Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-15 Thread Michal Orzel
Hi Julien,

On 15.12.2021 19:25, Julien Grall wrote:
> Hi Michal,
> 
> On 15/12/2021 10:40, Michal Orzel wrote:
>> On 15.12.2021 11:32, Jan Beulich wrote:
>>> (Re-sending an abridged version, as apparently spam filters didn't like
>>> the original message with more retained context; I'll have to see whether
>>> this one also isn't liked. Sorry.)
>>>
>>> On 15.12.2021 10:48, Michal Orzel wrote:
 This patch and the problem it solves is about clearing top 32bits of all 
 gp registers so not only x0,x1.
>>>
>>> That's well understood. Yet for everything still in registers simply
>>> using mov ahead of the respective push (as you had it) is still
>>> preferable imo.
>>>
>>> Jan
>>>
>>
>> In that case let's wait for Julien's opinion to decide whether I should get 
>> back to the previous
>> solution with mov or to the stack solution.
> 
> IIUC, your proposal is to:
>    1) Push all the 64-bit registers
>    2) Zero the top 32-bit
> 
> Jan's suggestion is to:
>    1) clobber the top 32-bit using mov wX, wX
>    2) Push all the registers
> 
> My preference is for the latter because there will be less memory/cache 
> access.
> 
> So, this would be your original patch + a compile time check to ensure 
> save_x0_x1 is 0 when compat=1.
> 
That is exactly what I would like to do. I will send this as v2.

> Cheers,
> 

Cheers,
Michal



Re: [PATCH] MAINTAINERS: remove typo from XEN PVUSB DRIVER section

2021-12-15 Thread Juergen Gross

On 16.12.21 07:55, Lukas Bulwahn wrote:

Commit a92548f90fa6 ("xen: add Xen pvUSB maintainer") adds the new XEN
PVUSB DRIVER section, but one file entry contains an obvious typo.

Fortunately, ./scripts/get_maintainer.pl --self-test=patterns warns:

   warning: no file matchesF:divers/usb/host/xen*

Remove this obvious typo.

Signed-off-by: Lukas Bulwahn 


Reviewed-by: Juergen Gross 


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] MAINTAINERS: remove typo from XEN PVUSB DRIVER section

2021-12-15 Thread Lukas Bulwahn
Commit a92548f90fa6 ("xen: add Xen pvUSB maintainer") adds the new XEN
PVUSB DRIVER section, but one file entry contains an obvious typo.

Fortunately, ./scripts/get_maintainer.pl --self-test=patterns warns:

  warning: no file matchesF:divers/usb/host/xen*

Remove this obvious typo.

Signed-off-by: Lukas Bulwahn 
---
applies on next-20211215

Juergen, please ack.

Greg, please pick this minor clean-up on top of the commit above.

 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 97215d89df4e..a5df6e1219b6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21007,7 +21007,7 @@ M:  Juergen Gross 
 L: xen-devel@lists.xenproject.org (moderated for non-subscribers)
 L: linux-...@vger.kernel.org
 S: Supported
-F: divers/usb/host/xen*
+F: drivers/usb/host/xen*
 F: include/xen/interface/io/usbif.h
 
 XEN SOUND FRONTEND DRIVER
-- 
2.17.1




RE: [patch V3 00/35] genirq/msi, PCI/MSI: Spring cleaning - Part 2

2021-12-15 Thread Michael Kelley (LINUX)
From: Thomas Gleixner  Sent: Wednesday, December 15, 2021 
8:36 AM
> 
> On Wed, Dec 15 2021 at 17:18, Thomas Gleixner wrote:
> 
> > On Tue, Dec 14 2021 at 22:19, Thomas Gleixner wrote:
> >> On Tue, Dec 14 2021 at 14:56, Nishanth Menon wrote:
> >>
> >> thanks for trying. I'll have a look again with brain awake tomorrow
> >> morning.
> >
> > Morning was busy with other things, but I found what my sleepy brain
> > managed to do wrong yesterday evening.
> >
> > Let me reintegrate the pile and I'll send you an update.
> 
>git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git 
> msi-v4.1-part-2
>git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git 
> msi-v4.2-part-3
> 
> That should cure the problem.

Tested the msi-v4.2-part-3 tag in two different Azure/Hyper-V VMs.  One
is a Generation 1 VM that has legacy PCI devices and one is a Generation 2
VM with no legacy PCI devices.   Tested hot add and remove of Mellanox
CX-3 and CX-4 SR-IOV NIC virtual functions that are directly mapped into the
VM.  Also tested local NVMe devices directly mapped into one of the VMs.

No issues encountered.  So for Azure/Hyper-V specifically,

Tested-by: Michael Kelley 




[PATCH] xen: make some per-scheduler performance counters sched global ones

2021-12-15 Thread Juergen Gross
Some performance counters listed to be credit or credit2 specific are
being used by the null scheduler, too.

Make those sched global ones.

Fixes: ab6ba8c6753fa76 ("perfc: conditionalize credit/credit2 counters")
Signed-off-by: Juergen Gross 
---
 xen/include/xen/perfc_defn.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/include/xen/perfc_defn.h b/xen/include/xen/perfc_defn.h
index 672b51c456..0027d95a60 100644
--- a/xen/include/xen/perfc_defn.h
+++ b/xen/include/xen/perfc_defn.h
@@ -34,6 +34,9 @@ PERFCOUNTER(tickled_idle_cpu,   "sched: tickled_idle_cpu")
 PERFCOUNTER(tickled_idle_cpu_excl,  "sched: tickled_idle_cpu_exclusive")
 PERFCOUNTER(tickled_busy_cpu,   "sched: tickled_busy_cpu")
 PERFCOUNTER(unit_check, "sched: unit_check")
+PERFCOUNTER(migrate_running,"sched: migrate_running")
+PERFCOUNTER(migrate_on_runq,"sched: migrate_on_runq")
+PERFCOUNTER(migrated,   "sched: migrated")
 
 /* credit specific counters */
 #ifdef CONFIG_SCHED_CREDIT
@@ -55,7 +58,6 @@ PERFCOUNTER(steal_trylock,  "csched: steal_trylock")
 PERFCOUNTER(steal_trylock_failed,   "csched: steal_trylock_failed")
 PERFCOUNTER(steal_peer_idle,"csched: steal_peer_idle")
 PERFCOUNTER(migrate_queued, "csched: migrate_queued")
-PERFCOUNTER(migrate_running,"csched: migrate_running")
 PERFCOUNTER(migrate_kicked_away,"csched: migrate_kicked_away")
 PERFCOUNTER(unit_hot,   "csched: unit_hot")
 #endif
@@ -67,13 +69,11 @@ PERFCOUNTER(acct_load_balance,  "csched2: 
acct_load_balance")
 PERFCOUNTER(upd_max_weight_quick,   "csched2: update_max_weight_quick")
 PERFCOUNTER(upd_max_weight_full,"csched2: update_max_weight_full")
 PERFCOUNTER(migrate_requested,  "csched2: migrate_requested")
-PERFCOUNTER(migrate_on_runq,"csched2: migrate_on_runq")
 PERFCOUNTER(migrate_no_runq,"csched2: migrate_no_runq")
 PERFCOUNTER(runtime_min_timer,  "csched2: runtime_min_timer")
 PERFCOUNTER(runtime_max_timer,  "csched2: runtime_max_timer")
 PERFCOUNTER(pick_resource,  "csched2: pick_resource")
 PERFCOUNTER(need_fallback_cpu,  "csched2: need_fallback_cpu")
-PERFCOUNTER(migrated,   "csched2: migrated")
 PERFCOUNTER(migrate_resisted,   "csched2: migrate_resisted")
 PERFCOUNTER(credit_reset,   "csched2: credit_reset")
 PERFCOUNTER(deferred_to_tickled_cpu,"csched2: deferred_to_tickled_cpu")
-- 
2.26.2




Re: [PATCH v3 02/13] xen: harmonize return types of hypercall handlers

2021-12-15 Thread Juergen Gross

On 16.12.21 03:10, Stefano Stabellini wrote:

On Wed, 15 Dec 2021, Juergen Gross wrote:

On 14.12.21 18:36, Julien Grall wrote:

Hi,

On 08/12/2021 15:55, Juergen Gross wrote:

Today most hypercall handlers have a return type of long, while the
compat ones return an int. There are a few exceptions from that rule,
however.


So on Arm64, I don't think you can make use of the full 64-bit because a
32-bit domain would not be able to see the top 32-bit.

In fact, this could potentially cause us some trouble (see [1]) in Xen.
So it feels like the hypercalls should always return a 32-bit signed value
on Arm.


This would break hypercalls like XENMEM_maximum_ram_page which are able
to return larger values, right?


The other advantage is it would be clear that the top 32-bit are not
usuable. Stefano, what do you think?


Wouldn't it make more sense to check the return value to be a sign
extended 32-bit value for 32-bit guests in do_trap_hypercall() instead?

The question is what to return if this is not the case. -EDOM?



I can see where Julien is coming from: we have been trying to keep the
arm32 and arm64 ABIs identical since the beginning of the project. So,
like Julien, my preference would be to always return 32-bit on ARM, both
aarch32 and aarch64. It would make things simple.

The case of XENMEM_maximum_ram_page is interesting but it is not a
problem in reality because the max physical address size is only 40-bit
for aarch32 guests, so 32-bit are always enough to return the highest
page in memory for 32-bit guests.


You are aware that this isn't the guest's max page, but the host's?

And all of this is about a 32-bit guest on a 64-bit host.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[ovmf test] 167436: all pass - PUSHED

2021-12-15 Thread osstest service owner
flight 167436 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167436/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf f14fff513540757bef62923ee4aeca4bf3ea8081
baseline version:
 ovmf 38f6d78c3b62f8825e7d802697b7992418a72da7

Last test of basis   167419  2021-12-14 23:11:46 Z1 days
Testing same since   167436  2021-12-15 15:38:44 Z0 days1 attempts


People who touched revisions under test:
  Wei6 Xu 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   38f6d78c3b..f14fff5135  f14fff513540757bef62923ee4aeca4bf3ea8081 -> 
xen-tested-master



arm32 randconfig failure

2021-12-15 Thread Stefano Stabellini
Hi all,

gitlab-ci spotted another randconfig build issue on arm32. To repro, use
the attached Xen config file and build with XEN_TARGET_ARCH=arm32 (you
need an appropriate cross-compiler.)  This is the error:

https://gitlab.com/xen-project/people/andyhhp/xen/-/jobs/1888385010

Also appended below for your convenience. It affects the null scheduler.
I couldn't spot anything obvious so I ran a bisection.  This is the
offending commit:

commit ab6ba8c6753fa7642de2ffc84f6decadc6c40c2c
Author: Jan Beulich 
Date:   Fri Dec 10 10:25:44 2021 +0100

perfc: conditionalize credit/credit2 counters

There's no point including them when the respective scheduler isn't
enabled in the build.

Signed-off-by: Jan Beulich 
Reviewed-by: Luca Fancellu 

 xen/include/xen/perfc_defn.h | 4 
 1 file changed, 4 insertions(+)


Ideas on how to fix it?

Cheers,

Stefano



In file included from /local/repos/xen-upstream/xen/include/xen/mm.h:58,
 from 
/local/repos/xen-upstream/xen/arch/arm/include/asm/p2m.h:4,
 from 
/local/repos/xen-upstream/xen/arch/arm/include/asm/domain.h:7,
 from /local/repos/xen-upstream/xen/include/xen/domain.h:8,
 from /local/repos/xen-upstream/xen/include/xen/sched.h:11,
 from null.c:31:
null.c: In function ‘null_unit_migrate’:
/local/repos/xen-upstream/xen/include/xen/perfc.h:65:53: error: 
‘PERFC_migrate_on_runq’ undeclared (first use in this function); did you mean 
‘PERFC_migrate_running’?
   65 | #define perfc_incr(x) (++this_cpu(perfcounters)[PERFC_ ## x])
  | ^~
/local/repos/xen-upstream/xen/include/xen/sched.h:47:46: note: in expansion of 
macro ‘perfc_incr’
   47 | #define SCHED_STAT_CRANK(_X)(perfc_incr(_X))
  |  ^~
null.c:713:9: note: in expansion of macro ‘SCHED_STAT_CRANK’
  713 | SCHED_STAT_CRANK(migrate_on_runq);
  | ^~~~
/local/repos/xen-upstream/xen/include/xen/perfc.h:65:53: note: each undeclared 
identifier is reported only once for each function it appears in
   65 | #define perfc_incr(x) (++this_cpu(perfcounters)[PERFC_ ## x])
  | ^~
/local/repos/xen-upstream/xen/include/xen/sched.h:47:46: note: in expansion of 
macro ‘perfc_incr’
   47 | #define SCHED_STAT_CRANK(_X)(perfc_incr(_X))
  |  ^~
null.c:713:9: note: in expansion of macro ‘SCHED_STAT_CRANK’
  713 | SCHED_STAT_CRANK(migrate_on_runq);
  | ^~~~
/local/repos/xen-upstream/xen/include/xen/perfc.h:65:53: error: 
‘PERFC_migrated’ undeclared (first use in this function); did you mean 
‘PSCI_migrate’?
   65 | #define perfc_incr(x) (++this_cpu(perfcounters)[PERFC_ ## x])
  | ^~
/local/repos/xen-upstream/xen/include/xen/sched.h:47:46: note: in expansion of 
macro ‘perfc_incr’
   47 | #define SCHED_STAT_CRANK(_X)(perfc_incr(_X))
  |  ^~
null.c:715:5: note: in expansion of macro ‘SCHED_STAT_CRANK’
  715 | SCHED_STAT_CRANK(migrated);
  | ^~~~
  LD  built_in.o
make[4]: *** [/local/repos/xen-upstream/xen/Rules.mk:197: null.o] Error 1#
# Automatically generated file; DO NOT EDIT.
# Xen/arm 4.17-unstable Configuration
#
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=100201
CONFIG_CLANG_VERSION=0
CONFIG_CC_HAS_VISIBILITY_ATTRIBUTE=y
CONFIG_ARM_32=y
CONFIG_ARM=y
CONFIG_ARCH_DEFCONFIG="arch/arm/configs/arm32_defconfig"

#
# Architecture Features
#
CONFIG_NR_CPUS=128
CONFIG_HVM=y
CONFIG_NEW_VGIC=y
# CONFIG_SBSA_VUART_CONSOLE is not set
# CONFIG_ARM_SSBD is not set
CONFIG_HARDEN_BRANCH_PREDICTOR=y
CONFIG_TEE=y
CONFIG_OPTEE=y
# end of Architecture Features

#
# ARM errata workaround via the alternative framework
#
# CONFIG_ARM_ERRATUM_858921 is not set
# end of ARM errata workaround via the alternative framework

CONFIG_ARM32_HARDEN_BRANCH_PREDICTOR=y
CONFIG_ALL_PLAT=y
# CONFIG_NO_PLAT is not set
CONFIG_ALL32_PLAT=y

#
# Common Features
#
# CONFIG_GRANT_TABLE is not set
CONFIG_HAS_ALTERNATIVE=y
CONFIG_HAS_DEVICE_TREE=y
CONFIG_HAS_PDX=y
CONFIG_MEM_ACCESS=y
# CONFIG_STATIC_MEMORY is not set

#
# Speculative hardening
#
CONFIG_SPECULATIVE_HARDEN_ARRAY=y
# end of Speculative hardening

CONFIG_HYPFS=y
# CONFIG_HYPFS_CONFIG is not set
CONFIG_IOREQ_SERVER=y
CONFIG_EFI_SET_VIRTUAL_ADDRESS_MAP=y
# CONFIG_XSM is not set
CONFIG_ARGO=y

#
# Schedulers
#
CONFIG_SCHED_CREDIT=y
# CONFIG_SCHED_CREDIT2 is not set
# CONFIG_SCHED_RTDS is not set
# CONFIG_SCHED_ARINC653 is not set
CONFIG_SCHED_NULL=y
CONFIG_SCHED_CREDIT_DEFAULT=y
# CONFIG_SCHED_NULL_DEFAULT is not set
CONFIG_SCHED_DEFAULT="credit"
# end of Schedulers

# CONFIG_LIVEPATCH is not set
# CONFIG_ENFORCE_UNIQUE_SYMBOLS 

[qemu-mainline test] 167426: tolerable FAIL - PUSHED

2021-12-15 Thread osstest service owner
flight 167426 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167426/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167228
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167228
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167228
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167228
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167228
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167228
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167228
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167228
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass

version targeted for testing:
 qemuu76b56fdfc9fa43ec6e5986aee33f108c6c6a511e
baseline version:
 qemuua3607def89f9cd68c1b994e1030527df33aa91d0

Last test of basis   167228  2021-12-08 05:42:12 Z7 days
Failing since167417  2021-12-14 21:09:04 Z1 days2 attempts
Testing same since   167426  2021-12-15 09:11:59 Z0 days1 attempts


People who touched r

Re: [PATCH] xen/arm: vpci: Remove PCI I/O ranges property value

2021-12-15 Thread Stefano Stabellini
On Tue, 14 Dec 2021, Rahul Singh wrote:
> IO ports on ARM don't exist so all IO ports related hypercalls are going
> to fail on ARM when we passthrough a PCI device.
> Failure of xc_domain_ioport_permission(..) would turn into a critical
> failure at domain creation. We need to avoid this outcome, instead we
> want to continue with domain creation as normal even if
> xc_domain_ioport_permission(..) fails. XEN_DOMCTL_ioport_permission
> is not implemented on ARM so it would return -ENOSYS.
> 
> To solve above issue remove PCI I/O ranges property value from dom0
> device tree node so that dom0 linux will not allocate I/O space for PCI
> devices if pci-passthrough is enabled.
> 
> Another valid reason to remove I/O ranges is that PCI I/O space are not
> mapped to dom0 when PCI passthrough is enabled, also there is no vpci
> trap handler register for IO bar.
> 
> Signed-off-by: Rahul Singh 
> ---
>  xen/arch/arm/domain_build.c   | 14 +++
>  xen/common/device_tree.c  | 72 +++
>  xen/include/xen/device_tree.h | 10 +
>  3 files changed, 96 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index d02bacbcd1..60f6b2c73b 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -749,6 +749,11 @@ static int __init write_properties(struct domain *d, 
> struct kernel_info *kinfo,
>  continue;
>  }
>  
> +if ( is_pci_passthrough_enabled() &&
> +dt_device_type_is_equal(node, "pci") )
> +if ( dt_property_name_is_equal(prop, "ranges") )
> +continue;

It looks like we are skipping the "ranges" property entirely for the PCI
node, is that right? Wouldn't that also remove the other (not ioports)
address ranges?


>  res = fdt_property(kinfo->fdt, prop->name, prop_data, prop_len);
>  
>  if ( res )
> @@ -769,6 +774,15 @@ static int __init write_properties(struct domain *d, 
> struct kernel_info *kinfo,
>  if ( res )
>  return res;
>  }
> +
> +/*
> + * PCI IO bar are not mapped to dom0 when PCI passthrough is enabled,
> + * also there is no trap handler registered for IO bar therefor 
> remove
> + * the IO range property from the device tree node for dom0.
> + */
> +res = dt_pci_remove_io_ranges(kinfo->fdt, node);
> +if ( res )
> +return res;

I tried to apply this patch to staging to make it easier to review but I
think this chuck got applied wrongly: I can see
dt_pci_remove_io_ranges() added to the function "handle_prop_pfdt" which
is for guest partial DTBs and not for dom0.

Is dt_pci_remove_io_ranges() meant to be called from write_properties
instead? It looks like it would be best to call it from
write_properties, maybe it could be combined with the other new check
just above in this patch?


>  /*
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index 4aae281e89..9fa25f6723 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -2195,6 +2195,78 @@ int dt_get_pci_domain_nr(struct dt_device_node *node)
>  return (u16)domain;
>  }
>  
> +int dt_pci_remove_io_ranges(void *fdt, const struct dt_device_node *dev)
> +{
> +const struct dt_device_node *parent = NULL;
> +const struct dt_bus *bus, *pbus;
> +unsigned int rlen;
> +int na, ns, pna, pns, rone, ret;
> +const __be32 *ranges;
> +__be32 regs[((GUEST_ROOT_ADDRESS_CELLS * 2) + GUEST_ROOT_SIZE_CELLS + 1)
> +   * 2];
> +__be32 *addr = ®s[0];
> +
> +bus = dt_match_bus(dev);
> +if ( !bus )
> +return 0; /* device is not a bus */
> +
> +parent = dt_get_parent(dev);
> +if ( parent == NULL )
> +return -EINVAL;
> +
> +ranges = dt_get_property(dev, "ranges", &rlen);
> +if ( ranges == NULL )
> +{
> +printk(XENLOG_ERR "DT: no ranges; cannot enumerate %s\n",
> +   dev->full_name);
> +return -EINVAL;
> +}
> +if ( rlen == 0 ) /* Nothing to do */
> +return 0;
> +
> +bus->count_cells(dev, &na, &ns);
> +if ( !DT_CHECK_COUNTS(na, ns) )
> +{
> +printk(XENLOG_ERR "dt_parse: Bad cell count for device %s\n",
> +  dev->full_name);
> +return -EINVAL;
> +}
> +pbus = dt_match_bus(parent);
> +if ( pbus == NULL )
> +{
> +printk("DT: %s is not a valid bus\n", parent->full_name);
> +return -EINVAL;
> +}
> +
> +pbus->count_cells(dev, &pna, &pns);
> +if ( !DT_CHECK_COUNTS(pna, pns) )
> +{
> +printk(XENLOG_ERR "dt_parse: Bad cell count for parent %s\n",
> +   dev->full_name);
> +return -EINVAL;
> +}
> +/* Now walk through the ranges */
> +rlen /= 4;
> +rone = na + pna + ns;
> +
> +for ( ; rlen >= rone; rlen -= rone, ranges += rone )
> +{
> +unsigned int flags = bus->get_flags(ranges);

Re: [PATCH v3 02/13] xen: harmonize return types of hypercall handlers

2021-12-15 Thread Stefano Stabellini
On Wed, 15 Dec 2021, Juergen Gross wrote:
> On 14.12.21 18:36, Julien Grall wrote:
> > Hi,
> > 
> > On 08/12/2021 15:55, Juergen Gross wrote:
> > > Today most hypercall handlers have a return type of long, while the
> > > compat ones return an int. There are a few exceptions from that rule,
> > > however.
> > 
> > So on Arm64, I don't think you can make use of the full 64-bit because a
> > 32-bit domain would not be able to see the top 32-bit.
> > 
> > In fact, this could potentially cause us some trouble (see [1]) in Xen.
> > So it feels like the hypercalls should always return a 32-bit signed value
> > on Arm.
> 
> This would break hypercalls like XENMEM_maximum_ram_page which are able
> to return larger values, right?
> 
> > The other advantage is it would be clear that the top 32-bit are not
> > usuable. Stefano, what do you think?
> 
> Wouldn't it make more sense to check the return value to be a sign
> extended 32-bit value for 32-bit guests in do_trap_hypercall() instead?
> 
> The question is what to return if this is not the case. -EDOM?


I can see where Julien is coming from: we have been trying to keep the
arm32 and arm64 ABIs identical since the beginning of the project. So,
like Julien, my preference would be to always return 32-bit on ARM, both
aarch32 and aarch64. It would make things simple.

The case of XENMEM_maximum_ram_page is interesting but it is not a
problem in reality because the max physical address size is only 40-bit
for aarch32 guests, so 32-bit are always enough to return the highest
page in memory for 32-bit guests.

So XENMEM_maximum_ram_page could be the exception and return "long"
which is 4 bytes on aarch32 and 8 bytes on aarch64, and it is exactly
what we need.



Re: [patch V3 00/35] genirq/msi, PCI/MSI: Spring cleaning - Part 2

2021-12-15 Thread Nishanth Menon
Hi Thomas,

On 17:35-20211215, Thomas Gleixner wrote:
>git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git 
> msi-v4.2-part-3

As you helped offline, summarizing the details on part3 of the series:

I was seeing failure[1] of NFS(DMA) on all TI K3 platforms:

[1.013258] ti-bcdma 485c0100.dma-controller: Number of rings: 68
[1.019963] ti-bcdma 485c0100.dma-controller: Failed to allocate IRQs -28
[1.026938] ti-bcdma 485c0100.dma-controller: Failed to allocate MSI 
interrupts

Rationale as you explained:
"
-28 is ENOSPC, which is returned when the interrupt allocation in the
 MSI domain fails. Fix below.
"

Which turned out to be the fixup[2] you suggested and I confirm that
fixes the problem for me.

With the fixup in place:

Tested-by: Nishanth Menon 

for part 3 of the series as well.

Thanks once again for your help. Hope we can roll in the fixes for
part3.

[1] https://gist.github.com/nmenon/5971ab27aa626c022e276cc946e4b6c3
[2]
--- a/drivers/soc/ti/ti_sci_inta_msi.c
+++ b/drivers/soc/ti/ti_sci_inta_msi.c
@@ -68,6 +68,7 @@ static int ti_sci_inta_msi_alloc_descs(s
int set, i, count = 0;
 
memset(&msi_desc, 0, sizeof(msi_desc));
+   msi_desc.nvec_used = 1;
 
for (set = 0; set < res->sets; set++) {
for (i = 0; i < res->desc[set].num; i++, count++) {
-- 
Regards,
Nishanth Menon
Key (0xDDB5849D1736249D)/Fingerprint: F8A2 8693 54EB 8232 17A3  1A34 DDB5 849D 
1736 249D



Re: [RFC v1 0/5] Introduce SCI-mediator feature

2021-12-15 Thread Oleksandr



On 14.12.21 11:34, Oleksii Moisieiev wrote:


Hi Oleksii


Introducing the feature, called SCI mediator.
It's purpose is to redirect SCMI requests from the domains to firmware
(SCP, ATF etc), which controls the power/clock/resets etc.
The idea is to make SCP firmware (or similar, such as AT-F) responsible for
control power/clock/resets and provide SCMI interface so controls can be shared
between the Domains.
Originally, we've met a problem, that the devices, shared between different
Domains, can't have an access to HW registers to work with clocks/resets/power
etc. You have to pass cpg to the Domain, so the devices can access HW directly.
The solution for this is to move HW controls over power/clock/resets to
SCP firmware and use Linux-kernel SCMI drivers to pass requests to SCP.
Xen is responsible for permissions setting, so Domain can access only to
power/clock/resets which are related to this Domain. Also XEN is the mediator
which redirects SCMI requests, adding agentID so firmware should know the
sender.
SMC is currently used as transport, but this should be configurable.

Here is the high level design:

SCI (System Control Interface) feature can be enabled in xen_config:

CONFIG_SCI=y

Mediator can be configured:

CONFIG_SCMI_SMC=y

Currently, only SCMI_SMC mediator is implemented, which using shared memory
region to communicate with firmware and SMC as transport.

Xen scmi should be configured in the device-tree.
Format is the following:
cpu_scp_shm: scp-shmem@0x53FF {
compatible = "arm,scmi-shmem";
reg = <0x0 0x53FF 0x0 0x1000>;
};

firmware {
scmi {
compatible = "arm,scmi-smc";
arm,smc-id = <0x8202>;
shmem = <&cpu_scp_shm>;
#address-cells = <1>;
#size-cells = <0>;

scmi_power: protocol@11 {
reg = <0x11>;
#power-domain-cells = <1>;
};

scmi_clock: protocol@14 {
reg = <0x14>;
#clock-cells = <1>;
};

scmi_reset: protocol@16 {
reg = <0x16>;
#reset-cells = <1>;
};
};
};

Where:
&cpu_scp_shm is the shared memory for scmi buffers;
0x53FF, size 0x1000 is the platform specific free address, which provide
space for the communication.
&scmi node, which should be copied to Dom0 device-tree.

Device configured to use scmi:
&avb {
scmi_devid = <0>;
clocks = <&scmi_clock 0>;
power-domains = <&scmi_power 0>;
resets = <&scmi_reset 0>;
};

Where:
scmi_devid - id from the firmware, which is assigned for AVB.

During initialization, XEN scans probes the first SCI-mediator driver which has
matching node in the device-tree. If no device-tree was provided, then the
first registered mediator driver should be probed.

DomX should be configured:
Device-tree should include the same nodes, described above.
&cpu_scp_shm should be altered during domain creation. Xen allocates free page
from the memory region, provided in &cpu_scp_shm in XEN device-tree, so each
domain should have unique page. Nodes &cpu_scp_shm and /firmware/scmi should be
copied from partial device-tree to domain device-tree, so kernel can initialize
scmi driver.

SCI mediator can be enabled in dom.cfg the following way:

sci = "scmi_smc"

which sets scmi_smc to be used for the domain.



Great work! I can imagine this is going to be nice feature once upstreamed.

I am wondering, would the Xen (with the required updates of course) also 
be able to send it's own requests to the SCP? For example, to control 
overall system performance (CPU frequency)


or other let's say important power management task.




Oleksii Moisieiev (5):
   xen/arm: add support for Renesas R-Car Gen3 platform
   xen/arm: add generic SCI mediator framework
   xen/arm: introduce SCMI-SMC mediator driver
   tools/arm: add "scmi_smc" option to xl.cfg
   xen/arm: add SCI mediator support for DomUs

  MAINTAINERS   |   6 +
  docs/man/xl.cfg.5.pod.in  |  22 +
  tools/include/libxl.h |   5 +
  tools/include/xenctrl.h   |   3 +
  tools/include/xenguest.h  |   2 +
  tools/libs/ctrl/xc_domain.c   |  23 +
  tools/libs/guest/xg_dom_arm.c |   5 +-
  tools/libs/light/libxl_arm.c  | 122 -
  tools/libs/light/libxl_create.c   |  54 +-
  tools/libs/light/libxl_dom.c  |   1 +
  tools/libs/light/libxl_internal.h |   4 +
  tools/libs/light/libxl_types.idl  |   6 +
  tools/xl/xl_parse.c   |   9 +
  xen/arch/arm/Kconfig  |  10 +
  xen/arch/arm/Makefile |   1 +
  xen/arch/arm/domain.c |  24 +
  xen

Re: [RFC v1 5/5] xen/arm: add SCI mediator support for DomUs

2021-12-15 Thread Oleksandr



On 14.12.21 11:34, Oleksii Moisieiev wrote:

Hi Oleksii


Integration of the SCMI mediator with xen libs:
- add hypercalls to aquire SCI channel and set device permissions
for DomUs;
- add SCMI_SMC nodes to DomUs device-tree based on partial device-tree;
- SCI requests redirection from DomUs to Firmware.

Signed-off-by: Oleksii Moisieiev 
---
  tools/include/xenctrl.h   |   3 +
  tools/include/xenguest.h  |   2 +
  tools/libs/ctrl/xc_domain.c   |  23 ++
  tools/libs/guest/xg_dom_arm.c |   5 +-
  tools/libs/light/libxl_arm.c  | 122 +++---
  tools/libs/light/libxl_create.c   |  54 -
  tools/libs/light/libxl_dom.c  |   1 +
  tools/libs/light/libxl_internal.h |   4 +
  xen/arch/arm/domctl.c |  15 
  xen/include/public/domctl.h   |   9 +++
  10 files changed, 223 insertions(+), 15 deletions(-)

diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 07b96e6671..cdd14f465f 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -1238,6 +1238,9 @@ int xc_domain_getvnuma(xc_interface *xch,
  int xc_domain_soft_reset(xc_interface *xch,
   uint32_t domid);
  
+int xc_domain_add_sci_device(xc_interface *xch,

+  uint32_t domid, char *path);
+
  #if defined(__i386__) || defined(__x86_64__)
  /*
   * PC BIOS standard E820 types and structure.
diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h
index 61d0a82f48..35c611ac73 100644
--- a/tools/include/xenguest.h
+++ b/tools/include/xenguest.h
@@ -242,6 +242,8 @@ struct xc_dom_image {
  
  /* Number of vCPUs */

  unsigned int max_vcpus;
+
+xen_pfn_t sci_shmem_gfn;
  };
  
  /* --- arch specific hooks - */

diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
index b155d6afd2..07bb390e17 100644
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -2206,6 +2206,29 @@ int xc_domain_soft_reset(xc_interface *xch,
  domctl.domain = domid;
  return do_domctl(xch, &domctl);
  }
+
+int xc_domain_add_sci_device(xc_interface *xch,
+  uint32_t domid, char *path)
+{
+size_t size = strlen(path);
+int rc;
+DECLARE_DOMCTL;
+DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+if ( xc_hypercall_bounce_pre(xch, path) )
+return -1;
+
+domctl.cmd = XEN_DOMCTL_add_sci_device;
+domctl.domain = domid;
+domctl.u.sci_device_op.size = size;
+set_xen_guest_handle(domctl.u.sci_device_op.path, path);
+rc = do_domctl(xch, &domctl);
+
+xc_hypercall_bounce_post(xch, path);
+
+return rc;
+}
+
  /*
   * Local variables:
   * mode: C
diff --git a/tools/libs/guest/xg_dom_arm.c b/tools/libs/guest/xg_dom_arm.c
index 5e3b76355e..368a670c46 100644
--- a/tools/libs/guest/xg_dom_arm.c
+++ b/tools/libs/guest/xg_dom_arm.c
@@ -25,11 +25,12 @@
  
  #include "xg_private.h"
  
-#define NR_MAGIC_PAGES 4

+#define NR_MAGIC_PAGES 5
  #define CONSOLE_PFN_OFFSET 0
  #define XENSTORE_PFN_OFFSET 1
  #define MEMACCESS_PFN_OFFSET 2
  #define VUART_PFN_OFFSET 3
+#define SCI_SHMEM_OFFSET 4
  
  #define LPAE_SHIFT 9
  
@@ -69,11 +70,13 @@ static int alloc_magic_pages(struct xc_dom_image *dom)

  dom->console_pfn = base + CONSOLE_PFN_OFFSET;
  dom->xenstore_pfn = base + XENSTORE_PFN_OFFSET;
  dom->vuart_gfn = base + VUART_PFN_OFFSET;
+dom->sci_shmem_gfn = base + SCI_SHMEM_OFFSET;
  
  xc_clear_domain_page(dom->xch, dom->guest_domid, dom->console_pfn);

  xc_clear_domain_page(dom->xch, dom->guest_domid, dom->xenstore_pfn);
  xc_clear_domain_page(dom->xch, dom->guest_domid, base + 
MEMACCESS_PFN_OFFSET);
  xc_clear_domain_page(dom->xch, dom->guest_domid, dom->vuart_gfn);
+xc_clear_domain_page(dom->xch, dom->guest_domid, dom->sci_shmem_gfn);
  
  xc_hvm_param_set(dom->xch, dom->guest_domid, HVM_PARAM_CONSOLE_PFN,

  dom->console_pfn);
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index eef1de0939..73ef591822 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -8,6 +8,8 @@
  #include 
  #include 
  
+#define SCMI_NODE_PATH  "/firmware/scmi"

+
  static const char *gicv_to_string(libxl_gic_version gic_version)
  {
  switch (gic_version) {
@@ -101,6 +103,19 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
  return ERROR_FAIL;
  }
  
+switch (d_config->b_info.sci) {

+case LIBXL_SCI_TYPE_NONE:
+config->arch.sci_type = XEN_DOMCTL_CONFIG_SCI_NONE;
+break;
+case LIBXL_SCI_TYPE_SCMI_SMC:
+config->arch.sci_type = XEN_DOMCTL_CONFIG_SCI_SCMI_SMC;
+break;
+default:
+LOG(ERROR, "Unknown SCI type %d",
+d_config->b_info.sci);
+return ERROR_FAIL;
+}
+
  return 0;
  }
  
@@ -122,6 +137,7 @@ int libxl__arch_domain_save_config(libxl__gc *gc,

  }
  
  s

[xen-unstable-smoke test] 167437: tolerable all pass - PUSHED

2021-12-15 Thread osstest service owner
flight 167437 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167437/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  dc27c174b2fc1ba0697ed96dc96066c272e09c24
baseline version:
 xen  8b3cbdbe782cae972e9a47cf22620ebee61a96a6

Last test of basis   167435  2021-12-15 15:03:05 Z0 days
Testing same since   167437  2021-12-15 19:01:40 Z0 days1 attempts


People who touched revisions under test:
  Julien Grall 
  Oleksandr Andrushchenko 
  Rahul Singh 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   8b3cbdbe78..dc27c174b2  dc27c174b2fc1ba0697ed96dc96066c272e09c24 -> smoke



[PATCH v2 3/4] x86/cpuid: Introduce dom0-cpuid command line option

2021-12-15 Thread Andrew Cooper
Specifically, this lets the user opt in to non-default for dom0.

Collect all dom0 settings together in dom0_{en,dis}able_feat[], and apply it
to dom0's policy when other tweaks are being made.

As recalculate_cpuid_policy() is an expensive action, and dom0-cpuid= is
likely to only be used by the x86 maintainers for development purposes, forgo
the recalculation in the general case.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * Rework almost from scratch, on top of broken-out changes.
---
 docs/misc/xen-command-line.pandoc | 17 +
 xen/arch/x86/cpuid.c  | 36 
 2 files changed, 53 insertions(+)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index f7797ea233f9..383a854dec60 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -801,6 +801,23 @@ Controls for how dom0 is constructed on x86 systems.
 
 If using this option is necessary to fix an issue, please report a bug.
 
+### dom0-cpuid
+= List of comma separated booleans
+
+Applicability: x86
+
+This option allows for fine tuning of the facilities dom0 will use, after
+accounting for hardware capabilities and Xen settings as enumerated via CPUID.
+
+Options are accepted in positive and negative form, to enable or disable
+specific features, but specify both forms of the same option is undefined.
+All selections via this mechanism are subject to normal CPU Policy safety and
+dependency logic.
+
+This option is intended for developers to opt dom0 into non-default features,
+and is not intended for use in production circumstances.  If using this option
+is necessary to fix an issue, please report a bug.
+
 ### dom0-iommu
 = List of [ passthrough=, strict=, map-inclusive=,
 map-reserved=, none ]
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index e11f5a3c9a6b..83a80ba6de70 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -116,6 +116,23 @@ static int __init parse_xen_cpuid(const char *s)
 }
 custom_param("cpuid", parse_xen_cpuid);
 
+static bool __initdata dom0_cpuid_cmdline;
+static uint32_t __initdata dom0_enable_feat[FSCAPINTS];
+static uint32_t __initdata dom0_disable_feat[FSCAPINTS];
+
+static void __init _parse_dom0_cpuid(unsigned int feat, bool val)
+{
+__set_bit(feat, val ? dom0_enable_feat : dom0_disable_feat);
+}
+
+static int __init parse_dom0_cpuid(const char *s)
+{
+dom0_cpuid_cmdline = true;
+
+return parse_cpuid(s, _parse_dom0_cpuid);
+}
+custom_param("dom0-cpuid", parse_dom0_cpuid);
+
 #define EMPTY_LEAF ((struct cpuid_leaf){})
 static void zero_leaves(struct cpuid_leaf *l,
 unsigned int first, unsigned int last)
@@ -768,6 +785,25 @@ void __init init_dom0_cpuid_policy(struct domain *d)
  */
 if ( cpu_has_arch_caps )
 p->feat.arch_caps = true;
+
+/* Apply dom0-cpuid= command line settings, if provided. */
+if ( dom0_cpuid_cmdline )
+{
+uint32_t fs[FSCAPINTS];
+unsigned int i;
+
+cpuid_policy_to_featureset(p, fs);
+
+for ( i = 0; i < ARRAY_SIZE(fs); ++i )
+{
+fs[i] |= dom0_enable_feat[i];
+fs[i] &= ~dom0_disable_feat[i];
+}
+
+cpuid_featureset_to_policy(fs, p);
+
+recalculate_cpuid_policy(d);
+}
 }
 
 void guest_cpuid(const struct vcpu *v, uint32_t leaf,
-- 
2.11.0




[PATCH v2 2/4] x86/cpuid: Factor common parsing out of parse_xen_cpuid()

2021-12-15 Thread Andrew Cooper
dom0-cpuid= is going to want to reuse the common parsing loop, so factor it
out into parse_cpuid().

Irritatingly, despite being static const, the features[] array gets duplicated
each time parse_cpuid() is inlined.  As it is a large (and ever growing with
new CPU features) datastructure, move it to being file scope so all inlines
use the same single object.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * New

We probably want to be wary of fallout from this pattern elsewhere.  I only
noticed it by chance.
---
 xen/arch/x86/cpuid.c | 45 -
 1 file changed, 32 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index f63f5efc17f5..e11f5a3c9a6b 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -26,17 +26,26 @@ static const uint32_t __initconst hvm_hap_def_featuremask[] 
=
 INIT_HVM_HAP_DEF_FEATURES;
 static const uint32_t deep_features[] = INIT_DEEP_FEATURES;
 
-static int __init parse_xen_cpuid(const char *s)
+static const struct feature_name {
+const char *name;
+unsigned int bit;
+} feature_names[] __initconstrel = INIT_FEATURE_NAMES;
+
+/*
+ * Parse a list of cpuid feature names -> bool, calling the callback for any
+ * matches found.
+ *
+ * always_inline, because this is init code only and we really don't want a
+ * function pointer call in the middle of the loop.
+ */
+static int __init always_inline parse_cpuid(
+const char *s, void (*callback)(unsigned int feat, bool val))
 {
 const char *ss;
 int val, rc = 0;
 
 do {
-static const struct feature {
-const char *name;
-unsigned int bit;
-} features[] __initconstrel = INIT_FEATURE_NAMES;
-const struct feature *lhs, *rhs, *mid = NULL /* GCC... */;
+const struct feature_name *lhs, *rhs, *mid = NULL /* GCC... */;
 const char *feat;
 
 ss = strchr(s, ',');
@@ -49,8 +58,8 @@ static int __init parse_xen_cpuid(const char *s)
 feat += 3;
 
 /* (Re)initalise lhs and rhs for binary search. */
-lhs = features;
-rhs = features + ARRAY_SIZE(features);
+lhs = feature_names;
+rhs = feature_names + ARRAY_SIZE(feature_names);
 
 while ( lhs < rhs )
 {
@@ -72,11 +81,7 @@ static int __init parse_xen_cpuid(const char *s)
 
 if ( (val = parse_boolean(mid->name, s, ss)) >= 0 )
 {
-if ( !val )
-setup_clear_cpu_cap(mid->bit);
-else if ( mid->bit == X86_FEATURE_RDRAND &&
-  (cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_RDRAND)) )
-setup_force_cpu_cap(X86_FEATURE_RDRAND);
+callback(mid->bit, val);
 mid = NULL;
 }
 
@@ -95,6 +100,20 @@ static int __init parse_xen_cpuid(const char *s)
 
 return rc;
 }
+
+static void __init _parse_xen_cpuid(unsigned int feat, bool val)
+{
+if ( !val )
+setup_clear_cpu_cap(feat);
+else if ( feat == X86_FEATURE_RDRAND &&
+  (cpuid_ecx(1) & cpufeat_mask(X86_FEATURE_RDRAND)) )
+setup_force_cpu_cap(X86_FEATURE_RDRAND);
+}
+
+static int __init parse_xen_cpuid(const char *s)
+{
+return parse_cpuid(s, _parse_xen_cpuid);
+}
 custom_param("cpuid", parse_xen_cpuid);
 
 #define EMPTY_LEAF ((struct cpuid_leaf){})
-- 
2.11.0




[PATCH v2 1/4] x86/cpuid: Split dom0 handling out of init_domain_cpuid_policy()

2021-12-15 Thread Andrew Cooper
To implement dom0-cpuid= support, the special cases would need extending.
However there is already a problem with late hwdom where the special cases
override toolstack settings, which is unintended and poor behaviour.

Introduce a new init_dom0_cpuid_policy() for the purpose, moving the ITSC and
ARCH_CAPS logic.  The is_hardware_domain() can be dropped, and for now there
is no need to rerun recalculate_cpuid_policy(); this is a relatively expensive
operation, and will become more-so over time.

Rearrange the logic in create_dom0() to make room for a call to
init_dom0_cpuid_policy().  The AMX plans for having variable sized XSAVE
states require that modifications to the policy happen before vCPUs are
created.

Additionally, factor out domid into a variable so we can be slightly more
correct in the case of a failure, and also print the error from
domain_create().  This will at least help distinguish -EINVAL from -ENOMEM.

No practical change in behaviour.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * New

It is slightly weird now having CPUID split into two helpers, while the MSR
side of things remains as a special case.  It doesn't actually matter right
now, and it is going to have to change anyway in due course, so I've decided
to go with the simple option for now.
---
 xen/arch/x86/cpuid.c | 25 +++--
 xen/arch/x86/include/asm/cpuid.h |  3 +++
 xen/arch/x86/setup.c | 15 +++
 3 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 151944f65702..f63f5efc17f5 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -727,23 +727,28 @@ int init_domain_cpuid_policy(struct domain *d)
 if ( !p )
 return -ENOMEM;
 
-/* The hardware domain can't migrate.  Give it ITSC if available. */
-if ( is_hardware_domain(d) )
-p->extd.itsc = cpu_has_itsc;
+d->arch.cpuid = p;
+
+recalculate_cpuid_policy(d);
+
+return 0;
+}
+
+void __init init_dom0_cpuid_policy(struct domain *d)
+{
+struct cpuid_policy *p = d->arch.cpuid;
+
+/* dom0 can't migrate.  Give it ITSC if available. */
+if ( cpu_has_itsc )
+p->extd.itsc = true;
 
 /*
  * Expose the "hardware speculation behaviour" bits of ARCH_CAPS to dom0,
  * so dom0 can turn off workarounds as appropriate.  Temporary, until the
  * domain policy logic gains a better understanding of MSRs.
  */
-if ( is_hardware_domain(d) && cpu_has_arch_caps )
+if ( cpu_has_arch_caps )
 p->feat.arch_caps = true;
-
-d->arch.cpuid = p;
-
-recalculate_cpuid_policy(d);
-
-return 0;
 }
 
 void guest_cpuid(const struct vcpu *v, uint32_t leaf,
diff --git a/xen/arch/x86/include/asm/cpuid.h b/xen/arch/x86/include/asm/cpuid.h
index 46904061d0ef..9c3637549a10 100644
--- a/xen/arch/x86/include/asm/cpuid.h
+++ b/xen/arch/x86/include/asm/cpuid.h
@@ -59,6 +59,9 @@ bool recheck_cpu_features(unsigned int cpu);
 /* Allocate and initialise a CPUID policy suitable for the domain. */
 int init_domain_cpuid_policy(struct domain *d);
 
+/* Apply dom0-specific tweaks to the CPUID policy. */
+void init_dom0_cpuid_policy(struct domain *d);
+
 /* Clamp the CPUID policy to reality. */
 void recalculate_cpuid_policy(struct domain *d);
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index f40a9fe5d351..e716005145d3 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -772,6 +772,7 @@ static struct domain *__init create_dom0(const module_t 
*image,
 };
 struct domain *d;
 char *cmdline;
+domid_t domid;
 
 if ( opt_dom0_pvh )
 {
@@ -786,10 +787,16 @@ static struct domain *__init create_dom0(const module_t 
*image,
 if ( iommu_enabled )
 dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
 
-/* Create initial domain 0. */
-d = domain_create(get_initial_domain_id(), &dom0_cfg, !pv_shim);
-if ( IS_ERR(d) || (alloc_dom0_vcpu0(d) == NULL) )
-panic("Error creating domain 0\n");
+/* Create initial domain.  Not d0 for pvshim. */
+domid = get_initial_domain_id();
+d = domain_create(domid, &dom0_cfg, !pv_shim);
+if ( IS_ERR(d) )
+panic("Error creating d%u: %ld\n", domid, PTR_ERR(d));
+
+init_dom0_cpuid_policy(d);
+
+if ( alloc_dom0_vcpu0(d) == NULL )
+panic("Error creating d%uv0\n", domid);
 
 /* Grab the DOM0 command line. */
 cmdline = image->string ? __va(image->string) : NULL;
-- 
2.11.0




[PATCH v2 4/4] x86/cpuid: Advertise SERIALIZE by default to guests

2021-12-15 Thread Andrew Cooper
I've played with SERIALIZE, TSXLDTRK, MOVDIRI and MOVDIR64 on real hardware,
and they all seem fine, including emulation support.

SERIALIZE exists specifically to have a userspace usable serialising operation
without other side effects.  (The only other two choices are CPUID which is a
VMExit under virt and clobbers 4 registers, and IRET-to-self which very slow
and consumes content from the stack.)

TSXLDTRK is a niche TSX feature, and TSX itself is niche outside of demos of
speculative sidechannels.  Leave the feature opt-in until a usecase is found,
in an effort to preempt the multiple person years of effort it has taken to
mop up TSX issues impacting every processor line.

MOVDIRI and MOVDIR64 are harder to judge.  They're architectural building
blocks towards ENQCMD{,S} without obvious usecases on their own.  They're of
no use to domains without PCI devices, so leave them opt-in for now.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * New
---
 xen/include/public/arch-x86/cpufeatureset.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 647ee9e5e277..0b399375566f 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -278,7 +278,7 @@ XEN_CPUFEATURE(SRBDS_CTRL,9*32+ 9) /*   
MSR_MCU_OPT_CTRL and RNGDS_MITG_DIS.
 XEN_CPUFEATURE(MD_CLEAR,  9*32+10) /*A  VERW clears microarchitectural 
buffers */
 XEN_CPUFEATURE(RTM_ALWAYS_ABORT, 9*32+11) /*! June 2021 TSX defeaturing in 
microcode. */
 XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */
-XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*a  SERIALIZE insn */
+XEN_CPUFEATURE(SERIALIZE, 9*32+14) /*A  SERIALIZE insn */
 XEN_CPUFEATURE(TSXLDTRK,  9*32+16) /*a  TSX load tracking suspend/resume 
insns */
 XEN_CPUFEATURE(CET_IBT,   9*32+20) /*   CET - Indirect Branch Tracking */
 XEN_CPUFEATURE(IBRSB, 9*32+26) /*A  IBRS and IBPB support (used by 
Intel) */
-- 
2.11.0




[PATCH v2 0/4] x86/cpuid: Introduce dom0-cpuid=

2021-12-15 Thread Andrew Cooper
Andrew Cooper (4):
  x86/cpuid: Split dom0 handling out of init_domain_cpuid_policy()
  x86/cpuid: Factor common parsing out of parse_xen_cpuid()
  x86/cpuid: Introduce dom0-cpuid command line option
  x86/cpuid: Advertise SERIALIZE by default to guests

 docs/misc/xen-command-line.pandoc   |  17 +
 xen/arch/x86/cpuid.c| 100 ++--
 xen/arch/x86/include/asm/cpuid.h|   3 +
 xen/arch/x86/setup.c|  15 +++--
 xen/include/public/arch-x86/cpufeatureset.h |   2 +-
 5 files changed, 112 insertions(+), 25 deletions(-)

-- 
2.11.0




[xen-unstable test] 167418: regressions - FAIL

2021-12-15 Thread osstest service owner
flight 167418 xen-unstable real [real]
flight 167439 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167418/
http://logs.test-lab.xenproject.org/osstest/logs/167439/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail REGR. 
vs. 167411

Tests which are failing intermittently (not blocking):
 test-amd64-i386-qemuu-rhel6hvm-amd 14 guest-start/redhat.repeat fail pass in 
167439-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167411
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167411
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167411
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167411
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167411
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167411
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167411
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167411
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167411
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167411
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167411
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167411
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-check  

[linux-5.4 test] 167422: tolerable FAIL - PUSHED

2021-12-15 Thread osstest service owner
flight 167422 linux-5.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167422/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-debianhvm-amd64 18 guest-localmigrate/x10 fail in 
167413 pass in 167422
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 18 
guest-localmigrate/x10 fail in 167413 pass in 167422
 test-arm64-arm64-libvirt-raw 13 guest-startfail pass in 167413

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail in 
167413 like 167230
 test-arm64-arm64-libvirt-raw 14 migrate-support-check fail in 167413 never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-check fail in 167413 never 
pass
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167230
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167230
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167230
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167230
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167230
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167230
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167230
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167230
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167230
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167230
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167230
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167230
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvi

Re: [RFC v1 4/5] tools/arm: add "scmi_smc" option to xl.cfg

2021-12-15 Thread Oleksandr



On 14.12.21 11:34, Oleksii Moisieiev wrote:


Hi Oleksii


This enumeration sets SCI type for the domain. Currently there is
two possible options: either 'none' or 'scmi_smc'.

'none' is the default value and it disables SCI support at all.

'scmi_smc' enables access to the Firmware from the domains using SCMI
protocol and SMC as transport.

Signed-off-by: Oleksii Moisieiev 
---
  docs/man/xl.cfg.5.pod.in | 22 ++
  tools/include/libxl.h|  5 +
  tools/libs/light/libxl_types.idl |  6 ++
  tools/xl/xl_parse.c  |  9 +
  4 files changed, 42 insertions(+)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index b98d161398..92d0593875 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -1614,6 +1614,28 @@ This feature is a B.
  
  =back
  
+=item B

+
+B Set SCI type for the guest. SCI is System Control Protocol -
+allows domain to manage various functions that are provided by HW platform.
+
+=over 4
+
+=item B
+
+Don't allow guest to use SCI if present on the platform. This is the default
+value.
+
+=item B
+
+Enables SCMI-SMC support for the guest. SCMI is System Control Management
+Inferface - allows domain to manage various functions that are provided by HW
+platform, such as clocks, resets and power-domains. Xen will mediate access to
+clocks, power-domains and resets between Domains and ATF. Disabled by default.
+SMC is used as transport.
+
+=back
+
  =back
  
  =head2 Paravirtualised (PV) Guest Specific Options

diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 2bbbd21f0b..30e5aee119 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -278,6 +278,11 @@
   */
  #define LIBXL_HAVE_BUILDINFO_ARCH_ARM_TEE 1
  
+/*

+ * libxl_domain_build_info has the arch_arm.sci field.
+ */
+#define LIBXL_HAVE_BUILDINFO_ARCH_ARM_SCI 1
+
  /*
   * LIBXL_HAVE_SOFT_RESET indicates that libxl supports performing
   * 'soft reset' for domains and there is 'soft_reset' shutdown reason
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 2a42da2f7d..9067b509f4 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl


I assume that at least goland bindings want updating.



@@ -480,6 +480,11 @@ libxl_tee_type = Enumeration("tee_type", [
  (1, "optee")
  ], init_val = "LIBXL_TEE_TYPE_NONE")
  
+libxl_sci_type = Enumeration("sci_type", [

+(0, "none"),
+(1, "scmi_smc")
+], init_val = "LIBXL_SCI_TYPE_NONE")
+
  libxl_rdm_reserve = Struct("rdm_reserve", [
  ("strategy",libxl_rdm_reserve_strategy),
  ("policy",  libxl_rdm_reserve_policy),
@@ -564,6 +569,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
  ("apic", libxl_defbool),
  ("dm_restrict",  libxl_defbool),
  ("tee",  libxl_tee_type),
+("sci",  libxl_sci_type),
  ("u", KeyedUnion(None, libxl_domain_type, "type",
  [("hvm", Struct(None, [("firmware", string),
 ("bios", libxl_bios_type),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 117fcdcb2b..c37bf6298b 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -2747,6 +2747,15 @@ skip_usbdev:
  }
  }
  
+if (!xlu_cfg_get_string (config, "sci", &buf, 1)) {

+e = libxl_sci_type_from_string(buf, &b_info->sci);
+if (e) {
+fprintf(stderr,
+"Unknown sci \"%s\" specified\n", buf);
+exit(-ERROR_FAIL);
+}
+}
+
  parse_vkb_list(config, d_config);
  
  xlu_cfg_get_defbool(config, "xend_suspend_evtchn_compat",


--
Regards,

Oleksandr Tyshchenko




Re: [PATCH V6 1/2] libxl: Add support for Virtio disk configuration

2021-12-15 Thread Oleksandr



On 15.12.21 17:58, Juergen Gross wrote:


Hi Juergen



On 15.12.21 16:02, Oleksandr wrote:


On 15.12.21 08:08, Juergen Gross wrote:

Hi Juergen


On 14.12.21 18:44, Oleksandr wrote:


On 14.12.21 18:03, Anthony PERARD wrote:

Hi Anthony



On Wed, Dec 08, 2021 at 06:59:43PM +0200, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

This patch adds basic support for configuring and assisting 
virtio-disk
backend (emualator) which is intended to run out of Qemu and 
could be

run in any domain.
Although the Virtio block device is quite different from traditional
Xen PV block device (vbd) from the toolstack point of view:
  - as the frontend is virtio-blk which is not a Xenbus driver, 
nothing

    written to Xenstore are fetched by the frontend (the vdev is not
    passed to the frontend)
  - the ring-ref/event-channel are not used for the 
backend<->frontend

    communication, the proposed IPC for Virtio is IOREQ/DM
it is still a "block device" and ought to be integrated in existing
"disk" handling. So, re-use (and adapt) "disk" parsing/configuration
logic to deal with Virtio devices as well.
How backend are intended to be created? Is there something 
listening on

xenstore?

You mention QEMU as been the backend, do you intend to have QEMU
listening on xenstore to create a virtio backend? Or maybe it is 
on the

command line? There is QMP as well, but it's probably a lot more
complicated as I think libxl needs refactoring for that.



No, QEMU is not involved there. The backend is a standalone 
application,
it is launched from the command line. The backend reads the 
Xenstore to get
the configuration and to detect when guest with the frontend is 
created/destroyed.


I think this should be reflected somehow in the configuration, as I
expect qemu might gain this functionality in the future.


I understand this and agree in general (however I am wondering 
whether this can be postponed until it is actually needed), but ...


This might lead to the need to support some "legacy" options in future.
I think we should at least think whether these scheme will cover (or
prohibit) extensions which are already on the horizon.


ok






I'm wondering whether we shouldn't split the backend from the protocol
(or specification?). Something like "protocol=virtio" (default would be
e.g. "xen") and then you could add "backend=external" for your use 
case?


... I am afraid, I didn't get the idea. Are we speaking about the 
(new?) disk configuration options
here or these are not disk specific things at all and to be 
applicable for all possible backends?


I was talking of a general approach using the disk as an example. For
disks it is just rather obvious.

If the former, then could the new backendtype simply do the job? For 
example, "backendtype=virtio_external" for our current use-case and 
"backendtype=virtio_qemu"

for the possible future use-cases? Could you please clarify the idea.


I want to avoid overloading the backendtype with information which is
in general not really related by the backend. You can have a qemu based
qdisk backend serving a Xen PV-disk (like today) or a virtio disk.

A similar approach has been chosen for the disk format: it is not part
of the backend, but a parameter of its own. This way e.g. the qdisk
backend can use the original qdisk format, or the qcow format.

In practice we are having something like the "protocol" already today:
the disk device name is encoding that ("xvd*" is a Xen PV disk, while
"sd*" is an emulated SCSI disk, which happens to be presented to the
guest as "xvd*", too). And this is an additional information not
related to the backendtype.

So we have basically the following configuration items, which are
orthogonal to each other (some combinations might not make sense,
but in theory most would be possible):

1. protocol: emulated (not PV), Xen (like today), virtio

2. backendtype: phy (blkback), qdisk (qemu), other (e.g. a daemon)

3. format: raw, qcow, qcow2, vhd, qed

The combination virtio+phy would be equivalent to vhost, BTW. And
virtio+other might even use vhost-user, depending on the daemon.

yes, BTW the combination virtio+other is close to our use-case.


Thank you for the detailed explanation, now I see your point why using 
backendtype=virtio is not flexible option in the long term
and why we would want/need to an extra configuration option such as 
protocol, etc. I think, it makes sense and would be correct.


If we take a disk as an example, then from the configuration PoV we will 
need to:

- add an optional "protocol" option
- add new backendtype: external/other/daemon/etc.
This seems to cover all possible combinations describe above (although I 
agree that some of them might not make sense). Is my understanding correct?


Unfortunately, disk configuration/management code is spread over 
multiple sources (including auto-generated) in the toolstack which is 
not so easy to follow (at least to me
who is not familiar enough with all this stuff), 

[libvirt test] 167423: regressions - FAIL

2021-12-15 Thread osstest service owner
flight 167423 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167423/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  7826148a72c97367fc6aaa76397fe92d32169723
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  523 days
Failing since151818  2020-07-11 04:18:52 Z  522 days  504 attempts
Testing same since   167423  2021-12-15 04:20:20 Z0 days1 attempts


People who touched revisions under test:
Adolfo Jayme Barrientos 
  Aleksandr Alekseev 
  Aleksei Zakharov 
  Andika Triwidada 
  Andrea Bolognani 
  Ani Sinha 
  Balázs Meskó 
  Barrett Schonefeld 
  Bastian Germann 
  Bastien Orivel 
  BiaoXiang Ye 
  Bihong Yu 
  Binfeng Wu 
  Bjoern Walk 
  Boris Fiuczynski 
  Brian Turek 
  Bruno Haible 
  Chris Mayo 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Kirbach 
  Christian Schoenebeck 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Cédric Bosdonnat 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Didik Supriadi 
  dinglimin 
  Dmitrii Shcherbakov 
  Dmytro Linkin 
  Eiichi Tsukata 
  Eric Farman 
  Erik Skultety 
  Fabian Affolter 
  Fabian Freyer 
  Fabiano Fidêncio 
  Fangge Jin 
  Farhan Ali 
  Fedora Weblate Translation 
  Franck Ridel 
  Gavi Teitz 
  gongwei 
  Guoyi Tu
  Göran Uddeborg 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Hela Basa 
  Helmut Grohne 
  Hiroki Narukawa 
  Hyman Huang(黄勇) 
  Ian Wienand 
  Ioanna Alifieraki 
  Jakob Meng 
  Jamie Strandboge 
  Jamie Strandboge 
  Jan Kuparinen 
  jason lee 
  Jean-Baptiste Holcroft 
  Jia Zhou 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jinsheng Zhang 
  Jiri Denemark 
  Joachim Falk 
  John Ferlan 
  Jonathan Watt 
  Jonathon Jongsma 
  Julio Faracco 
  Justin Gatzen 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Koichi Murase 
  Kristina Hanicova 
  Laine Stump 
  Laszlo Ersek 
  Lee Yarwood 
  Lei Yang 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Lin Ma 
  Liu Yiding 
  Luke Yue 
  Luyao Zhong 
  Marc Hartmayer 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matej Cepl 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Meina Li 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Moshe Levi 
  Muha Aliss 
  Nathan 
  Neal Gompa 
  Nick Chevsky 
  Nick Shyrokovskiy 
  Nickys Music Group 
  Nico Pache 
  Nikolay Shirokovskiy 
  Olaf Hering 
  Olesya Gerasimenko 
  Or Ozeri 
  Orion Poplawski 
  Pany 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peng Liang 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Praveen K Paladugu 
  Richard W.M. Jones 
  Ricky Tigg 
  Robin Lee 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Gahagan 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  SeongHyun Jo 
  Shalini Chellathurai Saroja 
  Shaojun Yang 
  Shi Lei 
  simmon 
  Simon Chopin 
  Simon Gaiser 
  Simon Rowe 
  Stefan Bader 
  Stefan Berger 
  Stefan Berger 
  Stefan Hajnoczi 
  Stefan Hajnoczi 
  Szymon Scholz 

Re: [patch V2 21/31] soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation

2021-12-15 Thread Thomas Gleixner
On Mon, Dec 06 2021 at 23:51, Thomas Gleixner wrote:
>
> No functional change intended.

Famous last words.

>  static int ti_sci_inta_msi_alloc_descs(struct device *dev,
>  struct ti_sci_resource *res)
>  {
> - struct msi_desc *msi_desc;
> + struct msi_desc msi_desc;
>   int set, i, count = 0;
>  
> + memset(&msi_desc, 0, sizeof(msi_desc));

This fails to initialize msi_desc.nvec_used which makes the subsequent
interrupt allocation fail. Delta fix below.

Thanks,

tglx
---
--- a/drivers/soc/ti/ti_sci_inta_msi.c
+++ b/drivers/soc/ti/ti_sci_inta_msi.c
@@ -68,6 +68,7 @@ static int ti_sci_inta_msi_alloc_descs(s
int set, i, count = 0;
 
memset(&msi_desc, 0, sizeof(msi_desc));
+   msi_desc.nvec_used = 1;
 
for (set = 0; set < res->sets; set++) {
for (i = 0; i < res->desc[set].num; i++, count++) {



Re: [RFC v1 1/5] xen/arm: add support for Renesas R-Car Gen3 platform

2021-12-15 Thread Oleksii Moisieiev
Hi Oleksandr,

Thank you for the point.
I will add it to the next patch version.

Best regards,
Oleksii 

On Wed, Dec 15, 2021 at 06:38:00AM +, Oleksandr Andrushchenko wrote:
> Hi, Oleksii!
> 
> On 14.12.21 11:34, Oleksii Moisieiev wrote:
> > Implementation includes platform-specific smc handler for rcar3 platform.
> >
> > Signed-off-by: Oleksii Moisieiev 
> Reviewed-by: Oleksandr Andrushchenko 
> 
> > ---
> >   xen/arch/arm/platforms/Makefile |  1 +
> >   xen/arch/arm/platforms/rcar3.c  | 46 +
> >   2 files changed, 47 insertions(+)
> >   create mode 100644 xen/arch/arm/platforms/rcar3.c
> >
> > diff --git a/xen/arch/arm/platforms/Makefile 
> > b/xen/arch/arm/platforms/Makefile
> > index 8632f4115f..b64c25de6c 100644
> > --- a/xen/arch/arm/platforms/Makefile
> > +++ b/xen/arch/arm/platforms/Makefile
> > @@ -4,6 +4,7 @@ obj-$(CONFIG_ALL32_PLAT) += exynos5.o
> >   obj-$(CONFIG_ALL32_PLAT) += midway.o
> >   obj-$(CONFIG_ALL32_PLAT) += omap5.o
> >   obj-$(CONFIG_ALL32_PLAT) += rcar2.o
> > +obj-$(CONFIG_RCAR3) += rcar3.o
> >   obj-$(CONFIG_ALL64_PLAT) += seattle.o
> >   obj-$(CONFIG_ALL_PLAT)   += sunxi.o
> >   obj-$(CONFIG_ALL64_PLAT) += thunderx.o
> > diff --git a/xen/arch/arm/platforms/rcar3.c b/xen/arch/arm/platforms/rcar3.c
> > new file mode 100644
> > index 00..d740145c71
> > --- /dev/null
> > +++ b/xen/arch/arm/platforms/rcar3.c
> > @@ -0,0 +1,46 @@
> > +/*
> > + * xen/arch/arm/platforms/rcar3.c
> > + *
> > + * Renesas R-Car Gen3 specific settings
> > + *
> > + * Oleksii Moisieiev 
> > + * Copyright (C) 2021 EPAM Systems
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + */
> > +
> > +#include 
> > +
> > +static bool rcar3_smc(struct cpu_user_regs *regs)
> > +{
> > +return false;
> > +}
> > +
> > +static const char *const rcar3_dt_compat[] __initconst =
> > +{
> > +"renesas,r8a7795",
> > +"renesas,r8a7796",
> > +NULL
> > +};
> > +
> > +PLATFORM_START(rcar3, "Renesas R-Car Gen3")
> > +.compatible = rcar3_dt_compat,
> > +.smc = rcar3_smc
> > +PLATFORM_END
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */


[xen-unstable-smoke test] 167435: tolerable all pass - PUSHED

2021-12-15 Thread osstest service owner
flight 167435 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167435/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  8b3cbdbe782cae972e9a47cf22620ebee61a96a6
baseline version:
 xen  9956fdc70f99b0f133be7f16f62417928a84622c

Last test of basis   167429  2021-12-15 10:02:58 Z0 days
Testing same since   167435  2021-12-15 15:03:05 Z0 days1 attempts


People who touched revisions under test:
  Bobby Eshleman 
  Julien Grall 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   9956fdc70f..8b3cbdbe78  8b3cbdbe782cae972e9a47cf22620ebee61a96a6 -> smoke



Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-15 Thread Julien Grall

Hi Michal,

On 15/12/2021 10:40, Michal Orzel wrote:

On 15.12.2021 11:32, Jan Beulich wrote:

(Re-sending an abridged version, as apparently spam filters didn't like
the original message with more retained context; I'll have to see whether
this one also isn't liked. Sorry.)

On 15.12.2021 10:48, Michal Orzel wrote:

This patch and the problem it solves is about clearing top 32bits of all gp 
registers so not only x0,x1.


That's well understood. Yet for everything still in registers simply
using mov ahead of the respective push (as you had it) is still
preferable imo.

Jan



In that case let's wait for Julien's opinion to decide whether I should get 
back to the previous
solution with mov or to the stack solution.


IIUC, your proposal is to:
   1) Push all the 64-bit registers
   2) Zero the top 32-bit

Jan's suggestion is to:
   1) clobber the top 32-bit using mov wX, wX
   2) Push all the registers

My preference is for the latter because there will be less memory/cache 
access.


So, this would be your original patch + a compile time check to ensure 
save_x0_x1 is 0 when compat=1.


Cheers,

--
Julien Grall



Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-15 Thread Julien Grall

Hi,

On 15/12/2021 09:35, Jan Beulich wrote:

1. Regarding save_x0_x1, it is 0 only for guest_sync_slowpath, which is called 
by guest_sync.
So as we are dealing only with compat=1, save_x0_x1 cannot be 0.
The conclusion is that we do not need to worry about it.


Oh, good point. I guess you may want to add a build time check to
avoid silently introducing a user of the macro violating that
assumption.


+1

Cheers,

--
Julien Grall



Re: [patch V3 00/35] genirq/msi, PCI/MSI: Spring cleaning - Part 2

2021-12-15 Thread Nishanth Menon
On 17:35-20211215, Thomas Gleixner wrote:
>git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git 
> msi-v4.1-part-2
[...]

> That should cure the problem.

And it sure does. Thanks for looking closer and providing a fix.

https://gist.github.com/nmenon/9862a1c31b17fd6dfe0a30c54d396187
(msi-v4.1-part-2) looks clean

Also while I had detected pointer corruption in the previous v4
https://gist.github.com/nmenon/ce4d12f460db5cd511185c047d5d35d0

Running it again on v4.1 does indicate the fix is in place.
https://gist.github.com/nmenon/3231fbb0faa1b9c827b40d1ae6160626


please feel free to add:

Tested-by: Nishanth Menon 

-- 
Regards,
Nishanth Menon
Key (0xDDB5849D1736249D)/Fingerprint: F8A2 8693 54EB 8232 17A3  1A34 DDB5 849D 
1736 249D



Re: [PATCH v8 0/4] PCI devices passthrough on Arm, part 2

2021-12-15 Thread Oleksandr Andrushchenko
Hi, Julien!

On 15.12.21 19:48, Julien Grall wrote:
> On 09/12/2021 07:29, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko 
>>
>> Hi, all!
>
> Hi Oleksandr,
>
>> This is an assorted series of patches which aim is to make some further
>> basis for PCI passthrough on Arm support. The series continues the work
>> published earlier by Arm [1] and adds new helpers and clears the way for
>> vPCI changes which will follow.
>>
>> RFC is at [2], [3]. Design presentation can be found at [4].Hi
>>
>> I have removed patch
>> [PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices
>> as it seems that this needs more time for decision on how to achive
>> that.
>>
>> I have also added a new patch
>> [PATCH v7 4/7] xen/arm: account IO handler for emulated PCI host bridge
>> with a tiny latent bug fix.
>>
>> This series contains all the patches which are left un-committed yet.
>>
>> Thank you,
>> Oleksandr
>>
>> [1] 
>> https://urldefense.com/v3/__https://patchwork.kernel.org/project/xen-devel/list/?series=558681__;!!GF_29dbcQIUBPA!gqz5e3dL-6UrscJs6ZorKgDOMpYsfiPNFn0ffortKrcGBkil9SMKjbDcX7V_T9RVID_vrU1iUA$
>>  [patchwork[.]kernel[.]org]
>> [2] 
>> https://urldefense.com/v3/__https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01184.html__;!!GF_29dbcQIUBPA!gqz5e3dL-6UrscJs6ZorKgDOMpYsfiPNFn0ffortKrcGBkil9SMKjbDcX7V_T9RVID-GAYv29Q$
>>  [lists[.]xenproject[.]org]
>> [3] 
>> https://urldefense.com/v3/__https://lists.xenproject.org/archives/html/xen-devel/2020-07/threads.html*01184__;Iw!!GF_29dbcQIUBPA!gqz5e3dL-6UrscJs6ZorKgDOMpYsfiPNFn0ffortKrcGBkil9SMKjbDcX7V_T9RVID_PWk-hRA$
>>  [lists[.]xenproject[.]org]
>> [4] 
>> https://urldefense.com/v3/__https://static.sched.com/hosted_files/xen2021/e4/PCI_Device_Passthrough_On_Arm.pdf__;!!GF_29dbcQIUBPA!gqz5e3dL-6UrscJs6ZorKgDOMpYsfiPNFn0ffortKrcGBkil9SMKjbDcX7V_T9RVID9SzhK4bw$
>>  [static[.]sched[.]com]
>>
>> Oleksandr Andrushchenko (4):
>>    xen/arm: add pci-domain for disabled devices
>>    xen/arm: setup MMIO range trap handlers for hardware domain
>>    xen/arm: account IO handler for emulated PCI host bridge
>>    xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m
>>
>>   xen/arch/arm/domain.c  |   2 +
>>   xen/arch/arm/domain_build.c    | 132 -
>>   xen/arch/arm/pci/ecam.c    |  14 +++
>>   xen/arch/arm/pci/pci-host-common.c |  77 -
>>   xen/arch/arm/pci/pci-host-zynqmp.c |   1 +
>>   xen/arch/arm/vpci.c    |  85 ---
>>   xen/arch/arm/vpci.h    |   6 ++
>>   xen/include/asm-arm/pci.h  |  22 +
>>   xen/include/asm-arm/setup.h    |  13 +++
>
> As a FYI, Jan pushed today a commit that moved the headers from 
> xen/include/asm-arm to xen/arch/arm/include/asm/.
>
> I have handled the clash for this series while committing.
Thank you for doing that!
>
> Thank you for the contribution.
Thank you all for supporting this work!
Oleksandr
>
> Cheers,
>


Re: [PATCH v8 0/4] PCI devices passthrough on Arm, part 2

2021-12-15 Thread Julien Grall

On 09/12/2021 07:29, Oleksandr Andrushchenko wrote:

From: Oleksandr Andrushchenko 

Hi, all!


Hi Oleksandr,


This is an assorted series of patches which aim is to make some further
basis for PCI passthrough on Arm support. The series continues the work
published earlier by Arm [1] and adds new helpers and clears the way for
vPCI changes which will follow.

RFC is at [2], [3]. Design presentation can be found at [4].Hi

I have removed patch
[PATCH v6 5/7] xen/arm: do not map IRQs and memory for disabled devices
as it seems that this needs more time for decision on how to achive
that.

I have also added a new patch
[PATCH v7 4/7] xen/arm: account IO handler for emulated PCI host bridge
with a tiny latent bug fix.

This series contains all the patches which are left un-committed yet.

Thank you,
Oleksandr

[1] https://patchwork.kernel.org/project/xen-devel/list/?series=558681
[2] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg01184.html
[3] 
https://lists.xenproject.org/archives/html/xen-devel/2020-07/threads.html#01184
[4] 
https://static.sched.com/hosted_files/xen2021/e4/PCI_Device_Passthrough_On_Arm.pdf

Oleksandr Andrushchenko (4):
   xen/arm: add pci-domain for disabled devices
   xen/arm: setup MMIO range trap handlers for hardware domain
   xen/arm: account IO handler for emulated PCI host bridge
   xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m

  xen/arch/arm/domain.c  |   2 +
  xen/arch/arm/domain_build.c| 132 -
  xen/arch/arm/pci/ecam.c|  14 +++
  xen/arch/arm/pci/pci-host-common.c |  77 -
  xen/arch/arm/pci/pci-host-zynqmp.c |   1 +
  xen/arch/arm/vpci.c|  85 ---
  xen/arch/arm/vpci.h|   6 ++
  xen/include/asm-arm/pci.h  |  22 +
  xen/include/asm-arm/setup.h|  13 +++


As a FYI, Jan pushed today a commit that moved the headers from 
xen/include/asm-arm to xen/arch/arm/include/asm/.


I have handled the clash for this series while committing.

Thank you for the contribution.

Cheers,

--
Julien Grall



Re: [patch V4 09-02/35] PCI/MSI: Allocate MSI device data on first use

2021-12-15 Thread Greg Kroah-Hartman
On Wed, Dec 15, 2021 at 06:19:49PM +0100, Thomas Gleixner wrote:
> Allocate MSI device data on first use, i.e. when a PCI driver invokes one
> of the PCI/MSI enablement functions.
> 
> Add a wrapper function to ensure that the ordering vs. pcim_msi_release()
> is correct.
> 
> Signed-off-by: Thomas Gleixner 

Reviewed-by: Greg Kroah-Hartman 



Re: [patch V4 09-01/35] PCI/MSI: Decouple MSI[-X] disable from pcim_release()

2021-12-15 Thread Greg Kroah-Hartman
On Wed, Dec 15, 2021 at 06:16:44PM +0100, Thomas Gleixner wrote:
> The MSI core will introduce runtime allocation of MSI related data. This
> data will be devres managed and has to be set up before enabling
> PCI/MSI[-X]. This would introduce an ordering issue vs. pcim_release().
> 
> The setup order is:
> 
>pcim_enable_device()
>   devres_alloc(pcim_release...);
>   ...
>   pci_irq_alloc()
> msi_setup_device_data()
>devres_alloc(msi_device_data_release, ...)
> 
> and once the device is released these release functions are invoked in the
> opposite order:
> 
> msi_device_data_release()
> ...
> pcim_release()
>pci_disable_msi[x]()
> 
> which is obviously wrong, because pci_disable_msi[x]() requires the MSI
> data to be available to tear down the MSI[-X] interrupts.
> 
> Remove the MSI[-X] teardown from pcim_release() and add an explicit action
> to be installed on the attempt of enabling PCI/MSI[-X].
> 
> This allows the MSI core data allocation to be ordered correctly in a
> subsequent step.
> 
> Reported-by: Nishanth Menon 
> Signed-off-by: Thomas Gleixner 
> ---
> V4: New patch


Reviewed-by: Greg Kroah-Hartman 



Re: [PATCH v8 2/4] xen/arm: setup MMIO range trap handlers for hardware domain

2021-12-15 Thread Julien Grall

On 10/12/2021 18:37, Oleksandr Andrushchenko wrote:

Hi, Julien!


Hello,


On 10.12.21 19:52, Julien Grall wrote:

Hi Oleksandr,

On 09/12/2021 07:29, Oleksandr Andrushchenko wrote:

+unsigned int domain_vpci_get_num_mmio_handlers(struct domain *d)
+{
+    if ( !has_vpci(d) )
+    return 0;
+
+    if ( is_hardware_domain(d) )
+    {
+    int ret = pci_host_iterate_bridges_and_count(d, 
vpci_get_num_handlers_cb);
+
+    return ret < 0 ? 0 : ret;


Sorry I only spotted this now. AFAICT, ret is not meant to return ret < 0 in 
this case. But if it were then I think it would be wrong to continue as nothing 
happened because the code will likely fall over/crash when registering the I/O 
handlers.

I would document this oddity with

if ( ret < 0 )
{
    ASSERT_UNREACHABLE();
    return 0;
}

I can do the change on commit if the others are happy with it.

Yes, please, do me a favor


Ok. With that:

Acked-by: Julien Grall 

Cheers,



Cheers,


Thank you,
Oleksandr


--
Julien Grall



[patch V4 09-02/35] PCI/MSI: Allocate MSI device data on first use

2021-12-15 Thread Thomas Gleixner
Allocate MSI device data on first use, i.e. when a PCI driver invokes one
of the PCI/MSI enablement functions.

Add a wrapper function to ensure that the ordering vs. pcim_msi_release()
is correct.

Signed-off-by: Thomas Gleixner 
---
V4: Adopted to ensure devres ordering
---
 drivers/pci/msi/msi.c |   17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -366,6 +366,19 @@ static int pcim_setup_msi_release(struct
return ret;
 }
 
+/*
+ * Ordering vs. devres: msi device data has to be installed first so that
+ * pcim_msi_release() is invoked before it on device release.
+ */
+static int pci_setup_msi_context(struct pci_dev *dev)
+{
+   int ret = msi_setup_device_data(&dev->dev);
+
+   if (!ret)
+   ret = pcim_setup_msi_release(dev);
+   return ret;
+}
+
 static struct msi_desc *
 msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
 {
@@ -909,7 +922,7 @@ static int __pci_enable_msi_range(struct
if (nvec > maxvec)
nvec = maxvec;
 
-   rc = pcim_setup_msi_release(dev);
+   rc = pci_setup_msi_context(dev);
if (rc)
return rc;
 
@@ -956,7 +969,7 @@ static int __pci_enable_msix_range(struc
if (WARN_ON_ONCE(dev->msix_enabled))
return -EINVAL;
 
-   rc = pcim_setup_msi_release(dev);
+   rc = pci_setup_msi_context(dev);
if (rc)
return rc;
 



[patch V4 09-01/35] PCI/MSI: Decouple MSI[-X] disable from pcim_release()

2021-12-15 Thread Thomas Gleixner
The MSI core will introduce runtime allocation of MSI related data. This
data will be devres managed and has to be set up before enabling
PCI/MSI[-X]. This would introduce an ordering issue vs. pcim_release().

The setup order is:

   pcim_enable_device()
devres_alloc(pcim_release...);
...
pci_irq_alloc()
  msi_setup_device_data()
 devres_alloc(msi_device_data_release, ...)

and once the device is released these release functions are invoked in the
opposite order:

msi_device_data_release()
...
pcim_release()
   pci_disable_msi[x]()

which is obviously wrong, because pci_disable_msi[x]() requires the MSI
data to be available to tear down the MSI[-X] interrupts.

Remove the MSI[-X] teardown from pcim_release() and add an explicit action
to be installed on the attempt of enabling PCI/MSI[-X].

This allows the MSI core data allocation to be ordered correctly in a
subsequent step.

Reported-by: Nishanth Menon 
Signed-off-by: Thomas Gleixner 
---
V4: New patch
---
 drivers/pci/msi/msi.c |   33 +
 drivers/pci/pci.c |5 -
 include/linux/pci.h   |3 ++-
 3 files changed, 35 insertions(+), 6 deletions(-)

--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -341,6 +341,31 @@ void pci_restore_msi_state(struct pci_de
 }
 EXPORT_SYMBOL_GPL(pci_restore_msi_state);
 
+static void pcim_msi_release(void *pcidev)
+{
+   struct pci_dev *dev = pcidev;
+
+   dev->is_msi_managed = false;
+   pci_free_irq_vectors(dev);
+}
+
+/*
+ * Needs to be separate from pcim_release to prevent an ordering problem
+ * vs. msi_device_data_release() in the MSI core code.
+ */
+static int pcim_setup_msi_release(struct pci_dev *dev)
+{
+   int ret;
+
+   if (!pci_is_managed(dev) || dev->is_msi_managed)
+   return 0;
+
+   ret = devm_add_action(&dev->dev, pcim_msi_release, dev);
+   if (!ret)
+   dev->is_msi_managed = true;
+   return ret;
+}
+
 static struct msi_desc *
 msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
 {
@@ -884,6 +909,10 @@ static int __pci_enable_msi_range(struct
if (nvec > maxvec)
nvec = maxvec;
 
+   rc = pcim_setup_msi_release(dev);
+   if (rc)
+   return rc;
+
for (;;) {
if (affd) {
nvec = irq_calc_affinity_vectors(minvec, nvec, affd);
@@ -927,6 +956,10 @@ static int __pci_enable_msix_range(struc
if (WARN_ON_ONCE(dev->msix_enabled))
return -EINVAL;
 
+   rc = pcim_setup_msi_release(dev);
+   if (rc)
+   return rc;
+
for (;;) {
if (affd) {
nvec = irq_calc_affinity_vectors(minvec, nvec, affd);
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2024,11 +2024,6 @@ static void pcim_release(struct device *
struct pci_devres *this = res;
int i;
 
-   if (dev->msi_enabled)
-   pci_disable_msi(dev);
-   if (dev->msix_enabled)
-   pci_disable_msix(dev);
-
for (i = 0; i < DEVICE_COUNT_RESOURCE; i++)
if (this->region_mask & (1 << i))
pci_release_region(dev, i);
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -425,7 +425,8 @@ struct pci_dev {
unsigned intats_enabled:1;  /* Address Translation Svc */
unsigned intpasid_enabled:1;/* Process Address Space ID */
unsigned intpri_enabled:1;  /* Page Request Interface */
-   unsigned intis_managed:1;
+   unsigned intis_managed:1;   /* Managed via devres */
+   unsigned intis_msi_managed:1;   /* MSI release via devres 
installed */
unsigned intneeds_freset:1; /* Requires fundamental reset */
unsigned intstate_saved:1;
unsigned intis_physfn:1;



Re: [patch V3 00/35] genirq/msi, PCI/MSI: Spring cleaning - Part 2

2021-12-15 Thread Thomas Gleixner
On Wed, Dec 15 2021 at 17:18, Thomas Gleixner wrote:

> On Tue, Dec 14 2021 at 22:19, Thomas Gleixner wrote:
>> On Tue, Dec 14 2021 at 14:56, Nishanth Menon wrote:
>>
>> thanks for trying. I'll have a look again with brain awake tomorrow
>> morning.
>
> Morning was busy with other things, but I found what my sleepy brain
> managed to do wrong yesterday evening.
>
> Let me reintegrate the pile and I'll send you an update.

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git msi-v4.1-part-2
   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git msi-v4.2-part-3

That should cure the problem.




Re: [patch V3 00/35] genirq/msi, PCI/MSI: Spring cleaning - Part 2

2021-12-15 Thread Thomas Gleixner
On Tue, Dec 14 2021 at 22:19, Thomas Gleixner wrote:
> On Tue, Dec 14 2021 at 14:56, Nishanth Menon wrote:
>
> thanks for trying. I'll have a look again with brain awake tomorrow
> morning.

Morning was busy with other things, but I found what my sleepy brain
managed to do wrong yesterday evening.

Let me reintegrate the pile and I'll send you an update.

Thanks,

tglx



Re: [PATCH V6 1/2] libxl: Add support for Virtio disk configuration

2021-12-15 Thread Juergen Gross

On 15.12.21 16:02, Oleksandr wrote:


On 15.12.21 08:08, Juergen Gross wrote:

Hi Juergen


On 14.12.21 18:44, Oleksandr wrote:


On 14.12.21 18:03, Anthony PERARD wrote:

Hi Anthony



On Wed, Dec 08, 2021 at 06:59:43PM +0200, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

This patch adds basic support for configuring and assisting 
virtio-disk

backend (emualator) which is intended to run out of Qemu and could be
run in any domain.
Although the Virtio block device is quite different from traditional
Xen PV block device (vbd) from the toolstack point of view:
  - as the frontend is virtio-blk which is not a Xenbus driver, 
nothing

    written to Xenstore are fetched by the frontend (the vdev is not
    passed to the frontend)
  - the ring-ref/event-channel are not used for the backend<->frontend
    communication, the proposed IPC for Virtio is IOREQ/DM
it is still a "block device" and ought to be integrated in existing
"disk" handling. So, re-use (and adapt) "disk" parsing/configuration
logic to deal with Virtio devices as well.

How backend are intended to be created? Is there something listening on
xenstore?

You mention QEMU as been the backend, do you intend to have QEMU
listening on xenstore to create a virtio backend? Or maybe it is on the
command line? There is QMP as well, but it's probably a lot more
complicated as I think libxl needs refactoring for that.



No, QEMU is not involved there. The backend is a standalone application,
it is launched from the command line. The backend reads the Xenstore 
to get
the configuration and to detect when guest with the frontend is 
created/destroyed.


I think this should be reflected somehow in the configuration, as I
expect qemu might gain this functionality in the future.


I understand this and agree in general (however I am wondering whether 
this can be postponed until it is actually needed), but ...


This might lead to the need to support some "legacy" options in future.
I think we should at least think whether these scheme will cover (or
prohibit) extensions which are already on the horizon.


I'm wondering whether we shouldn't split the backend from the protocol
(or specification?). Something like "protocol=virtio" (default would be
e.g. "xen") and then you could add "backend=external" for your use case?


... I am afraid, I didn't get the idea. Are we speaking about the (new?) 
disk configuration options
here or these are not disk specific things at all and to be applicable 
for all possible backends?


I was talking of a general approach using the disk as an example. For
disks it is just rather obvious.

If the former, then could the new backendtype simply do the job? For 
example, "backendtype=virtio_external" for our current use-case and 
"backendtype=virtio_qemu"

for the possible future use-cases? Could you please clarify the idea.


I want to avoid overloading the backendtype with information which is
in general not really related by the backend. You can have a qemu based
qdisk backend serving a Xen PV-disk (like today) or a virtio disk.

A similar approach has been chosen for the disk format: it is not part
of the backend, but a parameter of its own. This way e.g. the qdisk
backend can use the original qdisk format, or the qcow format.

In practice we are having something like the "protocol" already today:
the disk device name is encoding that ("xvd*" is a Xen PV disk, while
"sd*" is an emulated SCSI disk, which happens to be presented to the
guest as "xvd*", too). And this is an additional information not
related to the backendtype.

So we have basically the following configuration items, which are
orthogonal to each other (some combinations might not make sense,
but in theory most would be possible):

1. protocol: emulated (not PV), Xen (like today), virtio

2. backendtype: phy (blkback), qdisk (qemu), other (e.g. a daemon)

3. format: raw, qcow, qcow2, vhd, qed

The combination virtio+phy would be equivalent to vhost, BTW. And
virtio+other might even use vhost-user, depending on the daemon.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[ovmf test] 167419: all pass - PUSHED

2021-12-15 Thread osstest service owner
flight 167419 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167419/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 38f6d78c3b62f8825e7d802697b7992418a72da7
baseline version:
 ovmf 9006967c8d24f5d9585278fb6363b08f2118d424

Last test of basis   167414  2021-12-14 16:10:26 Z0 days
Testing same since   167419  2021-12-14 23:11:46 Z0 days1 attempts


People who touched revisions under test:
  Pierre Gondois 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   9006967c8d..38f6d78c3b  38f6d78c3b62f8825e7d802697b7992418a72da7 -> 
xen-tested-master



Re: [PATCH v2 14/18] IOMMU: fold flush-all hook into "flush one"

2021-12-15 Thread Oleksandr



On 24.09.21 12:53, Jan Beulich wrote:

Hi Jan


Having a separate flush-all hook has always been puzzling me some. We
will want to be able to force a full flush via accumulated flush flags
from the map/unmap functions. Introduce a respective new flag and fold
all flush handling to use the single remaining hook.

Note that because of the respective comments in SMMU and IPMMU-VMSA
code, I've folded the two prior hook functions into one.


Changes to IPMMU-VMSA lgtm, for SMMU-v2 I think the same.



For SMMU-v3,
which lacks a comment towards incapable hardware, I've left both
functions in place on the assumption that selective and full flushes
will eventually want separating.

Signed-off-by: Jan Beulich 
---
TBD: What we really are going to need is for the map/unmap functions to
  specify that a wider region needs flushing than just the one
  covered by the present set of (un)maps. This may still be less than
  a full flush, but at least as a first step it seemed better to me
  to keep things simple and go the flush-all route.
---
v2: New.

--- a/xen/drivers/passthrough/amd/iommu.h
+++ b/xen/drivers/passthrough/amd/iommu.h
@@ -242,7 +242,6 @@ int amd_iommu_get_reserved_device_memory
  int __must_check amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn,
   unsigned long page_count,
   unsigned int flush_flags);
-int __must_check amd_iommu_flush_iotlb_all(struct domain *d);
  void amd_iommu_print_entries(const struct amd_iommu *iommu, unsigned int 
dev_id,
   dfn_t dfn);
  
--- a/xen/drivers/passthrough/amd/iommu_map.c

+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -475,15 +475,18 @@ int amd_iommu_flush_iotlb_pages(struct d
  {
  unsigned long dfn_l = dfn_x(dfn);
  
-ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));

-ASSERT(flush_flags);
+if ( !(flush_flags & IOMMU_FLUSHF_all) )
+{
+ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
+ASSERT(flush_flags);
+}
  
  /* Unless a PTE was modified, no flush is required */

  if ( !(flush_flags & IOMMU_FLUSHF_modified) )
  return 0;
  
-/* If the range wraps then just flush everything */

-if ( dfn_l + page_count < dfn_l )
+/* If so requested or if the range wraps then just flush everything. */
+if ( (flush_flags & IOMMU_FLUSHF_all) || dfn_l + page_count < dfn_l )
  {
  amd_iommu_flush_all_pages(d);
  return 0;
@@ -508,13 +511,6 @@ int amd_iommu_flush_iotlb_pages(struct d
  
  return 0;

  }
-
-int amd_iommu_flush_iotlb_all(struct domain *d)
-{
-amd_iommu_flush_all_pages(d);
-
-return 0;
-}
  
  int amd_iommu_reserve_domain_unity_map(struct domain *d,

 const struct ivrs_unity_map *map,
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -642,7 +642,6 @@ static const struct iommu_ops __initcons
  .map_page = amd_iommu_map_page,
  .unmap_page = amd_iommu_unmap_page,
  .iotlb_flush = amd_iommu_flush_iotlb_pages,
-.iotlb_flush_all = amd_iommu_flush_iotlb_all,
  .reassign_device = reassign_device,
  .get_device_group_id = amd_iommu_group_id,
  .enable_x2apic = iov_enable_xt,
--- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -930,13 +930,19 @@ out:
  }
  
  /* Xen IOMMU ops */

-static int __must_check ipmmu_iotlb_flush_all(struct domain *d)
+static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn,
+  unsigned long page_count,
+  unsigned int flush_flags)
  {
  struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
  
+ASSERT(flush_flags);

+
  if ( !xen_domain || !xen_domain->root_domain )
  return 0;
  
+/* The hardware doesn't support selective TLB flush. */

+
  spin_lock(&xen_domain->lock);
  ipmmu_tlb_invalidate(xen_domain->root_domain);
  spin_unlock(&xen_domain->lock);
@@ -944,16 +950,6 @@ static int __must_check ipmmu_iotlb_flus
  return 0;
  }
  
-static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn,

-  unsigned long page_count,
-  unsigned int flush_flags)
-{
-ASSERT(flush_flags);
-
-/* The hardware doesn't support selective TLB flush. */
-return ipmmu_iotlb_flush_all(d);
-}
-
  static struct ipmmu_vmsa_domain *ipmmu_get_cache_domain(struct domain *d,
  struct device *dev)
  {
@@ -1303,7 +1299,6 @@ static const struct iommu_ops ipmmu_iomm
  .hwdom_init  = ipmmu_iommu_hwdom_init,
  .teardown= ipmmu_iommu_domain_teardown,
  .iotlb_flush = ipmmu_iotlb_flush,
-.iotlb_flush_all = ipmmu_iotlb_flush_all,
  .assign_device   =

Re: [PATCH v2 17/18] AMD/IOMMU: free all-empty page tables

2021-12-15 Thread Roger Pau Monné
On Fri, Sep 24, 2021 at 11:55:57AM +0200, Jan Beulich wrote:
> When a page table ends up with no present entries left, it can be
> replaced by a non-present entry at the next higher level. The page table
> itself can then be scheduled for freeing.
> 
> Note that while its output isn't used there yet, update_contig_markers()
> right away needs to be called in all places where entries get updated,
> not just the one where entries get cleared.

Couldn't you also coalesce all contiguous page tables into a
super-page entry at the higher level? (not that should be done here,
it's just that I'm not seeing any patch to that effect in the series)

> Signed-off-by: Jan Beulich 
> ---
> v2: New.
> 
> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -21,6 +21,9 @@
>  
>  #include "iommu.h"
>  
> +#define CONTIG_MASK IOMMU_PTE_CONTIG_MASK
> +#include 
> +
>  /* Given pfn and page table level, return pde index */
>  static unsigned int pfn_to_pde_idx(unsigned long pfn, unsigned int level)
>  {
> @@ -33,16 +36,20 @@ static unsigned int pfn_to_pde_idx(unsig
>  
>  static union amd_iommu_pte clear_iommu_pte_present(unsigned long l1_mfn,
> unsigned long dfn,
> -   unsigned int level)
> +   unsigned int level,
> +   bool *free)
>  {
>  union amd_iommu_pte *table, *pte, old;
> +unsigned int idx = pfn_to_pde_idx(dfn, level);
>  
>  table = map_domain_page(_mfn(l1_mfn));
> -pte = &table[pfn_to_pde_idx(dfn, level)];
> +pte = &table[idx];
>  old = *pte;
>  
>  write_atomic(&pte->raw, 0);
>  
> +*free = update_contig_markers(&table->raw, idx, level, PTE_kind_null);
> +
>  unmap_domain_page(table);
>  
>  return old;
> @@ -85,7 +92,11 @@ static union amd_iommu_pte set_iommu_pte
>  if ( !old.pr || old.next_level ||
>   old.mfn != next_mfn ||
>   old.iw != iw || old.ir != ir )
> +{
>  set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
> +update_contig_markers(&table->raw, pfn_to_pde_idx(dfn, level), level,
> +  PTE_kind_leaf);
> +}
>  else
>  old.pr = false; /* signal "no change" to the caller */
>  
> @@ -259,6 +270,9 @@ static int iommu_pde_from_dfn(struct dom
>  smp_wmb();
>  set_iommu_pde_present(pde, next_table_mfn, next_level, true,
>true);
> +update_contig_markers(&next_table_vaddr->raw,
> +  pfn_to_pde_idx(dfn, level),
> +  level, PTE_kind_table);
>  
>  *flush_flags |= IOMMU_FLUSHF_modified;
>  }
> @@ -284,6 +298,9 @@ static int iommu_pde_from_dfn(struct dom
>  next_table_mfn = mfn_x(page_to_mfn(table));
>  set_iommu_pde_present(pde, next_table_mfn, next_level, true,
>true);
> +update_contig_markers(&next_table_vaddr->raw,
> +  pfn_to_pde_idx(dfn, level),
> +  level, PTE_kind_table);

Would be nice if we could pack the update_contig_markers in
set_iommu_pde_present (like you do for clear_iommu_pte_present).

Thanks, Roger.



Re: [PATCH V6 1/2] libxl: Add support for Virtio disk configuration

2021-12-15 Thread Oleksandr



On 15.12.21 08:08, Juergen Gross wrote:

Hi Juergen


On 14.12.21 18:44, Oleksandr wrote:


On 14.12.21 18:03, Anthony PERARD wrote:

Hi Anthony



On Wed, Dec 08, 2021 at 06:59:43PM +0200, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

This patch adds basic support for configuring and assisting 
virtio-disk

backend (emualator) which is intended to run out of Qemu and could be
run in any domain.
Although the Virtio block device is quite different from traditional
Xen PV block device (vbd) from the toolstack point of view:
  - as the frontend is virtio-blk which is not a Xenbus driver, 
nothing

    written to Xenstore are fetched by the frontend (the vdev is not
    passed to the frontend)
  - the ring-ref/event-channel are not used for the backend<->frontend
    communication, the proposed IPC for Virtio is IOREQ/DM
it is still a "block device" and ought to be integrated in existing
"disk" handling. So, re-use (and adapt) "disk" parsing/configuration
logic to deal with Virtio devices as well.

How backend are intended to be created? Is there something listening on
xenstore?

You mention QEMU as been the backend, do you intend to have QEMU
listening on xenstore to create a virtio backend? Or maybe it is on the
command line? There is QMP as well, but it's probably a lot more
complicated as I think libxl needs refactoring for that.



No, QEMU is not involved there. The backend is a standalone application,
it is launched from the command line. The backend reads the Xenstore 
to get
the configuration and to detect when guest with the frontend is 
created/destroyed.


I think this should be reflected somehow in the configuration, as I
expect qemu might gain this functionality in the future.


I understand this and agree in general (however I am wondering whether 
this can be postponed until it is actually needed), but ...






I'm wondering whether we shouldn't split the backend from the protocol
(or specification?). Something like "protocol=virtio" (default would be
e.g. "xen") and then you could add "backend=external" for your use case?


... I am afraid, I didn't get the idea. Are we speaking about the (new?) 
disk configuration options
here or these are not disk specific things at all and to be applicable 
for all possible backends?
If the former, then could the new backendtype simply do the job? For 
example, "backendtype=virtio_external" for our current use-case and 
"backendtype=virtio_qemu"

for the possible future use-cases? Could you please clarify the idea.


Thank you.




We could later expand that to "backend=qemu" or "backend=".


Juergen


--
Regards,

Oleksandr Tyshchenko




Re: [PATCH v5 00/14] PCI devices passthrough on Arm, part 3

2021-12-15 Thread Oleksandr Andrushchenko
Hi, Roger!

On 15.12.21 16:51, Roger Pau Monné wrote:
> On Wed, Dec 15, 2021 at 12:22:32PM +, Oleksandr Andrushchenko wrote:
>> Hi, Jan!
>>
>> On 15.12.21 14:07, Jan Beulich wrote:
>>> On 15.12.2021 12:56, Oleksandr Andrushchenko wrote:
 Dear rest maintainers!

 Could you please review this series which seems to get stuck?
>>> I don't seem to have any record of you having pinged Roger as the vPCI
>>> maintainer.
>> No, I didn't. Roger is on CC, so he might shed some light on when it might
>> happen, so we, those who work on PCI passthrough on Arm,
>> can also plan the relevant upcoming (re)work: we still miss MSI/MSI-X and
>> IOMMU series which do depend on this one
> Hello,
>
> I'm quite overloaded with patch review and other stuff, since I've
> taken over the Community Manager role while George is away.
>
> There are series on the mailing list that have been pending for way
> longer, and while I understand that this is of no help or relief for
> you it wouldn't be fair for me to review this piece for work before
> other series that have been pending for longer, as other submitters
> also deserve review.
This is fair
>
> Sorry, but I think it's unlikely I will get to it until after new
> year.
Thank you in advance,
Oleksandr
>
> Thanks, Roger.


Re: [PATCH v5 00/14] PCI devices passthrough on Arm, part 3

2021-12-15 Thread Roger Pau Monné
On Wed, Dec 15, 2021 at 12:22:32PM +, Oleksandr Andrushchenko wrote:
> Hi, Jan!
> 
> On 15.12.21 14:07, Jan Beulich wrote:
> > On 15.12.2021 12:56, Oleksandr Andrushchenko wrote:
> >> Dear rest maintainers!
> >>
> >> Could you please review this series which seems to get stuck?
> > I don't seem to have any record of you having pinged Roger as the vPCI
> > maintainer.
> No, I didn't. Roger is on CC, so he might shed some light on when it might
> happen, so we, those who work on PCI passthrough on Arm,
> can also plan the relevant upcoming (re)work: we still miss MSI/MSI-X and
> IOMMU series which do depend on this one

Hello,

I'm quite overloaded with patch review and other stuff, since I've
taken over the Community Manager role while George is away.

There are series on the mailing list that have been pending for way
longer, and while I understand that this is of no help or relief for
you it wouldn't be fair for me to review this piece for work before
other series that have been pending for longer, as other submitters
also deserve review.

Sorry, but I think it's unlikely I will get to it until after new
year.

Thanks, Roger.



[PATCH] x86: xen: Fix xen_initdom_restore_msi #ifdef

2021-12-15 Thread Arnd Bergmann
From: Arnd Bergmann 

The #ifdef check around the definition doesn't match the one around the
declaration, leading to a link failure when CONFIG_XEN_DOM0 is enabled
but CONFIG_XEN_PV_DOM0 is not:

x86_64-linux-ld: arch/x86/kernel/apic/msi.o: in function 
`arch_restore_msi_irqs':
msi.c:(.text+0x29a): undefined reference to `xen_initdom_restore_msi'

Change the declaration to use the same check that was already
present around the function definition.

Fixes: ae72f3156729 ("PCI/MSI: Make arch_restore_msi_irqs() less horrible.")
Signed-off-by: Arnd Bergmann 
---
This should go on top the irq/msi branch of the tip tree, which
introduced the build regression
---
 arch/x86/include/asm/xen/hypervisor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/xen/hypervisor.h 
b/arch/x86/include/asm/xen/hypervisor.h
index 677f84d0039f..1bf2ad34188a 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -59,7 +59,7 @@ static inline bool __init xen_x2apic_para_available(void)
 
 struct pci_dev;
 
-#ifdef CONFIG_XEN_DOM0
+#ifdef CONFIG_XEN_PV_DOM0
 bool xen_initdom_restore_msi(struct pci_dev *dev);
 #else
 static inline bool xen_initdom_restore_msi(struct pci_dev *dev) { return true; 
}
-- 
2.29.2




[xen-unstable-smoke test] 167429: tolerable all pass - PUSHED

2021-12-15 Thread osstest service owner
flight 167429 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167429/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  9956fdc70f99b0f133be7f16f62417928a84622c
baseline version:
 xen  249e0f1d8f203188ccdcced5a05c2149739e1566

Last test of basis   167412  2021-12-14 13:01:35 Z1 days
Testing same since   167429  2021-12-15 10:02:58 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  Jan Beulich 
  Julien Grall 
  Paul Durrant 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   249e0f1d8f..9956fdc70f  9956fdc70f99b0f133be7f16f62417928a84622c -> smoke



Re: [PATCH v2 16/18] x86: introduce helper for recording degree of contiguity in page tables

2021-12-15 Thread Roger Pau Monné
On Fri, Sep 24, 2021 at 11:55:30AM +0200, Jan Beulich wrote:
> This is a re-usable helper (kind of a template) which gets introduced
> without users so that the individual subsequent patches introducing such
> users can get committed independently of one another.
> 
> See the comment at the top of the new file. To demonstrate the effect,
> if a page table had just 16 entries, this would be the set of markers
> for a page table with fully contiguous mappings:
> 
> index  0 1 2 3 4 5 6 7 8 9 A B C D E F
> marker 4 0 1 0 2 0 1 0 3 0 1 0 2 0 1 0
> 
> "Contiguous" here means not only present entries with successively
> increasing MFNs, each one suitably aligned for its slot, but also a
> respective number of all non-present entries.
> 
> Signed-off-by: Jan Beulich 
> ---
> v2: New.
> 
> --- /dev/null
> +++ b/xen/include/asm-x86/contig-marker.h
> @@ -0,0 +1,105 @@
> +#ifndef __ASM_X86_CONTIG_MARKER_H
> +#define __ASM_X86_CONTIG_MARKER_H
> +
> +/*
> + * Short of having function templates in C, the function defined below is
> + * intended to be used by multiple parties interested in recording the
> + * degree of contiguity in mappings by a single page table.
> + *
> + * Scheme: Every entry records the order of contiguous successive entries,
> + * up to the maximum order covered by that entry (which is the number of
> + * clear low bits in its index, with entry 0 being the exception using
> + * the base-2 logarithm of the number of entries in a single page table).
> + * While a few entries need touching upon update, knowing whether the
> + * table is fully contiguous (and can hence be replaced by a higher level
> + * leaf entry) is then possible by simply looking at entry 0's marker.
> + *
> + * Prereqs:
> + * - CONTIG_MASK needs to be #define-d, to a value having at least 4
> + *   contiguous bits (ignored by hardware), before including this file,
> + * - page tables to be passed here need to be initialized with correct
> + *   markers.

Given this requirement I think it would make sense to place the page
table marker initialization currently placed in iommu_alloc_pgtable as
a helper here also?

> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +/* This is the same for all anticipated users, so doesn't need passing in. */
> +#define CONTIG_LEVEL_SHIFT 9
> +#define CONTIG_NR  (1 << CONTIG_LEVEL_SHIFT)
> +
> +#define GET_MARKER(e) MASK_EXTR(e, CONTIG_MASK)
> +#define SET_MARKER(e, m) \
> +((void)(e = ((e) & ~CONTIG_MASK) | MASK_INSR(m, CONTIG_MASK)))
> +
> +enum PTE_kind {
> +PTE_kind_null,
> +PTE_kind_leaf,
> +PTE_kind_table,
> +};
> +
> +static bool update_contig_markers(uint64_t *pt, unsigned int idx,

Maybe pt_update_contig_markers, so it's not such a generic name.

> +  unsigned int level, enum PTE_kind kind)
> +{
> +unsigned int b, i = idx;
> +unsigned int shift = (level - 1) * CONTIG_LEVEL_SHIFT + PAGE_SHIFT;
> +
> +ASSERT(idx < CONTIG_NR);
> +ASSERT(!(pt[idx] & CONTIG_MASK));
> +
> +/* Step 1: Reduce markers in lower numbered entries. */
> +while ( i )
> +{
> +b = find_first_set_bit(i);
> +i &= ~(1U << b);
> +if ( GET_MARKER(pt[i]) > b )
> +SET_MARKER(pt[i], b);
> +}
> +
> +/* An intermediate table is never contiguous with anything. */
> +if ( kind == PTE_kind_table )
> +return false;
> +
> +/*
> + * Present entries need in sync index and address to be a candidate
> + * for being contiguous: What we're after is whether ultimately the
> + * intermediate table can be replaced by a superpage.
> + */
> +if ( kind != PTE_kind_null &&
> + idx != ((pt[idx] >> shift) & (CONTIG_NR - 1)) )

Don't you just need to check that the address is aligned to at least
idx, not that it's exactly aligned?

AFAICT this will return early if the address has an alignment that
exceeds the requirements imposed by idx.

> +return false;
> +
> +/* Step 2: Check higher numbered entries for contiguity. */
> +for ( b = 0; b < CONTIG_LEVEL_SHIFT && !(idx & (1U << b)); ++b )
> +{
> +i = idx | (1U << b);
> +if ( (kind == PTE_kind_leaf
> +  ? ((pt[i] ^ pt[idx]) & ~CONTIG_MASK) != (1ULL << (b + shift))

Maybe this could be a macro, CHECK_CONTIG or some such? It's also used
below.

I would also think this would be clearer as:

(pt[idx] & ~CONTIG_MASK) + (1ULL << (shift + b)) == (pt[i] & ~CONTIG_MASK)

> +  : pt[i] & ~CONTIG_MASK) ||

Isn't PTE_kind_null always supposed to be empty? (ie: wouldn't this
check always succeed?)

> + GET_MARKER(pt[i]) != b )
> +break;
> +}
> +
> +/* Step 3: Update markers in this and lower numbered entries. */
> +for ( ; SET_MARKER(pt[idx], b), b < CONTIG_LEVEL_SHIFT; ++b )
> +{
> +i = idx ^ (1U << b);
> +if ( (kind == PTE_kind_leaf
> +  ? ((pt[i] ^ pt[idx]) & ~CONTIG_MASK) != (1ULL << (b + shift))
> +  : p

Re: [PATCH 09/10] iommu/ipmmu-vmsa: Use refcount for the micro-TLBs

2021-12-15 Thread Oleksandr



On 15.12.21 04:58, Volodymyr Babchuk wrote:

Hi Oleksandr,


Hi Volodymyr





Oleksandr Tyshchenko  writes:


From: Oleksandr Tyshchenko 

Reference-count the micro-TLBs as several bus masters can be
connected to the same micro-TLB (and drop TODO comment).
This wasn't an issue so far, since the platform devices
(this driver deals with) get assigned/deassigned together during
domain creation/destruction. But, in order to support PCI devices
(which are hot-pluggable) in the near future we will need to
take care of.

Signed-off-by: Oleksandr Tyshchenko 
---
  xen/drivers/passthrough/arm/ipmmu-vmsa.c | 17 +
  1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c 
b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
index 22dd84e..32609f8 100644
--- a/xen/drivers/passthrough/arm/ipmmu-vmsa.c
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -134,6 +134,7 @@ struct ipmmu_vmsa_device {
  spinlock_t lock;/* Protects ctx and domains[] */
  DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
  struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
+unsigned int utlb_refcount[IPMMU_UTLB_MAX];
  const struct ipmmu_features *features;
  };
  
@@ -477,13 +478,12 @@ static int ipmmu_utlb_enable(struct ipmmu_vmsa_domain *domain,

  }
  }
  
-/*

- * TODO: Reference-count the micro-TLB as several bus masters can be
- * connected to the same micro-TLB.
- */
-ipmmu_imuasid_write(mmu, utlb, 0);
-ipmmu_imuctr_write(mmu, utlb, imuctr |
-   IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_MMUEN);
+if ( mmu->utlb_refcount[utlb]++ == 0 )
+{
+ipmmu_imuasid_write(mmu, utlb, 0);
+ipmmu_imuctr_write(mmu, utlb, imuctr |
+   IMUCTR_TTSEL_MMU(domain->context_id) | 
IMUCTR_MMUEN);
+}
  
  return 0;

  }
@@ -494,7 +494,8 @@ static void ipmmu_utlb_disable(struct ipmmu_vmsa_domain 
*domain,
  {
  struct ipmmu_vmsa_device *mmu = domain->mmu;
  
-ipmmu_imuctr_write(mmu, utlb, 0);

It would be great to have

+  ASSERT(mmu->utlb_refcount[utlb] > 0);

there. Just in case.


ok, will add.


Thank you.





+if ( --mmu->utlb_refcount[utlb] == 0 )
+ipmmu_imuctr_write(mmu, utlb, 0);
  }
  
  /* Domain/Context Management */



--
Regards,

Oleksandr Tyshchenko




[xen-4.14-testing test] 167415: tolerable FAIL - PUSHED

2021-12-15 Thread osstest service owner
flight 167415 xen-4.14-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/167415/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167216
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167216
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167216
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167216
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167216
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167216
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167216
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167216
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167216
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167216
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167216
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167216
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  cbadf67bcab4e29c883410db393f4f5ef34df04a
baseline version:
 xen  c4cf5388652e

Re: [PATCH V6 1/2] libxl: Add support for Virtio disk configuration

2021-12-15 Thread Oleksandr



On 14.12.21 19:44, Oleksandr wrote:

Hi Anthony



On 14.12.21 18:03, Anthony PERARD wrote:

Hi Anthony



On Wed, Dec 08, 2021 at 06:59:43PM +0200, Oleksandr Tyshchenko wrote:

From: Oleksandr Tyshchenko 

This patch adds basic support for configuring and assisting virtio-disk
backend (emualator) which is intended to run out of Qemu and could be
run in any domain.
Although the Virtio block device is quite different from traditional
Xen PV block device (vbd) from the toolstack point of view:
  - as the frontend is virtio-blk which is not a Xenbus driver, nothing
    written to Xenstore are fetched by the frontend (the vdev is not
    passed to the frontend)
  - the ring-ref/event-channel are not used for the backend<->frontend
    communication, the proposed IPC for Virtio is IOREQ/DM
it is still a "block device" and ought to be integrated in existing
"disk" handling. So, re-use (and adapt) "disk" parsing/configuration
logic to deal with Virtio devices as well.

How backend are intended to be created? Is there something listening on
xenstore?

You mention QEMU as been the backend, do you intend to have QEMU
listening on xenstore to create a virtio backend? Or maybe it is on the
command line? There is QMP as well, but it's probably a lot more
complicated as I think libxl needs refactoring for that.



No, QEMU is not involved there. The backend is a standalone application,
it is launched from the command line. The backend reads the Xenstore 
to get
the configuration and to detect when guest with the frontend is 
created/destroyed.






Besides introducing new disk backend type (LIBXL_DISK_BACKEND_VIRTIO)
introduce new device kind (LIBXL__DEVICE_KIND_VIRTIO_DISK) as current
one (LIBXL__DEVICE_KIND_VBD) doesn't fit into Virtio disk model.

In order to inform the toolstack that Virtio disk needs to be used
extend "disk" configuration by introducing new "virtio" flag.
An example of domain configuration:
disk = [ 'backend=DomD, phy:/dev/mmcblk1p3, xvda1, rw, virtio' ]

This new "virtio" flags feels strange. Would having something like
"backendtype=virtio" works?



IIRC I considered "backendtype=virtio" option, but decided to go "an 
extra virtio flag" route, however I don't remember what exactly 
affected my decision.
But, I see your point and agree, I will analyze whether we can use 
"backendtype=virtio", I hope that we can, but need to make sure.



I have just rechecked/experimented, we can use "backendtype=virtio" 
instead of an extra "virtio" flags.


disk = [ 'backend=DomD, phy:/dev/mmcblk0p3, xvda1, backendtype=virtio' ]

Also backendtype section in xl-disk-configuration.5.pod.in really wants 
updating as

at least the first sentence is not true for virtio backend:

"This does not affect the guest's view of the device.  It controls
which software implementation of the Xen backend driver is used."

So shall I go this direction?









Please note, this patch is not enough for virtio-disk to work
on Xen (Arm), as for every Virtio device (including disk) we need
to allocate Virtio MMIO params (IRQ and memory region) and pass
them to the backend, also update Guest device-tree. The subsequent
patch will add these missing bits. For the current patch,
the default "irq" and "base" are just written to the Xenstore.

This feels like the patches are in the wrong order. I don't think it is
a good idea to allow to create broken configuration until a follow-up
patch fixes things.


Yes, I also think this is not an ideal splitting, so I decided to 
write a few sentences to draw reviewer's attention to this.
The problem is that second patch adds Arm bits which are local to 
libs/light/libxl_arm.c and if I put it before the current one I will 
break the bisectability

as there will be no callers yet.





diff --git a/tools/xl/xl_block.c b/tools/xl/xl_block.c
index 70eed43..50a4d45 100644
--- a/tools/xl/xl_block.c
+++ b/tools/xl/xl_block.c
@@ -50,6 +50,11 @@ int main_blockattach(int argc, char **argv)
  return 0;
  }
  +    if (disk.virtio) {
+    fprintf(stderr, "block-attach is not supported for Virtio 
device\n");

+    return 1;
+    }

This might not be the right place. libxl might want to prevent hotplug
instead.


Could you please point me the right place where to prevent hotplug?


Thank you.




Thanks,


--
Regards,

Oleksandr Tyshchenko




[xen-4.15-testing test] 167416: tolerable FAIL - PUSHED

2021-12-15 Thread osstest service owner
flight 167416 xen-4.15-testing real [real]
flight 167433 xen-4.15-testing real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167416/
http://logs.test-lab.xenproject.org/osstest/logs/167433/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-credit2  18 guest-localmigrate  fail pass in 167433-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167217
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167217
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167217
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167217
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167217
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167217
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167217
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167217
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167217
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167217
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167217
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167217
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkf

Re: [PATCH v5 00/14] PCI devices passthrough on Arm, part 3

2021-12-15 Thread Oleksandr Andrushchenko
Hi, Jan!

On 15.12.21 14:07, Jan Beulich wrote:
> On 15.12.2021 12:56, Oleksandr Andrushchenko wrote:
>> Dear rest maintainers!
>>
>> Could you please review this series which seems to get stuck?
> I don't seem to have any record of you having pinged Roger as the vPCI
> maintainer.
No, I didn't. Roger is on CC, so he might shed some light on when it might
happen, so we, those who work on PCI passthrough on Arm,
can also plan the relevant upcoming (re)work: we still miss MSI/MSI-X and
IOMMU series which do depend on this one
>   Also, as said on the Community Call when discussing this,
> I don't think I'd view this series as in a state where an emergency
> fallback to REST would be appropriate.
No emergency here, but v5 without any ack/nack might ring a bell
Which made me write this e-mail
>   As indicated, in particular I
> wouldn't want to commit any of it without Roger's basic agreement.
This is clear as it is up to the relevant maintainer to commit which
I might not expect from the rest maintainers
>   IOW
> while REST maintainer reviews may help making progress (but as much
> would reviews by anyone else),
This is my goal: to have ack/nack at least from the REST mainatainers
>   they may not put the series in a state
> where it could go in.
Fair enough
>
> In any event, as also said on the call, afaic this series is in my to-
> be-reviewed folder,
Appreciate this
>   alongside a few dozen more patches. I'll get to it
> if nobody else would, but I can't predict when that's going to be.
Thank you
> There's simply too much other stuff in need of taking care of.
Sure, our companies do want us to do something useful for them as well, but 
review

Thank you,
Oleksandr
>
> Jan
>
>> Thank you in advance,
>> Oleksandr
>>
>> On 25.11.21 13:02, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko 
>>>
>>> Hi, all!
>>>
>>> 1. This patch series is focusing on vPCI and adds support for non-identity
>>> PCI BAR mappings which is required while passing through a PCI device to
>>> a guest. The highlights are:
>>>
>>> - Add relevant vpci register handlers when assigning PCI device to a domain
>>> and remove those when de-assigning. This allows having different
>>> handlers for different domains, e.g. hwdom and other guests.
>>>
>>> - Emulate guest BAR register values based on physical BAR values.
>>> This allows creating a guest view of the registers and emulates
>>> size and properties probe as it is done during PCI device enumeration by
>>> the guest.
>>>
>>> - Instead of handling a single range set, that contains all the memory
>>> regions of all the BARs and ROM, have them per BAR.
>>>
>>> - Take into account guest's BAR view and program its p2m accordingly:
>>> gfn is guest's view of the BAR and mfn is the physical BAR value as set
>>> up by the host bridge in the hardware domain.
>>> This way hardware doamin sees physical BAR values and guest sees
>>> emulated ones.
>>>
>>> 2. The series also adds support for virtual PCI bus topology for guests:
>>>- We emulate a single host bridge for the guest, so segment is always 0.
>>>- The implementation is limited to 32 devices which are allowed on
>>>  a single PCI bus.
>>>- The virtual bus number is set to 0, so virtual devices are seen
>>>  as embedded endpoints behind the root complex.
>>>
>>> 3. The series has complete re-work of the locking scheme used/absent before 
>>> with
>>> the help of the work started by Roger [1]:
>>> [PATCH v5 03/13] vpci: move lock outside of struct vpci
>>>
>>> This way the lock can be used to check whether vpci is present, and
>>> removal can be performed while holding the lock, in order to make
>>> sure there are no accesses to the contents of the vpci struct.
>>> Previously removal could race with vpci_read for example, since the
>>> lock was dropped prior to freeing pdev->vpci.
>>> This also solves synchronization issues between all vPCI code entities
>>> which could run in parallel.
>>>
>>> 4. There is an outstanding TODO left unimplemented by this series:
>>> for unprivileged guests vpci_{read|write} need to be re-worked
>>> to not passthrough accesses to the registers not explicitly handled
>>> by the corresponding vPCI handlers: without fixing that passthrough
>>> to guests is completely unsafe as Xen allows them full access to
>>> the registers.
>>>
>>> Xen needs to be sure that every register a guest accesses is not
>>> going to cause the system to malfunction, so Xen needs to keep a
>>> list of the registers it is safe for a guest to access.
>>>
>>> For example, we should only expose the PCI capabilities that we know
>>> are safe for a guest to use, i.e.: MSI and MSI-X initially.
>>> The rest of the capabilities should be blocked from guest access,
>>> unless we audit them and declare safe for a guest to access.
>>>
>>> As a reference we might want to look at the approach currently used
>>> by QEMU in order to do PCI passthrough. A very limited set of PCI
>>>

Re: [PATCH] x86/cpuid: Introduce dom0-cpuid command line option

2021-12-15 Thread Andrew Cooper
On 15/12/2021 08:34, Jan Beulich wrote:
> On 14.12.2021 22:16, Andrew Cooper wrote:
>> Specifically, this lets the user opt in to non-default for dom0.
>>
>> Split features[] out of parse_xen_cpuid(), giving it a lightly more
>> appropraite name, so it can be shared with parse_xen_cpuid().
> With the latter one I guess you mean parse_dom0_cpuid()?

I do, yes.  This is a copy/paste error.
>> Collect all dom0 settings together in dom0_{en,dis}able_feat[], and apply it
>> to dom0's policy when other tweaks are being made.
>>
>> Signed-off-by: Andrew Cooper 
>> ---
>> CC: Jan Beulich 
>> CC: Roger Pau Monné 
>> CC: Wei Liu 
>> CC: Daniel Smith 
>>
>> RFC, because I think we've got a preexisting error with late hwdom here.  We
>> really should not be cobbering a late hwdom's settings (which were provided 
>> in
>> the usual way by the toolstack in dom0).
> For ITSC I think also covering late hwdom is at least acceptable. For the
> speculation controls I'm less certain (but as per the comment there they're
> temporary only anyway), and I agree the command line option here should
> strictly only apply to Dom0 (or else, as a minor aspect, the option also
> would better be named "hwdom-cpuid=").
>
>> Furthermore, the distinction gets more murky in a hyperlaunch future where
>> multiple domains may be constructed by Xen, and there is reason to expect 
>> that
>> a full toolstack-like configuration is made available for them.
> Like above, anything created via the toolstack interfaces should use the
> toolstack controls. If there was something dom0less-like on x86, domains
> created that way (without toolstack involvement) would instead want to
> have another way of controlling their CPUID settings.
>
>> One option might be to remove the special case from 
>> init_domain_cpuid_policy()
>> and instead make a call into the cpuid code from create_dom0().  It would 
>> have
>> to be placed between domain_create() and alloc_dom0_vcpu0() for dynamic 
>> sizing
>> of the FPU block to work.  Thoughts?
> As said above, I think the ITSC special case could stay. But apart from
> this I agree.

So I disagree with keeping the ITSC special case.

I do agree that a non-dom0 hwdom probably wants ITSC, but ITSC
explicitly can be controlled by the toolstack, and therefore Xen should
not be overriding the toolstack's decision.

IMO, this really does want to remain dom0-cpuid= rather than
hwdom-cpuid=.  It is specific to the domain which Xen creates as part of
bringing the system up.

In a future world with hyperlaunch/dom0less/etc, there is (or should be)
adequate provision to specify settings like this for all created
domains.  In this case, perhaps hyperlaunch declares that dom0-cpuid=
gets ignored, because we probably don't want random command line
settings impacting one N'th of the carefully curated system configuration.

(Also, any Xen constructed domains in a hyperlaunch world probably want
ITSC, but I'll again argue firmly that Xen should not be special-casing
this one feature.)
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -801,6 +801,22 @@ Controls for how dom0 is constructed on x86 systems.
>>  
>>  If using this option is necessary to fix an issue, please report a bug.
>>  
>> +### dom0-cpuid
>> += List of comma separated booleans
>> +
>> +Applicability: x86
>> +
>> +This option allows for fine tuning of the facilities dom0 will use, after
>> +accounting for hardware capabilities and Xen settings as enumerated via 
>> CPUID.
>> +
>> +Options are accepted in positive and negative form, to enable or disable
>> +specific features.  All selections via this mechanism are subject to normal
>> +CPU Policy safety logic.
>> +
>> +This option is intended for developers to opt dom0 into non-default 
>> features,
>> +and is not intended for use in production circumstances.  If using this 
>> option
>> +is necessary to fix an issue, please report a bug.
> You may want to state explicitly that disables take priority over enables,
> as per the present implementation. Personally I would find it better if the
> item specified last took effect. This is, as mentioned in other contexts,
> so one can override earlier settings (e.g. in a xen.cfg file used with
> xen.efi) by simply appending to the command line.

Order of enabled/disabled I feel is an implementation detail.  Perhaps
what to put in the docs is that specifying both forms is unsupported,
but "this is for developers only" is already a fairly big hint.

The only way to make a latest-takes-priority scheme work is to use
string_param() (creating an arbitrary upper bound limit), and parsing
the list during dom0 construction.  For a developer-only option, I
really don't think the complexity is worth the effort.
>> @@ -97,6 +98,73 @@ static int __init parse_xen_cpuid(const char *s)
>>  }
>>  custom_param("cpuid", parse_xen_cpuid);
>>  
>> +static uint32_t __hwdom_initdata dom0_enable_feat[FSCAPINTS];
>> +static uint32_t __hwdom_initd

Re: [PATCH v5 00/14] PCI devices passthrough on Arm, part 3

2021-12-15 Thread Jan Beulich
On 15.12.2021 12:56, Oleksandr Andrushchenko wrote:
> Dear rest maintainers!
> 
> Could you please review this series which seems to get stuck?

I don't seem to have any record of you having pinged Roger as the vPCI
maintainer. Also, as said on the Community Call when discussing this,
I don't think I'd view this series as in a state where an emergency
fallback to REST would be appropriate. As indicated, in particular I
wouldn't want to commit any of it without Roger's basic agreement. IOW
while REST maintainer reviews may help making progress (but as much
would reviews by anyone else), they may not put the series in a state
where it could go in.

In any event, as also said on the call, afaic this series is in my to-
be-reviewed folder, alongside a few dozen more patches. I'll get to it
if nobody else would, but I can't predict when that's going to be.
There's simply too much other stuff in need of taking care of.

Jan

> Thank you in advance,
> Oleksandr
> 
> On 25.11.21 13:02, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko 
>>
>> Hi, all!
>>
>> 1. This patch series is focusing on vPCI and adds support for non-identity
>> PCI BAR mappings which is required while passing through a PCI device to
>> a guest. The highlights are:
>>
>> - Add relevant vpci register handlers when assigning PCI device to a domain
>>and remove those when de-assigning. This allows having different
>>handlers for different domains, e.g. hwdom and other guests.
>>
>> - Emulate guest BAR register values based on physical BAR values.
>>This allows creating a guest view of the registers and emulates
>>size and properties probe as it is done during PCI device enumeration by
>>the guest.
>>
>> - Instead of handling a single range set, that contains all the memory
>>regions of all the BARs and ROM, have them per BAR.
>>
>> - Take into account guest's BAR view and program its p2m accordingly:
>>gfn is guest's view of the BAR and mfn is the physical BAR value as set
>>up by the host bridge in the hardware domain.
>>This way hardware doamin sees physical BAR values and guest sees
>>emulated ones.
>>
>> 2. The series also adds support for virtual PCI bus topology for guests:
>>   - We emulate a single host bridge for the guest, so segment is always 0.
>>   - The implementation is limited to 32 devices which are allowed on
>> a single PCI bus.
>>   - The virtual bus number is set to 0, so virtual devices are seen
>> as embedded endpoints behind the root complex.
>>
>> 3. The series has complete re-work of the locking scheme used/absent before 
>> with
>> the help of the work started by Roger [1]:
>> [PATCH v5 03/13] vpci: move lock outside of struct vpci
>>
>> This way the lock can be used to check whether vpci is present, and
>> removal can be performed while holding the lock, in order to make
>> sure there are no accesses to the contents of the vpci struct.
>> Previously removal could race with vpci_read for example, since the
>> lock was dropped prior to freeing pdev->vpci.
>> This also solves synchronization issues between all vPCI code entities
>> which could run in parallel.
>>
>> 4. There is an outstanding TODO left unimplemented by this series:
>> for unprivileged guests vpci_{read|write} need to be re-worked
>> to not passthrough accesses to the registers not explicitly handled
>> by the corresponding vPCI handlers: without fixing that passthrough
>> to guests is completely unsafe as Xen allows them full access to
>> the registers.
>>
>> Xen needs to be sure that every register a guest accesses is not
>> going to cause the system to malfunction, so Xen needs to keep a
>> list of the registers it is safe for a guest to access.
>>
>> For example, we should only expose the PCI capabilities that we know
>> are safe for a guest to use, i.e.: MSI and MSI-X initially.
>> The rest of the capabilities should be blocked from guest access,
>> unless we audit them and declare safe for a guest to access.
>>
>> As a reference we might want to look at the approach currently used
>> by QEMU in order to do PCI passthrough. A very limited set of PCI
>> capabilities known to be safe for untrusted access are exposed to the
>> guest and registers need to be explicitly handled or else access is
>> rejected. Xen needs a fairly similar model in vPCI or else none of
>> this will be safe for unprivileged access.
>>
>> 5. The series was also tested on:
>>   - x86 PVH Dom0 and doesn't break it.
>>   - x86 HVM with PCI passthrough to DomU and doesn't break it.
>>
>> Thank you,
>> Oleksandr
>>
>> [1] 
>> https://lore.kernel.org/xen-devel/20180717094830.54806-2-roger@citrix.com/
>>
>> Oleksandr Andrushchenko (13):
>>rangeset: add RANGESETF_no_print flag
>>vpci: fix function attributes for vpci_process_pending
>>vpci: cancel pending map/unmap on vpci removal
>>vpci: add hooks for PCI device assign/de-assign
>>vpci/header: implement guest BAR register handlers
>>vpci/header: ha

Re: [PATCH v5 00/14] PCI devices passthrough on Arm, part 3

2021-12-15 Thread Oleksandr Andrushchenko
Dear rest maintainers!

Could you please review this series which seems to get stuck?

Thank you in advance,
Oleksandr

On 25.11.21 13:02, Oleksandr Andrushchenko wrote:
> From: Oleksandr Andrushchenko 
>
> Hi, all!
>
> 1. This patch series is focusing on vPCI and adds support for non-identity
> PCI BAR mappings which is required while passing through a PCI device to
> a guest. The highlights are:
>
> - Add relevant vpci register handlers when assigning PCI device to a domain
>and remove those when de-assigning. This allows having different
>handlers for different domains, e.g. hwdom and other guests.
>
> - Emulate guest BAR register values based on physical BAR values.
>This allows creating a guest view of the registers and emulates
>size and properties probe as it is done during PCI device enumeration by
>the guest.
>
> - Instead of handling a single range set, that contains all the memory
>regions of all the BARs and ROM, have them per BAR.
>
> - Take into account guest's BAR view and program its p2m accordingly:
>gfn is guest's view of the BAR and mfn is the physical BAR value as set
>up by the host bridge in the hardware domain.
>This way hardware doamin sees physical BAR values and guest sees
>emulated ones.
>
> 2. The series also adds support for virtual PCI bus topology for guests:
>   - We emulate a single host bridge for the guest, so segment is always 0.
>   - The implementation is limited to 32 devices which are allowed on
> a single PCI bus.
>   - The virtual bus number is set to 0, so virtual devices are seen
> as embedded endpoints behind the root complex.
>
> 3. The series has complete re-work of the locking scheme used/absent before 
> with
> the help of the work started by Roger [1]:
> [PATCH v5 03/13] vpci: move lock outside of struct vpci
>
> This way the lock can be used to check whether vpci is present, and
> removal can be performed while holding the lock, in order to make
> sure there are no accesses to the contents of the vpci struct.
> Previously removal could race with vpci_read for example, since the
> lock was dropped prior to freeing pdev->vpci.
> This also solves synchronization issues between all vPCI code entities
> which could run in parallel.
>
> 4. There is an outstanding TODO left unimplemented by this series:
> for unprivileged guests vpci_{read|write} need to be re-worked
> to not passthrough accesses to the registers not explicitly handled
> by the corresponding vPCI handlers: without fixing that passthrough
> to guests is completely unsafe as Xen allows them full access to
> the registers.
>
> Xen needs to be sure that every register a guest accesses is not
> going to cause the system to malfunction, so Xen needs to keep a
> list of the registers it is safe for a guest to access.
>
> For example, we should only expose the PCI capabilities that we know
> are safe for a guest to use, i.e.: MSI and MSI-X initially.
> The rest of the capabilities should be blocked from guest access,
> unless we audit them and declare safe for a guest to access.
>
> As a reference we might want to look at the approach currently used
> by QEMU in order to do PCI passthrough. A very limited set of PCI
> capabilities known to be safe for untrusted access are exposed to the
> guest and registers need to be explicitly handled or else access is
> rejected. Xen needs a fairly similar model in vPCI or else none of
> this will be safe for unprivileged access.
>
> 5. The series was also tested on:
>   - x86 PVH Dom0 and doesn't break it.
>   - x86 HVM with PCI passthrough to DomU and doesn't break it.
>
> Thank you,
> Oleksandr
>
> [1] 
> https://lore.kernel.org/xen-devel/20180717094830.54806-2-roger@citrix.com/
>
> Oleksandr Andrushchenko (13):
>rangeset: add RANGESETF_no_print flag
>vpci: fix function attributes for vpci_process_pending
>vpci: cancel pending map/unmap on vpci removal
>vpci: add hooks for PCI device assign/de-assign
>vpci/header: implement guest BAR register handlers
>vpci/header: handle p2m range sets per BAR
>vpci/header: program p2m with guest BAR view
>vpci/header: emulate PCI_COMMAND register for guests
>vpci/header: reset the command register when adding devices
>vpci: add initial support for virtual PCI bus topology
>xen/arm: translate virtual PCI bus topology for guests
>xen/arm: account IO handlers for emulated PCI MSI-X
>vpci: add TODO for the registers not explicitly handled
>
> Roger Pau Monne (1):
>vpci: move lock outside of struct vpci
>
>   tools/tests/vpci/emul.h   |   5 +-
>   tools/tests/vpci/main.c   |   4 +-
>   xen/arch/arm/vpci.c   |  33 +++-
>   xen/arch/x86/hvm/vmsi.c   |   8 +-
>   xen/common/rangeset.c |   5 +-
>   xen/drivers/Kconfig   |   4 +
>   xen/drivers/passthrough/pci.c |  11 ++
>   xen/drivers/vpci/header.c | 352 +++---
>   xen/drivers/vpci/msi.c|  11 +-
>   xen/dr

[xen-unstable-coverity test] 167428: regressions - ALL FAIL

2021-12-15 Thread osstest service owner
flight 167428 xen-unstable-coverity real [real]
flight 167430 xen-unstable-coverity real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167428/
http://logs.test-lab.xenproject.org/osstest/logs/167430/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 coverity-amd647 coverity-upload  fail REGR. vs. 167384

version targeted for testing:
 xen  249e0f1d8f203188ccdcced5a05c2149739e1566
baseline version:
 xen  df3e1a5efe700a9f59eced801cac73f9fd02a0e2

Last test of basis   167384  2021-12-12 09:20:52 Z3 days
Testing same since   167428  2021-12-15 09:21:06 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Oleksandr Andrushchenko 
  Paul Durrant 

jobs:
 coverity-amd64   fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 249e0f1d8f203188ccdcced5a05c2149739e1566
Author: Andrew Cooper 
Date:   Mon Dec 13 20:33:42 2021 +

x86/cpuid: Fix TSXLDTRK definition

TSXLDTRK lives in CPUID leaf 7[0].edx, not 7[0].ecx.

Bit 16 in ecx is LA57.

Fixes: a6d1b558471f ("x86emul: support X{SUS,RES}LDTRK")
Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 

commit 1382241fe880d48e109f2056cec052bb3919a9d1
Author: Juergen Gross 
Date:   Tue Dec 14 09:50:07 2021 +0100

perfc: drop calls_to_multicall performance counter

The calls_to_multicall performance counter is basically redundant to
the multicall hypercall counter. The only difference is the counting
of continuation calls, which isn't really that interesting.

Drop the calls_to_multicall performance counter.

Suggested-by: Jan Beulich 
Signed-off-by: Juergen Gross 
Reviewed-by: Jan Beulich 

commit 668dd44902bbaf52f8e7214e0da060b6ec962e88
Author: Juergen Gross 
Date:   Tue Dec 14 09:49:23 2021 +0100

x86/perfc: add hypercall performance counters for hvm, correct pv

The HVM hypercall handler is missing incrementing the per hypercall
counters. Add that.

The counters for PV are handled wrong, as they are not using
perf_incra() with the number of the hypercall as index, but are
incrementing the first hypercall entry (set_trap_table) for each
hypercall. Fix that.

Signed-off-by: Juergen Gross 
Reviewed-by: Jan Beulich 

commit 7b99e7258559c9caa235d9faf323b22c68e4a581
Author: Jan Beulich 
Date:   Tue Dec 14 09:48:17 2021 +0100

x86emul: drop "seg" parameter from insn_fetch() hook

This is specified (and asserted for in a number of places) to always be
CS. Passing this as an argument in various places is therefore
pointless. The price to pay is two simple new functions, with the
benefit of the PTWR case now gaining a more appropriate error code.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 
Acked-by: Paul Durrant 

commit c49ee0329ff3de98722fd74ed5ba6d9665701e54
Author: Jan Beulich 
Date:   Tue Dec 14 09:47:31 2021 +0100

SUPPORT.md: limit security support for hosts with very much memory

Sufficient and in particular regular testing on very large hosts cannot
currently be guaranteed. Anyone wanting us to support larger hosts is
free to propose so, but will need to supply not only test results, but
also a test plan.

This is a follow-up to XSA-385.

Signed-off-by: Jan Beulich 
Acked-by: Andrew Cooper 
Acked-by: Julien Grall 

commit 53ed194539ddbea4f6aecb1b7c2f33aa8c0201d9
Author: Jan Beulich 
Date:   Tue Dec 14 09:46:48 2021 +0100

x86/monitor: don't open-code hvm_has_set_descriptor_access_exiting()

Signed-off-by: Jan Beulich 
Acked-by: Andrew Cooper 
Reviewed by: Alexandru Isaila 

commit 7dc0233f534f64e7f3ee71e74e05dd5ab8a24808
Author: Oleksandr Andrushchenko 
Date:   Tue Dec 14 09:44:44 2021 +0100

vpci: fix function attributes for vpci_process_pending

vpci_process_pending is defined with different attributes, e.g.
with __must_check if CONFIG_HAS_VPCI enabled and not otherwise.
Fix this by defining both of the definitions with __must_check.

Fixes: 14583a590783 ("7fbb0

Re: [PATCH] x86/cpuid: Introduce dom0-cpuid command line option

2021-12-15 Thread Daniel P. Smith

On 12/14/21 4:16 PM, Andrew Cooper wrote:

Specifically, this lets the user opt in to non-default for dom0.

Split features[] out of parse_xen_cpuid(), giving it a lightly more
appropraite name, so it can be shared with parse_xen_cpuid().

Collect all dom0 settings together in dom0_{en,dis}able_feat[], and apply it
to dom0's policy when other tweaks are being made.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
CC: Daniel Smith 

RFC, because I think we've got a preexisting error with late hwdom here.  We
really should not be cobbering a late hwdom's settings (which were provided in
the usual way by the toolstack in dom0).

Furthermore, the distinction gets more murky in a hyperlaunch future where
multiple domains may be constructed by Xen, and there is reason to expect that
a full toolstack-like configuration is made available for them.


For hyperlaunch, perhaps Christohper and I should revisit the DTB to add 
a cpuid field/mask where CPU features can be masked out.



One option might be to remove the special case from init_domain_cpuid_policy()
and instead make a call into the cpuid code from create_dom0().  It would have
to be placed between domain_create() and alloc_dom0_vcpu0() for dynamic sizing
of the FPU block to work.  Thoughts?



v/r,
dps



Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-15 Thread Michal Orzel



On 15.12.2021 11:32, Jan Beulich wrote:
> (Re-sending an abridged version, as apparently spam filters didn't like
> the original message with more retained context; I'll have to see whether
> this one also isn't liked. Sorry.)
> 
> On 15.12.2021 10:48, Michal Orzel wrote:
>> This patch and the problem it solves is about clearing top 32bits of all gp 
>> registers so not only x0,x1.
> 
> That's well understood. Yet for everything still in registers simply
> using mov ahead of the respective push (as you had it) is still
> preferable imo.
> 
> Jan
> 

In that case let's wait for Julien's opinion to decide whether I should get 
back to the previous
solution with mov or to the stack solution.

Cheers,
Michal



Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-15 Thread Jan Beulich
(Re-sending an abridged version, as apparently spam filters didn't like
the original message with more retained context; I'll have to see whether
this one also isn't liked. Sorry.)

On 15.12.2021 10:48, Michal Orzel wrote:
> This patch and the problem it solves is about clearing top 32bits of all gp 
> registers so not only x0,x1.

That's well understood. Yet for everything still in registers simply
using mov ahead of the respective push (as you had it) is still
preferable imo.

Jan




Re: [RFC v1 1/5] xen/arm: add support for Renesas R-Car Gen3 platform

2021-12-15 Thread Oleksandr Tyshchenko
On Tue, Dec 14, 2021 at 11:35 AM Oleksii Moisieiev <
oleksii_moisie...@epam.com> wrote:

Hi Oleksii

[sorry for the possible format issues]

Implementation includes platform-specific smc handler for rcar3 platform.
>
> Signed-off-by: Oleksii Moisieiev 
> ---
>  xen/arch/arm/platforms/Makefile |  1 +
>  xen/arch/arm/platforms/rcar3.c  | 46 +
>  2 files changed, 47 insertions(+)
>  create mode 100644 xen/arch/arm/platforms/rcar3.c
>
> diff --git a/xen/arch/arm/platforms/Makefile
> b/xen/arch/arm/platforms/Makefile
> index 8632f4115f..b64c25de6c 100644
> --- a/xen/arch/arm/platforms/Makefile
> +++ b/xen/arch/arm/platforms/Makefile
> @@ -4,6 +4,7 @@ obj-$(CONFIG_ALL32_PLAT) += exynos5.o
>  obj-$(CONFIG_ALL32_PLAT) += midway.o
>  obj-$(CONFIG_ALL32_PLAT) += omap5.o
>  obj-$(CONFIG_ALL32_PLAT) += rcar2.o
> +obj-$(CONFIG_RCAR3) += rcar3.o
>  obj-$(CONFIG_ALL64_PLAT) += seattle.o
>  obj-$(CONFIG_ALL_PLAT)   += sunxi.o
>  obj-$(CONFIG_ALL64_PLAT) += thunderx.o
> diff --git a/xen/arch/arm/platforms/rcar3.c
> b/xen/arch/arm/platforms/rcar3.c
> new file mode 100644
> index 00..d740145c71
> --- /dev/null
> +++ b/xen/arch/arm/platforms/rcar3.c
> @@ -0,0 +1,46 @@
> +/*
> + * xen/arch/arm/platforms/rcar3.c
> + *
> + * Renesas R-Car Gen3 specific settings
> + *
> + * Oleksii Moisieiev 
> + * Copyright (C) 2021 EPAM Systems
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include 
> +
> +static bool rcar3_smc(struct cpu_user_regs *regs)
> +{
> +return false;
> +}
> +
> +static const char *const rcar3_dt_compat[] __initconst =
> +{
> +"renesas,r8a7795",
> +"renesas,r8a7796",
>


Please note that since Linux commit:
"9c9f7891093b02eb64ca4e1c7ab776a4296c058f soc: renesas: Identify R-Car
M3-W+"
the compatible string for R-Car M3-W+ (ES3.0) SoC is "renesas,r8a77961". So
in case we want to have vSCMI feature on this new SoC revision as well we
will need
to extend the compatible list.


+NULL
> +};
> +
> +PLATFORM_START(rcar3, "Renesas R-Car Gen3")
> +.compatible = rcar3_dt_compat,
> +.smc = rcar3_smc
> +PLATFORM_END
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> --
> 2.27.0
>
>

-- 
Regards,

Oleksandr Tyshchenko


Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-15 Thread Michal Orzel



On 15.12.2021 10:35, Jan Beulich wrote:
> On 15.12.2021 10:27, Michal Orzel wrote:
>> Replying to both Julien and Jan,
>>
>> On 14.12.2021 12:30, Julien Grall wrote:
>>>
>>>
>>> On 14/12/2021 11:01, Julien Grall wrote:
 Hi,

 Replying in one e-mail the comments from Jan and Michal.

 On 14/12/2021 10:01, Jan Beulich wrote:
> On 14.12.2021 10:51, Michal Orzel wrote:
>> Hi Julien,
>>
>> On 14.12.2021 10:33, Julien Grall wrote:
>>>
>>>
>>> On 14/12/2021 09:17, Michal Orzel wrote:
 Hi Julien, Jan
>>>
>>> Hi,
>>>
 On 08.12.2021 10:55, Julien Grall wrote:
> Hi,
>
> On 08/12/2021 07:20, Jan Beulich wrote:
>> On 07.12.2021 20:11, Julien Grall wrote:
>>>
>>>
>>> On 07/12/2021 08:37, Michal Orzel wrote:
 Hi Julien,
>>>
>>> Hi,
>>>
 On 06.12.2021 16:29, Julien Grall wrote:
> Hi,
>
> On 06/12/2021 14:20, Michal Orzel wrote:
>> to hypervisor when switching to AArch32 state.
>>
 I will change to "from AArch32 state".
>> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
>> "If the general-purpose register was accessible from AArch32 
>> state the
>> upper 32 bits either become zero, or hold the value that the same
>> architectural register held before any AArch32 execution.
>> The choice between these two options is IMPLEMENTATIONDEFINED"
>
> Typo: Missing space between IMPLEMENTATION and DEFINED.
>
 Ok.
>>
>> Currently Xen does not ensure that the top 32 bits are zeroed 
>> and this
>> needs to be fixed.
>
> Can you outline why this is a problem and why we need to protect? 
> IIRC, the main concern is Xen may misinterpret what the guest 
> requested but we are not concerned about Xen using wrong value.
>
 I would say:
 "
 The reason why this is a problem is that there are places in Xen 
 where we assume that top 32bits are zero for AArch32 guests.
 If they are not, this can lead to misinterpretation of Xen 
 regarding what the guest requested.
 For example hypercalls returning an error encoded in a signed long 
 like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
 if the command passed as the first argument was clobbered,
 "
>>
>> Fix this bug by zeroing the upper 32 bits of these registers on 
>> an
>> entry to hypervisor when switching to AArch32 state.
>>
>> Set default value of parameter compat of macro entry to 0 
>> (AArch64 mode
>> as we are on 64-bit hypervisor) to avoid checking if parameter 
>> is blank
>> when not passed.
>
> Which error do you see otherwise? Is it a compilation error?
>
 Yes, this is a compilation error. The errors appear at each line 
 when "entry" is called without passing value for "compat".
 So basically in all the places where entry is called with hyp=1.
 When taking the current patch and removing default value for 
 compat you will get:
 ```
 entry.S:254: Error: ".endif" without ".if"
 entry.S:258: Error: symbol `.if' is already defined
 entry.S:258: Error: ".endif" without ".if"
 entry.S:262: Error: symbol `.if' is already defined
 entry.S:262: Error: ".endif" without ".if"
 entry.S:266: Error: symbol `.if' is already defined
 entry.S:266: Error: ".endif" without ".if"
 entry.S:278: Error: symbol `.if' is already defined
 entry.S:278: Error: ".endif" without ".if"
 entry.S:292: Error: symbol `.if' is already defined
 entry.S:292: Error: ".endif" without ".if"
 entry.S:317: Error: symbol `.if' is already defined
 entry.S:317: Error: ".endif" without ".if"
 ```
>>>
>>> Thanks for input. I am concerned with your suggested approach (or 
>>> using
>>> .if 0\compat as suggested by Jan) because they allow the caller to 
>>> not
>>> properly specify compat when hyp=0. The risk here is we may 
>>> generate the
>>> wrong entry.
>>>
>>> compat should need to be specified when hyp=1 as we will always run 
>>> in
>>> aarch64 mode. So could we protect this code with hyp=0?
>>
>> Since my suggestion was only to avoid the need for specifying a 
>> default
>> for the parameter (whi

Re: [PATCH v3 01/13] xen: move do_vcpu_op() to arch specific code

2021-12-15 Thread Julien Grall

Hi Juergen,

On 15/12/2021 07:12, Juergen Gross wrote:

On 14.12.21 18:21, Julien Grall wrote:

Hi Juergen,

On 08/12/2021 15:55, Juergen Gross wrote:

Today Arm is using another entry point for the vcpu_op hypercall as


NIT: The 'as' doesn't sound right here. Did you mean 'compare to'?


Hmm, it should even be "different" instead of "another". And then it
should be:

   The entry point used for the vcpu_op hypercall on Arm is different
   from the one on x86 today.


LGTM.

Cheers,

--
Julien Grall



Re: [RFC v1 1/5] xen/arm: add support for Renesas R-Car Gen3 platform

2021-12-15 Thread Julien Grall

Hi,

On 14/12/2021 09:34, Oleksii Moisieiev wrote:

Implementation includes platform-specific smc handler for rcar3 platform.

Signed-off-by: Oleksii Moisieiev 
---
  xen/arch/arm/platforms/Makefile |  1 +
  xen/arch/arm/platforms/rcar3.c  | 46 +
  2 files changed, 47 insertions(+)
  create mode 100644 xen/arch/arm/platforms/rcar3.c

diff --git a/xen/arch/arm/platforms/Makefile b/xen/arch/arm/platforms/Makefile
index 8632f4115f..b64c25de6c 100644
--- a/xen/arch/arm/platforms/Makefile
+++ b/xen/arch/arm/platforms/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_ALL32_PLAT) += exynos5.o
  obj-$(CONFIG_ALL32_PLAT) += midway.o
  obj-$(CONFIG_ALL32_PLAT) += omap5.o
  obj-$(CONFIG_ALL32_PLAT) += rcar2.o
+obj-$(CONFIG_RCAR3) += rcar3.o
  obj-$(CONFIG_ALL64_PLAT) += seattle.o
  obj-$(CONFIG_ALL_PLAT)   += sunxi.o
  obj-$(CONFIG_ALL64_PLAT) += thunderx.o
diff --git a/xen/arch/arm/platforms/rcar3.c b/xen/arch/arm/platforms/rcar3.c
new file mode 100644
index 00..d740145c71
--- /dev/null
+++ b/xen/arch/arm/platforms/rcar3.c
@@ -0,0 +1,46 @@
+/*
+ * xen/arch/arm/platforms/rcar3.c
+ *
+ * Renesas R-Car Gen3 specific settings
+ *
+ * Oleksii Moisieiev 
+ * Copyright (C) 2021 EPAM Systems
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+
+static bool rcar3_smc(struct cpu_user_regs *regs)
+{
+return false;
+}


Looking at the rest of the series, this is going to be replaced in patch 
#2 with:


return sci_handle_call();

SCMI is not specific to RCAR3. So I would expect the function to be 
called from common code.


If it still needs some platform specific code, then I think it would be 
best to introduce rcar3.c at the end of the series. So we don't 
introduce a dummy platform and not hook the code in the middle of 
patch#2 which is meant to be generic.


I will have a proper review of the rest of the series in the new year.

Cheers,

--
Julien Grall



Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-15 Thread Jan Beulich
On 15.12.2021 10:27, Michal Orzel wrote:
> Replying to both Julien and Jan,
> 
> On 14.12.2021 12:30, Julien Grall wrote:
>>
>>
>> On 14/12/2021 11:01, Julien Grall wrote:
>>> Hi,
>>>
>>> Replying in one e-mail the comments from Jan and Michal.
>>>
>>> On 14/12/2021 10:01, Jan Beulich wrote:
 On 14.12.2021 10:51, Michal Orzel wrote:
> Hi Julien,
>
> On 14.12.2021 10:33, Julien Grall wrote:
>>
>>
>> On 14/12/2021 09:17, Michal Orzel wrote:
>>> Hi Julien, Jan
>>
>> Hi,
>>
>>> On 08.12.2021 10:55, Julien Grall wrote:
 Hi,

 On 08/12/2021 07:20, Jan Beulich wrote:
> On 07.12.2021 20:11, Julien Grall wrote:
>>
>>
>> On 07/12/2021 08:37, Michal Orzel wrote:
>>> Hi Julien,
>>
>> Hi,
>>
>>> On 06.12.2021 16:29, Julien Grall wrote:
 Hi,

 On 06/12/2021 14:20, Michal Orzel wrote:
> to hypervisor when switching to AArch32 state.
>
>>> I will change to "from AArch32 state".
> According to section D1.20.2 of Arm Arm(DDI 0487A.j):
> "If the general-purpose register was accessible from AArch32 
> state the
> upper 32 bits either become zero, or hold the value that the same
> architectural register held before any AArch32 execution.
> The choice between these two options is IMPLEMENTATIONDEFINED"

 Typo: Missing space between IMPLEMENTATION and DEFINED.

>>> Ok.
>
> Currently Xen does not ensure that the top 32 bits are zeroed and 
> this
> needs to be fixed.

 Can you outline why this is a problem and why we need to protect? 
 IIRC, the main concern is Xen may misinterpret what the guest 
 requested but we are not concerned about Xen using wrong value.

>>> I would say:
>>> "
>>> The reason why this is a problem is that there are places in Xen 
>>> where we assume that top 32bits are zero for AArch32 guests.
>>> If they are not, this can lead to misinterpretation of Xen 
>>> regarding what the guest requested.
>>> For example hypercalls returning an error encoded in a signed long 
>>> like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>>> if the command passed as the first argument was clobbered,
>>> "
>
> Fix this bug by zeroing the upper 32 bits of these registers on an
> entry to hypervisor when switching to AArch32 state.
>
> Set default value of parameter compat of macro entry to 0 
> (AArch64 mode
> as we are on 64-bit hypervisor) to avoid checking if parameter is 
> blank
> when not passed.

 Which error do you see otherwise? Is it a compilation error?

>>> Yes, this is a compilation error. The errors appear at each line 
>>> when "entry" is called without passing value for "compat".
>>> So basically in all the places where entry is called with hyp=1.
>>> When taking the current patch and removing default value for compat 
>>> you will get:
>>> ```
>>> entry.S:254: Error: ".endif" without ".if"
>>> entry.S:258: Error: symbol `.if' is already defined
>>> entry.S:258: Error: ".endif" without ".if"
>>> entry.S:262: Error: symbol `.if' is already defined
>>> entry.S:262: Error: ".endif" without ".if"
>>> entry.S:266: Error: symbol `.if' is already defined
>>> entry.S:266: Error: ".endif" without ".if"
>>> entry.S:278: Error: symbol `.if' is already defined
>>> entry.S:278: Error: ".endif" without ".if"
>>> entry.S:292: Error: symbol `.if' is already defined
>>> entry.S:292: Error: ".endif" without ".if"
>>> entry.S:317: Error: symbol `.if' is already defined
>>> entry.S:317: Error: ".endif" without ".if"
>>> ```
>>
>> Thanks for input. I am concerned with your suggested approach (or 
>> using
>> .if 0\compat as suggested by Jan) because they allow the caller to 
>> not
>> properly specify compat when hyp=0. The risk here is we may generate 
>> the
>> wrong entry.
>>
>> compat should need to be specified when hyp=1 as we will always run 
>> in
>> aarch64 mode. So could we protect this code with hyp=0?
>
> Since my suggestion was only to avoid the need for specifying a 
> default
> for the parameter (which you didn't seem to be happy about), it would
> then merely extend to
>
> .if !0\hyp && 0\compat
 Isn't it effectively the same as setting a def

Re: [PATCH] xen/arm64: Zero the top 32 bits of gp registers on entry...

2021-12-15 Thread Michal Orzel
Replying to both Julien and Jan,

On 14.12.2021 12:30, Julien Grall wrote:
> 
> 
> On 14/12/2021 11:01, Julien Grall wrote:
>> Hi,
>>
>> Replying in one e-mail the comments from Jan and Michal.
>>
>> On 14/12/2021 10:01, Jan Beulich wrote:
>>> On 14.12.2021 10:51, Michal Orzel wrote:
 Hi Julien,

 On 14.12.2021 10:33, Julien Grall wrote:
>
>
> On 14/12/2021 09:17, Michal Orzel wrote:
>> Hi Julien, Jan
>
> Hi,
>
>> On 08.12.2021 10:55, Julien Grall wrote:
>>> Hi,
>>>
>>> On 08/12/2021 07:20, Jan Beulich wrote:
 On 07.12.2021 20:11, Julien Grall wrote:
>
>
> On 07/12/2021 08:37, Michal Orzel wrote:
>> Hi Julien,
>
> Hi,
>
>> On 06.12.2021 16:29, Julien Grall wrote:
>>> Hi,
>>>
>>> On 06/12/2021 14:20, Michal Orzel wrote:
 to hypervisor when switching to AArch32 state.

>> I will change to "from AArch32 state".
 According to section D1.20.2 of Arm Arm(DDI 0487A.j):
 "If the general-purpose register was accessible from AArch32 state 
 the
 upper 32 bits either become zero, or hold the value that the same
 architectural register held before any AArch32 execution.
 The choice between these two options is IMPLEMENTATIONDEFINED"
>>>
>>> Typo: Missing space between IMPLEMENTATION and DEFINED.
>>>
>> Ok.

 Currently Xen does not ensure that the top 32 bits are zeroed and 
 this
 needs to be fixed.
>>>
>>> Can you outline why this is a problem and why we need to protect? 
>>> IIRC, the main concern is Xen may misinterpret what the guest 
>>> requested but we are not concerned about Xen using wrong value.
>>>
>> I would say:
>> "
>> The reason why this is a problem is that there are places in Xen 
>> where we assume that top 32bits are zero for AArch32 guests.
>> If they are not, this can lead to misinterpretation of Xen regarding 
>> what the guest requested.
>> For example hypercalls returning an error encoded in a signed long 
>> like do_sched_op, do_hmv_op, do_memory_op would return -ENOSYS
>> if the command passed as the first argument was clobbered,
>> "

 Fix this bug by zeroing the upper 32 bits of these registers on an
 entry to hypervisor when switching to AArch32 state.

 Set default value of parameter compat of macro entry to 0 (AArch64 
 mode
 as we are on 64-bit hypervisor) to avoid checking if parameter is 
 blank
 when not passed.
>>>
>>> Which error do you see otherwise? Is it a compilation error?
>>>
>> Yes, this is a compilation error. The errors appear at each line 
>> when "entry" is called without passing value for "compat".
>> So basically in all the places where entry is called with hyp=1.
>> When taking the current patch and removing default value for compat 
>> you will get:
>> ```
>> entry.S:254: Error: ".endif" without ".if"
>> entry.S:258: Error: symbol `.if' is already defined
>> entry.S:258: Error: ".endif" without ".if"
>> entry.S:262: Error: symbol `.if' is already defined
>> entry.S:262: Error: ".endif" without ".if"
>> entry.S:266: Error: symbol `.if' is already defined
>> entry.S:266: Error: ".endif" without ".if"
>> entry.S:278: Error: symbol `.if' is already defined
>> entry.S:278: Error: ".endif" without ".if"
>> entry.S:292: Error: symbol `.if' is already defined
>> entry.S:292: Error: ".endif" without ".if"
>> entry.S:317: Error: symbol `.if' is already defined
>> entry.S:317: Error: ".endif" without ".if"
>> ```
>
> Thanks for input. I am concerned with your suggested approach (or 
> using
> .if 0\compat as suggested by Jan) because they allow the caller to not
> properly specify compat when hyp=0. The risk here is we may generate 
> the
> wrong entry.
>
> compat should need to be specified when hyp=1 as we will always run in
> aarch64 mode. So could we protect this code with hyp=0?

 Since my suggestion was only to avoid the need for specifying a default
 for the parameter (which you didn't seem to be happy about), it would
 then merely extend to

 .if !0\hyp && 0\compat
>>> Isn't it effectively the same as setting a default value?
>>>
>>> The reason we seem to get away is because other part of the macro (e.g. 
>>> entry_guest) will need compat to be valid.
>>>
>>> But that seems pretty fragil

Re: [XEN PATCH v8 14/47] build: rename __LINKER__ to LINKER_SCRIPT

2021-12-15 Thread Julien Grall

Hi Jan,

On 15/12/2021 07:49, Jan Beulich wrote:

On 14.12.2021 18:05, Julien Grall wrote:

On 25/11/2021 13:39, Anthony PERARD wrote:

For two reasons: this macro is used to generate a "linker script" and
is not by the linker, and name starting with an underscore '_' are
supposed to be reserved, so better avoid them when not needed.


If that's the case, then shouldn't we also rename __ASSEMBLY__?


I'd rather not - unlike __LINKER__ (afaict at least) __ASSEMBLY__


I can't remember where I took __LINKER__ from.


is a commonly used identifier (which we've actually inherited from
Linux)


Fair enough.

Cheers,

--
Julien Grall



[qemu-mainline test] 167417: regressions - FAIL

2021-12-15 Thread osstest service owner
flight 167417 qemu-mainline real [real]
flight 167424 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/167417/
http://logs.test-lab.xenproject.org/osstest/logs/167424/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 18 
guest-start/debianhvm.repeat fail REGR. vs. 167228

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 167228

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167228
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167228
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 167228
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167228
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 167228
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 167228
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167228
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167228
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass

version targeted for testing:
 qemuu   

Re: [PATCH] xen/pciback: fix typo in a comment

2021-12-15 Thread Juergen Gross

On 12.12.21 04:24, Jason Wang wrote:

The double `the' in the comment in line 163 is repeated. Remove it
from the comment.

Signed-off-by: Jason Wang 
---
  drivers/xen/xen-pciback/pciback_ops.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pciback_ops.c 
b/drivers/xen/xen-pciback/pciback_ops.c
index 3fbc21466a93..e38b43b5065e 100644
--- a/drivers/xen/xen-pciback/pciback_ops.c
+++ b/drivers/xen/xen-pciback/pciback_ops.c
@@ -160,7 +160,7 @@ int xen_pcibk_enable_msi(struct xen_pcibk_device *pdev,
}
  
  	/* The value the guest needs is actually the IDT vector, not the

-* the local domain's IRQ number. */
+* local domain's IRQ number. */


When touching this comment, would you mind to correct its style, too?
It should have a leading "/*" and a trailing "*/" line like e.g. in
line 72 of the same source file.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [OSSTEST PATCH 1/2] daily-cron-email-{play,adhoc}-*: Drop Citrix email

2021-12-15 Thread Roger Pau Monné
On Tue, Dec 14, 2021 at 02:52:25PM +, Ian Jackson wrote:
> Any such adhoc flights run from cron should be reported to
> osstest-admin, not my (soon to be deleted) Citrix adddress.
> 
> Now the only remaining occurrences are
>   - examples
>   - authorship annotation of the init script
>   - crontab-cambridge, which would need updating
> 
> CC: Roger Pau Monné 
> Signed-off-by: Ian Jackson 

Acked-by: Roger Pau Monné 

Thanks.



Re: [OSSTEST PATCH 2/2] daily-cron-email-real-*: Temporarily drop osstest-admin

2021-12-15 Thread Roger Pau Monné
On Tue, Dec 14, 2021 at 02:52:26PM +, Ian Jackson wrote:
> Roger has agreed to take on osstest admin for the moment.  Someone who
> intents to take on the role long term will probably want to get CC's
> of these flight reports, but it's fairly noisy.  So for now, send them
> only to the lists.
> 
> CC: Roger Pau Monné 
> Signed-off-by: Ian Jackson 

Acked-by: Roger Pau Monné 

Thanks.



Re: [PATCH] x86/cpuid: Introduce dom0-cpuid command line option

2021-12-15 Thread Jan Beulich
On 14.12.2021 22:16, Andrew Cooper wrote:
> Specifically, this lets the user opt in to non-default for dom0.
> 
> Split features[] out of parse_xen_cpuid(), giving it a lightly more
> appropraite name, so it can be shared with parse_xen_cpuid().

With the latter one I guess you mean parse_dom0_cpuid()?

> Collect all dom0 settings together in dom0_{en,dis}able_feat[], and apply it
> to dom0's policy when other tweaks are being made.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Roger Pau Monné 
> CC: Wei Liu 
> CC: Daniel Smith 
> 
> RFC, because I think we've got a preexisting error with late hwdom here.  We
> really should not be cobbering a late hwdom's settings (which were provided in
> the usual way by the toolstack in dom0).

For ITSC I think also covering late hwdom is at least acceptable. For the
speculation controls I'm less certain (but as per the comment there they're
temporary only anyway), and I agree the command line option here should
strictly only apply to Dom0 (or else, as a minor aspect, the option also
would better be named "hwdom-cpuid=").

> Furthermore, the distinction gets more murky in a hyperlaunch future where
> multiple domains may be constructed by Xen, and there is reason to expect that
> a full toolstack-like configuration is made available for them.

Like above, anything created via the toolstack interfaces should use the
toolstack controls. If there was something dom0less-like on x86, domains
created that way (without toolstack involvement) would instead want to
have another way of controlling their CPUID settings.

> One option might be to remove the special case from init_domain_cpuid_policy()
> and instead make a call into the cpuid code from create_dom0().  It would have
> to be placed between domain_create() and alloc_dom0_vcpu0() for dynamic sizing
> of the FPU block to work.  Thoughts?

As said above, I think the ITSC special case could stay. But apart from
this I agree.

> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -801,6 +801,22 @@ Controls for how dom0 is constructed on x86 systems.
>  
>  If using this option is necessary to fix an issue, please report a bug.
>  
> +### dom0-cpuid
> += List of comma separated booleans
> +
> +Applicability: x86
> +
> +This option allows for fine tuning of the facilities dom0 will use, after
> +accounting for hardware capabilities and Xen settings as enumerated via 
> CPUID.
> +
> +Options are accepted in positive and negative form, to enable or disable
> +specific features.  All selections via this mechanism are subject to normal
> +CPU Policy safety logic.
> +
> +This option is intended for developers to opt dom0 into non-default features,
> +and is not intended for use in production circumstances.  If using this 
> option
> +is necessary to fix an issue, please report a bug.

You may want to state explicitly that disables take priority over enables,
as per the present implementation. Personally I would find it better if the
item specified last took effect. This is, as mentioned in other contexts,
so one can override earlier settings (e.g. in a xen.cfg file used with
xen.efi) by simply appending to the command line.

> @@ -97,6 +98,73 @@ static int __init parse_xen_cpuid(const char *s)
>  }
>  custom_param("cpuid", parse_xen_cpuid);
>  
> +static uint32_t __hwdom_initdata dom0_enable_feat[FSCAPINTS];
> +static uint32_t __hwdom_initdata dom0_disable_feat[FSCAPINTS];
> +
> +static int __init parse_dom0_cpuid(const char *s)
> +{
> +const char *ss;
> +int val, rc = 0;
> +
> +do {
> +const struct feature_name *lhs, *rhs, *mid = NULL /* GCC... */;
> +const char *feat;
> +
> +ss = strchr(s, ',');
> +if ( !ss )
> +ss = strchr(s, '\0');
> +
> +/* Skip the 'no-' prefix for name comparisons. */
> +feat = s;
> +if ( strncmp(s, "no-", 3) == 0 )
> +feat += 3;
> +
> +/* (Re)initalise lhs and rhs for binary search. */
> +lhs = feature_names;
> +rhs = feature_names + ARRAY_SIZE(feature_names);
> +
> +while ( lhs < rhs )
> +{
> +int res;
> +
> +mid = lhs + (rhs - lhs) / 2;
> +res = cmdline_strcmp(feat, mid->name);
> +
> +if ( res < 0 )
> +{
> +rhs = mid;
> +continue;
> +}
> +if ( res > 0 )
> +{
> +lhs = mid + 1;
> +continue;
> +}
> +
> +if ( (val = parse_boolean(mid->name, s, ss)) >= 0 )
> +{
> +__set_bit(mid->bit,
> +  val ? dom0_enable_feat : dom0_disable_feat);
> +mid = NULL;
> +}
> +
> +break;
> +}
> +
> +/*
> + * Mid being NULL means that the name and boolean were successfully
> + * identified.  Everything else is an error.
> + *