Re: [PATCH] xen/netback: Pass (void *) to virt_to_page()

2023-05-24 Thread Jakub Kicinski
On Wed, 24 May 2023 22:11:47 -0700 Jakub Kicinski wrote:
> On Tue, 23 May 2023 16:03:42 +0200 Linus Walleij wrote:
> > virt_to_page() takes a virtual address as argument but
> > the driver passes an unsigned long, which works because
> > the target platform(s) uses polymorphic macros to calculate
> > the page.
> > 
> > Since many architectures implement virt_to_pfn() as
> > a macro, this function becomes polymorphic and accepts both a
> > (unsigned long) and a (void *).
> > 
> > Fix this up by an explicit (void *) cast.  
> 
> Paul, Wei, looks like netdev may be the usual path for this patch 
> to flow thru, although I'm never 100% sure with Xen.
> Please ack or LUK if you prefer to direct the patch elsewhere?

Ugh, Wei already acked this, sorry for the noise.



Re: [PATCH] xen/netback: Pass (void *) to virt_to_page()

2023-05-24 Thread Jakub Kicinski
On Tue, 23 May 2023 16:03:42 +0200 Linus Walleij wrote:
> virt_to_page() takes a virtual address as argument but
> the driver passes an unsigned long, which works because
> the target platform(s) uses polymorphic macros to calculate
> the page.
> 
> Since many architectures implement virt_to_pfn() as
> a macro, this function becomes polymorphic and accepts both a
> (unsigned long) and a (void *).
> 
> Fix this up by an explicit (void *) cast.

Paul, Wei, looks like netdev may be the usual path for this patch 
to flow thru, although I'm never 100% sure with Xen.
Please ack or LUK if you prefer to direct the patch elsewhere?



[xen-unstable-smoke test] 180936: tolerable all pass - PUSHED

2023-05-24 Thread osstest service owner
flight 180936 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180936/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  cca2361947b3c9851b3ded6e43cc48caf5258eee
baseline version:
 xen  511b9f286c3dadd041e0d90beeff7d47c9bf3b7a

Last test of basis   180931  2023-05-24 20:01:55 Z0 days
Testing same since   180936  2023-05-25 02:02:55 Z0 days1 attempts


People who touched revisions under test:
  Michal Orzel 
  Stefano Stabellini 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   511b9f286c..cca2361947  cca2361947b3c9851b3ded6e43cc48caf5258eee -> smoke



[xen-unstable test] 180930: tolerable FAIL - PUSHED

2023-05-24 Thread osstest service owner
flight 180930 xen-unstable real [real]
flight 180935 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/180930/
http://logs.test-lab.xenproject.org/osstest/logs/180935/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qcow2 21 guest-start/debian.repeat fail pass in 
180935-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 180922
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 180922
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 180922
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 180922
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 180922
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 180922
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 180922
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 180922
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 180922
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 180922
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 180922
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 180922
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass

version targeted for testing:
 xen  380c6c170393c48852d4f2b1ea97125a399cfc61
baseline version:
 xen  c7908869ac26961a3919491705e521179ad3fc0e

Last test of basis   180922  2023-05-24 01:54:52 Z 

[qemu-mainline test] 180933: regressions - FAIL

2023-05-24 Thread osstest service owner
flight 180933 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180933/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 180691
 build-amd64-xsm   6 xen-buildfail REGR. vs. 180691
 build-i386-xsm6 xen-buildfail REGR. vs. 180691
 build-i3866 xen-buildfail REGR. vs. 180691
 build-arm64   6 xen-buildfail REGR. vs. 180691
 build-arm64-xsm   6 xen-buildfail REGR. vs. 180691
 build-armhf   6 xen-buildfail REGR. vs. 180691

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-vhd1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-vhd   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-coresched-i386-xl  1 build-check(1)   blocked  n/a
 test-amd64-coresched-amd64-xl  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-shadow1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-dom0pvh-xl-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1) blocked n/a
 test-amd64-amd64-dom0pvh-xl-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 

Re: [PATCH 2/2] xen/misra: diff-report.py: add report patching feature

2023-05-24 Thread Stefano Stabellini
On Fri, 19 May 2023, Luca Fancellu wrote:
> > On 19 May 2023, at 10:46, Luca Fancellu  wrote:
> > 
> > Add a feature to the diff-report.py script that improves the comparison
> > between two analysis report, one from a baseline codebase and the other
> > from the changes applied to the baseline.
> > 
> > The comparison between reports of different codebase is an issue because
> > entries in the baseline could have been moved in position due to addition
> > or deletion of unrelated lines or can disappear because of deletion of
> > the interested line, making the comparison between two revisions of the
> > code harder.
> > 
> > Having a baseline report, a report of the codebase with the changes
> > called "new report" and a git diff format file that describes the
> > changes happened to the code from the baseline, this feature can
> > understand which entries from the baseline report are deleted or shifted
> > in position due to changes to unrelated lines and can modify them as
> > they will appear in the "new report".
> > 
> > Having the "patched baseline" and the "new report", now it's simple
> > to make the diff between them and print only the entry that are new.
> > 
> > Signed-off-by: Luca Fancellu 
> > ---
> > Changes from v1:
> > - Made the script compatible with python2 (Stefano)
> > ---
> > xen/scripts/diff-report.py|  55 -
> > xen/scripts/xen_analysis/diff_tool/debug.py   |  21 ++
> > xen/scripts/xen_analysis/diff_tool/report.py  |  87 +++
> > .../diff_tool/unified_format_parser.py| 232 ++
> > 4 files changed, 393 insertions(+), 2 deletions(-)
> > create mode 100644 
> > xen/scripts/xen_analysis/diff_tool/unified_format_parser.py
> > 
> > diff --git a/xen/scripts/diff-report.py b/xen/scripts/diff-report.py
> > index f97cb2355cc3..d608e3a05aa1 100755
> > --- a/xen/scripts/diff-report.py
> > +++ b/xen/scripts/diff-report.py
> > @@ -7,6 +7,10 @@ from argparse import ArgumentParser
> > from xen_analysis.diff_tool.cppcheck_report import CppcheckReport
> > from xen_analysis.diff_tool.debug import Debug
> > from xen_analysis.diff_tool.report import ReportError
> > +from xen_analysis.diff_tool.unified_format_parser import \
> > +(UnifiedFormatParser, UnifiedFormatParseError)
> > +from xen_analysis.settings import repo_dir
> > +from xen_analysis.utils import invoke_command
> > 
> > 
> > def log_info(text, end='\n'):
> > @@ -36,9 +40,32 @@ def main(argv):
> >  "against the baseline.")
> > parser.add_argument("-v", "--verbose", action='store_true',
> > help="Print more informations during the run.")
> > +parser.add_argument("--patch", type=str,
> > +help="The patch file containing the changes to the 
> > "
> > + "code, from the baseline analysis result to 
> > the "
> > + "'check report' analysis result.\n"
> > + "Do not use with --baseline-rev/--report-rev")
> > +parser.add_argument("--baseline-rev", type=str,
> > +help="Revision or SHA of the codebase analysed to "
> > + "create the baseline report.\n"
> > + "Use together with --report-rev")
> > +parser.add_argument("--report-rev", type=str,
> > +help="Revision or SHA of the codebase analysed to "
> > + "create the 'check report'.\n"
> > + "Use together with --baseline-rev")
> > 
> > args = parser.parse_args()
> > 
> > +if args.patch and (args.baseline_rev or args.report_rev):
> > +print("ERROR: '--patch' argument can't be used with 
> > '--baseline-rev'"
> > +  " or '--report-rev'.")
> > +sys.exit(1)
> > +
> > +if bool(args.baseline_rev) != bool(args.report_rev):
> > +print("ERROR: '--baseline-rev' must be used together with "
> > +  "'--report-rev'.")
> > +sys.exit(1)
> > +
> > if args.out == "stdout":
> > file_out = sys.stdout
> > else:
> > @@ -63,11 +90,35 @@ def main(argv):
> > new_rep.parse()
> > debug.debug_print_parsed_report(new_rep)
> > log_info(" [OK]")
> > -except ReportError as e:
> > +diff_source = None
> > +if args.patch:
> > +diff_source = os.path.realpath(args.patch)
> > +elif args.baseline_rev:
> > +git_diff = invoke_command(
> > +"git diff --git-dir={} -C -C {}..{}".format(repo_dir,
> > +
> > args.baseline_rev,
> > +
> > args.report_rev),
> > +True, "Error occured invoking:\n{}\n\n{}"
> > +)
> 
> I’ve noticed now an issue here, when using --baseline-rev/--report-rev, the 
> fix is this one:
> 
> diff --git a/xen/scripts/diff-report.py 

Re: [PATCH 1/2] xen/misra: add diff-report.py tool

2023-05-24 Thread Stefano Stabellini
On Fri, 19 May 2023, Luca Fancellu wrote:
> Add a new tool, diff-report.py that can be used to make diff between
> reports generated by xen-analysis.py tool.
> Currently this tool supports the Xen cppcheck text report format in
> its operations.
> 
> The tool prints every finding that is in the report passed with -r
> (check report) which is not in the report passed with -b (baseline).
> 
> Signed-off-by: Luca Fancellu 

Acked-by: Stefano Stabellini 
Tested-by: Stefano Stabellini 


> ---
> Changes from v1:
>  - removed 2 method from class ReportEntry that landed there by a
>mistake on rebase.
>  - Made the script compatible also with python2 (Stefano)
> ---
>  xen/scripts/diff-report.py|  80 ++
>  .../xen_analysis/diff_tool/__init__.py|   0
>  .../xen_analysis/diff_tool/cppcheck_report.py |  44 
>  xen/scripts/xen_analysis/diff_tool/debug.py   |  40 +++
>  xen/scripts/xen_analysis/diff_tool/report.py  | 100 ++
>  5 files changed, 264 insertions(+)
>  create mode 100755 xen/scripts/diff-report.py
>  create mode 100644 xen/scripts/xen_analysis/diff_tool/__init__.py
>  create mode 100644 xen/scripts/xen_analysis/diff_tool/cppcheck_report.py
>  create mode 100644 xen/scripts/xen_analysis/diff_tool/debug.py
>  create mode 100644 xen/scripts/xen_analysis/diff_tool/report.py
> 
> diff --git a/xen/scripts/diff-report.py b/xen/scripts/diff-report.py
> new file mode 100755
> index ..f97cb2355cc3
> --- /dev/null
> +++ b/xen/scripts/diff-report.py
> @@ -0,0 +1,80 @@
> +#!/usr/bin/env python3
> +
> +from __future__ import print_function
> +import os
> +import sys
> +from argparse import ArgumentParser
> +from xen_analysis.diff_tool.cppcheck_report import CppcheckReport
> +from xen_analysis.diff_tool.debug import Debug
> +from xen_analysis.diff_tool.report import ReportError
> +
> +
> +def log_info(text, end='\n'):
> +# type: (str, str) -> None
> +global args
> +global file_out
> +
> +if (args.verbose):
> +print(text, end=end, file=file_out)
> +
> +
> +def main(argv):
> +# type: (list) -> None
> +global args
> +global file_out
> +
> +parser = ArgumentParser(prog="diff-report.py")
> +parser.add_argument("-b", "--baseline", required=True, type=str,
> +help="Path to the baseline report.")
> +parser.add_argument("--debug", action='store_true',
> +help="Produce intermediate reports during 
> operations.")
> +parser.add_argument("-o", "--out", default="stdout", type=str,
> +help="Where to print the tool output. Default is "
> + "stdout")
> +parser.add_argument("-r", "--report", required=True, type=str,
> +help="Path to the 'check report', the one checked "
> + "against the baseline.")
> +parser.add_argument("-v", "--verbose", action='store_true',
> +help="Print more informations during the run.")
> +
> +args = parser.parse_args()
> +
> +if args.out == "stdout":
> +file_out = sys.stdout
> +else:
> +try:
> +file_out = open(args.out, "wt")
> +except OSError as e:
> +print("ERROR: Issue opening file {}: {}".format(args.out, e))
> +sys.exit(1)
> +
> +debug = Debug(args)
> +
> +try:
> +baseline_path = os.path.realpath(args.baseline)
> +log_info("Loading baseline report {}".format(baseline_path), "")
> +baseline = CppcheckReport(baseline_path)
> +baseline.parse()
> +debug.debug_print_parsed_report(baseline)
> +log_info(" [OK]")
> +new_rep_path = os.path.realpath(args.report)
> +log_info("Loading check report {}".format(new_rep_path), "")
> +new_rep = CppcheckReport(new_rep_path)
> +new_rep.parse()
> +debug.debug_print_parsed_report(new_rep)
> +log_info(" [OK]")
> +except ReportError as e:
> +print("ERROR: {}".format(e))
> +sys.exit(1)
> +
> +output = new_rep - baseline
> +print(output, end="", file=file_out)
> +
> +if len(output) > 0:
> +sys.exit(1)
> +
> +sys.exit(0)
> +
> +
> +if __name__ == "__main__":
> +main(sys.argv[1:])
> diff --git a/xen/scripts/xen_analysis/diff_tool/__init__.py 
> b/xen/scripts/xen_analysis/diff_tool/__init__.py
> new file mode 100644
> index ..e69de29bb2d1
> diff --git a/xen/scripts/xen_analysis/diff_tool/cppcheck_report.py 
> b/xen/scripts/xen_analysis/diff_tool/cppcheck_report.py
> new file mode 100644
> index ..e7e80a9dde84
> --- /dev/null
> +++ b/xen/scripts/xen_analysis/diff_tool/cppcheck_report.py
> @@ -0,0 +1,44 @@
> +#!/usr/bin/env python3
> +
> +import re
> +from .report import Report, ReportError
> +
> +
> +class CppcheckReport(Report):
> +def __init__(self, report_path):
> +# type: (str) -> None
> +

Re: [PATCH 3/3] xen/misra: xen-analysis.py: Fix cppcheck report relative paths

2023-05-24 Thread Stefano Stabellini
On Fri, 19 May 2023, Luca Fancellu wrote:
> Fix the generation of the relative path from the repo, for cppcheck
> reports, when the script is launching make with in-tree build.
> 
> Fixes: b046f7e37489 ("xen/misra: xen-analysis.py: use the relative path from 
> the ...")
> Reported-by: Michal Orzel 
> Signed-off-by: Luca Fancellu 
> ---
>  .../xen_analysis/cppcheck_report_utils.py | 25 ---
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/scripts/xen_analysis/cppcheck_report_utils.py 
> b/xen/scripts/xen_analysis/cppcheck_report_utils.py
> index fdc299c7e029..10100f6c6a57 100644
> --- a/xen/scripts/xen_analysis/cppcheck_report_utils.py
> +++ b/xen/scripts/xen_analysis/cppcheck_report_utils.py
> @@ -1,6 +1,7 @@
>  #!/usr/bin/env python3
>  
> -import os
> +import os, re
> +from . import settings
>  from xml.etree import ElementTree
>  
>  class CppcheckHTMLReportError(Exception):
> @@ -101,12 +102,28 @@ def cppcheck_merge_txt_fragments(fragments_list, 
> out_txt_file, strip_paths):
>  text_report_content = list(text_report_content)
>  # Strip path from report lines
>  for i in list(range(0, len(text_report_content))):
> -for path in strip_paths:
> -text_report_content[i] = text_report_content[i].replace(
> -path + "/", 
> "")
>  # Split by : separator
>  text_report_content[i] = text_report_content[i].split(":")
>  
> +for path in strip_paths:
> +text_report_content[i][0] = \
> +text_report_content[i][0].replace(path + "/", "")

Why moving this for loop after "Split by : separator"? If it is
necessary, would it make sense to do it in the previous patch?


> +# When the compilation is in-tree, the makefile places
> +# the directory in /xen/xen, making cppcheck produce
> +# relative path from there, so check if "xen/" is a prefix
> +# of the path and if it's not, check if it can be added to
> +# have a relative path from the repository instead of from
> +# /xen/xen
> +if not text_report_content[i][0].startswith("xen/"):
> +# cppcheck first entry is in this format:
> +# path/to/file(line,cols), remove (line,cols)
> +cppcheck_file = re.sub(r'\(.*\)', '',
> +   text_report_content[i][0])
> +if os.path.isfile(settings.xen_dir + "/" + 
> cppcheck_file):
> +text_report_content[i][0] = \
> +"xen/" + text_report_content[i][0]
> +
>  # sort alphabetically for second field (misra rule) and as second
>  # criteria for the first field (file name)
>  text_report_content.sort(key = lambda x: (x[1], x[0]))
> -- 
> 2.34.1
> 



Re: [RFC PATCH v1 0/9] Hypervisor-Enforced Kernel Integrity

2023-05-24 Thread Trilok Soni

On 5/24/2023 3:20 PM, Edgecombe, Rick P wrote:

On Fri, 2023-05-05 at 17:20 +0200, Mickaël Salaün wrote:

# How does it work?

This implementation mainly leverages KVM capabilities to control the
Second
Layer Address Translation (or the Two Dimensional Paging e.g.,
Intel's EPT or
AMD's RVI/NPT) and Mode Based Execution Control (Intel's MBEC)
introduced with
the Kaby Lake (7th generation) architecture. This allows to set
permissions on
memory pages in a complementary way to the guest kernel's managed
memory
permissions. Once these permissions are set, they are locked and
there is no
way back.

A first KVM_HC_LOCK_MEM_PAGE_RANGES hypercall enables the guest
kernel to lock
a set of its memory page ranges with either the HEKI_ATTR_MEM_NOWRITE
or the
HEKI_ATTR_MEM_EXEC attribute. The first one denies write access to a
specific
set of pages (allow-list approach), and the second only allows kernel
execution
for a set of pages (deny-list approach).

The current implementation sets the whole kernel's .rodata (i.e., any
const or
__ro_after_init variables, which includes critical security data such
as LSM
parameters) and .text sections as non-writable, and the .text section
is the
only one where kernel execution is allowed. This is possible thanks
to the new
MBEC support also brough by this series (otherwise the vDSO would
have to be
executable). Thanks to this hardware support (VT-x, EPT and MBEC),
the
performance impact of such guest protection is negligible.

The second KVM_HC_LOCK_CR_UPDATE hypercall enables guests to pin some
of its
CPU control register flags (e.g., X86_CR0_WP, X86_CR4_SMEP,
X86_CR4_SMAP),
which is another complementary hardening mechanism.

Heki can be enabled with the heki=1 boot command argument.




Can the guest kernel ask the host VMM's emulated devices to DMA into
the protected data? It should go through the host userspace mappings I
think, which don't care about EPT permissions. Or did I miss where you
are protecting that another way? There are a lot of easy ways to ask
the host to write to guest memory that don't involve the EPT. You
probably need to protect the host userspace mappings, and also the
places in KVM that kmap a GPA provided by the guest.

[ snip ]



# Current limitations

The main limitation of this patch series is the statically enforced
permissions. This is not an issue for kernels without module but this
needs to
be addressed.  Mechanisms that dynamically impact kernel executable
memory are
not handled for now (e.g., kernel modules, tracepoints, eBPF JIT),
and such
code will need to be authenticated.  Because the hypervisor is highly
privileged and critical to the security of all the VMs, we don't want
to
implement a code authentication mechanism in the hypervisor itself
but delegate
this verification to something much less privileged. We are thinking
of two
ways to solve this: implement this verification in the VMM or spawn a
dedicated
special VM (similar to Windows's VBS). There are pros on cons to each
approach:
complexity, verification code ownership (guest's or VMM's), access to
guest
memory (i.e., confidential computing).


The kernel often creates writable aliases in order to write to
protected data (kernel text, etc). Some of this is done right as text
is being first written out (alternatives for example), and some happens
way later (jump labels, etc). So for verification, I wonder what stage
you would be verifying? If you want to verify the end state, you would
have to maintain knowledge in the verifier of all the touch-ups the
kernel does. I think it would get very tricky.


Right and for the ARM (from what I know) is that Erratas can be applied
using the alternatives fwk when you hotplug in the CPU post boot.

---Trilok Soni



Re: [PATCH 1/3] xen/misra: xen-analysis.py: Improve the cppcheck version check

2023-05-24 Thread Stefano Stabellini
On Fri, 19 May 2023, Luca Fancellu wrote:
> Use tuple comparison to check the cppcheck version.
> 
> Take the occasion to harden the regex, escaping the dots so that we
> check for them instead of generic characters.
> 
> Suggested-by: Andrew Cooper 
> Signed-off-by: Luca Fancellu 

Acked-by: Stefano Stabellini 


> ---
>  xen/scripts/xen_analysis/cppcheck_analysis.py | 13 -
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/scripts/xen_analysis/cppcheck_analysis.py 
> b/xen/scripts/xen_analysis/cppcheck_analysis.py
> index c8abbe0fca79..8dc45e653b79 100644
> --- a/xen/scripts/xen_analysis/cppcheck_analysis.py
> +++ b/xen/scripts/xen_analysis/cppcheck_analysis.py
> @@ -157,7 +157,7 @@ def generate_cppcheck_deps():
>  "Error occured retrieving cppcheck version:\n{}\n\n{}"
>  )
>  
> -version_regex = re.search('^Cppcheck (\d+).(\d+)(?:.\d+)?$',
> +version_regex = re.search('^Cppcheck (\d+)\.(\d+)(?:\.\d+)?$',
>invoke_cppcheck, flags=re.M)
>  # Currently, only cppcheck version >= 2.7 is supported, but version 2.8 
> is
>  # known to be broken, please refer to docs/misra/cppcheck.txt
> @@ -166,15 +166,10 @@ def generate_cppcheck_deps():
>  "Can't find cppcheck version or version not identified: "
>  "{}".format(invoke_cppcheck)
>  )
> -major = int(version_regex.group(1))
> -minor = int(version_regex.group(2))
> -if major < 2 or (major == 2 and minor < 7):
> +version = (int(version_regex.group(1)), int(version_regex.group(2)))
> +if version < (2, 7) or version == (2, 8):
>  raise CppcheckDepsPhaseError(
> -"Cppcheck version < 2.7 is not supported"
> -)
> -if major == 2 and minor == 8:
> -raise CppcheckDepsPhaseError(
> -"Cppcheck version 2.8 is known to be broken, see the 
> documentation"
> +"Cppcheck version < 2.7 or 2.8 are not supported"
>  )
>  
>  # If misra option is selected, append misra addon and generate cppcheck
> -- 
> 2.34.1
> 



Re: [RFC] Xen crashes on ASSERT on suspend/resume, suggested fix

2023-05-24 Thread Stefano Stabellini
On Tue, 23 May 2023, Roger Pau Monné wrote:
> On Tue, May 23, 2023 at 03:54:36PM +0200, Roger Pau Monné wrote:
> > On Thu, May 18, 2023 at 04:44:53PM -0700, Stefano Stabellini wrote:
> > > Hi all,
> > > 
> > > After many PVH Dom0 suspend/resume cycles we are seeing the following
> > > Xen crash (it is random and doesn't reproduce reliably):
> > > 
> > > (XEN) [555.042981][] R 
> > > arch/x86/irq.c#_clear_irq_vector+0x214/0x2bd
> > > (XEN) [555.042986][] F destroy_irq+0xe2/0x1b8
> > > (XEN) [555.042991][] F msi_free_irq+0x5e/0x1a7
> > > (XEN) [555.042995][] F unmap_domain_pirq+0x441/0x4b4
> > > (XEN) [555.043001][] F 
> > > arch/x86/hvm/vmsi.c#vpci_msi_disable+0xc0/0x155
> > > (XEN) [555.043006][] F vpci_msi_arch_disable+0x1e/0x2b
> > > (XEN) [555.043013][] F 
> > > drivers/vpci/msi.c#control_write+0x109/0x10e
> > > (XEN) [555.043018][] F vpci_write+0x11f/0x268
> > > (XEN) [555.043024][] F 
> > > arch/x86/hvm/io.c#vpci_portio_write+0xa0/0xa7
> > > (XEN) [555.043029][] F 
> > > hvm_process_io_intercept+0x203/0x26f
> > > (XEN) [555.043034][] F hvm_io_intercept+0x2a/0x4c
> > > (XEN) [555.043039][] F 
> > > arch/x86/hvm/emulate.c#hvmemul_do_io+0x29b/0x5f6
> > > (XEN) [555.043043][] F 
> > > arch/x86/hvm/emulate.c#hvmemul_do_io_buffer+0x2f/0x6a
> > > (XEN) [555.043048][] F hvmemul_do_pio_buffer+0x33/0x35
> > > (XEN) [555.043053][] F handle_pio+0x6d/0x1b4
> > > (XEN) [555.043059][] F svm_vmexit_handler+0x10bf/0x18b0
> > > (XEN) [555.043064][] F svm_stgi_label+0x8/0x18
> > > (XEN) [555.043067]
> > > (XEN) [555.469861]
> > > (XEN) [555.471855] 
> > > (XEN) [555.477315] Panic on CPU 9:
> > > (XEN) [555.480608] Assertion 'per_cpu(vector_irq, cpu)[old_vector] == 
> > > irq' failed at arch/x86/irq.c:233
> > > (XEN) [555.489882] 
> > > 
> > > Looking at the code in question, the ASSERT looks wrong to me.
> > > 
> > > Specifically, if you see send_cleanup_vector and
> > > irq_move_cleanup_interrupt, it is entirely possible to have old_vector
> > > still valid and also move_in_progress still set, but only some of the
> > > per_cpu(vector_irq, me)[vector] cleared. It seems to me that this could
> > > happen especially when an MSI has a large old_cpu_mask.
> > 
> > i guess the only way to get into such situation would be if you happen
> > to execute _clear_irq_vector() with a cpu_online_map smaller than the
> > one in old_cpu_mask, at which point you will leave old_vector fields
> > not updated.
> > 
> > Maybe somehow you get into this situation when doing suspend/resume?
> > 
> > Could you try to add a:
> > 
> > ASSERT(cpumask_equal(tmp_mask, desc->arch.old_cpu_mask));
> > 
> > Before the `for_each_cpu(cpu, tmp_mask)` loop?
> 
> I see that the old_cpu_mask is cleared in release_old_vec(), so that
> suggestion is not very useful.
> 
> Does the crash happen at specific points, for example just after
> resume or before suspend?

Hi Roger, Jan,

Unfortunately we are all completely blind on this issue. It is not
reproducible. It only happened once. We only have our wits to solve this
problem. However, looking at the codebase I think there are a few
possible races. Here is my analysis and suggested fix.


---
xen/irq: fix races between send_cleanup_vector and _clear_irq_vector

It is possible that send_cleanup_vector and _clear_irq_vector are
running at the same time on different CPUs. In that case we have a race
as both _clear_irq_vector and irq_move_cleanup_interrupt are trying to
clear old_vector.

This patch fixes 3 races:

1) As irq_move_cleanup_interrupt is running on multiple CPUs at the
same time, and also _clear_irq_vector is running, it is possible that
only some per_cpu(vector_irq, cpu)[old_vector] are valid but not all.
So, turn the ASSERT in _clear_irq_vector into an if.

2) It is possible that _clear_irq_vector is running at the same time as
release_old_vec, called from irq_move_cleanup_interrupt. At the moment,
it is possible for _clear_irq_vector to read a valid old_cpu_mask but an
invalid old_vector (because it is being set to invalid by
release_old_vec). To avoid this problem in release_old_vec move clearing
old_cpu_mask before setting old_vector to invalid. This way, we know that
in _clear_irq_vector if old_vector is invalid also old_cpu_mask is zero
and we don't enter the loop.

3) It is possible that release_old_vec is running twice at the same time
for the same old_vector. Change the code in release_old_vec to make it
OK to call it twice. Remove both ASSERTs. With those gone, it should be
possible now to call release_old_vec twice in a row for the same
old_vector.

Signed-off-by: Stefano Stabellini 
---
 xen/arch/x86/irq.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 20150b1c7f..d9c139cf1b 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -112,16 +112,11 @@ static void release_old_vec(struct irq_desc *desc)
 {
 unsigned int 

Re: [PATCH RFC v2] vPCI: account for hidden devices

2023-05-24 Thread Stefano Stabellini
On Wed, 24 May 2023, Jan Beulich wrote:
> Hidden devices (e.g. an add-in PCI serial card used for Xen's serial
> console) are associated with DomXEN, not Dom0. This means that while
> looking for overlapping BARs such devices cannot be found on Dom0's list
> of devices; DomXEN's list also needs to be scanned.
> 
> Suppress vPCI init altogether for r/o devices (which constitute a subset
> of hidden ones).
> 
> Signed-off-by: Jan Beulich 

This works! Ship it! :-)
https://gitlab.com/xen-project/people/sstabellini/xen/-/jobs/4346896950

I understand this is an RFC and there are still open questions, but
thank you for addressing the issue quickly.



> ---
> RFC: The modify_bars() change is intentionally mis-formatted, as the
>  necessary re-indentation would make the diff difficult to read. At
>  this point I'd merely like to gather input towards possible better
>  approaches to solve the issue (not the least because quite possibly
>  there are further places needing changing).
> 
> RFC: Whether to actually suppress vPCI init is up for debate; adding the
>  extra logic is following Roger's suggestion (I'm not convinced it is
>  useful to have). If we want to keep that, a 2nd question would be
>  whether to keep it in vpci_add_handlers(): Both of its callers (can)
>  have a struct pci_seg readily available, so checking ->ro_map at the
>  call sites would be easier.
> 
> RFC: _setup_hwdom_pci_devices()' loop may want splitting: For
>  modify_bars() to consistently respect BARs of hidden devices while
>  setting up "normal" ones (i.e. to avoid as much as possible the
>  "continue" path introduced here), setting up of the former may want
>  doing first.
> 
> RFC: vpci_write()'s check of the r/o map may want moving out of mainline
>  code, into the case dealing with !pdev->vpci.
> ---
> v2: Extend existing comment. Relax assertion. Don't initialize vPCI for
> r/o devices.
> 
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -218,6 +218,7 @@ static int modify_bars(const struct pci_
>  struct vpci_header *header = >vpci->header;
>  struct rangeset *mem = rangeset_new(NULL, NULL, 0);
>  struct pci_dev *tmp, *dev = NULL;
> +const struct domain *d;
>  const struct vpci_msix *msix = pdev->vpci->msix;
>  unsigned int i;
>  int rc;
> @@ -285,9 +286,11 @@ static int modify_bars(const struct pci_
>  
>  /*
>   * Check for overlaps with other BARs. Note that only BARs that are
> - * currently mapped (enabled) are checked for overlaps.
> + * currently mapped (enabled) are checked for overlaps. Note also that
> + * for Dom0 we also need to include hidden, i.e. DomXEN's, devices.
>   */
> -for_each_pdev ( pdev->domain, tmp )
> +for ( d = pdev->domain; ; d = dom_xen ) {//todo
> +for_each_pdev ( d, tmp )
>  {
>  if ( tmp == pdev )
>  {
> @@ -304,6 +307,7 @@ static int modify_bars(const struct pci_
>   */
>  continue;
>  }
> +if ( !tmp->vpci ) { ASSERT(d == dom_xen); continue; }//todo
>  
>  for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
>  {
> @@ -330,6 +334,7 @@ static int modify_bars(const struct pci_
>  }
>  }
>  }
> +if ( !is_hardware_domain(d) ) break; }//todo
>  
>  ASSERT(dev);
>  
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -70,6 +70,7 @@ void vpci_remove_device(struct pci_dev *
>  int vpci_add_handlers(struct pci_dev *pdev)
>  {
>  unsigned int i;
> +const unsigned long *ro_map;
>  int rc = 0;
>  
>  if ( !has_vpci(pdev->domain) )
> @@ -78,6 +79,11 @@ int vpci_add_handlers(struct pci_dev *pd
>  /* We should not get here twice for the same device. */
>  ASSERT(!pdev->vpci);
>  
> +/* No vPCI for r/o devices. */
> +ro_map = pci_get_ro_map(pdev->sbdf.seg);
> +if ( ro_map && test_bit(pdev->sbdf.bdf, ro_map) )
> +return 0;
> +
>  pdev->vpci = xzalloc(struct vpci);
>  if ( !pdev->vpci )
>  return -ENOMEM;
> 



Re: [PATCH RFC v2] vPCI: account for hidden devices

2023-05-24 Thread Stefano Stabellini
On Wed, 24 May 2023, Jan Beulich wrote:
> >> RFC: _setup_hwdom_pci_devices()' loop may want splitting: For
> >>  modify_bars() to consistently respect BARs of hidden devices while
> >>  setting up "normal" ones (i.e. to avoid as much as possible the
> >>  "continue" path introduced here), setting up of the former may want
> >>  doing first.
> > 
> > But BARs of hidden devices should be mapped into dom0 physmap?
> 
> Yes.

The BARs would be mapped read-only (not read-write), right? Otherwise we
let dom0 access devices that belong to Xen, which doesn't seem like a
good idea.

But even if we map the BARs read-only, what is the benefit of mapping
them to Dom0? If Dom0 loads a driver for it and the driver wants to
initialize the device, the driver will crash because the MMIO region is
read-only instead of read-write, right?

How does this device hiding work for dom0? How does dom0 know not to
access a device that is present on the PCI bus but is used by Xen?

The reason why I was suggesting to hide the device completely in the
past was that I assumed we had to hide the device (make it "disappear"
on the PCI bus) otherwise Dom0 would access it. Is there another way to
mark a PCI devices as "inaccessible" or "disabled"?



[xen-unstable-smoke test] 180931: tolerable all pass - PUSHED

2023-05-24 Thread osstest service owner
flight 180931 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180931/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  511b9f286c3dadd041e0d90beeff7d47c9bf3b7a
baseline version:
 xen  380c6c170393c48852d4f2b1ea97125a399cfc61

Last test of basis   180929  2023-05-24 15:03:16 Z0 days
Testing same since   180931  2023-05-24 20:01:55 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   380c6c1703..511b9f286c  511b9f286c3dadd041e0d90beeff7d47c9bf3b7a -> smoke



[linux-linus test] 180925: regressions - FAIL

2023-05-24 Thread osstest service owner
flight 180925 linux-linus real [real]
flight 180932 linux-linus real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/180925/
http://logs.test-lab.xenproject.org/osstest/logs/180932/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit1   8 xen-boot fail REGR. vs. 180278

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-vhd 21 guest-start/debian.repeat fail pass in 180932-retest

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  8 xen-boot fail  like 180278
 test-armhf-armhf-examine  8 reboot   fail  like 180278
 test-armhf-armhf-xl-arndale   8 xen-boot fail  like 180278
 test-armhf-armhf-libvirt-qcow2  8 xen-bootfail like 180278
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 180278
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 180278
 test-armhf-armhf-xl-vhd   8 xen-boot fail  like 180278
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 180278
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 180278
 test-armhf-armhf-xl-multivcpu  8 xen-boot fail like 180278
 test-armhf-armhf-libvirt-raw  8 xen-boot fail  like 180278
 test-armhf-armhf-xl-credit2   8 xen-boot fail  like 180278
 test-armhf-armhf-xl-rtds  8 xen-boot fail  like 180278
 test-armhf-armhf-xl   8 xen-boot fail  like 180278
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 180278
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 linux9d646009f65d62d32815f376465a3b92d8d9b046
baseline version:
 linux6c538e1adbfc696ac4747fb10d63e704344f763d

Last test of basis   180278  2023-04-16 19:41:46 Z   38 days
Failing since180281  2023-04-17 06:24:36 Z   37 days   69 attempts
Testing same since   180925  2023-05-24 05:07:18 Z0 days1 attempts


2480 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops 

Re: [RFC PATCH v1 0/9] Hypervisor-Enforced Kernel Integrity

2023-05-24 Thread Edgecombe, Rick P
On Fri, 2023-05-05 at 17:20 +0200, Mickaël Salaün wrote:
> # How does it work?
> 
> This implementation mainly leverages KVM capabilities to control the
> Second
> Layer Address Translation (or the Two Dimensional Paging e.g.,
> Intel's EPT or
> AMD's RVI/NPT) and Mode Based Execution Control (Intel's MBEC)
> introduced with
> the Kaby Lake (7th generation) architecture. This allows to set
> permissions on
> memory pages in a complementary way to the guest kernel's managed
> memory
> permissions. Once these permissions are set, they are locked and
> there is no
> way back.
> 
> A first KVM_HC_LOCK_MEM_PAGE_RANGES hypercall enables the guest
> kernel to lock
> a set of its memory page ranges with either the HEKI_ATTR_MEM_NOWRITE
> or the
> HEKI_ATTR_MEM_EXEC attribute. The first one denies write access to a
> specific
> set of pages (allow-list approach), and the second only allows kernel
> execution
> for a set of pages (deny-list approach).
> 
> The current implementation sets the whole kernel's .rodata (i.e., any
> const or
> __ro_after_init variables, which includes critical security data such
> as LSM
> parameters) and .text sections as non-writable, and the .text section
> is the
> only one where kernel execution is allowed. This is possible thanks
> to the new
> MBEC support also brough by this series (otherwise the vDSO would
> have to be
> executable). Thanks to this hardware support (VT-x, EPT and MBEC),
> the
> performance impact of such guest protection is negligible.
> 
> The second KVM_HC_LOCK_CR_UPDATE hypercall enables guests to pin some
> of its
> CPU control register flags (e.g., X86_CR0_WP, X86_CR4_SMEP,
> X86_CR4_SMAP),
> which is another complementary hardening mechanism.
> 
> Heki can be enabled with the heki=1 boot command argument.
> 
> 

Can the guest kernel ask the host VMM's emulated devices to DMA into
the protected data? It should go through the host userspace mappings I
think, which don't care about EPT permissions. Or did I miss where you
are protecting that another way? There are a lot of easy ways to ask
the host to write to guest memory that don't involve the EPT. You
probably need to protect the host userspace mappings, and also the
places in KVM that kmap a GPA provided by the guest.

[ snip ]

> 
> # Current limitations
> 
> The main limitation of this patch series is the statically enforced
> permissions. This is not an issue for kernels without module but this
> needs to
> be addressed.  Mechanisms that dynamically impact kernel executable
> memory are
> not handled for now (e.g., kernel modules, tracepoints, eBPF JIT),
> and such
> code will need to be authenticated.  Because the hypervisor is highly
> privileged and critical to the security of all the VMs, we don't want
> to
> implement a code authentication mechanism in the hypervisor itself
> but delegate
> this verification to something much less privileged. We are thinking
> of two
> ways to solve this: implement this verification in the VMM or spawn a
> dedicated
> special VM (similar to Windows's VBS). There are pros on cons to each
> approach:
> complexity, verification code ownership (guest's or VMM's), access to
> guest
> memory (i.e., confidential computing).

The kernel often creates writable aliases in order to write to
protected data (kernel text, etc). Some of this is done right as text
is being first written out (alternatives for example), and some happens
way later (jump labels, etc). So for verification, I wonder what stage
you would be verifying? If you want to verify the end state, you would
have to maintain knowledge in the verifier of all the touch-ups the
kernel does. I think it would get very tricky.

It also seems it will be a decent ask for the guest kernel to keep
track of GPA permissions as well as normal virtual memory pemirssions,
if this thing is not widely used.

So I wondering if you could go in two directions with this:
1. Make this a feature only for super locked down kernels (no modules,
etc). Forbid any configurations that might modify text. But eBPF is
used for seccomp, so you might be turning off some security protections
to get this.
2. Loosen the rules to allow the protections to not be so one-way
enable. Get less security, but used more widely.

There were similar dilemmas with the PV CR pinning stuff.


Re: [PATCH v1] xen: fix comment typo regarding IOREQ Server

2023-05-24 Thread Stefano Stabellini
On Wed, 24 May 2023, Olaf Hering wrote:
> Signed-off-by: Olaf Hering 

Acked-by: Stefano Stabellini 


> ---
>  xen/include/public/hvm/dm_op.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
> index acdf91693d..fa98551914 100644
> --- a/xen/include/public/hvm/dm_op.h
> +++ b/xen/include/public/hvm/dm_op.h
> @@ -17,7 +17,7 @@
>  /*
>   * IOREQ Servers
>   *
> - * The interface between an I/O emulator an Xen is called an IOREQ Server.
> + * The interface between an I/O emulator and Xen is called an IOREQ Server.
>   * A domain supports a single 'legacy' IOREQ Server which is instantiated if
>   * parameter...
>   *
> 



Re: [RFC PATCH v1 0/9] Hypervisor-Enforced Kernel Integrity

2023-05-24 Thread Trilok Soni

On 5/5/2023 8:20 AM, Mickaël Salaün wrote:

Hi,

This patch series is a proof-of-concept that implements new KVM features
(extended page tracking, MBEC support, CR pinning) and defines a new API to
protect guest VMs. No VMM (e.g., Qemu) modification is required.

The main idea being that kernel self-protection mechanisms should be delegated
to a more privileged part of the system, hence the hypervisor. It is still the
role of the guest kernel to request such restrictions according to its


Only for the guest kernel images here? Why not for the host OS kernel? 
Embedded devices w/ Android you have mentioned below supports the host 
OS as well it seems, right?


Do we suggest that all the functionalities should be implemented in the 
Hypervisor (NS-EL2 for ARM) or even at Secure EL like Secure-EL1 (ARM).


I am hoping that whatever we suggest the interface here from the Guest 
to the Hypervisor becomes the ABI right?





# Current limitations

The main limitation of this patch series is the statically enforced
permissions. This is not an issue for kernels without module but this needs to
be addressed.  Mechanisms that dynamically impact kernel executable memory are
not handled for now (e.g., kernel modules, tracepoints, eBPF JIT), and such
code will need to be authenticated.  Because the hypervisor is highly
privileged and critical to the security of all the VMs, we don't want to
implement a code authentication mechanism in the hypervisor itself but delegate
this verification to something much less privileged. We are thinking of two
ways to solve this: implement this verification in the VMM or spawn a dedicated
special VM (similar to Windows's VBS). There are pros on cons to each approach:
complexity, verification code ownership (guest's or VMM's), access to guest
memory (i.e., confidential computing).


Do you foresee the performance regressions due to lot of tracking here? 
Production kernels do have lot of tracepoints and we use it as feature 
in the GKI kernel for the vendor hooks implementation and in those cases 
every vendor driver is a module. Separate VM further fragments this 
design and delegates more of it to proprietary solutions?


Do you have any performance numbers w/ current RFC?

---Trilok Soni



Re: [PATCH v1 2/9] KVM: x86/mmu: Add support for prewrite page tracking

2023-05-24 Thread Madhavan T. Venkataraman



On 5/5/23 12:31, Sean Christopherson wrote:
> On Fri, May 05, 2023, Micka�l Sala�n wrote:
>>
>> On 05/05/2023 18:28, Sean Christopherson wrote:
>>> I have no doubt that we'll need to solve performance and scaling issues 
>>> with the
>>> memory attributes implementation, e.g. to utilize xarray multi-range support
>>> instead of storing information on a per-4KiB-page basis, but AFAICT, the 
>>> core
>>> idea is sound.  And a very big positive from a maintenance perspective is 
>>> that
>>> any optimizations, fixes, etc. for one use case (CoCo vs. hardening) should 
>>> also
>>> benefit the other use case.
>>>
>>> [1] https://lore.kernel.org/all/20230311002258.852397-22-sea...@google.com
>>> [2] https://lore.kernel.org/all/y2wb48kd0j4vg...@google.com
>>> [3] https://lore.kernel.org/all/y1a1i9vbj%2fpvm...@google.com
>>
>> I agree, I used this mechanism because it was easier at first to rely on a
>> previous work, but while I was working on the MBEC support, I realized that
>> it's not the optimal way to do it.
>>
>> I was thinking about using a new special EPT bit similar to
>> EPT_SPTE_HOST_WRITABLE, but it may not be portable though. What do you
>> think?
> 
> On x86, SPTEs are even more ephemeral than memslots.  E.g. for historical 
> reasons,
> KVM zaps all SPTEs if _any_ memslot is deleted, which is problematic if the 
> guest
> is moving around BARs, using option ROMs, etc.
> 
> ARM's pKVM tracks metadata in its stage-2 PTEs, i.e. doesn't need an xarray to
> otrack attributes, but that works only because pKVM is more privileged than 
> the
> host kernel, and the shared vs. private memory attribute that pKVM cares about
> is very, very restricted in how it can be used and changed.
> 
> I tried shoehorning private vs. shared metadata into x86's SPTEs in the past, 
> and
> it ended up being a constant battle with the kernel, e.g. page migration, and 
> with
> KVM itself, e.g. the above memslot mess.

Sorry for the delay in responding to this. I wanted to study the KVM code and 
fully
understand your comment before responding.

Yes, I quite agree with you. I will make an attempt to address this in the next 
version.
I am working on it right now.

Thanks.

Madhavan



Re: [patch v3 31/36] x86/apic: Provide cpu_primary_thread mask

2023-05-24 Thread Kirill A. Shutemov
On Mon, May 08, 2023 at 09:44:17PM +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner 
> 
> Make the primary thread tracking CPU mask based in preparation for simpler
> handling of parallel bootup.
> 
> Signed-off-by: Thomas Gleixner 
> Tested-by: Michael Kelley 
> 
> 
> ---
>  arch/x86/include/asm/apic.h |2 --
>  arch/x86/include/asm/topology.h |   19 +++
>  arch/x86/kernel/apic/apic.c |   20 +---
>  arch/x86/kernel/smpboot.c   |   12 +++-
>  4 files changed, 27 insertions(+), 26 deletions(-)
> ---
> 

...

> @@ -2386,20 +2386,16 @@ bool arch_match_cpu_phys_id(int cpu, u64
>  }
>  
>  #ifdef CONFIG_SMP
> -/**
> - * apic_id_is_primary_thread - Check whether APIC ID belongs to a primary 
> thread
> - * @apicid: APIC ID to check
> - */
> -bool apic_id_is_primary_thread(unsigned int apicid)
> +static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
>  {
> - u32 mask;
> -
> - if (smp_num_siblings == 1)
> - return true;
>   /* Isolate the SMT bit(s) in the APICID and check for 0 */
> - mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
> - return !(apicid & mask);
> + u32 mask = (1U << (fls(smp_num_siblings) - 1)) - 1;
> +
> + if (smp_num_siblings == 1 || !(apicid & mask))
> + cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
>  }
> +#else
> +static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int 
> apicid) { }
>  #endif
>  
>  /*

This patch causes boot regression on TDX guest. The guest crashes on SMP
bring up.

The change makes use of smp_num_siblings earlier than before: the mask get
constructed in acpi_boot_init() codepath. Later on smp_num_siblings gets
updated in detect_ht().

In my setup with 16 vCPUs, smp_num_siblings is 16 before detect_ht() and
set to 1 in detect_ht().

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



[PATCH v1] xen: fix comment typo regarding IOREQ Server

2023-05-24 Thread Olaf Hering
Signed-off-by: Olaf Hering 
---
 xen/include/public/hvm/dm_op.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/public/hvm/dm_op.h b/xen/include/public/hvm/dm_op.h
index acdf91693d..fa98551914 100644
--- a/xen/include/public/hvm/dm_op.h
+++ b/xen/include/public/hvm/dm_op.h
@@ -17,7 +17,7 @@
 /*
  * IOREQ Servers
  *
- * The interface between an I/O emulator an Xen is called an IOREQ Server.
+ * The interface between an I/O emulator and Xen is called an IOREQ Server.
  * A domain supports a single 'legacy' IOREQ Server which is instantiated if
  * parameter...
  *



[qemu-mainline test] 180927: regressions - FAIL

2023-05-24 Thread osstest service owner
flight 180927 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180927/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 180691
 build-amd64-xsm   6 xen-buildfail REGR. vs. 180691
 build-i386-xsm6 xen-buildfail REGR. vs. 180691
 build-i3866 xen-buildfail REGR. vs. 180691
 build-arm64   6 xen-buildfail REGR. vs. 180691
 build-arm64-xsm   6 xen-buildfail REGR. vs. 180691
 build-armhf   6 xen-buildfail REGR. vs. 180691

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-vhd1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-vhd   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-coresched-i386-xl  1 build-check(1)   blocked  n/a
 test-amd64-coresched-amd64-xl  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-shadow1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-dom0pvh-xl-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1) blocked n/a
 test-amd64-amd64-dom0pvh-xl-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 

Re: [PATCH v2 03/10] x86/cpu-policy: Infrastructure for MSR_ARCH_CAPS

2023-05-24 Thread Andrew Cooper
On 24/05/2023 3:53 pm, Jan Beulich wrote:
> On 24.05.2023 13:25, Andrew Cooper wrote:
>> Bits through 24 are already defined, meaning that we're not far off needing
>> the second word.  Put both in right away.
>>
>> As both halves are present now, the arch_caps field is full width.  Adjust 
>> the
>> unit test, which notices.
>>
>> The bool bitfield names in the arch_caps union are unused, and somewhat out 
>> of
>> date.  They'll shortly be automatically generated.
>>
>> Add CPUID and MSR prefixes to the ./xen-cpuid verbose output, now that there
>> are a mix of the two.
>>
>> Signed-off-by: Andrew Cooper 
> Acked-by: Jan Beulich 

Thanks,

> albeit ...
>
>> --- a/tools/misc/xen-cpuid.c
>> +++ b/tools/misc/xen-cpuid.c
>> @@ -226,31 +226,41 @@ static const char *const str_7d2[32] =
>>  [ 4] = "bhi-ctrl",  [ 5] = "mcdt-no",
>>  };
>>  
>> +static const char *const str_10Al[32] =
>> +{
>> +};
>> +
>> +static const char *const str_10Ah[32] =
>> +{
>> +};
>> +
>>  static const struct {
>>  const char *name;
>>  const char *abbr;
>>  const char *const *strs;
>>  } decodes[] =
>>  {
>> -{ "0x0001.edx",   "1d",  str_1d },
>> -{ "0x0001.ecx",   "1c",  str_1c },
>> -{ "0x8001.edx",   "e1d", str_e1d },
>> -{ "0x8001.ecx",   "e1c", str_e1c },
>> -{ "0x000d:1.eax", "Da1", str_Da1 },
>> -{ "0x0007:0.ebx", "7b0", str_7b0 },
>> -{ "0x0007:0.ecx", "7c0", str_7c0 },
>> -{ "0x8007.edx",   "e7d", str_e7d },
>> -{ "0x8008.ebx",   "e8b", str_e8b },
>> -{ "0x0007:0.edx", "7d0", str_7d0 },
>> -{ "0x0007:1.eax", "7a1", str_7a1 },
>> -{ "0x8021.eax",  "e21a", str_e21a },
>> -{ "0x0007:1.ebx", "7b1", str_7b1 },
>> -{ "0x0007:2.edx", "7d2", str_7d2 },
>> -{ "0x0007:1.ecx", "7c1", str_7c1 },
>> -{ "0x0007:1.edx", "7d1", str_7d1 },
>> +{ "CPUID 0x0001.edx","1d", str_1d },
>> +{ "CPUID 0x0001.ecx","1c", str_1c },
>> +{ "CPUID 0x8001.edx",   "e1d", str_e1d },
>> +{ "CPUID 0x8001.ecx",   "e1c", str_e1c },
>> +{ "CPUID 0x000d:1.eax", "Da1", str_Da1 },
>> +{ "CPUID 0x0007:0.ebx", "7b0", str_7b0 },
>> +{ "CPUID 0x0007:0.ecx", "7c0", str_7c0 },
>> +{ "CPUID 0x8007.edx",   "e7d", str_e7d },
>> +{ "CPUID 0x8008.ebx",   "e8b", str_e8b },
>> +{ "CPUID 0x0007:0.edx", "7d0", str_7d0 },
>> +{ "CPUID 0x0007:1.eax", "7a1", str_7a1 },
>> +{ "CPUID 0x8021.eax",  "e21a", str_e21a },
>> +{ "CPUID 0x0007:1.ebx", "7b1", str_7b1 },
>> +{ "CPUID 0x0007:2.edx", "7d2", str_7d2 },
>> +{ "CPUID 0x0007:1.ecx", "7c1", str_7c1 },
>> +{ "CPUID 0x0007:1.edx", "7d1", str_7d1 },
> ... I'm not really happy about this added verbosity. In a tool of this
> name, I think it's pretty clear that unadorned names are CPUID stuff.

You might make the connection, but it's unreasonable to expect the same
of everyone else.  This is used by end users.

If nothing else, the name of the binary is made stale by this change.

>> +{ "MSR   0x010a.lo",  "m10Al", str_10Al },
>> +{ "MSR   0x010a.hi",  "m10Ah", str_10Ah },
> Once we gain a few more MSRs, I'm afraid the raw numbers aren't going
> to be very useful. As vaguely suggested before, how about
>
> { "MSR_ARCH_CAPS.lo",  "m10Al", str_10Al },
> { "MSR_ARCH_CAPS.hi",  "m10Ah", str_10Ah },
>
> ?

I've done this.  I remain to be convinced, but it probably is nicer for
people who don't know the MSR indices like I do.

~Andrew



[xen-unstable-smoke test] 180929: tolerable all pass - PUSHED

2023-05-24 Thread osstest service owner
flight 180929 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180929/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  380c6c170393c48852d4f2b1ea97125a399cfc61
baseline version:
 xen  c7908869ac26961a3919491705e521179ad3fc0e

Last test of basis   180897  2023-05-22 15:00:25 Z2 days
Testing same since   180929  2023-05-24 15:03:16 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  Daniel P. Smith 
  Jan Beulich 
  Julien Grall 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   c7908869ac..380c6c1703  380c6c170393c48852d4f2b1ea97125a399cfc61 -> smoke



Re: [PATCH v2] iommu/vtd: fix address translation for superpages

2023-05-24 Thread Jan Beulich
On 24.05.2023 17:22, Roger Pau Monne wrote:
> When translating an address that falls inside of a superpage in the
> IOMMU page tables the fetching of the PTE value wasn't masking of the
> contiguous related data, which caused the returned data to be
> corrupt as it would contain bits that the caller would interpret as
> part of the address.
> 
> Fix this by masking off the contiguous bits.
> 
> Fixes: c71e55501a61 ('VT-d: have callers specify the target level for page 
> table walks')
> Signed-off-by: Roger Pau Monné 

Just to clarify: The title says superpages and you also only deal with
superpages. Yet in the earlier discussion I pointed out that the 4k-page
case looks to also be flawed (I don't think anymore we iterate one too
many times, but I'm pretty sure the r/w flags are missing in what we
return to intel_iommu_lookup_page()). Did you convince yourself
otherwise in the meantime? Or is that going to be a separate change
(whether by you or someone else, like me)?

> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -368,7 +368,7 @@ static uint64_t addr_to_dma_page_maddr(struct domain 
> *domain, daddr_t addr,
>   * with the address adjusted to account for the residual of
>   * the walk.
>   */
> -pte_maddr = pte->val +
> +pte_maddr = (pte->val & ~DMA_PTE_CONTIG_MASK) +

While this addresses the problem at hand, wouldn't masking by PADDR_MASK
be more forward compatible (for whenever another of the high bits gets
used)?

Jan

>  (addr & ((1UL << level_to_offset_bits(level)) - 1) &
>   PAGE_MASK);
>  if ( !target )




[PATCH] iscsi_ibft: Fix finding the iBFT under Xen Dom 0

2023-05-24 Thread Ross Lagerwall
Since firmware doesn't indicate the iBFT in the E820, add a reserved
region so that it gets identity mapped when running as Dom 0 so that it
is possible to search for it. Move the call to reserve_ibft_region()
later so that it is called after the Xen identity mapping adjustments
are applied.

Finally, instead of using isa_bus_to_virt() which doesn't do the right
thing under Xen, use early_memremap() like the dmi_scan code does.

Signed-off-by: Ross Lagerwall 
---

It tested this to work under Xen and native Linux.

 arch/x86/kernel/setup.c|  2 +-
 arch/x86/xen/setup.c   |  8 +++-
 drivers/firmware/iscsi_ibft_find.c | 24 +---
 3 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 16babff771bd..616b80507abd 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -796,7 +796,6 @@ static void __init early_reserve_memory(void)
 
memblock_x86_reserve_range_setup_data();
 
-   reserve_ibft_region();
reserve_bios_regions();
trim_snb_memory();
 }
@@ -1032,6 +1031,7 @@ void __init setup_arch(char **cmdline_p)
if (efi_enabled(EFI_BOOT))
efi_init();
 
+   reserve_ibft_region();
dmi_setup();
 
/*
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index c2be3efb2ba0..daab59df3b99 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -772,8 +772,14 @@ char * __init xen_memory_setup(void)
 * UNUSABLE regions in domUs are not handled and will need
 * a patch in the future.
 */
-   if (xen_initial_domain())
+   if (xen_initial_domain()) {
xen_ignore_unusable();
+   /* Reserve 0.5 MiB to 1 MiB region so iBFT can be found */
+   xen_e820_table.entries[xen_e820_table.nr_entries].addr = 
0x8;
+   xen_e820_table.entries[xen_e820_table.nr_entries].size = 
0x8;
+   xen_e820_table.entries[xen_e820_table.nr_entries].type = 
E820_TYPE_RESERVED;
+   xen_e820_table.nr_entries++;
+   }
 
/* Make sure the Xen-supplied memory map is well-ordered. */
e820__update_table(_e820_table);
diff --git a/drivers/firmware/iscsi_ibft_find.c 
b/drivers/firmware/iscsi_ibft_find.c
index 94b49ccd23ac..e3c1449987dd 100644
--- a/drivers/firmware/iscsi_ibft_find.c
+++ b/drivers/firmware/iscsi_ibft_find.c
@@ -52,9 +52,9 @@ static const struct {
  */
 void __init reserve_ibft_region(void)
 {
-   unsigned long pos;
+   unsigned long pos, virt_pos = 0;
unsigned int len = 0;
-   void *virt;
+   void *virt = NULL;
int i;
 
ibft_phys_addr = 0;
@@ -70,13 +70,20 @@ void __init reserve_ibft_region(void)
 * so skip that area */
if (pos == VGA_MEM)
pos += VGA_SIZE;
-   virt = isa_bus_to_virt(pos);
+
+   /* Map page by page */
+   if (offset_in_page(pos) == 0) {
+   if (virt)
+   early_memunmap(virt, PAGE_SIZE);
+   virt = early_memremap_ro(pos, PAGE_SIZE);
+   virt_pos = pos;
+   }
 
for (i = 0; i < ARRAY_SIZE(ibft_signs); i++) {
-   if (memcmp(virt, ibft_signs[i].sign, IBFT_SIGN_LEN) ==
-   0) {
+   if (memcmp(virt + (pos - virt_pos), ibft_signs[i].sign,
+  IBFT_SIGN_LEN) == 0) {
unsigned long *addr =
-   (unsigned long *)isa_bus_to_virt(pos + 4);
+   (unsigned long *)(virt + pos - virt_pos + 
4);
len = *addr;
/* if the length of the table extends past 1M,
 * the table cannot be valid. */
@@ -84,9 +91,12 @@ void __init reserve_ibft_region(void)
ibft_phys_addr = pos;
memblock_reserve(ibft_phys_addr, 
PAGE_ALIGN(len));
pr_info("iBFT found at %pa.\n", 
_phys_addr);
-   return;
+   goto out;
}
}
}
}
+
+out:
+   early_memunmap(virt, PAGE_SIZE);
 }
-- 
2.31.1




Re: [PATCH] x86/pci/xen: populate MSI sysfs entries

2023-05-24 Thread Dave Hansen
On 5/24/23 08:47, Juergen Gross wrote:
>> Any other feedback on this one? This is definitely a bug but I
>> understand that
>> there might be different ways to fix it.
> 
> I'd be happy to take the patch via the Xen tree, but I think x86
> maintainers should at least ack that.

Ack.

Works for me.



Re: [PATCH RFC v2] vPCI: account for hidden devices

2023-05-24 Thread Roger Pau Monné
On Wed, May 24, 2023 at 03:45:58PM +0200, Jan Beulich wrote:
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -218,6 +218,7 @@ static int modify_bars(const struct pci_
>  struct vpci_header *header = >vpci->header;
>  struct rangeset *mem = rangeset_new(NULL, NULL, 0);
>  struct pci_dev *tmp, *dev = NULL;
> +const struct domain *d;
>  const struct vpci_msix *msix = pdev->vpci->msix;
>  unsigned int i;
>  int rc;
> @@ -285,9 +286,11 @@ static int modify_bars(const struct pci_
>  
>  /*
>   * Check for overlaps with other BARs. Note that only BARs that are
> - * currently mapped (enabled) are checked for overlaps.
> + * currently mapped (enabled) are checked for overlaps. Note also that
> + * for Dom0 we also need to include hidden, i.e. DomXEN's, devices.
>   */
> -for_each_pdev ( pdev->domain, tmp )
> +for ( d = pdev->domain; ; d = dom_xen ) {//todo

Looking at this again, I think this is slightly more complex, as during
runtime dom0 will get here with pdev->domain == hardware_domain OR
dom_xen, and hence you also need to account that devices that have
pdev->domain == dom_xen need to iterate over devices that belong to
the hardware_domain, ie:

for ( d = pdev->domain; ;
  d = (pdev->domain == dom_xen) ? hardware_domain : dom_xen )

And we likely want to limit this to devices that belong to the
hardware_domain or to dom_xen (in preparation for vPCI being used for
domUs).

Thanks, Roger.



Re: [PATCH] x86/pci/xen: populate MSI sysfs entries

2023-05-24 Thread Juergen Gross

On 24.05.23 17:43, Maximilian Heyne wrote:

On Wed, May 03, 2023 at 01:16:53PM +, Maximilian Heyne wrote:

Commit bf5e758f02fc ("genirq/msi: Simplify sysfs handling") reworked the
creation of sysfs entries for MSI IRQs. The creation used to be in
msi_domain_alloc_irqs_descs_locked after calling ops->domain_alloc_irqs.
Then it moved into __msi_domain_alloc_irqs which is an implementation of
domain_alloc_irqs. However, Xen comes with the only other implementation
of domain_alloc_irqs and hence doesn't run the sysfs population code
anymore.

Commit 6c796996ee70 ("x86/pci/xen: Fixup fallout from the PCI/MSI
overhaul") set the flag MSI_FLAG_DEV_SYSFS for the xen msi_domain_info
but that doesn't actually have an effect because Xen uses it's own
domain_alloc_irqs implementation.

Fix this by making use of the fallback functions for sysfs population.

Fixes: bf5e758f02fc ("genirq/msi: Simplify sysfs handling")
Signed-off-by: Maximilian Heyne 



Any other feedback on this one? This is definitely a bug but I understand that
there might be different ways to fix it.


I'd be happy to take the patch via the Xen tree, but I think x86 maintainers
should at least ack that.


Juergen



OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] x86/pci/xen: populate MSI sysfs entries

2023-05-24 Thread Maximilian Heyne
On Wed, May 03, 2023 at 01:16:53PM +, Maximilian Heyne wrote:
> Commit bf5e758f02fc ("genirq/msi: Simplify sysfs handling") reworked the
> creation of sysfs entries for MSI IRQs. The creation used to be in
> msi_domain_alloc_irqs_descs_locked after calling ops->domain_alloc_irqs.
> Then it moved into __msi_domain_alloc_irqs which is an implementation of
> domain_alloc_irqs. However, Xen comes with the only other implementation
> of domain_alloc_irqs and hence doesn't run the sysfs population code
> anymore.
> 
> Commit 6c796996ee70 ("x86/pci/xen: Fixup fallout from the PCI/MSI
> overhaul") set the flag MSI_FLAG_DEV_SYSFS for the xen msi_domain_info
> but that doesn't actually have an effect because Xen uses it's own
> domain_alloc_irqs implementation.
> 
> Fix this by making use of the fallback functions for sysfs population.
> 
> Fixes: bf5e758f02fc ("genirq/msi: Simplify sysfs handling")
> Signed-off-by: Maximilian Heyne 


Any other feedback on this one? This is definitely a bug but I understand that
there might be different ways to fix it.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879






Re: [PATCH RFC v2] vPCI: account for hidden devices

2023-05-24 Thread Roger Pau Monné
On Wed, May 24, 2023 at 04:44:49PM +0200, Jan Beulich wrote:
> On 24.05.2023 16:22, Roger Pau Monné wrote:
> > On Wed, May 24, 2023 at 03:45:58PM +0200, Jan Beulich wrote:
> >> Hidden devices (e.g. an add-in PCI serial card used for Xen's serial
> >> console) are associated with DomXEN, not Dom0. This means that while
> >> looking for overlapping BARs such devices cannot be found on Dom0's list
> >> of devices; DomXEN's list also needs to be scanned.
> >>
> >> Suppress vPCI init altogether for r/o devices (which constitute a subset
> >> of hidden ones).
> >>
> >> Signed-off-by: Jan Beulich 
> >> ---
> >> RFC: The modify_bars() change is intentionally mis-formatted, as the
> >>  necessary re-indentation would make the diff difficult to read. At
> >>  this point I'd merely like to gather input towards possible better
> >>  approaches to solve the issue (not the least because quite possibly
> >>  there are further places needing changing).
> > 
> > I think we should also handle the case of !pdev->vpci for the hardware
> > doamin, as it's allowed for the vpci_add_handlers() call in
> > setup_one_hwdom_device() to fail and the device wold still be assigned
> > to the hardware domain.
> > 
> > I can submit that as a separate bugfix, as it's already an issue
> > without taking r/o or hidden devices into account.
> 
> Yeah, I think that wants dealing with separately. I'm not actually sure
> though that "is allowed to fail" is proper behavior ...

One better option would be to mark the device r/o if the
vpci_add_handlers() call fails in setup_one_hwdom_device(), as that
would prevent dom0 from accessing native MSI(-X) capabilities.

> But anyway - I take this as you agreeing to go that route, which is the
> prereq for me to actually make a well-formed patch. Please shout soon
> if that's a misunderstanding of mine.

Sure, will send the fix later today or tomorrow so that you can
rebase.

> >> RFC: Whether to actually suppress vPCI init is up for debate; adding the
> >>  extra logic is following Roger's suggestion (I'm not convinced it is
> >>  useful to have). If we want to keep that, a 2nd question would be
> >>  whether to keep it in vpci_add_handlers(): Both of its callers (can)
> >>  have a struct pci_seg readily available, so checking ->ro_map at the
> >>  call sites would be easier.
> > 
> > But that would duplicate the logic into the callers, which doesn't
> > seem very nice to me, and makes it easier to make mistakes if further
> > callers are added and r/o is not checked there.
> 
> Right, hence why I didn't do it the alternative way from the beginning.
> Both approaches have a pro and a con.
> 
> But prior to answering the 2nd question, what about the 1st one? Is it
> really worth to have the extra logic?

Why would we want to do all the vPCI initialization for r/o devices?
None of the handlers setup will get called, so I see it the other way
around: not short-circuiting vpci_add_handlers() for r/o devices is a
waste of time and resources because none of the setup state would be
used anyway.

> >  And
> > hence doing those before or after normal devices will lead to the same
> > result.  The loop in modify_bars() is there to avoid attempting to map
> > the same range twice, or to unmap a range while there are devices
> > still using it, but the unmap is never done during initial device
> > setup.
> 
> Okay, so maybe indeed there's no effect on the final result. Yet still
> the anomaly bothered me when seeing it in the logs (it actually mislead
> me initially in my conclusions as to what was actually going on), when
> I added a printk() to that new "continue" path. We would skip hidden
> devices up until them getting initialized themselves. There would be
> less skipping if all (there aren't going to be many) DomXEN devices
> were initialized first.

I think that just makes the logic more complicated for no reason, the
only reason you don't see this with devices assigned to dom0 is
because device addition is interleaved with calls to
vpci_add_handlers().  However it would also be valid to add all
devices to dom0 and then call vpci_add_handlers() for each one of them.

> > One further question is whether we want to map BARs of r/o devices
> > into the hardware domain physmap.  Not sure that's very helpful, as
> > dom0 won't be allowed to modify any of the config space bits of those
> > devices, so even attempts to size the BARs will fail.  I wonder what
> > kind of issues this can cause to dom0 OSes.
> 
> This is what Linux (6.3) says:
> 
> pci :02:00.1: [Firmware Bug]: reg 0x10: invalid BAR (can't size)
> pci :02:00.1: [Firmware Bug]: reg 0x14: invalid BAR (can't size)
> pci :02:00.1: [Firmware Bug]: reg 0x24: invalid BAR (can't size)

OK, seems fine then.  There's no point in mapping the BARs of r/o
devices to the dom0 physmap, as the domain is unable to size them in
the first place.

Thanks, Roger.



Re: [XEN PATCH v8 02/22] xen/arm: tee: add a primitive FF-A mediator

2023-05-24 Thread Bertrand Marquis
Hi Andrew,

> On 4 May 2023, at 15:14, Bertrand Marquis  wrote:
> 
> Hi Andrew,
> 
>> On 14 Apr 2023, at 10:58, Jens Wiklander  wrote:
>> 
>> Hi,
>> 
>> On Thu, Apr 13, 2023 at 3:27 PM Andrew Cooper  
>> wrote:
>>> 
>>> On 13/04/2023 1:26 pm, Julien Grall wrote:
> +static int ffa_domain_init(struct domain *d)
> +{
> +struct ffa_ctx *ctx;
> +
> +if ( !ffa_version )
> +return -ENODEV;
> +
> +ctx = xzalloc(struct ffa_ctx);
> +if ( !ctx )
> +return -ENOMEM;
> +
> +d->arch.tee = ctx;
> +
> +return 0;
> +}
> +
> +/* This function is supposed to undo what ffa_domain_init() has done */
 
 I think there is a problem in the TEE framework. The callback
 .relinquish_resources() will not be called if domain_create() failed.
 So this will result to a memory leak.
 
 We also can't call .relinquish_resources() on early domain creation
 failure because relinquishing resources can take time and therefore
 needs to be preemptible.
 
 So I think we need to introduce a new callback domain_free() that will
 be called arch_domain_destroy(). Is this something you can look at?
>>> 
>>> 
>>> Cleanup of an early domain creation failure, however you do it, is at
>>> most "the same amount of time again".  It cannot (absent of development
>>> errors) take the same indefinite time periods of time that a full
>>> domain_destroy() can.
>>> 
>>> The error path in domain_create() explicitly does call domain_teardown()
>>> so we can (eventually) purge these duplicate cleanup paths.  There are
>>> far too many easy errors to be made which occur from having split
>>> cleanup, and we have had to issue XSAs in the past to address some of
>>> them.  (Hence the effort to try and specifically change things, and
>>> remove the ability to introduce the errors in the first place.)
>>> 
>>> 
>>> Right now, it is specifically awkward to do this nicely because
>>> domain_teardown() doesn't call into a suitable arch hook.
>>> 
>>> IMO the best option here is extend domain_teardown() with an
>>> arch_domain_teardown() state/hook, and wire in the TEE cleanup path into
>>> this too.
>>> 
>>> Anything else is explicitly adding to technical debt that I (or someone
>>> else) is going to have to revert further down the line.
>>> 
>>> If you want, I am happy to prototype the arch_domain_teardown() bit of
>>> the fix, but I will have to defer wiring in the TEE part to someone
>>> capable of testing it.
>> 
>> You're more than welcome to prototype the fix, I can test it and add
>> it to the next version of the patch set when we're happy with the
>> result.
> 
> 
> Could you tell us if you are still happy to work on the prototype for
> arch_domain_teardown and when you would be able to give a first prototype ?

Could you answer to this question ?

Cheers
Bertrand



[PATCH v2] iommu/vtd: fix address translation for superpages

2023-05-24 Thread Roger Pau Monne
When translating an address that falls inside of a superpage in the
IOMMU page tables the fetching of the PTE value wasn't masking of the
contiguous related data, which caused the returned data to be
corrupt as it would contain bits that the caller would interpret as
part of the address.

Fix this by masking off the contiguous bits.

Fixes: c71e55501a61 ('VT-d: have callers specify the target level for page 
table walks')
Signed-off-by: Roger Pau Monné 
---
Changes since v1:
 - Return all the PTE bits except for the contiguous count ones.
---
 xen/drivers/passthrough/vtd/iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 130a159cde07..d7ba9a9c349f 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -368,7 +368,7 @@ static uint64_t addr_to_dma_page_maddr(struct domain 
*domain, daddr_t addr,
  * with the address adjusted to account for the residual of
  * the walk.
  */
-pte_maddr = pte->val +
+pte_maddr = (pte->val & ~DMA_PTE_CONTIG_MASK) +
 (addr & ((1UL << level_to_offset_bits(level)) - 1) &
  PAGE_MASK);
 if ( !target )
-- 
2.40.0




Re: [PATCH v7 12/12] xen/changelog: Add SVE and "dom0" options to the changelog for Arm

2023-05-24 Thread Bertrand Marquis
Hi Luca,

> On 23 May 2023, at 09:43, Luca Fancellu  wrote:
> 
> Arm now can use the "dom0=" Xen command line option and the support
> for guests running SVE instructions is added, put entries in the
> changelog.
> 
> Mention the "Tech Preview" status and add an entry in SUPPORT.md
> 
> Signed-off-by: Luca Fancellu 
> Acked-by: Henry Wang  # CHANGELOG

Reviewed-by: Bertrand Marquis 

Cheers
Bertrand

> ---
> Changes from v6:
> - Add Henry's A-by to CHANGELOG
> Changes from v5:
> - Add Tech Preview status and add entry in SUPPORT.md (Bertrand)
> Changes from v4:
> - No changes
> Change from v3:
> - new patch
> ---
> CHANGELOG.md | 3 +++
> SUPPORT.md   | 6 ++
> 2 files changed, 9 insertions(+)
> 
> diff --git a/CHANGELOG.md b/CHANGELOG.md
> index 5bfd3aa5c0d5..512b7bdc0fcb 100644
> --- a/CHANGELOG.md
> +++ b/CHANGELOG.md
> @@ -11,6 +11,8 @@ The format is based on [Keep a 
> Changelog](https://keepachangelog.com/en/1.0.0/)
>cap toolstack provided values.
>  - Ignore VCPUOP_set_singleshot_timer's VCPU_SSHOTTMR_future flag. The only
>known user doesn't use it properly, leading to in-guest breakage.
> + - The "dom0" option is now supported on Arm and "sve=" sub-option can be 
> used
> +   to enable dom0 guest to use SVE/SVE2 instructions.
> 
> ### Added
>  - On x86, support for features new in Intel Sapphire Rapids CPUs:
> @@ -20,6 +22,7 @@ The format is based on [Keep a 
> Changelog](https://keepachangelog.com/en/1.0.0/)
>- Bus-lock detection, used by Xen to mitigate (by rate-limiting) the system
>  wide impact of a guest misusing atomic instructions.
>  - xl/libxl can customize SMBIOS strings for HVM guests.
> + - On Arm, Xen supports guests running SVE/SVE2 instructions. (Tech Preview)
> 
> ## 
> [4.17.0](https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;h=RELEASE-4.17.0)
>  - 2022-12-12
> 
> diff --git a/SUPPORT.md b/SUPPORT.md
> index 6dbed9d5d029..e0fa2246807b 100644
> --- a/SUPPORT.md
> +++ b/SUPPORT.md
> @@ -99,6 +99,12 @@ Extension to the GICv3 interrupt controller to support MSI.
> 
> Status: Experimental
> 
> +### ARM Scalable Vector Extension (SVE/SVE2)
> +
> +AArch64 guest can use Scalable Vector Extension (SVE/SVE2).
> +
> +Status: Tech Preview
> +
> ## Guest Type
> 
> ### x86/PV
> -- 
> 2.34.1
> 




Re: [PATCH v7 11/12] xen/arm: add sve property for dom0less domUs

2023-05-24 Thread Bertrand Marquis
Hi Luca,

> On 23 May 2023, at 09:43, Luca Fancellu  wrote:
> 
> Add a device tree property in the dom0less domU configuration
> to enable the guest to use SVE.
> 
> Update documentation.
> 
> Signed-off-by: Luca Fancellu 

Reviewed-by: Bertrand Marquis 

Cheers
Bertrand

> ---
> Changes from v6:
> - Use ifdef in create_domUs and fail if 'sve' is used on systems
>   with CONFIG_ARM64_SVE not selected (Bertrand, Julien, Jan)
> Changes from v5:
> - Stop the domain creation if SVE not supported or SVE VL
>   errors (Julien, Bertrand)
> - now sve_sanitize_vl_param is renamed to sve_domctl_vl_param
>   and returns a boolean, change the affected code.
> - Reworded documentation.
> Changes from v4:
> - Now it is possible to specify the property "sve" for dom0less
>   device tree node without any value, that means the platform
>   supported VL will be used.
> Changes from v3:
> - Now domainconfig_encode_vl is named sve_encode_vl
> Changes from v2:
> - xen_domctl_createdomain field name has changed into sve_vl
>   and its value is the VL/128, use domainconfig_encode_vl
>   to encode a plain VL in bits.
> Changes from v1:
> - No changes
> Changes from RFC:
> - Changed documentation
> ---
> docs/misc/arm/device-tree/booting.txt | 16 +++
> xen/arch/arm/domain_build.c   | 28 +++
> 2 files changed, 44 insertions(+)
> 
> diff --git a/docs/misc/arm/device-tree/booting.txt 
> b/docs/misc/arm/device-tree/booting.txt
> index 3879340b5e0a..32a0e508c471 100644
> --- a/docs/misc/arm/device-tree/booting.txt
> +++ b/docs/misc/arm/device-tree/booting.txt
> @@ -193,6 +193,22 @@ with the following properties:
> Optional. Handle to a xen,cpupool device tree node that identifies the
> cpupool where the guest will be started at boot.
> 
> +- sve
> +
> +Optional. The `sve` property enables Arm SVE usage for the domain and 
> sets
> +the maximum SVE vector length, the option is applicable only to AArch64
> +guests.
> +A value equal to 0 disables the feature, this is the default value.
> +Specifying this property with no value, means that the SVE vector length
> +will be set equal to the maximum vector length supported by the platform.
> +Values above 0 explicitly set the maximum SVE vector length for the 
> domain,
> +allowed values are from 128 to maximum 2048, being multiple of 128.
> +Please note that when the user explicitly specifies the value, if that 
> value
> +is above the hardware supported maximum SVE vector length, the domain
> +creation will fail and the system will stop, the same will occur if the
> +option is provided with a non zero value, but the platform doesn't 
> support
> +SVE.
> +
> - xen,enhanced
> 
> A string property. Possible property values are:
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 9202a96d9c28..ba4fe9e165ee 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -4008,6 +4008,34 @@ void __init create_domUs(void)
> d_cfg.max_maptrack_frames = val;
> }
> 
> +if ( dt_get_property(node, "sve", ) )
> +{
> +#ifdef CONFIG_ARM64_SVE
> +unsigned int sve_vl_bits;
> +bool ret = false;
> +
> +if ( !val )
> +{
> +/* Property found with no value, means max HW VL supported */
> +ret = sve_domctl_vl_param(-1, _vl_bits);
> +}
> +else
> +{
> +if ( dt_property_read_u32(node, "sve", ) )
> +ret = sve_domctl_vl_param(val, _vl_bits);
> +else
> +panic("Error reading 'sve' property");
> +}
> +
> +if ( ret )
> +d_cfg.arch.sve_vl = sve_encode_vl(sve_vl_bits);
> +else
> +panic("SVE vector length error\n");
> +#else
> +panic("'sve' property found, but CONFIG_ARM64_SVE not selected");
> +#endif
> +}
> +
> /*
>  * The variable max_init_domid is initialized with zero, so here it's
>  * very important to use the pre-increment operator to call
> -- 
> 2.34.1
> 




[libvirt test] 180924: tolerable all pass - PUSHED

2023-05-24 Thread osstest service owner
flight 180924 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180924/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 180910
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 180910
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 180910
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 15 saverestore-support-checkfail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  44a0f2f0c8ff5e78c238013ed297b8fce223ac5a
baseline version:
 libvirt  3b6d69237f0a07bf8d9807cd68a387f8d42b076f

Last test of basis   180910  2023-05-23 04:18:51 Z1 days
Testing same since   180924  2023-05-24 04:20:20 Z0 days1 attempts


People who touched revisions under test:
  Boris Fiuczynski 
  Jiri Denemark 
  Martin Kletzander 
  Michal Privoznik 
  Tim Wiederhake 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-qcow2   pass
 test-arm64-arm64-libvirt-raw pass
 test-armhf-armhf-libvirt-raw pass
 test-amd64-i386-libvirt-raw  pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at

Re: [PATCH v2 10/10] x86/spec-ctrl: Remove opencoded MSR_ARCH_CAPS check

2023-05-24 Thread Jan Beulich
On 24.05.2023 13:25, Andrew Cooper wrote:
> MSR_ARCH_CAPS data is now included in featureset information.  Replace
> opencoded checks with regular feature ones.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Jan Beulich 





Re: [PATCH v2 06/10] x86/boot: Expose MSR_ARCH_CAPS data in guest max policies

2023-05-24 Thread Jan Beulich
On 24.05.2023 13:25, Andrew Cooper wrote:
> We already have common and default feature adjustment helpers.  Introduce one
> for max featuresets too.
> 
> Offer MSR_ARCH_CAPS unconditionally in the max policy, and stop clobbering the
> data inherited from the Host policy.  This will be necessary to level a VM
> safely for migration.  Annotate the ARCH_CAPS CPUID bit as special.  Note:
> ARCH_CAPS is still max-only for now, so will not be inhereted by the default
> policies.
> 
> With this done, the special case for dom0 can be shrunk to just resampling the
> Host policy (as ARCH_CAPS isn't visible by default yet).
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 





Re: [PATCH v2 04/10] x86/cpu-policy: MSR_ARCH_CAPS feature names

2023-05-24 Thread Jan Beulich
On 24.05.2023 13:25, Andrew Cooper wrote:
> Seed the default visibility from the dom0 special case, which for the most
> part just exposes the *_NO bits.  EIBRS is the one non-*_NO bit, which is
> "just" a status bit to the guest indicating a change in implemention of IBRS
> which is already fully supported.
> 
> Insert a block dependency from the ARCH_CAPS CPUID bit to the entire content
> of the MSR.  This is because MSRs have no structure information similar to
> CPUID, and used by x86_cpu_policy_clear_out_of_range_leaves(), in order to
> bulk-clear inaccessable words.
> 
> The overall CPUID bit is still max-only, so all of MSR_ARCH_CAPS is hidden in
> the default policies.
> 
> Signed-off-by: Andrew Cooper 

Reviewed-by: Jan Beulich 





Re: [PATCH v2 03/10] x86/cpu-policy: Infrastructure for MSR_ARCH_CAPS

2023-05-24 Thread Jan Beulich
On 24.05.2023 13:25, Andrew Cooper wrote:
> Bits through 24 are already defined, meaning that we're not far off needing
> the second word.  Put both in right away.
> 
> As both halves are present now, the arch_caps field is full width.  Adjust the
> unit test, which notices.
> 
> The bool bitfield names in the arch_caps union are unused, and somewhat out of
> date.  They'll shortly be automatically generated.
> 
> Add CPUID and MSR prefixes to the ./xen-cpuid verbose output, now that there
> are a mix of the two.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Jan Beulich 
albeit ...

> --- a/tools/misc/xen-cpuid.c
> +++ b/tools/misc/xen-cpuid.c
> @@ -226,31 +226,41 @@ static const char *const str_7d2[32] =
>  [ 4] = "bhi-ctrl",  [ 5] = "mcdt-no",
>  };
>  
> +static const char *const str_10Al[32] =
> +{
> +};
> +
> +static const char *const str_10Ah[32] =
> +{
> +};
> +
>  static const struct {
>  const char *name;
>  const char *abbr;
>  const char *const *strs;
>  } decodes[] =
>  {
> -{ "0x0001.edx",   "1d",  str_1d },
> -{ "0x0001.ecx",   "1c",  str_1c },
> -{ "0x8001.edx",   "e1d", str_e1d },
> -{ "0x8001.ecx",   "e1c", str_e1c },
> -{ "0x000d:1.eax", "Da1", str_Da1 },
> -{ "0x0007:0.ebx", "7b0", str_7b0 },
> -{ "0x0007:0.ecx", "7c0", str_7c0 },
> -{ "0x8007.edx",   "e7d", str_e7d },
> -{ "0x8008.ebx",   "e8b", str_e8b },
> -{ "0x0007:0.edx", "7d0", str_7d0 },
> -{ "0x0007:1.eax", "7a1", str_7a1 },
> -{ "0x8021.eax",  "e21a", str_e21a },
> -{ "0x0007:1.ebx", "7b1", str_7b1 },
> -{ "0x0007:2.edx", "7d2", str_7d2 },
> -{ "0x0007:1.ecx", "7c1", str_7c1 },
> -{ "0x0007:1.edx", "7d1", str_7d1 },
> +{ "CPUID 0x0001.edx","1d", str_1d },
> +{ "CPUID 0x0001.ecx","1c", str_1c },
> +{ "CPUID 0x8001.edx",   "e1d", str_e1d },
> +{ "CPUID 0x8001.ecx",   "e1c", str_e1c },
> +{ "CPUID 0x000d:1.eax", "Da1", str_Da1 },
> +{ "CPUID 0x0007:0.ebx", "7b0", str_7b0 },
> +{ "CPUID 0x0007:0.ecx", "7c0", str_7c0 },
> +{ "CPUID 0x8007.edx",   "e7d", str_e7d },
> +{ "CPUID 0x8008.ebx",   "e8b", str_e8b },
> +{ "CPUID 0x0007:0.edx", "7d0", str_7d0 },
> +{ "CPUID 0x0007:1.eax", "7a1", str_7a1 },
> +{ "CPUID 0x8021.eax",  "e21a", str_e21a },
> +{ "CPUID 0x0007:1.ebx", "7b1", str_7b1 },
> +{ "CPUID 0x0007:2.edx", "7d2", str_7d2 },
> +{ "CPUID 0x0007:1.ecx", "7c1", str_7c1 },
> +{ "CPUID 0x0007:1.edx", "7d1", str_7d1 },

... I'm not really happy about this added verbosity. In a tool of this
name, I think it's pretty clear that unadorned names are CPUID stuff.

> +{ "MSR   0x010a.lo",  "m10Al", str_10Al },
> +{ "MSR   0x010a.hi",  "m10Ah", str_10Ah },

Once we gain a few more MSRs, I'm afraid the raw numbers aren't going
to be very useful. As vaguely suggested before, how about

{ "MSR_ARCH_CAPS.lo",  "m10Al", str_10Al },
{ "MSR_ARCH_CAPS.hi",  "m10Ah", str_10Ah },

?

Jan



Re: [PATCH v2] x86/iommu: adjust type in arch_iommu_hwdom_init()

2023-05-24 Thread Jan Beulich
On 24.05.2023 16:30, Roger Pau Monne wrote:
> The 'i' iterator index stores a PDX, not a PFN, and hence the initial
> assignation of start (which stores a PFN) needs a conversion from PFN
> to PDX.
> 
> This is harmless currently, as the PDX compression skips the bottom
> MAX_ORDER bits which cover the low 1MB, but still do the conversion
> from PDX to PFN for type correctness.
> 
> Signed-off-by: Roger Pau Monné 

Acked-by: Jan Beulich 





Re: [PATCH RFC v2] vPCI: account for hidden devices

2023-05-24 Thread Jan Beulich
On 24.05.2023 16:22, Roger Pau Monné wrote:
> On Wed, May 24, 2023 at 03:45:58PM +0200, Jan Beulich wrote:
>> Hidden devices (e.g. an add-in PCI serial card used for Xen's serial
>> console) are associated with DomXEN, not Dom0. This means that while
>> looking for overlapping BARs such devices cannot be found on Dom0's list
>> of devices; DomXEN's list also needs to be scanned.
>>
>> Suppress vPCI init altogether for r/o devices (which constitute a subset
>> of hidden ones).
>>
>> Signed-off-by: Jan Beulich 
>> ---
>> RFC: The modify_bars() change is intentionally mis-formatted, as the
>>  necessary re-indentation would make the diff difficult to read. At
>>  this point I'd merely like to gather input towards possible better
>>  approaches to solve the issue (not the least because quite possibly
>>  there are further places needing changing).
> 
> I think we should also handle the case of !pdev->vpci for the hardware
> doamin, as it's allowed for the vpci_add_handlers() call in
> setup_one_hwdom_device() to fail and the device wold still be assigned
> to the hardware domain.
> 
> I can submit that as a separate bugfix, as it's already an issue
> without taking r/o or hidden devices into account.

Yeah, I think that wants dealing with separately. I'm not actually sure
though that "is allowed to fail" is proper behavior ...

But anyway - I take this as you agreeing to go that route, which is the
prereq for me to actually make a well-formed patch. Please shout soon
if that's a misunderstanding of mine.

>> RFC: Whether to actually suppress vPCI init is up for debate; adding the
>>  extra logic is following Roger's suggestion (I'm not convinced it is
>>  useful to have). If we want to keep that, a 2nd question would be
>>  whether to keep it in vpci_add_handlers(): Both of its callers (can)
>>  have a struct pci_seg readily available, so checking ->ro_map at the
>>  call sites would be easier.
> 
> But that would duplicate the logic into the callers, which doesn't
> seem very nice to me, and makes it easier to make mistakes if further
> callers are added and r/o is not checked there.

Right, hence why I didn't do it the alternative way from the beginning.
Both approaches have a pro and a con.

But prior to answering the 2nd question, what about the 1st one? Is it
really worth to have the extra logic?

>> RFC: _setup_hwdom_pci_devices()' loop may want splitting: For
>>  modify_bars() to consistently respect BARs of hidden devices while
>>  setting up "normal" ones (i.e. to avoid as much as possible the
>>  "continue" path introduced here), setting up of the former may want
>>  doing first.
> 
> But BARs of hidden devices should be mapped into dom0 physmap?

Yes.

>  And
> hence doing those before or after normal devices will lead to the same
> result.  The loop in modify_bars() is there to avoid attempting to map
> the same range twice, or to unmap a range while there are devices
> still using it, but the unmap is never done during initial device
> setup.

Okay, so maybe indeed there's no effect on the final result. Yet still
the anomaly bothered me when seeing it in the logs (it actually mislead
me initially in my conclusions as to what was actually going on), when
I added a printk() to that new "continue" path. We would skip hidden
devices up until them getting initialized themselves. There would be
less skipping if all (there aren't going to be many) DomXEN devices
were initialized first.

>> RFC: vpci_write()'s check of the r/o map may want moving out of mainline
>>  code, into the case dealing with !pdev->vpci.
> 
> Indeed.

Will extend the patch accordingly then

>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -218,6 +218,7 @@ static int modify_bars(const struct pci_
>>  struct vpci_header *header = >vpci->header;
>>  struct rangeset *mem = rangeset_new(NULL, NULL, 0);
>>  struct pci_dev *tmp, *dev = NULL;
>> +const struct domain *d;
>>  const struct vpci_msix *msix = pdev->vpci->msix;
>>  unsigned int i;
>>  int rc;
>> @@ -285,9 +286,11 @@ static int modify_bars(const struct pci_
>>  
>>  /*
>>   * Check for overlaps with other BARs. Note that only BARs that are
>> - * currently mapped (enabled) are checked for overlaps.
>> + * currently mapped (enabled) are checked for overlaps. Note also that
>> + * for Dom0 we also need to include hidden, i.e. DomXEN's, devices.
>>   */
>> -for_each_pdev ( pdev->domain, tmp )
>> +for ( d = pdev->domain; ; d = dom_xen ) {//todo
>> +for_each_pdev ( d, tmp )
>>  {
>>  if ( tmp == pdev )
>>  {
>> @@ -304,6 +307,7 @@ static int modify_bars(const struct pci_
>>   */
>>  continue;
>>  }
>> +if ( !tmp->vpci ) { ASSERT(d == dom_xen); continue; }//todo
>>  
>>  for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
>>  {
>> @@ -330,6 +334,7 @@ static int 

[ovmf test] 180928: all pass - PUSHED

2023-05-24 Thread osstest service owner
flight 180928 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180928/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf ba91d0292e593df8528b66f99c1b0b14fadc8e16
baseline version:
 ovmf 5ce29ae84db340244c3c3299f84713a88dec5171

Last test of basis   180908  2023-05-23 01:12:21 Z1 days
Testing same since   180928  2023-05-24 13:10:44 Z0 days1 attempts


People who touched revisions under test:
  Wendy Liao 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   5ce29ae84d..ba91d0292e  ba91d0292e593df8528b66f99c1b0b14fadc8e16 -> 
xen-tested-master



[PATCH v2] x86/iommu: adjust type in arch_iommu_hwdom_init()

2023-05-24 Thread Roger Pau Monne
The 'i' iterator index stores a PDX, not a PFN, and hence the initial
assignation of start (which stores a PFN) needs a conversion from PFN
to PDX.

This is harmless currently, as the PDX compression skips the bottom
MAX_ORDER bits which cover the low 1MB, but still do the conversion
from PDX to PFN for type correctness.

Signed-off-by: Roger Pau Monné 
---
Changes since v1:
 - Soften the description as it's not an error.
---
 xen/drivers/passthrough/x86/iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/drivers/passthrough/x86/iommu.c 
b/xen/drivers/passthrough/x86/iommu.c
index cb0788960a08..6bc79e7ec843 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -406,7 +406,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
  */
 start = paging_mode_translate(d) ? PFN_DOWN(MB(1)) : 0;
 
-for ( i = start, count = 0; i < top; )
+for ( i = pfn_to_pdx(start), count = 0; i < top; )
 {
 unsigned long pfn = pdx_to_pfn(i);
 unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn);
-- 
2.40.0




Re: [PATCH v7 01/12] xen/arm: enable SVE extension for Xen

2023-05-24 Thread Bertrand Marquis
Hi Julien,

> On 24 May 2023, at 11:58, Julien Grall  wrote:
> 
> Hi,
> 
> On 24/05/2023 10:01, Bertrand Marquis wrote:
>>> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
>>> index c4ec38bb2554..83b84368f6d5 100644
>>> --- a/xen/arch/arm/cpufeature.c
>>> +++ b/xen/arch/arm/cpufeature.c
>>> @@ -9,6 +9,7 @@
>>> #include 
>>> #include 
>>> #include 
>>> +#include 
>>> #include 
>>> 
>>> DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
>>> @@ -143,6 +144,9 @@ void identify_cpu(struct cpuinfo_arm *c)
>>> 
>>> c->zfr64.bits[0] = READ_SYSREG(ID_AA64ZFR0_EL1);
>>> 
>>> +if ( cpu_has_sve )
>>> +c->zcr64.bits[0] = compute_max_zcr();
>>> +
>>> c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
>>> 
>>> c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
>>> @@ -199,7 +203,7 @@ static int __init create_guest_cpuinfo(void)
>>> guest_cpuinfo.pfr64.mpam = 0;
>>> guest_cpuinfo.pfr64.mpam_frac = 0;
>>> 
>>> -/* Hide SVE as Xen does not support it */
>>> +/* Hide SVE by default to the guests */
>> Everything is for guests and as Jan mentioned in an other comment
>> this could be wrongly interpreted.
> 
> (Not directly related to this patch, so no changes expected here)
> 
> Hmmm... The name of the function/variable is confusing as well given that the 
> cpuinfo should also apply to dom0. Should we s/guest/domain/?

Would make sense to do some kind of coherency check here to use domain whenever 
something is for dom0 or guest.
So yes that would be a good idea and I can add this to my todolist (after SVE 
is merged to prevent conflicts).

Cheers
Bertrand

> 
> Cheers,
> 
> -- 
> Julien Grall




Re: [PATCH RFC v2] vPCI: account for hidden devices

2023-05-24 Thread Roger Pau Monné
On Wed, May 24, 2023 at 03:45:58PM +0200, Jan Beulich wrote:
> Hidden devices (e.g. an add-in PCI serial card used for Xen's serial
> console) are associated with DomXEN, not Dom0. This means that while
> looking for overlapping BARs such devices cannot be found on Dom0's list
> of devices; DomXEN's list also needs to be scanned.
> 
> Suppress vPCI init altogether for r/o devices (which constitute a subset
> of hidden ones).
> 
> Signed-off-by: Jan Beulich 
> ---
> RFC: The modify_bars() change is intentionally mis-formatted, as the
>  necessary re-indentation would make the diff difficult to read. At
>  this point I'd merely like to gather input towards possible better
>  approaches to solve the issue (not the least because quite possibly
>  there are further places needing changing).

I think we should also handle the case of !pdev->vpci for the hardware
doamin, as it's allowed for the vpci_add_handlers() call in
setup_one_hwdom_device() to fail and the device wold still be assigned
to the hardware domain.

I can submit that as a separate bugfix, as it's already an issue
without taking r/o or hidden devices into account.

> 
> RFC: Whether to actually suppress vPCI init is up for debate; adding the
>  extra logic is following Roger's suggestion (I'm not convinced it is
>  useful to have). If we want to keep that, a 2nd question would be
>  whether to keep it in vpci_add_handlers(): Both of its callers (can)
>  have a struct pci_seg readily available, so checking ->ro_map at the
>  call sites would be easier.

But that would duplicate the logic into the callers, which doesn't
seem very nice to me, and makes it easier to make mistakes if further
callers are added and r/o is not checked there.

> RFC: _setup_hwdom_pci_devices()' loop may want splitting: For
>  modify_bars() to consistently respect BARs of hidden devices while
>  setting up "normal" ones (i.e. to avoid as much as possible the
>  "continue" path introduced here), setting up of the former may want
>  doing first.

But BARs of hidden devices should be mapped into dom0 physmap?  And
hence doing those before or after normal devices will lead to the same
result.  The loop in modify_bars() is there to avoid attempting to map
the same range twice, or to unmap a range while there are devices
still using it, but the unmap is never done during initial device
setup.

> 
> RFC: vpci_write()'s check of the r/o map may want moving out of mainline
>  code, into the case dealing with !pdev->vpci.

Indeed.

> ---
> v2: Extend existing comment. Relax assertion. Don't initialize vPCI for
> r/o devices.
> 
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -218,6 +218,7 @@ static int modify_bars(const struct pci_
>  struct vpci_header *header = >vpci->header;
>  struct rangeset *mem = rangeset_new(NULL, NULL, 0);
>  struct pci_dev *tmp, *dev = NULL;
> +const struct domain *d;
>  const struct vpci_msix *msix = pdev->vpci->msix;
>  unsigned int i;
>  int rc;
> @@ -285,9 +286,11 @@ static int modify_bars(const struct pci_
>  
>  /*
>   * Check for overlaps with other BARs. Note that only BARs that are
> - * currently mapped (enabled) are checked for overlaps.
> + * currently mapped (enabled) are checked for overlaps. Note also that
> + * for Dom0 we also need to include hidden, i.e. DomXEN's, devices.
>   */
> -for_each_pdev ( pdev->domain, tmp )
> +for ( d = pdev->domain; ; d = dom_xen ) {//todo
> +for_each_pdev ( d, tmp )
>  {
>  if ( tmp == pdev )
>  {
> @@ -304,6 +307,7 @@ static int modify_bars(const struct pci_
>   */
>  continue;
>  }
> +if ( !tmp->vpci ) { ASSERT(d == dom_xen); continue; }//todo
>  
>  for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
>  {
> @@ -330,6 +334,7 @@ static int modify_bars(const struct pci_
>  }
>  }
>  }
> +if ( !is_hardware_domain(d) ) break; }//todo

AFAICT in order to support hidden devices there's one bit missing in
vpci_{write,read}(): the call to pci_get_pdev() should be also done
against dom_xen when handling accesses originated from the hardware
domain.

One further question is whether we want to map BARs of r/o devices
into the hardware domain physmap.  Not sure that's very helpful, as
dom0 won't be allowed to modify any of the config space bits of those
devices, so even attempts to size the BARs will fail.  I wonder what
kind of issues this can cause to dom0 OSes.

Thanks, Roger.



Re: [XEN v7 07/11] xen/arm: p2m: Use the pa_range_info table to support ARM_32 and ARM_64

2023-05-24 Thread Ayan Kumar Halder



On 19/05/2023 09:54, Michal Orzel wrote:

Hi Ayan,

Hi Michal,


On 18/05/2023 16:39, Ayan Kumar Halder wrote:

Restructure the code so that one can use pa_range_info[] table for both
ARM_32 as well as ARM_64.

Also, removed the hardcoding for P2M_ROOT_ORDER and P2M_ROOT_LEVEL as
p2m_root_order can be obtained from the pa_range_info[].root_order and
p2m_root_level can be obtained from pa_range_info[].sl0.

Refer ARM DDI 0406C.d ID040418, B3-1345,
"Use of concatenated first-level translation tables

...However, a 40-bit input address range with a translation granularity of 4KB
requires a total of 28 bits of address resolution. Therefore, a stage 2
translation that supports a 40-bit input address range requires two concatenated
first-level translation tables,..."

Thus, root-order is 1 for 40-bit IPA on ARM_32.

Refer ARM DDI 0406C.d ID040418, B3-1348,

"Determining the required first lookup level for stage 2 translations

For a stage 2 translation, the output address range from the stage 1
translations determines the required input address range for the stage 2
translation. The permitted values of VTCR.SL0 are:

0b00 Stage 2 translation lookup must start at the second level.
0b01 Stage 2 translation lookup must start at the first level.

VTCR.T0SZ must indicate the required input address range. The size of the input
address region is 2^(32-T0SZ) bytes."

Thus VTCR.SL0 = 1 (maximum value) and VTCR.T0SZ = -8 when the size of input
address region is 2^40 bytes.

Thus, pa_range_info[].t0sz = 1 (VTCR.S) | 8 (VTCR.T0SZ) ie 11000b which is 24.

VTCR.T0SZ, is bits [5:0] for ARM_64.
VTCR.T0SZ is bits [3:0] and S(sign extension), bit[4] for ARM_32.

Thus, VTCR.T0SZ bits are interpreted accordingly for different architecture.
For this, we have used union.

pa_range_info[] is indexed by ID_AA64MMFR0_EL1.PARange which is present in Arm64
only. This is the reason we do not specify the indices for ARM_32. Also, we
duplicated the entry "{ 40,  24/*24*/,  1,  1 }" between ARM_64 and
ARM_32. This is done to avoid introducing extra #if-defs.

Signed-off-by: Ayan Kumar Halder 
---
Changes from -

v3 - 1. New patch introduced in v4.
2. Restructure the code such that pa_range_info[] is used both by ARM_32 as
well as ARM_64.

v4 - 1. Removed the hardcoded definitions of P2M_ROOT_ORDER and P2M_ROOT_LEVEL.
The reason being root_order will not be always 1 (See the next patch).
2. Updated the commit message to explain t0sz, sl0 and root_order values for
32-bit IPA on Arm32.
3. Some sanity fixes.

v5 - pa_range_info is indexed by system_cpuinfo.mm64.pa_range. ie
when PARange is 0, the PA size is 32, 1 -> 36 and so on. So pa_range_info[] has
been updated accordingly.
For ARM_32 pa_range_info[0] = 0 and pa_range_info[1] = 0 as we do not support
32-bit, 36-bit physical address range yet.

v6 - 1. Added pa_range_info[] entries for ARM_32 without indices. Some entry
may be duplicated between ARM_64 and ARM_32.
2. Recalculate p2m_ipa_bits for ARM_32 from T0SZ (similar to ARM_64).
3. Introduced an union to reinterpret T0SZ bits between ARM_32 and ARM_64.

  xen/arch/arm/include/asm/p2m.h |  6 --
  xen/arch/arm/p2m.c | 37 +++---
  2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/xen/arch/arm/include/asm/p2m.h b/xen/arch/arm/include/asm/p2m.h
index f67e9ddc72..940495d42b 100644
--- a/xen/arch/arm/include/asm/p2m.h
+++ b/xen/arch/arm/include/asm/p2m.h
@@ -14,16 +14,10 @@
  /* Holds the bit size of IPAs in p2m tables.  */
  extern unsigned int p2m_ipa_bits;
  
-#ifdef CONFIG_ARM_64

  extern unsigned int p2m_root_order;
  extern unsigned int p2m_root_level;
  #define P2M_ROOT_ORDERp2m_root_order
  #define P2M_ROOT_LEVEL p2m_root_level
-#else
-/* First level P2M is always 2 consecutive pages */
-#define P2M_ROOT_ORDER1
-#define P2M_ROOT_LEVEL 1
-#endif
  
  struct domain;
  
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c

index 418997843d..755cb86c5b 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -19,9 +19,9 @@
  
  #define INVALID_VMID 0 /* VMID 0 is reserved */
  
-#ifdef CONFIG_ARM_64

  unsigned int __read_mostly p2m_root_order;
  unsigned int __read_mostly p2m_root_level;
+#ifdef CONFIG_ARM_64
  static unsigned int __read_mostly max_vmid = MAX_VMID_8_BIT;
  /* VMID is by default 8 bit width on AArch64 */
  #define MAX_VMID   max_vmid
@@ -2247,16 +2247,6 @@ void __init setup_virt_paging(void)
  /* Setup Stage 2 address translation */
  register_t val = VTCR_RES1|VTCR_SH0_IS|VTCR_ORGN0_WBWA|VTCR_IRGN0_WBWA;
  
-#ifdef CONFIG_ARM_32

-if ( p2m_ipa_bits < 40 )
-panic("P2M: Not able to support %u-bit IPA at the moment\n",
-  p2m_ipa_bits);
-
-printk("P2M: 40-bit IPA\n");
-p2m_ipa_bits = 40;
-val |= VTCR_T0SZ(0x18); /* 40 bit IPA */
-val |= VTCR_SL0(0x1); /* P2M starts at first level */
-#else /* CONFIG_ARM_64 */
  static const struct {
  unsigned int pabits; /* Physical Address Size 

Re: [XEN PATCH 03/15] build, x86: clean build log for boot/ targets

2023-05-24 Thread Jan Beulich
On 23.05.2023 18:37, Anthony PERARD wrote:
> We are adding %.lnk to .PRECIOUS or make treat them as intermediate
> targets and remove them.

What's wrong with them getting removed? Note also that's no different from
today, so it's an orthogonal change in any event (unless I'm overlooking
something). Plus if such behavior was intended, shouldn't $(targets) be
made a prereq of .PRECIOUS in generic makefile logic?

> Signed-off-by: Anthony PERARD 
> ---
>  xen/arch/x86/boot/Makefile | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
> index 03d8ce3a9e..2693b938bd 100644
> --- a/xen/arch/x86/boot/Makefile
> +++ b/xen/arch/x86/boot/Makefile
> @@ -5,6 +5,8 @@ head-bin-objs := cmdline.o reloc.o
>  nocov-y   += $(head-bin-objs)
>  noubsan-y += $(head-bin-objs)
>  targets   += $(head-bin-objs)
> +targets   += $(head-bin-objs:.o=.bin)
> +targets   += $(head-bin-objs:.o=.lnk)

Leaving aside the question of whether .lnk really should be part
of $(targets), don't these two lines also ...

> @@ -26,10 +28,16 @@ $(head-bin-objs): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic
>  LDFLAGS_DIRECT-$(call ld-option,--warn-rwx-segments) := 
> --no-warn-rwx-segments
>  LDFLAGS_DIRECT += $(LDFLAGS_DIRECT-y)
>  
> -%.bin: %.lnk
> - $(OBJCOPY) -j .text -O binary $< $@
> +%.bin: OBJCOPYFLAGS := -j .text -O binary
> +%.bin: %.lnk FORCE
> + $(call if_changed,objcopy)
>  
> -%.lnk: %.o $(src)/build32.lds
> - $(LD) $(subst x86_64,i386,$(LDFLAGS_DIRECT)) -N -T $(filter %.lds,$^) 
> -o $@ $<
> +quiet_cmd_ld_lnk_o = LD  $@
> +cmd_ld_lnk_o = $(LD) $(subst x86_64,i386,$(LDFLAGS_DIRECT)) -N -T $(filter 
> %.lds,$^) -o $@ $<
> +
> +%.lnk: %.o $(src)/build32.lds FORCE
> + $(call if_changed,ld_lnk_o)
>  
>  clean-files := *.lnk *.bin

... eliminate the need for this?

Jan

> +
> +.PRECIOUS: %.lnk




Re: [XEN PATCH 02/15] build: rework asm-offsets.* build step to use kbuild

2023-05-24 Thread Jan Beulich
On 23.05.2023 18:37, Anthony PERARD wrote:
> Use $(if_changed_dep, ) macro to generate "asm-offsets.s" and remove
> the use of $(move-if-changes,). That mean that "asm-offset.s" will be
> changed even when the output doesn't change.
> 
> But "asm-offsets.s" is only used to generated "asm-offsets.h". So
> instead of regenerating "asm-offsets.h" every time "asm-offsets.s"
> change, we will use "$(filechk, )" to only update the ".h" when the
> output change. Also, with "$(filechk, )", the file does get
> regenerated when the rule change in the makefile.
> 
> This changes also result in a cleaner build log.
> 
> Signed-off-by: Anthony PERARD 
> ---
> 
> Instead of having a special $(cmd_asm-offsets.s) command, we could
> probably reuse $(cmd_cc_s_c) from Rules.mk, but that would mean that
> an hypothetical additional flags "-flto" in CFLAGS would not be
> removed anymore, not sure if that matter here.

There's no real code being generated there, and what we're after are
merely the special .ascii directives. So the presence of -flto should
have no effect there, and hence it would even look more consistent to
me if we didn't use different options (and even rules) for .c -> .s
transformations.

> But then we could write this:
> 
> targets += arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s
> arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s: CFLAGS-y += -g0
> arch/$(TARGET_ARCH)/include/asm/asm-offsets.h: 
> arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s FORCE
> 
> instead of having to write a rule for asm-offsets.s

Ftaod, I'd be happy to ack the patch as it is, but I would favor if
you could do the rework / simplification as outlined.

Jan



Re: [PATCH v2 05/10] x86/boot: Record MSR_ARCH_CAPS for the Raw and Host CPU policy

2023-05-24 Thread Andrew Cooper
On 24/05/2023 12:25 pm, Andrew Cooper wrote:
> diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
> index 9bbb385db42d..f1084bb1ed36 100644
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -477,6 +477,11 @@ static void generic_identify(struct cpuinfo_x86 *c)
>   cpuid_count(0xd, 1,
>   >x86_capability[FEATURESET_Da1],
>   , , );
> +
> + if (test_bit(X86_FEATURE_ARCH_CAPS, c->x86_capability))
> + rdmsr(MSR_ARCH_CAPABILITIES,
> +   c->x86_capability[FEATURESET_10Al],
> +   c->x86_capability[FEATURESET_10Ah]);

I managed to send out a stale version.  I've corrected this and the
other instances locally.

~Andrew



Re: PVH Dom0 related UART failure

2023-05-24 Thread Jan Beulich
On 24.05.2023 03:13, Stefano Stabellini wrote:
> On Tue, 23 May 2023, Jan Beulich wrote:
>> On 23.05.2023 00:20, Stefano Stabellini wrote:
>>> On Sat, 20 May 2023, Roger Pau Monné wrote:
 diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
 index ec2e978a4e6b..0ff8e940fa8d 100644
 --- a/xen/drivers/vpci/header.c
 +++ b/xen/drivers/vpci/header.c
 @@ -289,6 +289,13 @@ static int modify_bars(const struct pci_dev *pdev, 
 uint16_t cmd, bool rom_only)
   */
  for_each_pdev ( pdev->domain, tmp )
  {
 +if ( !tmp->vpci )
 +{
 +printk(XENLOG_G_WARNING "%pp: not handled by vPCI for %pd\n",
 +   >sbdf, pdev->domain);
 +continue;
 +}
 +
  if ( tmp == pdev )
  {
  /*
 diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
 index 652807a4a454..0baef3a8d3a1 100644
 --- a/xen/drivers/vpci/vpci.c
 +++ b/xen/drivers/vpci/vpci.c
 @@ -72,7 +72,12 @@ int vpci_add_handlers(struct pci_dev *pdev)
  unsigned int i;
  int rc = 0;
  
 -if ( !has_vpci(pdev->domain) )
 +if ( !has_vpci(pdev->domain) ||
 + /*
 +  * Ignore RO and hidden devices, those are in use by Xen and vPCI
 +  * won't work on them.
 +  */
 + pci_get_pdev(dom_xen, pdev->sbdf) )
  return 0;
  
  /* We should not get here twice for the same device. */
>>>
>>>
>>> Now this patch works! Thank you!! :-)
>>>
>>> You can check the full logs here
>>> https://gitlab.com/xen-project/people/sstabellini/xen/-/jobs/4329259080
>>>
>>> Is the patch ready to be upstreamed aside from the commit message?
>>
>> I don't think so. vPCI ought to work on "r/o" devices. Out of curiosity,
>> have you also tried my (hackish and hence RFC) patch [1]?
>>
>> [1] https://lists.xen.org/archives/html/xen-devel/2021-08/msg01489.html
> 
> I don't know the code well enough to discuss what is the best solution.
> I'll let you and Roger figure it out. I would only kindly request to
> solve this in few days so that we can enable the real hardware PVH test
> in gitlab-ci as soon as possible. I think it is critical as it will
> allow us to catch many real issues going forward.
> 
> For sure I can test your patch. BTW it is also really easy for you to do
> it your simply by pushing a branch to a repo on gitlab-ci and watch for
> the results. If you are interested let me know I can give you a
> tutorial, you just need to create a repo, and register the gitlab runner
> and voila'.
> 
> This is the outcome:
> 
> https://gitlab.com/xen-project/people/sstabellini/xen/-/pipelines/876808194
> 
> 
> (XEN) PCI add device :00:00.0
> (XEN) PCI add device :00:00.2
> (XEN) PCI add device :00:01.0
> (XEN) PCI add device :00:02.0
> (XEN) Assertion 'd == dom_xen && system_state < SYS_STATE_active' failed at 
> drivers/vpci/header.c:313

I've sent an updated RFC patch, integrating a variant of Roger's patch
at the same time.

Jan



[PATCH RFC v2] vPCI: account for hidden devices

2023-05-24 Thread Jan Beulich
Hidden devices (e.g. an add-in PCI serial card used for Xen's serial
console) are associated with DomXEN, not Dom0. This means that while
looking for overlapping BARs such devices cannot be found on Dom0's list
of devices; DomXEN's list also needs to be scanned.

Suppress vPCI init altogether for r/o devices (which constitute a subset
of hidden ones).

Signed-off-by: Jan Beulich 
---
RFC: The modify_bars() change is intentionally mis-formatted, as the
 necessary re-indentation would make the diff difficult to read. At
 this point I'd merely like to gather input towards possible better
 approaches to solve the issue (not the least because quite possibly
 there are further places needing changing).

RFC: Whether to actually suppress vPCI init is up for debate; adding the
 extra logic is following Roger's suggestion (I'm not convinced it is
 useful to have). If we want to keep that, a 2nd question would be
 whether to keep it in vpci_add_handlers(): Both of its callers (can)
 have a struct pci_seg readily available, so checking ->ro_map at the
 call sites would be easier.

RFC: _setup_hwdom_pci_devices()' loop may want splitting: For
 modify_bars() to consistently respect BARs of hidden devices while
 setting up "normal" ones (i.e. to avoid as much as possible the
 "continue" path introduced here), setting up of the former may want
 doing first.

RFC: vpci_write()'s check of the r/o map may want moving out of mainline
 code, into the case dealing with !pdev->vpci.
---
v2: Extend existing comment. Relax assertion. Don't initialize vPCI for
r/o devices.

--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -218,6 +218,7 @@ static int modify_bars(const struct pci_
 struct vpci_header *header = >vpci->header;
 struct rangeset *mem = rangeset_new(NULL, NULL, 0);
 struct pci_dev *tmp, *dev = NULL;
+const struct domain *d;
 const struct vpci_msix *msix = pdev->vpci->msix;
 unsigned int i;
 int rc;
@@ -285,9 +286,11 @@ static int modify_bars(const struct pci_
 
 /*
  * Check for overlaps with other BARs. Note that only BARs that are
- * currently mapped (enabled) are checked for overlaps.
+ * currently mapped (enabled) are checked for overlaps. Note also that
+ * for Dom0 we also need to include hidden, i.e. DomXEN's, devices.
  */
-for_each_pdev ( pdev->domain, tmp )
+for ( d = pdev->domain; ; d = dom_xen ) {//todo
+for_each_pdev ( d, tmp )
 {
 if ( tmp == pdev )
 {
@@ -304,6 +307,7 @@ static int modify_bars(const struct pci_
  */
 continue;
 }
+if ( !tmp->vpci ) { ASSERT(d == dom_xen); continue; }//todo
 
 for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ )
 {
@@ -330,6 +334,7 @@ static int modify_bars(const struct pci_
 }
 }
 }
+if ( !is_hardware_domain(d) ) break; }//todo
 
 ASSERT(dev);
 
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -70,6 +70,7 @@ void vpci_remove_device(struct pci_dev *
 int vpci_add_handlers(struct pci_dev *pdev)
 {
 unsigned int i;
+const unsigned long *ro_map;
 int rc = 0;
 
 if ( !has_vpci(pdev->domain) )
@@ -78,6 +79,11 @@ int vpci_add_handlers(struct pci_dev *pd
 /* We should not get here twice for the same device. */
 ASSERT(!pdev->vpci);
 
+/* No vPCI for r/o devices. */
+ro_map = pci_get_ro_map(pdev->sbdf.seg);
+if ( ro_map && test_bit(pdev->sbdf.bdf, ro_map) )
+return 0;
+
 pdev->vpci = xzalloc(struct vpci);
 if ( !pdev->vpci )
 return -ENOMEM;



Re: [PATCH] [v3] x86: xen: add missing prototypes

2023-05-24 Thread Juergen Gross

On 23.05.23 22:56, Arnd Bergmann wrote:

From: Arnd Bergmann 

These function are all called from assembler files, or from inline assembler,
so there is no immediate need for a prototype in a header, but if 
-Wmissing-prototypes
is enabled, the compiler warns about them:

arch/x86/xen/efi.c:130:13: error: no previous prototype for 'xen_efi_init' 
[-Werror=missing-prototypes]
arch/x86/platform/pvh/enlighten.c:120:13: error: no previous prototype for 
'xen_prepare_pvh' [-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:358:20: error: no previous prototype for 'xen_pte_val' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:366:20: error: no previous prototype for 'xen_pgd_val' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:372:17: error: no previous prototype for 'xen_make_pte' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:380:17: error: no previous prototype for 'xen_make_pgd' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:387:20: error: no previous prototype for 'xen_pmd_val' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:425:17: error: no previous prototype for 'xen_make_pmd' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:432:20: error: no previous prototype for 'xen_pud_val' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:438:17: error: no previous prototype for 'xen_make_pud' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:522:20: error: no previous prototype for 'xen_p4d_val' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:528:17: error: no previous prototype for 'xen_make_p4d' 
[-Werror=missing-prototypes]
arch/x86/xen/mmu_pv.c:1442:17: error: no previous prototype for 
'xen_make_pte_init' [-Werror=missing-prototypes]


I'd suggest to add the prototypes of the functions defined in
arch/x86/xen/mmu_pv.c to the same source, as they are referenced
nowhere else. This avoids the need of #ifdefs for those functions.


arch/x86/xen/enlighten_pv.c:1233:34: error: no previous prototype for 
'xen_start_kernel' [-Werror=missing-prototypes]
arch/x86/xen/irq.c:22:14: error: no previous prototype for 
'xen_force_evtchn_callback' [-Werror=missing-prototypes]
arch/x86/entry/common.c:302:24: error: no previous prototype for 
'xen_pv_evtchn_do_upcall' [-Werror=missing-prototypes]

Declare all of them in an appropriate header file to avoid the warnings.
For consistency, also move the asm_cpu_bringup_and_idle() declaration out of
smp_pv.c.

Signed-off-by: Arnd Bergmann 
---
v3: move declations of conditional function in an #ifdef with a stub
v2: fix up changelog
---
  arch/x86/xen/efi.c |  2 ++
  arch/x86/xen/smp.h | 11 +++
  arch/x86/xen/smp_pv.c  |  1 -
  arch/x86/xen/xen-ops.h | 26 ++
  include/xen/xen.h  |  3 +++
  5 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/efi.c b/arch/x86/xen/efi.c
index 7d7ffb9c826a..863d0d6b3edc 100644
--- a/arch/x86/xen/efi.c
+++ b/arch/x86/xen/efi.c
@@ -16,6 +16,8 @@
  #include 
  #include 
  
+#include "xen-ops.h"

+
  static efi_char16_t vendor[100] __initdata;
  
  static efi_system_table_t efi_systab_xen __initdata = {

diff --git a/arch/x86/xen/smp.h b/arch/x86/xen/smp.h
index 22fb982ff971..9367502281dc 100644
--- a/arch/x86/xen/smp.h
+++ b/arch/x86/xen/smp.h
@@ -2,6 +2,10 @@
  #ifndef _XEN_SMP_H
  
  #ifdef CONFIG_SMP

+
+void asm_cpu_bringup_and_idle(void);
+asmlinkage void cpu_bringup_and_idle(void);
+
  extern void xen_send_IPI_mask(const struct cpumask *mask,
  int vector);
  extern void xen_send_IPI_mask_allbutself(const struct cpumask *mask,
@@ -29,6 +33,13 @@ struct xen_common_irq {
  };
  #else /* CONFIG_SMP */
  
+static inline void asm_cpu_bringup_and_idle(void)

+{
+}
+static inline void cpu_bringup_and_idle(void)
+{
+}
+


No need for above stubs, as there is no way they can get called
without CONFIG_SMP.


  static inline int xen_smp_intr_init(unsigned int cpu)
  {
return 0;
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index a92e8002b5cf..d5ae5de2daa2 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -55,7 +55,6 @@ static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work) = 
{ .irq = -1 };
  static DEFINE_PER_CPU(struct xen_common_irq, xen_pmu_irq) = { .irq = -1 };
  
  static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id);

-void asm_cpu_bringup_and_idle(void);
  
  static void cpu_bringup(void)

  {
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 6d7f6318fc07..eb4cb30570c7 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -146,12 +146,38 @@ int xen_cpuhp_setup(int (*cpu_up_prepare_cb)(unsigned 
int),
  void xen_pin_vcpu(int cpu);
  
  void xen_emergency_restart(void);

+void xen_force_evtchn_callback(void);
+
  #ifdef CONFIG_XEN_PV
  void xen_pv_pre_suspend(void);
  void xen_pv_post_suspend(int suspend_cancelled);
+pteval_t xen_pte_val(pte_t pte);
+pgdval_t xen_pgd_val(pgd_t pgd);
+pmdval_t xen_pmd_val(pmd_t pmd);
+pudval_t xen_pud_val(pud_t 

[xen-unstable test] 180922: tolerable FAIL

2023-05-24 Thread osstest service owner
flight 180922 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180922/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-install fail in 180913 pass in 
180922
 test-amd64-i386-libvirt-raw   7 xen-install  fail in 180913 pass in 180922
 test-amd64-i386-xl-vhd   22 guest-start.2fail in 180913 pass in 180922
 test-amd64-amd64-xl-qemut-debianhvm-amd64 20 guest-start/debianhvm.repeat fail 
pass in 180913

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 180913
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 180913
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 180913
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 180913
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 180913
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 180913
 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeatfail  like 180913
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 180913
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 180913
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 180913
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 180913
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 180913
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 180913
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   

Re: PVH Dom0 related UART failure

2023-05-24 Thread Jan Beulich
On 24.05.2023 08:14, Jan Beulich wrote:
> On 24.05.2023 03:13, Stefano Stabellini wrote:
>> For sure I can test your patch. BTW it is also really easy for you to do
>> it your simply by pushing a branch to a repo on gitlab-ci and watch for
>> the results. If you are interested let me know I can give you a
>> tutorial, you just need to create a repo, and register the gitlab runner
>> and voila'.
>>
>> This is the outcome:
>>
>> https://gitlab.com/xen-project/people/sstabellini/xen/-/pipelines/876808194
>>
>>
>> (XEN) PCI add device :00:00.0
>> (XEN) PCI add device :00:00.2
>> (XEN) PCI add device :00:01.0
>> (XEN) PCI add device :00:02.0
>> (XEN) Assertion 'd == dom_xen && system_state < SYS_STATE_active' failed at 
>> drivers/vpci/header.c:313
> 
> So this is an assertion my patch adds. The right side of the && may be too
> strict, but it's been too long to recall why exactly I thought the case
> should occur only before Dom0 starts. You may want to retry with that 2nd
> half of the condition dropped.

And indeed it needs dropping. The patch pre-dates 163db6a72b66 ("x86/PVH:
permit more physdevop-s to be used by Dom0"), and despite me being the
author of that one I failed to make the connection. Which quite clearly
indicates some other oddity, because in principle I should be hitting
that same assertion then as well when booting PVH Dom0. Yet I don't, so
I have more to figure out.

Jan



[PATCH v2 10/10] x86/spec-ctrl: Remove opencoded MSR_ARCH_CAPS check

2023-05-24 Thread Andrew Cooper
MSR_ARCH_CAPS data is now included in featureset information.  Replace
opencoded checks with regular feature ones.

No functional change.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/include/asm/cpufeature.h |  7 
 xen/arch/x86/spec_ctrl.c  | 56 +--
 2 files changed, 33 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/include/asm/cpufeature.h 
b/xen/arch/x86/include/asm/cpufeature.h
index 9047ea43f503..50235f098d70 100644
--- a/xen/arch/x86/include/asm/cpufeature.h
+++ b/xen/arch/x86/include/asm/cpufeature.h
@@ -183,8 +183,15 @@ static inline bool boot_cpu_has(unsigned int feat)
 #define cpu_has_avx_ne_convert  boot_cpu_has(X86_FEATURE_AVX_NE_CONVERT)
 
 /* MSR_ARCH_CAPS */
+#define cpu_has_rdcl_no boot_cpu_has(X86_FEATURE_RDCL_NO)
+#define cpu_has_eibrs   boot_cpu_has(X86_FEATURE_EIBRS)
+#define cpu_has_rsbaboot_cpu_has(X86_FEATURE_RSBA)
+#define cpu_has_skip_l1dfl  boot_cpu_has(X86_FEATURE_SKIP_L1DFL)
+#define cpu_has_mds_no  boot_cpu_has(X86_FEATURE_MDS_NO)
 #define cpu_has_if_pschange_mc_no boot_cpu_has(X86_FEATURE_IF_PSCHANGE_MC_NO)
 #define cpu_has_tsx_ctrlboot_cpu_has(X86_FEATURE_TSX_CTRL)
+#define cpu_has_taa_no  boot_cpu_has(X86_FEATURE_TAA_NO)
+#define cpu_has_fb_clearboot_cpu_has(X86_FEATURE_FB_CLEAR)
 
 /* Synthesized. */
 #define cpu_has_arch_perfmonboot_cpu_has(X86_FEATURE_ARCH_PERFMON)
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index f81db2143328..50d467f74cf8 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -282,12 +282,10 @@ custom_param("spec-ctrl", parse_spec_ctrl);
 int8_t __read_mostly opt_xpti_hwdom = -1;
 int8_t __read_mostly opt_xpti_domu = -1;
 
-static __init void xpti_init_default(uint64_t caps)
+static __init void xpti_init_default(void)
 {
-if ( boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) )
-caps = ARCH_CAPS_RDCL_NO;
-
-if ( caps & ARCH_CAPS_RDCL_NO )
+if ( (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) ||
+ cpu_has_rdcl_no )
 {
 if ( opt_xpti_hwdom < 0 )
 opt_xpti_hwdom = 0;
@@ -390,9 +388,10 @@ static int __init cf_check parse_pv_l1tf(const char *s)
 }
 custom_param("pv-l1tf", parse_pv_l1tf);
 
-static void __init print_details(enum ind_thunk thunk, uint64_t caps)
+static void __init print_details(enum ind_thunk thunk)
 {
 unsigned int _7d0 = 0, _7d2 = 0, e8b = 0, max = 0, tmp;
+uint64_t caps = 0;
 
 /* Collect diagnostics about available mitigations. */
 if ( boot_cpu_data.cpuid_level >= 7 )
@@ -401,6 +400,8 @@ static void __init print_details(enum ind_thunk thunk, 
uint64_t caps)
 cpuid_count(7, 2, , , , &_7d2);
 if ( boot_cpu_data.extended_cpuid_level >= 0x8008 )
 cpuid(0x8008, , , , );
+if ( cpu_has_arch_caps )
+rdmsrl(MSR_ARCH_CAPABILITIES, caps);
 
 printk("Speculative mitigation facilities:\n");
 
@@ -578,7 +579,7 @@ static bool __init check_smt_enabled(void)
 }
 
 /* Calculate whether Retpoline is known-safe on this CPU. */
-static bool __init retpoline_safe(uint64_t caps)
+static bool __init retpoline_safe(void)
 {
 unsigned int ucode_rev = this_cpu(cpu_sig).rev;
 
@@ -596,7 +597,7 @@ static bool __init retpoline_safe(uint64_t caps)
  * Processors offering Enhanced IBRS are not guarenteed to be
  * repoline-safe.
  */
-if ( caps & (ARCH_CAPS_RSBA | ARCH_CAPS_IBRS_ALL) )
+if ( cpu_has_rsba || cpu_has_eibrs )
 return false;
 
 switch ( boot_cpu_data.x86_model )
@@ -845,7 +846,7 @@ static void __init ibpb_calculations(void)
 }
 
 /* Calculate whether this CPU is vulnerable to L1TF. */
-static __init void l1tf_calculations(uint64_t caps)
+static __init void l1tf_calculations(void)
 {
 bool hit_default = false;
 
@@ -933,7 +934,7 @@ static __init void l1tf_calculations(uint64_t caps)
 }
 
 /* Any processor advertising RDCL_NO should be not vulnerable to L1TF. */
-if ( caps & ARCH_CAPS_RDCL_NO )
+if ( cpu_has_rdcl_no )
 cpu_has_bug_l1tf = false;
 
 if ( cpu_has_bug_l1tf && hit_default )
@@ -992,7 +993,7 @@ static __init void l1tf_calculations(uint64_t caps)
 }
 
 /* Calculate whether this CPU is vulnerable to MDS. */
-static __init void mds_calculations(uint64_t caps)
+static __init void mds_calculations(void)
 {
 /* MDS is only known to affect Intel Family 6 processors at this time. */
 if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
@@ -1000,7 +1001,7 @@ static __init void mds_calculations(uint64_t caps)
 return;
 
 /* Any processor advertising MDS_NO should be not vulnerable to MDS. */
-if ( caps & ARCH_CAPS_MDS_NO )
+if ( cpu_has_mds_no )
 return;
 
 switch ( boot_cpu_data.x86_model )
@@ -1113,10 +1114,6 @@ void __init init_speculation_mitigations(void)
 enum ind_thunk thunk = 

[PATCH v2 08/10] x86/vtx: Remove opencoded MSR_ARCH_CAPS check

2023-05-24 Thread Andrew Cooper
MSR_ARCH_CAPS data is now included in featureset information.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
CC: Jun Nakajima 
CC: Kevin Tian 
---
 xen/arch/x86/hvm/vmx/vmx.c| 8 ++--
 xen/arch/x86/include/asm/cpufeature.h | 3 +++
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 096c69251d58..9dc16d0cc6b9 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2849,8 +2849,6 @@ static void __init ler_to_fixup_check(void);
  */
 static bool __init has_if_pschange_mc(void)
 {
-uint64_t caps = 0;
-
 /*
  * If we are virtualised, there is nothing we can do.  Our EPT tables are
  * shadowed by our hypervisor, and not walked by hardware.
@@ -2858,10 +2856,8 @@ static bool __init has_if_pschange_mc(void)
 if ( cpu_has_hypervisor )
 return false;
 
-if ( cpu_has_arch_caps )
-rdmsrl(MSR_ARCH_CAPABILITIES, caps);
-
-if ( caps & ARCH_CAPS_IF_PSCHANGE_MC_NO )
+/* Hardware reports itself as fixed. */
+if ( cpu_has_if_pschange_mc_no )
 return false;
 
 /*
diff --git a/xen/arch/x86/include/asm/cpufeature.h 
b/xen/arch/x86/include/asm/cpufeature.h
index d0ead8e7a51e..e3154ec5800d 100644
--- a/xen/arch/x86/include/asm/cpufeature.h
+++ b/xen/arch/x86/include/asm/cpufeature.h
@@ -182,6 +182,9 @@ static inline bool boot_cpu_has(unsigned int feat)
 #define cpu_has_avx_vnni_int8   boot_cpu_has(X86_FEATURE_AVX_VNNI_INT8)
 #define cpu_has_avx_ne_convert  boot_cpu_has(X86_FEATURE_AVX_NE_CONVERT)
 
+/* MSR_ARCH_CAPS */
+#define cpu_has_if_pschange_mc_no boot_cpu_has(X86_FEATURE_IF_PSCHANGE_MC_NO)
+
 /* Synthesized. */
 #define cpu_has_arch_perfmonboot_cpu_has(X86_FEATURE_ARCH_PERFMON)
 #define cpu_has_cpuid_faulting  boot_cpu_has(X86_FEATURE_CPUID_FAULTING)
-- 
2.30.2




[PATCH v2 05/10] x86/boot: Record MSR_ARCH_CAPS for the Raw and Host CPU policy

2023-05-24 Thread Andrew Cooper
Extend x86_cpu_policy_fill_native() with a read of ARCH_CAPS based on the
CPUID information just read, removing the specially handling in
calculate_raw_cpu_policy().

Right now, the only use of x86_cpu_policy_fill_native() outside of Xen is the
unit tests.  Getting MSR data in this context is left to whomever first
encounters a genuine need to have it.

Extend generic_identify() to read ARCH_CAPS into x86_capability[], which is
fed into the Host Policy.  This in turn means there's no need to special case
arch_caps in calculate_host_policy().

No practical change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * Extend commit message to discuss the absence of MSRs outside of __XEN__
---
 xen/arch/x86/cpu-policy.c | 12 
 xen/arch/x86/cpu/common.c |  5 +
 xen/lib/x86/cpuid.c   |  7 ++-
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 49f5465ec445..dfd9abd8564c 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -354,9 +354,6 @@ void calculate_raw_cpu_policy(void)
 
 /* 0x00ce  MSR_INTEL_PLATFORM_INFO */
 /* Was already added by probe_cpuid_faulting() */
-
-if ( cpu_has_arch_caps )
-rdmsrl(MSR_ARCH_CAPABILITIES, p->arch_caps.raw);
 }
 
 static void __init calculate_host_policy(void)
@@ -409,15 +406,6 @@ static void __init calculate_host_policy(void)
 /* 0x00ce  MSR_INTEL_PLATFORM_INFO */
 /* probe_cpuid_faulting() sanity checks presence of MISC_FEATURES_ENABLES 
*/
 p->platform_info.cpuid_faulting = cpu_has_cpuid_faulting;
-
-/* Temporary, until we have known_features[] for feature bits in MSRs. */
-p->arch_caps.raw = raw_cpu_policy.arch_caps.raw &
-(ARCH_CAPS_RDCL_NO | ARCH_CAPS_IBRS_ALL | ARCH_CAPS_RSBA |
- ARCH_CAPS_SKIP_L1DFL | ARCH_CAPS_SSB_NO | ARCH_CAPS_MDS_NO |
- ARCH_CAPS_IF_PSCHANGE_MC_NO | ARCH_CAPS_TSX_CTRL | ARCH_CAPS_TAA_NO |
- ARCH_CAPS_SBDR_SSDP_NO | ARCH_CAPS_FBSDP_NO | ARCH_CAPS_PSDP_NO |
- ARCH_CAPS_FB_CLEAR | ARCH_CAPS_RRSBA | ARCH_CAPS_BHI_NO |
- ARCH_CAPS_PBRSB_NO);
 }
 
 static void __init guest_common_default_feature_adjustments(uint32_t *fs)
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 9bbb385db42d..f1084bb1ed36 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -477,6 +477,11 @@ static void generic_identify(struct cpuinfo_x86 *c)
cpuid_count(0xd, 1,
>x86_capability[FEATURESET_Da1],
, , );
+
+   if (test_bit(X86_FEATURE_ARCH_CAPS, c->x86_capability))
+   rdmsr(MSR_ARCH_CAPABILITIES,
+ c->x86_capability[FEATURESET_10Al],
+ c->x86_capability[FEATURESET_10Ah]);
 }
 
 /*
diff --git a/xen/lib/x86/cpuid.c b/xen/lib/x86/cpuid.c
index e795ce375032..07e550191448 100644
--- a/xen/lib/x86/cpuid.c
+++ b/xen/lib/x86/cpuid.c
@@ -226,7 +226,12 @@ void x86_cpu_policy_fill_native(struct cpu_policy *p)
 p->hv_limit = 0;
 p->hv2_limit = 0;
 
-/* TODO MSRs */
+#ifdef __XEN__
+/* TODO MSR_PLATFORM_INFO */
+
+if ( p->feat.arch_caps )
+rdmsrl(MSR_ARCH_CAPABILITIES, p->arch_caps.raw);
+#endif
 
 x86_cpu_policy_recalc_synth(p);
 }
-- 
2.30.2




[PATCH v2 04/10] x86/cpu-policy: MSR_ARCH_CAPS feature names

2023-05-24 Thread Andrew Cooper
Seed the default visibility from the dom0 special case, which for the most
part just exposes the *_NO bits.  EIBRS is the one non-*_NO bit, which is
"just" a status bit to the guest indicating a change in implemention of IBRS
which is already fully supported.

Insert a block dependency from the ARCH_CAPS CPUID bit to the entire content
of the MSR.  This is because MSRs have no structure information similar to
CPUID, and used by x86_cpu_policy_clear_out_of_range_leaves(), in order to
bulk-clear inaccessable words.

The overall CPUID bit is still max-only, so all of MSR_ARCH_CAPS is hidden in
the default policies.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

There is no libxl logic because libxl still uses the older xend format which
is specific to CPUID data.  That is going to need untangling at some other
point.

v2:
 * Don't expose SKIP_L1DFL to guests (it's only applicable for nested virt)
 * Fix SBDR_SSDP_NO and FBSDP_NO names.
 * Extend the commit message.
---
 tools/misc/xen-cpuid.c  | 13 
 xen/include/public/arch-x86/cpufeatureset.h | 23 +
 xen/tools/gen-cpuid.py  |  3 +++
 3 files changed, 39 insertions(+)

diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index 15ad2d33e2a1..8925a583edd5 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -228,6 +228,19 @@ static const char *const str_7d2[32] =
 
 static const char *const str_10Al[32] =
 {
+[ 0] = "rdcl-no", [ 1] = "eibrs",
+[ 2] = "rsba",[ 3] = "skip-l1dfl",
+[ 4] = "intel-ssb-no",[ 5] = "mds-no",
+[ 6] = "if-pschange-mc-no",   [ 7] = "tsx-ctrl",
+[ 8] = "taa-no",  [ 9] = "mcu-ctrl",
+[10] = "misc-pkg-ctrl",   [11] = "energy-ctrl",
+[12] = "doitm",   [13] = "sbdr-ssdp-no",
+[14] = "fbsdp-no",[15] = "psdp-no",
+/* 16 */  [17] = "fb-clear",
+[18] = "fb-clear-ctrl",   [19] = "rrsba",
+[20] = "bhi-no",  [21] = "xapic-status",
+/* 22 */  [23] = "ovrclk-status",
+[24] = "pbrsb-no",
 };
 
 static const char *const str_10Ah[32] =
diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 032cec3ccba2..033b1a72feea 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -308,6 +308,29 @@ XEN_CPUFEATURE(AVX_NE_CONVERT, 15*32+ 5) /*A  
AVX-NE-CONVERT Instructions */
 XEN_CPUFEATURE(CET_SSS,15*32+18) /*   CET Supervisor Shadow Stacks 
safe to use */
 
 /* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.eax, word 16 */
+XEN_CPUFEATURE(RDCL_NO,16*32+ 0) /*A  No Rogue Data Cache Load 
(Meltdown) */
+XEN_CPUFEATURE(EIBRS,  16*32+ 1) /*A  Enhanced IBRS */
+XEN_CPUFEATURE(RSBA,   16*32+ 2) /*!A RSB Alternative (Retpoline 
not safe) */
+XEN_CPUFEATURE(SKIP_L1DFL, 16*32+ 3) /*   Don't need to flush L1D on 
VMEntry */
+XEN_CPUFEATURE(INTEL_SSB_NO,   16*32+ 4) /*A  No Speculative Store Bypass 
*/
+XEN_CPUFEATURE(MDS_NO, 16*32+ 5) /*A  No Microarchitectural Data 
Sampling */
+XEN_CPUFEATURE(IF_PSCHANGE_MC_NO,  16*32+ 6) /*A  No Instruction fetch #MC */
+XEN_CPUFEATURE(TSX_CTRL,   16*32+ 7) /*   MSR_TSX_CTRL */
+XEN_CPUFEATURE(TAA_NO, 16*32+ 8) /*A  No TSX Async Abort */
+XEN_CPUFEATURE(MCU_CTRL,   16*32+ 9) /*   MSR_MCU_CTRL */
+XEN_CPUFEATURE(MISC_PKG_CTRL,  16*32+10) /*   MSR_MISC_PKG_CTRL */
+XEN_CPUFEATURE(ENERGY_FILTERING,   16*32+11) /*   
MSR_MISC_PKG_CTRL.ENERGY_FILTERING */
+XEN_CPUFEATURE(DOITM,  16*32+12) /*   Data Operand Invariant 
Timing Mode */
+XEN_CPUFEATURE(SBDR_SSDP_NO,   16*32+13) /*A  No Shared Buffer Data Read 
or Sideband Stale Data Propagation */
+XEN_CPUFEATURE(FBSDP_NO,   16*32+14) /*A  No Fill Buffer Stale Data 
Propagation */
+XEN_CPUFEATURE(PSDP_NO,16*32+15) /*A  No Primary Stale Data 
Propagation */
+XEN_CPUFEATURE(FB_CLEAR,   16*32+17) /*A  Fill Buffers cleared by VERW 
*/
+XEN_CPUFEATURE(FB_CLEAR_CTRL,  16*32+18) /*   
MSR_OPT_CPU_CTRL.FB_CLEAR_DIS */
+XEN_CPUFEATURE(RRSBA,  16*32+19) /*!A Restricted RSB Alternative */
+XEN_CPUFEATURE(BHI_NO, 16*32+20) /*A  No Branch History Injection  
*/
+XEN_CPUFEATURE(XAPIC_STATUS,   16*32+21) /*   MSR_XAPIC_DISABLE_STATUS */
+XEN_CPUFEATURE(OVRCLK_STATUS,  16*32+23) /*   MSR_OVERCLOCKING_STATUS */
+XEN_CPUFEATURE(PBRSB_NO,   16*32+24) /*A  No Post-Barrier RSB 
predictions */
 
 /* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.edx, word 17 */
 
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index 86d00bb3c273..f28ff708a2fc 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -325,6 +325,9 @@ def crunch_numbers(state):
 
 # In principle the TSXLDTRK 

[PATCH v2 06/10] x86/boot: Expose MSR_ARCH_CAPS data in guest max policies

2023-05-24 Thread Andrew Cooper
We already have common and default feature adjustment helpers.  Introduce one
for max featuresets too.

Offer MSR_ARCH_CAPS unconditionally in the max policy, and stop clobbering the
data inherited from the Host policy.  This will be necessary to level a VM
safely for migration.  Annotate the ARCH_CAPS CPUID bit as special.  Note:
ARCH_CAPS is still max-only for now, so will not be inhereted by the default
policies.

With this done, the special case for dom0 can be shrunk to just resampling the
Host policy (as ARCH_CAPS isn't visible by default yet).

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * Annotate ARCH_CAPS as special.
---
 xen/arch/x86/cpu-policy.c   | 42 -
 xen/include/public/arch-x86/cpufeatureset.h |  2 +-
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index dfd9abd8564c..74266d30b551 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -408,6 +408,25 @@ static void __init calculate_host_policy(void)
 p->platform_info.cpuid_faulting = cpu_has_cpuid_faulting;
 }
 
+static void __init guest_common_max_feature_adjustments(uint32_t *fs)
+{
+if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+{
+/*
+ * MSR_ARCH_CAPS is just feature data, and we can offer it to guests
+ * unconditionally, although limit it to Intel systems as it is highly
+ * uarch-specific.
+ *
+ * In particular, the RSBA and RRSBA bits mean "you might migrate to a
+ * system where RSB underflow uses alternative predictors (a.k.a
+ * Retpoline not safe)", so these need to be visible to a guest in all
+ * cases, even when it's only some other server in the pool which
+ * suffers the identified behaviour.
+ */
+__set_bit(X86_FEATURE_ARCH_CAPS, fs);
+}
+}
+
 static void __init guest_common_default_feature_adjustments(uint32_t *fs)
 {
 /*
@@ -483,6 +502,7 @@ static void __init calculate_pv_max_policy(void)
 __clear_bit(X86_FEATURE_IBRS, fs);
 }
 
+guest_common_max_feature_adjustments(fs);
 guest_common_feature_adjustments(fs);
 
 sanitise_featureset(fs);
@@ -490,8 +510,6 @@ static void __init calculate_pv_max_policy(void)
 recalculate_xstate(p);
 
 p->extd.raw[0xa] = EMPTY_LEAF; /* No SVM for PV guests. */
-
-p->arch_caps.raw = 0; /* Not supported yet. */
 }
 
 static void __init calculate_pv_def_policy(void)
@@ -598,6 +616,7 @@ static void __init calculate_hvm_max_policy(void)
 if ( !cpu_has_vmx )
 __clear_bit(X86_FEATURE_PKS, fs);
 
+guest_common_max_feature_adjustments(fs);
 guest_common_feature_adjustments(fs);
 
 sanitise_featureset(fs);
@@ -606,8 +625,6 @@ static void __init calculate_hvm_max_policy(void)
 
 /* It's always possible to emulate CPUID faulting for HVM guests */
 p->platform_info.cpuid_faulting = true;
-
-p->arch_caps.raw = 0; /* Not supported yet. */
 }
 
 static void __init calculate_hvm_def_policy(void)
@@ -828,7 +845,10 @@ void __init init_dom0_cpuid_policy(struct domain *d)
  * domain policy logic gains a better understanding of MSRs.
  */
 if ( is_hardware_domain(d) && cpu_has_arch_caps )
+{
 p->feat.arch_caps = true;
+p->arch_caps.raw = host_cpu_policy.arch_caps.raw;
+}
 
 /* Apply dom0-cpuid= command line settings, if provided. */
 if ( dom0_cpuid_cmdline )
@@ -858,20 +878,6 @@ void __init init_dom0_cpuid_policy(struct domain *d)
 p->platform_info.cpuid_faulting = false;
 
 recalculate_cpuid_policy(d);
-
-if ( is_hardware_domain(d) && cpu_has_arch_caps )
-{
-uint64_t val;
-
-rdmsrl(MSR_ARCH_CAPABILITIES, val);
-
-p->arch_caps.raw = val &
-(ARCH_CAPS_RDCL_NO | ARCH_CAPS_IBRS_ALL | ARCH_CAPS_RSBA |
- ARCH_CAPS_SSB_NO | ARCH_CAPS_MDS_NO | ARCH_CAPS_IF_PSCHANGE_MC_NO 
|
- ARCH_CAPS_TAA_NO | ARCH_CAPS_SBDR_SSDP_NO | ARCH_CAPS_FBSDP_NO |
- ARCH_CAPS_PSDP_NO | ARCH_CAPS_FB_CLEAR | ARCH_CAPS_RRSBA |
- ARCH_CAPS_BHI_NO | ARCH_CAPS_PBRSB_NO);
-}
 }
 
 static void __init __maybe_unused build_assertions(void)
diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 033b1a72feea..777041425e0a 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -271,7 +271,7 @@ XEN_CPUFEATURE(AVX512_FP16,   9*32+23) /*   AVX512 FP16 
instructions */
 XEN_CPUFEATURE(IBRSB, 9*32+26) /*A  IBRS and IBPB support (used by 
Intel) */
 XEN_CPUFEATURE(STIBP, 9*32+27) /*A  STIBP */
 XEN_CPUFEATURE(L1D_FLUSH, 9*32+28) /*S  MSR_FLUSH_CMD and L1D flush. */
-XEN_CPUFEATURE(ARCH_CAPS, 9*32+29) /*a  IA32_ARCH_CAPABILITIES MSR */
+XEN_CPUFEATURE(ARCH_CAPS, 9*32+29) /*!a IA32_ARCH_CAPABILITIES MSR */
 

[PATCH v2 07/10] x86/cpufeature: Rework {boot_,}cpu_has()

2023-05-24 Thread Andrew Cooper
One area where Xen deviates from Linux is that test_bit() forces a volatile
read.  This leads to poor code generation, because the optimiser cannot merge
bit operations on the same word.

Drop the use of test_bit(), and write the expressions in regular C.  This
removes the include of bitops.h (which is a frequent source of header
tangles), and it offers the optimiser far more flexibility.

Bloat-o-meter reports a net change of:

  add/remove: 0/0 grow/shrink: 21/87 up/down: 641/-2751 (-2110)

with half of that in x86_emulate() alone.  vmx_ctxt_switch_to() seems to be
the fastpath with the greatest delta at -24, where the optimiser has
successfully removed the branch hidden in cpu_has_msr_tsc_aux.

No functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * Drop stdbool.  It is already covered by other includes.
---
 xen/arch/x86/include/asm/cpufeature.h | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/include/asm/cpufeature.h 
b/xen/arch/x86/include/asm/cpufeature.h
index 4140ec0938b2..d0ead8e7a51e 100644
--- a/xen/arch/x86/include/asm/cpufeature.h
+++ b/xen/arch/x86/include/asm/cpufeature.h
@@ -17,7 +17,6 @@
 #define X86_FEATURE_ALWAYS  X86_FEATURE_LM
 
 #ifndef __ASSEMBLY__
-#include 
 
 struct cpuinfo_x86 {
 unsigned char x86; /* CPU family */
@@ -43,8 +42,15 @@ struct cpuinfo_x86 {
 
 extern struct cpuinfo_x86 boot_cpu_data;
 
-#define cpu_has(c, bit)test_bit(bit, (c)->x86_capability)
-#define boot_cpu_has(bit)  test_bit(bit, boot_cpu_data.x86_capability)
+static inline bool cpu_has(const struct cpuinfo_x86 *info, unsigned int feat)
+{
+return info->x86_capability[cpufeat_word(feat)] & cpufeat_mask(feat);
+}
+
+static inline bool boot_cpu_has(unsigned int feat)
+{
+return cpu_has(_cpu_data, feat);
+}
 
 #define CPUID_PM_LEAF6
 #define CPUID6_ECX_APERFMPERF_CAPABILITY 0x1
-- 
2.30.2




[PATCH v2 01/10] x86/boot: Rework dom0 feature configuration

2023-05-24 Thread Andrew Cooper
Right now, dom0's feature configuration is split between between the common
path and a dom0-specific one.  This mostly is by accident, and causes some
very subtle bugs.

First, start by clearly defining init_dom0_cpuid_policy() to be the domain
that Xen builds automatically.  The late hwdom case is still constructed in a
mostly normal way, with the control domain having full discretion over the CPU
policy.

Identifying this highlights a latent bug - the two halves of the MSR_ARCH_CAPS
bodge are asymmetric with respect to the hardware domain.  This means that
shim, or a control-only dom0 sees the MSR_ARCH_CAPS CPUID bit but none of the
MSR content.  This in turn declares the hardware to be retpoline-safe by
failing to advertise the {R,}RSBA bits appropriately.  Restrict this logic to
the hardware domain, although the special case will cease to exist shortly.

For the CPUID Faulting adjustment, the comment in ctxt_switch_levelling()
isn't actually relevant.  Provide a better explanation.

Move the recalculate_cpuid_policy() call outside of the dom0-cpuid= case.
This is no change for now, but will become necessary shortly.

Finally, place the second half of the MSR_ARCH_CAPS bodge after the
recalculate_cpuid_policy() call.  This is necessary to avoid transiently
breaking the hardware domain's view while the handling is cleaned up.  This
special case will cease to exist shortly.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/cpu-policy.c | 57 +--
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index ef6a2d0d180a..5e7e19fbcda8 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -687,29 +687,6 @@ int init_domain_cpu_policy(struct domain *d)
 if ( !p )
 return -ENOMEM;
 
-/* See comment in ctxt_switch_levelling() */
-if ( !opt_dom0_cpuid_faulting && is_control_domain(d) && is_pv_domain(d) )
-p->platform_info.cpuid_faulting = false;
-
-/*
- * Expose the "hardware speculation behaviour" bits of ARCH_CAPS to dom0,
- * so dom0 can turn off workarounds as appropriate.  Temporary, until the
- * domain policy logic gains a better understanding of MSRs.
- */
-if ( is_hardware_domain(d) && cpu_has_arch_caps )
-{
-uint64_t val;
-
-rdmsrl(MSR_ARCH_CAPABILITIES, val);
-
-p->arch_caps.raw = val &
-(ARCH_CAPS_RDCL_NO | ARCH_CAPS_IBRS_ALL | ARCH_CAPS_RSBA |
- ARCH_CAPS_SSB_NO | ARCH_CAPS_MDS_NO | ARCH_CAPS_IF_PSCHANGE_MC_NO 
|
- ARCH_CAPS_TAA_NO | ARCH_CAPS_SBDR_SSDP_NO | ARCH_CAPS_FBSDP_NO |
- ARCH_CAPS_PSDP_NO | ARCH_CAPS_FB_CLEAR | ARCH_CAPS_RRSBA |
- ARCH_CAPS_BHI_NO | ARCH_CAPS_PBRSB_NO);
-}
-
 d->arch.cpu_policy = p;
 
 recalculate_cpuid_policy(d);
@@ -845,11 +822,15 @@ void recalculate_cpuid_policy(struct domain *d)
 p->extd.raw[0x19] = EMPTY_LEAF;
 }
 
+/*
+ * Adjust the CPU policy for dom0.  Really, this is "the domain Xen builds
+ * automatically on boot", and might not have the domid 0 (e.g. pvshim).
+ */
 void __init init_dom0_cpuid_policy(struct domain *d)
 {
 struct cpu_policy *p = d->arch.cpuid;
 
-/* dom0 can't migrate.  Give it ITSC if available. */
+/* Dom0 doesn't migrate relative to Xen.  Give it ITSC if available. */
 if ( cpu_has_itsc )
 p->extd.itsc = true;
 
@@ -858,7 +839,7 @@ void __init init_dom0_cpuid_policy(struct domain *d)
  * so dom0 can turn off workarounds as appropriate.  Temporary, until the
  * domain policy logic gains a better understanding of MSRs.
  */
-if ( cpu_has_arch_caps )
+if ( is_hardware_domain(d) && cpu_has_arch_caps )
 p->feat.arch_caps = true;
 
 /* Apply dom0-cpuid= command line settings, if provided. */
@@ -876,8 +857,32 @@ void __init init_dom0_cpuid_policy(struct domain *d)
 }
 
 x86_cpu_featureset_to_policy(fs, p);
+}
+
+/*
+ * PV Control domains used to require unfiltered CPUID.  This was fixed in
+ * Xen 4.13, but there is an cmdline knob to restore the prior behaviour.
+ *
+ * If the domain is getting unfiltered CPUID, don't let the guest kernel
+ * play with CPUID faulting either, as Xen's CPUID path won't cope.
+ */
+if ( !opt_dom0_cpuid_faulting && is_control_domain(d) && is_pv_domain(d) )
+p->platform_info.cpuid_faulting = false;
 
-recalculate_cpuid_policy(d);
+recalculate_cpuid_policy(d);
+
+if ( is_hardware_domain(d) && cpu_has_arch_caps )
+{
+uint64_t val;
+
+rdmsrl(MSR_ARCH_CAPABILITIES, val);
+
+p->arch_caps.raw = val &
+(ARCH_CAPS_RDCL_NO | ARCH_CAPS_IBRS_ALL | ARCH_CAPS_RSBA |
+ ARCH_CAPS_SSB_NO | ARCH_CAPS_MDS_NO | ARCH_CAPS_IF_PSCHANGE_MC_NO 
|
+ ARCH_CAPS_TAA_NO | ARCH_CAPS_SBDR_SSDP_NO | 

[PATCH v2 00/10] x86: Introduce MSR_ARCH_CAPS into featuresets

2023-05-24 Thread Andrew Cooper
Also combined with "x86: Feature check cleanup" for simplicity.

See individual patches for v2 deltas.

Andrew Cooper (10):
  x86/boot: Rework dom0 feature configuration
  x86/boot: Adjust MSR_ARCH_CAPS handling for the Host policy
  x86/cpu-policy: Infrastructure for MSR_ARCH_CAPS
  x86/cpu-policy: MSR_ARCH_CAPS feature names
  x86/boot: Record MSR_ARCH_CAPS for the Raw and Host CPU policy
  x86/boot: Expose MSR_ARCH_CAPS data in guest max policies
  x86/cpufeature: Rework {boot_,}cpu_has()
  x86/vtx: Remove opencoded MSR_ARCH_CAPS check
  x86/tsx: Remove opencoded MSR_ARCH_CAPS check
  x86/spec-ctrl: Remove opencoded MSR_ARCH_CAPS check

 tools/misc/xen-cpuid.c  | 57 +-
 tools/tests/cpu-policy/test-cpu-policy.c|  5 --
 xen/arch/x86/cpu-policy.c   | 83 ++---
 xen/arch/x86/cpu/common.c   |  5 ++
 xen/arch/x86/hvm/vmx/vmx.c  |  8 +-
 xen/arch/x86/include/asm/cpufeature.h   | 23 +-
 xen/arch/x86/include/asm/processor.h|  2 +-
 xen/arch/x86/spec_ctrl.c| 56 +++---
 xen/arch/x86/tsx.c  | 13 ++--
 xen/include/public/arch-x86/cpufeatureset.h | 29 ++-
 xen/include/xen/lib/x86/cpu-policy.h| 50 ++---
 xen/lib/x86/cpuid.c | 11 ++-
 xen/tools/gen-cpuid.py  |  3 +
 13 files changed, 208 insertions(+), 137 deletions(-)

-- 
2.30.2




[PATCH v2 09/10] x86/tsx: Remove opencoded MSR_ARCH_CAPS check

2023-05-24 Thread Andrew Cooper
The current cpu_has_tsx_ctrl tristate is serving double pupose; to signal the
first pass through tsx_init(), and the availability of MSR_TSX_CTRL.

Drop the variable, replacing it with a once boolean, and altering
cpu_has_tsx_ctrl to come out of the feature information.

No functional change.

Signed-off-by: Andrew Cooper 
Reviewed-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/include/asm/cpufeature.h |  1 +
 xen/arch/x86/include/asm/processor.h  |  2 +-
 xen/arch/x86/tsx.c| 13 -
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/include/asm/cpufeature.h 
b/xen/arch/x86/include/asm/cpufeature.h
index e3154ec5800d..9047ea43f503 100644
--- a/xen/arch/x86/include/asm/cpufeature.h
+++ b/xen/arch/x86/include/asm/cpufeature.h
@@ -184,6 +184,7 @@ static inline bool boot_cpu_has(unsigned int feat)
 
 /* MSR_ARCH_CAPS */
 #define cpu_has_if_pschange_mc_no boot_cpu_has(X86_FEATURE_IF_PSCHANGE_MC_NO)
+#define cpu_has_tsx_ctrlboot_cpu_has(X86_FEATURE_TSX_CTRL)
 
 /* Synthesized. */
 #define cpu_has_arch_perfmonboot_cpu_has(X86_FEATURE_ARCH_PERFMON)
diff --git a/xen/arch/x86/include/asm/processor.h 
b/xen/arch/x86/include/asm/processor.h
index 0eaa2c3094d0..f983ff501d95 100644
--- a/xen/arch/x86/include/asm/processor.h
+++ b/xen/arch/x86/include/asm/processor.h
@@ -535,7 +535,7 @@ static inline uint8_t get_cpu_family(uint32_t raw, uint8_t 
*model,
 return fam;
 }
 
-extern int8_t opt_tsx, cpu_has_tsx_ctrl;
+extern int8_t opt_tsx;
 extern bool rtm_disabled;
 void tsx_init(void);
 
diff --git a/xen/arch/x86/tsx.c b/xen/arch/x86/tsx.c
index 41b6092cfe16..fc199815994d 100644
--- a/xen/arch/x86/tsx.c
+++ b/xen/arch/x86/tsx.c
@@ -19,7 +19,6 @@
  * controlling TSX behaviour, and where TSX isn't force-disabled by firmware.
  */
 int8_t __read_mostly opt_tsx = -1;
-int8_t __read_mostly cpu_has_tsx_ctrl = -1;
 bool __read_mostly rtm_disabled;
 
 static int __init cf_check parse_tsx(const char *s)
@@ -37,24 +36,28 @@ custom_param("tsx", parse_tsx);
 
 void tsx_init(void)
 {
+static bool __read_mostly once;
+
 /*
  * This function is first called between microcode being loaded, and CPUID
  * being scanned generally.  Read into boot_cpu_data.x86_capability[] for
  * the cpu_has_* bits we care about using here.
  */
-if ( unlikely(cpu_has_tsx_ctrl < 0) )
+if ( unlikely(!once) )
 {
-uint64_t caps = 0;
 bool has_rtm_always_abort;
 
+once = true;
+
 if ( boot_cpu_data.cpuid_level >= 7 )
 boot_cpu_data.x86_capability[FEATURESET_7d0]
 = cpuid_count_edx(7, 0);
 
 if ( cpu_has_arch_caps )
-rdmsrl(MSR_ARCH_CAPABILITIES, caps);
+rdmsr(MSR_ARCH_CAPABILITIES,
+  boot_cpu_data.x86_capability[FEATURESET_10Al],
+  boot_cpu_data.x86_capability[FEATURESET_10Ah]);
 
-cpu_has_tsx_ctrl = !!(caps & ARCH_CAPS_TSX_CTRL);
 has_rtm_always_abort = cpu_has_rtm_always_abort;
 
 if ( cpu_has_tsx_ctrl && cpu_has_srbds_ctrl )
-- 
2.30.2




[PATCH v2 03/10] x86/cpu-policy: Infrastructure for MSR_ARCH_CAPS

2023-05-24 Thread Andrew Cooper
Bits through 24 are already defined, meaning that we're not far off needing
the second word.  Put both in right away.

As both halves are present now, the arch_caps field is full width.  Adjust the
unit test, which notices.

The bool bitfield names in the arch_caps union are unused, and somewhat out of
date.  They'll shortly be automatically generated.

Add CPUID and MSR prefixes to the ./xen-cpuid verbose output, now that there
are a mix of the two.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

v2:
 * Adjust the unit test.
 * Use an m prefix on the short name.
 * Add CPUID/MSR to the verbose name.
---
 tools/misc/xen-cpuid.c  | 44 +++---
 tools/tests/cpu-policy/test-cpu-policy.c|  5 ---
 xen/include/public/arch-x86/cpufeatureset.h |  4 ++
 xen/include/xen/lib/x86/cpu-policy.h| 50 ++---
 xen/lib/x86/cpuid.c |  4 ++
 5 files changed, 59 insertions(+), 48 deletions(-)

diff --git a/tools/misc/xen-cpuid.c b/tools/misc/xen-cpuid.c
index 8ec143ebc854..15ad2d33e2a1 100644
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -226,31 +226,41 @@ static const char *const str_7d2[32] =
 [ 4] = "bhi-ctrl",  [ 5] = "mcdt-no",
 };
 
+static const char *const str_10Al[32] =
+{
+};
+
+static const char *const str_10Ah[32] =
+{
+};
+
 static const struct {
 const char *name;
 const char *abbr;
 const char *const *strs;
 } decodes[] =
 {
-{ "0x0001.edx",   "1d",  str_1d },
-{ "0x0001.ecx",   "1c",  str_1c },
-{ "0x8001.edx",   "e1d", str_e1d },
-{ "0x8001.ecx",   "e1c", str_e1c },
-{ "0x000d:1.eax", "Da1", str_Da1 },
-{ "0x0007:0.ebx", "7b0", str_7b0 },
-{ "0x0007:0.ecx", "7c0", str_7c0 },
-{ "0x8007.edx",   "e7d", str_e7d },
-{ "0x8008.ebx",   "e8b", str_e8b },
-{ "0x0007:0.edx", "7d0", str_7d0 },
-{ "0x0007:1.eax", "7a1", str_7a1 },
-{ "0x8021.eax",  "e21a", str_e21a },
-{ "0x0007:1.ebx", "7b1", str_7b1 },
-{ "0x0007:2.edx", "7d2", str_7d2 },
-{ "0x0007:1.ecx", "7c1", str_7c1 },
-{ "0x0007:1.edx", "7d1", str_7d1 },
+{ "CPUID 0x0001.edx","1d", str_1d },
+{ "CPUID 0x0001.ecx","1c", str_1c },
+{ "CPUID 0x8001.edx",   "e1d", str_e1d },
+{ "CPUID 0x8001.ecx",   "e1c", str_e1c },
+{ "CPUID 0x000d:1.eax", "Da1", str_Da1 },
+{ "CPUID 0x0007:0.ebx", "7b0", str_7b0 },
+{ "CPUID 0x0007:0.ecx", "7c0", str_7c0 },
+{ "CPUID 0x8007.edx",   "e7d", str_e7d },
+{ "CPUID 0x8008.ebx",   "e8b", str_e8b },
+{ "CPUID 0x0007:0.edx", "7d0", str_7d0 },
+{ "CPUID 0x0007:1.eax", "7a1", str_7a1 },
+{ "CPUID 0x8021.eax",  "e21a", str_e21a },
+{ "CPUID 0x0007:1.ebx", "7b1", str_7b1 },
+{ "CPUID 0x0007:2.edx", "7d2", str_7d2 },
+{ "CPUID 0x0007:1.ecx", "7c1", str_7c1 },
+{ "CPUID 0x0007:1.edx", "7d1", str_7d1 },
+{ "MSR   0x010a.lo",  "m10Al", str_10Al },
+{ "MSR   0x010a.hi",  "m10Ah", str_10Ah },
 };
 
-#define COL_ALIGN "18"
+#define COL_ALIGN "24"
 
 static const char *const fs_names[] = {
 [XEN_SYSCTL_cpu_featureset_raw] = "Raw",
diff --git a/tools/tests/cpu-policy/test-cpu-policy.c 
b/tools/tests/cpu-policy/test-cpu-policy.c
index f1d968adfc39..301df2c00285 100644
--- a/tools/tests/cpu-policy/test-cpu-policy.c
+++ b/tools/tests/cpu-policy/test-cpu-policy.c
@@ -391,11 +391,6 @@ static void test_msr_deserialise_failure(void)
 .msr = { .idx = 0xce, .val = ~0ull },
 .rc = -EOVERFLOW,
 },
-{
-.name = "truncated val",
-.msr = { .idx = 0x10a, .val = ~0ull },
-.rc = -EOVERFLOW,
-},
 };
 
 printf("Testing MSR deserialise failure:\n");
diff --git a/xen/include/public/arch-x86/cpufeatureset.h 
b/xen/include/public/arch-x86/cpufeatureset.h
index 8de73aebc3e0..032cec3ccba2 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -307,6 +307,10 @@ XEN_CPUFEATURE(AVX_VNNI_INT8,  15*32+ 4) /*A  
AVX-VNNI-INT8 Instructions */
 XEN_CPUFEATURE(AVX_NE_CONVERT, 15*32+ 5) /*A  AVX-NE-CONVERT Instructions 
*/
 XEN_CPUFEATURE(CET_SSS,15*32+18) /*   CET Supervisor Shadow Stacks 
safe to use */
 
+/* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.eax, word 16 */
+
+/* Intel-defined CPU features, MSR_ARCH_CAPS 0x10a.edx, word 17 */
+
 #endif /* XEN_CPUFEATURE */
 
 /* Clean up from a default include.  Close the enum (for C). */
diff --git a/xen/include/xen/lib/x86/cpu-policy.h 
b/xen/include/xen/lib/x86/cpu-policy.h
index bfa425060464..6d5e9edd269b 100644
--- a/xen/include/xen/lib/x86/cpu-policy.h
+++ b/xen/include/xen/lib/x86/cpu-policy.h
@@ -4,22 +4,24 @@
 
 #include 
 
-#define FEATURESET_1d 0 /* 

[PATCH v2 02/10] x86/boot: Adjust MSR_ARCH_CAPS handling for the Host policy

2023-05-24 Thread Andrew Cooper
We are about to move MSR_ARCH_CAPS into featureset, but the order of
operations (copy raw policy, then copy x86_capabilitiles[] in) will end up
clobbering the ARCH_CAPS value.

Some toolstacks use this information to handle TSX compatibility across the
CPUs and microcode versions where support was removed.

To avoid this transient breakage, read from raw_cpu_policy rather than
modifying it in place.  This logic will be removed entirely in due course.

Signed-off-by: Andrew Cooper 
Acked-by: Jan Beulich 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 
---
 xen/arch/x86/cpu-policy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
index 5e7e19fbcda8..49f5465ec445 100644
--- a/xen/arch/x86/cpu-policy.c
+++ b/xen/arch/x86/cpu-policy.c
@@ -411,7 +411,7 @@ static void __init calculate_host_policy(void)
 p->platform_info.cpuid_faulting = cpu_has_cpuid_faulting;
 
 /* Temporary, until we have known_features[] for feature bits in MSRs. */
-p->arch_caps.raw &=
+p->arch_caps.raw = raw_cpu_policy.arch_caps.raw &
 (ARCH_CAPS_RDCL_NO | ARCH_CAPS_IBRS_ALL | ARCH_CAPS_RSBA |
  ARCH_CAPS_SKIP_L1DFL | ARCH_CAPS_SSB_NO | ARCH_CAPS_MDS_NO |
  ARCH_CAPS_IF_PSCHANGE_MC_NO | ARCH_CAPS_TSX_CTRL | ARCH_CAPS_TAA_NO |
-- 
2.30.2




Re: [XEN PATCH 15/15] build: remove Config.mk include from Rules.mk

2023-05-24 Thread Luca Fancellu



> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> Everything needed to build the hypervisor should already be configured
> by "xen/Makefile", thus Config.mk shouldn't be needed.
> 
> Then, Config.mk keeps on testing support of some CFLAGS with CC, the
> result of this testing is not used at this stage so the build is
> slowed unnecessarily.
> 
> Likewise, GCC is checked to be at the minimum at 4.2 when entering
> every sub-directory, so the check have run countless time at this
> stage.
> 
> We only need to export a few more configuration variables. And add
> some variables in Kbuild.include, and macro fallbacks for Make older
> than 3.81. (Adding `or` just in case. it's only used in xen/Makefile,
> which includes Config.mk and so has already the fallback.)
> 
> Signed-off-by: Anthony PERARD 
> ---

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 






Re: [XEN PATCH 14/15] Config.mk: move $(cc-option, ) to config/compiler-testing.mk

2023-05-24 Thread Luca Fancellu



> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> In xen/, it isn't necessary to include Config.mk in every Makefile in
> subdirectories as nearly all necessary variables should be calculated
> in xen/Makefile. But some Makefile make use of the macro $(cc-option,)
> that is only available in Config.mk.
> 
> Extract $(cc-option,) from Config.mk so we can use it without
> including Config.mk and thus without having to recalculate some CFLAGS
> which would be ignored.
> 
> Signed-off-by: Anthony PERARD 

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 






Re: [PATCH v7 07/12] xen: enable Dom0 to use SVE feature

2023-05-24 Thread Bertrand Marquis
Hi Luca,

> On 23 May 2023, at 09:43, Luca Fancellu  wrote:
> 
> Add a command line parameter to allow Dom0 the use of SVE resources,
> the command line parameter sve=, sub argument of dom0=,
> controls the feature on this domain and sets the maximum SVE vector
> length for Dom0.
> 
> Add a new function, parse_signed_integer(), to parse an integer
> command line argument.
> 
> Signed-off-by: Luca Fancellu 

Reviewed-by: Bertrand Marquis 

with ...

> ---
> Changes from v6:
> - Fixed case for e==NULL in parse_signed_integer, drop parenthesis
>   from if conditions, delete inline sve_domctl_vl_param and rely on
>   DCE from the compiler (Jan)
> - Drop parenthesis from opt_dom0_sve (Julien)
> - Do not continue if 'sve' is in command line args but
>   CONFIG_ARM64_SVE is not selected:
>   https://lore.kernel.org/all/7614ae25-f59d-430a-9c3e-30b1ce0e1...@arm.com/
> Changes from v5:
> - stop the domain if VL error occurs (Julien, Bertrand)
> - update the documentation
> - Rename sve_sanitize_vl_param to sve_domctl_vl_param to
>   mark the fact that we are sanitizing a parameter coming from
>   the user before encoding it into sve_vl in domctl structure.
>   (suggestion from Bertrand in a separate discussion)
> - update comment in parse_signed_integer, return boolean in
>   sve_domctl_vl_param (Jan).
> Changes from v4:
> - Negative values as user param means max supported HW VL (Jan)
> - update documentation, make use of no_config_param(), rename
>   parse_integer into parse_signed_integer and take long long *,
>   also put a comment on the -2 return condition, update
>   declaration comment to reflect the modifications (Jan)
> Changes from v3:
> - Don't use fixed len types when not needed (Jan)
> - renamed domainconfig_encode_vl to sve_encode_vl
> - Use a sub argument of dom0= to enable the feature (Jan)
> - Add parse_integer() function
> Changes from v2:
> - xen_domctl_createdomain field has changed into sve_vl and its
>   value now is the VL / 128, create an helper function for that.
> Changes from v1:
> - No changes
> Changes from RFC:
> - Changed docs to explain that the domain won't be created if the
>   requested vector length is above the supported one from the
>   platform.
> ---
> docs/misc/xen-command-line.pandoc| 20 ++--
> xen/arch/arm/arm64/sve.c | 20 
> xen/arch/arm/domain_build.c  | 26 ++
> xen/arch/arm/include/asm/arm64/sve.h | 10 ++
> xen/common/kernel.c  | 28 
> xen/include/xen/lib.h| 10 ++
> 6 files changed, 112 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc 
> b/docs/misc/xen-command-line.pandoc
> index e0b89b7d3319..47e5b4eb6199 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -777,9 +777,9 @@ Specify the bit width of the DMA heap.
> 
> ### dom0
> = List of [ pv | pvh, shadow=, verbose=,
> -cpuid-faulting=, msr-relaxed= ]
> +cpuid-faulting=, msr-relaxed= ] (x86)
> 
> -Applicability: x86
> += List of [ sve= ] (Arm)
> 
> Controls for how dom0 is constructed on x86 systems.
> 
> @@ -838,6 +838,22 @@ Controls for how dom0 is constructed on x86 systems.
> 
> If using this option is necessary to fix an issue, please report a bug.
> 
> +Enables features on dom0 on Arm systems.
> +
> +*   The `sve` integer parameter enables Arm SVE usage for Dom0 domain and 
> sets
> +the maximum SVE vector length, the option is applicable only to AArch64
> +guests.

Here i would just remove "guests", just AArch64 is enough.
I am ok if you choose to use "AArch64 Dom0 kernels"

> +A value equal to 0 disables the feature, this is the default value.
> +Values below 0 means the feature uses the maximum SVE vector length
> +supported by hardware, if SVE is supported.
> +Values above 0 explicitly set the maximum SVE vector length for Dom0,
> +allowed values are from 128 to maximum 2048, being multiple of 128.
> +Please note that when the user explicitly specifies the value, if that 
> value
> +is above the hardware supported maximum SVE vector length, the domain
> +creation will fail and the system will stop, the same will occur if the
> +option is provided with a non zero value, but the platform doesn't 
> support
> +SVE.
> +

I agree on the discussion with Jan here so you can keep my R-b if modified as 
discussed.


Cheers
Bertrand

> ### dom0-cpuid
> = List of comma separated booleans
> 
> diff --git a/xen/arch/arm/arm64/sve.c b/xen/arch/arm/arm64/sve.c
> index 84a6dedc1fd7..feaca2cf647d 100644
> --- a/xen/arch/arm/arm64/sve.c
> +++ b/xen/arch/arm/arm64/sve.c
> @@ -13,6 +13,9 @@
> #include 
> #include 
> 
> +/* opt_dom0_sve: allow Dom0 to use SVE and set maximum vector length. */
> +int __initdata opt_dom0_sve;
> +
> extern unsigned int sve_get_hw_vl(void);
> 
> /*
> @@ -152,6 +155,23 

Re: [PATCH v7 01/12] xen/arm: enable SVE extension for Xen

2023-05-24 Thread Julien Grall

Hi,

On 24/05/2023 10:01, Bertrand Marquis wrote:

diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
index c4ec38bb2554..83b84368f6d5 100644
--- a/xen/arch/arm/cpufeature.c
+++ b/xen/arch/arm/cpufeature.c
@@ -9,6 +9,7 @@
#include 
#include 
#include 
+#include 
#include 

DECLARE_BITMAP(cpu_hwcaps, ARM_NCAPS);
@@ -143,6 +144,9 @@ void identify_cpu(struct cpuinfo_arm *c)

 c->zfr64.bits[0] = READ_SYSREG(ID_AA64ZFR0_EL1);

+if ( cpu_has_sve )
+c->zcr64.bits[0] = compute_max_zcr();
+
 c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);

 c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
@@ -199,7 +203,7 @@ static int __init create_guest_cpuinfo(void)
 guest_cpuinfo.pfr64.mpam = 0;
 guest_cpuinfo.pfr64.mpam_frac = 0;

-/* Hide SVE as Xen does not support it */
+/* Hide SVE by default to the guests */


Everything is for guests and as Jan mentioned in an other comment
this could be wrongly interpreted.


(Not directly related to this patch, so no changes expected here)

Hmmm... The name of the function/variable is confusing as well given 
that the cpuinfo should also apply to dom0. Should we s/guest/domain/?


Cheers,

--
Julien Grall



Re: [PATCH v7 05/12] arm/sve: save/restore SVE context switch

2023-05-24 Thread Luca Fancellu


> On 24 May 2023, at 10:47, Bertrand Marquis  wrote:
> 
> Hi Luca,
> 
>> On 23 May 2023, at 09:43, Luca Fancellu  wrote:
>> 
>> Save/restore context switch for SVE, allocate memory to contain
>> the Z0-31 registers whose length is maximum 2048 bits each and
>> FFR who can be maximum 256 bits, the allocated memory depends on
>> how many bits is the vector length for the domain and how many bits
>> are supported by the platform.
>> 
>> Save P0-15 whose length is maximum 256 bits each, in this case the
>> memory used is from the fpregs field in struct vfp_state,
>> because V0-31 are part of Z0-31 and this space would have been
>> unused for SVE domain otherwise.
>> 
>> Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu
>> creation given the requested vector length and restore it on
>> context switch, save/restore ZCR_EL1 value as well.
>> 
>> List import macros from Linux in README.LinuxPrimitives.
>> 
>> Signed-off-by: Luca Fancellu 
> Reviewed-by: Bertrand Marquis 
> 
> Just ...
> 
>> ---
>> Changes from v6:
>> - Add comment for explain why sve_save/sve_load are different from
>>  Linux, add macros in xen/arch/arm/README.LinuxPrimitives (Julien)
>> - Add comments in sve_context_init and sve_context_free, handle the
>>  case where sve_zreg_ctx_end is NULL, move setting of v->arch.zcr_el2
>>  in sve_context_init (Julien)
>> - remove stubs for sve_context_* and sve_save_* and rely on compiler
>>  DCE (Jan)
>> - Add comments for sve_save_ctx/sve_load_ctx (Julien)
>> Changes from v5:
>> - use XFREE instead of xfree, keep the headers (Julien)
>> - Avoid math computation for every save/restore, store the computation
>>  in struct vfp_state once (Bertrand)
>> - protect access to v->domain->arch.sve_vl inside arch_vcpu_create now
>>  that sve_vl is available only on arm64
>> Changes from v4:
>> - No changes
>> Changes from v3:
>> - don't use fixed len types when not needed (Jan)
>> - now VL is an encoded value, decode it before using.
>> Changes from v2:
>> - No changes
>> Changes from v1:
>> - No changes
>> Changes from RFC:
>> - Moved zcr_el2 field introduction in this patch, restore its
>>  content inside sve_restore_state function. (Julien)
>> 
>> fix patch 5
>> 
>> Signed-off-by: Luca Fancellu 
>> Change-Id: Ief65b2ff14fd579afa4fd110ce08a19980e64fa9
> 
> You have a signed off and a change-id that should not be here.
> They are in the comment section so should be removed during push so might be 
> ok :-)

Ohh yeah I missed that, probably it’s from a squash! 

> 
> Cheers
> Bertrand




Re: [PATCH v7 05/12] arm/sve: save/restore SVE context switch

2023-05-24 Thread Bertrand Marquis
Hi Luca,

> On 23 May 2023, at 09:43, Luca Fancellu  wrote:
> 
> Save/restore context switch for SVE, allocate memory to contain
> the Z0-31 registers whose length is maximum 2048 bits each and
> FFR who can be maximum 256 bits, the allocated memory depends on
> how many bits is the vector length for the domain and how many bits
> are supported by the platform.
> 
> Save P0-15 whose length is maximum 256 bits each, in this case the
> memory used is from the fpregs field in struct vfp_state,
> because V0-31 are part of Z0-31 and this space would have been
> unused for SVE domain otherwise.
> 
> Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu
> creation given the requested vector length and restore it on
> context switch, save/restore ZCR_EL1 value as well.
> 
> List import macros from Linux in README.LinuxPrimitives.
> 
> Signed-off-by: Luca Fancellu 
Reviewed-by: Bertrand Marquis 

Just ...

> ---
> Changes from v6:
> - Add comment for explain why sve_save/sve_load are different from
>   Linux, add macros in xen/arch/arm/README.LinuxPrimitives (Julien)
> - Add comments in sve_context_init and sve_context_free, handle the
>   case where sve_zreg_ctx_end is NULL, move setting of v->arch.zcr_el2
>   in sve_context_init (Julien)
> - remove stubs for sve_context_* and sve_save_* and rely on compiler
>   DCE (Jan)
> - Add comments for sve_save_ctx/sve_load_ctx (Julien)
> Changes from v5:
> - use XFREE instead of xfree, keep the headers (Julien)
> - Avoid math computation for every save/restore, store the computation
>   in struct vfp_state once (Bertrand)
> - protect access to v->domain->arch.sve_vl inside arch_vcpu_create now
>   that sve_vl is available only on arm64
> Changes from v4:
> - No changes
> Changes from v3:
> - don't use fixed len types when not needed (Jan)
> - now VL is an encoded value, decode it before using.
> Changes from v2:
> - No changes
> Changes from v1:
> - No changes
> Changes from RFC:
> - Moved zcr_el2 field introduction in this patch, restore its
>   content inside sve_restore_state function. (Julien)
> 
> fix patch 5
> 
> Signed-off-by: Luca Fancellu 
> Change-Id: Ief65b2ff14fd579afa4fd110ce08a19980e64fa9

You have a signed off and a change-id that should not be here.
They are in the comment section so should be removed during push so might be ok 
:-)

Cheers
Bertrand




Re: [XEN PATCH 13/15] build: fix compile.h compiler version command line

2023-05-24 Thread Luca Fancellu


> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> CFLAGS is just from Config.mk, instead use the flags used to build
> Xen.
> 
> Signed-off-by: Anthony PERARD 
> ---
> 
> Notes:
>I don't know if CFLAGS is even useful there, just --version without the
>flags might produce the same result.
> 
> xen/build.mk | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/build.mk b/xen/build.mk
> index e2a78aa806..d468bb6e26 100644
> --- a/xen/build.mk
> +++ b/xen/build.mk
> @@ -23,7 +23,7 @@ define cmd_compile.h
>-e 's/@@whoami@@/$(XEN_WHOAMI)/g' \
>-e 's/@@domain@@/$(XEN_DOMAIN)/g' \
>-e 's/@@hostname@@/$(XEN_BUILD_HOST)/g' \
> --e 's!@@compiler@@!$(shell $(CC) $(CFLAGS) --version 2>&1 | head -1)!g' \
> +-e 's!@@compiler@@!$(shell $(CC) $(XEN_CFLAGS) --version 2>&1 | head 
> -1)!g' \
>-e 's/@@version@@/$(XEN_VERSION)/g' \
>-e 's/@@subversion@@/$(XEN_SUBVERSION)/g' \
>-e 's/@@extraversion@@/$(XEN_EXTRAVERSION)/g' \
> -- 
> Anthony PERARD
> 
> 

Yes I think Andrew is right, so I guess $(XEN_CFLAGS) can be dropped?

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 

I’ve tested this patch with and without the $(XEN_CFLAGS), so if you drop it 
you can
retain my r-by if you want.



Re: [XEN PATCH 12/15] build: avoid Config.mk's CFLAGS

2023-05-24 Thread Luca Fancellu


> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> The variable $(CFLAGS) is too often set in the environment,
> especially when building a package for a distribution. Often, those
> CFLAGS are intended to be use to build user spaces binaries, not a

NIT: s/use/used/

But I’m not a native speaker so I might be wrong on this

> kernel. This mean packager needs to takes extra steps to build Xen by
> overriding the CFLAGS provided by the package build environment.
> 
> With this patch, we avoid using the variable $(CFLAGS). Also, the
> hypervisor's build system have complete control over which CFLAGS are
> used.
> 
> No change intended to XEN_CFLAGS used, beside some flags which may be
> in a different order on the command line.
> 
> Signed-off-by: Anthony PERARD 
> ---
> 
> Notes:
>There's still $(EXTRA_CFLAGS_XEN_CORE) which allows to add more CFLAGS
>if someone building Xen needs to add more CFLAGS to the hypervisor
>build.
> 

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 





Re: [XEN PATCH 03/15] build, x86: clean build log for boot/ targets

2023-05-24 Thread Andrew Cooper
On 23/05/2023 5:37 pm, Anthony PERARD wrote:
> diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
> index 03d8ce3a9e..2693b938bd 100644
> --- a/xen/arch/x86/boot/Makefile
> +++ b/xen/arch/x86/boot/Makefile
> @@ -26,10 +28,16 @@ $(head-bin-objs): XEN_CFLAGS := $(CFLAGS_x86_32) -fpic
>  LDFLAGS_DIRECT-$(call ld-option,--warn-rwx-segments) := 
> --no-warn-rwx-segments
>  LDFLAGS_DIRECT += $(LDFLAGS_DIRECT-y)
>  
> -%.bin: %.lnk
> - $(OBJCOPY) -j .text -O binary $< $@
> +%.bin: OBJCOPYFLAGS := -j .text -O binary
> +%.bin: %.lnk FORCE
> + $(call if_changed,objcopy)
>  
> -%.lnk: %.o $(src)/build32.lds
> - $(LD) $(subst x86_64,i386,$(LDFLAGS_DIRECT)) -N -T $(filter %.lds,$^) 
> -o $@ $<
> +quiet_cmd_ld_lnk_o = LD  $@

I'd suggest that this be LD32 because it is different to most other LD's
in the log.

However, this is a definite improvement.

Reviewed-by: Andrew Cooper 
Tested-by: Andrew Cooper 

Happy to fix up on commit.



Re: [XEN PATCH 00/15] build: cleanup build log, avoid user's CFLAGS, avoid too many include of Config.mk

2023-05-24 Thread Andrew Cooper
On 23/05/2023 5:37 pm, Anthony PERARD wrote:
> Patch series available in this git branch:
> https://xenbits.xen.org/git-http/people/aperard/xen-unstable.git 
> br.build-system-xen-removing-config.mk-v1
>
> Hi,
>
> This series of patch cleanup the remaining rules still displaying their 
> command
> line.
>
> Then, some change are made in Config.mk to remove build-id stuff only used by
> Xen build.
>
> Then, the variable AFLAGS and CFLAGS are been renamed XEN_AFLAGS and 
> XEN_CFLAGS
> from the beginning to about inclusion of users CFLAGS as those are usually 
> made
> for user space program, not kernel, especially in build environment for distro
> packages.
>
> Last patch removes the inclusion of Config.mk from xen/Rules.mk, as this slow
> down the build unnecessarily. xen/Makefile should do everything necessary to
> setup the build of the hypervisor, and is its only entry point.

Thankyou for doing this.  I'm tempted to summarily ack it, but lets do
things properly.

One thing though, which I think might be a regression but I'm not sure. 
When doing an incremental build (second build with no change), we get:

...
  UPD include/xen/compile.h
 __  __    _  _    _  ___ _    _ _  
 \ \/ /___ _ __   | || |  / |( _ )    _   _ _ __  ___| |_ __ _| |__ | | ___
  \  // _ \ '_ \  | || |_ | |/ _ \ __| | | | '_ \/ __| __/ _` | '_ \| |/ _ \
  /  \  __/ | | | |__   _|| | (_) |__| |_| | | | \__ \ || (_| | |_) | |  __/
 /_/\_\___|_| |_|    |_|(_)_|\___/    \__,_|_| |_|___/\__\__,_|_.__/|_|\___|
    
make[2]: Nothing to be done for 'all'.
make[4]: Nothing to be done for 'all'.
  CC  common/version.o
  CC  arch/x86/efi/boot.o
...

Where I think those two "nothing to be done for 'all'" are new.  I don't
see them in a build from clean.

~Andrew

P.S. I do have some other notes for further cleanup, but I'm not going
to extend this current series with them.



Re: [PATCH v7 03/12] xen/arm: Expose SVE feature to the guest

2023-05-24 Thread Bertrand Marquis
Hi Luca,

> On 23 May 2023, at 09:43, Luca Fancellu  wrote:
> 
> When a guest is allowed to use SVE, expose the SVE features through
> the identification registers.
> 
> Signed-off-by: Luca Fancellu 
> Acked-by: Julien Grall 

Reviewed-by: Bertrand Marquis 

Cheers
Bertrand

> ---
> Changes from v6:
> - code style fix, add A-by Julien
> Changes from v5:
> - given the move of is_sve_domain() in asm/arm64/sve.h, add the
>   header to vsysreg.c
> - dropping Bertrand's R-by because of the change
> Changes from v4:
> - no changes
> Changes from v3:
> - no changes
> Changes from v2:
> - no changes
> Changes from v1:
> - No changes
> Changes from RFC:
> - No changes
> ---
> xen/arch/arm/arm64/vsysreg.c | 41 ++--
> 1 file changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> index 758750983c11..fe31f7b3827f 100644
> --- a/xen/arch/arm/arm64/vsysreg.c
> +++ b/xen/arch/arm/arm64/vsysreg.c
> @@ -18,6 +18,8 @@
> 
> #include 
> 
> +#include 
> +#include 
> #include 
> #include 
> #include 
> @@ -295,7 +297,28 @@ void do_sysreg(struct cpu_user_regs *regs,
> GENERATE_TID3_INFO(MVFR0_EL1, mvfr, 0)
> GENERATE_TID3_INFO(MVFR1_EL1, mvfr, 1)
> GENERATE_TID3_INFO(MVFR2_EL1, mvfr, 2)
> -GENERATE_TID3_INFO(ID_AA64PFR0_EL1, pfr64, 0)
> +
> +case HSR_SYSREG_ID_AA64PFR0_EL1:
> +{
> +register_t guest_reg_value = guest_cpuinfo.pfr64.bits[0];
> +
> +if ( is_sve_domain(v->domain) )
> +{
> +/* 4 is the SVE field width in id_aa64pfr0_el1 */
> +uint64_t mask = GENMASK(ID_AA64PFR0_SVE_SHIFT + 4 - 1,
> +ID_AA64PFR0_SVE_SHIFT);
> +/* sysval is the sve field on the system */
> +uint64_t sysval = cpuid_feature_extract_unsigned_field_width(
> +system_cpuinfo.pfr64.bits[0],
> +ID_AA64PFR0_SVE_SHIFT, 4);
> +guest_reg_value &= ~mask;
> +guest_reg_value |= (sysval << ID_AA64PFR0_SVE_SHIFT) & mask;
> +}
> +
> +return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
> +  guest_reg_value);
> +}
> +
> GENERATE_TID3_INFO(ID_AA64PFR1_EL1, pfr64, 1)
> GENERATE_TID3_INFO(ID_AA64DFR0_EL1, dbg64, 0)
> GENERATE_TID3_INFO(ID_AA64DFR1_EL1, dbg64, 1)
> @@ -306,7 +329,21 @@ void do_sysreg(struct cpu_user_regs *regs,
> GENERATE_TID3_INFO(ID_AA64MMFR2_EL1, mm64, 2)
> GENERATE_TID3_INFO(ID_AA64AFR0_EL1, aux64, 0)
> GENERATE_TID3_INFO(ID_AA64AFR1_EL1, aux64, 1)
> -GENERATE_TID3_INFO(ID_AA64ZFR0_EL1, zfr64, 0)
> +
> +case HSR_SYSREG_ID_AA64ZFR0_EL1:
> +{
> +/*
> + * When the guest has the SVE feature enabled, the whole 
> id_aa64zfr0_el1
> + * needs to be exposed.
> + */
> +register_t guest_reg_value = guest_cpuinfo.zfr64.bits[0];
> +
> +if ( is_sve_domain(v->domain) )
> +guest_reg_value = system_cpuinfo.zfr64.bits[0];
> +
> +return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
> +  guest_reg_value);
> +}
> 
> /*
>  * Those cases are catching all Reserved registers trapped by TID3 which
> -- 
> 2.34.1
> 




Re: [PATCH v7 02/12] xen/arm: add SVE vector length field to the domain

2023-05-24 Thread Bertrand Marquis
Hi Luca,

> On 23 May 2023, at 09:43, Luca Fancellu  wrote:
> 
> Add sve_vl field to arch_domain and xen_arch_domainconfig struct,
> to allow the domain to have an information about the SVE feature
> and the number of SVE register bits that are allowed for this
> domain.
> 
> sve_vl field is the vector length in bits divided by 128, this
> allows to use less space in the structures.
> 
> The field is used also to allow or forbid a domain to use SVE,
> because a value equal to zero means the guest is not allowed to
> use the feature.
> 
> Check that the requested vector length is lower or equal to the
> platform supported vector length, otherwise fail on domain
> creation.
> 
> Check that only 64 bit domains have SVE enabled, otherwise fail.
> 
> Signed-off-by: Luca Fancellu 

Reviewed-by: Bertrand Marquis 

Cheers
Bertrand




Re: [PATCH v7 01/12] xen/arm: enable SVE extension for Xen

2023-05-24 Thread Bertrand Marquis
Hi Luca,

> On 23 May 2023, at 09:43, Luca Fancellu  wrote:
> 
> Enable Xen to handle the SVE extension, add code in cpufeature module
> to handle ZCR SVE register, disable trapping SVE feature on system
> boot only when SVE resources are accessed.
> While there, correct coding style for the comment on coprocessor
> trapping.
> 
> Now cptr_el2 is part of the domain context and it will be restored
> on context switch, this is a preparation for saving the SVE context
> which will be part of VFP operations, so restore it before the call
> to save VFP registers.
> To save an additional isb barrier, restore cptr_el2 before an
> existing isb barrier and move the call for saving VFP context after
> that barrier. To keep a (mostly) specularity of ctxt_switch_to()
> and ctxt_switch_from(), move vfp_save_state() up in the function.
> 
> Change the KConfig entry to make ARM64_SVE symbol selectable, by
> default it will be not selected.
> 
> Create sve module and sve_asm.S that contains assembly routines for
> the SVE feature, this code is inspired from linux and it uses
> instruction encoding to be compatible with compilers that does not
> support SVE, imported instructions are documented in
> README.LinuxPrimitives.
> 
> Signed-off-by: Luca Fancellu 

Reviewed-by: Bertrand Marquis 

with one minor NIT that could be fixed on commit...

> ---
> Changes from v6:
> - modified licence, add emacs block, move vfp_save_state up in the
>   function, add comments to CPTR_EL2 and vfp_restore_state, don't
>   use variable in init_traps(), code style fixes,
>   add entries to README.LinuxPrimitives (Julien)
> - vl_to_zcr is moved into sve.c module as changes to the series led
>   to its usage only inside it, remove stub for compute_max_zcr and
>   rely on compiler DCE.
> Changes from v5:
> - Add R-by Bertrand
> Changes from v4:
> - don't use fixed types in vl_to_zcr, forgot to address that in
>   v3, by mistake I changed that in patch 2, fixing now (Jan)
> Changes from v3:
> - no changes
> Changes from v2:
> - renamed sve_asm.S in sve-asm.S, new files should not contain
>   underscore in the name (Jan)
> Changes from v1:
> - Add assert to vl_to_zcr, it is never called with vl==0, but just
>   to be sure it won't in the future.
> Changes from RFC:
> - Moved restoring of cptr before an existing barrier (Julien)
> - Marked the feature as unsupported for now (Julien)
> - Trap and un-trap only when using SVE resources in
>   compute_max_zcr() (Julien)
> ---
> xen/arch/arm/Kconfig | 10 ++--
> xen/arch/arm/README.LinuxPrimitives  |  9 
> xen/arch/arm/arm64/Makefile  |  1 +
> xen/arch/arm/arm64/cpufeature.c  |  7 ++-
> xen/arch/arm/arm64/sve-asm.S | 48 +++
> xen/arch/arm/arm64/sve.c | 59 
> xen/arch/arm/cpufeature.c|  6 ++-
> xen/arch/arm/domain.c| 20 +---
> xen/arch/arm/include/asm/arm64/sve.h | 27 +++
> xen/arch/arm/include/asm/arm64/sysregs.h |  1 +
> xen/arch/arm/include/asm/cpufeature.h| 14 ++
> xen/arch/arm/include/asm/domain.h|  1 +
> xen/arch/arm/include/asm/processor.h |  2 +
> xen/arch/arm/setup.c |  5 +-
> xen/arch/arm/traps.c | 27 ++-
> 15 files changed, 210 insertions(+), 27 deletions(-)
> create mode 100644 xen/arch/arm/arm64/sve-asm.S
> create mode 100644 xen/arch/arm/arm64/sve.c
> create mode 100644 xen/arch/arm/include/asm/arm64/sve.h
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 239d3aed3c7f..41f45d8d1203 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -112,11 +112,15 @@ config ARM64_PTR_AUTH
>  This feature is not supported in Xen.
> 
> config ARM64_SVE
> - def_bool n
> + bool "Enable Scalar Vector Extension support (UNSUPPORTED)" if UNSUPPORTED
> depends on ARM_64
> help
> -  Scalar Vector Extension support.
> -  This feature is not supported in Xen.
> +  Scalar Vector Extension (SVE/SVE2) support for guests.
> +
> +  Please be aware that currently, enabling this feature will add latency on
> +  VM context switch between SVE enabled guests, between not-enabled SVE
> +  guests and SVE enabled guests and viceversa, compared to the time
> +  required to switch between not-enabled SVE guests.
> 
> config ARM64_MTE
> def_bool n
> diff --git a/xen/arch/arm/README.LinuxPrimitives 
> b/xen/arch/arm/README.LinuxPrimitives
> index 1d53e6a898da..76c8df29e416 100644
> --- a/xen/arch/arm/README.LinuxPrimitives
> +++ b/xen/arch/arm/README.LinuxPrimitives
> @@ -62,6 +62,15 @@ done
> linux/arch/arm64/lib/clear_page.S   xen/arch/arm/arm64/lib/clear_page.S
> linux/arch/arm64/lib/copy_page.Sunused in Xen
> 
> +-
> +
> +SVE assembly macro: last sync @ v6.3.0 (last commit: 457391b03803)
> +
> +linux/arch/arm64/include/asm/fpsimdmacros.h   
> xen/arch/arm/include/asm/arm64/sve-asm.S

Re: [PATCH v2 5/6] block/linux-aio: convert to blk_io_plug_call() API

2023-05-24 Thread Stefano Garzarella

On Tue, May 23, 2023 at 01:12:59PM -0400, Stefan Hajnoczi wrote:

Stop using the .bdrv_co_io_plug() API because it is not multi-queue
block layer friendly. Use the new blk_io_plug_call() API to batch I/O
submission instead.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
include/block/raw-aio.h |  7 ---
block/file-posix.c  | 28 
block/linux-aio.c   | 41 +++--
3 files changed, 11 insertions(+), 65 deletions(-)

diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index da60ca13ef..0f63c2800c 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -62,13 +62,6 @@ int coroutine_fn laio_co_submit(int fd, uint64_t offset, 
QEMUIOVector *qiov,

void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context);
void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context);
-
-/*
- * laio_io_plug/unplug work in the thread's current AioContext, therefore the
- * caller must ensure that they are paired in the same IOThread.
- */
-void laio_io_plug(void);
-void laio_io_unplug(uint64_t dev_max_batch);
#endif
/* io_uring.c - Linux io_uring implementation */
#ifdef CONFIG_LINUX_IO_URING
diff --git a/block/file-posix.c b/block/file-posix.c
index 7baa8491dd..ac1ed54811 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2550,26 +2550,6 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState 
*bs, int64_t offset,
return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE);
}

-static void coroutine_fn raw_co_io_plug(BlockDriverState *bs)
-{
-BDRVRawState __attribute__((unused)) *s = bs->opaque;
-#ifdef CONFIG_LINUX_AIO
-if (s->use_linux_aio) {
-laio_io_plug();
-}
-#endif
-}
-
-static void coroutine_fn raw_co_io_unplug(BlockDriverState *bs)
-{
-BDRVRawState __attribute__((unused)) *s = bs->opaque;
-#ifdef CONFIG_LINUX_AIO
-if (s->use_linux_aio) {
-laio_io_unplug(s->aio_max_batch);
-}
-#endif
-}
-
static int coroutine_fn raw_co_flush_to_disk(BlockDriverState *bs)
{
BDRVRawState *s = bs->opaque;
@@ -3914,8 +3894,6 @@ BlockDriver bdrv_file = {
.bdrv_co_copy_range_from = raw_co_copy_range_from,
.bdrv_co_copy_range_to  = raw_co_copy_range_to,
.bdrv_refresh_limits = raw_refresh_limits,
-.bdrv_co_io_plug= raw_co_io_plug,
-.bdrv_co_io_unplug  = raw_co_io_unplug,
.bdrv_attach_aio_context = raw_aio_attach_aio_context,

.bdrv_co_truncate   = raw_co_truncate,
@@ -4286,8 +4264,6 @@ static BlockDriver bdrv_host_device = {
.bdrv_co_copy_range_from = raw_co_copy_range_from,
.bdrv_co_copy_range_to  = raw_co_copy_range_to,
.bdrv_refresh_limits = raw_refresh_limits,
-.bdrv_co_io_plug= raw_co_io_plug,
-.bdrv_co_io_unplug  = raw_co_io_unplug,
.bdrv_attach_aio_context = raw_aio_attach_aio_context,

.bdrv_co_truncate   = raw_co_truncate,
@@ -4424,8 +4400,6 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_co_pwritev= raw_co_pwritev,
.bdrv_co_flush_to_disk  = raw_co_flush_to_disk,
.bdrv_refresh_limits= cdrom_refresh_limits,
-.bdrv_co_io_plug= raw_co_io_plug,
-.bdrv_co_io_unplug  = raw_co_io_unplug,
.bdrv_attach_aio_context = raw_aio_attach_aio_context,

.bdrv_co_truncate   = raw_co_truncate,
@@ -4552,8 +4526,6 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_co_pwritev= raw_co_pwritev,
.bdrv_co_flush_to_disk  = raw_co_flush_to_disk,
.bdrv_refresh_limits= cdrom_refresh_limits,
-.bdrv_co_io_plug= raw_co_io_plug,
-.bdrv_co_io_unplug  = raw_co_io_unplug,
.bdrv_attach_aio_context = raw_aio_attach_aio_context,

.bdrv_co_truncate   = raw_co_truncate,
diff --git a/block/linux-aio.c b/block/linux-aio.c
index 442c86209b..5021aed68f 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -15,6 +15,7 @@
#include "qemu/event_notifier.h"
#include "qemu/coroutine.h"
#include "qapi/error.h"
+#include "sysemu/block-backend.h"

/* Only used for assertions.  */
#include "qemu/coroutine_int.h"
@@ -46,7 +47,6 @@ struct qemu_laiocb {
};

typedef struct {
-int plugged;
unsigned int in_queue;
unsigned int in_flight;
bool blocked;
@@ -236,7 +236,7 @@ static void 
qemu_laio_process_completions_and_submit(LinuxAioState *s)
{
qemu_laio_process_completions(s);

-if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(>io_q.pending)) {
+if (!QSIMPLEQ_EMPTY(>io_q.pending)) {
ioq_submit(s);
}
}
@@ -277,7 +277,6 @@ static void qemu_laio_poll_ready(EventNotifier *opaque)
static void ioq_init(LaioQueue *io_q)
{
QSIMPLEQ_INIT(_q->pending);
-io_q->plugged = 0;
io_q->in_queue = 0;
io_q->in_flight = 0;
io_q->blocked = false;
@@ -354,31 +353,11 @@ static uint64_t laio_max_batch(LinuxAioState *s, uint64_t 
dev_max_batch)
return max_batch;
}

-void laio_io_plug(void)
+static void laio_unplug_fn(void *opaque)
{
- 

Re: [PATCH v2 6/6] block: remove bdrv_co_io_plug() API

2023-05-24 Thread Stefano Garzarella

On Tue, May 23, 2023 at 01:13:00PM -0400, Stefan Hajnoczi wrote:

No block driver implements .bdrv_co_io_plug() anymore. Get rid of the
function pointers.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
include/block/block-io.h |  3 ---
include/block/block_int-common.h | 11 --
block/io.c   | 37 
3 files changed, 51 deletions(-)


Reviewed-by: Stefano Garzarella 




Re: [PATCH v2 4/6] block/io_uring: convert to blk_io_plug_call() API

2023-05-24 Thread Stefano Garzarella

On Tue, May 23, 2023 at 01:12:58PM -0400, Stefan Hajnoczi wrote:

Stop using the .bdrv_co_io_plug() API because it is not multi-queue
block layer friendly. Use the new blk_io_plug_call() API to batch I/O
submission instead.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
v2
- Removed whitespace hunk [Eric]
---
include/block/raw-aio.h |  7 ---
block/file-posix.c  | 10 --
block/io_uring.c| 44 -
block/trace-events  |  5 ++---
4 files changed, 19 insertions(+), 47 deletions(-)


Reviewed-by: Stefano Garzarella 




Re: [XEN PATCH 10/15] build: rename $(AFLAGS) to $(XEN_AFLAGS)

2023-05-24 Thread Luca Fancellu



> On 24 May 2023, at 09:29, Luca Fancellu  wrote:
> 
> 
> 
>> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
>> 
>> We don't want the AFLAGS from the environment, they are usually meant
>> to build user space application and not for the hypervisor.
>> 
>> Config.mk doesn't provied any $(AFLAGS) so we can start a fresh

NIT: there is a typo s/provied/provide/

>> $(XEN_AFLAGS).
>> 
>> Signed-off-by: Anthony PERARD 
>> ---
> 
> Reviewed-by: Luca Fancellu 
> Tested-by: Luca Fancellu 
> 
> 
> 




Re: [XEN PATCH 11/15] build: rename CFLAGS to XEN_CFLAGS in xen/Makefile

2023-05-24 Thread Luca Fancellu



> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> This is a preparatory patch. A future patch will not even use
> $(CFLAGS) to seed $(XEN_CFLAGS).
> 
> Signed-off-by: Anthony PERARD 

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 





Re: [XEN PATCH 10/15] build: rename $(AFLAGS) to $(XEN_AFLAGS)

2023-05-24 Thread Luca Fancellu



> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> We don't want the AFLAGS from the environment, they are usually meant
> to build user space application and not for the hypervisor.
> 
> Config.mk doesn't provied any $(AFLAGS) so we can start a fresh
> $(XEN_AFLAGS).
> 
> Signed-off-by: Anthony PERARD 
> ---

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 






[qemu-mainline test] 180921: regressions - FAIL

2023-05-24 Thread osstest service owner
flight 180921 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180921/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   6 xen-buildfail REGR. vs. 180691
 build-amd64-xsm   6 xen-buildfail REGR. vs. 180691
 build-i386-xsm6 xen-buildfail REGR. vs. 180691
 build-i3866 xen-buildfail REGR. vs. 180691
 build-arm64   6 xen-buildfail REGR. vs. 180691
 build-arm64-xsm   6 xen-buildfail REGR. vs. 180691
 build-armhf   6 xen-buildfail REGR. vs. 180691

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemuu-ws16-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-shadow 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-vhd1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-vhd   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-raw   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-coresched-i386-xl  1 build-check(1)   blocked  n/a
 test-amd64-coresched-amd64-xl  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-shadow1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 build-arm64-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ws16-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 build-armhf-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 1 build-check(1) blocked 
n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-dom0pvh-xl-amd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow  1 build-check(1) blocked n/a
 test-amd64-amd64-dom0pvh-xl-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 

[xen-4.17-testing test] 180919: tolerable FAIL - PUSHED

2023-05-24 Thread osstest service owner
flight 180919 xen-4.17-testing real [real]
flight 180926 xen-4.17-testing real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/180919/
http://logs.test-lab.xenproject.org/osstest/logs/180926/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 12 debian-hvm-install fail pass in 
180926-retest

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 180683
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 180683
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 180683
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 180683
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 180683
 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check   fail like 180683
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 180683
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 180683
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 180683
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 180683
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 180683
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 180683
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 xen  47eb94123035a2987dd1e328e9adec6db36e7fb3
baseline version:
 xen  b773c48e368d9cf1ea29b259fe4ae434b8bb42da

Last test of basis   180683  

Re: [XEN PATCH 09/15] build: hide commands run for kconfig

2023-05-24 Thread Luca Fancellu



> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> but still show a log entry for syncconfig. We have to use kecho
> instead of $(cmd,) to avoid issue with prompt from kconfig.
> 
> linux commits for reference:
>23cd88c91343 ("kbuild: hide commands to run Kconfig, and show short log 
> for syncconfig")
>d952cfaf0cff ("kbuild: do not suppress Kconfig prompts for silent build")
> 
> Signed-off-by: Anthony PERARD 

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 






Re: [PATCH] [v2] x86: xen: add missing prototypes

2023-05-24 Thread Arnd Bergmann
On Wed, May 24, 2023, at 01:09, Boris Ostrovsky wrote:
> On 5/23/23 4:37 PM, Arnd Bergmann wrote:
>> On Sat, May 20, 2023, at 00:24, Boris Ostrovsky wrote:
>>> On 5/19/23 5:28 AM, Arnd Bergmann wrote:
>>
>> Not sure if there is much point for the second one, since
>> it's only called from assembler, so the #else path is
>> never seen, but I can do that for consistency if you
>> like.
>> 
>> I generally prefer to avoid the extra #if checks
>> when there is no strict need for an empty stub.
>
> Do we need the empty stubs? Neither of these should be referenced if 
> !SMP (or more precisely if !CONFIG_XEN_PV_SMP)

We don't need the prototypes at all for building, they
are only there to avoid the missing-prototype warning!

I added the stubs in v3 because you asked for an #ifdef,
and having an #ifdef without an else clause seemed even
weirder: that would only add complexity for something that
is already unused while making it harder to maintain or
use if an actual user comes up.

I'll let someone else figure out what you actually want
here.

 Arnd



Re: [XEN PATCH 02/15] build: rework asm-offsets.* build step to use kbuild

2023-05-24 Thread Luca Fancellu


> On 24 May 2023, at 09:01, Jan Beulich  wrote:
> 
> On 24.05.2023 09:39, Luca Fancellu wrote:
>>> On 23 May 2023, at 17:37, Anthony PERARD  wrote:
>>> Instead of having a special $(cmd_asm-offsets.s) command, we could
>>> probably reuse $(cmd_cc_s_c) from Rules.mk, but that would mean that
>>> an hypothetical additional flags "-flto" in CFLAGS would not be
>>> removed anymore, not sure if that matter here.
>>> 
>>> But then we could write this:
>>> 
>>> targets += arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s
>>> arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s: CFLAGS-y += -g0
>>> arch/$(TARGET_ARCH)/include/asm/asm-offsets.h: 
>>> arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s FORCE
>>> 
>>> instead of having to write a rule for asm-offsets.s
>> 
>> The solution above seems clean, maybe I am wrong but -flto should not matter 
>> here as we are
>> not building objects to include in the final build, isn’t it? And gcc 
>> documentation states just:
>> 
>> “It is recommended that you compile all the files participating in the same 
>> link with the same
>> options and also specify those options at link time."
>> 
>> I’ve also tested this patch and it works fine, I have to say however that I 
>> preferred
>> a more verbose output, so that people can check how we are invoking the 
>> compiler,
>> but I guess now it’s more consistent with the other invocations that doesn’t 
>> print
>> the compiler invocation.
> 
> If you want it more verbose, you can pass V=1 on the make command line.
> (Of course that'll affect all commands' output.)

Yes I have to say that after sending the mail, I’ve checked the Makefile and 
I’ve found the comment

# To put more focus on warnings, be less verbose as default
# Use 'make V=1' to see the full commands

Thank you for pointing that out


> 
> Jan



Re: [XEN PATCH 08/15] build: use $(filechk, ) for all compat/.xlat/%.lst

2023-05-24 Thread Luca Fancellu



> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> Make use of filechk mean that we don't have to use
> $(move-if-changed,). It also mean that will have sometime "UPD .." in
> the build output when the target changed, rather than having "GEN ..."
> all the time when "xlat.lst" happen to have a more recent modification
> timestamp.
> 
> While there, replace `grep -v` by `sed '//d'` to avoid an extra
> fork and pipe when building.
> 
> Signed-off-by: Anthony PERARD 
> ---


Seems good to me!

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 






Re: [PATCH v2 3/6] block/blkio: convert to blk_io_plug_call() API

2023-05-24 Thread Stefano Garzarella

On Tue, May 23, 2023 at 01:12:57PM -0400, Stefan Hajnoczi wrote:

Stop using the .bdrv_co_io_plug() API because it is not multi-queue
block layer friendly. Use the new blk_io_plug_call() API to batch I/O
submission instead.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
v2
- Add missing #include and fix blkio_unplug_fn() prototype [Stefano]
---
block/blkio.c | 43 ---
1 file changed, 24 insertions(+), 19 deletions(-)


Reviewed-by: Stefano Garzarella 




Re: [PATCH v2 2/6] block/nvme: convert to blk_io_plug_call() API

2023-05-24 Thread Stefano Garzarella

On Tue, May 23, 2023 at 01:12:56PM -0400, Stefan Hajnoczi wrote:

Stop using the .bdrv_co_io_plug() API because it is not multi-queue
block layer friendly. Use the new blk_io_plug_call() API to batch I/O
submission instead.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
v2
- Remove unused nvme_process_completion_queue_plugged trace event
 [Stefano]
---
block/nvme.c   | 44 
block/trace-events |  1 -
2 files changed, 12 insertions(+), 33 deletions(-)


Reviewed-by: Stefano Garzarella 




Re: [PATCH v2 1/6] block: add blk_io_plug_call() API

2023-05-24 Thread Stefano Garzarella

On Tue, May 23, 2023 at 01:12:55PM -0400, Stefan Hajnoczi wrote:

Introduce a new API for thread-local blk_io_plug() that does not
traverse the block graph. The goal is to make blk_io_plug() multi-queue
friendly.

Instead of having block drivers track whether or not we're in a plugged
section, provide an API that allows them to defer a function call until
we're unplugged: blk_io_plug_call(fn, opaque). If blk_io_plug_call() is
called multiple times with the same fn/opaque pair, then fn() is only
called once at the end of the function - resulting in batching.

This patch introduces the API and changes blk_io_plug()/blk_io_unplug().
blk_io_plug()/blk_io_unplug() no longer require a BlockBackend argument
because the plug state is now thread-local.

Later patches convert block drivers to blk_io_plug_call() and then we
can finally remove .bdrv_co_io_plug() once all block drivers have been
converted.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Eric Blake 
---
v2
- "is not be freed" -> "is not freed" [Eric]
---
MAINTAINERS   |   1 +
include/sysemu/block-backend-io.h |  13 +--
block/block-backend.c |  22 -
block/plug.c  | 159 ++
hw/block/dataplane/xen-block.c|   8 +-
hw/block/virtio-blk.c |   4 +-
hw/scsi/virtio-scsi.c |   6 +-
block/meson.build |   1 +
8 files changed, 173 insertions(+), 41 deletions(-)
create mode 100644 block/plug.c


Reviewed-by: Stefano Garzarella 




Re: [XEN PATCH 07/15] build: move XEN_HAS_BUILD_ID out of Config.mk

2023-05-24 Thread Luca Fancellu



> On 23 May 2023, at 17:38, Anthony PERARD  wrote:
> 
> Whether or not the linker can do build id is only used by the
> hypervisor build, so move that there.
> 
> Rename $(build_id_linker) to $(XEN_LDFLAGS_BUILD_ID) as this is a
> better name to be exported as to use the "XEN_*" namespace.
> 
> Also update XEN_TREEWIDE_CFLAGS so flags can be used for
> arch/x86/boot/ CFLAGS_x86_32
> 
> Beside a reordering of the command line where CFLAGS is used, there
> shouldn't be any other changes.
> 
> Signed-off-by: Anthony PERARD 


Seems good to me!

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 





Re: [PATCH 1/6] block: add blk_io_plug_call() API

2023-05-24 Thread Stefano Garzarella

On Tue, May 23, 2023 at 11:47:08AM -0400, Stefan Hajnoczi wrote:

On Fri, May 19, 2023 at 10:45:57AM +0200, Stefano Garzarella wrote:

On Wed, May 17, 2023 at 06:10:17PM -0400, Stefan Hajnoczi wrote:
> Introduce a new API for thread-local blk_io_plug() that does not
> traverse the block graph. The goal is to make blk_io_plug() multi-queue
> friendly.
>
> Instead of having block drivers track whether or not we're in a plugged
> section, provide an API that allows them to defer a function call until
> we're unplugged: blk_io_plug_call(fn, opaque). If blk_io_plug_call() is
> called multiple times with the same fn/opaque pair, then fn() is only
> called once at the end of the function - resulting in batching.
>
> This patch introduces the API and changes blk_io_plug()/blk_io_unplug().
> blk_io_plug()/blk_io_unplug() no longer require a BlockBackend argument
> because the plug state is now thread-local.
>
> Later patches convert block drivers to blk_io_plug_call() and then we
> can finally remove .bdrv_co_io_plug() once all block drivers have been
> converted.
>
> Signed-off-by: Stefan Hajnoczi 
> ---
> MAINTAINERS   |   1 +
> include/sysemu/block-backend-io.h |  13 +--
> block/block-backend.c |  22 -
> block/plug.c  | 159 ++
> hw/block/dataplane/xen-block.c|   8 +-
> hw/block/virtio-blk.c |   4 +-
> hw/scsi/virtio-scsi.c |   6 +-
> block/meson.build |   1 +
> 8 files changed, 173 insertions(+), 41 deletions(-)
> create mode 100644 block/plug.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 50585117a0..574202295c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2644,6 +2644,7 @@ F: util/aio-*.c
> F: util/aio-*.h
> F: util/fdmon-*.c
> F: block/io.c
> +F: block/plug.c
> F: migration/block*
> F: include/block/aio.h
> F: include/block/aio-wait.h
> diff --git a/include/sysemu/block-backend-io.h 
b/include/sysemu/block-backend-io.h
> index d62a7ee773..be4dcef59d 100644
> --- a/include/sysemu/block-backend-io.h
> +++ b/include/sysemu/block-backend-io.h
> @@ -100,16 +100,9 @@ void blk_iostatus_set_err(BlockBackend *blk, int error);
> int blk_get_max_iov(BlockBackend *blk);
> int blk_get_max_hw_iov(BlockBackend *blk);
>
> -/*
> - * blk_io_plug/unplug are thread-local operations. This means that multiple
> - * IOThreads can simultaneously call plug/unplug, but the caller must ensure
> - * that each unplug() is called in the same IOThread of the matching plug().
> - */
> -void coroutine_fn blk_co_io_plug(BlockBackend *blk);
> -void co_wrapper blk_io_plug(BlockBackend *blk);
> -
> -void coroutine_fn blk_co_io_unplug(BlockBackend *blk);
> -void co_wrapper blk_io_unplug(BlockBackend *blk);
> +void blk_io_plug(void);
> +void blk_io_unplug(void);
> +void blk_io_plug_call(void (*fn)(void *), void *opaque);
>
> AioContext *blk_get_aio_context(BlockBackend *blk);
> BlockAcctStats *blk_get_stats(BlockBackend *blk);
> diff --git a/block/block-backend.c b/block/block-backend.c
> index ca537cd0ad..1f1d226ba6 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -2568,28 +2568,6 @@ void blk_add_insert_bs_notifier(BlockBackend *blk, 
Notifier *notify)
> notifier_list_add(>insert_bs_notifiers, notify);
> }
>
> -void coroutine_fn blk_co_io_plug(BlockBackend *blk)
> -{
> -BlockDriverState *bs = blk_bs(blk);
> -IO_CODE();
> -GRAPH_RDLOCK_GUARD();
> -
> -if (bs) {
> -bdrv_co_io_plug(bs);
> -}
> -}
> -
> -void coroutine_fn blk_co_io_unplug(BlockBackend *blk)
> -{
> -BlockDriverState *bs = blk_bs(blk);
> -IO_CODE();
> -GRAPH_RDLOCK_GUARD();
> -
> -if (bs) {
> -bdrv_co_io_unplug(bs);
> -}
> -}
> -
> BlockAcctStats *blk_get_stats(BlockBackend *blk)
> {
> IO_CODE();
> diff --git a/block/plug.c b/block/plug.c
> new file mode 100644
> index 00..6738a568ba
> --- /dev/null
> +++ b/block/plug.c
> @@ -0,0 +1,159 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + * Block I/O plugging
> + *
> + * Copyright Red Hat.
> + *
> + * This API defers a function call within a blk_io_plug()/blk_io_unplug()
> + * section, allowing multiple calls to batch up. This is a performance
> + * optimization that is used in the block layer to submit several I/O 
requests
> + * at once instead of individually:
> + *
> + *   blk_io_plug(); <-- start of plugged region
> + *   ...
> + *   blk_io_plug_call(my_func, my_obj); <-- deferred my_func(my_obj) call
> + *   blk_io_plug_call(my_func, my_obj); <-- another
> + *   blk_io_plug_call(my_func, my_obj); <-- another
> + *   ...
> + *   blk_io_unplug(); <-- end of plugged region, my_func(my_obj) is called 
once
> + *
> + * This code is actually generic and not tied to the block layer. If another
> + * subsystem needs this functionality, it could be renamed.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/coroutine-tls.h"
> +#include "qemu/notify.h"
> +#include "qemu/thread.h"
> 

Re: [XEN PATCH 02/15] build: rework asm-offsets.* build step to use kbuild

2023-05-24 Thread Jan Beulich
On 24.05.2023 09:39, Luca Fancellu wrote:
>> On 23 May 2023, at 17:37, Anthony PERARD  wrote:
>> Instead of having a special $(cmd_asm-offsets.s) command, we could
>> probably reuse $(cmd_cc_s_c) from Rules.mk, but that would mean that
>> an hypothetical additional flags "-flto" in CFLAGS would not be
>> removed anymore, not sure if that matter here.
>>
>> But then we could write this:
>>
>> targets += arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s
>> arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s: CFLAGS-y += -g0
>> arch/$(TARGET_ARCH)/include/asm/asm-offsets.h: 
>> arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s FORCE
>>
>> instead of having to write a rule for asm-offsets.s
> 
> The solution above seems clean, maybe I am wrong but -flto should not matter 
> here as we are
> not building objects to include in the final build, isn’t it? And gcc 
> documentation states just:
> 
> “It is recommended that you compile all the files participating in the same 
> link with the same
> options and also specify those options at link time."
> 
> I’ve also tested this patch and it works fine, I have to say however that I 
> preferred
> a more verbose output, so that people can check how we are invoking the 
> compiler,
> but I guess now it’s more consistent with the other invocations that doesn’t 
> print
> the compiler invocation.

If you want it more verbose, you can pass V=1 on the make command line.
(Of course that'll affect all commands' output.)

Jan



Re: [PATCH v3 3/6] iommu/arm: Introduce iommu_add_dt_pci_sideband_ids API

2023-05-24 Thread Michal Orzel
Hi Stewart,

On 18/05/2023 23:06, Stewart Hildebrand wrote:
> 
> 
> From: Oleksandr Tyshchenko 
> 
> The main purpose of this patch is to add a way to register PCI device
> (which is behind the IOMMU) using the generic PCI-IOMMU DT bindings [1]
> before assigning that device to a domain.
> 
> This behaves in almost the same way as existing iommu_add_dt_device API,
> the difference is in devices to handle and DT bindings to use.
> 
> The function of_map_id to translate an ID through a downstream mapping
> (which is also suitable for mapping Requester ID) was borrowed from Linux
> (v5.10-rc6) and updated according to the Xen code base.
> 
> XXX: I don't port pci_for_each_dma_alias from Linux which is a part
> of PCI-IOMMU bindings infrastucture as I don't have a good understanding
> for how it is expected to work in Xen environment.
> Also it is not completely clear whether we need to distinguish between
> different PCI types here (DEV_TYPE_PCI, DEV_TYPE_PCI_HOST_BRIDGE, etc).
> For example, how we should behave here if the host bridge doesn't have
> a stream ID (so not described in iommu-map property) just simple
> fail or bypasses translation?
> 
> [1] 
> https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/pci-iommu.txt
> 
> Signed-off-by: Oleksandr Tyshchenko 
> Signed-off-by: Stewart Hildebrand 
> ---
> v2->v3:
> * new patch title (was: iommu/arm: Introduce iommu_add_dt_pci_device API)
> * renamed function
>   from: iommu_add_dt_pci_device
>   to: iommu_add_dt_pci_sideband_ids
> * removed stale ops->add_device check
> * iommu.h: add empty stub iommu_add_dt_pci_sideband_ids for !HAS_DEVICE_TREE
> * iommu.h: add iommu_add_pci_sideband_ids helper
> * iommu.h: don't wrap prototype in #ifdef CONFIG_HAS_PCI
> * s/iommu_fwspec_free(pci_to_dev(pdev))/iommu_fwspec_free(dev)/
> 
> v1->v2:
> * remove extra devfn parameter since pdev fully describes the device
> * remove ops->add_device() call from iommu_add_dt_pci_device(). Instead, rely 
> on
>   the existing iommu call in iommu_add_device().
> * move the ops->add_device and ops->dt_xlate checks earlier
> 
> downstream->v1:
> * rebase
> * add const qualifier to struct dt_device_node *np arg in dt_map_id()
> * add const qualifier to struct dt_device_node *np declaration in 
> iommu_add_pci_device()
> * use stdint.h types instead of u8/u32/etc...
> * rename functions:
>   s/dt_iommu_xlate/iommu_dt_xlate/
>   s/dt_map_id/iommu_dt_pci_map_id/
>   s/iommu_add_pci_device/iommu_add_dt_pci_device/
> * add device_is_protected check in iommu_add_dt_pci_device
> * wrap prototypes in CONFIG_HAS_PCI
> 
> (cherry picked from commit 734e3bf6ee77e7947667ab8fa96c25b349c2e1da from
>  the downstream branch poc/pci-passthrough from
>  https://gitlab.com/xen-project/people/bmarquis/xen-arm-poc.git)
> ---
>  xen/drivers/passthrough/device_tree.c | 140 ++
>  xen/include/xen/device_tree.h |  25 +
>  xen/include/xen/iommu.h   |  17 +++-
>  3 files changed, 181 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/passthrough/device_tree.c 
> b/xen/drivers/passthrough/device_tree.c
> index 1b50f4670944..d568166e19ec 100644
> --- a/xen/drivers/passthrough/device_tree.c
> +++ b/xen/drivers/passthrough/device_tree.c
> @@ -151,6 +151,146 @@ static int iommu_dt_xlate(struct device *dev,
>  return ops->dt_xlate(dev, iommu_spec);
>  }
> 
> +#ifdef CONFIG_HAS_PCI
> +int iommu_dt_pci_map_id(const struct dt_device_node *np, uint32_t id,
> +const char *map_name, const char *map_mask_name,
> +struct dt_device_node **target, uint32_t *id_out)
> +{
> +uint32_t map_mask, masked_id, map_len;
> +const __be32 *map = NULL;
> +
> +if ( !np || !map_name || (!target && !id_out) )
> +return -EINVAL;
> +
> +map = dt_get_property(np, map_name, _len);
> +if ( !map )
> +{
> +if ( target )
> +return -ENODEV;
empty line here

> +/* Otherwise, no map implies no translation */
> +*id_out = id;
> +return 0;
> +}
> +
> +if ( !map_len || map_len % (4 * sizeof(*map)) )
could you enclose the second expression in parantheses?

> +{
> +printk(XENLOG_ERR "%pOF: Error: Bad %s length: %d\n", np,
%pOF is a Linux special printk format to print full name of the node.
We do not have this in Xen (see printk-formats.txt). If you want to achieve the 
same, use np->full_name.
This applies to all the uses below.
Also, use %u for map_len as it is unsigned.

> +map_name, map_len);
incorrect alignment

> +return -EINVAL;
> +}
> +
> +/* The default is to select all bits. */
> +map_mask = 0x;
> +
> +/*
> + * Can be overridden by "{iommu,msi}-map-mask" property.
> + * If of_property_read_u32() fails, the default is used.
s/of_property_read_u32/dt_property_read_u32

> + */
> +if ( map_mask_name )
> +dt_property_read_u32(np, map_mask_name, _mask);
> +
> +masked_id 

Re: [XEN PATCH 02/15] build: rework asm-offsets.* build step to use kbuild

2023-05-24 Thread Luca Fancellu


> On 23 May 2023, at 17:37, Anthony PERARD  wrote:
> 
> Use $(if_changed_dep, ) macro to generate "asm-offsets.s" and remove
> the use of $(move-if-changes,). That mean that "asm-offset.s" will be
> changed even when the output doesn't change.
> 
> But "asm-offsets.s" is only used to generated "asm-offsets.h". So
> instead of regenerating "asm-offsets.h" every time "asm-offsets.s"
> change, we will use "$(filechk, )" to only update the ".h" when the
> output change. Also, with "$(filechk, )", the file does get
> regenerated when the rule change in the makefile.
> 
> This changes also result in a cleaner build log.
> 
> Signed-off-by: Anthony PERARD 
> ---
> 
> Instead of having a special $(cmd_asm-offsets.s) command, we could
> probably reuse $(cmd_cc_s_c) from Rules.mk, but that would mean that
> an hypothetical additional flags "-flto" in CFLAGS would not be
> removed anymore, not sure if that matter here.
> 
> But then we could write this:
> 
> targets += arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s
> arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s: CFLAGS-y += -g0
> arch/$(TARGET_ARCH)/include/asm/asm-offsets.h: 
> arch/$(TARGET_ARCH)/$(TARGET_SUBARCH)/asm-offsets.s FORCE
> 
> instead of having to write a rule for asm-offsets.s

The solution above seems clean, maybe I am wrong but -flto should not matter 
here as we are
not building objects to include in the final build, isn’t it? And gcc 
documentation states just:

“It is recommended that you compile all the files participating in the same 
link with the same
options and also specify those options at link time."

I’ve also tested this patch and it works fine, I have to say however that I 
preferred
a more verbose output, so that people can check how we are invoking the 
compiler,
but I guess now it’s more consistent with the other invocations that doesn’t 
print
the compiler invocation.

So if you want to proceed with this one, for me looks fine:

Reviewed-by: Luca Fancellu 
Tested-by: Luca Fancellu 




Re: [XEN PATCH 06/15] build: quiet for .allconfig.tmp target

2023-05-24 Thread Jan Beulich
On 23.05.2023 18:38, Anthony PERARD wrote:
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -339,7 +339,7 @@ filechk_kconfig_allconfig = \
>  :
>  
>  .allconfig.tmp: FORCE
> - set -e; { $(call filechk_kconfig_allconfig); } > $@
> + $(Q)set -e; { $(call filechk_kconfig_allconfig); } > $@

So this then leaves no trace of the generation of this file. This might
be okay, if only the file wasn't needlessly generated for e.g. "make
syncconfig". So I'd be okay with this complete silencing only if prior
to that (or in the same patch) the dependencies were limited to just
the *config goals which actually need the file (and the setting of
KCONFIG_ALLCONFIG).

Jan



  1   2   >