Re: [PATCH 2/5] efi: only set a console mode if the current one is invalid

2023-03-30 Thread Jan Beulich
On 30.03.2023 18:17, Roger Pau Monné wrote:
> On Thu, Mar 30, 2023 at 06:07:57PM +0200, Jan Beulich wrote:
>> On 30.03.2023 17:44, Roger Pau Monné wrote:
>>> I guess I'm slightly confused by the usage of both GOP and StdOut, I
>>> would assume if we have a gop, and can correctly initialize it there's
>>> no need to fiddle with StdOut also?
>>
>> Setting the GOP mode is done last before exiting boot services; this
>> may be a graphics mode which doesn't support a text output protocol.
> 
> Right, that's what I was missing.  I assumed that all modes available
> in GOP would be compatible with the ConOut mode.
> 
> Would you be OK with leaving StdOut as-is when booted from multiboot2,
> or there's a chance of things not being properly setup?

On modern UEFI it may be unlikely, but I think it's not impossible (see
below).

> IMO it's not very friendly to change the StdOut mode if not explicitly
> requested, as in the multiboot2 case that gets setup by the
> bootloader.

May get set up, that is. If it was set up, then yes, we probably should
leave it alone unless told to use another mode. I.e. no vga= or
vga=current should minimally result in no further mode change. Aiui we
can't easily honor vga=gfx-... in that case, so leaving the mode alone
there may also be better than trying to guess a mode. The only time
where I would think it would be nice to switch by default even in the
xen.gz case is if the boot loader handed us the screen in some text
mode.

Jan



[xen-unstable test] 180073: regressions - trouble: blocked/fail/pass/starved

2023-03-30 Thread osstest service owner
flight 180073 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180073/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-arm64-pvops 6 kernel-build fail REGR. vs. 180062

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-examine  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-raw  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit1   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-credit2   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-thunderx  1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-vhd   1 build-check(1)   blocked  n/a
 test-arm64-arm64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 180062
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 180062
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 180062
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 180062
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 180062
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 180062
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 180062
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 180062
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 180062
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  12c5eea3ca6fa2674726d62cc9f369f81861d23f
baseline version:
 xen  eef4608fe71feddb5fea86678cf3acaf84d10fd2

Last test of basis   180062  2023-03-30 05:16:10 Z1 days
Testing same since   180073  2023-03-30 20:38:51 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Christian Lindig 
  Jan Beulich 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  starved 
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  starved 
 build-i386-libvirt   pass
 build-amd64-prev pass
 build-i386-prev  pass
 build-amd64-pvopspass
 build

[linux-linus test] 180069: regressions - trouble: fail/pass/starved

2023-03-30 Thread osstest service owner
flight 180069 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180069/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-credit1  17 guest-saverestorefail REGR. vs. 178042
 test-amd64-amd64-pair25 guest-start/debian   fail REGR. vs. 178042
 test-amd64-amd64-freebsd12-amd64 13 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-xl-pvhv2-intel 14 guest-start   fail REGR. vs. 178042
 test-amd64-amd64-xl-multivcpu 17 guest-saverestore   fail REGR. vs. 178042
 test-amd64-amd64-xl-xsm  17 guest-saverestorefail REGR. vs. 178042
 test-amd64-amd64-dom0pvh-xl-amd 14 guest-start   fail REGR. vs. 178042
 test-amd64-amd64-xl  18 guest-localmigrate   fail REGR. vs. 178042
 test-amd64-amd64-xl-pvhv2-amd 14 guest-start fail REGR. vs. 178042
 test-amd64-amd64-freebsd11-amd64 13 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-dom0pvh-xl-intel 14 guest-start fail REGR. vs. 178042
 test-amd64-amd64-xl-shadow  22 guest-start/debian.repeat fail REGR. vs. 178042
 test-amd64-amd64-libvirt-xsm 14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-libvirt-pair 25 guest-start/debian  fail REGR. vs. 178042
 test-amd64-coresched-amd64-xl 19 guest-saverestore.2 fail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 17 guest-saverestore.2 fail REGR. 
vs. 178042
 test-arm64-arm64-xl-thunderx 14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl-credit1  14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl-xsm  17 guest-stop   fail REGR. vs. 178042
 test-arm64-arm64-libvirt-xsm 14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl-credit2  14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl  14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 18 guest-localmigrate/x10 fail 
REGR. vs. 178042
 test-amd64-amd64-xl-credit2  17 guest-saverestorefail REGR. vs. 178042
 test-amd64-amd64-qemuu-nested-intel 12 debian-hvm-install fail REGR. vs. 178042
 test-amd64-amd64-libvirt 14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 12 debian-hvm-install fail 
REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-ovmf-amd64 12 debian-hvm-install fail REGR. vs. 
178042
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 12 debian-hvm-install fail 
REGR. vs. 178042
 test-amd64-amd64-xl-qemut-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 
178042
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 
178042
 test-amd64-amd64-qemuu-nested-amd 12 debian-hvm-install  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 debian-hvm-install 
fail REGR. vs. 178042
 test-amd64-amd64-xl-pvshim   14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-xl-vhd  12 debian-di-installfail REGR. vs. 178042
 test-amd64-amd64-pygrub  12 debian-di-installfail REGR. vs. 178042
 test-amd64-amd64-libvirt-raw 12 debian-di-installfail REGR. vs. 178042
 test-amd64-amd64-libvirt-qcow2 12 debian-di-install  fail REGR. vs. 178042
 test-arm64-arm64-xl-vhd  12 debian-di-installfail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 12 debian-hvm-install 
fail REGR. vs. 178042
 test-arm64-arm64-libvirt-raw 12 debian-di-installfail REGR. vs. 178042

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds22 guest-start/debian.repeat fail REGR. vs. 178042

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 178042
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 178042
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 178042
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 178042
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1) 

[linux-5.4 test] 180066: tolerable trouble: fail/pass/starved - PUSHED

2023-03-30 Thread osstest service owner
flight 180066 linux-5.4 real [real]
flight 180079 linux-5.4 real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/180066/
http://logs.test-lab.xenproject.org/osstest/logs/180079/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-examine-bios  6 xen-install fail pass in 180079-retest

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 179870
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 179870
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 179870
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 179870
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 179870
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 179870
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 179870
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 179870
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 179870
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 linux09b1a76e7879184fb35d71a221cae9451b895fff
baseline version:
 linux6849d8c4a61a93bb3abf2f65c84ec1ebfa9a9fb6

Last test of basis   179870  2023-03-22 12:45:03 Z8 days
Testing same since   180066  2023-03-30 13:12:10 Z0 days1 attempts


People who touched revisions under test:
  Greg Kroah-Hartman 
  Rishabh Bhatnagar 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf 

[ovmf test] 180074: all pass - PUSHED

2023-03-30 Thread osstest service owner
flight 180074 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180074/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf b08a19eae28e76fb5a296a604c27d06fab29b08a
baseline version:
 ovmf e9e61671237e10f7d7672ff74e8907941a954c1d

Last test of basis   180070  2023-03-30 17:40:41 Z0 days
Testing same since   180074  2023-03-30 20:41:05 Z0 days1 attempts


People who touched revisions under test:
  Ard Biesheuvel 
  Michael D Kinney 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   e9e6167123..b08a19eae2  b08a19eae28e76fb5a296a604c27d06fab29b08a -> 
xen-tested-master



[xen-unstable-smoke test] 180076: tolerable trouble: pass/starved - PUSHED

2023-03-30 Thread osstest service owner
flight 180076 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180076/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  64c21916167e6269280929fab1202537b54b13cf
baseline version:
 xen  12c5eea3ca6fa2674726d62cc9f369f81861d23f

Last test of basis   180068  2023-03-30 16:01:56 Z0 days
Testing same since   180072  2023-03-30 20:00:24 Z0 days2 attempts


People who touched revisions under test:
  Ayan Kumar Halder 
  Julien Grall 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  starved 
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  starved 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   12c5eea3ca..64c2191616  64c21916167e6269280929fab1202537b54b13cf -> smoke



[xen-unstable-smoke test] 180072: regressions - trouble: fail/pass/starved

2023-03-30 Thread osstest service owner
flight 180072 xen-unstable-smoke real [real]
flight 180075 xen-unstable-smoke real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/180072/
http://logs.test-lab.xenproject.org/osstest/logs/180075/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-libvirt  8 xen-boot fail REGR. vs. 180068

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  64c21916167e6269280929fab1202537b54b13cf
baseline version:
 xen  12c5eea3ca6fa2674726d62cc9f369f81861d23f

Last test of basis   180068  2023-03-30 16:01:56 Z0 days
Testing same since   180072  2023-03-30 20:00:24 Z0 days1 attempts


People who touched revisions under test:
  Ayan Kumar Halder 
  Julien Grall 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  starved 
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  starved 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit 64c21916167e6269280929fab1202537b54b13cf
Author: Ayan Kumar Halder 
Date:   Tue Mar 21 14:03:47 2023 +

xen/arm: Use the correct format specifier

1. One should use 'PRIpaddr' to display 'paddr_t' variables. However,
while creating nodes in fdt, the address (if present in the node name)
should be represented using 'PRIx64'. This is to be in conformance
with the following rule present in https://elinux.org/Device_Tree_Linux

. node names
"unit-address does not have leading zeros"

As 'PRIpaddr' introduces leading zeros, we cannot use it.

So, we have introduced a wrapper ie domain_fdt_begin_node() which will
represent physical address using 'PRIx64'.

2. One should use 'PRIx64' to display 'u64' in hex format. The current
use of 'PRIpaddr' for printing PTE is buggy as this is not a physical
address.

Signed-off-by: Ayan Kumar Halder 
Reviewed-by: Stefano Stabellini 
Acked-by: Julien Grall 
(qemu changes not included)



Re: [XEN v4 10/11] xen/arm: p2m: Use the pa_range_info table to support Arm_32 and Arm_64

2023-03-30 Thread Julien Grall

Hi,

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

  /* Choose suitable "pa_range" according to the resulted "p2m_ipa_bits". */
  for ( i = 0; i < ARRAY_SIZE(pa_range_info); i++ )
@@ -2324,14 +2323,13 @@ void __init setup_virt_paging(void)
  if ( pa_range >= ARRAY_SIZE(pa_range_info) || 
!pa_range_info[pa_range].pabits )
  panic("Unknown encoding of ID_AA64MMFR0_EL1.PARange %x\n", pa_range);
  
-val |= VTCR_PS(pa_range);

+#ifdef CONFIG_ARM_64
  val |= VTCR_TG0_4K;
+val |= VTCR_PS(pa_range);
  
  /* Set the VS bit only if 16 bit VMID is supported. */

  if ( MAX_VMID == MAX_VMID_16_BIT )
  val |= VTCR_VS;
-val |= VTCR_SL0(pa_range_info[pa_range].sl0);
-val |= VTCR_T0SZ(pa_range_info[pa_range].t0sz);
  
  p2m_root_order = pa_range_info[pa_range].root_order;

  p2m_root_level = 2 - pa_range_info[pa_range].sl0;


As mentioned in the next patch, this now wants to be outside of the 
ARM_64 specific section because at least the root order will be 
different. The root level is the same, but I think you would be better 
if this is moved out at the same time.


Cheers,

--
Julien Grall



Re: [XEN v4 11/11] xen/arm: p2m: Enable support for 32bit IPA for ARM_32

2023-03-30 Thread Julien Grall

Hi Ayan,

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

The pabits, t0sz, root_order and sl0 values are the same as those for
ARM_64.


To me this read as the line should be common. But you still duplicate it.

In any case, you should justify this change with a pointer to the Arm 
Arm. Not just saying they are common.




Signed-off-by: Ayan Kumar Halder 
---

Changes from -

v1 - New patch.

v2 - 1. Added Ack.

v3 - 1. Dropped Ack.
2. Rebased the patch based on the previous change.

  xen/arch/arm/p2m.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index f34b6e6f11..20beecc6e8 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -2272,8 +2272,9 @@ void __init setup_virt_paging(void)
  unsigned int sl0;/* Desired SL0, maximum in comment */
  } pa_range_info[] __initconst = {
  #ifdef CONFIG_ARM_32
-[0] = { 40,  24/*24*/,  1,  1 },
-[1] = { 0 } /* Invalid */
+[0] = { 32,  32/*32*/,  0,  1 },


As I pointed out in one of the previous version, the root order is 
different than ...



+[1] = { 40,  24/*24*/,  1,  1 },


... here. Yet, you still keep P2M_ROOT_ORDER and P2M_ROOT_LEVEL 
hardcoded. Your previous patch wants to define p2M_root_order and 
p2m_root_level (lower-case intended). IOW making more code common 
between arm64 and arm32.


Cheers,

--
Julien Grall



Re: [XEN v4 10/11] xen/arm: p2m: Use the pa_range_info table to support Arm_32 and Arm_64

2023-03-30 Thread Julien Grall

Hi Ayan:

Title: Shouldn't you remove the _ or use uppercase?

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

Restructure the code so that one can use pa_range_info[] table for both
ARM_32 as well as ARM_64. >
Signed-off-by: Ayan Kumar Halder 
---
Changes from -

v3 - 1. New patch introduced in v4.
2. Restructure the code such that pa_range_info[] is used both by ARM_32 as
well as ARM_64.

  xen/arch/arm/p2m.c | 28 +++-
  1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 948f199d84..f34b6e6f11 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -2265,22 +2265,16 @@ void __init setup_virt_paging(void)
  /* Setup Stage 2 address translation */
  register_t val = VTCR_RES1|VTCR_SH0_IS|VTCR_ORGN0_WBWA|VTCR_IRGN0_WBWA;
  
-#ifdef CONFIG_ARM_32

-if ( p2m_ipa_bits < 40 )
-panic("P2M: Not able to support %u-bit IPA at the moment\n",
-  p2m_ipa_bits);
-
-printk("P2M: 40-bit IPA\n");
-p2m_ipa_bits = 40;
-val |= VTCR_T0SZ(0x18); /* 40 bit IPA */
-val |= VTCR_SL0(0x1); /* P2M starts at first level */
-#else /* CONFIG_ARM_64 */
  static const struct {
  unsigned int pabits; /* Physical Address Size */
  unsigned int t0sz;   /* Desired T0SZ, minimum in comment */
  unsigned int root_order; /* Page order of the root of the p2m */
  unsigned int sl0;/* Desired SL0, maximum in comment */
  } pa_range_info[] __initconst = {
+#ifdef CONFIG_ARM_32
+[0] = { 40,  24/*24*/,  1,  1 },
+[1] = { 0 } /* Invalid */
+#else
  /* T0SZ minimum and SL0 maximum from ARM DDI 0487H.a Table D5-6 */
  /*  PA size, t0sz(min), root-order, sl0(max) */
  [0] = { 32,  32/*32*/,  0,  1 },
@@ -2291,11 +2285,13 @@ void __init setup_virt_paging(void)
  [5] = { 48,  16/*16*/,  0,  2 },
  [6] = { 52,  12/*12*/,  4,  2 },
  [7] = { 0 }  /* Invalid */
+#endif
  };
  
  unsigned int i;

  unsigned int pa_range = 0x10; /* Larger than any possible value */
  
+#ifdef CONFIG_ARM_64

  /*
   * Restrict "p2m_ipa_bits" if needed. As P2M table is always configured
   * with IPA bits == PA bits, compare against "pabits".
@@ -2309,6 +2305,9 @@ void __init setup_virt_paging(void)
   */
  if ( system_cpuinfo.mm64.vmid_bits == MM64_VMID_16_BITS_SUPPORT )
  max_vmid = MAX_VMID_16_BIT;
+#else
+p2m_ipa_bits = PADDR_BITS;
+#endif
  
  /* Choose suitable "pa_range" according to the resulted "p2m_ipa_bits". */

  for ( i = 0; i < ARRAY_SIZE(pa_range_info); i++ )
@@ -2324,14 +2323,13 @@ void __init setup_virt_paging(void)
  if ( pa_range >= ARRAY_SIZE(pa_range_info) || 
!pa_range_info[pa_range].pabits )
  panic("Unknown encoding of ID_AA64MMFR0_EL1.PARange %x\n", pa_range);
  
-val |= VTCR_PS(pa_range);

+#ifdef CONFIG_ARM_64
  val |= VTCR_TG0_4K;
+val |= VTCR_PS(pa_range);

Why is this moved rather than adding #ifdef a line before?

  
  /* Set the VS bit only if 16 bit VMID is supported. */

  if ( MAX_VMID == MAX_VMID_16_BIT )
  val |= VTCR_VS;
-val |= VTCR_SL0(pa_range_info[pa_range].sl0);
-val |= VTCR_T0SZ(pa_range_info[pa_range].t0sz);
  
  p2m_root_order = pa_range_info[pa_range].root_order;

  p2m_root_level = 2 - pa_range_info[pa_range].sl0;
@@ -2342,6 +2340,10 @@ void __init setup_virt_paging(void)
 pa_range_info[pa_range].pabits,
 ( MAX_VMID == MAX_VMID_16_BIT ) ? 16 : 8);
  #endif
+
+val |= VTCR_SL0(pa_range_info[pa_range].sl0);
+val |= VTCR_T0SZ(pa_range_info[pa_range].t0sz);
+
  printk("P2M: %d levels with order-%d root, VTCR 0x%"PRIregister"\n",
 4 - P2M_ROOT_LEVEL, P2M_ROOT_ORDER, val);
  


Cheers,

--
Julien Grall



Re: [XEN v4 09/11] xen/arm: Restrict zeroeth_table_offset for ARM_64

2023-03-30 Thread Julien Grall

Hi,

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

When 32 bit physical addresses are used (ie ARM_PA_32=y),
"va >> ZEROETH_SHIFT" causes an overflow.
Also, there is no zeroeth level page table on Arm 32-bit.

Also took the opportunity to clean up dump_pt_walk(). One could use
DECLARE_OFFSETS() macro instead of declaring the declaring an array
of page table offsets.

Acked-by: Julien Grall 
Reviewed-by: Stefano Stabellini 
Signed-off-by: Ayan Kumar Halder 


This should be re-ordered so the signed-off-by tag is first.


---
Changes from -

v1 - Removed the duplicate declaration for DECLARE_OFFSETS.

v2 - 1. Reworded the commit message.
2. Use CONFIG_ARM_PA_32 to restrict zeroeth_table_offset.

v3 - 1. Added R-b and Ack.

  xen/arch/arm/include/asm/lpae.h | 4 
  xen/arch/arm/mm.c   | 7 +--
  2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/include/asm/lpae.h b/xen/arch/arm/include/asm/lpae.h
index 3fdd5d0de2..0d40388f93 100644
--- a/xen/arch/arm/include/asm/lpae.h
+++ b/xen/arch/arm/include/asm/lpae.h
@@ -259,7 +259,11 @@ lpae_t mfn_to_xen_entry(mfn_t mfn, unsigned int attr);
  #define first_table_offset(va)  TABLE_OFFSET(first_linear_offset(va))
  #define second_table_offset(va) TABLE_OFFSET(second_linear_offset(va))
  #define third_table_offset(va)  TABLE_OFFSET(third_linear_offset(va))
+#ifdef CONFIG_ARM_PA_BITS_32


I know I already acked this patch. However, looking at the previous 
patch, you are using CONFIG_PHYS_ADDR_T_32 but here you are using 
CONFIG_ARM_PA_BITS_32.


It is not fully clear to me why you differ in the #ifdef approach. Can 
you clarify?



+#define zeroeth_table_offset(va)  0
+#else
  #define zeroeth_table_offset(va)  TABLE_OFFSET(zeroeth_linear_offset(va))
+#endif
  
  /*

   * Macros to define page-tables:
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index d8b43ef38c..41e0896b0f 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -221,12 +221,7 @@ void dump_pt_walk(paddr_t ttbr, paddr_t addr,
  {
  static const char *level_strs[4] = { "0TH", "1ST", "2ND", "3RD" };
  const mfn_t root_mfn = maddr_to_mfn(ttbr);
-const unsigned int offsets[4] = {
-zeroeth_table_offset(addr),
-first_table_offset(addr),
-second_table_offset(addr),
-third_table_offset(addr)
-};
+DECLARE_OFFSETS(offsets, addr);
  lpae_t pte, *mapping;
  unsigned int level, root_table;
  


Cheers,

--
Julien Grall



Re: [XEN v4 06/11] xen/arm: smmu: Use writeq_relaxed_non_atomic() for writing to SMMU_CBn_TTBR0

2023-03-30 Thread Julien Grall

Hi Ayan,

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

Refer ARM IHI 0062D.c ID070116 (SMMU 2.0 spec), 17-360, 17.3.9,
SMMU_CBn_TTBR0 is a 64 bit register. Thus, one can use
writeq_relaxed_non_atomic() to write to it instead of invoking
writel_relaxed() twice for lower half and upper half of the register.

This also helps us as p2maddr is 'paddr_t' (which may be u32 in future).
Thus, one can assign p2maddr to a 64 bit register and do the bit
manipulations on it, to generate the value for SMMU_CBn_TTBR0.

Reviewed-by: Stefano Stabellini 
Signed-off-by: Ayan Kumar Halder 


The tags should be ordered in a timeline. So Signed-off-by should be first.

I am happy to do it on commit if you can confirm that this patch doesn't 
dependent on the patches before.


Cheers,

--
Julien Grall



Re: [XEN v4 05/11] xen/arm: Introduce a wrapper for dt_device_get_address() to handle paddr_t

2023-03-30 Thread Julien Grall

Hi Ayan,

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

dt_device_get_address() can accept uint64_t only for address and size.
However, the address/size denotes physical addresses. Thus, they should
be represented by 'paddr_t'.
Consequently, we introduce a wrapper for dt_device_get_address() ie
dt_device_get_paddr() which accepts address/size as paddr_t and inturn
invokes dt_device_get_address() after converting address/size to
uint64_t.

The reason for introducing doing this is that in future 'paddr_t' may
be defined as uint32_t.


Technically, you will define it as 'unsigned long' after. To avoid 
relying on how paddr_t is defined, I would suggest ot write "'paddr_t' 
may not always be 64-bit" or similar.



diff --git a/xen/arch/arm/platforms/brcm-raspberry-pi.c 
b/xen/arch/arm/platforms/brcm-raspberry-pi.c
index 811b40b1a6..407ec07f63 100644
--- a/xen/arch/arm/platforms/brcm-raspberry-pi.c
+++ b/xen/arch/arm/platforms/brcm-raspberry-pi.c
@@ -64,7 +64,7 @@ static void __iomem *rpi4_map_watchdog(void)
  if ( !node )
  return NULL;
  
-ret = dt_device_get_address(node, 0, &start, &len);

+ret = dt_device_get_paddr(node, 0, &start, &len);
  if ( ret )
  {
  printk("Cannot read watchdog register address\n");
diff --git a/xen/arch/arm/platforms/brcm.c b/xen/arch/arm/platforms/brcm.c
index d481b2c60f..4310feee73 100644
--- a/xen/arch/arm/platforms/brcm.c
+++ b/xen/arch/arm/platforms/brcm.c
@@ -40,7 +40,7 @@ static __init int brcm_get_dt_node(char *compat_str,
 u32 *reg_base)
  {
  const struct dt_device_node *node;
-u64 reg_base_64;
+paddr_t reg_base_64;


'64' reads a bit odd now. I think you want to rename to reg_base_paddr.

Cheers,

--
Julien Grall



Re: [XEN v4 03/11] xen/arm: Typecast the DT values into paddr_t

2023-03-30 Thread Julien Grall

Hi Ayan,

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

In future, we wish to support 32 bit physical address.
However, the current dt and fdt functions can only read u64 values.


Reading this again this is a bit misleading. At least the DT functions 
are able to deal read 32-bit or 64-bit values (this is based on the 
number of cells). The difference is that the functions are expecting 
64-bit variable and this will be incompatible with paddr_t (which after 
your patch may be 32-bit).



We wish to make the DT functions read 32bit as well 64bit values
(depending on the width of physical address). Also, we wish to detect
if any truncation has occurred (ie while reading 32bit physical
addresses from 64bit values read from DT).

device_tree_get_reg() should now be able to return paddr_t. This is
invoked by various callers to get DT address and size.

For fdt_get_mem_rsv(), we have introduced wrapper ie
fdt_get_mem_rsv_paddr() while will invoke fdt_get_mem_rsv() and
translate uint64_t to paddr_t. The reason being we cannot modify
fdt_get_mem_rsv() as it has been imported from external source.

For dt_read_number(), we have also introduced a wrapper ie


NIT: 'ie' seems to be out of place. I would drop it.


dt_read_paddr() to read physical addresses. We chose not to modify the
original function as it been used in places where it needs to


"it been used" -> "it is used" ?


specifically 64bit values from dt (For eg dt_property_read_u64()).


I can't parse the sencen after "where it needs to...".


Xen prints an error when it detects a truncation (during typecast
between uint64_t and paddr_t). It is not possible to return an error in
all scenarios. So, it is user's responsibility to check the error logs.


The last sentence is a bit odd to say. Quite likely a user will not be 
aware of this requirements. And a single line in the log is likely going 
to be difficult to spot (see more below).




Also, replaced u32/u64 with uint32_t/uint64_t in the functions touched
by the code changes.

Signed-off-by: Ayan Kumar Halder 
---

Changes from

v1 - 1. Dropped "[XEN v1 2/9] xen/arm: Define translate_dt_address_size() for the 
translation between u64 and paddr_t" and
"[XEN v1 4/9] xen/arm: Use translate_dt_address_size() to translate between device 
tree addr/size and paddr_t", instead
this approach achieves the same purpose.

2. No need to check for truncation while converting values from u64 to paddr_t.

v2 - 1. Use "( (dt_start >> (PADDR_SHIFT - 1)) > 1 )" to detect truncation.
2. Introduced libfdt_xen.h to implement fdt_get_mem_rsv_paddr
3. Logged error messages in case truncation is detected.

v3 - 1. Renamed libfdt_xen.h to libfdt-xen.h.
2. Replaced u32/u64 with uint32_t/uint64_t
3. Use "(paddr_t)val != val" to check for truncation.
4. Removed the alias "#define PADDR_SHIFT PADDR_BITS".

  xen/arch/arm/bootfdt.c  | 41 ++-
  xen/arch/arm/domain_build.c |  2 +-
  xen/arch/arm/include/asm/setup.h|  4 +--
  xen/arch/arm/setup.c| 14 
  xen/arch/arm/smpboot.c  |  2 +-
  xen/include/xen/device_tree.h   | 21 
  xen/include/xen/libfdt/libfdt-xen.h | 52 +
  7 files changed, 116 insertions(+), 20 deletions(-)
  create mode 100644 xen/include/xen/libfdt/libfdt-xen.h

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 0085c28d74..33bef1c15e 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -11,7 +11,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 
  #include 
@@ -52,11 +52,32 @@ static bool __init device_tree_node_compatible(const void 
*fdt, int node,
  return false;
  }
  
-void __init device_tree_get_reg(const __be32 **cell, u32 address_cells,

-u32 size_cells, u64 *start, u64 *size)
+void __init device_tree_get_reg(const __be32 **cell, uint32_t address_cells,
+uint32_t size_cells, paddr_t *start,
+paddr_t *size)
  {
-*start = dt_next_cell(address_cells, cell);
-*size = dt_next_cell(size_cells, cell);
+uint64_t dt_start, dt_size;
+
+/*
+ * dt_next_cell will return u64 whereas paddr_t may be u64 or u32. Thus,


s/u64/uint64_t/ I would also say "whereas paddr_t may not be 64-bit" 
just to shorten the sentence.



+ * there is an implicit cast from u64 to paddr_t.


s/u64/uint64_t/


+ */
+dt_start = dt_next_cell(address_cells, cell);
+dt_size = dt_next_cell(size_cells, cell);
+
+if ( dt_start != (paddr_t)dt_start  )
+printk("Error: Physical address greater than max width supported\n");


I would add a WARN(). Same...


+
+if ( dt_size != (paddr_t)dt_size )
+printk("Error: Physical size greater than max width supported\n");


... here as this is easier to spot in the logs.


+
+/*
+ * Note: It is user's responsibility to check for the error messages.


See my remar

Re: [XEN v4 02/11] xen/arm: domain_build: Track unallocated pages using the frame number

2023-03-30 Thread Julien Grall

Hi,

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

rangeset_{xxx}_range() functions are invoked with 'start' and 'size' as
arguments which are either 'uint64_t' or 'paddr_t'. However, the function
accepts 'unsigned long' for 'start' and 'size'. 'unsigned long' is 32 bits for
ARM_32. Thus, there is an implicit downcasting from 'uint64_t'/'paddr_t' to
'unsigned long' when invoking rangeset_{xxx}_range().

So, it may seem there is a possibility of lose of data due to truncation.

In reality, 'start' and 'size' are always page aligned. And ARM_32 currently
supports 40 bits as the width of physical address.
So if the addresses are page aligned, the last 12 bits contain zeroes.
Thus, we could instead pass page frame number which will contain 28 bits (40-12
on Arm_32) and this can be represented using 'unsigned long'.


Is Arm_32 meant to refer to the config or the architecture? If the 
former, then it should be ARM_32 if the latter, it should be Arm32. I 
would prefer the latter.




On Arm_64, this change will not induce any adverse side effect as the width of


Same here for Arm_64.


physical address is 48 bits. Thus, the width of 'mfn' (ie 48 - 12 = 36) can be
represented using 'unsigned long' (which is 64 bits wide).

Signed-off-by: Ayan Kumar Halder 
---
Changes from -

v3 - 1. Extracted the patch from 
https://lists.xenproject.org/archives/html/xen-devel/2023-02/msg00657.html
and added it to this series.
2. Modified add_ext_regions(). This accepts a frame number instead of physical
address.

  xen/arch/arm/domain_build.c | 27 +--
  1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 15fa88e977..24b12b7512 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1500,10 +1500,13 @@ static int __init make_resv_memory_node(const struct 
domain *d,
  return res;
  }
  
-static int __init add_ext_regions(unsigned long s, unsigned long e, void *data)

+static int __init add_ext_regions(unsigned long s_pfn, unsigned long e_pfn,


We are trying to phase out any using of 'pfn' in the code. In this case, 
this is mean (see include/xen/mm.h for more details). Here, you want to 
use 'gfn' as we are looking for space in the dom0 memory address space.



+  void *data)
  {
  struct meminfo *ext_regions = data;
  paddr_t start, size;
+paddr_t s = PFN_UP(s_pfn);
+paddr_t e = PFN_UP(e_pfn);


PFN_UP() takes a physical address in parameter and return a page frame 
number. So this is not what you want here. You want pfn_to_paddr().


The rest looks good to me.

Cheers,

--
Julien Grall



[xen-unstable test] 180062: tolerable trouble: fail/pass/starved - PUSHED

2023-03-30 Thread osstest service owner
flight 180062 xen-unstable real [real]
flight 180071 xen-unstable real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/180062/
http://logs.test-lab.xenproject.org/osstest/logs/180071/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-libvirt-raw  12 debian-di-install   fail pass in 180071-retest
 test-amd64-amd64-xl-qcow222 guest-start.2   fail pass in 180071-retest

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt-raw 14 migrate-support-check fail in 180071 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 180051
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 180051
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 180051
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 180051
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 180051
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 180051
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 180051
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 180051
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 180051
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 test-arm64-arm64-libvirt-xsm  3 hosts-allocate   starved  n/a
 test-arm64-arm64-xl   3 hosts-allocate   starved  n/a
 test-arm64-arm64-xl-thunderx  3 hosts-allocate   starved  n/a
 test-arm64-arm64-xl-vhd   3 hosts-allocate   starved  n/a
 test-arm64-arm64-libvirt-raw  3 hosts-allocate   starved  n/a
 test-arm64-arm64-xl-credit2   3 hosts-allocate   starved  n/a
 test-arm64-arm64-xl-xsm   3 hosts-allocate   starved  n/a
 test-arm64-arm64-xl-credit1   3 hosts-allocate   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  eef4608fe71feddb5fea86678cf3acaf84d10fd2
baseline version:
 xen  b177892d2d0e8a31122c218989f43130aeba5282

Last test of basis   180051  2023-03-29 12:19:29 Z1 days
Testing same since   180062  2023-03-30 05:16:10 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Juergen Gross 
  Julien Grall 
  Michal Orzel 
  Roger Pau Monné 
  Stefano Stabellini 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64-xtf  pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  starved 
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  starved 
 build-i386-libvirt   pass
 build-amd64-prev  

[ovmf test] 180070: all pass - PUSHED

2023-03-30 Thread osstest service owner
flight 180070 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180070/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf e9e61671237e10f7d7672ff74e8907941a954c1d
baseline version:
 ovmf 6f0c65cdb073eb8ee8c1cb6fad5d0060ec86f3cc

Last test of basis   180067  2023-03-30 14:12:20 Z0 days
Testing same since   180070  2023-03-30 17:40:41 Z0 days1 attempts


People who touched revisions under test:
  Michael D Kinney 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   6f0c65cdb0..e9e6167123  e9e61671237e10f7d7672ff74e8907941a954c1d -> 
xen-tested-master



[ANNOUNCE] Call for agenda items for 13 April Community Call @ 1600 UTC

2023-03-30 Thread George Dunlap
Hi all,

NOTE THE DATE CHANGE: The call has been pushed back by a week to avoid any
clashes with Easter holidays.


The proposed agenda is in
https://cryptpad.fr/pad/#/2/pad/edit/HtfnjfjUlr9KcUC-vP2RGLXe/ and you can
edit to add items.  Alternatively, you can reply to this mail directly.

Agenda items appreciated a few days before the call: please put your name
besides items if you edit the document.

Note the following administrative conventions for the call:
* Unless, agreed in the previous meeting otherwise, the call is on the 1st
Thursday of each month at 1600 British Time (either GMT or BST)
* I usually send out a meeting reminder a few days before with a
provisional agenda

* To allow time to switch between meetings, we'll plan on starting the
agenda at 16:05 sharp.  Aim to join by 16:03 if possible to allocate time
to sort out technical difficulties &c

* If you want to be CC'ed please add or remove yourself from the
sign-up-sheet at
https://cryptpad.fr/pad/#/2/pad/edit/D9vGzihPxxAOe6RFPz0sRCf+/

Best Regards
George


== Dial-in Information ==
## Meeting time
15:00 - 16:00 UTC
Further International meeting times:


https://www.timeanddate.com/worldclock/meetingdetails.html?year=2023&month=4&day=13&hour=15&min=0&sec=0&p1=1234&p2=37&p3=224&p4=179


## Dial in details
Web: https://meet.jit.si/XenProjectCommunityCall

Dial-in info and pin can be found here:

https://meet.jit.si/static/dialInInfo.html?room=XenProjectCommunityCall


Re: [XEN v4 01/11] xen/arm: Use the correct format specifier

2023-03-30 Thread Julien Grall

Hi,

On 21/03/2023 14:03, Ayan Kumar Halder wrote:

1. One should use 'PRIpaddr' to display 'paddr_t' variables. However,
while creating nodes in fdt, the address (if present in the node name)
should be represented using 'PRIx64'. This is to be in conformance
with the following rule present in https://elinux.org/Device_Tree_Linux

. node names
"unit-address does not have leading zeros"

As 'PRIpaddr' introduces leading zeros, we cannot use it.

So, we have introduced a wrapper ie domain_fdt_begin_node() which will
represent physical address using 'PRIx64'.

2. One should use 'PRIx64' to display 'u64' in hex format. The current
use of 'PRIpaddr' for printing PTE is buggy as this is not a physical
address.

Signed-off-by: Ayan Kumar Halder 
Reviewed-by: Stefano Stabellini 
Acked-by: Julien Grall 


I have committed this patch.

Cheers,


---

Changes from -

v3 - 1. Extracted the patch from 
https://lists.xenproject.org/archives/html/xen-devel/2023-02/msg00655.html
and added to this series.
2. No changes done.

  xen/arch/arm/domain_build.c | 64 +++--
  xen/arch/arm/gic-v2.c   |  6 ++--
  xen/arch/arm/mm.c   |  2 +-
  3 files changed, 44 insertions(+), 28 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 9707eb7b1b..15fa88e977 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1288,6 +1288,39 @@ static int __init fdt_property_interrupts(const struct 
kernel_info *kinfo,
  return res;
  }
  
+/*

+ * Wrapper to convert physical address from paddr_t to uint64_t and
+ * invoke fdt_begin_node(). This is required as the physical address
+ * provided as part of node name should not contain any leading
+ * zeroes. Thus, one should use PRIx64 (instead of PRIpaddr) to append
+ * unit (which contains the physical address) with name to generate a
+ * node name.
+ */
+static int __init domain_fdt_begin_node(void *fdt, const char *name,
+uint64_t unit)
+{
+/*
+ * The size of the buffer to hold the longest possible string (i.e.
+ * interrupt-controller@ + a 64-bit number + \0).
+ */
+char buf[38];
+int ret;
+
+/* ePAPR 3.4 */
+ret = snprintf(buf, sizeof(buf), "%s@%"PRIx64, name, unit);
+
+if ( ret >= sizeof(buf) )
+{
+printk(XENLOG_ERR
+   "Insufficient buffer. Minimum size required is %d\n",
+   (ret + 1));
+
+return -FDT_ERR_TRUNCATED;
+}
+
+return fdt_begin_node(fdt, buf);
+}
+
  static int __init make_memory_node(const struct domain *d,
 void *fdt,
 int addrcells, int sizecells,
@@ -1296,8 +1329,6 @@ static int __init make_memory_node(const struct domain *d,
  unsigned int i;
  int res, reg_size = addrcells + sizecells;
  int nr_cells = 0;
-/* Placeholder for memory@ + a 64-bit number + \0 */
-char buf[24];
  __be32 reg[NR_MEM_BANKS * 4 /* Worst case addrcells + sizecells */];
  __be32 *cells;
  
@@ -1314,9 +1345,7 @@ static int __init make_memory_node(const struct domain *d,
  
  dt_dprintk("Create memory node\n");
  
-/* ePAPR 3.4 */

-snprintf(buf, sizeof(buf), "memory@%"PRIx64, mem->bank[i].start);
-res = fdt_begin_node(fdt, buf);
+res = domain_fdt_begin_node(fdt, "memory", mem->bank[i].start);
  if ( res )
  return res;
  
@@ -1375,16 +1404,13 @@ static int __init make_shm_memory_node(const struct domain *d,

  {
  uint64_t start = mem->bank[i].start;
  uint64_t size = mem->bank[i].size;
-/* Placeholder for xen-shmem@ + a 64-bit number + \0 */
-char buf[27];
  const char compat[] = "xen,shared-memory-v1";
  /* Worst case addrcells + sizecells */
  __be32 reg[GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS];
  __be32 *cells;
  unsigned int len = (addrcells + sizecells) * sizeof(__be32);
  
-snprintf(buf, sizeof(buf), "xen-shmem@%"PRIx64, mem->bank[i].start);

-res = fdt_begin_node(fdt, buf);
+res = domain_fdt_begin_node(fdt, "xen-shmem", mem->bank[i].start);
  if ( res )
  return res;
  
@@ -2716,12 +2742,9 @@ static int __init make_gicv2_domU_node(struct kernel_info *kinfo)

  __be32 reg[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) * 2];
  __be32 *cells;
  const struct domain *d = kinfo->d;
-/* Placeholder for interrupt-controller@ + a 64-bit number + \0 */
-char buf[38];
  
-snprintf(buf, sizeof(buf), "interrupt-controller@%"PRIx64,

- vgic_dist_base(&d->arch.vgic));
-res = fdt_begin_node(fdt, buf);
+res = domain_fdt_begin_node(fdt, "interrupt-controller",
+vgic_dist_base(&d->arch.vgic));
  if ( res )
  return res;
  
@@ -2771,14 +2794,10 @@ static int __init make_gicv3_domU_node(struct kernel_info *kinfo)

  int res = 0;

Re: [PATCH] bump default SeaBIOS version to 1.16.2

2023-03-30 Thread Julien Grall

Hi Jan,

On 30/03/2023 11:38, Jan Beulich wrote:

Signed-off-by: Jan Beulich 


FWIW:

Acked-by: Julien Grall 

Cheers,



--- a/Config.mk
+++ b/Config.mk
@@ -225,7 +225,7 @@ MINIOS_UPSTREAM_URL ?= https://xenbits.x
  MINIOS_UPSTREAM_REVISION ?= 5bcb28aaeba1c2506a82fab0cdad0201cd9b54b3
  
  SEABIOS_UPSTREAM_URL ?= https://xenbits.xen.org/git-http/seabios.git

-SEABIOS_UPSTREAM_REVISION ?= rel-1.16.1
+SEABIOS_UPSTREAM_REVISION ?= rel-1.16.2
  
  ETHERBOOT_NICS ?= rtl8139 8086100e
  


--
Julien Grall



[libvirt test] 180061: tolerable trouble: pass/starved - PUSHED

2023-03-30 Thread osstest service owner
flight 180061 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180061/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-qcow2 14 migrate-support-checkfail never pass
 test-arm64-arm64-libvirt-qcow2 15 saverestore-support-checkfail never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 libvirt  2c6b5a84257379e516ca1999782dca88dfd8a9de
baseline version:
 libvirt  eb677e3a101107d8113feb841537425302784908

Last test of basis   180048  2023-03-29 04:20:27 Z1 days
Testing same since   180061  2023-03-30 04:20:17 Z0 days1 attempts


People who touched revisions under test:
  Anastasia Belova 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  starved 
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  starved 
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-arm64-arm64-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-arm64-arm64-libvirt pass
 test-armhf-armhf-libvirt starved 
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-arm64-arm64-libvirt-qcow2   pass
 test-armhf-armhf-libvirt-qcow2   starved 
 test-arm64-arm64-libvirt-raw pass
 test-armhf-armhf-libvirt-raw starved 
 test-amd64-i386-libvirt-raw  pass
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.x

[xen-unstable-smoke test] 180068: tolerable trouble: pass/starved - PUSHED

2023-03-30 Thread osstest service owner
flight 180068 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180068/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  12c5eea3ca6fa2674726d62cc9f369f81861d23f
baseline version:
 xen  f41c88a6fca59f99a2eb5e7ed3d90ab7bca08b1b

Last test of basis   180065  2023-03-30 12:00:25 Z0 days
Testing same since   180068  2023-03-30 16:01:56 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Christian Lindig 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  starved 
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  starved 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   f41c88a6fc..12c5eea3ca  12c5eea3ca6fa2674726d62cc9f369f81861d23f -> smoke



[ovmf test] 180067: all pass - PUSHED

2023-03-30 Thread osstest service owner
flight 180067 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180067/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 6f0c65cdb073eb8ee8c1cb6fad5d0060ec86f3cc
baseline version:
 ovmf 53eb26b238541799134aa5846530485c915735da

Last test of basis   180064  2023-03-30 11:13:36 Z0 days
Testing same since   180067  2023-03-30 14:12:20 Z0 days1 attempts


People who touched revisions under test:
  Nickle Wang 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   53eb26b238..6f0c65cdb0  6f0c65cdb073eb8ee8c1cb6fad5d0060ec86f3cc -> 
xen-tested-master



[linux-linus test] 180059: regressions - trouble: fail/pass/starved

2023-03-30 Thread osstest service owner
flight 180059 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180059/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-pair25 guest-start/debian   fail REGR. vs. 178042
 test-amd64-amd64-xl-credit1  18 guest-localmigrate   fail REGR. vs. 178042
 test-amd64-amd64-freebsd12-amd64 17 guest-localmigrate   fail REGR. vs. 178042
 test-amd64-amd64-xl-shadow   17 guest-saverestorefail REGR. vs. 178042
 test-amd64-amd64-xl-pvhv2-intel 17 guest-saverestore fail REGR. vs. 178042
 test-amd64-amd64-xl-xsm  14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-xl  14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-xl-pvhv2-amd 14 guest-start fail REGR. vs. 178042
 test-amd64-amd64-freebsd11-amd64 13 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-dom0pvh-xl-intel 14 guest-start fail REGR. vs. 178042
 test-amd64-amd64-xl-multivcpu 20 guest-localmigrate/x10  fail REGR. vs. 178042
 test-amd64-amd64-dom0pvh-xl-amd 14 guest-start   fail REGR. vs. 178042
 test-amd64-amd64-libvirt-xsm 17 guest-saverestorefail REGR. vs. 178042
 test-amd64-amd64-libvirt-pair 26 guest-migrate/src_host/dst_host fail REGR. 
vs. 178042
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 12 debian-hvm-install fail REGR. 
vs. 178042
 test-arm64-arm64-xl-thunderx 14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl-credit1  14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl-xsm  17 guest-stop   fail REGR. vs. 178042
 test-arm64-arm64-xl  14 guest-start  fail REGR. vs. 178042
 test-arm64-arm64-xl-credit2 18 guest-start/debian.repeat fail REGR. vs. 178042
 test-arm64-arm64-libvirt-xsm 18 guest-start/debian.repeat fail REGR. vs. 178042
 test-amd64-amd64-xl-credit2  17 guest-saverestorefail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm 17 guest-saverestore.2 fail REGR. 
vs. 178042
 test-amd64-amd64-qemuu-nested-intel 12 debian-hvm-install fail REGR. vs. 178042
 test-amd64-amd64-libvirt 14 guest-start  fail REGR. vs. 178042
 test-amd64-coresched-amd64-xl 14 guest-start fail REGR. vs. 178042
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 12 debian-hvm-install fail 
REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-ovmf-amd64 12 debian-hvm-install fail REGR. vs. 
178042
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow 12 debian-hvm-install fail 
REGR. vs. 178042
 test-amd64-amd64-xl-qemut-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 
178042
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 12 debian-hvm-install fail REGR. vs. 
178042
 test-amd64-amd64-qemuu-nested-amd 12 debian-hvm-install  fail REGR. vs. 178042
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 debian-hvm-install 
fail REGR. vs. 178042
 test-amd64-amd64-xl-pvshim   14 guest-start  fail REGR. vs. 178042
 test-amd64-amd64-xl-vhd  12 debian-di-installfail REGR. vs. 178042
 test-amd64-amd64-pygrub  12 debian-di-installfail REGR. vs. 178042
 test-amd64-amd64-libvirt-raw 12 debian-di-installfail REGR. vs. 178042
 test-amd64-amd64-libvirt-qcow2 12 debian-di-install  fail REGR. vs. 178042
 test-arm64-arm64-xl-vhd  12 debian-di-installfail REGR. vs. 178042
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 12 debian-hvm-install 
fail REGR. vs. 178042
 test-arm64-arm64-libvirt-raw 12 debian-di-installfail REGR. vs. 178042

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds22 guest-start/debian.repeat fail REGR. vs. 178042

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 178042
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 178042
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 178042
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 178042
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-examine  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl

Re: [PATCH v4 09/12] tools: add physinfo arch_capabilities handling for Arm

2023-03-30 Thread Anthony PERARD
On Mon, Mar 27, 2023 at 11:59:41AM +0100, Luca Fancellu wrote:
> ---
>  tools/golang/xenlight/helpers.gen.go|  2 ++
>  tools/golang/xenlight/types.gen.go  |  1 +
>  tools/include/arm-arch-capabilities.h   | 33 +

Could you move that new file into "tools/include/xen-tools/", where
"common-macros.h" is. The top-dir "tools/include" already has a mixture
of installed and internal headers, but most of them are installed. So
the fairly recent "xen-tools" dir which have been introduced to share
macros at build time seems more appropriate for another built-time
macro.

>  tools/include/xen-tools/common-macros.h |  2 ++
> 
> diff --git a/tools/include/arm-arch-capabilities.h 
> b/tools/include/arm-arch-capabilities.h
> new file mode 100644
> index ..46e876651052
> --- /dev/null
> +++ b/tools/include/arm-arch-capabilities.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#ifndef ARM_ARCH_CAPABILITIES_H
> +#define ARM_ARCH_CAPABILITIES_H
> +
> +/* Tell the Xen public headers we are a user-space tools build. */
> +#ifndef __XEN_TOOLS__
> +#define __XEN_TOOLS__ 1

Somehow, this doesn't seems appropriate in this header. This macro
should instead be set on the command line. Any reason you've added this
in the header?

> +#endif
> +
> +#include 
> +#include 
> +
> +#include 
> +
> +static inline
> +unsigned int arch_capabilities_arm_sve(unsigned int arch_capabilities)
> +{
> +#if defined(__aarch64__)
> +unsigned int sve_vl = MASK_EXTR(arch_capabilities,
> +XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK);
> +
> +/* Vector length is divided by 128 before storing it in 
> arch_capabilities */
> +return sve_vl * 128U;
> +#else
> +return 0;
> +#endif
> +}
> +
> +#endif /* ARM_ARCH_CAPABILITIES_H */
> diff --git a/tools/libs/light/libxl_types.idl 
> b/tools/libs/light/libxl_types.idl
> index c10292e0d7e3..fd31dacf7d5a 100644
> --- a/tools/libs/light/libxl_types.idl
> +++ b/tools/libs/light/libxl_types.idl
> @@ -1133,6 +1133,7 @@ libxl_physinfo = Struct("physinfo", [
>  ("cap_vpmu", bool),
>  ("cap_gnttab_v1", bool),
>  ("cap_gnttab_v2", bool),
> +("arch_capabilities", uint32),

This additional field needs a new LIBXL_HAVE_ macro in "libxl.h".

>  ], dir=DIR_OUT)
>  
>  libxl_connectorinfo = Struct("connectorinfo", [
> diff --git a/tools/python/xen/lowlevel/xc/xc.c 
> b/tools/python/xen/lowlevel/xc/xc.c
> index 35901c2d63b6..254d3b5dccd2 100644
> --- a/tools/python/xen/lowlevel/xc/xc.c
> +++ b/tools/python/xen/lowlevel/xc/xc.c
> @@ -7,6 +7,7 @@
>  #define PY_SSIZE_T_CLEAN
>  #include 
>  #define XC_WANT_COMPAT_MAP_FOREIGN_API
> +#include 

Could you add this header ...

>  #include 
>  #include 
>  #include 
> @@ -22,8 +23,6 @@
>  #include 
>  #include 
>  
> -#include 
> -

... here, instead?

Also, I think #include common-macros, can stay.

>  /* Needed for Python versions earlier than 2.3. */
>  #ifndef PyMODINIT_FUNC
>  #define PyMODINIT_FUNC DL_EXPORT(void)
> @@ -897,7 +896,7 @@ static PyObject *pyxc_physinfo(XcObject *self)
>  if ( p != virt_caps )
>*(p-1) = '\0';
>  
> -return Py_BuildValue("{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s}",
> +return Py_BuildValue("{s:i,s:i,s:i,s:i,s:l,s:l,s:l,s:i,s:s,s:s,s:i}",
>  "nr_nodes", pinfo.nr_nodes,
>  "threads_per_core", pinfo.threads_per_core,
>  "cores_per_socket", pinfo.cores_per_socket,
> @@ -907,7 +906,10 @@ static PyObject *pyxc_physinfo(XcObject *self)
>  "scrub_memory", 
> pages_to_kib(pinfo.scrub_pages),
>  "cpu_khz",  pinfo.cpu_khz,
>  "hw_caps",  cpu_cap,
> -"virt_caps",virt_caps);
> +"virt_caps",virt_caps,
> +"arm_sve_vl",
> +  
> arch_capabilities_arm_sve(pinfo.arch_capabilities)

arch_capabilities_arm_sve() returns an "unsigned int", but the format is
"i", which seems to be an "int". Shouldn't the format be "I" instead?

https://docs.python.org/3/c-api/arg.html#c.Py_BuildValue

> +);
>  }
>  
>  static PyObject *pyxc_getcpuinfo(XcObject *self, PyObject *args, PyObject 
> *kwds)
> diff --git a/tools/xl/xl_info.c b/tools/xl/xl_info.c
> index 712b7638b013..bf18ba2449ef 100644
> --- a/tools/xl/xl_info.c
> +++ b/tools/xl/xl_info.c
> @@ -14,6 +14,7 @@
>  
>  #define _GNU_SOURCE
>  
> +#include 

Any reason reason to have this header first?
I feel like private headers should come after public ones, so here, this
include would be added between  and "xl.h".

>  #include 
>  #include 
>  #include 
> @@ -224,6 +225,13 @@ static void output_physinfo(void)
>   info.cap_gnttab_v2 ? " gnttab-v2" : ""
>  );
>  
> +/* Print arm SVE vector le

[PATCH v8 5/7] PCI: Allow pci_bus_for_each_resource() to take less arguments

2023-03-30 Thread Andy Shevchenko
Refactor pci_bus_for_each_resource() in the same way as it's done in
pci_dev_for_each_resource() case. This will allow to hide iterator
inside the loop, where it's not used otherwise.

No functional changes intended.

Signed-off-by: Andy Shevchenko 
Reviewed-by: Krzysztof Wilczyński 
Reviewed-by: Philippe Mathieu-Daudé 
---
 drivers/pci/bus.c  |  7 +++
 drivers/pci/hotplug/shpchp_sysfs.c |  8 
 drivers/pci/pci.c  |  3 +--
 drivers/pci/probe.c|  2 +-
 drivers/pci/setup-bus.c| 10 --
 include/linux/pci.h| 24 +++-
 6 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 549c4bd5caec..5bc81cc0a2de 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -182,13 +182,13 @@ static int pci_bus_alloc_from_region(struct pci_bus *bus, 
struct resource *res,
void *alignf_data,
struct pci_bus_region *region)
 {
-   int i, ret;
struct resource *r, avail;
resource_size_t max;
+   int ret;
 
type_mask |= IORESOURCE_TYPE_BITS;
 
-   pci_bus_for_each_resource(bus, r, i) {
+   pci_bus_for_each_resource(bus, r) {
resource_size_t min_used = min;
 
if (!r)
@@ -289,9 +289,8 @@ bool pci_bus_clip_resource(struct pci_dev *dev, int idx)
struct resource *res = &dev->resource[idx];
struct resource orig_res = *res;
struct resource *r;
-   int i;
 
-   pci_bus_for_each_resource(bus, r, i) {
+   pci_bus_for_each_resource(bus, r) {
resource_size_t start, end;
 
if (!r)
diff --git a/drivers/pci/hotplug/shpchp_sysfs.c 
b/drivers/pci/hotplug/shpchp_sysfs.c
index 64beed7a26be..01d47a42da04 100644
--- a/drivers/pci/hotplug/shpchp_sysfs.c
+++ b/drivers/pci/hotplug/shpchp_sysfs.c
@@ -24,16 +24,16 @@
 static ssize_t show_ctrl(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
struct pci_dev *pdev;
-   int index, busnr;
struct resource *res;
struct pci_bus *bus;
size_t len = 0;
+   int busnr;
 
pdev = to_pci_dev(dev);
bus = pdev->subordinate;
 
len += sysfs_emit_at(buf, len, "Free resources: memory\n");
-   pci_bus_for_each_resource(bus, res, index) {
+   pci_bus_for_each_resource(bus, res) {
if (res && (res->flags & IORESOURCE_MEM) &&
!(res->flags & IORESOURCE_PREFETCH)) {
len += sysfs_emit_at(buf, len,
@@ -43,7 +43,7 @@ static ssize_t show_ctrl(struct device *dev, struct 
device_attribute *attr, char
}
}
len += sysfs_emit_at(buf, len, "Free resources: prefetchable memory\n");
-   pci_bus_for_each_resource(bus, res, index) {
+   pci_bus_for_each_resource(bus, res) {
if (res && (res->flags & IORESOURCE_MEM) &&
   (res->flags & IORESOURCE_PREFETCH)) {
len += sysfs_emit_at(buf, len,
@@ -53,7 +53,7 @@ static ssize_t show_ctrl(struct device *dev, struct 
device_attribute *attr, char
}
}
len += sysfs_emit_at(buf, len, "Free resources: IO\n");
-   pci_bus_for_each_resource(bus, res, index) {
+   pci_bus_for_each_resource(bus, res) {
if (res && (res->flags & IORESOURCE_IO)) {
len += sysfs_emit_at(buf, len,
 "start = %8.8llx, length = 
%8.8llx\n",
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 45c3bb039f21..585bb3988ddf 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -779,9 +779,8 @@ struct resource *pci_find_parent_resource(const struct 
pci_dev *dev,
 {
const struct pci_bus *bus = dev->bus;
struct resource *r;
-   int i;
 
-   pci_bus_for_each_resource(bus, r, i) {
+   pci_bus_for_each_resource(bus, r) {
if (!r)
continue;
if (resource_contains(r, res)) {
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index a3f68b6ba6ac..f8191750f6b7 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -533,7 +533,7 @@ void pci_read_bridge_bases(struct pci_bus *child)
pci_read_bridge_mmio_pref(child);
 
if (dev->transparent) {
-   pci_bus_for_each_resource(child->parent, res, i) {
+   pci_bus_for_each_resource(child->parent, res) {
if (res && res->flags) {
pci_bus_add_resource(child, res,
 PCI_SUBTRACTIVE_DECODE);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 027b985dd1ee..fdeb121e9175 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -770,9 +770,8 @@ static struct resource *find_bus_resource_of_type(struct 
pci_bus *bus,

[PATCH v8 7/7] pcmcia: Convert to use less arguments in pci_bus_for_each_resource()

2023-03-30 Thread Andy Shevchenko
The pci_bus_for_each_resource() can hide the iterator loop since
it may be not used otherwise. With this, we may drop that iterator
variable definition.

Signed-off-by: Andy Shevchenko 
Reviewed-by: Krzysztof Wilczyński 
Acked-by: Dominik Brodowski 
---
 drivers/pcmcia/rsrc_nonstatic.c | 9 +++--
 drivers/pcmcia/yenta_socket.c   | 3 +--
 2 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/pcmcia/rsrc_nonstatic.c b/drivers/pcmcia/rsrc_nonstatic.c
index ad1141fddb4c..96264ebee46a 100644
--- a/drivers/pcmcia/rsrc_nonstatic.c
+++ b/drivers/pcmcia/rsrc_nonstatic.c
@@ -934,7 +934,7 @@ static int adjust_io(struct pcmcia_socket *s, unsigned int 
action, unsigned long
 static int nonstatic_autoadd_resources(struct pcmcia_socket *s)
 {
struct resource *res;
-   int i, done = 0;
+   int done = 0;
 
if (!s->cb_dev || !s->cb_dev->bus)
return -ENODEV;
@@ -960,12 +960,9 @@ static int nonstatic_autoadd_resources(struct 
pcmcia_socket *s)
 */
if (s->cb_dev->bus->number == 0)
return -EINVAL;
-
-   for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; i++) {
-   res = s->cb_dev->bus->resource[i];
-#else
-   pci_bus_for_each_resource(s->cb_dev->bus, res, i) {
 #endif
+
+   pci_bus_for_each_resource(s->cb_dev->bus, res) {
if (!res)
continue;
 
diff --git a/drivers/pcmcia/yenta_socket.c b/drivers/pcmcia/yenta_socket.c
index 1365eaa20ff4..fd18ab571ce8 100644
--- a/drivers/pcmcia/yenta_socket.c
+++ b/drivers/pcmcia/yenta_socket.c
@@ -673,9 +673,8 @@ static int yenta_search_res(struct yenta_socket *socket, 
struct resource *res,
u32 min)
 {
struct resource *root;
-   int i;
 
-   pci_bus_for_each_resource(socket->dev->bus, root, i) {
+   pci_bus_for_each_resource(socket->dev->bus, root) {
if (!root)
continue;
 
-- 
2.40.0.1.gaa8946217a0b




[PATCH v8 4/7] PCI: Document pci_bus_for_each_resource() to avoid confusion

2023-03-30 Thread Andy Shevchenko
There might be a confusion with the implementation of the
pci_bus_for_each_resources() due to side effect of Logical
OR. Document entire macro and explain how it works and why
the conditional needs to be like that.

Signed-off-by: Andy Shevchenko 
---
 include/linux/pci.h | 20 
 1 file changed, 20 insertions(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5cacd9e4c8cd..e3b3af606280 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1446,6 +1446,26 @@ int devm_request_pci_bus_resources(struct device *dev,
 /* Temporary until new and working PCI SBR API in place */
 int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
 
+/**
+ * pci_bus_for_each_resource - iterate over PCI bus resources
+ * @bus: the PCI bus
+ * @res: a varible to keep a pointer to the current resource
+ * @i: a variable to keep the index of the current resource
+ *
+ * Iterate over PCI bus resources. The first part is to go over PCI bus
+ * resource array, which has at most the %PCI_BRIDGE_RESOURCE_NUM entries.
+ * After that continue with the separate list of the additional resources,
+ * if not empty. That's why the Logical OR is being used.
+ *
+ * Possible usage:
+ *
+ * struct pci_bus *bus = ...;
+ * struct resource *res;
+ * unsigned int i;
+ *
+ * pci_bus_for_each_resource(bus, res, i)
+ * pr_info("PCI bus resource[%u]: %pR\n", i, res);
+ */
 #define pci_bus_for_each_resource(bus, res, i) \
for (i = 0; \
(res = pci_bus_resource_n(bus, i)) || i < PCI_BRIDGE_RESOURCE_NUM; \
-- 
2.40.0.1.gaa8946217a0b




[PATCH v8 6/7] EISA: Convert to use less arguments in pci_bus_for_each_resource()

2023-03-30 Thread Andy Shevchenko
The pci_bus_for_each_resource() can hide the iterator loop since
it may be not used otherwise. With this, we may drop that iterator
variable definition.

Signed-off-by: Andy Shevchenko 
Reviewed-by: Krzysztof Wilczyński 
Reviewed-by: Philippe Mathieu-Daudé 
---
 drivers/eisa/pci_eisa.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/eisa/pci_eisa.c b/drivers/eisa/pci_eisa.c
index 930c2332c3c4..8173e60bb808 100644
--- a/drivers/eisa/pci_eisa.c
+++ b/drivers/eisa/pci_eisa.c
@@ -20,8 +20,8 @@ static struct eisa_root_device pci_eisa_root;
 
 static int __init pci_eisa_init(struct pci_dev *pdev)
 {
-   int rc, i;
struct resource *res, *bus_res = NULL;
+   int rc;
 
if ((rc = pci_enable_device (pdev))) {
dev_err(&pdev->dev, "Could not enable device\n");
@@ -38,7 +38,7 @@ static int __init pci_eisa_init(struct pci_dev *pdev)
 * eisa_root_register() can only deal with a single io port resource,
*  so we use the first valid io port resource.
 */
-   pci_bus_for_each_resource(pdev->bus, res, i)
+   pci_bus_for_each_resource(pdev->bus, res)
if (res && (res->flags & IORESOURCE_IO)) {
bus_res = res;
break;
-- 
2.40.0.1.gaa8946217a0b




[PATCH v8 3/7] PCI: Introduce pci_dev_for_each_resource()

2023-03-30 Thread Andy Shevchenko
From: Mika Westerberg 

Instead of open-coding it everywhere introduce a tiny helper that can be
used to iterate over each resource of a PCI device, and convert the most
obvious users into it.

While at it drop doubled empty line before pdev_sort_resources().

No functional changes intended.

Suggested-by: Andy Shevchenko 
Signed-off-by: Mika Westerberg 
Signed-off-by: Andy Shevchenko 
Reviewed-by: Krzysztof Wilczyński 
---
 .clang-format |  1 +
 arch/alpha/kernel/pci.c   |  5 ++--
 arch/arm/kernel/bios32.c  | 16 ++---
 arch/arm/mach-dove/pcie.c | 10 
 arch/arm/mach-mv78xx0/pcie.c  | 10 
 arch/arm/mach-orion5x/pci.c   | 10 
 arch/mips/pci/ops-bcm63xx.c   |  8 +++
 arch/mips/pci/pci-legacy.c|  3 +--
 arch/powerpc/kernel/pci-common.c  | 21 
 arch/powerpc/platforms/4xx/pci.c  |  8 +++
 arch/powerpc/platforms/52xx/mpc52xx_pci.c |  5 ++--
 arch/powerpc/platforms/pseries/pci.c  | 16 ++---
 arch/sh/drivers/pci/pcie-sh7786.c | 10 
 arch/sparc/kernel/leon_pci.c  |  5 ++--
 arch/sparc/kernel/pci.c   | 10 
 arch/sparc/kernel/pcic.c  |  5 ++--
 drivers/pci/remove.c  |  5 ++--
 drivers/pci/setup-bus.c   | 27 -
 drivers/pci/setup-res.c   |  4 +---
 drivers/pci/vgaarb.c  | 17 -
 drivers/pci/xen-pcifront.c|  4 +---
 drivers/pnp/quirks.c  | 29 ---
 include/linux/pci.h   | 15 
 23 files changed, 112 insertions(+), 132 deletions(-)

diff --git a/.clang-format b/.clang-format
index d988e9fa9b26..2048b0296d76 100644
--- a/.clang-format
+++ b/.clang-format
@@ -520,6 +520,7 @@ ForEachMacros:
   - 'of_property_for_each_string'
   - 'of_property_for_each_u32'
   - 'pci_bus_for_each_resource'
+  - 'pci_dev_for_each_resource'
   - 'pci_doe_for_each_off'
   - 'pcl_for_each_chunk'
   - 'pcl_for_each_segment'
diff --git a/arch/alpha/kernel/pci.c b/arch/alpha/kernel/pci.c
index 64fbfb0763b2..4458eb7f44f0 100644
--- a/arch/alpha/kernel/pci.c
+++ b/arch/alpha/kernel/pci.c
@@ -288,11 +288,10 @@ pcibios_claim_one_bus(struct pci_bus *b)
struct pci_bus *child_bus;
 
list_for_each_entry(dev, &b->devices, bus_list) {
+   struct resource *r;
int i;
 
-   for (i = 0; i < PCI_NUM_RESOURCES; i++) {
-   struct resource *r = &dev->resource[i];
-
+   pci_dev_for_each_resource(dev, r, i) {
if (r->parent || !r->start || !r->flags)
continue;
if (pci_has_flag(PCI_PROBE_ONLY) ||
diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c
index e7ef2b5bea9c..d334c7fb672b 100644
--- a/arch/arm/kernel/bios32.c
+++ b/arch/arm/kernel/bios32.c
@@ -142,15 +142,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_WINBOND2, 
PCI_DEVICE_ID_WINBOND2_89C940F,
  */
 static void pci_fixup_dec21285(struct pci_dev *dev)
 {
-   int i;
-
if (dev->devfn == 0) {
+   struct resource *r;
+
dev->class &= 0xff;
dev->class |= PCI_CLASS_BRIDGE_HOST << 8;
-   for (i = 0; i < PCI_NUM_RESOURCES; i++) {
-   dev->resource[i].start = 0;
-   dev->resource[i].end   = 0;
-   dev->resource[i].flags = 0;
+   pci_dev_for_each_resource(dev, r) {
+   r->start = 0;
+   r->end = 0;
+   r->flags = 0;
}
}
 }
@@ -162,13 +162,11 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_DEC, 
PCI_DEVICE_ID_DEC_21285, pci_fixup_d
 static void pci_fixup_ide_bases(struct pci_dev *dev)
 {
struct resource *r;
-   int i;
 
if ((dev->class >> 8) != PCI_CLASS_STORAGE_IDE)
return;
 
-   for (i = 0; i < PCI_NUM_RESOURCES; i++) {
-   r = dev->resource + i;
+   pci_dev_for_each_resource(dev, r) {
if ((r->start & ~0x80) == 0x374) {
r->start |= 2;
r->end = r->start;
diff --git a/arch/arm/mach-dove/pcie.c b/arch/arm/mach-dove/pcie.c
index 754ca381f600..3044b7e03890 100644
--- a/arch/arm/mach-dove/pcie.c
+++ b/arch/arm/mach-dove/pcie.c
@@ -142,14 +142,14 @@ static struct pci_ops pcie_ops = {
 static void rc_pci_fixup(struct pci_dev *dev)
 {
if (dev->bus->parent == NULL && dev->devfn == 0) {
-   int i;
+   struct resource *r;
 
dev->class &= 0xff;
dev->class |= PCI_CLASS_BRIDGE_HOST << 8;
-   for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
-   dev->resource[i].start = 0;
-

[PATCH v8 1/7] kernel.h: Split out COUNT_ARGS() and CONCATENATE()

2023-03-30 Thread Andy Shevchenko
kernel.h is being used as a dump for all kinds of stuff for a long time.
The COUNT_ARGS() and CONCATENATE() macros may be used in some places
without need of the full kernel.h dependency train with it.

Here is the attempt on cleaning it up by splitting out these macros().

Signed-off-by: Andy Shevchenko 
---
 include/linux/args.h   | 13 +
 include/linux/kernel.h |  8 +---
 2 files changed, 14 insertions(+), 7 deletions(-)
 create mode 100644 include/linux/args.h

diff --git a/include/linux/args.h b/include/linux/args.h
new file mode 100644
index ..16ef6fad8add
--- /dev/null
+++ b/include/linux/args.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _LINUX_ARGS_H
+#define _LINUX_ARGS_H
+
+/* This counts to 12. Any more, it will return 13th argument. */
+#define __COUNT_ARGS(_0, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, 
_n, X...) _n
+#define COUNT_ARGS(X...) __COUNT_ARGS(, ##X, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 
2, 1, 0)
+
+#define __CONCAT(a, b) a ## b
+#define CONCATENATE(a, b) __CONCAT(a, b)
+
+#endif /* _LINUX_ARGS_H */
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 0d91e0af0125..fa675d50d7b7 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -13,6 +13,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -457,13 +458,6 @@ ftrace_vprintk(const char *fmt, va_list ap)
 static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
 #endif /* CONFIG_TRACING */
 
-/* This counts to 12. Any more, it will return 13th argument. */
-#define __COUNT_ARGS(_0, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, 
_n, X...) _n
-#define COUNT_ARGS(X...) __COUNT_ARGS(, ##X, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 
2, 1, 0)
-
-#define __CONCAT(a, b) a ## b
-#define CONCATENATE(a, b) __CONCAT(a, b)
-
 /* Rebuild everything on CONFIG_FTRACE_MCOUNT_RECORD */
 #ifdef CONFIG_FTRACE_MCOUNT_RECORD
 # define REBUILD_DUE_TO_FTRACE_MCOUNT_RECORD
-- 
2.40.0.1.gaa8946217a0b




[PATCH v8 2/7] PCI: Introduce pci_resource_n()

2023-03-30 Thread Andy Shevchenko
Introduce pci_resource_n() and replace open-coded implementations of it
in pci.h.

Signed-off-by: Andy Shevchenko 
Reviewed-by: Philippe Mathieu-Daudé 
---
 include/linux/pci.h | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index b50e5c79f7e3..aeaa95455d4c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1995,14 +1995,13 @@ int pci_iobar_pfn(struct pci_dev *pdev, int bar, struct 
vm_area_struct *vma);
  * These helpers provide future and backwards compatibility
  * for accessing popular PCI BAR info
  */
-#define pci_resource_start(dev, bar)   ((dev)->resource[(bar)].start)
-#define pci_resource_end(dev, bar) ((dev)->resource[(bar)].end)
-#define pci_resource_flags(dev, bar)   ((dev)->resource[(bar)].flags)
-#define pci_resource_len(dev,bar) \
-   ((pci_resource_end((dev), (bar)) == 0) ? 0 :\
-   \
-(pci_resource_end((dev), (bar)) -  \
- pci_resource_start((dev), (bar)) + 1))
+#define pci_resource_n(dev, bar)   (&(dev)->resource[(bar)])
+#define pci_resource_start(dev, bar)   (pci_resource_n(dev, bar)->start)
+#define pci_resource_end(dev, bar) (pci_resource_n(dev, bar)->end)
+#define pci_resource_flags(dev, bar)   (pci_resource_n(dev, bar)->flags)
+#define pci_resource_len(dev,bar)  \
+   (pci_resource_end((dev), (bar)) ?   \
+resource_size(pci_resource_n((dev), (bar))) : 0)
 
 /*
  * Similar to the helpers above, these manipulate per-pci_dev
-- 
2.40.0.1.gaa8946217a0b




[PATCH v8 0/7] Add pci_dev_for_each_resource() helper and update users

2023-03-30 Thread Andy Shevchenko
Provide two new helper macros to iterate over PCI device resources and
convert users.

Looking at it, refactor existing pci_bus_for_each_resource() and convert
users accordingly.

Note, the amount of lines grew due to the documentation update.

Changelog v8:
- fixed issue with pci_bus_for_each_resource() macro (LKP)
- due to above added a new patch to document how it works
- moved the last patch to be #2 (Philippe)
- added tags (Philippe)

Changelog v7:
- made both macros to share same name (Bjorn)
- split out the pci_resource_n() conversion (Bjorn)

Changelog v6:
- dropped unused variable in PPC code (LKP)

Changelog v5:
- renamed loop variable to minimize the clash (Keith)
- addressed smatch warning (Dan)
- addressed 0-day bot findings (LKP)

Changelog v4:
- rebased on top of v6.3-rc1
- added tag (Krzysztof)

Changelog v3:
- rebased on top of v2 by Mika, see above
- added tag to pcmcia patch (Dominik)

Changelog v2:
- refactor to have two macros
- refactor existing pci_bus_for_each_resource() in the same way and
  convert users

Andy Shevchenko (6):
  kernel.h: Split out COUNT_ARGS() and CONCATENATE()
  PCI: Introduce pci_resource_n()
  PCI: Document pci_bus_for_each_resource() to avoid confusion
  PCI: Allow pci_bus_for_each_resource() to take less arguments
  EISA: Convert to use less arguments in pci_bus_for_each_resource()
  pcmcia: Convert to use less arguments in pci_bus_for_each_resource()

Mika Westerberg (1):
  PCI: Introduce pci_dev_for_each_resource()

 .clang-format |  1 +
 arch/alpha/kernel/pci.c   |  5 +-
 arch/arm/kernel/bios32.c  | 16 +++--
 arch/arm/mach-dove/pcie.c | 10 ++--
 arch/arm/mach-mv78xx0/pcie.c  | 10 ++--
 arch/arm/mach-orion5x/pci.c   | 10 ++--
 arch/mips/pci/ops-bcm63xx.c   |  8 +--
 arch/mips/pci/pci-legacy.c|  3 +-
 arch/powerpc/kernel/pci-common.c  | 21 +++
 arch/powerpc/platforms/4xx/pci.c  |  8 +--
 arch/powerpc/platforms/52xx/mpc52xx_pci.c |  5 +-
 arch/powerpc/platforms/pseries/pci.c  | 16 ++---
 arch/sh/drivers/pci/pcie-sh7786.c | 10 ++--
 arch/sparc/kernel/leon_pci.c  |  5 +-
 arch/sparc/kernel/pci.c   | 10 ++--
 arch/sparc/kernel/pcic.c  |  5 +-
 drivers/eisa/pci_eisa.c   |  4 +-
 drivers/pci/bus.c |  7 +--
 drivers/pci/hotplug/shpchp_sysfs.c|  8 +--
 drivers/pci/pci.c |  3 +-
 drivers/pci/probe.c   |  2 +-
 drivers/pci/remove.c  |  5 +-
 drivers/pci/setup-bus.c   | 37 +---
 drivers/pci/setup-res.c   |  4 +-
 drivers/pci/vgaarb.c  | 17 ++
 drivers/pci/xen-pcifront.c|  4 +-
 drivers/pcmcia/rsrc_nonstatic.c   |  9 +--
 drivers/pcmcia/yenta_socket.c |  3 +-
 drivers/pnp/quirks.c  | 29 -
 include/linux/args.h  | 13 
 include/linux/kernel.h|  8 +--
 include/linux/pci.h   | 72 +++
 32 files changed, 190 insertions(+), 178 deletions(-)
 create mode 100644 include/linux/args.h

-- 
2.40.0.1.gaa8946217a0b




Re: Need help on a issue (Unable to schedule guest for Xen on Arm)

2023-03-30 Thread Julien Grall




On 30/03/2023 16:57, Ayan Kumar Halder wrote:

(Apologies, fixed the formatting issue)


Hi,



On 30/03/2023 16:50, Ayan Kumar Halder wrote:

Hi Xen experts,

I need some pointers on an issue I am facing.

I am running my downstream port of Xen on Cortex-R52 hardware. The 
hardware consist of two R52 cores (the second core is in lockstep 
mode). So, currently the hardware does not support SMP.


The issue is that Xen is unable to schedule a guest.


Are you sure about this? Because...

So 
leave_hypervisor_to_guest() ---> check_for_pcpu_work() and this does 
not return.


... leave_hypervisor_to_guest() indicates that a guest vCPU was 
scheduled. Before we return to the guest we always check if there is 
some softirq that need to be handled. So...




Debugging this, I see  check_for_pcpu_work() --> do_softirq() --> 
__do_softirq() --> timer_softirq_action().


.. the fact that check_for_pcpu_work() doesn't return seems to indicate 
that there is a softirq that is always pending. Can you look which one 
it is?




In timer_softirq_action(), the problem I see is that for all the 
timers, "((t = heap[1])->expires < now)" is true.


    while ( (heap_metadata(heap)->size != 0) &&
    ((t = heap[1])->expires < now) )
    {
    remove_from_heap(heap, t); -- So, this gets invoked 
for all the timers.

    execute_timer(ts, t);
    }

So, further below reprogram_timer() gets invoked with timeout = 0 and 
it disables the timer. So, timer_interrupt() is never invoked.


Which is expected because there is no timer armed, so there is no need 
to configure the physical timer.




Can someone give any pointers on what the underlying issue could be 
and how to debug further ?


See above. You could also look why there is no softtimer pending and/or 
where Xen is stuck (e.g. the PC).




I do not observe this behavior while running on R52 FVP. The 
difference is that for most of the timers "((t = heap[1])->expires < 
now)" is false, so reprogram_timer() gets invoked with non zero 
timeout and subsequently, the timer_interrupt() get invoked.

This reads as one of the following:
  1) There is a missing barrier
  2) You didn't properly configure some registers
  3) There is an HW errata (I know that some of the Cortex-A had an 
issue in when reading the Timer counter but seems unlikely here)




Also, looking at the following from xen/arch/arm/time.c.


/* Set the timer to wake us up at a particular time.
  * Timeout is a Xen system time (nanoseconds since boot); 0 disables 
the timer.
  * Returns 1 on success; 0 if the timeout is too soon or is in the 
past. */

int reprogram_timer(s_time_t timeout)
{
     uint64_t deadline;

     if ( timeout == 0 )
     {
     WRITE_SYSREG(0, CNTHP_CTL_EL2);
     return 1; <<-- Shouldn't this 
be 0 as the comment suggets ?


I think 1 is correct because we want to disable the timer so this is a 
success. 0 should be return if this was effectively a timeout.


FWIW, x86 also seems to return 1 when the timeout is 0.


     }

     deadline = ns_to_ticks(timeout) + boot_count;
     WRITE_SYSREG64(deadline, CNTHP_CVAL_EL2);
     WRITE_SYSREG(CNTx_CTL_ENABLE, CNTHP_CTL_EL2);
     isb();

     /* No need to check for timers in the past; the Generic Timer fires
  * on a signed 63-bit comparison. */
     return 1;
}


Kind regards,

Ayan






--
Julien Grall



Re: [PATCH 2/5] efi: only set a console mode if the current one is invalid

2023-03-30 Thread Roger Pau Monné
On Thu, Mar 30, 2023 at 06:07:57PM +0200, Jan Beulich wrote:
> On 30.03.2023 17:44, Roger Pau Monné wrote:
> > On Mon, Dec 05, 2022 at 03:19:13PM +0100, Jan Beulich wrote:
> >> On 23.11.2022 16:45, Roger Pau Monne wrote:
> >>> Do not unconditionally set a mode in efi_console_set_mode(), do so
> >>> only if the currently set mode is not valid.
> >>
> >> You don't say why you want to do so. Furthermore ...
> >>
> >>> --- a/xen/common/efi/boot.c
> >>> +++ b/xen/common/efi/boot.c
> >>> @@ -799,6 +799,11 @@ static void __init efi_console_set_mode(void)
> >>>  UINTN cols, rows, size;
> >>>  unsigned int best, i;
> >>>  
> >>> +/* Only set a mode if the current one is not valid. */
> >>> +if ( StdOut->QueryMode(StdOut, StdOut->Mode->Mode, &cols, &rows) ==
> >>> + EFI_SUCCESS )
> >>> +return;
> >>
> >> ... it might be okay if you put such a check in efi_multiboot2(), but
> >> the call here from efi_start() is specifically guarded by a check of
> >> whether "-basevideo" was passed to xen.efi. This _may_ not be as
> >> relevant anymore today, but it certainly was 20 years ago (recall
> >> that we've inherited this code from a much older project of ours) -
> >> at that time EFI usually started in 80x25 text mode. And I think that
> >> even today when you end up launching xen.efi from the EFI shell,
> >> you'd be stuck with 80x25 text mode on at least some implementations.
> > 
> > Won't you use console=vga vga=gfx-...
> > 
> > To switch to a best mode?
> 
> I don't think "vga=gfx-..." is presently consumed in any way xen.efi.
> Doing so would require a similar hack of peeking at the (ordinary)
> command line options as in legacy booting, except that in xen.efi we
> read that command line from a file, which iirc is done only after
> fiddling with the video mode.

I will only take care of multiboot2, since I don't have a way to test
xen.efi ATM.

> >> Overall, looking at (for now) just the titles of subsequent patches,
> >> I'm not convinced the change here is needed at all. Or if anything it
> >> may want to go at the end, taking action only when "vga=current" was
> >> specified.
> > 
> > I guess I'm slightly confused by the usage of both GOP and StdOut, I
> > would assume if we have a gop, and can correctly initialize it there's
> > no need to fiddle with StdOut also?
> 
> Setting the GOP mode is done last before exiting boot services; this
> may be a graphics mode which doesn't support a text output protocol.

Right, that's what I was missing.  I assumed that all modes available
in GOP would be compatible with the ConOut mode.

Would you be OK with leaving StdOut as-is when booted from multiboot2,
or there's a chance of things not being properly setup?

IMO it's not very friendly to change the StdOut mode if not explicitly
requested, as in the multiboot2 case that gets setup by the
bootloader.

Thanks, Roger.



Re: [PATCH 2/5] efi: only set a console mode if the current one is invalid

2023-03-30 Thread Jan Beulich
On 30.03.2023 17:44, Roger Pau Monné wrote:
> On Mon, Dec 05, 2022 at 03:19:13PM +0100, Jan Beulich wrote:
>> On 23.11.2022 16:45, Roger Pau Monne wrote:
>>> Do not unconditionally set a mode in efi_console_set_mode(), do so
>>> only if the currently set mode is not valid.
>>
>> You don't say why you want to do so. Furthermore ...
>>
>>> --- a/xen/common/efi/boot.c
>>> +++ b/xen/common/efi/boot.c
>>> @@ -799,6 +799,11 @@ static void __init efi_console_set_mode(void)
>>>  UINTN cols, rows, size;
>>>  unsigned int best, i;
>>>  
>>> +/* Only set a mode if the current one is not valid. */
>>> +if ( StdOut->QueryMode(StdOut, StdOut->Mode->Mode, &cols, &rows) ==
>>> + EFI_SUCCESS )
>>> +return;
>>
>> ... it might be okay if you put such a check in efi_multiboot2(), but
>> the call here from efi_start() is specifically guarded by a check of
>> whether "-basevideo" was passed to xen.efi. This _may_ not be as
>> relevant anymore today, but it certainly was 20 years ago (recall
>> that we've inherited this code from a much older project of ours) -
>> at that time EFI usually started in 80x25 text mode. And I think that
>> even today when you end up launching xen.efi from the EFI shell,
>> you'd be stuck with 80x25 text mode on at least some implementations.
> 
> Won't you use console=vga vga=gfx-...
> 
> To switch to a best mode?

I don't think "vga=gfx-..." is presently consumed in any way xen.efi.
Doing so would require a similar hack of peeking at the (ordinary)
command line options as in legacy booting, except that in xen.efi we
read that command line from a file, which iirc is done only after
fiddling with the video mode.

>> Overall, looking at (for now) just the titles of subsequent patches,
>> I'm not convinced the change here is needed at all. Or if anything it
>> may want to go at the end, taking action only when "vga=current" was
>> specified.
> 
> I guess I'm slightly confused by the usage of both GOP and StdOut, I
> would assume if we have a gop, and can correctly initialize it there's
> no need to fiddle with StdOut also?

Setting the GOP mode is done last before exiting boot services; this
may be a graphics mode which doesn't support a text output protocol.

Jan



Re: Need help on a issue (Unable to schedule guest for Xen on Arm)

2023-03-30 Thread Ayan Kumar Halder

(Apologies, fixed the formatting issue)

On 30/03/2023 16:50, Ayan Kumar Halder wrote:

Hi Xen experts,

I need some pointers on an issue I am facing.

I am running my downstream port of Xen on Cortex-R52 hardware. The 
hardware consist of two R52 cores (the second core is in lockstep 
mode). So, currently the hardware does not support SMP.


The issue is that Xen is unable to schedule a guest. So 
leave_hypervisor_to_guest() ---> check_for_pcpu_work() and this does 
not return.


Debugging this, I see  check_for_pcpu_work() --> do_softirq() --> 
__do_softirq() --> timer_softirq_action().


In timer_softirq_action(), the problem I see is that for all the 
timers, "((t = heap[1])->expires < now)" is true.


    while ( (heap_metadata(heap)->size != 0) &&
    ((t = heap[1])->expires < now) )
    {
    remove_from_heap(heap, t); -- So, this gets invoked 
for all the timers.

    execute_timer(ts, t);
    }

So, further below reprogram_timer() gets invoked with timeout = 0 and 
it disables the timer. So, timer_interrupt() is never invoked.


Can someone give any pointers on what the underlying issue could be 
and how to debug further ?


I do not observe this behavior while running on R52 FVP. The 
difference is that for most of the timers "((t = heap[1])->expires < 
now)" is false, so reprogram_timer() gets invoked with non zero 
timeout and subsequently, the timer_interrupt() get invoked.


Also, looking at the following from xen/arch/arm/time.c.


/* Set the timer to wake us up at a particular time.
 * Timeout is a Xen system time (nanoseconds since boot); 0 disables 
the timer.

 * Returns 1 on success; 0 if the timeout is too soon or is in the past. */
int reprogram_timer(s_time_t timeout)
{
    uint64_t deadline;

    if ( timeout == 0 )
    {
    WRITE_SYSREG(0, CNTHP_CTL_EL2);
    return 1; <<-- Shouldn't this 
be 0 as the comment suggets ?

    }

    deadline = ns_to_ticks(timeout) + boot_count;
    WRITE_SYSREG64(deadline, CNTHP_CVAL_EL2);
    WRITE_SYSREG(CNTx_CTL_ENABLE, CNTHP_CTL_EL2);
    isb();

    /* No need to check for timers in the past; the Generic Timer fires
 * on a signed 63-bit comparison. */
    return 1;
}


Kind regards,

Ayan








Need help on a issue (Unable to schedule guest for Xen on Arm)

2023-03-30 Thread Ayan Kumar Halder

Hi Xen experts,

I need some pointers on an issue I am facing.

I am running my downstream port of Xen on Cortex-R52 hardware. The 
hardware consist of two R52 cores (the second core is in lockstep mode). 
So, currently the hardware does not support SMP.


The issue is that Xen is unable to schedule a guest. So 
leave_hypervisor_to_guest() ---> check_for_pcpu_work() and this does not 
return.


Debugging this, I see  check_for_pcpu_work() --> do_softirq() --> 
__do_softirq() --> timer_softirq_action().


In timer_softirq_action(), the problem I see is that for all the timers, 
"((t = heap[1])->expires < now)" is true.


    while ( (heap_metadata(heap)->size != 0) &&
    ((t = heap[1])->expires < now) )
    {
    remove_from_heap(heap, t); -- So, this gets invoked for 
all the timers.

    execute_timer(ts, t);
    }

So, further below reprogram_timer() gets invoked with timeout = 0 and it 
disables the timer. So, timer_interrupt() is never invoked.


Can someone give any pointers on what the underlying issue could be and 
how to debug further ?


I do not observe this behavior while running on R52 FVP. The difference 
is that for most of the timers "((t = heap[1])->expires < now)" is 
false, so reprogram_timer() gets invoked with non zero timeout and 
subsequently, the timer_interrupt() get invoked.


Also, looking at 
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/arm/time.c;h=0b482d7db30c89fc70b191722a89dea8a675c2b6;hb=refs/heads/staging#l211


208 
 
/* Set the timer to wake us up at a particular time.
209 
 
 * Timeout is a Xen system time (nanoseconds since boot); 0 disables the timer.
210 
 
 * Returns 1 on success; 0 if the timeout is too soon or is in the past. */
211 
 
int reprogram_timer(s_time_t timeout)
212 
 
{
213 
 
uint64_t deadline;
214 
 

215 
 
if ( timeout == 0 )
216 
 
{
217 
 
WRITE_SYSREG(0, CNTHP_CTL_EL2);
218 
 
return 1; <<-- Shouldn't this 
be 0 as the comment suggets ?
219 
 
}
220 
 

221 
 
deadline = ns_to_ticks(timeout) + boot_count;
222 
 
WRITE_SYSREG64(deadline, CNTHP_CVAL_EL2);
223 
 
WRITE_SYSREG(CNTx_CTL_ENABLE, CNTHP_CTL_EL2);
224 
 
isb();
225 
 

226 
 
/* No need to check for timers in the past; t

Re: [PATCH 2/5] efi: only set a console mode if the current one is invalid

2023-03-30 Thread Roger Pau Monné
On Mon, Dec 05, 2022 at 03:19:13PM +0100, Jan Beulich wrote:
> On 23.11.2022 16:45, Roger Pau Monne wrote:
> > Do not unconditionally set a mode in efi_console_set_mode(), do so
> > only if the currently set mode is not valid.
> 
> You don't say why you want to do so. Furthermore ...
> 
> > --- a/xen/common/efi/boot.c
> > +++ b/xen/common/efi/boot.c
> > @@ -799,6 +799,11 @@ static void __init efi_console_set_mode(void)
> >  UINTN cols, rows, size;
> >  unsigned int best, i;
> >  
> > +/* Only set a mode if the current one is not valid. */
> > +if ( StdOut->QueryMode(StdOut, StdOut->Mode->Mode, &cols, &rows) ==
> > + EFI_SUCCESS )
> > +return;
> 
> ... it might be okay if you put such a check in efi_multiboot2(), but
> the call here from efi_start() is specifically guarded by a check of
> whether "-basevideo" was passed to xen.efi. This _may_ not be as
> relevant anymore today, but it certainly was 20 years ago (recall
> that we've inherited this code from a much older project of ours) -
> at that time EFI usually started in 80x25 text mode. And I think that
> even today when you end up launching xen.efi from the EFI shell,
> you'd be stuck with 80x25 text mode on at least some implementations.

Won't you use console=vga vga=gfx-...

To switch to a best mode?

> Overall, looking at (for now) just the titles of subsequent patches,
> I'm not convinced the change here is needed at all. Or if anything it
> may want to go at the end, taking action only when "vga=current" was
> specified.

I guess I'm slightly confused by the usage of both GOP and StdOut, I
would assume if we have a gop, and can correctly initialize it there's
no need to fiddle with StdOut also?

Thanks, Roger.



Re: [PATCH v7 3/6] PCI: Allow pci_bus_for_each_resource() to take less arguments

2023-03-30 Thread Andy Shevchenko
On Thu, Mar 30, 2023 at 09:24:21PM +0800, kernel test robot wrote:
> 
> Greeting,
> 
> FYI, we noticed various errors such like
> "i40e: probe of :3d:00.0 failed with error -12"
> due to commit (built with gcc-11):
> 
> commit: d23d5938fd7ced817d6aa1ff86cd671ebbaebfc2 ("[PATCH v7 3/6] PCI: Allow 
> pci_bus_for_each_resource() to take less arguments")
> url: 
> https://github.com/intel-lab-lkp/linux/commits/Andy-Shevchenko/kernel-h-Split-out-COUNT_ARGS-and-CONCATENATE/20230324-013857
> base: https://git.kernel.org/cgit/linux/kernel/git/pci/pci.git next
> patch link: 
> https://lore.kernel.org/all/20230323173610.60442-4-andriy.shevche...@linux.intel.com/
> patch subject: [PATCH v7 3/6] PCI: Allow pci_bus_for_each_resource() to take 
> less arguments
> 
> in testcase: boot
> 
> on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 
> 2.10GHz (Cascade Lake) with 512G memory
> 
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):
> 
> 
> If you fix the issue, kindly add following tag
> | Reported-by: kernel test robot 
> | Link: 
> https://lore.kernel.org/oe-lkp/202303302009.55848372-oliver.s...@intel.com

Thanks, that is useful test!

-- 
With Best Regards,
Andy Shevchenko





Re: [PATCH v2 0/8] x86emul: a few small steps towards disintegration

2023-03-30 Thread Jan Beulich
On 30.03.2023 16:27, Roger Pau Monné wrote:
> On Thu, Mar 30, 2023 at 01:40:43PM +0200, Jan Beulich wrote:
>> On 30.03.2023 11:54, Roger Pau Monné wrote:
>>> On Thu, Mar 30, 2023 at 09:53:23AM +0200, Jan Beulich wrote:
 On 29.03.2023 16:17, Roger Pau Monné wrote:
>>> Patch 8 I'm unsure, I guess it should be up to the user to decide
>>> whether to use -Os or some other optimization?
>>>
>>> I expect introspection users likely care way more about the speed
>>> rather than the size of the generated x86 emulator code?
>>
>> Perhaps. But do we want to introduce another Kconfig control just
>> for that? And wouldn't there likely be other performance concerns,
>> if performance really mattered in the introspection case?
> 
> I don't really have a strong opinion on the usage of -Os or not.  It's
> likely fine as long as we also stay consistent with the flag we use
> when building the test harness and the fuzzer, just in case.

While they might be built with different compilers (and hence things can't
be compared directly anyway), I can certainly make those two use -Os as
well.

Jan



Re: [PATCH RFC 0/9] x86: Merge cpuid and msr policy

2023-03-30 Thread Roger Pau Monné
On Thu, Mar 30, 2023 at 01:59:37PM +0100, Andrew Cooper wrote:
> On 30/03/2023 12:07 pm, Roger Pau Monné wrote:
> > On Wed, Mar 29, 2023 at 09:51:28PM +0100, Andrew Cooper wrote:
> >> tl;dr to add MSR_ARCH_CAPS features sensibly, cpu_{featureset<->policy}() 
> >> need
> >> to not operate on objects of differing lifetimes, so structs
> >> {cpuid,msr}_policy need merging and cpu_policy is the obvious name.
> > So the problem is that there's a chance we might get a cpu_policy
> > object that contains a valid (allocated) cpuid object, but not an msr
> > one?
> 
> No - not cpu_policy.  It is that we can get a cpuid_policy and an
> msr_policy that aren't at the same point in their lifecycle.
> 
> ... which is exactly what happens right now for the raw/host msr right
> now if you featureset_to_policy() to include MSR data.

I see, but that's mostly because we handle the featureset_to_policy()
in two different places for CPUID vs MSR, those need to be unified
into a single helper that does both at the same point.

I assume not having such pointers in side of cpu_policy makes it
clearer that both msr and cpuid should be handled at the same time,
but ultimately this would imply passing a cpu_policy object to
featureset_to_policy() so that both CPUID and MSR sub-structs are
filled from the same featureset.

Sorry, maybe I'm being a bit dull here, just would like to understand
the motivation of the change.

> Merging the two together into cpu_policy causes there to be a single
> object lifecycle.
> 
> 
> It's probably worth repeating the advise from the footnote in
> https://lwn.net/Articles/193245/ again.  Get your datastructures right,
> and the code takes care of itself.  Don't get them right, and the code
> tends to be unmaintainable.
> 
> 
> >> But this does mean that we now have
> >>
> >>   cpu_policy->basic.$X
> >>   cpu_policy->feat.$Y
> >>   cpu_policy->arch_caps.$Z
> > I'm not sure I like the fact that we now can't differentiate between
> > policy fields related to MSRs or CPUID leafs.
> >
> > Isn't there a chance we might in the future get some name space
> > collision by us having decided to unify both?
> 
> The names are chosen by me so far, and the compiler will tell us if
> things actually collide.
> 
> And renaming the existing field is a perfectly acceptable way of
> resolving a conflict which arises in the future.
> 
> But yes - this was the whole point of asking the question.

I think I would prefer to keep the cpu_policy->{cpuid,msr}.
distinction if it doesn't interfere with further work.

Thanks, Roger.



Re: [PATCH v2 0/8] x86emul: a few small steps towards disintegration

2023-03-30 Thread Roger Pau Monné
On Thu, Mar 30, 2023 at 01:40:43PM +0200, Jan Beulich wrote:
> On 30.03.2023 11:54, Roger Pau Monné wrote:
> > On Thu, Mar 30, 2023 at 09:53:23AM +0200, Jan Beulich wrote:
> >> On 29.03.2023 16:17, Roger Pau Monné wrote:
> > Patch 8 I'm unsure, I guess it should be up to the user to decide
> > whether to use -Os or some other optimization?
> > 
> > I expect introspection users likely care way more about the speed
> > rather than the size of the generated x86 emulator code?
> 
> Perhaps. But do we want to introduce another Kconfig control just
> for that? And wouldn't there likely be other performance concerns,
> if performance really mattered in the introspection case?

I don't really have a strong opinion on the usage of -Os or not.  It's
likely fine as long as we also stay consistent with the flag we use
when building the test harness and the fuzzer, just in case.

Thanks, Roger.



[xen-unstable-smoke test] 180065: tolerable trouble: pass/starved - PUSHED

2023-03-30 Thread osstest service owner
flight 180065 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180065/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 xen  f41c88a6fca59f99a2eb5e7ed3d90ab7bca08b1b
baseline version:
 xen  eef4608fe71feddb5fea86678cf3acaf84d10fd2

Last test of basis   180058  2023-03-29 22:01:58 Z0 days
Testing same since   180065  2023-03-30 12:00:25 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  starved 
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  starved 
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   eef4608fe7..f41c88a6fc  f41c88a6fca59f99a2eb5e7ed3d90ab7bca08b1b -> smoke



Improving website content and an introduction

2023-03-30 Thread Charles-H. Schulz
Hello everyone,

I've recently joined the Xen Project and I'm working at Vates on strategy and
innovation. I will also be formally representing the company at the Advisory
Board. First: I'm very excited to join and do what I can to help Xen, it's an
amazing project!

Second, and on to more concrete things: I'd like to work on improve our
website, both on the content - in fact both the website and the wiki could be
improved- and on the style and looks.

If anyone would like to help and join me in this task I'll be happy to share
the work and the creative side of things as well :)

Please let me know off or on list if you're interested.

All the best,


-- 
Charles-H. Schulz
Chief Strategy Officer - CSO
XCP-ng & Xen Orchestra - Vates solutions


Charles Schulz | Vates Chief Strategy Officer
Mobile: +33 698 655 424
XCP-ng & Xen Orchestra - Vates solutions
w: vates.fr | xcp-ng.org | xen-orchestra.com



Re: [PATCH v3 1/3] xen/riscv: introduce setup_initial_pages

2023-03-30 Thread Oleksii
On Wed, 2023-03-29 at 14:06 +0200, Jan Beulich wrote:
> On 29.03.2023 13:43, Oleksii wrote:
> > On Tue, 2023-03-28 at 17:34 +0200, Jan Beulich wrote:
> > > On 27.03.2023 19:17, Oleksii Kurochko wrote:
> > > > --- /dev/null
> > > > +++ b/xen/arch/riscv/mm.c
> > > > @@ -0,0 +1,277 @@
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#define PGTBL_INITIAL_COUNT 8
> > > 
> > > What does this magic number derive from? (A comment may help.)
> > It can be now 4 tables as it is enough to map 2Mb. 8 it was when I
> > experimented with bigger sizes of Xen binary and verified that
> > logic of
> > working with page tables works.
> 
> Since this is connected to Xen's image size as you say, I guess the
> constant wants to move to a header, and then be used in an assertion
> in xen.lds.S. That way one will become easily aware if/when this 2Mb
> assumption breaks.
then we can not be tied to 2MB but to:
ASSERT(_end - _start <= MB((PAGE_TABLE_INIT_COUNT - 2) * PAGE_SIZE),
"Xen too large for early-boot assumptions")

> 
> > > > +struct mmu_desc {
> > > > +    unsigned long num_levels;
> > > > +    uint32_t pgtbl_count;
> > > 
> > > Nit: Style (as before please avoid fixed width types when their
> > > use
> > > isn't really warranted; afaics unsigned int will do fine here and
> > > elsewhere below).
> > I will change it but I am not sure that I fully understand what is
> > an
> > issue with uint32_t.
> 
> The issue is simply that ./CODING_STYLE says to prefer basic types
> whenever possible.
Thanks.

> 
> > > > +void __init setup_initial_pagetables(void)
> > > > +{
> > > > +    struct mmu_desc mmu_desc = { 0, 0, NULL, 0 };
> > > > +
> > > > +    /*
> > > > + * Access to _{stard, end } is always PC-relative
> > > > + * thereby when access them we will get load adresses
> > > > + * of start and end of Xen
> > > > + * To get linker addresses LOAD_TO_LINK() is required
> > > > + * to use
> > > > + */
> > > > +    unsigned long load_start    = (unsigned long)_start;
> > > > +    unsigned long load_end  = (unsigned long)_end;
> > > > +    unsigned long linker_start  = LOAD_TO_LINK(load_start);
> > > > +    unsigned long linker_end    = LOAD_TO_LINK(load_end);
> > > > +
> > > > +    if ( (linker_start != load_start) &&
> > > > + (linker_start <= load_end) && (load_start <=
> > > > linker_end)
> > > > ) {
> > > > +    early_printk("(XEN) linker and load address ranges
> > > > overlap\n");
> > > > +    die();
> > > > +    }
> > > > +
> > > > +    calc_pgtbl_lvls_num(&mmu_desc);
> > > > +
> > > > +    mmu_desc.pgtbl_base = stage1_pgtbl_root;
> > > > +    mmu_desc.next_pgtbl = stage1_pgtbl_nonroot;
> > > > +
> > > > +    setup_initial_mapping(&mmu_desc,
> > > > +  linker_start,
> > > > +  linker_end,
> > > > +  load_start,
> > > > +  PTE_LEAF_DEFAULT);
> > > > +
> > > > +    setup_ptes_permission(&mmu_desc);
> > > 
> > > ...: Why does this require a 2nd pass / function in the first
> > > place?
> > Probably I misunderstood Julien and it setup_pte_permission can be
> > done
> > in setup_initial_mapping() but here is the reply:
> > https://lore.kernel.org/xen-devel/79e83610-5980-d9b5-7994-6b0cb2b90...@xen.org/
> 
> Hmm, yes, his option 2 looks like what you've implemented. Still I
> don't see why the permissions can't be got right on the first pass.

~ Oleksii



Re: [PATCH v3 1/3] xen/riscv: introduce setup_initial_pages

2023-03-30 Thread Oleksii
On Wed, 2023-03-29 at 14:20 +0200, Jan Beulich wrote:
> On 29.03.2023 13:47, Oleksii wrote:
> > On Tue, 2023-03-28 at 16:44 +0100, Julien Grall wrote:
> > > On 28/03/2023 16:34, Jan Beulich wrote:
> > > > On 27.03.2023 19:17, Oleksii Kurochko wrote:
> > > > > --- a/xen/arch/riscv/include/asm/config.h
> > > > > +++ b/xen/arch/riscv/include/asm/config.h
> > > > > @@ -39,12 +39,25 @@
> > > > >     name:
> > > > >   #endif
> > > > >   
> > > > > -#define XEN_VIRT_START  _AT(UL, 0x8020)
> > > > > +#define ADDRESS_SPACE_END (_AC(-1, UL))
> > > > > +#define SZ_1G 0x4000
> > > > 
> > > > No need for this - please use GB(1) (see xen/config.h) in its
> > > > stead.
> > > > 
> > > > > +#ifdef CONFIG_RISCV_64
> > > > > +#define XEN_VIRT_START    (ADDRESS_SPACE_END - SZ_1G + 1)
> > > > 
> > > > I first thought the " + 1" would be rendering the address
> > > > misaligned.
> > > > May I suggest (ADDRESS_SPACE_END + 1 - SZ_1G) to help avoiding
> > > > this
> > > > impression?
> > > 
> > > I will jump on this to say that I am finding quite difficult to
> > > review 
> > > code using define/variable that contains "end" in their name
> > > because
> > > it 
> > > is never clear whether this is inclusive or exclusive.
> > > 
> > > In this case, it is inclusive, but within the same patch I can
> > > see 
> > > map_end which is exclusive.
> > > 
> > > For this case, my suggestion would be to remove ADDRESS_SPACE_END
> > > and
> > > just hardcode the value where you want to put Xen. For others,
> > > you
> > > could 
> > > use (start, size) to describe a region.
> > Thanks for the suggestion. I'll take it into account.
> > > 
> > > Also, are you sure that all the CPU will be able to support more
> > > full
> > > 64-bit VA space?
> > I am not sure but based on Linux it looks like it is true. ( at
> > least,
> > they use the same definitions for RV64 ).
> 
> Hmm, the spec has this text in a rationalizing remark: "The mapping
> between 64-bit virtual addresses and the 39-bit usable address space
> of Sv39 is not based on zero-extension but instead follows an
> entrenched convention that allows an OS to use one or a few of the
> most-significant bits of a full-size (64-bit) virtual address to
> quickly distinguish user and supervisor address regions."
> 
> Apart from that all descriptions look to assume that the top VA bits
> are simply ignored for the purpose of translating to PA, which would
> mean that any address within the 64-bit address space can be mapped,
> and the same mapping would appear many times in VA space. I wonder
> whether that's really how it's meant to be, or whether I'm
> overlooking
> something, or whether the spec would want clarifying.
> 
You are right the top VA bits are ignored but my understanding is that
you can't map any address within 64-bit address space as the top bits
are ignored. Thereby address 0xFFXX are equal to
0x00XX from HW point of view and the top VA bits are used
only by developer.

> > I am curious how that can be checked.
> 
> Try it out somewhere?
I re-read the initial question and also your answer and it looks like I
misunderstood a little bit the initial question.

So if the question was about if all CPUs support all MMU addressing
modes ( Sv32, Sv39, Sv48, Sv57 ) then an answer is no. There is a rule
that if ,for example, CPU supports Sv57 then it must support lower
modes.


If I understood the question correctly then we can add function which
will write/read SATP_MODE:
csr_write(CSR_SATP,
  ((unsigned long)stage1_pgtbl_root >> PAGE_SHIFT) |
RV_STAGE1_MODE << SATP_MODE_SHIFT); __sfence_vma_all();
csr_write(CSR_SATP, satp);
if ((csr_read(CSR_SATP) >> SATP_MODE_SHIFT) == SATP_MODE_SV57)
{
max_supported_mode = SATP_MODE_SV57;
return
}
... check other modes ...

~ Oleksii
  



Re: [PATCH v3] xen/netback: use same error messages for same errors

2023-03-30 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni :

On Wed, 29 Mar 2023 10:02:59 +0200 you wrote:
> Issue the same error message in case an illegal page boundary crossing
> has been detected in both cases where this is tested.
> 
> Suggested-by: Jan Beulich 
> Signed-off-by: Juergen Gross 
> ---
> V2:
> - new patch
> V3:
> - modify message text (Jan Beulich)
> 
> [...]

Here is the summary with links:
  - [v3] xen/netback: use same error messages for same errors
https://git.kernel.org/netdev/net/c/2eca98e5b24d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html





[ovmf test] 180064: all pass - PUSHED

2023-03-30 Thread osstest service owner
flight 180064 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/180064/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 53eb26b238541799134aa5846530485c915735da
baseline version:
 ovmf e3e88d90e8d777e38120c32c7b3bbfb9bcf99b79

Last test of basis   180060  2023-03-30 02:42:13 Z0 days
Testing same since   180064  2023-03-30 11:13:36 Z0 days1 attempts


People who touched revisions under test:
  Ard Biesheuvel 
  Michael Kubacki 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   e3e88d90e8..53eb26b238  53eb26b238541799134aa5846530485c915735da -> 
xen-tested-master



Re: [PATCH v3 6/6] hw/isa/piix3: Resolve redundant TYPE_PIIX3_XEN_DEVICE

2023-03-30 Thread Anthony PERARD
On Sun, Mar 12, 2023 at 01:02:21PM +0100, Bernhard Beschow wrote:
> During the last patches, TYPE_PIIX3_XEN_DEVICE turned into a clone of
> TYPE_PIIX3_DEVICE. Remove this redundancy.
> 
> Signed-off-by: Bernhard Beschow 
> Reviewed-by: Michael S. Tsirkin 

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH v3 5/6] hw/isa/piix3: Resolve redundant k->config_write assignments

2023-03-30 Thread Anthony PERARD
On Sun, Mar 12, 2023 at 01:02:20PM +0100, Bernhard Beschow wrote:
> The previous patch unified handling of piix3_write_config() accross the
> PIIX3 device models which allows for assigning k->config_write once in the
> base class.
> 
> Signed-off-by: Bernhard Beschow 
> Reviewed-by: Michael S. Tsirkin 

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH v3 4/6] hw/isa/piix3: Avoid Xen-specific variant of piix3_write_config()

2023-03-30 Thread Anthony PERARD
On Sun, Mar 12, 2023 at 01:02:19PM +0100, Bernhard Beschow wrote:
> Subscribe to pci_bus_fire_intx_routing_notifier() instead which allows for
> having a common piix3_write_config() for the PIIX3 device models.
> 
> While at it, move the subscription into machine code to facilitate resolving
> TYPE_PIIX3_XEN_DEVICE.
> 
> In a possible future followup, pci_bus_fire_intx_routing_notifier() could
> be adjusted in such a way that subscribing to it doesn't require
> knowledge of the device firing it.
> 
> Signed-off-by: Bernhard Beschow 
> Reviewed-by: Michael S. Tsirkin 

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH v3 3/6] hw/isa/piix3: Wire up Xen PCI IRQ handling outside of PIIX3

2023-03-30 Thread Anthony PERARD
On Sun, Mar 12, 2023 at 01:02:18PM +0100, Bernhard Beschow wrote:
> xen_intx_set_irq() doesn't depend on PIIX3State. In order to resolve
> TYPE_PIIX3_XEN_DEVICE and in order to make Xen agnostic about the
> precise south bridge being used, set up Xen's PCI IRQ handling of PIIX3
> in the board.
> 
> Signed-off-by: Bernhard Beschow 
> Reviewed-by: Michael S. Tsirkin 

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH v3 2/6] hw/isa/piix3: Reuse piix3_realize() in piix3_xen_realize()

2023-03-30 Thread Anthony PERARD
On Sun, Mar 12, 2023 at 01:02:17PM +0100, Bernhard Beschow wrote:
> This is a preparational patch for the next one to make the following
> more obvious:
> 
> First, pci_bus_irqs() is now called twice in case of Xen where the
> second call overrides the pci_set_irq_fn with the Xen variant.

pci_bus_irqs() does allocates pci_bus->irq_count, so the second call in
piix3_xen_realize() will leak `pci_bus->irq_count`. Could you look if
pci_bus_irqs_cleanup() can be called before the second pci_bus_irqs()
call, or maybe some other way to avoid the leak?

> Second, pci_bus_set_route_irq_fn() is now also called in Xen mode.
> 
> Signed-off-by: Bernhard Beschow 
> Reviewed-by: Michael S. Tsirkin 

Beside the leak which I think can happen only once, patch is fine:
Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH RFC 0/9] x86: Merge cpuid and msr policy

2023-03-30 Thread Andrew Cooper
On 30/03/2023 12:07 pm, Roger Pau Monné wrote:
> On Wed, Mar 29, 2023 at 09:51:28PM +0100, Andrew Cooper wrote:
>> tl;dr to add MSR_ARCH_CAPS features sensibly, cpu_{featureset<->policy}() 
>> need
>> to not operate on objects of differing lifetimes, so structs
>> {cpuid,msr}_policy need merging and cpu_policy is the obvious name.
> So the problem is that there's a chance we might get a cpu_policy
> object that contains a valid (allocated) cpuid object, but not an msr
> one?

No - not cpu_policy.  It is that we can get a cpuid_policy and an
msr_policy that aren't at the same point in their lifecycle.

... which is exactly what happens right now for the raw/host msr right
now if you featureset_to_policy() to include MSR data.

Merging the two together into cpu_policy causes there to be a single
object lifecycle.


It's probably worth repeating the advise from the footnote in
https://lwn.net/Articles/193245/ again.  Get your datastructures right,
and the code takes care of itself.  Don't get them right, and the code
tends to be unmaintainable.


>> But this does mean that we now have
>>
>>   cpu_policy->basic.$X
>>   cpu_policy->feat.$Y
>>   cpu_policy->arch_caps.$Z
> I'm not sure I like the fact that we now can't differentiate between
> policy fields related to MSRs or CPUID leafs.
>
> Isn't there a chance we might in the future get some name space
> collision by us having decided to unify both?

The names are chosen by me so far, and the compiler will tell us if
things actually collide.

And renaming the existing field is a perfectly acceptable way of
resolving a conflict which arises in the future.

But yes - this was the whole point of asking the question.

~Andrew



Re: [PATCH 9/9] RFC: Everything else

2023-03-30 Thread Jan Beulich
On 30.03.2023 14:31, Andrew Cooper wrote:
> On 30/03/2023 1:06 pm, Jan Beulich wrote:
>> On 30.03.2023 14:01, Andrew Cooper wrote:
>>> On 30/03/2023 11:16 am, Jan Beulich wrote:
> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> @@ -893,7 +893,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, 
> size_t size)
>  struct x86_emulate_ctxt ctxt = {
>  .data = &state,
>  .regs = &input.regs,
> -.cpuid = &cp,
> +.cpu_policy = &cp,
 ... this and ...

> --- a/tools/tests/x86_emulator/test_x86_emulator.c
> +++ b/tools/tests/x86_emulator/test_x86_emulator.c
> @@ -909,7 +909,7 @@ int main(int argc, char **argv)
>  
>  ctxt.regs = ®s;
>  ctxt.force_writeback = 0;
> -ctxt.cpuid = &cp;
> +ctxt.cpu_policy = &cp;
 ... this imo want keeping as you have it here.
>>> Ok, so that's a firm "switch to using cpu_policy for emul_ctxt" then.
>>>
>>> Which is fine - in fact it's one I'd already started splitting out of
>>> this patch.
>> Hmm, wait - CPUID "basic" and "feat" and alike uses I still would prefer
>> to see using "cpuid". It's really only such initializers which (imo
>> even logically) want to use the name with the wider coverage.
> 
> So its the other way around and you're saying you don't want the field
> name to change, and you don't want to see
> 
> -#define vcpu_has_fpu() (ctxt->cpuid->basic.fpu)
> +#define vcpu_has_fpu() (ctxt->cpu_policy->basic.fpu)
> 
> in the resulting diff ?

Neither. I want the change as seen in the diff further up, but indeed
I'd prefer if the two quoted diff lines in your most recent reply
weren't there. This would again be by way of using a union, this time
in struct x86_emulate_ctxt:

/* CPU Policy for the domain. */
union {
const struct cpu_policy *cpuid;
const struct cpu_policy *cpu_policy;
};

Having said that: I realize that this is against what C mandates for
the use of unions, but since we (ab)use unions in similar ways in many
other places, I don't think we need to be concerned. Furthermore this
specific use could, aiui, be "legalized" by making the declaration

/* CPU Policy for the domain. */
union {
struct {
const struct cpu_policy *cpuid;
};
struct {
const struct cpu_policy *cpu_policy;
};
};

Jan



Re: [PATCH v3 1/6] include/hw/xen/xen: Rename xen_piix3_set_irq() to xen_intx_set_irq()

2023-03-30 Thread Anthony PERARD
On Sun, Mar 12, 2023 at 01:02:16PM +0100, Bernhard Beschow wrote:
> xen_piix3_set_irq() isn't PIIX specific: PIIX is a single PCI device
> while xen_piix3_set_irq() maps multiple PCI devices to their respective
> IRQs, which is board-specific. Rename xen_piix3_set_irq() to communicate
> this.
> 
> Also rename XEN_PIIX_NUM_PIRQS to XEN_IOAPIC_NUM_PIRQS since the Xen's
> IOAPIC rather than PIIX has this many interrupt routes.
> 
> Signed-off-by: Bernhard Beschow 
> Reviewed-by: Michael S. Tsirkin 

Reviewed-by: Anthony PERARD 

Thanks,

-- 
Anthony PERARD



Re: [PATCH 9/9] RFC: Everything else

2023-03-30 Thread Andrew Cooper
On 30/03/2023 1:06 pm, Jan Beulich wrote:
> On 30.03.2023 14:01, Andrew Cooper wrote:
>> On 30/03/2023 11:16 am, Jan Beulich wrote:
 --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
 +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
 @@ -893,7 +893,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, 
 size_t size)
  struct x86_emulate_ctxt ctxt = {
  .data = &state,
  .regs = &input.regs,
 -.cpuid = &cp,
 +.cpu_policy = &cp,
>>> ... this and ...
>>>
 --- a/tools/tests/x86_emulator/test_x86_emulator.c
 +++ b/tools/tests/x86_emulator/test_x86_emulator.c
 @@ -909,7 +909,7 @@ int main(int argc, char **argv)
  
  ctxt.regs = ®s;
  ctxt.force_writeback = 0;
 -ctxt.cpuid = &cp;
 +ctxt.cpu_policy = &cp;
>>> ... this imo want keeping as you have it here.
>> Ok, so that's a firm "switch to using cpu_policy for emul_ctxt" then.
>>
>> Which is fine - in fact it's one I'd already started splitting out of
>> this patch.
> Hmm, wait - CPUID "basic" and "feat" and alike uses I still would prefer
> to see using "cpuid". It's really only such initializers which (imo
> even logically) want to use the name with the wider coverage.

So its the other way around and you're saying you don't want the field
name to change, and you don't want to see

-#define vcpu_has_fpu() (ctxt->cpuid->basic.fpu)
+#define vcpu_has_fpu() (ctxt->cpu_policy->basic.fpu)

in the resulting diff ?

~Andrew



Re: [PATCH v2 01/13] x86/shadow: rename SHADOW_FOREACH_LE() to FOREACH_PRESENT_LE()

2023-03-30 Thread Andrew Cooper
On 30/03/2023 12:24 pm, Jan Beulich wrote:
> These being local macros, the SHADOW prefix doesn't gain us much. What
> is more important to be aware of at use sites is that the supplied code
> is invoked for present entries only.
>
> While making the adjustment also properly use NULL for the 3rd argument
> at respective invocation sites.
>
> Requested-by: Andrew Cooper 
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 



Re: [PATCH RFC 0/9] x86: Merge cpuid and msr policy

2023-03-30 Thread Andrew Cooper
On 30/03/2023 11:18 am, Jan Beulich wrote:
> On 29.03.2023 22:51, Andrew Cooper wrote:
>> Andrew Cooper (9):
>>   x86: Rename struct cpu_policy to struct old_cpuid_policy
>>   x86: Rename {domctl,sysctl}.cpu_policy.{cpuid,msr_policy} fields
> Nit: I guess the last closing brace wants moving forward a little.

Oops yes.

>>   x86: Rename struct cpuid_policy to struct cpu_policy
>>   x86: Merge struct msr_policy into struct cpu_policy
>>   x86: Merge the system {cpuid,msr} policy objects
>>   x86: Merge a domain's {cpuid,msr} policy objects
>>   x86: Merge xc_cpu_policy's cpuid and msr objects
>>   x86: Drop struct old_cpu_policy
> With the small comments on individual patches taken care of one way or
> another, up to here:
>
> Reviewed-by: Jan Beulich 

Thanks.  I'll fold this in on these patches, but I don't intend to
commit anything until I've got the full series together.

Right now there's still a lot of shuffling of minor things (comments)
between patches, and not-so-minor things (deciding to split things
differently).

~Andrew



Re: [PATCH 9/9] RFC: Everything else

2023-03-30 Thread Jan Beulich
On 30.03.2023 14:01, Andrew Cooper wrote:
> On 30/03/2023 11:16 am, Jan Beulich wrote:
>> On 29.03.2023 22:51, Andrew Cooper wrote:
>>> Looking at this diff, I'm wondering whether keeping
>>>
>>> union {
>>> struct cpu_policy *cpuid;
>>> struct cpu_policy *cpu_policy;
>>> };
>>>
>>> permentantly might be a good idea, because d->arch.cpuid->$X reads rather
>>> better than d->arch.cpu_policy->$X
>>>
>>> Thoughts?
>> I'm not overly fussed, but perhaps yes.
> 
> If we were to do this, we ought to keep d->arch.msr too for the same reason.

I'm not sure this is necessarily a consequence.

> (Honestly - I'm still undecided on whether this is a good idea or not...)
> 
>>  Nevertheless e.g. ...
>>
>>> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>>> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>>> @@ -893,7 +893,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, 
>>> size_t size)
>>>  struct x86_emulate_ctxt ctxt = {
>>>  .data = &state,
>>>  .regs = &input.regs,
>>> -.cpuid = &cp,
>>> +.cpu_policy = &cp,
>> ... this and ...
>>
>>> --- a/tools/tests/x86_emulator/test_x86_emulator.c
>>> +++ b/tools/tests/x86_emulator/test_x86_emulator.c
>>> @@ -909,7 +909,7 @@ int main(int argc, char **argv)
>>>  
>>>  ctxt.regs = ®s;
>>>  ctxt.force_writeback = 0;
>>> -ctxt.cpuid = &cp;
>>> +ctxt.cpu_policy = &cp;
>> ... this imo want keeping as you have it here.
> 
> Ok, so that's a firm "switch to using cpu_policy for emul_ctxt" then.
> 
> Which is fine - in fact it's one I'd already started splitting out of
> this patch.

Hmm, wait - CPUID "basic" and "feat" and alike uses I still would prefer
to see using "cpuid". It's really only such initializers which (imo
even logically) want to use the name with the wider coverage.

Jan



Re: [PATCH 9/9] RFC: Everything else

2023-03-30 Thread Andrew Cooper
On 30/03/2023 11:16 am, Jan Beulich wrote:
> On 29.03.2023 22:51, Andrew Cooper wrote:
>> Looking at this diff, I'm wondering whether keeping
>>
>> union {
>> struct cpu_policy *cpuid;
>> struct cpu_policy *cpu_policy;
>> };
>>
>> permentantly might be a good idea, because d->arch.cpuid->$X reads rather
>> better than d->arch.cpu_policy->$X
>>
>> Thoughts?
> I'm not overly fussed, but perhaps yes.

If we were to do this, we ought to keep d->arch.msr too for the same reason.

(Honestly - I'm still undecided on whether this is a good idea or not...)

>  Nevertheless e.g. ...
>
>> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
>> @@ -893,7 +893,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t 
>> size)
>>  struct x86_emulate_ctxt ctxt = {
>>  .data = &state,
>>  .regs = &input.regs,
>> -.cpuid = &cp,
>> +.cpu_policy = &cp,
> ... this and ...
>
>> --- a/tools/tests/x86_emulator/test_x86_emulator.c
>> +++ b/tools/tests/x86_emulator/test_x86_emulator.c
>> @@ -909,7 +909,7 @@ int main(int argc, char **argv)
>>  
>>  ctxt.regs = ®s;
>>  ctxt.force_writeback = 0;
>> -ctxt.cpuid = &cp;
>> +ctxt.cpu_policy = &cp;
> ... this imo want keeping as you have it here.

Ok, so that's a firm "switch to using cpu_policy for emul_ctxt" then.

Which is fine - in fact it's one I'd already started splitting out of
this patch.

~Andrew



Re: [PATCH v3] xen/netback: use same error messages for same errors

2023-03-30 Thread Paolo Abeni
Hello,

On Wed, 2023-03-29 at 10:02 +0200, Juergen Gross wrote:
> Issue the same error message in case an illegal page boundary crossing
> has been detected in both cases where this is tested.
> 
> Suggested-by: Jan Beulich 
> Signed-off-by: Juergen Gross 

As this was intended to be part of:

xen/netback: fix issue introduced recently

I'm going to apply this one on net, unless someone screams very loudly,
very soon :)

thanks!

Paolo




Re: [PATCH 8/9] x86: Drop struct old_cpu_policy

2023-03-30 Thread Andrew Cooper
On 30/03/2023 11:08 am, Jan Beulich wrote:
> On 29.03.2023 22:51, Andrew Cooper wrote:
>> --- a/tools/libs/guest/xg_cpuid_x86.c
>> +++ b/tools/libs/guest/xg_cpuid_x86.c
>> @@ -868,9 +868,7 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, 
>> xc_cpu_policy_t *host,
>>   xc_cpu_policy_t *guest)
>>  {
>>  struct cpu_policy_errors err = INIT_CPU_POLICY_ERRORS;
>> -struct old_cpu_policy h = { &host->policy, &host->policy };
>> -struct old_cpu_policy g = { &guest->policy, &guest->policy };
>> -int rc = x86_cpu_policies_are_compatible(&h, &g, &err);
>> +int rc = x86_cpu_policies_are_compatible(&host->policy, &host->policy, 
>> &err);
> One of the two surely wants to be &guest->policy?

Ah, yes it does.  The second, specifically.

(One of the problems of having an API that's not used by anything
interesting yet.)

~Andrew



Re: [PATCH 7/9] x86: Merge xc_cpu_policy's cpuid and msr objects

2023-03-30 Thread Andrew Cooper
On 30/03/2023 11:04 am, Jan Beulich wrote:
> On 29.03.2023 22:51, Andrew Cooper wrote:
>>  /* Sanity test various invariants we expect in the default/max policies. */
>>  static void test_guest_policies(const struct xc_cpu_policy *max,
>>  const struct xc_cpu_policy *def)
>>  {
>> -const struct cpuid_policy *cm = &max->cpuid;
>> -const struct cpuid_policy *cd = &def->cpuid;
>> -const struct msr_policy *mm = &max->msr;
>> -const struct msr_policy *md = &def->msr;
>> +const struct cpu_policy *m = &max->policy;
>> +const struct cpu_policy *d = &def->policy;
> Looks like you could reduce code churn by keeping "cm" and "cd" as the
> names, which would also be consistent with you continuing to use "cp"
> in hypervisor code.

It's a unit test - I'm not worried about churn; I'm more concerned about
the end readability.

I did consider swapping {d,m} with {max,def} so the logic below reads:

    if ( ((max->feat.raw[0].d | def->feat.raw[0].d) &

but it occurs to me that because the policies are now merged, I can get
rid of all of the xc_cpu_policy passing and use cpu_policy instead. 
(For inspection purposes, we don't care about the serialisation buffers
on the tail of xc_cpu_policy).

It will be a slightly larger change, but the end result will be better IMO.

~Andrew



Re: [PATCH v2 0/8] x86emul: a few small steps towards disintegration

2023-03-30 Thread Jan Beulich
On 30.03.2023 11:54, Roger Pau Monné wrote:
> On Thu, Mar 30, 2023 at 09:53:23AM +0200, Jan Beulich wrote:
>> On 29.03.2023 16:17, Roger Pau Monné wrote:
>>> Do we expect to be able to split other opcode handling into separate
>>> files?
>>
>> As per above, "expect" is perhaps too optimistic. I'd say "hope", in
>> particular for SIMD code (which I guess is now the main part of the
>> ISA as well as the sources, at least number-of-insns-wise). One
>> possible (likely intermediate) approach might be to move all SIMD code
>> from the huge switch() statement to its own file/function, invoked
>> from that huge switch()'s default: case. It may then still be a big
>> switch() statement in (e.g.) simd.c, but we'd then at least have two
>> of them, each about half the size of what we have right now. Plus it
>> may allow some (most?) of the X86EMUL_NO_SIMD #ifdef-ary to be dropped,
>> as the new file - like fpu.c - could then itself be built only
>> conditionally.
> 
> I don't like the handling of SIMD from a default case in the global
> switch much, as we then could end up chaining all kind of random
> handling in the default case.  It's IMO clearer if we can detect and
> forward insn to the SIMD code when we know it's a SIMD instruction.

Avoiding of which would require ...

>>> I wonder how tied together are the remaining cases, and whether we
>>> will be able to split more of them into separate units?
>>
>> That's the big open question indeed. As per above - with some effort
>> I expect all SIMD code could collectively be moved out; further
>> splitting would likely end up more involved.
>>
>>> Overall I don't think the disintegration makes the code harder, as the split
>>> cases seems to be fully isolated, I do however wonder whether further splits
>>> might not be so isolated?
>>
>> And again - indeed. This series, while already a lot of code churn, is
>> only collecting some of the very low hanging fruit. But at least I
>> hope that the pieces now separated out won't need a lot of touching
>> later on, except of course to add support for new insns.
> 
> OK, I'm a bit concerned that we end up growing duplicated switch
> cases, in the sense that we will have the following:
> 
> switch ( insn )
> {
> case 0x100:
> case 0x1f0:
> case 0x200:
>   x86emul_foo();
> ...
> }
> 
> x86emul_foo()
> {
> switch (insn )
> {
> case 0x100:
> /* Handle. */
>   break;
> 
> case 0x1f0:
> /* Handle. */
>   break;
> 
> case 0x200:
> ...
> }
> }
> 
> So that we would end up having to add the opcodes twice, once in the
> generic switch, and then again at place the insn are actually handled.

... doing exactly this, which looks like we a agree we'd like to avoid.
I'm also not really concerned of "chaining all kind of random handling
in the default case" - where things are terminated with an error code
doesn't really matter. As first step here I'm literally thinking of
splitting the huge switch() into two. The only thing that I can see may
get in the way is the resulting SIMD function requiring like 20
parameters.

> So far the introduced splits seems fine in that they deal with a
> contiguous range of opcodes.
> 
> For patches 1-7:
> 
> Acked-by: Roger Pau Monné 

Thanks!

> Patch 8 I'm unsure, I guess it should be up to the user to decide
> whether to use -Os or some other optimization?
> 
> I expect introspection users likely care way more about the speed
> rather than the size of the generated x86 emulator code?

Perhaps. But do we want to introduce another Kconfig control just
for that? And wouldn't there likely be other performance concerns,
if performance really mattered in the introspection case?

Jan



[PATCH v2 13/13] x86/shadow: adjust monitor table prealloc amount

2023-03-30 Thread Jan Beulich
While 670d6b908ff2 ('x86 shadow: Move the shadow linear mapping for
n-on-3-on-4 shadows so') bumped the amount by one too little for the
32-on-64 case (which luckily was dead code, and hence a bump wasn't
necessary in the first place), 0b841314dace ('x86/shadow:
sh_{make,destroy}_monitor_table() are "even more" HVM-only'), dropping
the dead code, then didn't adjust the amount back. Yet even the original
amount was too high in certain cases. Switch to pre-allocating just as
much as is going to be needed.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm/shadow/hvm.c
+++ b/xen/arch/x86/mm/shadow/hvm.c
@@ -738,7 +738,7 @@ mfn_t sh_make_monitor_table(const struct
 ASSERT(!pagetable_get_pfn(v->arch.hvm.monitor_table));
 
 /* Guarantee we can get the memory we need */
-if ( !shadow_prealloc(d, SH_type_monitor_table, CONFIG_PAGING_LEVELS) )
+if ( !shadow_prealloc(d, SH_type_monitor_table, shadow_levels < 4 ? 3 : 1) 
)
 return INVALID_MFN;
 
 m4mfn = shadow_alloc(d, SH_type_monitor_table, 0);




[PATCH v2 12/13] x86/shadow: "monitor table" is a HVM-only concept

2023-03-30 Thread Jan Beulich
It looks like in the combination of aff8bf94ce65 ('x86/shadow: only
4-level guest code needs building when !HVM') and 0b841314dace
('x86/shadow: sh_{make,destroy}_monitor_table() are "even more" HVM-
only') I didn't go quite far enough: SH_type_monitor_table is also
effectively unused when !HVM.

The assertion early in sh_destroy_shadow() can have the type dropped
altogether: it shouldn't make it here in the first place. Pages of
this type are freed directly from sh_destroy_monitor_table() only.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1190,7 +1190,6 @@ void sh_destroy_shadow(struct domain *d,
 ASSERT(t == SH_type_fl1_32_shadow  ||
t == SH_type_fl1_pae_shadow ||
t == SH_type_fl1_64_shadow  ||
-   t == SH_type_monitor_table  ||
(is_pv_32bit_domain(d) && t == SH_type_l4_64_shadow) ||
(page_get_owner(mfn_to_page(backpointer(sp))) == d));
 
--- a/xen/arch/x86/mm/shadow/private.h
+++ b/xen/arch/x86/mm/shadow/private.h
@@ -205,8 +205,7 @@ extern void shadow_audit_tables(struct v
 #define SH_type_l4_64_shadow   6U /* shadowing a 64-bit L4 page */
 #define SH_type_max_shadow 6U
 #define SH_type_p2m_table  7U /* in use as the p2m table */
-#define SH_type_monitor_table  8U /* in use as a monitor table */
-#define SH_type_unused 9U
+#define SH_type_unused 8U
 #endif
 
 #ifndef CONFIG_PV32 /* Unused (but uglier to #ifdef above): */




[PATCH v2 11/13] x86/shadow: vCPU-s never have "no mode"

2023-03-30 Thread Jan Beulich
With an initial mode installed by shadow_vcpu_init(), there's no need
for sh_update_paging_modes() to deal with the "mode is still unset"
case. Leave an assertion, though.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1864,6 +1864,8 @@ static void sh_update_paging_modes(struc
 make_cr3(v, mmfn);
 hvm_update_host_cr3(v);
 }
+else if ( !old_mode )
+ASSERT_UNREACHABLE();
 else if ( v->arch.paging.mode != old_mode )
 {
 SHADOW_PRINTK("new paging mode: %pv pe=%d gl=%u "
@@ -1872,11 +1874,10 @@ static void sh_update_paging_modes(struc
   hvm_paging_enabled(v),
   v->arch.paging.mode->guest_levels,
   v->arch.paging.mode->shadow.shadow_levels,
-  old_mode ? old_mode->guest_levels : 0,
-  old_mode ? old_mode->shadow.shadow_levels : 0);
-if ( old_mode &&
- (v->arch.paging.mode->shadow.shadow_levels !=
-  old_mode->shadow.shadow_levels) )
+  old_mode->guest_levels,
+  old_mode->shadow.shadow_levels);
+if ( v->arch.paging.mode->shadow.shadow_levels !=
+ old_mode->shadow.shadow_levels )
 {
 /* Need to make a new monitor table for the new mode */
 mfn_t new_mfn, old_mfn;




[PATCH v2 10/13] x86/shadow: make monitor table create/destroy more consistent

2023-03-30 Thread Jan Beulich
While benign at present, it is still a little fragile to operate on a
wrong "old_mode" value in sh_update_paging_modes(). This can happen when
no monitor table was present initially - we'd create one for the new
mode without updating old_mode. Correct this in two ways, each of which
would be sufficient on its own: Once by adding "else" to the second of
the involved if()s in the function, and then by setting the correct
initial mode for HVM domains in shadow_vcpu_init().

Further use the same predicate (paging_mode_external()) consistently
when dealing with shadow mode init/update/cleanup, rather than a mix of
is_hvm_vcpu() (init), is_hvm_domain() (update), and
paging_mode_external() (cleanup).

Finally drop a redundant is_hvm_domain() from inside the bigger if()
(which is being converted to paging_mode_external()) in
sh_update_paging_modes().

Signed-off-by: Jan Beulich 
---
v2: Style adjustment.

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -129,9 +129,9 @@ void shadow_vcpu_init(struct vcpu *v)
 }
 #endif
 
-v->arch.paging.mode = is_hvm_vcpu(v) ?
-  &SHADOW_INTERNAL_NAME(sh_paging_mode, 3) :
-  &SHADOW_INTERNAL_NAME(sh_paging_mode, 4);
+v->arch.paging.mode = paging_mode_external(v->domain)
+  ? &SHADOW_INTERNAL_NAME(sh_paging_mode, 2)
+  : &SHADOW_INTERNAL_NAME(sh_paging_mode, 4);
 }
 
 #if SHADOW_AUDIT
@@ -1811,7 +1811,7 @@ static void sh_update_paging_modes(struc
 sh_detach_old_tables(v);
 
 #ifdef CONFIG_HVM
-if ( is_hvm_domain(d) )
+if ( paging_mode_external(d) )
 {
 const struct paging_mode *old_mode = v->arch.paging.mode;
 
@@ -1864,13 +1864,12 @@ static void sh_update_paging_modes(struc
 make_cr3(v, mmfn);
 hvm_update_host_cr3(v);
 }
-
-if ( v->arch.paging.mode != old_mode )
+else if ( v->arch.paging.mode != old_mode )
 {
 SHADOW_PRINTK("new paging mode: %pv pe=%d gl=%u "
   "sl=%u (was g=%u s=%u)\n",
   v,
-  is_hvm_domain(d) ? hvm_paging_enabled(v) : 1,
+  hvm_paging_enabled(v),
   v->arch.paging.mode->guest_levels,
   v->arch.paging.mode->shadow.shadow_levels,
   old_mode ? old_mode->guest_levels : 0,




[PATCH v2 09/13] x86/shadow: drop is_hvm_...() where easily possible

2023-03-30 Thread Jan Beulich
Emulation related functions are involved in HVM handling only, and in
some cases they even invoke such checks after having already done things
which are valid for HVM domains only. OOS active also implies HVM. In
sh_remove_all_mappings() one of the two checks is redundant with an
earlier paging_mode_external() one (the other, however, needs to stay).

Signed-off-by: Jan Beulich 
---
v2: Re-base over changes/additions earlier in the series.

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1522,7 +1522,7 @@ int sh_remove_all_mappings(struct domain
&& (page->count_info & PGC_count_mask) <= 3
&& ((page->u.inuse.type_info & PGT_count_mask)
== (is_special_page(page) ||
-   (is_hvm_domain(d) && is_ioreq_server_page(d, page) )
+   is_ioreq_server_page(d, page )
 printk(XENLOG_G_ERR "can't find all mappings of mfn %"PRI_mfn
" (gfn %"PRI_gfn"): c=%lx t=%lx s=%d i=%d\n",
mfn_x(gmfn), gfn_x(gfn),
--- a/xen/arch/x86/mm/shadow/hvm.c
+++ b/xen/arch/x86/mm/shadow/hvm.c
@@ -204,10 +204,6 @@ hvm_emulate_write(enum x86_segment seg,
 if ( rc || !bytes )
 return rc;
 
-/* Unaligned writes are only acceptable on HVM */
-if ( (addr & (bytes - 1)) && !is_hvm_vcpu(v)  )
-return X86EMUL_UNHANDLEABLE;
-
 ptr = sh_emulate_map_dest(v, addr, bytes, sh_ctxt);
 if ( IS_ERR(ptr) )
 return ~PTR_ERR(ptr);
@@ -258,10 +254,6 @@ hvm_emulate_cmpxchg(enum x86_segment seg
 if ( rc )
 return rc;
 
-/* Unaligned writes are only acceptable on HVM */
-if ( (addr & (bytes - 1)) && !is_hvm_vcpu(v)  )
-return X86EMUL_UNHANDLEABLE;
-
 ptr = sh_emulate_map_dest(v, addr, bytes, sh_ctxt);
 if ( IS_ERR(ptr) )
 return ~PTR_ERR(ptr);
@@ -457,8 +449,7 @@ static void *sh_emulate_map_dest(struct
 
 #ifndef NDEBUG
 /* We don't emulate user-mode writes to page tables. */
-if ( is_hvm_domain(d) ? hvm_get_cpl(v) == 3
-  : !guest_kernel_mode(v, guest_cpu_user_regs()) )
+if ( hvm_get_cpl(v) == 3 )
 {
 gdprintk(XENLOG_DEBUG, "User-mode write to pagetable reached "
  "emulate_map_dest(). This should never happen!\n");
@@ -487,15 +478,6 @@ static void *sh_emulate_map_dest(struct
 sh_ctxt->mfn[1] = INVALID_MFN;
 map = map_domain_page(sh_ctxt->mfn[0]) + (vaddr & ~PAGE_MASK);
 }
-else if ( !is_hvm_domain(d) )
-{
-/*
- * Cross-page emulated writes are only supported for HVM guests;
- * PV guests ought to know better.
- */
-put_page(mfn_to_page(sh_ctxt->mfn[0]));
-return MAPPING_UNHANDLEABLE;
-}
 else
 {
 /* This write crosses a page boundary. Translate the second page. */
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3438,7 +3438,7 @@ int sh_rm_write_access_from_sl1p(struct
 ASSERT(mfn_valid(smfn));
 
 /* Remember if we've been told that this process is being torn down */
-if ( curr->domain == d && is_hvm_domain(d) )
+if ( curr->domain == d )
 curr->arch.paging.shadow.pagetable_dying
 = mfn_to_page(gmfn)->pagetable_dying;
 
--- a/xen/arch/x86/mm/shadow/oos.c
+++ b/xen/arch/x86/mm/shadow/oos.c
@@ -580,7 +580,6 @@ int sh_unsync(struct vcpu *v, mfn_t gmfn
 if ( (pg->shadow_flags &
   ((SHF_page_type_mask & ~SHF_L1_ANY) | SHF_out_of_sync)) ||
  sh_page_has_multiple_shadows(pg) ||
- !is_hvm_vcpu(v) ||
  !v->domain->arch.paging.shadow.oos_active )
 return 0;
 




[PATCH v2 08/13] x86/shadow: sh_rm_write_access_from_sl1p() is HVM-only

2023-03-30 Thread Jan Beulich
The function is used from (HVM-only) OOS code only - replace the
respective #ifdef inside the function to make this more obvious. (Note
that SHOPT_OUT_OF_SYNC won't be set when !HVM, so the #ifdef surrounding
the function is already sufficient.)

Requested-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3429,9 +3429,7 @@ static void cf_check sh_update_cr3(struc
 int sh_rm_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
  mfn_t smfn, unsigned long off)
 {
-#ifdef CONFIG_HVM
 struct vcpu *curr = current;
-#endif
 int r;
 shadow_l1e_t *sl1p, sl1e;
 struct page_info *sp;
@@ -3439,12 +3437,10 @@ int sh_rm_write_access_from_sl1p(struct
 ASSERT(mfn_valid(gmfn));
 ASSERT(mfn_valid(smfn));
 
-#ifdef CONFIG_HVM
 /* Remember if we've been told that this process is being torn down */
 if ( curr->domain == d && is_hvm_domain(d) )
 curr->arch.paging.shadow.pagetable_dying
 = mfn_to_page(gmfn)->pagetable_dying;
-#endif
 
 sp = mfn_to_page(smfn);
 




[PATCH v2 07/13] x86/shadow: move OOS functions to their own file

2023-03-30 Thread Jan Beulich
The code has been identified as HVM-only, and its main functions are
pretty well isolated. Move them to their own file. While moving, besides
making two functions non-static, do a few style adjustments, mainly
comment formatting, but leave the code otherwise untouched.

Signed-off-by: Jan Beulich 
Acked-by: Andrew Cooper 
---
v2: Adjust SPDX to GPL-2.0-only. A few more style adjustments.

--- a/xen/arch/x86/mm/shadow/Makefile
+++ b/xen/arch/x86/mm/shadow/Makefile
@@ -1,6 +1,6 @@
 ifeq ($(CONFIG_SHADOW_PAGING),y)
 obj-y += common.o set.o
-obj-$(CONFIG_HVM) += hvm.o guest_2.o guest_3.o guest_4.o
+obj-$(CONFIG_HVM) += hvm.o guest_2.o guest_3.o guest_4.o oos.o
 obj-$(CONFIG_PV) += pv.o guest_4.o
 else
 obj-y += none.o
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -152,576 +152,6 @@ static int __init cf_check shadow_audit_
 __initcall(shadow_audit_key_init);
 #endif /* SHADOW_AUDIT */
 
-#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
-/**/
-/* Out-of-sync shadows. */
-
-/* From time to time, we let a shadowed pagetable page go out of sync
- * with its shadow: the guest is allowed to write directly to the page,
- * and those writes are not synchronously reflected in the shadow.
- * This lets us avoid many emulations if the guest is writing a lot to a
- * pagetable, but it relaxes a pretty important invariant in the shadow
- * pagetable design.  Therefore, some rules:
- *
- * 1. Only L1 pagetables may go out of sync: any page that is shadowed
- *at at higher level must be synchronously updated.  This makes
- *using linear shadow pagetables much less dangerous.
- *That means that: (a) unsyncing code needs to check for higher-level
- *shadows, and (b) promotion code needs to resync.
- *
- * 2. All shadow operations on a guest page require the page to be brought
- *back into sync before proceeding.  This must be done under the
- *paging lock so that the page is guaranteed to remain synced until
- *the operation completes.
- *
- *Exceptions to this rule: the pagefault and invlpg handlers may
- *update only one entry on an out-of-sync page without resyncing it.
- *
- * 3. Operations on shadows that do not start from a guest page need to
- *be aware that they may be handling an out-of-sync shadow.
- *
- * 4. Operations that do not normally take the paging lock (fast-path
- *#PF handler, INVLPG) must fall back to a locking, syncing version
- *if they see an out-of-sync table.
- *
- * 5. Operations corresponding to guest TLB flushes (MOV CR3, INVLPG)
- *must explicitly resync all relevant pages or update their
- *shadows.
- *
- * Currently out-of-sync pages are listed in a simple open-addressed
- * hash table with a second chance (must resist temptation to radically
- * over-engineer hash tables...)  The virtual address of the access
- * which caused us to unsync the page is also kept in the hash table, as
- * a hint for finding the writable mappings later.
- *
- * We keep a hash per vcpu, because we want as much as possible to do
- * the re-sync on the save vcpu we did the unsync on, so the VA hint
- * will be valid.
- */
-
-static void sh_oos_audit(struct domain *d)
-{
-unsigned int idx, expected_idx, expected_idx_alt;
-struct page_info *pg;
-struct vcpu *v;
-
-for_each_vcpu(d, v)
-{
-for ( idx = 0; idx < SHADOW_OOS_PAGES; idx++ )
-{
-mfn_t *oos = v->arch.paging.shadow.oos;
-if ( mfn_eq(oos[idx], INVALID_MFN) )
-continue;
-
-expected_idx = mfn_x(oos[idx]) % SHADOW_OOS_PAGES;
-expected_idx_alt = ((expected_idx + 1) % SHADOW_OOS_PAGES);
-if ( idx != expected_idx && idx != expected_idx_alt )
-{
-printk("%s: idx %x contains gmfn %lx, expected at %x or %x.\n",
-   __func__, idx, mfn_x(oos[idx]),
-   expected_idx, expected_idx_alt);
-BUG();
-}
-pg = mfn_to_page(oos[idx]);
-if ( !(pg->count_info & PGC_shadowed_pt) )
-{
-printk("%s: idx %x gmfn %lx not a pt (count %lx)\n",
-   __func__, idx, mfn_x(oos[idx]), pg->count_info);
-BUG();
-}
-if ( !(pg->shadow_flags & SHF_out_of_sync) )
-{
-printk("%s: idx %x gmfn %lx not marked oos (flags %x)\n",
-   __func__, idx, mfn_x(oos[idx]), pg->shadow_flags);
-BUG();
-}
-if ( (pg->shadow_flags & SHF_page_type_mask & ~SHF_L1_ANY) )
-{
-printk("%s: idx %x gmfn %lx shadowed as non-l1 (flags %x)\n",
-   __func__, idx, mfn_x(oos[idx]), pg->shadow_flags);
-BUG();
-}
-}
-}
-}
-
-#if SHADOW_AUDIT & SHADOW_AUDIT_ENTRIES
-void oos_audit_hash_is_present(str

[PATCH v2 06/13] x86/shadow: use lighter weight mode checks

2023-03-30 Thread Jan Beulich
shadow_mode_...(), with the exception of shadow_mode_enabled(), are
shorthands for shadow_mode_enabled() && paging_mode_...(). While
potentially useful outside of shadow-internal functions, when we already
know that we're dealing with a domain in shadow mode, the "paging"
checks are sufficient and cheaper. While the "shadow" ones commonly
translate to a MOV/AND/CMP/Jcc sequence, the "paging" ones typically
resolve to just TEST+Jcc.

Signed-off-by: Jan Beulich 
---
v2: Re-base over new earlier patch.

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1856,7 +1856,7 @@ int sh_remove_write_access(struct domain
  * In guest refcounting, we trust Xen to already be restricting
  * all the writes to the guest page tables, so we do not need to
  * do more. */
-if ( !shadow_mode_refcounts(d) )
+if ( !paging_mode_refcounts(d) )
 return 0;
 
 /* Early exit if it's already a pagetable, or otherwise not writeable */
@@ -2088,7 +2088,7 @@ int sh_remove_all_mappings(struct domain
  *   guest pages with an extra reference taken by
  *   prepare_ring_for_helper().
  */
-if ( !(shadow_mode_external(d)
+if ( !(paging_mode_external(d)
&& (page->count_info & PGC_count_mask) <= 3
&& ((page->u.inuse.type_info & PGT_count_mask)
== (is_special_page(page) ||
@@ -2385,8 +2385,8 @@ static void sh_update_paging_modes(struc
 {
 const struct paging_mode *old_mode = v->arch.paging.mode;
 
-ASSERT(shadow_mode_translate(d));
-ASSERT(shadow_mode_external(d));
+ASSERT(paging_mode_translate(d));
+ASSERT(paging_mode_external(d));
 
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
 /* Need to resync all our pages now, because if a page goes out
@@ -2773,7 +2773,7 @@ void shadow_vcpu_teardown(struct vcpu *v
 
 sh_detach_old_tables(v);
 #ifdef CONFIG_HVM
-if ( shadow_mode_external(d) )
+if ( paging_mode_external(d) )
 {
 mfn_t mfn = pagetable_get_mfn(v->arch.hvm.monitor_table);
 
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -526,7 +526,7 @@ _sh_propagate(struct vcpu *v,
|| (level == 1
&& page_get_owner(mfn_to_page(target_mfn)) == dom_io);
 if ( mmio_mfn
- && !(level == 1 && (!shadow_mode_refcounts(d)
+ && !(level == 1 && (!paging_mode_refcounts(d)
  || p2mt == p2m_mmio_direct)) )
 {
 ASSERT((ft == ft_prefetch));
@@ -543,7 +543,7 @@ _sh_propagate(struct vcpu *v,
_PAGE_RW | _PAGE_PRESENT);
 if ( guest_nx_enabled(v) )
 pass_thru_flags |= _PAGE_NX_BIT;
-if ( level == 1 && !shadow_mode_refcounts(d) && mmio_mfn )
+if ( level == 1 && !paging_mode_refcounts(d) && mmio_mfn )
 pass_thru_flags |= PAGE_CACHE_ATTRS;
 sflags = gflags & pass_thru_flags;
 
@@ -663,7 +663,7 @@ _sh_propagate(struct vcpu *v,
  * (We handle log-dirty entirely inside the shadow code, without using the
  * p2m_ram_logdirty p2m type: only HAP uses that.)
  */
-if ( level == 1 && unlikely(shadow_mode_log_dirty(d)) && !mmio_mfn )
+if ( level == 1 && unlikely(paging_mode_log_dirty(d)) && !mmio_mfn )
 {
 if ( ft & FETCH_TYPE_WRITE )
 paging_mark_dirty(d, target_mfn);
@@ -819,7 +819,7 @@ do {
 #define FOREACH_PRESENT_L2E(_sl2mfn, _sl2e, _gl2p, _done, _dom, _code)\
 do {  \
 int _i, _j;   \
-ASSERT(shadow_mode_external(_dom));   \
+ASSERT(paging_mode_external(_dom));   \
 ASSERT(mfn_to_page(_sl2mfn)->u.sh.type == SH_type_l2_32_shadow);  \
 for ( _j = 0; _j < 4; _j++ )  \
 { \
@@ -845,7 +845,7 @@ do {
 do {   \
 int _i;\
 shadow_l2e_t *_sp = map_domain_page((_sl2mfn));\
-ASSERT(shadow_mode_external(_dom));\
+ASSERT(paging_mode_external(_dom));\
 ASSERT(mfn_to_page(_sl2mfn)->u.sh.type == SH_type_l2_pae_shadow);  \
 for ( _i = 0; _i < SHADOW_L2_PAGETABLE_ENTRIES; _i++ ) \
 {  \
@@ -866,7 +866,7 @@ do {
 unsigned int _i, _end = SHADOW_L2_PAGETABLE_ENTRIES;\
 shadow_l2e_t *_sp = map_domain_page((_sl2mfn)); \
 ASSERT_VALID_L2(mfn_to_page(_sl2mfn)->u.sh.type);   \
-if ( is_pv_32bit_domain(_dom) /* implies !shadow

[PATCH v2 05/13] x86/shadow: don't generate bogus "domain dying" trace entry from sh_page_fault()

2023-03-30 Thread Jan Beulich
When in 3-level guest mode we help a guest to stay alive, we also
shouldn't emit a trace entry to the contrary. Move the invocation up
into the respective #ifdef, noting that while this moves it into the
locked region, emitting trace records with the paging lock held is okay
(as done elsewhere as well), just needlessly increasing lock holding
time a little.

Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -2492,10 +2492,10 @@ static int cf_check sh_page_fault(
 sh_update_cr3(v, 0, false);
 #else
 ASSERT(d->is_shutting_down);
+trace_shadow_gen(TRC_SHADOW_DOMF_DYING, va);
 #endif
 paging_unlock(d);
 put_gfn(d, gfn_x(gfn));
-trace_shadow_gen(TRC_SHADOW_DOMF_DYING, va);
 return 0;
 }
 




[PATCH v2 04/13] x86/shadow: call sh_update_cr3() directly from sh_page_fault()

2023-03-30 Thread Jan Beulich
There's no need for an indirect call here, as the mode is invariant
throughout the entire paging-locked region. All it takes to avoid it is
to have a forward declaration of sh_update_cr3() in place.

Signed-off-by: Jan Beulich 
---
I find this and the respective Win7 related comment suspicious: If we
really need to "fix up" L3 entries "on demand", wouldn't we better retry
the shadow_get_and_create_l1e() rather than exit? The spurious page
fault that the guest observes can, after all, not be known to be non-
fatal inside the guest. That's purely an OS policy.

Furthermore the sh_update_cr3() will also invalidate L3 entries which
were loaded successfully before, but invalidated by the guest
afterwards. I strongly suspect that the described hardware behavior is
_only_ to load previously not-present entries from the PDPT, but not
purge ones already marked present. IOW I think sh_update_cr3() would
need calling in an "incremental" mode here. (The alternative of doing
this in shadow_get_and_create_l3e() instead would likely be more
cumbersome.)

Beyond the "on demand" L3 entry creation I also can't see what guest
actions could lead to the ASSERT() being inapplicable in the PAE case.
The 3-level code in shadow_get_and_create_l2e() doesn't consult guest
PDPTEs, and all other logic is similar to that for other modes.

(See 89329d832aed ["x86 shadow: Update cr3 in PAE mode when guest walk
succeed but shadow walk fails"].)

--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -91,6 +91,8 @@ const char *const fetch_type_names[] = {
 # define for_each_shadow_table(v, i) for ( (i) = 0; (i) < 1; ++(i) )
 #endif
 
+static void cf_check sh_update_cr3(struct vcpu *v, int do_locking, bool 
noflush);
+
 /* Helper to perform a local TLB flush. */
 static void sh_flush_local(const struct domain *d)
 {
@@ -2487,7 +2489,7 @@ static int cf_check sh_page_fault(
  * In any case, in the PAE case, the ASSERT is not true; it can
  * happen because of actions the guest is taking. */
 #if GUEST_PAGING_LEVELS == 3
-v->arch.paging.mode->update_cr3(v, 0, false);
+sh_update_cr3(v, 0, false);
 #else
 ASSERT(d->is_shutting_down);
 #endif




[PATCH v2 03/13] x86/shadow: reduce explicit log-dirty recording for HVM

2023-03-30 Thread Jan Beulich
validate_guest_pt_write(), by calling sh_validate_guest_entry(), already
guarantees the needed update of log-dirty information. Move the
operation into the sole code path needing it (when SHOPT_SKIP_VERIFY is
enabled), making clear that only one such call is needed.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/mm/shadow/hvm.c
+++ b/xen/arch/x86/mm/shadow/hvm.c
@@ -656,6 +656,7 @@ static void sh_emulate_unmap_dest(struct
 {
 /* Writes with this alignment constraint can't possibly cross pages. */
 ASSERT(!mfn_valid(sh_ctxt->mfn[1]));
+paging_mark_dirty(v->domain, sh_ctxt->mfn[0]);
 }
 else
 #endif /* SHADOW_OPTIMIZATIONS & SHOPT_SKIP_VERIFY */
@@ -673,12 +674,10 @@ static void sh_emulate_unmap_dest(struct
 validate_guest_pt_write(v, sh_ctxt->mfn[1], addr + b1, b2);
 }
 
-paging_mark_dirty(v->domain, sh_ctxt->mfn[0]);
 put_page(mfn_to_page(sh_ctxt->mfn[0]));
 
 if ( unlikely(mfn_valid(sh_ctxt->mfn[1])) )
 {
-paging_mark_dirty(v->domain, sh_ctxt->mfn[1]);
 put_page(mfn_to_page(sh_ctxt->mfn[1]));
 vunmap((void *)((unsigned long)addr & PAGE_MASK));
 }




[PATCH v2 02/13] x86/shadow: drop redundant present bit checks from FOREACH_PRESENT_LE() "bodies"

2023-03-30 Thread Jan Beulich
FOREACH_PRESENT_LE(), as their names (now) say, already invoke the
"body" only when the present bit is set; no need to re-do the check.

While there also
- stop open-coding mfn_to_maddr() in code being touched (re-indented)
  anyway,
- stop open-coding mfn_eq() in code being touched or in adjacent code,
- drop local variables when they're no longer used at least twice.

Signed-off-by: Jan Beulich 
Reviewed-by: Andrew Cooper 
---
v2: Re-base over new earlier patch. More mfn_eq().

--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -1289,12 +1289,8 @@ void sh_destroy_l4_shadow(struct domain
 /* Decrement refcounts of all the old entries */
 sl4mfn = smfn;
 FOREACH_PRESENT_L4E(sl4mfn, sl4e, NULL, 0, d, {
-if ( shadow_l4e_get_flags(*sl4e) & _PAGE_PRESENT )
-{
-sh_put_ref(d, shadow_l4e_get_mfn(*sl4e),
-   (((paddr_t)mfn_x(sl4mfn)) << PAGE_SHIFT)
-   | ((unsigned long)sl4e & ~PAGE_MASK));
-}
+sh_put_ref(d, shadow_l4e_get_mfn(*sl4e),
+   mfn_to_maddr(sl4mfn) | ((unsigned long)sl4e & ~PAGE_MASK));
 });
 
 /* Put the memory back in the pool */
@@ -1320,10 +1316,8 @@ void sh_destroy_l3_shadow(struct domain
 /* Decrement refcounts of all the old entries */
 sl3mfn = smfn;
 FOREACH_PRESENT_L3E(sl3mfn, sl3e, NULL, 0, {
-if ( shadow_l3e_get_flags(*sl3e) & _PAGE_PRESENT )
-sh_put_ref(d, shadow_l3e_get_mfn(*sl3e),
-(((paddr_t)mfn_x(sl3mfn)) << PAGE_SHIFT)
-| ((unsigned long)sl3e & ~PAGE_MASK));
+sh_put_ref(d, shadow_l3e_get_mfn(*sl3e),
+   mfn_to_maddr(sl3mfn) | ((unsigned long)sl3e & ~PAGE_MASK));
 });
 
 /* Put the memory back in the pool */
@@ -1352,10 +1346,8 @@ void sh_destroy_l2_shadow(struct domain
 /* Decrement refcounts of all the old entries */
 sl2mfn = smfn;
 FOREACH_PRESENT_L2E(sl2mfn, sl2e, NULL, 0, d, {
-if ( shadow_l2e_get_flags(*sl2e) & _PAGE_PRESENT )
-sh_put_ref(d, shadow_l2e_get_mfn(*sl2e),
-(((paddr_t)mfn_x(sl2mfn)) << PAGE_SHIFT)
-| ((unsigned long)sl2e & ~PAGE_MASK));
+sh_put_ref(d, shadow_l2e_get_mfn(*sl2e),
+   mfn_to_maddr(sl2mfn) | ((unsigned long)sl2e & ~PAGE_MASK));
 });
 
 /* Put the memory back in the pool */
@@ -1390,11 +1382,10 @@ void sh_destroy_l1_shadow(struct domain
 /* Decrement refcounts of all the old entries */
 mfn_t sl1mfn = smfn;
 FOREACH_PRESENT_L1E(sl1mfn, sl1e, NULL, 0, {
-unsigned int sl1f = shadow_l1e_get_flags(*sl1e);
-
-if ( (sl1f & _PAGE_PRESENT) && !sh_l1e_is_magic(*sl1e) )
+if ( !sh_l1e_is_magic(*sl1e) )
 {
-shadow_vram_put_mfn(shadow_l1e_get_mfn(*sl1e), sl1f,
+shadow_vram_put_mfn(shadow_l1e_get_mfn(*sl1e),
+shadow_l1e_get_flags(*sl1e),
 sl1mfn, sl1e, d);
 shadow_put_page_from_l1e(*sl1e, d);
 }
@@ -3558,7 +3549,6 @@ int cf_check sh_rm_write_access_from_l1(
 {
 shadow_l1e_t *sl1e;
 int done = 0;
-int flags;
 #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
 struct vcpu *curr = current;
 mfn_t base_sl1mfn = sl1mfn; /* Because sl1mfn changes in the foreach */
@@ -3566,10 +3556,8 @@ int cf_check sh_rm_write_access_from_l1(
 
 FOREACH_PRESENT_L1E(sl1mfn, sl1e, NULL, done,
 {
-flags = shadow_l1e_get_flags(*sl1e);
-if ( (flags & _PAGE_PRESENT)
- && (flags & _PAGE_RW)
- && (mfn_x(shadow_l1e_get_mfn(*sl1e)) == mfn_x(readonly_mfn)) )
+if ( (shadow_l1e_get_flags(*sl1e) & _PAGE_RW) &&
+ mfn_eq(shadow_l1e_get_mfn(*sl1e), readonly_mfn) )
 {
 shadow_l1e_t ro_sl1e = shadow_l1e_remove_flags(*sl1e, _PAGE_RW);
 
@@ -3595,13 +3583,10 @@ int cf_check sh_rm_mappings_from_l1(
 {
 shadow_l1e_t *sl1e;
 int done = 0;
-int flags;
 
 FOREACH_PRESENT_L1E(sl1mfn, sl1e, NULL, done,
 {
-flags = shadow_l1e_get_flags(*sl1e);
-if ( (flags & _PAGE_PRESENT)
- && (mfn_x(shadow_l1e_get_mfn(*sl1e)) == mfn_x(target_mfn)) )
+if ( mfn_eq(shadow_l1e_get_mfn(*sl1e), target_mfn) )
 {
 shadow_set_l1e(d, sl1e, shadow_l1e_empty(), p2m_invalid, sl1mfn);
 if ( sh_check_page_has_no_refs(mfn_to_page(target_mfn)) )
@@ -3646,13 +3631,10 @@ int cf_check sh_remove_l1_shadow(struct
 {
 shadow_l2e_t *sl2e;
 int done = 0;
-int flags;
 
 FOREACH_PRESENT_L2E(sl2mfn, sl2e, NULL, done, d,
 {
-flags = shadow_l2e_get_flags(*sl2e);
-if ( (flags & _PAGE_PRESENT)
- && (mfn_x(shadow_l2e_get_mfn(*sl2e)) == mfn_x(sl1mfn)) )
+if ( mfn_eq(shadow_l2e_get_mfn(*sl2e), sl1mfn) )
 {
 shadow_set_l2e(d, sl2e, shadow

[PATCH v2 01/13] x86/shadow: rename SHADOW_FOREACH_LE() to FOREACH_PRESENT_LE()

2023-03-30 Thread Jan Beulich
These being local macros, the SHADOW prefix doesn't gain us much. What
is more important to be aware of at use sites is that the supplied code
is invoked for present entries only.

While making the adjustment also properly use NULL for the 3rd argument
at respective invocation sites.

Requested-by: Andrew Cooper 
Signed-off-by: Jan Beulich 
---
v2: New.

--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -777,7 +777,7 @@ static inline void increment_ptr_to_gues
 }
 
 /* All kinds of l1: touch all entries */
-#define _SHADOW_FOREACH_L1E(_sl1mfn, _sl1e, _gl1p, _done, _code)\
+#define _FOREACH_PRESENT_L1E(_sl1mfn, _sl1e, _gl1p, _done, _code)   \
 do {\
 int _i; \
 shadow_l1e_t *_sp = map_domain_page((_sl1mfn)); \
@@ -796,25 +796,25 @@ do {
 
 /* 32-bit l1, on PAE or 64-bit shadows: need to walk both pages of shadow */
 #if GUEST_PAGING_LEVELS == 2 && SHADOW_PAGING_LEVELS > 2
-#define SHADOW_FOREACH_L1E(_sl1mfn, _sl1e, _gl1p, _done,  _code)\
+#define FOREACH_PRESENT_L1E(_sl1mfn, _sl1e, _gl1p, _done,  _code)   \
 do {\
 int __done = 0; \
-_SHADOW_FOREACH_L1E(_sl1mfn, _sl1e, _gl1p,  \
+_FOREACH_PRESENT_L1E(_sl1mfn, _sl1e, _gl1p, \
  ({ (__done = _done); }), _code);   \
 _sl1mfn = sh_next_page(_sl1mfn);\
 if ( !__done )  \
-_SHADOW_FOREACH_L1E(_sl1mfn, _sl1e, _gl1p, _done, _code);   \
+_FOREACH_PRESENT_L1E(_sl1mfn, _sl1e, _gl1p, _done, _code);  \
 } while (0)
 #else /* Everything else; l1 shadows are only one page */
-#define SHADOW_FOREACH_L1E(_sl1mfn, _sl1e, _gl1p, _done, _code) \
-   _SHADOW_FOREACH_L1E(_sl1mfn, _sl1e, _gl1p, _done, _code)
+#define FOREACH_PRESENT_L1E(_sl1mfn, _sl1e, _gl1p, _done, _code)\
+   _FOREACH_PRESENT_L1E(_sl1mfn, _sl1e, _gl1p, _done, _code)
 #endif
 
 
 #if GUEST_PAGING_LEVELS == 2
 
 /* 32-bit l2 on PAE/64: four pages, touch every second entry */
-#define SHADOW_FOREACH_L2E(_sl2mfn, _sl2e, _gl2p, _done, _dom, _code) \
+#define FOREACH_PRESENT_L2E(_sl2mfn, _sl2e, _gl2p, _done, _dom, _code)\
 do {  \
 int _i, _j;   \
 ASSERT(shadow_mode_external(_dom));   \
@@ -839,7 +839,7 @@ do {
 #elif GUEST_PAGING_LEVELS == 3
 
 /* PAE: touch all entries */
-#define SHADOW_FOREACH_L2E(_sl2mfn, _sl2e, _gl2p, _done, _dom, _code)  \
+#define FOREACH_PRESENT_L2E(_sl2mfn, _sl2e, _gl2p, _done, _dom, _code) \
 do {   \
 int _i;\
 shadow_l2e_t *_sp = map_domain_page((_sl2mfn));\
@@ -859,7 +859,7 @@ do {
 #else
 
 /* 64-bit l2: touch all entries except for PAE compat guests. */
-#define SHADOW_FOREACH_L2E(_sl2mfn, _sl2e, _gl2p, _done, _dom, _code)   \
+#define FOREACH_PRESENT_L2E(_sl2mfn, _sl2e, _gl2p, _done, _dom, _code)  \
 do {\
 unsigned int _i, _end = SHADOW_L2_PAGETABLE_ENTRIES;\
 shadow_l2e_t *_sp = map_domain_page((_sl2mfn)); \
@@ -886,7 +886,7 @@ do {
 #if GUEST_PAGING_LEVELS == 4
 
 /* 64-bit l3: touch all entries */
-#define SHADOW_FOREACH_L3E(_sl3mfn, _sl3e, _gl3p, _done, _code) \
+#define FOREACH_PRESENT_L3E(_sl3mfn, _sl3e, _gl3p, _done, _code)\
 do {\
 int _i; \
 shadow_l3e_t *_sp = map_domain_page((_sl3mfn)); \
@@ -903,7 +903,7 @@ do {
 } while (0)
 
 /* 64-bit l4: avoid Xen mappings */
-#define SHADOW_FOREACH_L4E(_sl4mfn, _sl4e, _gl4p, _done, _dom, _code)   \
+#define FOREACH_PRESENT_L4E(_sl4mfn, _sl4e, _gl4p, _done, _dom, _code)  \
 do {\
 shadow_l4e_t *_sp = map_domain_page((_sl4mfn)); \
 int _xen = !shadow_mode_external(_dom); \
@@ -1288,7 +1288,7 @@ void sh_destroy_l4_shadow(struct domain
 shadow_demote(d, gmfn, t);
 /* Decrement refcounts of all the old entries */
 sl4mfn = smfn;
-SHADOW_FOREACH_L4E(sl4mfn, sl4e, 0, 0, d, {
+FOREACH_PRESENT_L4E(sl4mfn, sl4e, NULL, 0, d, {
 if ( shadow_l4e_get_flags(*sl4e) & _PAGE_PRESENT )
 {
 

[PATCH v2 00/13] x86: assorted shadow mode adjustments

2023-03-30 Thread Jan Beulich
This is kind of fallout from XSA-427 investigations, partly related to
there having been a more intrusive first approach.

Most patches aren't really dependent upon one another, so can probably
go in independently (as they get acked).

A few patches from v1 went in, but there are also three new patches in
v2. See individual patches for what has changed (in response to review
comments).

01: rename SHADOW_FOREACH_LE() to FOREACH_PRESENT_LE()
02: drop redundant present bit checks from SHADOW_FOREACH_LE() "bodies"
03: reduce explicit log-dirty recording for HVM
04: call sh_update_cr3() directly from sh_page_fault()
05: don't generate bogus "domain dying" trace entry from sh_page_fault()
06: use lighter weight mode checks
07: move OOS functions to their own file
08: sh_rm_write_access_from_sl1p() is HVM-only
09: drop is_hvm_...() where easily possible
10: make monitor table create/destroy more consistent
11: vCPU-s never have "no mode"
12: "monitor table" is a HVM-only concept
13: adjust monitor table prealloc amount

Jan



Re: [PATCH 5/9] x86: Merge the system {cpuid,msr} policy objects

2023-03-30 Thread Andrew Cooper
On 30/03/2023 10:40 am, Jan Beulich wrote:
> On 29.03.2023 22:51, Andrew Cooper wrote:
>> Right now, they're the same underlying type, containing disjoint information.
>>
>> Introduce a new cpu-policy.{h,c} to be the new location for all policy
>> handling logic.  Place the combined objects in __ro_after_init, which has
>> appeared since the original logic was written.
>>
>> As we're trying to phase out the use of struct old_cpu_policy entirely, 
>> rework
>> update_domain_cpu_policy() to not pointer-chase through system_policies[].
>>
>> This in turn allows system_policies[] in sysctl.c to become static and 
>> reduced
>> in scope to XEN_SYSCTL_get_cpu_policy.
>>
>> No practical change, but we do half the amount of compile time space required
>> for the system policies, which saves 6x almost-2k at the moment.
> But what you save here is only what was introduced earlier in the series,
> if I'm not mistaken (just like the "containing disjoint" further up)?

Hmm yeah - I should reword to make it clearer that this is returning to
how it was before the series.

> Also do you mean "halve" in place of "half"?

Yes.

>
>> @@ -391,7 +360,19 @@ long arch_do_sysctl(
>>  
>>  case XEN_SYSCTL_get_cpu_policy:
>>  {
>> -const struct old_cpu_policy *policy;
>> +static const struct cpu_policy *system_policies[6] = {
> Another const wanted here after *?

Will do.

~Andrew



Re: [PATCH 4/9] x86: Merge struct msr_policy into struct cpu_policy

2023-03-30 Thread Jan Beulich
On 30.03.2023 13:11, Andrew Cooper wrote:
> On 30/03/2023 10:30 am, Jan Beulich wrote:
>> On 29.03.2023 22:51, Andrew Cooper wrote:
>>> --- a/xen/include/xen/lib/x86/cpu-policy.h
>>> +++ b/xen/include/xen/lib/x86/cpu-policy.h
>>> @@ -3,7 +3,6 @@
>>>  #define XEN_LIB_X86_POLICIES_H
>>>  
>>>  #include 
>>> -#include 
>>>  
>>>  #define FEATURESET_1d 0 /* 0x0001.edx  */
>>>  #define FEATURESET_1c 1 /* 0x0001.ecx  */
>>> @@ -107,6 +106,9 @@ const char *x86_cpuid_vendor_to_str(unsigned int 
>>> vendor);
>>>   CPUID_GUEST_NR_XSTATE - !!CPUID_GUEST_NR_XSTATE +  \
>>>   CPUID_GUEST_NR_EXTD + 2 /* hv_limit and hv2_limit */ )
>>>  
>>> +/* Maximum number of MSRs written when serialising msr_policy. */
>>> +#define MSR_MAX_SERIALISED_ENTRIES 2
>> The comment better wouldn't refer to msr_policy anymore, I think.
> 
> Ah yes.  (There's so much comment and library cleanup still to do)
> 
>>  I also
>> wonder whether the comment wouldn't better move ...
>>
>>> @@ -324,6 +326,44 @@ struct cpu_policy
>>>  };
>>>  } extd;
>>>  
>>> +/*
>>> + * 0x00ce - MSR_INTEL_PLATFORM_INFO
>> ... e.g. above here, to increase the chance of it being spotted that
>> it needs updating if another MSR is added here.
> 
> I'm not sure about that.  In its current position, it's next to it's
> CPUID partner.
> 
> The unit tests in test-cpu-policy cross-check that we never get -ENOBUFS
> for a MSR_MAX_* sized destination, so I'm not worried about it actually
> getting stale.

Well, I wasn't worried so much about ending up with a bad commit, but
rather that something you could spot on the first pass while writing
new code might cause you a build or test failure later on, resulting
in extra hunting. But this was merely a thought, nothing I would insist
one (the more that your CPUID sibling argument is of course also
relevant).

Jan



Re: [PATCH 6/9] x86: Merge a domain's {cpuid,msr} policy objects

2023-03-30 Thread Andrew Cooper
On 30/03/2023 11:00 am, Jan Beulich wrote:
> On 29.03.2023 22:51, Andrew Cooper wrote:
>> @@ -573,7 +574,6 @@ int arch_vcpu_create(struct vcpu *v)
>>  /* Idle domain */
>>  v->arch.cr3 = __pa(idle_pg_table);
>>  rc = 0;
>> -v->arch.msrs = ZERO_BLOCK_PTR; /* Catch stray misuses */
>>  }
> Is this intentional? It's a vCPU pointer here, not a domain one.

Ah, no.  And it answers one of my TODO notes that I hadn't got to yet
(of why MSRs was different to CPUID in this case).

It looks like it got caught in my `arch.msrs` rework.  I'll drop this hunk.

>
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -40,11 +40,11 @@
>>  static int update_domain_cpu_policy(struct domain *d,
>>  xen_domctl_cpu_policy_t *xdpc)
>>  {
>> -struct old_cpu_policy new = {};
>> +struct cpu_policy *new;
>>  struct cpu_policy *sys = is_pv_domain(d)
>>  ? (IS_ENABLED(CONFIG_PV)  ?  &pv_max_cpu_policy : NULL)
>>  : (IS_ENABLED(CONFIG_HVM) ? &hvm_max_cpu_policy : NULL);
>> -struct old_cpu_policy old_sys = { sys, sys };
>> +struct old_cpu_policy old_sys = { sys, sys }, old_new;
> Interesting name, but as long as it's transitional only, that's of course
> fine.

Yeah... it was the best I could come up with.

It does get removed in patch 8.

~Andrew



Re: [PATCH 4/9] x86: Merge struct msr_policy into struct cpu_policy

2023-03-30 Thread Andrew Cooper
On 30/03/2023 10:30 am, Jan Beulich wrote:
> On 29.03.2023 22:51, Andrew Cooper wrote:
>> --- a/xen/include/xen/lib/x86/cpu-policy.h
>> +++ b/xen/include/xen/lib/x86/cpu-policy.h
>> @@ -3,7 +3,6 @@
>>  #define XEN_LIB_X86_POLICIES_H
>>  
>>  #include 
>> -#include 
>>  
>>  #define FEATURESET_1d 0 /* 0x0001.edx  */
>>  #define FEATURESET_1c 1 /* 0x0001.ecx  */
>> @@ -107,6 +106,9 @@ const char *x86_cpuid_vendor_to_str(unsigned int vendor);
>>   CPUID_GUEST_NR_XSTATE - !!CPUID_GUEST_NR_XSTATE +  \
>>   CPUID_GUEST_NR_EXTD + 2 /* hv_limit and hv2_limit */ )
>>  
>> +/* Maximum number of MSRs written when serialising msr_policy. */
>> +#define MSR_MAX_SERIALISED_ENTRIES 2
> The comment better wouldn't refer to msr_policy anymore, I think.

Ah yes.  (There's so much comment and library cleanup still to do)

>  I also
> wonder whether the comment wouldn't better move ...
>
>> @@ -324,6 +326,44 @@ struct cpu_policy
>>  };
>>  } extd;
>>  
>> +/*
>> + * 0x00ce - MSR_INTEL_PLATFORM_INFO
> ... e.g. above here, to increase the chance of it being spotted that
> it needs updating if another MSR is added here.

I'm not sure about that.  In its current position, it's next to it's
CPUID partner.

The unit tests in test-cpu-policy cross-check that we never get -ENOBUFS
for a MSR_MAX_* sized destination, so I'm not worried about it actually
getting stale.


But, with the new merged policies, I need to change the serialising
behaviour anyway.

Right now we serialise everything unconditionally because that's the
only way that merging incremental deltas could be made to work, but now
the MSR enumeration bits are in the same structure we can use those
instead, along with the various "clear" passes we already have.

~Andrew



[PATCH v2] x86emul: further correct 64-bit mode zero count repeated string insn handling

2023-03-30 Thread Jan Beulich
In an entirely different context I came across Linux commit 428e3d08574b
("KVM: x86: Fix zero iterations REP-string"), which points out that
we're still doing things wrong: For one, there's no zero-extension at
all on AMD. And then while RCX is zero-extended from 32 bits uniformly
for all string instructions on newer hardware, RSI/RDI are only for MOVS
and STOS on the systems I have access to. (On an old family 0xf system
I've further found that for REP LODS even RCX is not zero-extended.)

Fixes: 79e996a89f69 ("x86emul: correct 64-bit mode repeated string insn 
handling with zero count")
Signed-off-by: Jan Beulich 
---
Partly RFC for none of this being documented anywhere (and it partly
being model specific); inquiry pending.

If I was asked, I would have claimed to have tested all string insns and
for both vendors back at the time. But pretty clearly I didn't, and
instead I did derive uniform behavior from just the MOVS and STOS
observations on just Intel hardware; I'm sorry for that.
---
v2: Re-base over re-ordering against previously 2nd patch in a series.

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -1589,7 +1589,7 @@ static inline void put_loop_count(
 regs->r(cx) = ad_bytes == 4 ? (uint32_t)count : count;
 }
 
-#define get_rep_prefix(using_si, using_di) ({   \
+#define get_rep_prefix(extend_si, extend_di) ({ \
 unsigned long max_reps = 1; \
 if ( rep_prefix() ) \
 max_reps = get_loop_count(&_regs, ad_bytes);\
@@ -1597,14 +1597,14 @@ static inline void put_loop_count(
 {   \
 /*  \
  * Skip the instruction if no repetitions are required, but \
- * zero extend involved registers first when using 32-bit   \
+ * zero extend relevant registers first when using 32-bit   \
  * addressing in 64-bit mode.   \
  */ \
-if ( mode_64bit() && ad_bytes == 4 )\
+if ( !amd_like(ctxt) && mode_64bit() && ad_bytes == 4 ) \
 {   \
 _regs.r(cx) = 0;\
-if ( using_si ) _regs.r(si) = _regs.esi;\
-if ( using_di ) _regs.r(di) = _regs.edi;\
+if ( extend_si ) _regs.r(si) = _regs.esi;   \
+if ( extend_di ) _regs.r(di) = _regs.edi;   \
 }   \
 goto complete_insn; \
 }   \
@@ -4255,7 +4255,7 @@ x86_emulate(
 dst.bytes = !(b & 1) ? 1 : (op_bytes == 8) ? 4 : op_bytes;
 if ( (rc = ioport_access_check(port, dst.bytes, ctxt, ops)) != 0 )
 goto done;
-nr_reps = get_rep_prefix(false, true);
+nr_reps = get_rep_prefix(false, false);
 dst.mem.off = truncate_ea_and_reps(_regs.r(di), nr_reps, dst.bytes);
 dst.mem.seg = x86_seg_es;
 /* Try the presumably most efficient approach first. */
@@ -4297,7 +4297,7 @@ x86_emulate(
 dst.bytes = !(b & 1) ? 1 : (op_bytes == 8) ? 4 : op_bytes;
 if ( (rc = ioport_access_check(port, dst.bytes, ctxt, ops)) != 0 )
 goto done;
-nr_reps = get_rep_prefix(true, false);
+nr_reps = get_rep_prefix(false, false);
 ea.mem.off = truncate_ea_and_reps(_regs.r(si), nr_reps, dst.bytes);
 /* Try the presumably most efficient approach first. */
 if ( !ops->rep_outs )
@@ -4633,7 +4633,7 @@ x86_emulate(
 case 0xa6 ... 0xa7: /* cmps */ {
 unsigned long next_eip = _regs.r(ip);
 
-get_rep_prefix(true, true);
+get_rep_prefix(false, false);
 src.bytes = dst.bytes = (d & ByteOp) ? 1 : op_bytes;
 if ( (rc = read_ulong(ea.mem.seg, truncate_ea(_regs.r(si)),
   &dst.val, dst.bytes, ctxt, ops)) ||
@@ -4675,7 +4675,7 @@ x86_emulate(
 }
 
 case 0xac ... 0xad: /* lods */
-get_rep_prefix(true, false);
+get_rep_prefix(false, false);
 if ( (rc = read_ulong(ea.mem.seg, truncate_ea(_regs.r(si)),
   &dst.val, dst.bytes, ctxt, ops)) != 0 )
 goto done;
@@ -4686,7 +4686,7 @@ x86_emulate(
 case 0xae ... 0xaf: /* scas */ {
 unsigned long next_eip = _regs.r(ip);
 
-get_rep_prefix(false, true);
+get_rep_prefix(false, false);
 if ( (rc = read_ulong(x86_seg_es, truncate_ea

Re: [PATCH RFC 0/9] x86: Merge cpuid and msr policy

2023-03-30 Thread Roger Pau Monné
On Wed, Mar 29, 2023 at 09:51:28PM +0100, Andrew Cooper wrote:
> tl;dr to add MSR_ARCH_CAPS features sensibly, cpu_{featureset<->policy}() need
> to not operate on objects of differing lifetimes, so structs
> {cpuid,msr}_policy need merging and cpu_policy is the obvious name.

So the problem is that there's a chance we might get a cpu_policy
object that contains a valid (allocated) cpuid object, but not an msr
one?

Or the problem is rather with the domain struct containing separate
cpuid/msr fields not encapsulated in the cpu_policy struct?

I don't think the current set of cpu_policy operations permit you to
operate on incomplete objects anyway.

I assume you are worried about the usage of x86_msr_copy_from_buffer()
for example that could load data into the MSR_ARCH_CAPS field for the
msr object without checking that the corresponding CPUID bit is set?

> But this does mean that we now have
> 
>   cpu_policy->basic.$X
>   cpu_policy->feat.$Y
>   cpu_policy->arch_caps.$Z

I'm not sure I like the fact that we now can't differentiate between
policy fields related to MSRs or CPUID leafs.

Isn't there a chance we might in the future get some name space
collision by us having decided to unify both?

Thanks, Roger.



Re: [PATCH v4 08/12] xen/physinfo: encode Arm SVE vector length in arch_capabilities

2023-03-30 Thread Jan Beulich
On 30.03.2023 12:41, Luca Fancellu wrote:
> 
> 
>> On 28 Mar 2023, at 11:14, Jan Beulich  wrote:
>>
>> On 27.03.2023 12:59, Luca Fancellu wrote:
>>> --- a/xen/arch/arm/arm64/sve.c
>>> +++ b/xen/arch/arm/arm64/sve.c
>>> @@ -124,3 +124,15 @@ int __init sve_parse_dom0_param(const char *str_begin, 
>>> const char *str_end)
>>> {
>>> return parse_integer("sve", str_begin, str_end, (int*)&opt_dom0_sve);
>>> }
>>> +
>>> +void sve_arch_cap_physinfo(uint32_t *arch_capabilities)
>>> +{
>>> +if ( cpu_has_sve )
>>> +{
>>> +/* Vector length is divided by 128 to save some space */
>>> +uint32_t sve_vl = MASK_INSR(sve_encode_vl(get_sys_vl_len()),
>>> +XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK);
>>> +
>>> +*arch_capabilities |= sve_vl;
>>> +}
>>> +}
>>
>> I'm again wondering why a separate function is needed, when everything
>> that's needed is ...
>>
>>> --- a/xen/arch/arm/sysctl.c
>>> +++ b/xen/arch/arm/sysctl.c
>>> @@ -11,11 +11,14 @@
>>> #include 
>>> #include 
>>> #include 
>>> +#include 
>>
>> ... becoming available here for use ...
>>
>>> #include 
>>>
>>> void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
>>> {
>>> pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm | XEN_SYSCTL_PHYSCAP_hap;
>>> +
>>> +sve_arch_cap_physinfo(&pi->arch_capabilities);
>>
>> ... here. That would be even more so if, like suggested before,
>> get_sys_vl_len() returned 0 when !cpu_has_sve.
> 
> I’ve had a look on this, I can do everything In arch_do_physinfo if in 
> xen/include/public/sysctl.h
> the XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK is protected by __aarch64__ or __arm__ .
> 
> Do you agree on that?

I don't see the connection, but guarding the #define in the public header
looks not only okay, but even desirable to me.

Jan



Re: [PATCH v4 08/12] xen/physinfo: encode Arm SVE vector length in arch_capabilities

2023-03-30 Thread Luca Fancellu


> On 28 Mar 2023, at 11:14, Jan Beulich  wrote:
> 
> On 27.03.2023 12:59, Luca Fancellu wrote:
>> --- a/xen/arch/arm/arm64/sve.c
>> +++ b/xen/arch/arm/arm64/sve.c
>> @@ -124,3 +124,15 @@ int __init sve_parse_dom0_param(const char *str_begin, 
>> const char *str_end)
>> {
>> return parse_integer("sve", str_begin, str_end, (int*)&opt_dom0_sve);
>> }
>> +
>> +void sve_arch_cap_physinfo(uint32_t *arch_capabilities)
>> +{
>> +if ( cpu_has_sve )
>> +{
>> +/* Vector length is divided by 128 to save some space */
>> +uint32_t sve_vl = MASK_INSR(sve_encode_vl(get_sys_vl_len()),
>> +XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK);
>> +
>> +*arch_capabilities |= sve_vl;
>> +}
>> +}
> 
> I'm again wondering why a separate function is needed, when everything
> that's needed is ...
> 
>> --- a/xen/arch/arm/sysctl.c
>> +++ b/xen/arch/arm/sysctl.c
>> @@ -11,11 +11,14 @@
>> #include 
>> #include 
>> #include 
>> +#include 
> 
> ... becoming available here for use ...
> 
>> #include 
>> 
>> void arch_do_physinfo(struct xen_sysctl_physinfo *pi)
>> {
>> pi->capabilities |= XEN_SYSCTL_PHYSCAP_hvm | XEN_SYSCTL_PHYSCAP_hap;
>> +
>> +sve_arch_cap_physinfo(&pi->arch_capabilities);
> 
> ... here. That would be even more so if, like suggested before,
> get_sys_vl_len() returned 0 when !cpu_has_sve.

I’ve had a look on this, I can do everything In arch_do_physinfo if in 
xen/include/public/sysctl.h
the XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK is protected by __aarch64__ or __arm__ .

Do you agree on that?

> 
> Jan



[PATCH v5] x86: detect CMOS aliasing on ports other than 0x70/0x71

2023-03-30 Thread Jan Beulich
... in order to also intercept Dom0 accesses through the alias ports.

Also stop intercepting accesses to the CMOS ports if we won't ourselves
use the CMOS RTC, because of there being none.

Note that rtc_init() deliberately uses 16 as the upper loop bound,
despite probe_cmos_alias() using 8: The higher bound is benign now, but
would save us touching the code (or, worse, missing to touch it) in case
the lower one was doubled.

Signed-off-by: Jan Beulich 
---
v5: Simplify logic in is_cmos_port(). Limit the scope of a local
variable. Adjust a comment that's being moved.
v4: Also conditionally mask top bit for guest index port accesses. Add
missing adjustments to rtc_init(). Re-work to avoid recursive
read_lock(). Also adjust guest_io_{read,write}(). Re-base.
v3: Re-base over change to earlier patch.
v2: Re-base.

--- a/xen/arch/x86/hvm/rtc.c
+++ b/xen/arch/x86/hvm/rtc.c
@@ -27,7 +27,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
@@ -836,10 +836,18 @@ void rtc_init(struct domain *d)
 
 if ( !has_vrtc(d) )
 {
-if ( is_hardware_domain(d) )
-/* Hardware domain gets mediated access to the physical RTC. */
-register_portio_handler(d, RTC_PORT(0), 2, hw_rtc_io);
-return;
+unsigned int port;
+
+if ( !is_hardware_domain(d) )
+return;
+
+/*
+ * Hardware domain gets mediated access to the physical RTC/CMOS (of
+ * course unless we don't use it ourselves, for there being none).
+ */
+for ( port = RTC_PORT(0); port < RTC_PORT(0) + 0x10; port += 2 )
+if ( is_cmos_port(port, 2, d) )
+register_portio_handler(d, port, 2, hw_rtc_io);
 }
 
 spin_lock_init(&s->lock);
--- a/xen/arch/x86/include/asm/mc146818rtc.h
+++ b/xen/arch/x86/include/asm/mc146818rtc.h
@@ -9,6 +9,10 @@
 
 extern spinlock_t rtc_lock; /* serialize CMOS RAM access */
 
+struct domain;
+bool is_cmos_port(unsigned int port, unsigned int bytes,
+  const struct domain *d);
+
 /**
  * register summary
  **/
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -220,7 +220,7 @@ static bool admin_io_okay(unsigned int p
 return false;
 
 /* We also never permit direct access to the RTC/CMOS registers. */
-if ( port <= RTC_PORT(1) && port + bytes > RTC_PORT(0) )
+if ( is_cmos_port(port, bytes, d) )
 return false;
 
 return ioports_access_permitted(d, port, port + bytes - 1);
@@ -290,7 +290,7 @@ static uint32_t guest_io_read(unsigned i
 {
 sub_data = pv_pit_handler(port, 0, 0);
 }
-else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+else if ( is_cmos_port(port, 1, currd) )
 {
 sub_data = rtc_guest_read(port);
 }
@@ -436,7 +436,7 @@ static void guest_io_write(unsigned int
 {
 pv_pit_handler(port, (uint8_t)data, 1);
 }
-else if ( port == RTC_PORT(0) || port == RTC_PORT(1) )
+else if ( is_cmos_port(port, 1, currd) )
 {
 rtc_guest_write(port, data);
 }
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -2131,37 +2131,36 @@ int __hwdom_init xen_in_range(unsigned l
 static int __hwdom_init cf_check io_bitmap_cb(
 unsigned long s, unsigned long e, void *ctx)
 {
-struct domain *d = ctx;
+const struct domain *d = ctx;
 unsigned int i;
 
 ASSERT(e <= INT_MAX);
 for ( i = s; i <= e; i++ )
-__clear_bit(i, d->arch.hvm.io_bitmap);
+/*
+ * Accesses to RTC ports also need to be trapped in order to keep
+ * consistency with hypervisor accesses.
+ */
+if ( !is_cmos_port(i, 1, d) )
+__clear_bit(i, d->arch.hvm.io_bitmap);
 
 return 0;
 }
 
 void __hwdom_init setup_io_bitmap(struct domain *d)
 {
-int rc;
+if ( !is_hvm_domain(d) )
+return;
 
-if ( is_hvm_domain(d) )
-{
-bitmap_fill(d->arch.hvm.io_bitmap, 0x1);
-rc = rangeset_report_ranges(d->arch.ioport_caps, 0, 0x1,
-io_bitmap_cb, d);
-BUG_ON(rc);
-/*
- * NB: we need to trap accesses to 0xcf8 in order to intercept
- * 4 byte accesses, that need to be handled by Xen in order to
- * keep consistency.
- * Access to 1 byte RTC ports also needs to be trapped in order
- * to keep consistency with PV.
- */
-__set_bit(0xcf8, d->arch.hvm.io_bitmap);
-__set_bit(RTC_PORT(0), d->arch.hvm.io_bitmap);
-__set_bit(RTC_PORT(1), d->arch.hvm.io_bitmap);
-}
+bitmap_fill(d->arch.hvm.io_bitmap, 0x1);
+if ( rangeset_report_ranges(d->arch.ioport_caps, 0, 0x1,
+io_bitmap_cb, d) )
+BUG();
+

[PATCH] bump default SeaBIOS version to 1.16.2

2023-03-30 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/Config.mk
+++ b/Config.mk
@@ -225,7 +225,7 @@ MINIOS_UPSTREAM_URL ?= https://xenbits.x
 MINIOS_UPSTREAM_REVISION ?= 5bcb28aaeba1c2506a82fab0cdad0201cd9b54b3
 
 SEABIOS_UPSTREAM_URL ?= https://xenbits.xen.org/git-http/seabios.git
-SEABIOS_UPSTREAM_REVISION ?= rel-1.16.1
+SEABIOS_UPSTREAM_REVISION ?= rel-1.16.2
 
 ETHERBOOT_NICS ?= rtl8139 8086100e
 



[qemu-mainline test] 180057: tolerable trouble: fail/pass/starved - PUSHED

2023-03-30 Thread osstest service owner
flight 180057 qemu-mainline real [real]
flight 180063 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/180057/
http://logs.test-lab.xenproject.org/osstest/logs/180063/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm 12 debian-hvm-install fail pass 
in 180063-retest

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 180047
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 180047
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 180047
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 180047
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 180047
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-raw  14 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  14 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-qcow2  1 build-check(1)   starved  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-rtds  1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-vhd   1 build-check(1)   starved  n/a
 build-armhf-libvirt   1 build-check(1)   starved  n/a
 test-armhf-armhf-xl-credit1   1 build-check(1)   starved  n/a
 build-armhf   2 hosts-allocate   starved  n/a

version targeted for testing:
 qemuuf00506aeca2f6d92318967693f8da8c713c163f3
baseline version:
 qemuud37158bb2425e7ebffb167d611be01f1e9e6c86f

Last test of basis   180047  2023-03-29 02:43:12 Z1 days
Testing same since   180057  2023-03-29 21:10:00 Z0 days1 attempts


People who touched revisions under test:
  Emilio Cota 
  Peter Maydell 
  Philippe Mathieu-Daudé 
  Richard Henderson 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  starved 
 build-i386   pass
 build-amd64-libvirt  pass
 build-arm64-libvirt  pass
 build-armhf-libvirt  starved 
 build-i386-libvi

Re: [PATCH RFC 0/9] x86: Merge cpuid and msr policy

2023-03-30 Thread Jan Beulich
On 29.03.2023 22:51, Andrew Cooper wrote:
> tl;dr to add MSR_ARCH_CAPS features sensibly, cpu_{featureset<->policy}() need
> to not operate on objects of differing lifetimes, so structs
> {cpuid,msr}_policy need merging and cpu_policy is the obvious name.
> 
> But this does mean that we now have
> 
>   cpu_policy->basic.$X
>   cpu_policy->feat.$Y
>   cpu_policy->arch_caps.$Z
> 
> and plenty of code now reads
> 
>   d->arch.cpu_policy->feat.$Y
> 
> instead of
> 
>   d->arch.cpuid->feat.$Y
> 
> The latter can be half-fixed with some union magic (see patch 9 commit
> message).  The former can be fixed by putting cpuid/msr infixes in cpu_policy,
> which is doable but very invasive, and would make plenty of code read
> 
>   d->arch.cpu_policy->cpuid.feat.$Y
> 
> and the two obviously shouldn't be done together.
> 
> So, RFC.  Does this code layout look ok?  If we want to make changes with
> naming, now is very much the right time to get them sorted.
> 
> Patches 1-8 are pretty ready to go.  Patch 9 is the remainder to take out the
> temporary hacks, and I'm still in the process of merging the system policy
> derivation.
> 
> Andrew Cooper (9):
>   x86: Rename struct cpu_policy to struct old_cpuid_policy
>   x86: Rename {domctl,sysctl}.cpu_policy.{cpuid,msr_policy} fields

Nit: I guess the last closing brace wants moving forward a little.

>   x86: Rename struct cpuid_policy to struct cpu_policy
>   x86: Merge struct msr_policy into struct cpu_policy
>   x86: Merge the system {cpuid,msr} policy objects
>   x86: Merge a domain's {cpuid,msr} policy objects
>   x86: Merge xc_cpu_policy's cpuid and msr objects
>   x86: Drop struct old_cpu_policy

With the small comments on individual patches taken care of one way or
another, up to here:

Reviewed-by: Jan Beulich 

Jan

>   RFC: Everything else
> 
>  tools/fuzz/cpu-policy/afl-policy-fuzzer.c |  15 +-
>  .../fuzz/x86_instruction_emulator/fuzz-emul.c |   2 +-
>  tools/libs/guest/xg_cpuid_x86.c   |  48 +-
>  tools/libs/guest/xg_private.h |   5 +-
>  tools/tests/cpu-policy/test-cpu-policy.c  |  50 +-
>  tools/tests/tsx/test-tsx.c|  58 +-
>  tools/tests/x86_emulator/Makefile |   2 +-
>  tools/tests/x86_emulator/test_x86_emulator.c  |   2 +-
>  tools/tests/x86_emulator/x86-emulate.c|   2 +-
>  tools/tests/x86_emulator/x86-emulate.h|   2 +-
>  xen/arch/x86/Makefile |   1 +
>  xen/arch/x86/cpu-policy.c |  67 +++
>  xen/arch/x86/cpu/common.c |   4 +-
>  xen/arch/x86/cpu/mcheck/mce_intel.c   |   2 +-
>  xen/arch/x86/cpu/vpmu_intel.c |   4 +-
>  xen/arch/x86/cpuid.c  | 101 ++--
>  xen/arch/x86/domain.c |  18 +-
>  xen/arch/x86/domctl.c |  51 +-
>  xen/arch/x86/hvm/emulate.c|   2 +-
>  xen/arch/x86/hvm/hvm.c|  38 +-
>  xen/arch/x86/hvm/ioreq.c  |   4 +-
>  xen/arch/x86/hvm/mtrr.c   |   2 +-
>  xen/arch/x86/hvm/svm/svm.c|  18 +-
>  xen/arch/x86/hvm/svm/svmdebug.c   |   2 +-
>  xen/arch/x86/hvm/vlapic.c |   2 +-
>  xen/arch/x86/hvm/vmx/vmx.c|  12 +-
>  xen/arch/x86/hvm/vmx/vvmx.c   |   2 +-
>  xen/arch/x86/include/asm/cpu-policy.h |  18 +
>  xen/arch/x86/include/asm/cpuid.h  |  10 -
>  xen/arch/x86/include/asm/domain.h |   4 +-
>  xen/arch/x86/include/asm/guest_pt.h   |   4 +-
>  xen/arch/x86/include/asm/msr.h|  13 +-
>  xen/arch/x86/include/asm/paging.h |   2 +-
>  xen/arch/x86/mm/mem_sharing.c |   3 +-
>  xen/arch/x86/mm/shadow/hvm.c  |   2 +-
>  xen/arch/x86/msr.c|  98 +---
>  xen/arch/x86/pv/domain.c  |   2 +-
>  xen/arch/x86/pv/emul-priv-op.c|   6 +-
>  xen/arch/x86/pv/ro-page-fault.c   |   2 +-
>  xen/arch/x86/sysctl.c |  77 +--
>  xen/arch/x86/traps.c  |   2 +-
>  xen/arch/x86/x86_emulate.c|   2 +-
>  xen/arch/x86/x86_emulate/x86_emulate.c| 166 +++---
>  xen/arch/x86/x86_emulate/x86_emulate.h|   6 +-
>  xen/arch/x86/xstate.c |   4 +-
>  xen/include/public/domctl.h   |   4 +-
>  xen/include/public/sysctl.h   |   4 +-
>  xen/include/xen/lib/x86/cpu-policy.h  | 540 +-
>  xen/include/xen/lib/x86/cpuid.h   | 475 ---
>  xen/include/xen/lib/x86/msr.h | 104 
>  xen/lib/x86/cpuid.c   |  12 +-
>  xen/lib/x86/msr.c |   6 +-
>  xen/lib/x86/policy.c  |   8 +-
>  53 files changed, 986 insertions(+), 1104 deletions(-)
>  create m

Re: [PATCH 9/9] RFC: Everything else

2023-03-30 Thread Jan Beulich
On 29.03.2023 22:51, Andrew Cooper wrote:
> Looking at this diff, I'm wondering whether keeping
> 
> union {
> struct cpu_policy *cpuid;
> struct cpu_policy *cpu_policy;
> };
> 
> permentantly might be a good idea, because d->arch.cpuid->$X reads rather
> better than d->arch.cpu_policy->$X
> 
> Thoughts?

I'm not overly fussed, but perhaps yes. Nevertheless e.g. ...

> --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c
> @@ -893,7 +893,7 @@ int LLVMFuzzerTestOneInput(const uint8_t *data_p, size_t 
> size)
>  struct x86_emulate_ctxt ctxt = {
>  .data = &state,
>  .regs = &input.regs,
> -.cpuid = &cp,
> +.cpu_policy = &cp,

... this and ...

> --- a/tools/tests/x86_emulator/test_x86_emulator.c
> +++ b/tools/tests/x86_emulator/test_x86_emulator.c
> @@ -909,7 +909,7 @@ int main(int argc, char **argv)
>  
>  ctxt.regs = ®s;
>  ctxt.force_writeback = 0;
> -ctxt.cpuid = &cp;
> +ctxt.cpu_policy = &cp;

... this imo want keeping as you have it here.

Jan



Re: [PATCH 8/9] x86: Drop struct old_cpu_policy

2023-03-30 Thread Jan Beulich
On 29.03.2023 22:51, Andrew Cooper wrote:
> --- a/tools/libs/guest/xg_cpuid_x86.c
> +++ b/tools/libs/guest/xg_cpuid_x86.c
> @@ -868,9 +868,7 @@ bool xc_cpu_policy_is_compatible(xc_interface *xch, 
> xc_cpu_policy_t *host,
>   xc_cpu_policy_t *guest)
>  {
>  struct cpu_policy_errors err = INIT_CPU_POLICY_ERRORS;
> -struct old_cpu_policy h = { &host->policy, &host->policy };
> -struct old_cpu_policy g = { &guest->policy, &guest->policy };
> -int rc = x86_cpu_policies_are_compatible(&h, &g, &err);
> +int rc = x86_cpu_policies_are_compatible(&host->policy, &host->policy, 
> &err);

One of the two surely wants to be &guest->policy?

Jan



Re: [PATCH 7/9] x86: Merge xc_cpu_policy's cpuid and msr objects

2023-03-30 Thread Jan Beulich
On 29.03.2023 22:51, Andrew Cooper wrote:
>  /* Sanity test various invariants we expect in the default/max policies. */
>  static void test_guest_policies(const struct xc_cpu_policy *max,
>  const struct xc_cpu_policy *def)
>  {
> -const struct cpuid_policy *cm = &max->cpuid;
> -const struct cpuid_policy *cd = &def->cpuid;
> -const struct msr_policy *mm = &max->msr;
> -const struct msr_policy *md = &def->msr;
> +const struct cpu_policy *m = &max->policy;
> +const struct cpu_policy *d = &def->policy;

Looks like you could reduce code churn by keeping "cm" and "cd" as the
names, which would also be consistent with you continuing to use "cp"
in hypervisor code.

Jan

>  dump_tsx_details(max, "Max:");
>  dump_tsx_details(def, "Def:");
>  
> -if ( ((cm->feat.raw[0].d | cd->feat.raw[0].d) &
> +if ( ((m->feat.raw[0].d | d->feat.raw[0].d) &
>(bitmaskof(X86_FEATURE_TSX_FORCE_ABORT) |
> bitmaskof(X86_FEATURE_RTM_ALWAYS_ABORT) |
> bitmaskof(X86_FEATURE_SRBDS_CTRL))) ||
> - ((mm->arch_caps.raw | md->arch_caps.raw) & ARCH_CAPS_TSX_CTRL) )
> + ((m->arch_caps.raw | d->arch_caps.raw) & ARCH_CAPS_TSX_CTRL) )
>  fail("  Xen-only TSX controls offered to guest\n");
>  
>  switch ( rtm_behaviour )
>  {
>  case RTM_UD:
> -if ( (cm->feat.raw[0].b | cd->feat.raw[0].b) &
> +if ( (m->feat.raw[0].b | d->feat.raw[0].b) &
>   (bitmaskof(X86_FEATURE_HLE) | bitmaskof(X86_FEATURE_RTM)) )
>   fail("  HLE/RTM offered to guests despite not being 
> available\n");
>  break;
>  
>  case RTM_ABORT:
> -if ( cd->feat.raw[0].b &
> +if ( d->feat.raw[0].b &
>   (bitmaskof(X86_FEATURE_HLE) | bitmaskof(X86_FEATURE_RTM)) )
>   fail("  HLE/RTM offered to guests by default despite not being 
> usable\n");
>  break;
>  
>  case RTM_OK:
> -if ( !cm->feat.rtm || !cd->feat.rtm )
> +if ( !m->feat.rtm || !d->feat.rtm )
>   fail("  RTM not offered to guests despite being available\n");
>  break;
>  }
>  
> -if ( cd->feat.hle )
> +if ( d->feat.hle )
>  fail("  Fail: HLE offered in default policy\n");
>  }
>  




Re: [PATCH 6/9] x86: Merge a domain's {cpuid,msr} policy objects

2023-03-30 Thread Jan Beulich
On 29.03.2023 22:51, Andrew Cooper wrote:
> @@ -573,7 +574,6 @@ int arch_vcpu_create(struct vcpu *v)
>  /* Idle domain */
>  v->arch.cr3 = __pa(idle_pg_table);
>  rc = 0;
> -v->arch.msrs = ZERO_BLOCK_PTR; /* Catch stray misuses */
>  }

Is this intentional? It's a vCPU pointer here, not a domain one.

> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -40,11 +40,11 @@
>  static int update_domain_cpu_policy(struct domain *d,
>  xen_domctl_cpu_policy_t *xdpc)
>  {
> -struct old_cpu_policy new = {};
> +struct cpu_policy *new;
>  struct cpu_policy *sys = is_pv_domain(d)
>  ? (IS_ENABLED(CONFIG_PV)  ?  &pv_max_cpu_policy : NULL)
>  : (IS_ENABLED(CONFIG_HVM) ? &hvm_max_cpu_policy : NULL);
> -struct old_cpu_policy old_sys = { sys, sys };
> +struct old_cpu_policy old_sys = { sys, sys }, old_new;

Interesting name, but as long as it's transitional only, that's of course
fine.

Jan



Re: [PATCH v2 0/8] x86emul: a few small steps towards disintegration

2023-03-30 Thread Roger Pau Monné
On Thu, Mar 30, 2023 at 09:53:23AM +0200, Jan Beulich wrote:
> On 29.03.2023 16:17, Roger Pau Monné wrote:
> > On Tue, Mar 28, 2023 at 04:48:10PM +0200, Jan Beulich wrote:
> >> On 28.03.2023 16:19, Roger Pau Monné wrote:
> >>> On Wed, Jun 15, 2022 at 11:57:54AM +0200, Jan Beulich wrote:
>  ... of the huge monolithic source file. The series is largely code
>  movement and hence has the intention of not incurring any functional
>  change.
> >>>
> >>> I take the intention is to make code simpler and easier to follow by
> >>> splitting it up into smaller files?
> >>
> >> Well, I can't say yes or no to "simpler" or "easier to follow", but
> >> splitting is the goal, in the hope that these may end up as a side
> >> effects. There's always the risk that scattering things around may
> >> also make things less obvious. My main motivation, however, is the
> >> observation that this huge source file alone consumes a fair part
> >> of (non-parallelizable) build time. To the degree that with older
> >> gcc building this one file takes ten (or so) times as long as the
> > 
> > I wouldn't really trade compiler speed for clarity in a piece of code
> > like the x86 emulator, which is already very complex.
> 
> Of course, and I specifically said "main" motivation. The hope is that
> by splitting things become less entangled / convoluted. I guess fpu.c
> is a good example where certain non-trivial macros have isolated use,
> and hence are no longer cluttering other parts of the emulator sources.
> 
> > Do you have some figures of the performance difference this series
> > makes when building the emulator?
> 
> No, I don't. And the difference isn't going to be significant, I expect,
> as the build being slow is - from all I can tell - directly connected to
> the huge switch() statement. Yet the number of cases there shrinks only
> marginally for now. The series is named "a few small steps" for this
> reason, along with others. See below for a first bigger step, which may
> then make a noticeable difference.
> 
> > A couple of notes from someone that's not familiar with the x86
> > emulator.  It would be clearer if the split files where prefixed with
> > opcode_ for those that deal with emulation (blk.c, 0f01.c, ...) IMO,
> > so that you clearly see this is an opcode family that has been split
> > into a separate file, or maybe insn_{opcode,blk,fpu,...}?
> 
> Hmm. For one, "blk" isn't really dealing with any opcode family in
> particular. It contains a helper function for code using the emulator.
> So it falls more in the group of util*.c. For the others may main
> objection would be that I'd prefer to keep file names short. At least
> at this point of splitting I think file names are sufficiently
> descriptive. Nevertheless, insn-0f01.c or opc-0f01.c may be options, if
> we really think we want/need to group files visually. However, I don't
> expect there are going to be more files paralleling 0f01.c et al: The
> opcode groups split out are the ones that are large/heterogeneous
> enough to warrant doing it on this basis. Of course new such groups may
> appear in the ISA down the road.
> 
> FPU is isolated functionally, and I'd expect a simd.c to appear once
> it becomes clear if/how to sensibly split out SIMD code. Unlike fpu.c
> I'd further expect this to (long term) consist of more than just a
> single function, hopefully replacing the massive use of "goto" within
> that big switch statement by function calls (but as said, plans here
> - if one can call it that way in the first place - are very vague at
> this point).
> 
> > I've noticed in some of the newly introduced files the original
> > copyright notice from Keir is lost, I assume that's on purpose because
> > none of the code was contributed by Kier in that file? (see 0f01.c vs
> > 0fae.c for example).
> 
> Right - 0fae.c contains only code which was added later (mostly by me),
> if I'm not mistaken.

OK, just wanted to make sure this wasn't an oversight.

> > Do we expect to be able to split other opcode handling into separate
> > files?
> 
> As per above, "expect" is perhaps too optimistic. I'd say "hope", in
> particular for SIMD code (which I guess is now the main part of the
> ISA as well as the sources, at least number-of-insns-wise). One
> possible (likely intermediate) approach might be to move all SIMD code
> from the huge switch() statement to its own file/function, invoked
> from that huge switch()'s default: case. It may then still be a big
> switch() statement in (e.g.) simd.c, but we'd then at least have two
> of them, each about half the size of what we have right now. Plus it
> may allow some (most?) of the X86EMUL_NO_SIMD #ifdef-ary to be dropped,
> as the new file - like fpu.c - could then itself be built only
> conditionally.

I don't like the handling of SIMD from a default case in the global
switch much, as we then could end up chaining all kind of random
handling in the default case.  It's IMO clearer if we can detect and
forward insn

Re: [PATCH 5/9] x86: Merge the system {cpuid,msr} policy objects

2023-03-30 Thread Jan Beulich
On 29.03.2023 22:51, Andrew Cooper wrote:
> Right now, they're the same underlying type, containing disjoint information.
> 
> Introduce a new cpu-policy.{h,c} to be the new location for all policy
> handling logic.  Place the combined objects in __ro_after_init, which has
> appeared since the original logic was written.
> 
> As we're trying to phase out the use of struct old_cpu_policy entirely, rework
> update_domain_cpu_policy() to not pointer-chase through system_policies[].
> 
> This in turn allows system_policies[] in sysctl.c to become static and reduced
> in scope to XEN_SYSCTL_get_cpu_policy.
> 
> No practical change, but we do half the amount of compile time space required
> for the system policies, which saves 6x almost-2k at the moment.

But what you save here is only what was introduced earlier in the series,
if I'm not mistaken (just like the "containing disjoint" further up)? Also 
do you mean "halve" in place of "half"?

> @@ -391,7 +360,19 @@ long arch_do_sysctl(
>  
>  case XEN_SYSCTL_get_cpu_policy:
>  {
> -const struct old_cpu_policy *policy;
> +static const struct cpu_policy *system_policies[6] = {

Another const wanted here after *?

Jan



Re: [PATCH 4/9] x86: Merge struct msr_policy into struct cpu_policy

2023-03-30 Thread Jan Beulich
On 29.03.2023 22:51, Andrew Cooper wrote:
> --- a/xen/include/xen/lib/x86/cpu-policy.h
> +++ b/xen/include/xen/lib/x86/cpu-policy.h
> @@ -3,7 +3,6 @@
>  #define XEN_LIB_X86_POLICIES_H
>  
>  #include 
> -#include 
>  
>  #define FEATURESET_1d 0 /* 0x0001.edx  */
>  #define FEATURESET_1c 1 /* 0x0001.ecx  */
> @@ -107,6 +106,9 @@ const char *x86_cpuid_vendor_to_str(unsigned int vendor);
>   CPUID_GUEST_NR_XSTATE - !!CPUID_GUEST_NR_XSTATE +  \
>   CPUID_GUEST_NR_EXTD + 2 /* hv_limit and hv2_limit */ )
>  
> +/* Maximum number of MSRs written when serialising msr_policy. */
> +#define MSR_MAX_SERIALISED_ENTRIES 2

The comment better wouldn't refer to msr_policy anymore, I think. I also
wonder whether the comment wouldn't better move ...

> @@ -324,6 +326,44 @@ struct cpu_policy
>  };
>  } extd;
>  
> +/*
> + * 0x00ce - MSR_INTEL_PLATFORM_INFO

... e.g. above here, to increase the chance of it being spotted that
it needs updating if another MSR is added here.

Jan



[PATCH v2 13/13] tools/xenstore: remove unused stuff from list.h

2023-03-30 Thread Juergen Gross
Remove the hlist devines/functions and the rcu related functions from
tools/xenstore/list.h, as they are not used.

Signed-off-by: Juergen Gross 
---
 tools/xenstore/list.h | 227 --
 1 file changed, 227 deletions(-)

diff --git a/tools/xenstore/list.h b/tools/xenstore/list.h
index a464a38b61..d722a91220 100644
--- a/tools/xenstore/list.h
+++ b/tools/xenstore/list.h
@@ -88,48 +88,6 @@ static inline void list_add_tail(struct list_head *new, 
struct list_head *head)
__list_add(new, head->prev, head);
 }
 
-/*
- * Insert a new entry between two known consecutive entries. 
- *
- * This is only for internal list manipulation where we know
- * the prev/next entries already!
- */
-static __inline__ void __list_add_rcu(struct list_head * new,
-   struct list_head * prev,
-   struct list_head * next)
-{
-   new->next = next;
-   new->prev = prev;
-   next->prev = new;
-   prev->next = new;
-}
-
-/**
- * list_add_rcu - add a new entry to rcu-protected list
- * @new: new entry to be added
- * @head: list head to add it after
- *
- * Insert a new entry after the specified head.
- * This is good for implementing stacks.
- */
-static __inline__ void list_add_rcu(struct list_head *new, struct list_head 
*head)
-{
-   __list_add_rcu(new, head, head->next);
-}
-
-/**
- * list_add_tail_rcu - add a new entry to rcu-protected list
- * @new: new entry to be added
- * @head: list head to add it before
- *
- * Insert a new entry before the specified head.
- * This is useful for implementing queues.
- */
-static __inline__ void list_add_tail_rcu(struct list_head *new, struct 
list_head *head)
-{
-   __list_add_rcu(new, head->prev, head);
-}
-
 /*
  * Delete a list entry by making the prev/next entries
  * point to each other.
@@ -156,23 +114,6 @@ static inline void list_del(struct list_head *entry)
entry->prev = LIST_POISON2;
 }
 
-/**
- * list_del_rcu - deletes entry from list without re-initialization
- * @entry: the element to delete from the list.
- *
- * Note: list_empty on entry does not return true after this, 
- * the entry is in an undefined state. It is useful for RCU based
- * lockfree traversal.
- *
- * In particular, it means that we can not poison the forward 
- * pointers that may still be used for walking the list.
- */
-static inline void list_del_rcu(struct list_head *entry)
-{
-   __list_del(entry->prev, entry->next);
-   entry->prev = LIST_POISON2;
-}
-
 /**
  * list_del_init - deletes entry from list and reinitialize it.
  * @entry: the element to delete from the list.
@@ -339,172 +280,4 @@ static inline void list_splice_init(struct list_head 
*list,
 &pos->member != (head);\
 pos = n, n = list_entry(n->member.next, typeof(*n), member))
 
-
-/* 
- * Double linked lists with a single pointer list head. 
- * Mostly useful for hash tables where the two pointer list head is 
- * too wasteful.
- * You lose the ability to access the tail in O(1).
- */ 
-
-struct hlist_head { 
-   struct hlist_node *first; 
-}; 
-
-struct hlist_node { 
-   struct hlist_node *next, **pprev; 
-}; 
-
-#define HLIST_HEAD_INIT { .first = NULL } 
-#define HLIST_HEAD(name) struct hlist_head name = {  .first = NULL }
-#define INIT_HLIST_HEAD(ptr) ((ptr)->first = NULL) 
-#define INIT_HLIST_NODE(ptr) ((ptr)->next = NULL, (ptr)->pprev = NULL)
-
-static __inline__ int hlist_unhashed(struct hlist_node *h) 
-{ 
-   return !h->pprev;
-} 
-
-static __inline__ int hlist_empty(struct hlist_head *h) 
-{ 
-   return !h->first;
-} 
-
-static __inline__ void __hlist_del(struct hlist_node *n) 
-{
-   struct hlist_node *next = n->next;
-   struct hlist_node **pprev = n->pprev;
-   *pprev = next;  
-   if (next) 
-   next->pprev = pprev;
-}  
-
-static __inline__ void hlist_del(struct hlist_node *n)
-{
-   __hlist_del(n);
-   n->next = LIST_POISON1;
-   n->pprev = LIST_POISON2;
-}
-
-/**
- * hlist_del_rcu - deletes entry from hash list without re-initialization
- * @entry: the element to delete from the hash list.
- *
- * Note: list_unhashed() on entry does not return true after this, 
- * the entry is in an undefined state. It is useful for RCU based
- * lockfree traversal.
- *
- * In particular, it means that we can not poison the forward
- * pointers that may still be used for walking the hash list.
- */
-static inline void hlist_del_rcu(struct hlist_node *n)
-{
-   __hlist_del(n);
-   n->pprev = LIST_POISON2;
-}
-
-static __inline__ void hlist_del_init(struct hlist_node *n) 
-{
-   if (n->pprev)  {
-   __hlist_del(n);
-   INIT_HLIST_NODE(n);
-   }
-}  
-
-#define hlist_del_rcu_init hlist_del_init
-
-static __inline__ void hlist_add_head(struct hlist_node *n, struct hlist_head 
*h) 
-{ 
-   struct hlist_node *first = h->first;
-   n->next = first; 
-   if (first) 
-   first->pprev = &

  1   2   >