Re: BUG: libxl vuart build order

2020-10-29 Thread Takahiro Akashi
Hi Stefano,

On Thu, Oct 29, 2020 at 07:03:28PM -0700, Stefano Stabellini wrote:
> + xen-devel and libxl maintainers
> 
> In short, there is a regression in libxl with the ARM vuart introduced
> by moving ARM guests to the PVH build.
> 
> 
> On Thu, 29 Oct 2020, Takahiro Akashi wrote:
> > On Wed, Oct 28, 2020 at 02:44:16PM -0700, Stefano Stabellini wrote:
> > > On Wed, 28 Oct 2020, Takahiro Akashi wrote:
> > > > On Tue, Oct 27, 2020 at 09:02:14AM +0900, Takahiro Akashi wrote:
> > > > > On Mon, Oct 26, 2020 at 04:37:30PM -0700, Stefano Stabellini wrote:
> > > > > > 
> > > > > > On Mon, 26 Oct 2020, Takahiro Akashi wrote:
> > > > > > > Stefano,
> > > > > > > 
> > > > > > > # I'm afraid that I have already bothered you with a lot of 
> > > > > > > questions.
> > > > > > > 
> > > > > > > When I looked at Xen's vpl011 implementation, I found
> > > > > > > CR (and LCHR) register is not supported. (trap may cause a data 
> > > > > > > abort).
> > > > > > > On the other hand, for example, linux's pl011 driver surely
> > > > > > > accesses CR (and LCHR) register.
> > > > > > > So I guess that linux won't be able to use pl011 on a Xen guest vm
> > > > > > > if vuart = "sbsa_uart".
> > > > > > > 
> > > > > > > Is this a known issue or do I miss anything?
> > > > > > 
> > > > > > Linux should definitely be able to use it, and in fact, I am using 
> > > > > > it
> > > > > > with Linux in my test environment.
> > > > > > 
> > > > > > I think the confusion comes from the name "vpl011": it is in fact 
> > > > > > not a
> > > > > > full PL011 UART, but an SBSA UART.
> > > > > 
> > > > > Yeah, I have noticed it.
> > > > > 
> > > > > > SBSA UART only implements a subset of
> > > > > > the PL011 registers. The compatible string is "arm,sbsa-uart", also 
> > > > > > see
> > > > > > drivers/tty/serial/amba-pl011.c:sbsa_uart_probe.
> > > > > 
> > > > > Looking closely into the details of implementation, I found
> > > > > that all the accesses to unimplemented registers, including
> > > > > CR, are deliberately avoided in sbsa part of linux driver.
> > > > 
> > > > So I'm now trying to implement "sbsa-uart" driver on U-Boot
> > > > by modifying the existing pl011 driver.
> > > > (Please note the current xen'ized U-Boot utilises a para-virtualized
> > > > console, i.e. with HVM_PARAM_CONSOLE_PFN.)
> > > > 
> > > > So far my all attempts have failed.
> > > > 
> > > > There are a couple of problems, and one of them is how we can
> > > > access vpl011 port (from dom0).
> > > > What I did is:
> > > > - modify U-Boot's pl011 driver
> > > >   (I'm sure that the driver correctly handle a vpl011 device
> > > >   with regard of accessing a proper set of registers.)
> > > > - start U-Boot guest with "vuart=sbsa_uart" by
> > > > xl create uboot.cfg -c
> > > > 
> > > > Then I have seen almost nothing on the screen.
> > > > Digging into vpl011 implementation, I found that all the characters
> > > > written DR register will be directed to a "backend domain" if a guest
> > > > vm is launched by xl command.
> > > > (In case of dom0less, the backend seems to be Xen itself.)
> > > > 
> > > > As a silly experiment, I modified domain_vpl011_init() to always create
> > > > a vpl011 device with "backend_in_domain == false".
> > > > Then, I could see more boot messages from U-Boot, but still fails
> > > > to use the device as a console, I mean, we will lose all the outputs
> > > > after at some point and won't be able to type any keys (at a command 
> > > > prompt).
> > > > (This will be another problem on U-Boot side.)
> > > > 
> > > > My first question here is how we can configure and connect a console
> > > > in this case?
> > > > Should "xl create -c" or "xl console -t vuart" simply work?
> > > 
> > > "xl create -c" creates a guest and connect to the primary console which
> > > is the PV console (i.e. HVM_PARAM_CONSOLE_PFN.)
> > 
> > So in case of vuart, it (console) doesn't work?
> > (Apparently, "xl create" doesn't take '-t' option.)
> > 
> > > To connect to the emulated sbsa uart you need to pass -t vuart. So yes,
> > > "xl console -t vuart domain_name" should get you access to the emulated
> > > sbsa uart. The sbsa uart can also be exposed to dom0less guests; you get
> > > their output by using CTRL-AAA to switch to right domU console.
> > > 
> > > You can add printks to xen/arch/arm/vpl011.c in Xen to see what's
> > > happening on the Xen side. vpl011.c is the emulator.
> > 
> > I'm sure that write to "REG_DR" register is caught by Xen.
> > What I don't understand is
> > if back_in_domain -> no outputs
> > if !back_in_domain -> can see outputs
> > 
> > (As you know, if a guest is created by xl command, back_in_domain
> > is forcedly set to true.)
> > 
> > I looked into xenstore and found that "vuart/0/tty" does not exist,
> > but "console/tty" does.
> > How can this happen for vuart?
> > (I clearly specified, vuart = "sbsa_uart" in Xen config.)
> 
> It looks like we have a bug :-(
> 
> I managed to reproduce the issue. The problem is a 

[qemu-mainline test] 156287: regressions - FAIL

2020-10-29 Thread osstest service owner
flight 156287 qemu-mainline real [real]
flight 156312 qemu-mainline real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/156287/
http://logs.test-lab.xenproject.org/osstest/logs/156312/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-libvirt-xsm 14 guest-start  fail REGR. vs. 152631
 test-armhf-armhf-xl-vhd  12 debian-di-installfail REGR. vs. 152631
 test-armhf-armhf-libvirt 14 guest-start  fail REGR. vs. 152631
 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail REGR. vs. 152631
 test-armhf-armhf-libvirt-raw 12 debian-di-installfail REGR. vs. 152631
 test-amd64-amd64-xl-qcow2   21 guest-start/debian.repeat fail REGR. vs. 152631
 test-amd64-i386-xl-qemuu-ws16-amd64 12 windows-install   fail REGR. vs. 152631

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152631
 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeatfail  like 152631
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 152631
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152631
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass

version targeted for testing:
 qemuubbc48d2bcb9711614fbe751c2c5ae13e172fbca8
baseline version:
 qemuu1d806cef0e38b5db8347a8e12f214d543204a314

Last test of basis   152631  2020-08-20 09:07:46 Z   70 days
Failing since152659  2020-08-21 14:07:39 Z   69 days  159 attempts
Testing same since   156287  2020-10-29 05:06:49 Z0 days1 attempts


People who touched revisions under test:
Aaron Lindsay 
  Alberto Garcia 
  Aleksandar Markovic 
  Alex Bennée 
  Alex Williamson 
  Alexander Bulekov 
  Alexey Kirillov 
  Alistair Francis 
  Alistair Francis 
  Amey Narkhede 
  Ana Pazos 
  Andrew Jones 
  Andrey Konovalov 
  Andrey Shinkevich 
  Ani Sinha 
  Anthony PERARD 
  Anton Blanchard 
  Anup Patel 
  Babu Moger 
  BALATON Zoltan 
  Bastian Koppelmann 
  Ben Widawsky 
  Bihong Yu 
  Bin Meng 
  Bruce Rogers 
  Carlo 

Re: [PATCH] tools/python: pass more -rpath-link options to ld

2020-10-29 Thread Marek Marczykowski
On Mon, Oct 19, 2020 at 10:31:37AM +0200, Jan Beulich wrote:
> With the split of libraries, I've observed a number of warnings from
> (old?) ld.
> 
> Signed-off-by: Jan Beulich 
> ---
> It's unclear to me whether this is ld version dependent - the pattern
> of where I've seen such warnings doesn't suggest a clear version
> dependency.
> 
> --- a/tools/python/setup.py
> +++ b/tools/python/setup.py
> @@ -7,10 +7,15 @@ XEN_ROOT = "../.."
>  extra_compile_args  = [ "-fno-strict-aliasing", "-Werror" ]
>  
>  PATH_XEN  = XEN_ROOT + "/tools/include"
> +PATH_LIBXENTOOLCORE = XEN_ROOT + "/tools/libs/toolcore"
>  PATH_LIBXENTOOLLOG = XEN_ROOT + "/tools/libs/toollog"
> +PATH_LIBXENCALL = XEN_ROOT + "/tools/libs/call"
>  PATH_LIBXENEVTCHN = XEN_ROOT + "/tools/libs/evtchn"
> +PATH_LIBXENGNTTAB = XEN_ROOT + "/tools/libs/gnttab"
>  PATH_LIBXENCTRL = XEN_ROOT + "/tools/libs/ctrl"
>  PATH_LIBXENGUEST = XEN_ROOT + "/tools/libs/guest"
> +PATH_LIBXENDEVICEMODEL = XEN_ROOT + "/tools/libs/devicemodel"
> +PATH_LIBXENFOREIGNMEMORY = XEN_ROOT + "/tools/libs/foreignmemory"
>  PATH_XENSTORE = XEN_ROOT + "/tools/libs/store"
>  
>  xc = Extension("xc",
> @@ -24,7 +29,13 @@ xc = Extension("xc",
> library_dirs   = [ PATH_LIBXENCTRL, PATH_LIBXENGUEST ],
> libraries  = [ "xenctrl", "xenguest" ],
> depends= [ PATH_LIBXENCTRL + "/libxenctrl.so", 
> PATH_LIBXENGUEST + "/libxenguest.so" ],
> -   extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG 
> ],
> +   extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENCALL,
> +  
> "-Wl,-rpath-link="+PATH_LIBXENDEVICEMODEL,
> +  "-Wl,-rpath-link="+PATH_LIBXENEVTCHN,
> +  
> "-Wl,-rpath-link="+PATH_LIBXENFOREIGNMEMORY,
> +  "-Wl,-rpath-link="+PATH_LIBXENGNTTAB,
> +  "-Wl,-rpath-link="+PATH_LIBXENTOOLCORE,
> +  "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG 
> ],

This basically open-codes SHLIB_libxenctrl + SHLIB_libxenguest. Isn't it
better to pass it from make that already has all the dependencies
resolved?

> sources= [ "xen/lowlevel/xc/xc.c" ])
>  
>  xs = Extension("xs",
> @@ -33,6 +44,7 @@ xs = Extension("xs",
> library_dirs   = [ PATH_XENSTORE ],
> libraries  = [ "xenstore" ],
> depends= [ PATH_XENSTORE + "/libxenstore.so" ],
> +   extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLCORE 
> ],
> sources= [ "xen/lowlevel/xs/xs.c" ])
>  
>  plat = os.uname()[0]

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


signature.asc
Description: PGP signature


[xen-unstable-smoke test] 156310: tolerable all pass - PUSHED

2020-10-29 Thread osstest service owner
flight 156310 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156310/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  6e2ee3dfd660d9fde96243da7d565244b4d2f164
baseline version:
 xen  26a8fa494f2f323622b6928bd15921b41818f180

Last test of basis   156304  2020-10-29 20:00:26 Z0 days
Testing same since   156310  2020-10-30 00:01:28 Z0 days1 attempts


People who touched revisions under test:
  Bertrand Marquis 
  Stefano Stabellini 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   26a8fa494f..6e2ee3dfd6  6e2ee3dfd660d9fde96243da7d565244b4d2f164 -> smoke



BUG: libxl vuart build order

2020-10-29 Thread Stefano Stabellini
+ xen-devel and libxl maintainers

In short, there is a regression in libxl with the ARM vuart introduced
by moving ARM guests to the PVH build.


On Thu, 29 Oct 2020, Takahiro Akashi wrote:
> On Wed, Oct 28, 2020 at 02:44:16PM -0700, Stefano Stabellini wrote:
> > On Wed, 28 Oct 2020, Takahiro Akashi wrote:
> > > On Tue, Oct 27, 2020 at 09:02:14AM +0900, Takahiro Akashi wrote:
> > > > On Mon, Oct 26, 2020 at 04:37:30PM -0700, Stefano Stabellini wrote:
> > > > > 
> > > > > On Mon, 26 Oct 2020, Takahiro Akashi wrote:
> > > > > > Stefano,
> > > > > > 
> > > > > > # I'm afraid that I have already bothered you with a lot of 
> > > > > > questions.
> > > > > > 
> > > > > > When I looked at Xen's vpl011 implementation, I found
> > > > > > CR (and LCHR) register is not supported. (trap may cause a data 
> > > > > > abort).
> > > > > > On the other hand, for example, linux's pl011 driver surely
> > > > > > accesses CR (and LCHR) register.
> > > > > > So I guess that linux won't be able to use pl011 on a Xen guest vm
> > > > > > if vuart = "sbsa_uart".
> > > > > > 
> > > > > > Is this a known issue or do I miss anything?
> > > > > 
> > > > > Linux should definitely be able to use it, and in fact, I am using it
> > > > > with Linux in my test environment.
> > > > > 
> > > > > I think the confusion comes from the name "vpl011": it is in fact not 
> > > > > a
> > > > > full PL011 UART, but an SBSA UART.
> > > > 
> > > > Yeah, I have noticed it.
> > > > 
> > > > > SBSA UART only implements a subset of
> > > > > the PL011 registers. The compatible string is "arm,sbsa-uart", also 
> > > > > see
> > > > > drivers/tty/serial/amba-pl011.c:sbsa_uart_probe.
> > > > 
> > > > Looking closely into the details of implementation, I found
> > > > that all the accesses to unimplemented registers, including
> > > > CR, are deliberately avoided in sbsa part of linux driver.
> > > 
> > > So I'm now trying to implement "sbsa-uart" driver on U-Boot
> > > by modifying the existing pl011 driver.
> > > (Please note the current xen'ized U-Boot utilises a para-virtualized
> > > console, i.e. with HVM_PARAM_CONSOLE_PFN.)
> > > 
> > > So far my all attempts have failed.
> > > 
> > > There are a couple of problems, and one of them is how we can
> > > access vpl011 port (from dom0).
> > > What I did is:
> > > - modify U-Boot's pl011 driver
> > >   (I'm sure that the driver correctly handle a vpl011 device
> > >   with regard of accessing a proper set of registers.)
> > > - start U-Boot guest with "vuart=sbsa_uart" by
> > > xl create uboot.cfg -c
> > > 
> > > Then I have seen almost nothing on the screen.
> > > Digging into vpl011 implementation, I found that all the characters
> > > written DR register will be directed to a "backend domain" if a guest
> > > vm is launched by xl command.
> > > (In case of dom0less, the backend seems to be Xen itself.)
> > > 
> > > As a silly experiment, I modified domain_vpl011_init() to always create
> > > a vpl011 device with "backend_in_domain == false".
> > > Then, I could see more boot messages from U-Boot, but still fails
> > > to use the device as a console, I mean, we will lose all the outputs
> > > after at some point and won't be able to type any keys (at a command 
> > > prompt).
> > > (This will be another problem on U-Boot side.)
> > > 
> > > My first question here is how we can configure and connect a console
> > > in this case?
> > > Should "xl create -c" or "xl console -t vuart" simply work?
> > 
> > "xl create -c" creates a guest and connect to the primary console which
> > is the PV console (i.e. HVM_PARAM_CONSOLE_PFN.)
> 
> So in case of vuart, it (console) doesn't work?
> (Apparently, "xl create" doesn't take '-t' option.)
> 
> > To connect to the emulated sbsa uart you need to pass -t vuart. So yes,
> > "xl console -t vuart domain_name" should get you access to the emulated
> > sbsa uart. The sbsa uart can also be exposed to dom0less guests; you get
> > their output by using CTRL-AAA to switch to right domU console.
> > 
> > You can add printks to xen/arch/arm/vpl011.c in Xen to see what's
> > happening on the Xen side. vpl011.c is the emulator.
> 
> I'm sure that write to "REG_DR" register is caught by Xen.
> What I don't understand is
> if back_in_domain -> no outputs
> if !back_in_domain -> can see outputs
> 
> (As you know, if a guest is created by xl command, back_in_domain
> is forcedly set to true.)
> 
> I looked into xenstore and found that "vuart/0/tty" does not exist,
> but "console/tty" does.
> How can this happen for vuart?
> (I clearly specified, vuart = "sbsa_uart" in Xen config.)

It looks like we have a bug :-(

I managed to reproduce the issue. The problem is a race in libxl.

tools/libxc/xc_dom_arm.c:alloc_magic_pages is called first, setting
dom->vuart_gfn.  Then, libxl__build_hvm should be setting
state->vuart_gfn to dom->vuart_gfn (like libxl__build_pv does) but it
doesn't.


---

libxl: set vuart_gfn in libxl__build_hvm

Setting vuart_gfn was missed when switching ARM 

[libvirt test] 156290: regressions - FAIL

2020-10-29 Thread osstest service owner
flight 156290 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156290/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 151777
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 151777

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  1f807631f402210d036ec4803e7adfefa222f786
baseline version:
 libvirt  2c846fa6bcc11929c9fb857a22430fb9945654ad

Last test of basis   151777  2020-07-10 04:19:19 Z  111 days
Failing since151818  2020-07-11 04:18:52 Z  110 days  105 attempts
Testing same since   156273  2020-10-28 04:19:15 Z1 days2 attempts


People who touched revisions under test:
  Adolfo Jayme Barrientos 
  Andika Triwidada 
  Andrea Bolognani 
  Balázs Meskó 
  Bastien Orivel 
  Bihong Yu 
  Binfeng Wu 
  Boris Fiuczynski 
  Christian Ehrhardt 
  Christian Schoenebeck 
  Cole Robinson 
  Collin Walling 
  Cornelia Huck 
  Côme Borsoi 
  Daniel Henrique Barboza 
  Daniel Letai 
  Daniel P. Berrange 
  Daniel P. Berrangé 
  Erik Skultety 
  Fabian Freyer 
  Fangge Jin 
  Fedora Weblate Translation 
  Halil Pasic 
  Han Han 
  Hao Wang 
  Ian Wienand 
  Jamie Strandboge 
  Jamie Strandboge 
  Jean-Baptiste Holcroft 
  Jianan Gao 
  Jim Fehlig 
  Jin Yan 
  Jiri Denemark 
  Jonathon Jongsma 
  Ján Tomko 
  Kashyap Chamarthy 
  Kevin Locke 
  Laine Stump 
  Liao Pingfang 
  Lin Ma 
  Lin Ma 
  Marc Hartmayer 
  Marek Marczykowski-Górecki 
  Markus Schade 
  Martin Kletzander 
  Masayoshi Mizuma 
  Matt Coleman 
  Matt Coleman 
  Mauro Matteo Cascella 
  Michal Privoznik 
  Michał Smyk 
  Milo Casagrande 
  Neal Gompa 
  Nico Pache 
  Nikolay Shirokovskiy 
  Olesya Gerasimenko 
  Patrick Magauran 
  Paulo de Rezende Pinatti 
  Pavel Hrdina 
  Peter Krempa 
  Pino Toscano 
  Pino Toscano 
  Piotr Drąg 
  Prathamesh Chavan 
  Roman Bogorodskiy 
  Roman Bolshakov 
  Ryan Schmidt 
  Sam Hartman 
  Scott Shambarger 
  Sebastian Mitterle 
  Simon Gaiser 
  Stefan Bader 
  Stefan Berger 
  Szymon Scholz 
  Thomas Huth 
  Tim Wiederhake 
  Tomáš Golembiovský 
  Wang Xin 
  Weblate 
  Yang Hang 
  Yanqiu Zhang 
  Yi Li 
  Yi Wang 
  Yuri Chornoivan 
  Zheng Chuan 
  zhenwei pi 
  Zhenyu Zheng 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-arm64-libvirt  fail
 build-armhf-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   blocked 
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 test-amd64-amd64-libvirt-xsm blocked 
 test-arm64-arm64-libvirt-xsm 

[seabios test] 156285: tolerable FAIL - PUSHED

2020-10-29 Thread osstest service owner
flight 156285 seabios real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156285/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 155839
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 155839
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 155839
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 155839
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 155839
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass

version targeted for testing:
 seabios  94f0510dc75e910400aad6c169048d672c8c7193
baseline version:
 seabios  58a44be024f69d2e4d2b58553529230abdd3935e

Last test of basis   155839  2020-10-15 09:39:29 Z   14 days
Testing same since   156285  2020-10-28 19:42:01 Z1 days1 attempts


People who touched revisions under test:
  Alexander Graf 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm pass
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm  pass
 test-amd64-amd64-qemuu-nested-amdfail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-amd64-qemuu-freebsd11-amd64   pass
 test-amd64-amd64-qemuu-freebsd12-amd64   pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-ws16-amd64 fail
 test-amd64-i386-xl-qemuu-ws16-amd64  fail
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrictpass
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict pass
 test-amd64-amd64-qemuu-nested-intel  pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/seabios.git
   58a44be..94f0510  94f0510dc75e910400aad6c169048d672c8c7193 -> 
xen-tested-master



Re: [PATCH v2 3/3] xen/arm: Warn user on cpu errata 832075

2020-10-29 Thread Stefano Stabellini
On Thu, 29 Oct 2020, Bertrand Marquis wrote:
> Hi Julien,
> 
> > On 28 Oct 2020, at 18:39, Julien Grall  wrote:
> > 
> > Hi Bertrand,
> > 
> > On 26/10/2020 16:21, Bertrand Marquis wrote:
> >> When a Cortex A57 processor is affected by CPU errata 832075, a guest
> >> not implementing the workaround for it could deadlock the system.
> >> Add a warning during boot informing the user that only trusted guests
> >> should be executed on the system.
> >> An equivalent warning is already given to the user by KVM on cores
> >> affected by this errata.
> >> Also taint the hypervisor as unsecure when this errata applies and
> >> mention Cortex A57 r0p0 - r1p2 as not security supported in SUPPORT.md
> >> Signed-off-by: Bertrand Marquis 
> > 
> > Reviewed-by: Julien Grall 
> 
> Thanks
> 
> > 
> > If you don't need to resend the series, then I would be happy to fix the 
> > typo pointed out by George on commit.
> 
> There is only the condensing from Stefano.
> If you can handle that on commit to great but if you need me to send a v3 to 
> make your life easier do not hesitate to tell me

I have just done the committing



[xen-unstable-smoke test] 156304: tolerable all pass - PUSHED

2020-10-29 Thread osstest service owner
flight 156304 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156304/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  26a8fa494f2f323622b6928bd15921b41818f180
baseline version:
 xen  1fd1d4bafdf6f9f8fe5ca9b947f016a7aae92a74

Last test of basis   156297  2020-10-29 14:01:23 Z0 days
Testing same since   156304  2020-10-29 20:00:26 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   1fd1d4bafd..26a8fa494f  26a8fa494f2f323622b6928bd15921b41818f180 -> smoke



[xen-4.12-testing test] 156280: regressions - FAIL

2020-10-29 Thread osstest service owner
flight 156280 xen-4.12-testing real [real]
flight 156308 xen-4.12-testing real-retest [real]
http://logs.test-lab.xenproject.org/osstest/logs/156280/
http://logs.test-lab.xenproject.org/osstest/logs/156308/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qcow218 guest-saverestore.2  fail REGR. vs. 156035

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156035
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 156035
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156035
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156035
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156035
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156035
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156035
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156035
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 156035
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156035
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156035
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  4100d463dbdd95d85fabe387dd5676bed75f65f7
baseline version:
 xen  0108b011e133915a8ebd33636811d8c141b6e9f3

Last test of basis   156035  2020-10-20 13:36:02 Z9 days
Testing same since   156263  2020-10-27 18:36:53 Z2 days2 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 

RE: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread David Laight
From: Arvind Sankar
> Sent: 29 October 2020 21:35
> 
> On Thu, Oct 29, 2020 at 09:41:13PM +0100, Thomas Gleixner wrote:
> > On Thu, Oct 29 2020 at 17:59, Paolo Bonzini wrote:
> > > On 29/10/20 17:56, Arvind Sankar wrote:
> > >>> For those two just add:
> > >>> struct apic *apic = x86_system_apic;
> > >>> before all the assignments.
> > >>> Less churn and much better code.
> > >>>
> > >> Why would it be better code?
> > >>
> > >
> > > I think he means the compiler produces better code, because it won't
> > > read the global variable repeatedly.  Not sure if that's true,(*) but I
> > > think I do prefer that version if Arnd wants to do that tweak.
> >
> > It's not true.
> >
> >  foo *p = bar;
> >
> >  p->a = 1;
> >  p->b = 2;
> >
> > The compiler is free to reload bar after accessing p->a and with
> >
> > bar->a = 1;
> > bar->b = 1;
> >
> > it can either cache bar in a register or reread it after bar->a
> >
> > The generated code is the same as long as there is no reason to reload,
> > e.g. register pressure.
> >
> > Thanks,
> >
> > tglx
> 
> It's not quite the same.
> 
> https://godbolt.org/z/4dzPbM
> 
> With -fno-strict-aliasing, the compiler reloads the pointer if you write
> to the start of what it points to, but not if you write to later
> elements.

I guess it assumes that global data doesn't overlap.

But in general they are sort of opposites:

With the local variable it can reload if it knows the write
cannot have affected the global - but is unlikely to do so.

Using the global it must reload if it is possible the write
might have affected the global.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm

2020-10-29 Thread Oleksandr Tyshchenko
Hi Alex.

[sorry for the possible format issues]


> I assume this is wiring up the required bits of support to handle the
> IOREQ requests in QEMU?


No, it is not related to QEMU. We don't run QEMU in our system. Being
honest I have never tried to run it)
virtio-disk backend PoC is a completely standalone entity (IOREQ server)
which emulates a virtio-mmio disk device.
It is based on code from DEMU (for IOREQ server purposes) and some code
from kvmtool to implement virtio protocol,
disk operations over underlying H/W and Xenbus code to be able to read
configuration from the Xenstore
(it is configured via domain config file). Last patch in this series
(marked as RFC) actually adds required bits to the libxl code.


> We are putting together a PoC demo to show
> a virtio enabled image (AGL) running on both KVM and Xen hypervisors so
> we are keen to see your code as soon as you can share it.


Thank you. Sure, I will provide a branch with code by the end of this week.



>
> I'm currently preparing a patch series for QEMU which fixes the recent
> breakage caused by changes to the build system. As part of that I've
> separated CONFIG_XEN and CONFIG_XEN_HVM so it's possible to build
> i386-softmmu with unneeded for ARM bits. Hopefully this will allow me to
> create a qemu-aarch64-system binary with just the PV related models in
> it.
>
Does it mean that it will be possible to use QEMU in Xen on Arm just for
"backend provider" purposes?


>
> Talking to Stefano it probably makes sense going forward to introduce a
> new IOREQ aware machine type for QEMU that doesn't bring in the rest of
> the x86 overhead. I was thinking maybe xenvirt?
>
> You've tested with virtio-block but we'd also like to extend this to
> other arbitrary virtio devices. I guess we will need some sort of
> mechanism to inform the QEMU command line where each device sits in the
> virtio-mmio bus so the FDT Xen delivers to the guest matches up with
> what QEMU is ready to serve requests for?
>

I am sorry, I can't provide ideas here, not familiar with QEMU.
But, completely agree, that other virtio devices should be supported.

-- 
Regards,

Oleksandr Tyshchenko


[PATCH 09/36] qdev: Make qdev_get_prop_ptr() get Object* arg

2020-10-29 Thread Eduardo Habkost
Make the code more generic and not specific to TYPE_DEVICE.

Signed-off-by: Eduardo Habkost 
---
Cc: Stefan Berger 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Paolo Bonzini 
Cc: "Daniel P. Berrangé" 
Cc: Eduardo Habkost 
Cc: Richard Henderson 
Cc: David Hildenbrand 
Cc: Cornelia Huck 
Cc: Halil Pasic 
Cc: Christian Borntraeger 
Cc: Thomas Huth 
Cc: Matthew Rosato 
Cc: Alex Williamson 
Cc: qemu-de...@nongnu.org
Cc: xen-devel@lists.xenproject.org
Cc: qemu-bl...@nongnu.org
Cc: qemu-s3...@nongnu.org
---
 include/hw/qdev-properties.h |  2 +-
 backends/tpm/tpm_util.c  |  8 ++--
 hw/block/xen-block.c |  6 +--
 hw/core/qdev-properties-system.c | 57 +-
 hw/core/qdev-properties.c| 82 +---
 hw/s390x/css.c   |  5 +-
 hw/s390x/s390-pci-bus.c  |  4 +-
 hw/vfio/pci-quirks.c |  5 +-
 8 files changed, 68 insertions(+), 101 deletions(-)

diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 0ea822e6a7..0b92cfc761 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -302,7 +302,7 @@ void qdev_prop_set_macaddr(DeviceState *dev, const char 
*name,
const uint8_t *value);
 void qdev_prop_set_enum(DeviceState *dev, const char *name, int value);
 
-void *qdev_get_prop_ptr(DeviceState *dev, Property *prop);
+void *qdev_get_prop_ptr(Object *obj, Property *prop);
 
 void qdev_prop_register_global(GlobalProperty *prop);
 const GlobalProperty *qdev_find_global_prop(DeviceState *dev,
diff --git a/backends/tpm/tpm_util.c b/backends/tpm/tpm_util.c
index b58d298c1a..e91c21dd4a 100644
--- a/backends/tpm/tpm_util.c
+++ b/backends/tpm/tpm_util.c
@@ -35,8 +35,7 @@
 static void get_tpm(Object *obj, Visitor *v, const char *name, void *opaque,
 Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
-TPMBackend **be = qdev_get_prop_ptr(dev, opaque);
+TPMBackend **be = qdev_get_prop_ptr(obj, opaque);
 char *p;
 
 p = g_strdup(*be ? (*be)->id : "");
@@ -49,7 +48,7 @@ static void set_tpm(Object *obj, Visitor *v, const char 
*name, void *opaque,
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-TPMBackend *s, **be = qdev_get_prop_ptr(dev, prop);
+TPMBackend *s, **be = qdev_get_prop_ptr(obj, prop);
 char *str;
 
 if (dev->realized) {
@@ -73,9 +72,8 @@ static void set_tpm(Object *obj, Visitor *v, const char 
*name, void *opaque,
 
 static void release_tpm(Object *obj, const char *name, void *opaque)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-TPMBackend **be = qdev_get_prop_ptr(dev, prop);
+TPMBackend **be = qdev_get_prop_ptr(obj, prop);
 
 if (*be) {
 tpm_backend_reset(*be);
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 8a7a3f5452..1ba9981c08 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -335,9 +335,8 @@ static char *disk_to_vbd_name(unsigned int disk)
 static void xen_block_get_vdev(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-XenBlockVdev *vdev = qdev_get_prop_ptr(dev, prop);
+XenBlockVdev *vdev = qdev_get_prop_ptr(obj, prop);
 char *str;
 
 switch (vdev->type) {
@@ -396,9 +395,8 @@ static int vbd_name_to_disk(const char *name, const char 
**endp,
 static void xen_block_set_vdev(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-XenBlockVdev *vdev = qdev_get_prop_ptr(dev, prop);
+XenBlockVdev *vdev = qdev_get_prop_ptr(obj, prop);
 char *str, *p;
 const char *end;
 
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index d0fb063a49..c8c73c371b 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -59,9 +59,8 @@ static bool check_prop_still_unset(DeviceState *dev, const 
char *name,
 static void get_drive(Object *obj, Visitor *v, const char *name, void *opaque,
   Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-void **ptr = qdev_get_prop_ptr(dev, prop);
+void **ptr = qdev_get_prop_ptr(obj, prop);
 const char *value;
 char *p;
 
@@ -87,7 +86,7 @@ static void set_drive_helper(Object *obj, Visitor *v, const 
char *name,
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-void **ptr = qdev_get_prop_ptr(dev, prop);
+void **ptr = qdev_get_prop_ptr(obj, prop);
 char *str;
 BlockBackend *blk;
 bool blk_created = false;
@@ -185,7 +184,7 @@ static void release_drive(Object *obj, const char *name, 
void *opaque)
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-BlockBackend **ptr = 

[PATCH 14/36] qdev: Move dev->realized check to qdev_property_set()

2020-10-29 Thread Eduardo Habkost
Every single qdev property setter function manually checks
dev->realized.  We can just check dev->realized inside
qdev_property_set() instead.

The check is being added as a separate function
(qdev_prop_allow_set()) because it will become a callback later.

Signed-off-by: Eduardo Habkost 
---
Cc: Stefan Berger 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Paolo Bonzini 
Cc: "Daniel P. Berrangé" 
Cc: Eduardo Habkost 
Cc: Cornelia Huck 
Cc: Halil Pasic 
Cc: Christian Borntraeger 
Cc: Thomas Huth 
Cc: Richard Henderson 
Cc: David Hildenbrand 
Cc: Matthew Rosato 
Cc: Alex Williamson 
Cc: Mark Cave-Ayland 
Cc: Artyom Tarasenko 
Cc: qemu-de...@nongnu.org
Cc: xen-devel@lists.xenproject.org
Cc: qemu-bl...@nongnu.org
Cc: qemu-s3...@nongnu.org
---
 backends/tpm/tpm_util.c  |   6 --
 hw/block/xen-block.c |   5 --
 hw/core/qdev-properties-system.c |  64 ---
 hw/core/qdev-properties.c| 106 ++-
 hw/s390x/css.c   |   6 --
 hw/s390x/s390-pci-bus.c  |   6 --
 hw/vfio/pci-quirks.c |   6 --
 target/sparc/cpu.c   |   6 --
 8 files changed, 18 insertions(+), 187 deletions(-)

diff --git a/backends/tpm/tpm_util.c b/backends/tpm/tpm_util.c
index e91c21dd4a..042cacfcca 100644
--- a/backends/tpm/tpm_util.c
+++ b/backends/tpm/tpm_util.c
@@ -46,16 +46,10 @@ static void get_tpm(Object *obj, Visitor *v, const char 
*name, void *opaque,
 static void set_tpm(Object *obj, Visitor *v, const char *name, void *opaque,
 Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
 TPMBackend *s, **be = qdev_get_prop_ptr(obj, prop);
 char *str;
 
-if (dev->realized) {
-qdev_prop_set_after_realize(dev, name, errp);
-return;
-}
-
 if (!visit_type_str(v, name, , errp)) {
 return;
 }
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index 1ba9981c08..bd1aef63a7 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -400,11 +400,6 @@ static void xen_block_set_vdev(Object *obj, Visitor *v, 
const char *name,
 char *str, *p;
 const char *end;
 
-if (dev->realized) {
-qdev_prop_set_after_realize(dev, name, errp);
-return;
-}
-
 if (!visit_type_str(v, name, , errp)) {
 return;
 }
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index fca1b694ca..60a45f5620 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -92,11 +92,6 @@ static void set_drive_helper(Object *obj, Visitor *v, const 
char *name,
 bool blk_created = false;
 int ret;
 
-if (dev->realized) {
-qdev_prop_set_after_realize(dev, name, errp);
-return;
-}
-
 if (!visit_type_str(v, name, , errp)) {
 return;
 }
@@ -228,17 +223,11 @@ static void get_chr(Object *obj, Visitor *v, const char 
*name, void *opaque,
 static void set_chr(Object *obj, Visitor *v, const char *name, void *opaque,
 Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
 CharBackend *be = qdev_get_prop_ptr(obj, prop);
 Chardev *s;
 char *str;
 
-if (dev->realized) {
-qdev_prop_set_after_realize(dev, name, errp);
-return;
-}
-
 if (!visit_type_str(v, name, , errp)) {
 return;
 }
@@ -309,18 +298,12 @@ static void get_mac(Object *obj, Visitor *v, const char 
*name, void *opaque,
 static void set_mac(Object *obj, Visitor *v, const char *name, void *opaque,
 Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
 MACAddr *mac = qdev_get_prop_ptr(obj, prop);
 int i, pos;
 char *str;
 const char *p;
 
-if (dev->realized) {
-qdev_prop_set_after_realize(dev, name, errp);
-return;
-}
-
 if (!visit_type_str(v, name, , errp)) {
 return;
 }
@@ -388,7 +371,6 @@ static void get_netdev(Object *obj, Visitor *v, const char 
*name,
 static void set_netdev(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
 NICPeers *peers_ptr = qdev_get_prop_ptr(obj, prop);
 NetClientState **ncs = peers_ptr->ncs;
@@ -396,11 +378,6 @@ static void set_netdev(Object *obj, Visitor *v, const char 
*name,
 int queues, err = 0, i = 0;
 char *str;
 
-if (dev->realized) {
-qdev_prop_set_after_realize(dev, name, errp);
-return;
-}
-
 if (!visit_type_str(v, name, , errp)) {
 return;
 }
@@ -467,18 +444,12 @@ static void get_audiodev(Object *obj, Visitor *v, const 
char* name,
 static void set_audiodev(Object *obj, Visitor *v, const char* name,
  void *opaque, Error **errp)
 {
-DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;

[PATCH 25/36] qdev: Rename qdev_get_prop_ptr() to object_static_prop_ptr()

2020-10-29 Thread Eduardo Habkost
The function will be moved to common QOM code, as it is not
specific to TYPE_DEVICE anymore.

Signed-off-by: Eduardo Habkost 
---
Cc: Stefan Berger 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: Paul Durrant 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: Paolo Bonzini 
Cc: "Daniel P. Berrangé" 
Cc: Eduardo Habkost 
Cc: Cornelia Huck 
Cc: Halil Pasic 
Cc: Christian Borntraeger 
Cc: Richard Henderson 
Cc: David Hildenbrand 
Cc: Thomas Huth 
Cc: Matthew Rosato 
Cc: Alex Williamson 
Cc: qemu-de...@nongnu.org
Cc: xen-devel@lists.xenproject.org
Cc: qemu-bl...@nongnu.org
Cc: qemu-s3...@nongnu.org
---
 include/hw/qdev-properties.h |  2 +-
 backends/tpm/tpm_util.c  |  6 +--
 hw/block/xen-block.c |  4 +-
 hw/core/qdev-properties-system.c | 46 +++
 hw/core/qdev-properties.c| 64 
 hw/s390x/css.c   |  4 +-
 hw/s390x/s390-pci-bus.c  |  4 +-
 hw/vfio/pci-quirks.c |  4 +-
 8 files changed, 67 insertions(+), 67 deletions(-)

diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 0acc92ae2b..4146dac281 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -332,7 +332,7 @@ void qdev_prop_set_macaddr(DeviceState *dev, const char 
*name,
const uint8_t *value);
 void qdev_prop_set_enum(DeviceState *dev, const char *name, int value);
 
-void *qdev_get_prop_ptr(Object *obj, Property *prop);
+void *object_static_prop_ptr(Object *obj, Property *prop);
 
 void qdev_prop_register_global(GlobalProperty *prop);
 const GlobalProperty *qdev_find_global_prop(Object *obj,
diff --git a/backends/tpm/tpm_util.c b/backends/tpm/tpm_util.c
index 042cacfcca..2b5f788861 100644
--- a/backends/tpm/tpm_util.c
+++ b/backends/tpm/tpm_util.c
@@ -35,7 +35,7 @@
 static void get_tpm(Object *obj, Visitor *v, const char *name, void *opaque,
 Error **errp)
 {
-TPMBackend **be = qdev_get_prop_ptr(obj, opaque);
+TPMBackend **be = object_static_prop_ptr(obj, opaque);
 char *p;
 
 p = g_strdup(*be ? (*be)->id : "");
@@ -47,7 +47,7 @@ static void set_tpm(Object *obj, Visitor *v, const char 
*name, void *opaque,
 Error **errp)
 {
 Property *prop = opaque;
-TPMBackend *s, **be = qdev_get_prop_ptr(obj, prop);
+TPMBackend *s, **be = object_static_prop_ptr(obj, prop);
 char *str;
 
 if (!visit_type_str(v, name, , errp)) {
@@ -67,7 +67,7 @@ static void set_tpm(Object *obj, Visitor *v, const char 
*name, void *opaque,
 static void release_tpm(Object *obj, const char *name, void *opaque)
 {
 Property *prop = opaque;
-TPMBackend **be = qdev_get_prop_ptr(obj, prop);
+TPMBackend **be = object_static_prop_ptr(obj, prop);
 
 if (*be) {
 tpm_backend_reset(*be);
diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c
index bd1aef63a7..20985c465a 100644
--- a/hw/block/xen-block.c
+++ b/hw/block/xen-block.c
@@ -336,7 +336,7 @@ static void xen_block_get_vdev(Object *obj, Visitor *v, 
const char *name,
void *opaque, Error **errp)
 {
 Property *prop = opaque;
-XenBlockVdev *vdev = qdev_get_prop_ptr(obj, prop);
+XenBlockVdev *vdev = object_static_prop_ptr(obj, prop);
 char *str;
 
 switch (vdev->type) {
@@ -396,7 +396,7 @@ static void xen_block_set_vdev(Object *obj, Visitor *v, 
const char *name,
void *opaque, Error **errp)
 {
 Property *prop = opaque;
-XenBlockVdev *vdev = qdev_get_prop_ptr(obj, prop);
+XenBlockVdev *vdev = object_static_prop_ptr(obj, prop);
 char *str, *p;
 const char *end;
 
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index d9355053d2..448d77ecab 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -60,7 +60,7 @@ static void get_drive(Object *obj, Visitor *v, const char 
*name, void *opaque,
   Error **errp)
 {
 Property *prop = opaque;
-void **ptr = qdev_get_prop_ptr(obj, prop);
+void **ptr = object_static_prop_ptr(obj, prop);
 const char *value;
 char *p;
 
@@ -86,7 +86,7 @@ static void set_drive_helper(Object *obj, Visitor *v, const 
char *name,
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-void **ptr = qdev_get_prop_ptr(obj, prop);
+void **ptr = object_static_prop_ptr(obj, prop);
 char *str;
 BlockBackend *blk;
 bool blk_created = false;
@@ -179,7 +179,7 @@ static void release_drive(Object *obj, const char *name, 
void *opaque)
 {
 DeviceState *dev = DEVICE(obj);
 Property *prop = opaque;
-BlockBackend **ptr = qdev_get_prop_ptr(obj, prop);
+BlockBackend **ptr = object_static_prop_ptr(obj, prop);
 
 if (*ptr) {
 AioContext *ctx = blk_get_aio_context(*ptr);
@@ -212,7 +212,7 @@ const PropertyInfo qdev_prop_drive_iothread = {
 static void get_chr(Object *obj, Visitor *v, const char *name, void 

Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread Arvind Sankar
On Thu, Oct 29, 2020 at 09:41:13PM +0100, Thomas Gleixner wrote:
> On Thu, Oct 29 2020 at 17:59, Paolo Bonzini wrote:
> > On 29/10/20 17:56, Arvind Sankar wrote:
> >>> For those two just add:
> >>>   struct apic *apic = x86_system_apic;
> >>> before all the assignments.
> >>> Less churn and much better code.
> >>>
> >> Why would it be better code?
> >> 
> >
> > I think he means the compiler produces better code, because it won't
> > read the global variable repeatedly.  Not sure if that's true,(*) but I
> > think I do prefer that version if Arnd wants to do that tweak.
> 
> It's not true.
> 
>  foo *p = bar;
> 
>  p->a = 1;
>  p->b = 2;
> 
> The compiler is free to reload bar after accessing p->a and with
> 
> bar->a = 1;
> bar->b = 1;
> 
> it can either cache bar in a register or reread it after bar->a
> 
> The generated code is the same as long as there is no reason to reload,
> e.g. register pressure.
> 
> Thanks,
> 
> tglx

It's not quite the same.

https://godbolt.org/z/4dzPbM

With -fno-strict-aliasing, the compiler reloads the pointer if you write
to the start of what it points to, but not if you write to later
elements.



Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm

2020-10-29 Thread Stefano Stabellini
On Thu, 29 Oct 2020, Oleksandr Tyshchenko wrote:
> Hi Stefano
> 
> [sorry for the possible format issue]
> 
> On Thu, Oct 29, 2020 at 9:53 PM Stefano Stabellini  
> wrote:
>   On Thu, 29 Oct 2020, Oleksandr Tyshchenko wrote:
>   > On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu 
>  wrote:
>   >       Hi Oleksandr,
>   >
>   > Hi Masami
>   >
>   > [sorry for the possible format issue]
>   >  
>   >
>   >       I would like to try this on my arm64 board.
>   >
>   > Glad to hear you are interested in this topic. 
>   >  
>   >
>   >       According to your comments in the patch, I made this config 
> file.
>   >       # cat debian.conf
>   >       name = "debian"
>   >       type = "pvh"
>   >       vcpus = 8
>   >       memory = 512
>   >       kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64"
>   >       ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64"
>   >       cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw"
>   >       disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ]
>   >       vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ]
>   >       virtio = 1
>   >       vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ]
>   >
>   >       And tried to boot a DomU, but I got below error.
>   >
>   >       # xl create -c debian.conf
>   >       Parsing config from debian.conf
>   >       libxl: error: libxl_create.c:1863:domcreate_attach_devices: 
> Domain
>   >       1:unable to add virtio_disk devices
>   >       libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain
>   >       1:xc_domain_pause failed
>   >       libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get 
> domain
>   >       type for domid=1
>   >       libxl: error: libxl_domain.c:1136:domain_destroy_callback: 
> Domain
>   >       1:Unable to destroy guest
>   >       libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain
>   >       1:Destruction of domain failed
>   >
>   >
>   >       Could you tell me how can I test it?
>   >
>   >
>   > I assume it is due to the lack of the virtio-disk backend (which I 
> haven't shared yet as I focused on the IOREQ/DM support on
>   Arm in the
>   > first place).
>   > Could you wait a little bit, I am going to share it soon. 
> 
>   Do you have a quick-and-dirty hack you can share in the meantime? Even
>   just on github as a special branch? It would be very useful to be able
>   to have a test-driver for the new feature.
> 
> Well, I will provide a branch on github with our PoC virtio-disk backend by 
> the end of this week. It will be possible to test this series
> with it. 

Very good, thank you!

Re: Xen on RP4

2020-10-29 Thread Elliott Mitchell
On Wed, Oct 28, 2020 at 05:37:02PM -0700, Stefano Stabellini wrote:
> On Tue, 27 Oct 2020, Elliott Mitchell wrote:
> > On Mon, Oct 26, 2020 at 06:44:27PM +, Julien Grall wrote:
> > > On 26/10/2020 16:03, Elliott Mitchell wrote:
> > > > On Mon, Oct 26, 2020 at 01:31:42PM +, Julien Grall wrote:
> > > >> On 24/10/2020 06:35, Elliott Mitchell wrote:
> > > >>> ACPI has a distinct
> > > >>> means of specifying a limited DMA-width; the above fails, because it
> > > >>> assumes a *device-tree*.
> > > >>
> > > >> Do you know if it would be possible to infer from the ACPI static table
> > > >> the DMA-width?
> > > > 
> > > > Yes, and it is.  Due to not knowing much about ACPI tables I don't know
> > > > what the C code would look like though (problem is which documentation
> > > > should I be looking at first?).
> > > 
> > > What you provided below is an excerpt of the DSDT. AFAIK, DSDT content 
> > > is written in AML. So far the shortest implementation I have seen for 
> > > the AML parser is around 5000 lines (see [1]). It might be possible to 
> > > strip some the code, although I think this will still probably too big 
> > > for a single workaround.
> > > 
> > > What I meant by "static table" is a table that looks like a structure 
> > > and can be parsed in a few lines. If we can't find on contain the DMA 
> > > window, then the next best solution is to find a way to identity the 
> > > platform.
> > > 
> > > I don't know enough ACPI to know if this solution is possible. A good 
> > > starter would probably be the ACPI spec [2].
> > 
> > Okay, that is worse than I had thought (okay, ACPI is impressively
> > complex for something nominally firmware-level).
> >
> > There are strings in the present Tianocore implementation for Raspberry
> > PI 4B which could be targeted.  Notably included in the output during
> > boot listing the tables, "RPIFDN", "RPIFDN RPI" and "RPIFDN RPI4" (I'm
> > unsure how kosher these are as this wsn't implemented nor blessed by the
> > Raspberry PI Foundation).
> > 
> > I strongly dislike this approach as you soon turn the Xen project into a
> > database of hardware.  This is already occurring with
> > xen/arch/arm/platforms and I would love to do something about this.  On
> > that thought, how about utilizing Xen's command-line for this purpose?
> 
> I don't think that a command line option is a good idea: basically it is
> punting to users the task of platform detection. Also, it means that
> users will be necessarily forced to edit the uboot script or grub
> configuration file to boot.

-EINVAL

Many Linux installations (near universal on desktop/server, but may be
uncommon on ARM servers) Xen's command-line comes from grub.cfg.
grub.cfg is in turn created by a series of scripts with several places
for users to modify configuration without breaking things.

The scripts which create grub.cfg could add a "dma_mem=" option to Xen's
command-line based upon what the running kernel reports.  If the kernel
is running on top of Xen, it will still be able to retrieve this
information out of ACPI.

This does mean distributions would need to modify scripts, but that is
doable.  This also means a dumb user could potentially jump in, modify
the value and thus cause unrecoverable breakage.  Yet on the flip side
this also allows for the short-term stop-gap of smart users modifying
the option as appropriate for new hardware.

Certainly it may not be the greatest approach, but it isn't as bad as
you're claiming.


> Note that even if we introduced a new command line, we wouldn't take
> away the need for xen/arch/arm/platforms anyway.

Perhaps, but it could allow for this setting at least to be moved to
somewhere outside of Xen.

I'm inclined to agree with Juergen Gro??, this reads kind of like having
an extra domain run during Xen's initialization which can talk ACPI.


> > Have a procedure of during installation/updates retrieve DMA limitation
> > information from the running OS and the following boot Xen will follow
> > the appropriate setup.  By its nature, Domain 0 will have the information
> > needed, just becomes an issue of how hard that is to retrieve...
> 
> Historically that is what we used to do for many things related to ACPI,
> but unfortunately it leads to a pretty bad architecture where Xen
> depends on Dom0 for booting rather than the other way around. (Dom0
> should be the one requiring Xen for booting, given that Xen is higher
> privilege and boots first.)
> 
> 
> I think the best compromise is still to use an ACPI string to detect the
> platform. For instance, would it be possible to use the OEMID fields in
> RSDT, XSDT, FADT?  Possibly even a combination of them?
> 
> Another option might be to get the platform name from UEFI somehow. 

I included appropriate strings in e-mail.  Suitable strings do appear in
`dmesg`.

Problem is this feels like you're hard-coding a fixed list of platforms
Xen can run on.  Instead values like these should be provided by
firmware.  ACPI 

Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm

2020-10-29 Thread Oleksandr Tyshchenko
Hi Stefano

[sorry for the possible format issue]

On Thu, Oct 29, 2020 at 9:53 PM Stefano Stabellini 
wrote:

> On Thu, 29 Oct 2020, Oleksandr Tyshchenko wrote:
> > On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu <
> masami.hirama...@linaro.org> wrote:
> >   Hi Oleksandr,
> >
> > Hi Masami
> >
> > [sorry for the possible format issue]
> >
> >
> >   I would like to try this on my arm64 board.
> >
> > Glad to hear you are interested in this topic.
> >
> >
> >   According to your comments in the patch, I made this config file.
> >   # cat debian.conf
> >   name = "debian"
> >   type = "pvh"
> >   vcpus = 8
> >   memory = 512
> >   kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64"
> >   ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64"
> >   cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw"
> >   disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ]
> >   vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ]
> >   virtio = 1
> >   vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ]
> >
> >   And tried to boot a DomU, but I got below error.
> >
> >   # xl create -c debian.conf
> >   Parsing config from debian.conf
> >   libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain
> >   1:unable to add virtio_disk devices
> >   libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain
> >   1:xc_domain_pause failed
> >   libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get
> domain
> >   type for domid=1
> >   libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain
> >   1:Unable to destroy guest
> >   libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain
> >   1:Destruction of domain failed
> >
> >
> >   Could you tell me how can I test it?
> >
> >
> > I assume it is due to the lack of the virtio-disk backend (which I
> haven't shared yet as I focused on the IOREQ/DM support on Arm in the
> > first place).
> > Could you wait a little bit, I am going to share it soon.
>
> Do you have a quick-and-dirty hack you can share in the meantime? Even
> just on github as a special branch? It would be very useful to be able
> to have a test-driver for the new feature.

Well, I will provide a branch on github with our PoC virtio-disk backend by
the end of this week. It will be possible to test this series with it.


-- 
Regards,

Oleksandr Tyshchenko


Re: Ping: [PATCH v3 0/3] x86: shim building adjustments (plus shadow follow-on)

2020-10-29 Thread Tim Deegan
At 14:40 +0100 on 29 Oct (1603982415), Jan Beulich wrote:
> Tim,
> 
> unless you tell me otherwise I'm intending to commit the latter
> two with Roger's acks some time next week. Of course it would
> still be nice to know your view on the first of the TBDs in
> patch 3 (which may result in further changes, in a follow-up).

Ack, sorry for the dropped patches, and thank you for the ping.  I've
now replied to everything that I think is wating for my review.

Cheers,

Tim.



Re: [PATCH] x86/shadow: correct GFN use by sh_unshadow_for_p2m_change()

2020-10-29 Thread Tim Deegan
At 16:42 +0100 on 28 Oct (1603903365), Jan Beulich wrote:
> Luckily sh_remove_all_mappings()'s use of the parameter is limited to
> generation of log messages. Nevertheless we'd better pass correct GFNs
> around:
> - the incoming GFN, when replacing a large page, may not be large page
>   aligned,
> - incrementing by page-size-scaled values can't be right.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Tim Deegan 

Thanks!

Tim.



Re: [PATCH 5/5] x86/p2m: split write_p2m_entry() hook

2020-10-29 Thread Tim Deegan
At 10:24 +0100 on 28 Oct (1603880693), Jan Beulich wrote:
> Fair parts of the present handlers are identical; in fact
> nestedp2m_write_p2m_entry() lacks a call to p2m_entry_modify(). Move
> common parts right into write_p2m_entry(), splitting the hooks into a
> "pre" one (needed just by shadow code) and a "post" one.
> 
> For the common parts moved I think that the p2m_flush_nestedp2m() is,
> at least from an abstract perspective, also applicable in the shadow
> case. Hence it doesn't get a 3rd hook put in place.
> 
> The initial comment that was in hap_write_p2m_entry() gets dropped: Its
> placement was bogus, and looking back the the commit introducing it
> (dd6de3ab9985 "Implement Nested-on-Nested") I can't see either what use
> of a p2m it was meant to be associated with.
> 
> Signed-off-by: Jan Beulich 

This seems like a good approach to me.  I'm happy with the shadow
parts but am not confident enough on nested p2m any more to have an
opinion on that side. 

Acked-by: Tim Deegan 




Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread Thomas Gleixner
On Thu, Oct 29 2020 at 17:59, Paolo Bonzini wrote:
> On 29/10/20 17:56, Arvind Sankar wrote:
>>> For those two just add:
>>> struct apic *apic = x86_system_apic;
>>> before all the assignments.
>>> Less churn and much better code.
>>>
>> Why would it be better code?
>> 
>
> I think he means the compiler produces better code, because it won't
> read the global variable repeatedly.  Not sure if that's true,(*) but I
> think I do prefer that version if Arnd wants to do that tweak.

It's not true.

 foo *p = bar;

 p->a = 1;
 p->b = 2;

The compiler is free to reload bar after accessing p->a and with

bar->a = 1;
bar->b = 1;

it can either cache bar in a register or reread it after bar->a

The generated code is the same as long as there is no reason to reload,
e.g. register pressure.

Thanks,

tglx



Re: [PATCH 2/5] x86/p2m: collapse the two ->write_p2m_entry() hooks

2020-10-29 Thread Tim Deegan
At 10:22 +0100 on 28 Oct (1603880578), Jan Beulich wrote:
> The struct paging_mode instances get set to the same functions
> regardless of mode by both HAP and shadow code, hence there's no point
> having this hook there. The hook also doesn't need moving elsewhere - we
> can directly use struct p2m_domain's. This merely requires (from a
> strictly formal pov; in practice this may not even be needed) making
> sure we don't end up using safe_write_pte() for nested P2Ms.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH v3 3/3] x86/shadow: sh_{make,destroy}_monitor_table() are "even more" HVM-only

2020-10-29 Thread Tim Deegan
At 10:45 +0200 on 19 Oct (1603104300), Jan Beulich wrote:
> With them depending on just the number of shadow levels, there's no need
> for more than one instance of them, and hence no need for any hook (IOW
> 452219e24648 ["x86/shadow: monitor table is HVM-only"] didn't go quite
> far enough). Move the functions to hvm.c while dropping the dead
> is_pv_32bit_domain() code paths.
>
> While moving the code, replace a stale comment reference to
> sh_install_xen_entries_in_l4(). Doing so made me notice the function
> also didn't have its prototype dropped in 8d7b633adab7 ("x86/mm:
> Consolidate all Xen L4 slot writing into init_xen_l4_slots()"), which
> gets done here as well.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

> TBD: In principle both functions could have their first parameter
>  constified. In fact, "destroy" doesn't depend on the vCPU at all
>  and hence could be passed a struct domain *. Not sure whether such
>  an asymmetry would be acceptable.
>  In principle "make" would also not need passing of the number of
>  shadow levels (can be derived from v), which would result in yet
>  another asymmetry.
>  If these asymmetries were acceptable, "make" could then also update
>  v->arch.hvm.monitor_table, instead of doing so at both call sites.

Feel free to add consts, but please don't change the function
parameters any more than that.  I would rather keep them as a matched
pair, and leave the hvm.monitor_table updates in the caller, where
it's easier to see why they're not symmetrical.

Cheers

Tim.



Re: [XEN PATCH v1] xen/arm : Add support for SMMUv3 driver

2020-10-29 Thread Stefano Stabellini
On Thu, 29 Oct 2020, Bertrand Marquis wrote:
> > On 28 Oct 2020, at 19:12, Julien Grall  wrote:
> > On 26/10/2020 11:03, Rahul Singh wrote:
> >> Hello Julien,
> >>> On 23 Oct 2020, at 4:19 pm, Julien Grall  wrote:
> >>> On 23/10/2020 15:27, Rahul Singh wrote:
>  Hello Julien,
> > On 23 Oct 2020, at 2:00 pm, Julien Grall  wrote:
> > On 23/10/2020 12:35, Rahul Singh wrote:
> >> Hello,
> >>> On 23 Oct 2020, at 1:02 am, Stefano Stabellini 
> >>>  wrote:
> >>> 
> >>> On Thu, 22 Oct 2020, Julien Grall wrote:
> >> On 20/10/2020 16:25, Rahul Singh wrote:
> >>> Add support for ARM architected SMMUv3 implementations. It is 
> >>> based on
> >>> the Linux SMMUv3 driver.
> >>> Major differences between the Linux driver are as follows:
> >>> 1. Only Stage-2 translation is supported as compared to the Linux 
> >>> driver
> >>>that supports both Stage-1 and Stage-2 translations.
> >>> 2. Use P2M  page table instead of creating one as SMMUv3 has the
> >>>capability to share the page tables with the CPU.
> >>> 3. Tasklets is used in place of threaded IRQ's in Linux for event 
> >>> queue
> >>>and priority queue IRQ handling.
> >> 
> >> Tasklets are not a replacement for threaded IRQ. In particular, 
> >> they will
> >> have priority over anything else (IOW nothing will run on the pCPU 
> >> until
> >> they are done).
> >> 
> >> Do you know why Linux is using thread. Is it because of long 
> >> running
> >> operations?
> > 
> > Yes you are right because of long running operations Linux is using 
> > the
> > threaded IRQs.
> > 
> > SMMUv3 reports fault/events bases on memory-based circular buffer 
> > queues not
> > based on the register. As per my understanding, it is 
> > time-consuming to
> > process the memory based queues in interrupt context because of 
> > that Linux
> > is using threaded IRQ to process the faults/events from SMMU.
> > 
> > I didn’t find any other solution in XEN in place of tasklet to 
> > defer the
> > work, that’s why I used tasklet in XEN in replacement of threaded 
> > IRQs. If
> > we do all work in interrupt context we will make XEN less 
> > responsive.
>  
>  So we need to make sure that Xen continue to receives interrupts, 
>  but we also
>  need to make sure that a vCPU bound to the pCPU is also responsive.
>  
> > 
> > If you know another solution in XEN that will be used to defer the 
> > work in
> > the interrupt please let me know I will try to use that.
>  
>  One of my work colleague encountered a similar problem recently. He 
>  had a long
>  running tasklet and wanted to be broken down in smaller chunk.
>  
>  We decided to use a timer to reschedule the taslket in the future. 
>  This allows
>  the scheduler to run other loads (e.g. vCPU) for some time.
>  
>  This is pretty hackish but I couldn't find a better solution as 
>  tasklet have
>  high priority.
>  
>  Maybe the other will have a better idea.
> >>> 
> >>> Julien's suggestion is a good one.
> >>> 
> >>> But I think tasklets can be configured to be called from the 
> >>> idle_loop,
> >>> in which case they are not run in interrupt context?
> >>> 
> >>  Yes you are right tasklet will be scheduled from the idle_loop that 
> >> is not interrupt conext.
> > 
> > This depends on your tasklet. Some will run from the softirq context 
> > which is usually (for Arm) on the return of an exception.
> > 
>  Thanks for the info. I will check and will get better understanding of 
>  the tasklet how it will run in XEN.
> >>> 
> >>> 4. Latest version of the Linux SMMUv3 code implements the 
> >>> commands queue
> >>>access functions based on atomic operations implemented in 
> >>> Linux.
> >> 
> >> Can you provide more details?
> > 
> > I tried to port the latest version of the SMMUv3 code than I 
> > observed that
> > in order to port that code I have to also port atomic operation 
> > implemented
> > in Linux to XEN. As latest Linux code uses atomic operation to 
> > process the
> > command queues 
> > (atomic_cond_read_relaxed(),atomic_long_cond_read_relaxed() ,
> > atomic_fetch_andnot_relaxed()) .
>  
>  Thank you for the explanation. I think it would be best to import 
>  the atomic
>  helpers and use the latest code.
>  
>  This will ensure 

Re: [PATCH v3 2/3] x86/shadow: refactor shadow_vram_{get,put}_l1e()

2020-10-29 Thread Tim Deegan
At 10:44 +0200 on 19 Oct (1603104271), Jan Beulich wrote:
> By passing the functions an MFN and flags, only a single instance of
> each is needed; they were pretty large for being inline functions
> anyway.
> 
> While moving the code, also adjust coding style and add const where
> sensible / possible.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm

2020-10-29 Thread Stefano Stabellini
On Thu, 29 Oct 2020, Alex Bennée wrote:
> Oleksandr Tyshchenko  writes:
> > On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu <
> > masami.hirama...@linaro.org> wrote:
> >
> >> Hi Oleksandr,
> >>
> > Hi Masami
> >
> > [sorry for the possible format issue]
> >
> >
> >>
> >> I would like to try this on my arm64 board.
> >>
> > Glad to hear you are interested in this topic.
> >
> >
> >>
> >> According to your comments in the patch, I made this config file.
> >> # cat debian.conf
> >> name = "debian"
> >> type = "pvh"
> >> vcpus = 8
> >> memory = 512
> >> kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64"
> >> ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64"
> >> cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw"
> >> disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ]
> >> vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ]
> >> virtio = 1
> >> vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ]
> >>
> >> And tried to boot a DomU, but I got below error.
> >>
> >> # xl create -c debian.conf
> >> Parsing config from debian.conf
> >> libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain
> >> 1:unable to add virtio_disk devices
> >> libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain
> >> 1:xc_domain_pause failed
> >> libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get domain
> >> type for domid=1
> >> libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain
> >> 1:Unable to destroy guest
> >> libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain
> >> 1:Destruction of domain failed
> >>
> >>
> >> Could you tell me how can I test it?
> >>
> >
> > I assume it is due to the lack of the virtio-disk backend (which I haven't
> > shared yet as I focused on the IOREQ/DM support on Arm in the first place).
> > Could you wait a little bit, I am going to share it soon.
> 
> I assume this is wiring up the required bits of support to handle the
> IOREQ requests in QEMU? We are putting together a PoC demo to show
> a virtio enabled image (AGL) running on both KVM and Xen hypervisors so
> we are keen to see your code as soon as you can share it.
> 
> I'm currently preparing a patch series for QEMU which fixes the recent
> breakage caused by changes to the build system. As part of that I've
> separated CONFIG_XEN and CONFIG_XEN_HVM so it's possible to build
> i386-softmmu with unneeded for ARM bits. Hopefully this will allow me to
> create a qemu-aarch64-system binary with just the PV related models in
> it.
> 
> Talking to Stefano it probably makes sense going forward to introduce a
> new IOREQ aware machine type for QEMU that doesn't bring in the rest of
> the x86 overhead. I was thinking maybe xenvirt?
> 
> You've tested with virtio-block but we'd also like to extend this to
> other arbitrary virtio devices. I guess we will need some sort of
> mechanism to inform the QEMU command line where each device sits in the
> virtio-mmio bus so the FDT Xen delivers to the guest matches up with
> what QEMU is ready to serve requests for?

That would be great. As a reference, given that the domU memory mapping
layout is fixed, on x86 we just made sure that QEMU's idea of where the
devices are is the same of Xen's. What you are suggesting would make it
much more flexible and clearer.

Re: [PATCH] meson.build: fix building of Xen support for aarch64

2020-10-29 Thread Stefano Stabellini
On Thu, 29 Oct 2020, Alex Bennée wrote:
> > On Wed, 28 Oct 2020, Alex Bennée wrote:
> >> Xen is supported on aarch64 although weirdly using the i386-softmmu
> >> model. Checking based on the host CPU meant we never enabled Xen
> >> support. It would be nice to enable CONFIG_XEN for aarch64-softmmu to
> >> make it not seem weird but that will require further build surgery.
> >> 
> >> Signed-off-by: Alex Bennée 
> >> Cc: Masami Hiramatsu 
> >> Cc: Stefano Stabellini 
> >> Cc: Anthony Perard 
> >> Cc: Paul Durrant 
> >> Fixes: 8a19980e3f ("configure: move accelerator logic to meson")
> >> ---
> >>  meson.build | 2 ++
> >>  1 file changed, 2 insertions(+)
> >> 
> >> diff --git a/meson.build b/meson.build
> >> index 835424999d..f1fcbfed4c 100644
> >> --- a/meson.build
> >> +++ b/meson.build
> >> @@ -81,6 +81,8 @@ if cpu in ['x86', 'x86_64']
> >>  'CONFIG_HVF': ['x86_64-softmmu'],
> >>  'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'],
> >>}
> >> +elif cpu in [ 'arm', 'aarch64' ]
> >> +  accelerator_targets += { 'CONFIG_XEN': ['i386-softmmu'] }
> >>  endif
> >
> > This looks very reasonable -- the patch makes sense.
> >
> >
> > However I have two questions, mostly for my own understanding. I tried
> > to repro the aarch64 build problem but it works at my end, even without
> > this patch.
> 
> Building on a x86 host or with cross compiler?

native build (qemu-user)


> > I wonder why. I suspect it works thanks to these lines in
> > meson.build:
> >
> >   if not get_option('xen').disabled() and 'CONFIG_XEN_BACKEND' in 
> > config_host
> > accelerators += 'CONFIG_XEN'
> > have_xen_pci_passthrough = not 
> > get_option('xen_pci_passthrough').disabled() and targetos == 'linux'
> >   else
> > have_xen_pci_passthrough = false
> >   endif
> >
> > But I am not entirely sure who is adding CONFIG_XEN_BACKEND to
> > config_host.
> 
> The is part of the top level configure check - which basically checks
> for --enable-xen or autodetects the presence of the userspace libraries.
> I'm not sure if we have a slight over proliferation of symbols for Xen
> support (although I'm about to add more).
> 
> > The other question is: does it make sense to print the value of
> > CONFIG_XEN as part of the summary? Something like:
> >
> > diff --git a/meson.build b/meson.build
> > index 47e32e1fcb..c6e7832dc9 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -2070,6 +2070,7 @@ summary_info += {'KVM support':   
> > config_all.has_key('CONFIG_KVM')}
> >  summary_info += {'HAX support':   config_all.has_key('CONFIG_HAX')}
> >  summary_info += {'HVF support':   config_all.has_key('CONFIG_HVF')}
> >  summary_info += {'WHPX support':  config_all.has_key('CONFIG_WHPX')}
> > +summary_info += {'XEN support':  config_all.has_key('CONFIG_XEN')}
> >  summary_info += {'TCG support':   config_all.has_key('CONFIG_TCG')}
> >  if config_all.has_key('CONFIG_TCG')
> >summary_info += {'TCG debug enabled': 
> > config_host.has_key('CONFIG_DEBUG_TCG')}
> >
> >
> > But I realize there is already:
> >
> > summary_info += {'xen support':   
> > config_host.has_key('CONFIG_XEN_BACKEND')}
> >
> > so it would be a bit of a duplicate
> 
> Hmm so what we have is:
> 
>   CONFIG_XEN_BACKEND
> - ensures that appropriate compiler flags are added
> - pegs RAM_ADDR_MAX at UINT64_MAX (instead of UINTPTR_MAX)
>   CONFIG_XEN
> - which controls a bunch of build objects, some of which may be i386 only?
> ./accel/meson.build:15:specific_ss.add_all(when: ['CONFIG_XEN'], if_true: 
> dummy_ss)
> ./accel/stubs/meson.build:2:specific_ss.add(when: 'CONFIG_XEN', if_false: 
> files('xen-stub.c'))
> ./accel/xen/meson.build:1:specific_ss.add(when: 'CONFIG_XEN', if_true: 
> files('xen-all.c'))
> ./hw/9pfs/meson.build:17:fs_ss.add(when: 'CONFIG_XEN', if_true: 
> files('xen-9p-backend.c'))
> ./hw/block/dataplane/meson.build:2:specific_ss.add(when: 'CONFIG_XEN', 
> if_true: files('xen-block.c'))
> ./hw/block/meson.build:14:softmmu_ss.add(when: 'CONFIG_XEN', if_true: 
> files('xen-block.c'))
> ./hw/char/meson.build:23:softmmu_ss.add(when: 'CONFIG_XEN', if_true: 
> files('xen_console.c'))
> ./hw/display/meson.build:18:softmmu_ss.add(when: 'CONFIG_XEN', if_true: 
> files('xenfb.c'))
> ./hw/i386/xen/meson.build:1:i386_ss.add(when: 'CONFIG_XEN', if_true: 
> files('xen-hvm.c',
>   
>  'xen-mapcache.c',
>   
>  'xen_apic.c',
>   
>  'xen_platform.c',
>   
>  'xen_pvdevice.c')
> ./hw/net/meson.build:2:softmmu_ss.add(when: 'CONFIG_XEN', if_true: 
> files('xen_nic.c'))
> ./hw/usb/meson.build:76:softmmu_ss.add(when: ['CONFIG_USB', 'CONFIG_XEN', 
> libusb], if_true: files('xen-usb.c'))
>  

Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm

2020-10-29 Thread Alex Bennée


Oleksandr Tyshchenko  writes:

> On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu <
> masami.hirama...@linaro.org> wrote:
>
>> Hi Oleksandr,
>>
> Hi Masami
>
> [sorry for the possible format issue]
>
>
>>
>> I would like to try this on my arm64 board.
>>
> Glad to hear you are interested in this topic.
>
>
>>
>> According to your comments in the patch, I made this config file.
>> # cat debian.conf
>> name = "debian"
>> type = "pvh"
>> vcpus = 8
>> memory = 512
>> kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64"
>> ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64"
>> cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw"
>> disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ]
>> vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ]
>> virtio = 1
>> vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ]
>>
>> And tried to boot a DomU, but I got below error.
>>
>> # xl create -c debian.conf
>> Parsing config from debian.conf
>> libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain
>> 1:unable to add virtio_disk devices
>> libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain
>> 1:xc_domain_pause failed
>> libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get domain
>> type for domid=1
>> libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain
>> 1:Unable to destroy guest
>> libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain
>> 1:Destruction of domain failed
>>
>>
>> Could you tell me how can I test it?
>>
>
> I assume it is due to the lack of the virtio-disk backend (which I haven't
> shared yet as I focused on the IOREQ/DM support on Arm in the first place).
> Could you wait a little bit, I am going to share it soon.

I assume this is wiring up the required bits of support to handle the
IOREQ requests in QEMU? We are putting together a PoC demo to show
a virtio enabled image (AGL) running on both KVM and Xen hypervisors so
we are keen to see your code as soon as you can share it.

I'm currently preparing a patch series for QEMU which fixes the recent
breakage caused by changes to the build system. As part of that I've
separated CONFIG_XEN and CONFIG_XEN_HVM so it's possible to build
i386-softmmu with unneeded for ARM bits. Hopefully this will allow me to
create a qemu-aarch64-system binary with just the PV related models in
it.

Talking to Stefano it probably makes sense going forward to introduce a
new IOREQ aware machine type for QEMU that doesn't bring in the rest of
the x86 overhead. I was thinking maybe xenvirt?

You've tested with virtio-block but we'd also like to extend this to
other arbitrary virtio devices. I guess we will need some sort of
mechanism to inform the QEMU command line where each device sits in the
virtio-mmio bus so the FDT Xen delivers to the guest matches up with
what QEMU is ready to serve requests for?

-- 
Alex Bennée



Re: [PATCH] x86/hvm: process softirq while saving/loading entries

2020-10-29 Thread Roger Pau Monné
On Thu, Oct 29, 2020 at 05:38:17PM +0100, Jan Beulich wrote:
> On 29.10.2020 16:20, Roger Pau Monne wrote:
> > On slow systems with sync_console saving or loading the context of big
> > guests can cause the watchdog to trigger. Fix this by adding a couple
> > of process_pending_softirqs.
> 
> Which raises the question in how far this is then also a problem
> for the caller of the underlying hypercall. IOW I wonder whether
> instead we need to make use of continuations here. Nevertheless

FWIW, I've only hit this with debug builds on boxes that have slow
serial with sync_console enabled, due to the verbose printks.

> 
> > Signed-off-by: Roger Pau Monné 
> 
> Acked-by: Jan Beulich 

Thanks.



[PATCH v1 00/23] reduce overhead during live migration

2020-10-29 Thread Olaf Hering
The current live migration code can easily saturate an 1Gb link.
There is still room for improvement with faster network connections.
Even with this series reviewed and applied.
See description of patch #6.

Olaf

Olaf Hering (23):
  tools: add readv_exact to libxenctrl
  tools: add xc_is_known_page_type to libxenctrl
  tools: use xc_is_known_page_type
  tools: unify type checking for data pfns in migration stream
  tools: show migration transfer rate in send_dirty_pages
  tools/guest: prepare to allocate arrays once
  tools/guest: save: move batch_pfns
  tools/guest: save: move mfns array
  tools/guest: save: move types array
  tools/guest: save: move errors array
  tools/guest: save: move iov array
  tools/guest: save: move rec_pfns array
  tools/guest: save: move guest_data array
  tools/guest: save: move local_pages array
  tools/guest: restore: move pfns array
  tools/guest: restore: move types array
  tools/guest: restore: move mfns array
  tools/guest: restore: move map_errs array
  tools/guest: restore: move mfns array in populate_pfns
  tools/guest: restore: move pfns array in populate_pfns
  tools/guest: restore: split record processing
  tools/guest: restore: split handle_page_data
  tools/guest: restore: write data directly into guest

 tools/libs/ctrl/xc_private.c  |  54 ++-
 tools/libs/ctrl/xc_private.h  |  34 ++
 tools/libs/guest/xg_sr_common.c   |  33 +-
 tools/libs/guest/xg_sr_common.h   |  86 +++-
 tools/libs/guest/xg_sr_restore.c  | 562 +-
 tools/libs/guest/xg_sr_save.c | 158 
 tools/libs/guest/xg_sr_save_x86_hvm.c |   5 +-
 tools/libs/guest/xg_sr_save_x86_pv.c  |  31 +-
 8 files changed, 666 insertions(+), 297 deletions(-)




Re: [PATCH] meson.build: fix building of Xen support for aarch64

2020-10-29 Thread Stefano Stabellini
On Thu, 29 Oct 2020, Jason Andryuk wrote:
> On Thu, Oct 29, 2020 at 6:01 AM Alex Bennée  wrote:
> >
> >
> > Stefano Stabellini  writes:
> >
> > > On Wed, 28 Oct 2020, Alex Bennée wrote:
> > >> Xen is supported on aarch64 although weirdly using the i386-softmmu
> > >> model. Checking based on the host CPU meant we never enabled Xen
> > >> support. It would be nice to enable CONFIG_XEN for aarch64-softmmu to
> > >> make it not seem weird but that will require further build surgery.
> > >>
> > >> Signed-off-by: Alex Bennée 
> > >> Cc: Masami Hiramatsu 
> > >> Cc: Stefano Stabellini 
> > >> Cc: Anthony Perard 
> > >> Cc: Paul Durrant 
> > >> Fixes: 8a19980e3f ("configure: move accelerator logic to meson")
> > >> ---
> > >>  meson.build | 2 ++
> > >>  1 file changed, 2 insertions(+)
> > >>
> > >> diff --git a/meson.build b/meson.build
> > >> index 835424999d..f1fcbfed4c 100644
> > >> --- a/meson.build
> > >> +++ b/meson.build
> > >> @@ -81,6 +81,8 @@ if cpu in ['x86', 'x86_64']
> > >>  'CONFIG_HVF': ['x86_64-softmmu'],
> > >>  'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'],
> > >>}
> > >> +elif cpu in [ 'arm', 'aarch64' ]
> > >> +  accelerator_targets += { 'CONFIG_XEN': ['i386-softmmu'] }
> > >>  endif
> > >
> > > This looks very reasonable -- the patch makes sense.
> 
> A comment would be useful to explain that Arm needs i386-softmmu for
> the xenpv machine.
> 
> > >
> > > However I have two questions, mostly for my own understanding. I tried
> > > to repro the aarch64 build problem but it works at my end, even without
> > > this patch.
> >
> > Building on a x86 host or with cross compiler?
> >
> > > I wonder why. I suspect it works thanks to these lines in
> > > meson.build:
> 
> I think it's a runtime and not a build problem.  In osstest, Xen
> support was detected and configured, but CONFIG_XEN wasn't set for
> Arm.  So at runtime xen_available() returns 0, and QEMU doesn't start
> with "qemu-system-i386: -xen-domid 1: Option not supported for this
> target"
> 
> I posted my investigation here:
> https://lore.kernel.org/xen-devel/cakf6xpss8kpgovzrkitpz63bhbvbjxrtywdhekzuo2q1kem...@mail.gmail.com/

Right, but strangely I cannot reproduce it here. I am natively building
(qemu-user) on aarch64 and for me CONFIG_XEN gets set.

Re: Xen on RP4

2020-10-29 Thread Stefano Stabellini
On Thu, 29 Oct 2020, Jürgen Groß wrote:
> On 29.10.20 01:37, Stefano Stabellini wrote:
> > On Tue, 27 Oct 2020, Elliott Mitchell wrote:
> > > On Mon, Oct 26, 2020 at 06:44:27PM +, Julien Grall wrote:
> > > > On 26/10/2020 16:03, Elliott Mitchell wrote:
> > > > > On Mon, Oct 26, 2020 at 01:31:42PM +, Julien Grall wrote:
> > > > > > On 24/10/2020 06:35, Elliott Mitchell wrote:
> > > > > > > ACPI has a distinct
> > > > > > > means of specifying a limited DMA-width; the above fails, because
> > > > > > > it
> > > > > > > assumes a *device-tree*.
> > > > > > 
> > > > > > Do you know if it would be possible to infer from the ACPI static
> > > > > > table
> > > > > > the DMA-width?
> > > > > 
> > > > > Yes, and it is.  Due to not knowing much about ACPI tables I don't
> > > > > know
> > > > > what the C code would look like though (problem is which documentation
> > > > > should I be looking at first?).
> > > > 
> > > > What you provided below is an excerpt of the DSDT. AFAIK, DSDT content
> > > > is written in AML. So far the shortest implementation I have seen for
> > > > the AML parser is around 5000 lines (see [1]). It might be possible to
> > > > strip some the code, although I think this will still probably too big
> > > > for a single workaround.
> > > > 
> > > > What I meant by "static table" is a table that looks like a structure
> > > > and can be parsed in a few lines. If we can't find on contain the DMA
> > > > window, then the next best solution is to find a way to identity the
> > > > platform.
> > > > 
> > > > I don't know enough ACPI to know if this solution is possible. A good
> > > > starter would probably be the ACPI spec [2].
> > > 
> > > Okay, that is worse than I had thought (okay, ACPI is impressively
> > > complex for something nominally firmware-level).
> > > 
> > > There are strings in the present Tianocore implementation for Raspberry
> > > PI 4B which could be targeted.  Notably included in the output during
> > > boot listing the tables, "RPIFDN", "RPIFDN RPI" and "RPIFDN RPI4" (I'm
> > > unsure how kosher these are as this wsn't implemented nor blessed by the
> > > Raspberry PI Foundation).
> > > 
> > > I strongly dislike this approach as you soon turn the Xen project into a
> > > database of hardware.  This is already occurring with
> > > xen/arch/arm/platforms and I would love to do something about this.  On
> > > that thought, how about utilizing Xen's command-line for this purpose?
> > 
> > I don't think that a command line option is a good idea: basically it is
> > punting to users the task of platform detection. Also, it means that
> > users will be necessarily forced to edit the uboot script or grub
> > configuration file to boot.
> > 
> > Note that even if we introduced a new command line, we wouldn't take
> > away the need for xen/arch/arm/platforms anyway.
> > 
> > It would be far better for Xen to autodetect the platform if we can.
> > 
> > 
> > > Have a procedure of during installation/updates retrieve DMA limitation
> > > information from the running OS and the following boot Xen will follow
> > > the appropriate setup.  By its nature, Domain 0 will have the information
> > > needed, just becomes an issue of how hard that is to retrieve...
> > 
> > Historically that is what we used to do for many things related to ACPI,
> > but unfortunately it leads to a pretty bad architecture where Xen
> > depends on Dom0 for booting rather than the other way around. (Dom0
> > should be the one requiring Xen for booting, given that Xen is higher
> > privilege and boots first.)
> > 
> > 
> > I think the best compromise is still to use an ACPI string to detect the
> > platform. For instance, would it be possible to use the OEMID fields in
> > RSDT, XSDT, FADT?  Possibly even a combination of them?
> > 
> > Another option might be to get the platform name from UEFI somehow.
> 
> What about having a small domain parsing the ACPI booting first and use
> that information for booting dom0?
> 
> That dom would be part of the Xen build and the hypervisor wouldn't need
> to gain all the ACPI AML logic.

That could work, but in practice we don't have such a domain today --
the infrastructure is missing. I wonder whether the bootloader (uboot or
grub) would know about the platform and might be able to pass that
information to Xen somehow.

Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm

2020-10-29 Thread Stefano Stabellini
On Thu, 29 Oct 2020, Oleksandr Tyshchenko wrote:
> On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu 
>  wrote:
>   Hi Oleksandr,
> 
> Hi Masami
> 
> [sorry for the possible format issue]
>  
> 
>   I would like to try this on my arm64 board.
> 
> Glad to hear you are interested in this topic. 
>  
> 
>   According to your comments in the patch, I made this config file.
>   # cat debian.conf
>   name = "debian"
>   type = "pvh"
>   vcpus = 8
>   memory = 512
>   kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64"
>   ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64"
>   cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw"
>   disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ]
>   vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ]
>   virtio = 1
>   vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ]
> 
>   And tried to boot a DomU, but I got below error.
> 
>   # xl create -c debian.conf
>   Parsing config from debian.conf
>   libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain
>   1:unable to add virtio_disk devices
>   libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain
>   1:xc_domain_pause failed
>   libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get domain
>   type for domid=1
>   libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain
>   1:Unable to destroy guest
>   libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain
>   1:Destruction of domain failed
> 
> 
>   Could you tell me how can I test it?
> 
> 
> I assume it is due to the lack of the virtio-disk backend (which I haven't 
> shared yet as I focused on the IOREQ/DM support on Arm in the
> first place).
> Could you wait a little bit, I am going to share it soon. 

Do you have a quick-and-dirty hack you can share in the meantime? Even
just on github as a special branch? It would be very useful to be able
to have a test-driver for the new feature.

[xen-4.11-testing test] 156277: tolerable FAIL - PUSHED

2020-10-29 Thread osstest service owner
flight 156277 xen-4.11-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156277/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 12 debian-hvm-install 
fail like 156034
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 12 debian-hvm-install 
fail like 156034
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156034
 test-armhf-armhf-libvirt 16 saverestore-support-checkfail  like 156034
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156034
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156034
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156034
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156034
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156034
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156034
 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail  like 156034
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156034
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156034
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-checkfail   never pass
 test-amd64-i386-xl-pvshim14 guest-start  fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  15 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-arm64-arm64-xl  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  16 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail  never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-rtds 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass
 test-armhf-armhf-libvirt 15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  14 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  15 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail   never pass

version targeted for testing:
 xen  e274c8bdc12eb596e55233040e8b49da27150f31
baseline version:
 xen  63199dfd3a0418f1677c6ccd7fe05b123af4610a

Last test of basis   156034  2020-10-20 13:35:54 Z9 days
Testing same since   156262  2020-10-27 18:36:52 Z2 days2 attempts


People who touched revisions under test:
  Andrew Cooper 

jobs:
 build-amd64-xsm  pass
 

Re: [PATCH] x86/pv: Drop stale comment in dom0_construct_pv()

2020-10-29 Thread Andrew Cooper
On 29/10/2020 14:37, Jan Beulich wrote:
> On 29.10.2020 15:00, Andrew Cooper wrote:
>> This comment has been around since c/s 1372bca0615 in 2004.  It is stale, as
>> it predates the introduction of struct vcpu.
> That commit only moved it around; it's 22a857bde9b8 afaics from
> early 2003 where it first appeared, where it had a reason:
>
> /*
>  * WARNING: The new domain must have its 'processor' field
>  * filled in by now !!
>  */
> phys_l2tab = ALLOC_FRAME_FROM_DOMAIN();
> l2tab = map_domain_mem(phys_l2tab);
> memcpy(l2tab, idle_pg_table[p->processor], PAGE_SIZE);

Oh yes - my simple search didn't spot the reformat.

>
> But yes, the comment has been stale for a long time, and I've
> been wondering a number of times what it was supposed to tell
> me. (I think it was already stale at the point the comment
> first got altered, in 3072fef54df8.)

Looks like it became stale with 99db02d5097 "Remove CPU-dependent
page-directory entries." which drops the per-cpu idle_pg_table.

>
>> It is not obvious that it was even correct at the time.  Where a vcpu (domain
>> at the time) has been configured to run is unrelated to construct the 
>> domain's
>> initial pagetables, etc.
>>
>> Signed-off-by: Andrew Cooper 
> Acked-by: Jan Beulich 

Thanks.  I'll update the commit message.

~Andrew



[PATCH v2] libxl: Add suppress-vmdesc to QEMU machine

2020-10-29 Thread Jason Andryuk
The device model state saved by QMP xen-save-devices-state doesn't
include the vmdesc json.  When restoring an HVM, xen-load-devices-state
always triggers "Expected vmdescription section, but got 0".  This is
not a problem when restore comes from a file.  However, when QEMU runs
in a linux stubdom and comes over a console, EOF is not received.  This
causes a delay restoring - though it does restore.

Setting suppress-vmdesc skips looking for the vmdesc during restore and
avoids the wait.

QEMU 5.2 enables suppress-vmdesc by default for xenfv, but this change
sets it manually for xenfv and xen_platform_pci=0 when -machine pc is
use.

QEMU commit 9850c6047b8b "migration: Allow to suppress vmdesc
submission" added suppress-vmdesc in QEMU 2.3.

Signed-off-by: Jason Andryuk 

---
QEMU 2.3 came out in 2015, so setting suppress-vmdesc unilaterally
should be okay...  Is this okay?
---
 tools/libs/light/libxl_dm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index d1ff35dda3..3da83259c0 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -1778,9 +1778,9 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 /* Switching here to the machine "pc" which does not add
  * the xen-platform device instead of the default "xenfv" machine.
  */
-machinearg = libxl__strdup(gc, "pc,accel=xen");
+machinearg = libxl__strdup(gc, "pc,accel=xen,suppress-vmdesc=on");
 } else {
-machinearg = libxl__strdup(gc, "xenfv");
+machinearg = libxl__strdup(gc, "xenfv,suppress-vmdesc=on");
 }
 if (b_info->u.hvm.mmio_hole_memkb) {
 uint64_t max_ram_below_4g = (1ULL << 32) -
-- 
2.25.1




Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm

2020-10-29 Thread Oleksandr Tyshchenko
On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu <
masami.hirama...@linaro.org> wrote:

> Hi Oleksandr,
>
Hi Masami

[sorry for the possible format issue]


>
> I would like to try this on my arm64 board.
>
Glad to hear you are interested in this topic.


>
> According to your comments in the patch, I made this config file.
> # cat debian.conf
> name = "debian"
> type = "pvh"
> vcpus = 8
> memory = 512
> kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64"
> ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64"
> cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw"
> disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ]
> vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ]
> virtio = 1
> vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ]
>
> And tried to boot a DomU, but I got below error.
>
> # xl create -c debian.conf
> Parsing config from debian.conf
> libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain
> 1:unable to add virtio_disk devices
> libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain
> 1:xc_domain_pause failed
> libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get domain
> type for domid=1
> libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain
> 1:Unable to destroy guest
> libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain
> 1:Destruction of domain failed
>
>
> Could you tell me how can I test it?
>

I assume it is due to the lack of the virtio-disk backend (which I haven't
shared yet as I focused on the IOREQ/DM support on Arm in the first place).
Could you wait a little bit, I am going to share it soon.


-- 
Regards,

Oleksandr Tyshchenko


Re: [PATCH] x86/HVM: send mapcache invalidation request to qemu regardless of preemption

2020-10-29 Thread Andrew Cooper
On 29/10/2020 15:14, Jan Beulich wrote:
> Even if only part of a hypercall completed before getting preempted,
> invalidation ought to occur. Therefore fold the two return statements.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 



Re: [PATCH 1/2] x86: fix build of PV shim when !GRANT_TABLE

2020-10-29 Thread Andrew Cooper
On 29/10/2020 17:09, Jan Beulich wrote:
> To do its compat translation, shim code needs to include the compat
> header. For this to work, this header first of all needs to be
> generated.
>
> Reported-by: Andrew Cooper 
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 



Re: [PATCH 2/2] x86: PV shim doesn't need GRANT_TABLE

2020-10-29 Thread Andrew Cooper
On 29/10/2020 17:10, Jan Beulich wrote:
> The only reference into the code controlled by this option is from the
> hypercall table, and that hypercall table entry gets overwritten.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 



Re: [PATCH] x86emul: support AVX-VNNI

2020-10-29 Thread Andrew Cooper
On 29/10/2020 15:01, Jan Beulich wrote:
> These are VEX-encoded equivalents of the EVEX-encoded AVX512-VNNI ISA
> extension.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper 



[PATCH v1 21/23] tools/guest: restore: split record processing

2020-10-29 Thread Olaf Hering
handle_page_data must be able to read directly into mapped guest memory.
This will avoid unneccesary memcpy calls for data which can be consumed 
verbatim.

Rearrange the code to allow decisions based on the incoming record.

This change is preparation for future changes in handle_page_data,
no change in behavior is intended.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.c  | 33 -
 tools/libs/guest/xg_sr_common.h  |  4 ++-
 tools/libs/guest/xg_sr_restore.c | 49 ++--
 tools/libs/guest/xg_sr_save.c|  7 -
 4 files changed, 63 insertions(+), 30 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.c b/tools/libs/guest/xg_sr_common.c
index 17567ab133..cabde4ef74 100644
--- a/tools/libs/guest/xg_sr_common.c
+++ b/tools/libs/guest/xg_sr_common.c
@@ -91,26 +91,33 @@ int write_split_record(struct xc_sr_context *ctx, struct 
xc_sr_record *rec,
 return -1;
 }
 
-int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec)
+int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr 
*rhdr)
 {
 xc_interface *xch = ctx->xch;
-struct xc_sr_rhdr rhdr;
-size_t datasz;
 
-if ( read_exact(fd, , sizeof(rhdr)) )
+if ( read_exact(fd, rhdr, sizeof(*rhdr)) )
 {
 PERROR("Failed to read Record Header from stream");
 return -1;
 }
 
-if ( rhdr.length > REC_LENGTH_MAX )
+if ( rhdr->length > REC_LENGTH_MAX )
 {
-ERROR("Record (0x%08x, %s) length %#x exceeds max (%#x)", rhdr.type,
-  rec_type_to_str(rhdr.type), rhdr.length, REC_LENGTH_MAX);
+ERROR("Record (0x%08x, %s) length %#x exceeds max (%#x)", rhdr->type,
+  rec_type_to_str(rhdr->type), rhdr->length, REC_LENGTH_MAX);
 return -1;
 }
 
-datasz = ROUNDUP(rhdr.length, REC_ALIGN_ORDER);
+return 0;
+}
+
+int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr 
*rhdr,
+ struct xc_sr_record *rec)
+{
+xc_interface *xch = ctx->xch;
+size_t datasz;
+
+datasz = ROUNDUP(rhdr->length, REC_ALIGN_ORDER);
 
 if ( datasz )
 {
@@ -119,7 +126,7 @@ int read_record(struct xc_sr_context *ctx, int fd, struct 
xc_sr_record *rec)
 if ( !rec->data )
 {
 ERROR("Unable to allocate %zu bytes for record data (0x%08x, %s)",
-  datasz, rhdr.type, rec_type_to_str(rhdr.type));
+  datasz, rhdr->type, rec_type_to_str(rhdr->type));
 return -1;
 }
 
@@ -128,18 +135,18 @@ int read_record(struct xc_sr_context *ctx, int fd, struct 
xc_sr_record *rec)
 free(rec->data);
 rec->data = NULL;
 PERROR("Failed to read %zu bytes of data for record (0x%08x, %s)",
-   datasz, rhdr.type, rec_type_to_str(rhdr.type));
+   datasz, rhdr->type, rec_type_to_str(rhdr->type));
 return -1;
 }
 }
 else
 rec->data = NULL;
 
-rec->type   = rhdr.type;
-rec->length = rhdr.length;
+rec->type   = rhdr->type;
+rec->length = rhdr->length;
 
 return 0;
-};
+}
 
 static void __attribute__((unused)) build_assertions(void)
 {
diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 3fe665b91d..66d1b0dfe6 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -475,7 +475,9 @@ static inline int write_record(struct xc_sr_context *ctx,
  *
  * On failure, the contents of the record structure are undefined.
  */
-int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec);
+int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr 
*rhdr);
+int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr 
*rhdr,
+ struct xc_sr_record *rec);
 
 /*
  * This would ideally be private in restore.c, but is needed by
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 71b39612ee..93f69b9ba8 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -471,7 +471,7 @@ static int send_checkpoint_dirty_pfn_list(struct 
xc_sr_context *ctx)
 return rc;
 }
 
-static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec);
+static int process_buffered_record(struct xc_sr_context *ctx, struct 
xc_sr_record *rec);
 static int handle_checkpoint(struct xc_sr_context *ctx)
 {
 xc_interface *xch = ctx->xch;
@@ -510,7 +510,7 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
 
 for ( i = 0; i < ctx->restore.buffered_rec_num; i++ )
 {
-rc = process_record(ctx, >restore.buffered_records[i]);
+rc = process_buffered_record(ctx, 
>restore.buffered_records[i]);
 if ( rc )
 goto err;
 }
@@ -571,10 +571,11 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
 return rc;
 }
 
-static int buffer_record(struct 

Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread Arvind Sankar
On Thu, Oct 29, 2020 at 05:59:54PM +0100, Paolo Bonzini wrote:
> On 29/10/20 17:56, Arvind Sankar wrote:
> >> For those two just add:
> >>struct apic *apic = x86_system_apic;
> >> before all the assignments.
> >> Less churn and much better code.
> >>
> > Why would it be better code?
> > 
> 
> I think he means the compiler produces better code, because it won't
> read the global variable repeatedly.  Not sure if that's true,(*) but I
> think I do prefer that version if Arnd wants to do that tweak.
> 
> Paolo
> 
> (*) if it is, it might also be due to Linux using -fno-strict-aliasing
> 

Nope, it doesn't read it multiple times. I guess it still assumes that
apic isn't in the middle of what it points to: it would reload the
address if the first element of *apic was modified, but doesn't for
other elements. Interesting.



[PATCH v1 19/23] tools/guest: restore: move mfns array in populate_pfns

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move populate_pfns mfns array into preallocated 
space.
Use some prefix to avoid conflict with an array used in handle_page_data.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  | 2 ++
 tools/libs/guest/xg_sr_restore.c | 5 ++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index eba3a49877..96a77b5969 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -230,6 +230,8 @@ struct xc_sr_restore_arrays {
 /* process_page_data */
 xen_pfn_t mfns[MAX_BATCH_SIZE];
 int map_errs[MAX_BATCH_SIZE];
+/* populate_pfns */
+xen_pfn_t pp_mfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 94c329032f..85a32aaed2 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -138,12 +138,12 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int 
count,
   const xen_pfn_t *original_pfns, const uint32_t *types)
 {
 xc_interface *xch = ctx->xch;
-xen_pfn_t *mfns = malloc(count * sizeof(*mfns)),
+xen_pfn_t *mfns = ctx->restore.m->pp_mfns,
 *pfns = malloc(count * sizeof(*pfns));
 unsigned int i, nr_pfns = 0;
 int rc = -1;
 
-if ( !mfns || !pfns )
+if ( !pfns )
 {
 ERROR("Failed to allocate %zu bytes for populating the physmap",
   2 * count * sizeof(*mfns));
@@ -191,7 +191,6 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int 
count,
 
  err:
 free(pfns);
-free(mfns);
 
 return rc;
 }



[PATCH v1 23/23] tools/guest: restore: write data directly into guest

2020-10-29 Thread Olaf Hering
Read incoming migration stream directly into the guest memory.
This avoids the memory allocation and copying, and the resulting
performance penalty.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  |   1 +
 tools/libs/guest/xg_sr_restore.c | 132 ++-
 2 files changed, 129 insertions(+), 4 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 7ec8867b88..f76af23bcc 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -231,6 +231,7 @@ struct xc_sr_restore_arrays {
 xen_pfn_t mfns[MAX_BATCH_SIZE];
 int map_errs[MAX_BATCH_SIZE];
 void *guest_data[MAX_BATCH_SIZE];
+struct iovec iov[MAX_BATCH_SIZE];
 
 /* populate_pfns */
 xen_pfn_t pp_mfns[MAX_BATCH_SIZE];
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 060f3d1f4e..2f575d7dd9 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -392,6 +392,122 @@ err:
 return rc;
 }
 
+/*
+ * Handle PAGE_DATA record from the stream.
+ * Given a list of pfns, their types, and a block of page data from the
+ * stream, populate and record their types, map the relevant subset and copy
+ * the data into the guest.
+ */
+static int handle_incoming_page_data(struct xc_sr_context *ctx,
+ struct xc_sr_rhdr *rhdr)
+{
+xc_interface *xch = ctx->xch;
+struct xc_sr_restore_arrays *m = ctx->restore.m;
+struct xc_sr_rec_page_data_header *pages = >pages;
+uint64_t *pfn_nums = m->pages.pfn;
+uint32_t i;
+int rc, iov_idx;
+
+rc = handle_static_data_end_v2(ctx);
+if ( rc )
+goto err;
+
+/* First read and verify the header */
+rc = read_exact(ctx->fd, pages, sizeof(*pages));
+if ( rc )
+{
+PERROR("Could not read rec_pfn header");
+goto err;
+}
+
+if ( verify_rec_page_hdr(ctx, rhdr->length, pages) == false )
+{
+rc = -1;
+goto err;
+}
+
+/* Then read and verify the incoming pfn numbers */
+rc = read_exact(ctx->fd, pfn_nums, sizeof(*pfn_nums) * pages->count);
+if ( rc )
+{
+PERROR("Could not read rec_pfn data");
+goto err;
+}
+
+if ( verify_rec_page_pfns(ctx, rhdr->length, pages) == false )
+{
+rc = -1;
+goto err;
+}
+
+/* Finally read and verify the incoming pfn data */
+rc = map_guest_pages(ctx, pages);
+if ( rc )
+goto err;
+
+/* Prepare read buffers, either guest or throw away memory */
+for ( i = 0, iov_idx = 0; i < pages->count; i++ )
+{
+if ( !m->guest_data[i] )
+continue;
+
+m->iov[iov_idx].iov_len = PAGE_SIZE;
+if ( ctx->restore.verify )
+m->iov[iov_idx].iov_base = ctx->restore.verify_buf + i * PAGE_SIZE;
+else
+m->iov[iov_idx].iov_base = m->guest_data[i];
+iov_idx++;
+}
+
+if ( !iov_idx )
+goto done;
+
+rc = readv_exact(ctx->fd, m->iov, iov_idx);
+if ( rc )
+{
+PERROR("read of %d pages failed", iov_idx);
+goto err;
+}
+
+/* Post-processing of pfn data */
+for ( i = 0, iov_idx = 0; i < pages->count; i++ )
+{
+if ( !m->guest_data[i] )
+continue;
+
+rc = ctx->restore.ops.localise_page(ctx, m->types[i], 
m->iov[iov_idx].iov_base);
+if ( rc )
+{
+ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")",
+  m->pfns[i], m->types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+goto err;
+
+}
+
+if ( ctx->restore.verify )
+{
+if ( memcmp(m->guest_data[i], m->iov[iov_idx].iov_base, PAGE_SIZE) 
)
+{
+ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")",
+  m->pfns[i], m->types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
+}
+}
+
+iov_idx++;
+}
+
+done:
+rc = 0;
+
+err:
+if ( ctx->restore.guest_mapping )
+{
+xenforeignmemory_unmap(xch->fmem, ctx->restore.guest_mapping, 
ctx->restore.nr_mapped_pages);
+ctx->restore.guest_mapping = NULL;
+}
+return rc;
+}
+
 /*
  * Handle PAGE_DATA record from an existing buffer
  * Given a list of pfns, their types, and a block of page data from the
@@ -773,11 +889,19 @@ static int process_incoming_record_header(struct 
xc_sr_context *ctx, struct xc_s
 struct xc_sr_record rec;
 int rc;
 
-rc = read_record_data(ctx, ctx->fd, rhdr, );
-if ( rc )
-return rc;
+switch ( rhdr->type )
+{
+case REC_TYPE_PAGE_DATA:
+rc = handle_incoming_page_data(ctx, rhdr);
+break;
+default:
+rc = read_record_data(ctx, ctx->fd, rhdr, );
+if ( rc == 0 )
+rc = process_buffered_record(ctx, );;
+break;
+}
 
-return process_buffered_record(ctx, );
+return rc;
 }
 
 



[PATCH v1 06/23] tools/guest: prepare to allocate arrays once

2020-10-29 Thread Olaf Hering
The hotpath 'send_dirty_pages' is supposed to do just one thing: sending.
The other end 'handle_page_data' is supposed to do just receiving.

But instead both do other costly work like memory allocations and data moving.
Do the allocations once, the array sizes are a compiletime constant.
Avoid unneeded copying of data by receiving data directly into mapped guest 
memory.

This patch is just prepartion, subsequent changes will populate the arrays.

Once all changes are applied, migration of a busy HVM domU changes like that:

Without this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 
xen_testing):
2020-10-29 10:23:10.711+: xc: show_transfer_rate: 23663128 bytes + 2879563 
pages in 55.324905335 sec, 203 MiB/sec: Internal error
2020-10-29 10:23:35.115+: xc: show_transfer_rate: 16829632 bytes + 2097552 
pages in 24.401179720 sec, 335 MiB/sec: Internal error
2020-10-29 10:23:59.436+: xc: show_transfer_rate: 16829032 bytes + 2097478 
pages in 24.319025928 sec, 336 MiB/sec: Internal error
2020-10-29 10:24:23.844+: xc: show_transfer_rate: 16829024 bytes + 2097477 
pages in 24.406992500 sec, 335 MiB/sec: Internal error
2020-10-29 10:24:48.292+: xc: show_transfer_rate: 16828912 bytes + 2097463 
pages in 24.446489027 sec, 335 MiB/sec: Internal error
2020-10-29 10:25:01.816+: xc: show_transfer_rate: 16836080 bytes + 2098356 
pages in 13.447091818 sec, 609 MiB/sec: Internal error

With this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 
xen_unstable):
2020-10-28 21:26:05.074+: xc: show_transfer_rate: 23663128 bytes + 2879563 
pages in 52.564054368 sec, 213 MiB/sec: Internal error
2020-10-28 21:26:23.527+: xc: show_transfer_rate: 16830040 bytes + 2097603 
pages in 18.450592015 sec, 444 MiB/sec: Internal error
2020-10-28 21:26:41.926+: xc: show_transfer_rate: 16830944 bytes + 2097717 
pages in 18.397862306 sec, 445 MiB/sec: Internal error
2020-10-28 21:27:00.339+: xc: show_transfer_rate: 16829176 bytes + 2097498 
pages in 18.411973339 sec, 445 MiB/sec: Internal error
2020-10-28 21:27:18.643+: xc: show_transfer_rate: 16828592 bytes + 2097425 
pages in 18.303326695 sec, 447 MiB/sec: Internal error
2020-10-28 21:27:26.289+: xc: show_transfer_rate: 16835952 bytes + 2098342 
pages in 7.579846749 sec, 1081 MiB/sec: Internal error

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  | 8 
 tools/libs/guest/xg_sr_restore.c | 8 
 tools/libs/guest/xg_sr_save.c| 4 +++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index f3a7a29298..62bc87b5f4 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -211,6 +211,12 @@ static inline int update_blob(struct xc_sr_blob *blob,
 return 0;
 }
 
+struct xc_sr_save_arrays {
+};
+
+struct xc_sr_restore_arrays {
+};
+
 struct xc_sr_context
 {
 xc_interface *xch;
@@ -248,6 +254,7 @@ struct xc_sr_context
 unsigned long *deferred_pages;
 unsigned long nr_deferred_pages;
 xc_hypercall_buffer_t dirty_bitmap_hbuf;
+struct xc_sr_save_arrays *m;
 } save;
 
 struct /* Restore data. */
@@ -299,6 +306,7 @@ struct xc_sr_context
 
 /* Sender has invoked verify mode on the stream. */
 bool verify;
+struct xc_sr_restore_arrays *m;
 } restore;
 };
 
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 0332ae9f32..4a9ece9681 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -739,6 +739,13 @@ static int setup(struct xc_sr_context *ctx)
 }
 ctx->restore.allocated_rec_num = DEFAULT_BUF_RECORDS;
 
+ctx->restore.m = malloc(sizeof(*ctx->restore.m));
+if ( !ctx->restore.m ) {
+ERROR("Unable to allocate memory for arrays");
+rc = -1;
+goto err;
+}
+
  err:
 return rc;
 }
@@ -757,6 +764,7 @@ static void cleanup(struct xc_sr_context *ctx)
 xc_hypercall_buffer_free_pages(
 xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->restore.p2m_size)));
 
+free(ctx->restore.m);
 free(ctx->restore.buffered_records);
 free(ctx->restore.populated_pfns);
 
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index f58a35ddde..1e3c8eff2f 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -853,8 +853,9 @@ static int setup(struct xc_sr_context *ctx)
 ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
   sizeof(*ctx->save.batch_pfns));
 ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
+ctx->save.m = malloc(sizeof(*ctx->save.m));
 
-if ( !ctx->save.batch_pfns || !dirty_bitmap || !ctx->save.deferred_pages )
+if ( !ctx->save.m || !ctx->save.batch_pfns || !dirty_bitmap || 
!ctx->save.deferred_pages )
 {
 ERROR("Unable to allocate 

[PATCH v1 09/23] tools/guest: save: move types array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move types array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h | 2 ++
 tools/libs/guest/xg_sr_save.c   | 7 ++-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 0c2bef8f78..3cbadb607b 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -215,6 +215,8 @@ struct xc_sr_save_arrays {
 xen_pfn_t batch_pfns[MAX_BATCH_SIZE];
 /* write_batch: Mfns of the batch pfns. */
 xen_pfn_t mfns[MAX_BATCH_SIZE];
+/* write_batch: Types of the batch pfns. */
+xen_pfn_t types[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_restore_arrays {
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index cdab288c3f..6678df95b8 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -88,7 +88,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx)
 static int write_batch(struct xc_sr_context *ctx)
 {
 xc_interface *xch = ctx->xch;
-xen_pfn_t *mfns = ctx->save.m->mfns, *types = NULL;
+xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types;
 void *guest_mapping = NULL;
 void **guest_data = NULL;
 void **local_pages = NULL;
@@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx)
 
 assert(nr_pfns != 0);
 
-/* Types of the batch pfns. */
-types = malloc(nr_pfns * sizeof(*types));
 /* Errors from attempting to map the gfns. */
 errors = malloc(nr_pfns * sizeof(*errors));
 /* Pointers to page data to send.  Mapped gfns or local allocations. */
@@ -116,7 +114,7 @@ static int write_batch(struct xc_sr_context *ctx)
 /* iovec[] for writev(). */
 iov = malloc((nr_pfns + 4) * sizeof(*iov));
 
-if ( !types || !errors || !guest_data || !local_pages || !iov )
+if ( !errors || !guest_data || !local_pages || !iov )
 {
 ERROR("Unable to allocate arrays for a batch of %u pages",
   nr_pfns);
@@ -274,7 +272,6 @@ static int write_batch(struct xc_sr_context *ctx)
 free(local_pages);
 free(guest_data);
 free(errors);
-free(types);
 
 return rc;
 }



[PATCH v1 03/23] tools: use xc_is_known_page_type

2020-10-29 Thread Olaf Hering
Verify pfn type on sending side, also verify incoming batch of pfns.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_restore.c | 3 +--
 tools/libs/guest/xg_sr_save.c| 6 ++
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index b57a787519..f1c3169229 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -406,8 +406,7 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 }
 
 type = (pages->pfn[i] & PAGE_DATA_TYPE_MASK) >> 32;
-if ( ((type >> XEN_DOMCTL_PFINFO_LTAB_SHIFT) >= 5) &&
- ((type >> XEN_DOMCTL_PFINFO_LTAB_SHIFT) <= 8) )
+if ( xc_is_known_page_type(type) == false )
 {
 ERROR("Invalid type %#"PRIx32" for pfn %#"PRIpfn" (index %u)",
   type, pfn, i);
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 2ba7c3200c..044d0ae3aa 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -147,6 +147,12 @@ static int write_batch(struct xc_sr_context *ctx)
 
 for ( i = 0; i < nr_pfns; ++i )
 {
+if ( xc_is_known_page_type(types[i]) == false )
+{
+ERROR("Wrong type %#"PRIpfn" for pfn %#"PRIpfn, types[i], mfns[i]);
+goto err;
+}
+
 switch ( types[i] )
 {
 case XEN_DOMCTL_PFINFO_BROKEN:



[PATCH v1 17/23] tools/guest: restore: move mfns array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move mfns array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  | 2 ++
 tools/libs/guest/xg_sr_restore.c | 5 ++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 0ceecb289d..5731a5c186 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -227,6 +227,8 @@ struct xc_sr_restore_arrays {
 /* handle_page_data */
 xen_pfn_t pfns[MAX_BATCH_SIZE];
 uint32_t types[MAX_BATCH_SIZE];
+/* process_page_data */
+xen_pfn_t mfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 7729071e41..3ba089f862 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -205,7 +205,7 @@ static int process_page_data(struct xc_sr_context *ctx, 
unsigned int count,
  xen_pfn_t *pfns, uint32_t *types, void *page_data)
 {
 xc_interface *xch = ctx->xch;
-xen_pfn_t *mfns = malloc(count * sizeof(*mfns));
+xen_pfn_t *mfns = ctx->restore.m->mfns;
 int *map_errs = malloc(count * sizeof(*map_errs));
 int rc;
 void *mapping = NULL, *guest_page = NULL;
@@ -213,7 +213,7 @@ static int process_page_data(struct xc_sr_context *ctx, 
unsigned int count,
 j,  /* j indexes the subset of pfns we decide to map. */
 nr_pages = 0;
 
-if ( !mfns || !map_errs )
+if ( !map_errs )
 {
 rc = -1;
 ERROR("Failed to allocate %zu bytes to process page data",
@@ -299,7 +299,6 @@ static int process_page_data(struct xc_sr_context *ctx, 
unsigned int count,
 xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
 
 free(map_errs);
-free(mfns);
 
 return rc;
 }



[PATCH v1 07/23] tools/guest: save: move batch_pfns

2020-10-29 Thread Olaf Hering
The batch_pfns array is already allocated in advance.
Move it into the preallocated area.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h |  2 +-
 tools/libs/guest/xg_sr_save.c   | 25 +++--
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 62bc87b5f4..c78a07b8f8 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -212,6 +212,7 @@ static inline int update_blob(struct xc_sr_blob *blob,
 }
 
 struct xc_sr_save_arrays {
+xen_pfn_t batch_pfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_restore_arrays {
@@ -249,7 +250,6 @@ struct xc_sr_context
 
 struct precopy_stats stats;
 
-xen_pfn_t *batch_pfns;
 unsigned int nr_batch_pfns;
 unsigned long *deferred_pages;
 unsigned long nr_deferred_pages;
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 1e3c8eff2f..597e638c59 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -77,7 +77,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx)
 
 /*
  * Writes a batch of memory as a PAGE_DATA record into the stream.  The batch
- * is constructed in ctx->save.batch_pfns.
+ * is constructed in ctx->save.m->batch_pfns.
  *
  * This function:
  * - gets the types for each pfn in the batch.
@@ -128,12 +128,12 @@ static int write_batch(struct xc_sr_context *ctx)
 for ( i = 0; i < nr_pfns; ++i )
 {
 types[i] = mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
-  ctx->save.batch_pfns[i]);
+  
ctx->save.m->batch_pfns[i]);
 
 /* Likely a ballooned page. */
 if ( mfns[i] == INVALID_MFN )
 {
-set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
+set_bit(ctx->save.m->batch_pfns[i], ctx->save.deferred_pages);
 ++ctx->save.nr_deferred_pages;
 }
 }
@@ -179,7 +179,7 @@ static int write_batch(struct xc_sr_context *ctx)
 if ( errors[p] )
 {
 ERROR("Mapping of pfn %#"PRIpfn" (mfn %#"PRIpfn") failed %d",
-  ctx->save.batch_pfns[i], mfns[p], errors[p]);
+  ctx->save.m->batch_pfns[i], mfns[p], errors[p]);
 goto err;
 }
 
@@ -193,7 +193,7 @@ static int write_batch(struct xc_sr_context *ctx)
 {
 if ( rc == -1 && errno == EAGAIN )
 {
-set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages);
+set_bit(ctx->save.m->batch_pfns[i], 
ctx->save.deferred_pages);
 ++ctx->save.nr_deferred_pages;
 types[i] = XEN_DOMCTL_PFINFO_XTAB;
 --nr_pages;
@@ -224,7 +224,7 @@ static int write_batch(struct xc_sr_context *ctx)
 rec.length += nr_pages * PAGE_SIZE;
 
 for ( i = 0; i < nr_pfns; ++i )
-rec_pfns[i] = ((uint64_t)(types[i]) << 32) | ctx->save.batch_pfns[i];
+rec_pfns[i] = ((uint64_t)(types[i]) << 32) | 
ctx->save.m->batch_pfns[i];
 
 iov[0].iov_base = 
 iov[0].iov_len = sizeof(rec.type);
@@ -296,9 +296,9 @@ static int flush_batch(struct xc_sr_context *ctx)
 
 if ( !rc )
 {
-VALGRIND_MAKE_MEM_UNDEFINED(ctx->save.batch_pfns,
+VALGRIND_MAKE_MEM_UNDEFINED(ctx->save.m->batch_pfns,
 MAX_BATCH_SIZE *
-sizeof(*ctx->save.batch_pfns));
+sizeof(*ctx->save.m->batch_pfns));
 }
 
 return rc;
@@ -315,7 +315,7 @@ static int add_to_batch(struct xc_sr_context *ctx, 
xen_pfn_t pfn)
 rc = flush_batch(ctx);
 
 if ( rc == 0 )
-ctx->save.batch_pfns[ctx->save.nr_batch_pfns++] = pfn;
+ctx->save.m->batch_pfns[ctx->save.nr_batch_pfns++] = pfn;
 
 return rc;
 }
@@ -850,14 +850,12 @@ static int setup(struct xc_sr_context *ctx)
 
 dirty_bitmap = xc_hypercall_buffer_alloc_pages(
 xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size)));
-ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE *
-  sizeof(*ctx->save.batch_pfns));
 ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size);
 ctx->save.m = malloc(sizeof(*ctx->save.m));
 
-if ( !ctx->save.m || !ctx->save.batch_pfns || !dirty_bitmap || 
!ctx->save.deferred_pages )
+if ( !ctx->save.m || !dirty_bitmap || !ctx->save.deferred_pages )
 {
-ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and"
+ERROR("Unable to allocate memory for dirty bitmaps and"
   " deferred pages");
 rc = -1;
 errno = ENOMEM;
@@ -886,7 +884,6 @@ static void cleanup(struct xc_sr_context *ctx)
 xc_hypercall_buffer_free_pages(xch, dirty_bitmap,

[PATCH v1 16/23] tools/guest: restore: move types array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move types array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  |  1 +
 tools/libs/guest/xg_sr_restore.c | 12 +---
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 516d9b9fb5..0ceecb289d 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -226,6 +226,7 @@ struct xc_sr_save_arrays {
 struct xc_sr_restore_arrays {
 /* handle_page_data */
 xen_pfn_t pfns[MAX_BATCH_SIZE];
+uint32_t types[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 7d1447e86c..7729071e41 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -316,7 +316,7 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 int rc = -1;
 
 xen_pfn_t *pfns = ctx->restore.m->pfns, pfn;
-uint32_t *types = NULL, type;
+uint32_t *types = ctx->restore.m->types, type;
 
 /*
  * v2 compatibility only exists for x86 streams.  This is a bit of a
@@ -363,14 +363,6 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 goto err;
 }
 
-types = malloc(pages->count * sizeof(*types));
-if ( !types )
-{
-ERROR("Unable to allocate enough memory for %u pfns",
-  pages->count);
-goto err;
-}
-
 for ( i = 0; i < pages->count; ++i )
 {
 pfn = pages->pfn[i] & PAGE_DATA_PFN_MASK;
@@ -410,8 +402,6 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 rc = process_page_data(ctx, pages->count, pfns, types,
>pfn[pages->count]);
  err:
-free(types);
-
 return rc;
 }
 



[PATCH v1 08/23] tools/guest: save: move mfns array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move mfns array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h | 2 ++
 tools/libs/guest/xg_sr_save.c   | 7 ++-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index c78a07b8f8..0c2bef8f78 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -213,6 +213,8 @@ static inline int update_blob(struct xc_sr_blob *blob,
 
 struct xc_sr_save_arrays {
 xen_pfn_t batch_pfns[MAX_BATCH_SIZE];
+/* write_batch: Mfns of the batch pfns. */
+xen_pfn_t mfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_restore_arrays {
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 597e638c59..cdab288c3f 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -88,7 +88,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx)
 static int write_batch(struct xc_sr_context *ctx)
 {
 xc_interface *xch = ctx->xch;
-xen_pfn_t *mfns = NULL, *types = NULL;
+xen_pfn_t *mfns = ctx->save.m->mfns, *types = NULL;
 void *guest_mapping = NULL;
 void **guest_data = NULL;
 void **local_pages = NULL;
@@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx)
 
 assert(nr_pfns != 0);
 
-/* Mfns of the batch pfns. */
-mfns = malloc(nr_pfns * sizeof(*mfns));
 /* Types of the batch pfns. */
 types = malloc(nr_pfns * sizeof(*types));
 /* Errors from attempting to map the gfns. */
@@ -118,7 +116,7 @@ static int write_batch(struct xc_sr_context *ctx)
 /* iovec[] for writev(). */
 iov = malloc((nr_pfns + 4) * sizeof(*iov));
 
-if ( !mfns || !types || !errors || !guest_data || !local_pages || !iov )
+if ( !types || !errors || !guest_data || !local_pages || !iov )
 {
 ERROR("Unable to allocate arrays for a batch of %u pages",
   nr_pfns);
@@ -277,7 +275,6 @@ static int write_batch(struct xc_sr_context *ctx)
 free(guest_data);
 free(errors);
 free(types);
-free(mfns);
 
 return rc;
 }



[PATCH v1 13/23] tools/guest: save: move guest_data array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move guest_data array into preallocated space.

Because this was allocated with calloc:
Adjust the loop to clear unused entries as needed.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h |  2 ++
 tools/libs/guest/xg_sr_save.c   | 11 ++-
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 81158a4f4d..33e66678c6 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -223,6 +223,8 @@ struct xc_sr_save_arrays {
 struct iovec iov[MAX_BATCH_SIZE + 4];
 /* write_batch */
 uint64_t rec_pfns[MAX_BATCH_SIZE];
+/* write_batch: Pointers to page data to send. Mapped gfns or local 
allocations. */
+void *guest_data[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_restore_arrays {
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 4d7fbe09ed..658f834ae8 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -90,7 +90,7 @@ static int write_batch(struct xc_sr_context *ctx)
 xc_interface *xch = ctx->xch;
 xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types;
 void *guest_mapping = NULL;
-void **guest_data = NULL;
+void **guest_data = ctx->save.m->guest_data;
 void **local_pages = NULL;
 int *errors = ctx->save.m->errors, rc = -1;
 unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
@@ -105,12 +105,10 @@ static int write_batch(struct xc_sr_context *ctx)
 
 assert(nr_pfns != 0);
 
-/* Pointers to page data to send.  Mapped gfns or local allocations. */
-guest_data = calloc(nr_pfns, sizeof(*guest_data));
 /* Pointers to locally allocated pages.  Need freeing. */
 local_pages = calloc(nr_pfns, sizeof(*local_pages));
 
-if ( !guest_data || !local_pages )
+if ( !local_pages )
 {
 ERROR("Unable to allocate arrays for a batch of %u pages",
   nr_pfns);
@@ -166,7 +164,10 @@ static int write_batch(struct xc_sr_context *ctx)
 for ( i = 0, p = 0; i < nr_pfns; ++i )
 {
 if ( page_type_has_stream_data(types[i]) == false )
+{
+guest_data[i] = NULL;
 continue;
+}
 
 if ( errors[p] )
 {
@@ -183,6 +184,7 @@ static int write_batch(struct xc_sr_context *ctx)
 
 if ( rc )
 {
+guest_data[i] = NULL;
 if ( rc == -1 && errno == EAGAIN )
 {
 set_bit(ctx->save.m->batch_pfns[i], 
ctx->save.deferred_pages);
@@ -256,7 +258,6 @@ static int write_batch(struct xc_sr_context *ctx)
 for ( i = 0; local_pages && i < nr_pfns; ++i )
 free(local_pages[i]);
 free(local_pages);
-free(guest_data);
 
 return rc;
 }



[PATCH v1 20/23] tools/guest: restore: move pfns array in populate_pfns

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move populate_pfns' pfns array into 
preallocated space.
Use some prefix to avoid conflict with an array used in handle_page_data.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  |  1 +
 tools/libs/guest/xg_sr_restore.c | 11 +--
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 96a77b5969..3fe665b91d 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -232,6 +232,7 @@ struct xc_sr_restore_arrays {
 int map_errs[MAX_BATCH_SIZE];
 /* populate_pfns */
 xen_pfn_t pp_mfns[MAX_BATCH_SIZE];
+xen_pfn_t pp_pfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 85a32aaed2..71b39612ee 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -139,17 +139,10 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int 
count,
 {
 xc_interface *xch = ctx->xch;
 xen_pfn_t *mfns = ctx->restore.m->pp_mfns,
-*pfns = malloc(count * sizeof(*pfns));
+*pfns = ctx->restore.m->pp_pfns;
 unsigned int i, nr_pfns = 0;
 int rc = -1;
 
-if ( !pfns )
-{
-ERROR("Failed to allocate %zu bytes for populating the physmap",
-  2 * count * sizeof(*mfns));
-goto err;
-}
-
 for ( i = 0; i < count; ++i )
 {
 if ( (!types ||
@@ -190,8 +183,6 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int 
count,
 rc = 0;
 
  err:
-free(pfns);
-
 return rc;
 }
 



[PATCH v1 12/23] tools/guest: save: move rec_pfns array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move rec_pfns array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h |  2 ++
 tools/libs/guest/xg_sr_save.c   | 11 +--
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index f315b4f526..81158a4f4d 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -221,6 +221,8 @@ struct xc_sr_save_arrays {
 int errors[MAX_BATCH_SIZE];
 /* write_batch: iovec[] for writev(). */
 struct iovec iov[MAX_BATCH_SIZE + 4];
+/* write_batch */
+uint64_t rec_pfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_restore_arrays {
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 385a591332..4d7fbe09ed 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -96,7 +96,7 @@ static int write_batch(struct xc_sr_context *ctx)
 unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
 unsigned int nr_pfns = ctx->save.nr_batch_pfns;
 void *page, *orig_page;
-uint64_t *rec_pfns = NULL;
+uint64_t *rec_pfns = ctx->save.m->rec_pfns;
 struct iovec *iov = ctx->save.m->iov; int iovcnt = 0;
 struct xc_sr_rec_page_data_header hdr = { 0 };
 struct xc_sr_record rec = {
@@ -201,14 +201,6 @@ static int write_batch(struct xc_sr_context *ctx)
 }
 }
 
-rec_pfns = malloc(nr_pfns * sizeof(*rec_pfns));
-if ( !rec_pfns )
-{
-ERROR("Unable to allocate %zu bytes of memory for page data pfn list",
-  nr_pfns * sizeof(*rec_pfns));
-goto err;
-}
-
 hdr.count = nr_pfns;
 
 rec.length = sizeof(hdr);
@@ -259,7 +251,6 @@ static int write_batch(struct xc_sr_context *ctx)
 rc = ctx->save.nr_batch_pfns = 0;
 
  err:
-free(rec_pfns);
 if ( guest_mapping )
 xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped);
 for ( i = 0; local_pages && i < nr_pfns; ++i )



[PATCH v1 22/23] tools/guest: restore: split handle_page_data

2020-10-29 Thread Olaf Hering
handle_page_data must be able to read directly into mapped guest memory.
This will avoid unneccesary memcpy calls for data that can be consumed verbatim.

Split the various steps of record processing:
- move processing to handle_buffered_page_data
- adjust xenforeignmemory_map to set errno in case of failure
- adjust verify mode to set errno in case of failure

This change is preparation for future changes in handle_page_data,
no change in behavior is intended.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  |   9 +
 tools/libs/guest/xg_sr_restore.c | 343 ---
 2 files changed, 231 insertions(+), 121 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 66d1b0dfe6..7ec8867b88 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -230,9 +230,14 @@ struct xc_sr_restore_arrays {
 /* process_page_data */
 xen_pfn_t mfns[MAX_BATCH_SIZE];
 int map_errs[MAX_BATCH_SIZE];
+void *guest_data[MAX_BATCH_SIZE];
+
 /* populate_pfns */
 xen_pfn_t pp_mfns[MAX_BATCH_SIZE];
 xen_pfn_t pp_pfns[MAX_BATCH_SIZE];
+
+/* Must be the last member */
+struct xc_sr_rec_page_data_header pages;
 };
 
 struct xc_sr_context
@@ -323,7 +328,11 @@ struct xc_sr_context
 
 /* Sender has invoked verify mode on the stream. */
 bool verify;
+void *verify_buf;
+
 struct xc_sr_restore_arrays *m;
+void *guest_mapping;
+uint32_t nr_mapped_pages;
 } restore;
 };
 
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 93f69b9ba8..060f3d1f4e 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -186,123 +186,18 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned 
int count,
 return rc;
 }
 
-/*
- * Given a list of pfns, their types, and a block of page data from the
- * stream, populate and record their types, map the relevant subset and copy
- * the data into the guest.
- */
-static int process_page_data(struct xc_sr_context *ctx, unsigned int count,
- xen_pfn_t *pfns, uint32_t *types, void *page_data)
+static int handle_static_data_end_v2(struct xc_sr_context *ctx)
 {
-xc_interface *xch = ctx->xch;
-xen_pfn_t *mfns = ctx->restore.m->mfns;
-int *map_errs = ctx->restore.m->map_errs;
-int rc;
-void *mapping = NULL, *guest_page = NULL;
-unsigned int i, /* i indexes the pfns from the record. */
-j,  /* j indexes the subset of pfns we decide to map. */
-nr_pages = 0;
-
-rc = populate_pfns(ctx, count, pfns, types);
-if ( rc )
-{
-ERROR("Failed to populate pfns for batch of %u pages", count);
-goto err;
-}
-
-for ( i = 0; i < count; ++i )
-{
-ctx->restore.ops.set_page_type(ctx, pfns[i], types[i]);
-
-if ( page_type_has_stream_data(types[i]) == true )
-mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]);
-}
-
-/* Nothing to do? */
-if ( nr_pages == 0 )
-goto done;
-
-mapping = guest_page = xenforeignmemory_map(
-xch->fmem, ctx->domid, PROT_READ | PROT_WRITE,
-nr_pages, mfns, map_errs);
-if ( !mapping )
-{
-rc = -1;
-PERROR("Unable to map %u mfns for %u pages of data",
-   nr_pages, count);
-goto err;
-}
-
-for ( i = 0, j = 0; i < count; ++i )
-{
-if ( page_type_has_stream_data(types[i]) == false )
-continue;
-
-if ( map_errs[j] )
-{
-rc = -1;
-ERROR("Mapping pfn %#"PRIpfn" (mfn %#"PRIpfn", type %#"PRIx32") 
failed with %d",
-  pfns[i], mfns[j], types[i], map_errs[j]);
-goto err;
-}
-
-/* Undo page normalisation done by the saver. */
-rc = ctx->restore.ops.localise_page(ctx, types[i], page_data);
-if ( rc )
-{
-ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")",
-  pfns[i], types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
-goto err;
-}
-
-if ( ctx->restore.verify )
-{
-/* Verify mode - compare incoming data to what we already have. */
-if ( memcmp(guest_page, page_data, PAGE_SIZE) )
-ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")",
-  pfns[i], types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT);
-}
-else
-{
-/* Regular mode - copy incoming data into place. */
-memcpy(guest_page, page_data, PAGE_SIZE);
-}
-
-++j;
-guest_page += PAGE_SIZE;
-page_data += PAGE_SIZE;
-}
-
- done:
-rc = 0;
-
- err:
-if ( mapping )
-xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
-
-return rc;
-}
+int rc = 0;
 
-/*
- * Validate a PAGE_DATA record from the stream, and 

[PATCH v1 04/23] tools: unify type checking for data pfns in migration stream

2020-10-29 Thread Olaf Hering
Introduce a helper which decides if a given pfn type has data
for the migration stream.

No change in behavior intended.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  | 17 
 tools/libs/guest/xg_sr_restore.c | 34 +---
 tools/libs/guest/xg_sr_save.c| 14 ++---
 3 files changed, 24 insertions(+), 41 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index cc3ad1c394..70e328e951 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -455,6 +455,23 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int 
count,
 /* Handle a STATIC_DATA_END record. */
 int handle_static_data_end(struct xc_sr_context *ctx);
 
+static inline bool page_type_has_stream_data(uint32_t type)
+{
+bool ret;
+
+switch (type)
+{
+case XEN_DOMCTL_PFINFO_XTAB:
+case XEN_DOMCTL_PFINFO_XALLOC:
+case XEN_DOMCTL_PFINFO_BROKEN:
+ret = false;
+break;
+default:
+ret = true;
+break;
+}
+return ret;
+}
 #endif
 /*
  * Local variables:
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index f1c3169229..0332ae9f32 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -152,9 +152,8 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int 
count,
 
 for ( i = 0; i < count; ++i )
 {
-if ( (!types || (types &&
- (types[i] != XEN_DOMCTL_PFINFO_XTAB &&
-  types[i] != XEN_DOMCTL_PFINFO_BROKEN))) &&
+if ( (!types ||
+  (types && page_type_has_stream_data(types[i]) == true)) &&
  !pfn_is_populated(ctx, original_pfns[i]) )
 {
 rc = pfn_set_populated(ctx, original_pfns[i]);
@@ -233,25 +232,8 @@ static int process_page_data(struct xc_sr_context *ctx, 
unsigned int count,
 {
 ctx->restore.ops.set_page_type(ctx, pfns[i], types[i]);
 
-switch ( types[i] )
-{
-case XEN_DOMCTL_PFINFO_NOTAB:
-
-case XEN_DOMCTL_PFINFO_L1TAB:
-case XEN_DOMCTL_PFINFO_L1TAB | XEN_DOMCTL_PFINFO_LPINTAB:
-
-case XEN_DOMCTL_PFINFO_L2TAB:
-case XEN_DOMCTL_PFINFO_L2TAB | XEN_DOMCTL_PFINFO_LPINTAB:
-
-case XEN_DOMCTL_PFINFO_L3TAB:
-case XEN_DOMCTL_PFINFO_L3TAB | XEN_DOMCTL_PFINFO_LPINTAB:
-
-case XEN_DOMCTL_PFINFO_L4TAB:
-case XEN_DOMCTL_PFINFO_L4TAB | XEN_DOMCTL_PFINFO_LPINTAB:
-
+if ( page_type_has_stream_data(types[i]) == true )
 mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]);
-break;
-}
 }
 
 /* Nothing to do? */
@@ -271,14 +253,8 @@ static int process_page_data(struct xc_sr_context *ctx, 
unsigned int count,
 
 for ( i = 0, j = 0; i < count; ++i )
 {
-switch ( types[i] )
-{
-case XEN_DOMCTL_PFINFO_XTAB:
-case XEN_DOMCTL_PFINFO_BROKEN:
-case XEN_DOMCTL_PFINFO_XALLOC:
-/* No page data to deal with. */
+if ( page_type_has_stream_data(types[i]) == false )
 continue;
-}
 
 if ( map_errs[j] )
 {
@@ -413,7 +389,7 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 goto err;
 }
 
-if ( type < XEN_DOMCTL_PFINFO_BROKEN )
+if ( page_type_has_stream_data(type) == true )
 /* NOTAB and all L1 through L4 tables (including pinned) should
  * have a page worth of data in the record. */
 pages_of_data++;
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 044d0ae3aa..0546d3d9e6 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -153,13 +153,8 @@ static int write_batch(struct xc_sr_context *ctx)
 goto err;
 }
 
-switch ( types[i] )
-{
-case XEN_DOMCTL_PFINFO_BROKEN:
-case XEN_DOMCTL_PFINFO_XALLOC:
-case XEN_DOMCTL_PFINFO_XTAB:
+if ( page_type_has_stream_data(types[i]) == false )
 continue;
-}
 
 mfns[nr_pages++] = mfns[i];
 }
@@ -177,13 +172,8 @@ static int write_batch(struct xc_sr_context *ctx)
 
 for ( i = 0, p = 0; i < nr_pfns; ++i )
 {
-switch ( types[i] )
-{
-case XEN_DOMCTL_PFINFO_BROKEN:
-case XEN_DOMCTL_PFINFO_XALLOC:
-case XEN_DOMCTL_PFINFO_XTAB:
+if ( page_type_has_stream_data(types[i]) == false )
 continue;
-}
 
 if ( errors[p] )
 {



[PATCH v1 11/23] tools/guest: save: move iov array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move iov array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h | 2 ++
 tools/libs/guest/xg_sr_save.c   | 7 ++-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 71b676c0e0..f315b4f526 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -219,6 +219,8 @@ struct xc_sr_save_arrays {
 xen_pfn_t types[MAX_BATCH_SIZE];
 /* write_batch: Errors from attempting to map the gfns. */
 int errors[MAX_BATCH_SIZE];
+/* write_batch: iovec[] for writev(). */
+struct iovec iov[MAX_BATCH_SIZE + 4];
 };
 
 struct xc_sr_restore_arrays {
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index cdc27a9cde..385a591332 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -97,7 +97,7 @@ static int write_batch(struct xc_sr_context *ctx)
 unsigned int nr_pfns = ctx->save.nr_batch_pfns;
 void *page, *orig_page;
 uint64_t *rec_pfns = NULL;
-struct iovec *iov = NULL; int iovcnt = 0;
+struct iovec *iov = ctx->save.m->iov; int iovcnt = 0;
 struct xc_sr_rec_page_data_header hdr = { 0 };
 struct xc_sr_record rec = {
 .type = REC_TYPE_PAGE_DATA,
@@ -109,10 +109,8 @@ static int write_batch(struct xc_sr_context *ctx)
 guest_data = calloc(nr_pfns, sizeof(*guest_data));
 /* Pointers to locally allocated pages.  Need freeing. */
 local_pages = calloc(nr_pfns, sizeof(*local_pages));
-/* iovec[] for writev(). */
-iov = malloc((nr_pfns + 4) * sizeof(*iov));
 
-if ( !guest_data || !local_pages || !iov )
+if ( !guest_data || !local_pages )
 {
 ERROR("Unable to allocate arrays for a batch of %u pages",
   nr_pfns);
@@ -266,7 +264,6 @@ static int write_batch(struct xc_sr_context *ctx)
 xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped);
 for ( i = 0; local_pages && i < nr_pfns; ++i )
 free(local_pages[i]);
-free(iov);
 free(local_pages);
 free(guest_data);
 



[PATCH v1 01/23] tools: add readv_exact to libxenctrl

2020-10-29 Thread Olaf Hering
Read a batch of iovec's.

In the common case of short reads, finish individual iov's with read_exact.

Signed-off-by: Olaf Hering 
---
 tools/libs/ctrl/xc_private.c | 54 +++-
 tools/libs/ctrl/xc_private.h |  1 +
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/tools/libs/ctrl/xc_private.c b/tools/libs/ctrl/xc_private.c
index d94f846686..a86ed213a5 100644
--- a/tools/libs/ctrl/xc_private.c
+++ b/tools/libs/ctrl/xc_private.c
@@ -659,8 +659,23 @@ int write_exact(int fd, const void *data, size_t size)
 
 #if defined(__MINIOS__)
 /*
- * MiniOS's libc doesn't know about writev(). Implement it as multiple 
write()s.
+ * MiniOS's libc doesn't know about readv/writev().
+ * Implement it as multiple read/write()s.
  */
+int readv_exact(int fd, const struct iovec *iov, int iovcnt)
+{
+int rc, i;
+
+for ( i = 0; i < iovcnt; ++i )
+{
+rc = read_exact(fd, iov[i].iov_base, iov[i].iov_len);
+if ( rc )
+return rc;
+}
+
+return 0;
+}
+
 int writev_exact(int fd, const struct iovec *iov, int iovcnt)
 {
 int rc, i;
@@ -675,6 +689,44 @@ int writev_exact(int fd, const struct iovec *iov, int 
iovcnt)
 return 0;
 }
 #else
+int readv_exact(int fd, const struct iovec *iov, int iovcnt)
+{
+int rc = 0, idx = 0;
+ssize_t len;
+
+while ( idx < iovcnt )
+{
+len = readv(fd, [idx], min(iovcnt - idx, IOV_MAX));
+if ( len == -1 && errno == EINTR )
+continue;
+if ( len <= 0 )
+{
+rc = -1;
+goto out;
+}
+while ( len > 0 && idx < iovcnt )
+{
+if ( len >= iov[idx].iov_len )
+{
+len -= iov[idx].iov_len;
+}
+else
+{
+void *p = iov[idx].iov_base + len;
+size_t l = iov[idx].iov_len - len;
+
+rc = read_exact(fd, p, l);
+if ( rc )
+goto out;
+len = 0;
+}
+idx++;
+}
+}
+out:
+return rc;
+}
+
 int writev_exact(int fd, const struct iovec *iov, int iovcnt)
 {
 struct iovec *local_iov = NULL;
diff --git a/tools/libs/ctrl/xc_private.h b/tools/libs/ctrl/xc_private.h
index f0b5f83ac8..5d2c7274fb 100644
--- a/tools/libs/ctrl/xc_private.h
+++ b/tools/libs/ctrl/xc_private.h
@@ -441,6 +441,7 @@ int xc_flush_mmu_updates(xc_interface *xch, struct xc_mmu 
*mmu);
 
 /* Return 0 on success; -1 on error setting errno. */
 int read_exact(int fd, void *data, size_t size); /* EOF => -1, errno=0 */
+int readv_exact(int fd, const struct iovec *iov, int iovcnt);
 int write_exact(int fd, const void *data, size_t size);
 int writev_exact(int fd, const struct iovec *iov, int iovcnt);
 



[PATCH v1 14/23] tools/guest: save: move local_pages array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move local_pages array into preallocated space.

Adjust the code to use the src page as is in case of HVM.
In case of PV the page may need to be normalised, use an private memory
area for this purpose.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h   | 22 ++-
 tools/libs/guest/xg_sr_save.c | 25 +++--
 tools/libs/guest/xg_sr_save_x86_hvm.c |  5 +++--
 tools/libs/guest/xg_sr_save_x86_pv.c  | 31 ++-
 4 files changed, 39 insertions(+), 44 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 33e66678c6..2a020fef5c 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -33,16 +33,12 @@ struct xc_sr_save_ops
  * Optionally transform the contents of a page from being specific to the
  * sending environment, to being generic for the stream.
  *
- * The page of data at the end of 'page' may be a read-only mapping of a
- * running guest; it must not be modified.  If no transformation is
- * required, the callee should leave '*pages' untouched.
+ * The page of data '*src' may be a read-only mapping of a running guest;
+ * it must not be modified. If no transformation is required, the callee
+ * should leave '*src' untouched, and return it via '**ptr'.
  *
- * If a transformation is required, the callee should allocate themselves
- * a local page using malloc() and return it via '*page'.
- *
- * The caller shall free() '*page' in all cases.  In the case that the
- * callee encounters an error, it should *NOT* free() the memory it
- * allocated for '*page'.
+ * If a transformation is required, the callee should provide the
+ * transformed page in a private buffer and return it via '**ptr'.
  *
  * It is valid to fail with EAGAIN if the transformation is not able to be
  * completed at this point.  The page shall be retried later.
@@ -50,7 +46,7 @@ struct xc_sr_save_ops
  * @returns 0 for success, -1 for failure, with errno appropriately set.
  */
 int (*normalise_page)(struct xc_sr_context *ctx, xen_pfn_t type,
-  void **page);
+  void *src, unsigned int idx, void **ptr);
 
 /**
  * Set up local environment to save a domain. (Typically querying
@@ -371,6 +367,12 @@ struct xc_sr_context
 
 union
 {
+struct
+{
+/* Used by write_batch for modified pages. */
+void *normalised_pages;
+} save;
+
 struct
 {
 /* State machine for the order of received records. */
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 658f834ae8..804e4ccb3a 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -91,11 +91,10 @@ static int write_batch(struct xc_sr_context *ctx)
 xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types;
 void *guest_mapping = NULL;
 void **guest_data = ctx->save.m->guest_data;
-void **local_pages = NULL;
 int *errors = ctx->save.m->errors, rc = -1;
 unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
 unsigned int nr_pfns = ctx->save.nr_batch_pfns;
-void *page, *orig_page;
+void *src;
 uint64_t *rec_pfns = ctx->save.m->rec_pfns;
 struct iovec *iov = ctx->save.m->iov; int iovcnt = 0;
 struct xc_sr_rec_page_data_header hdr = { 0 };
@@ -105,16 +104,6 @@ static int write_batch(struct xc_sr_context *ctx)
 
 assert(nr_pfns != 0);
 
-/* Pointers to locally allocated pages.  Need freeing. */
-local_pages = calloc(nr_pfns, sizeof(*local_pages));
-
-if ( !local_pages )
-{
-ERROR("Unable to allocate arrays for a batch of %u pages",
-  nr_pfns);
-goto err;
-}
-
 for ( i = 0; i < nr_pfns; ++i )
 {
 types[i] = mfns[i] = ctx->save.ops.pfn_to_gfn(ctx,
@@ -176,11 +165,8 @@ static int write_batch(struct xc_sr_context *ctx)
 goto err;
 }
 
-orig_page = page = guest_mapping + (p * PAGE_SIZE);
-rc = ctx->save.ops.normalise_page(ctx, types[i], );
-
-if ( orig_page != page )
-local_pages[i] = page;
+src = guest_mapping + (p * PAGE_SIZE);
+rc = ctx->save.ops.normalise_page(ctx, types[i], src, i, 
_data[i]);
 
 if ( rc )
 {
@@ -195,8 +181,6 @@ static int write_batch(struct xc_sr_context *ctx)
 else
 goto err;
 }
-else
-guest_data[i] = page;
 
 rc = -1;
 ++p;
@@ -255,9 +239,6 @@ static int write_batch(struct xc_sr_context *ctx)
  err:
 if ( guest_mapping )
 

[PATCH v1 10/23] tools/guest: save: move errors array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move errors array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h | 2 ++
 tools/libs/guest/xg_sr_save.c   | 7 ++-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 3cbadb607b..71b676c0e0 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -217,6 +217,8 @@ struct xc_sr_save_arrays {
 xen_pfn_t mfns[MAX_BATCH_SIZE];
 /* write_batch: Types of the batch pfns. */
 xen_pfn_t types[MAX_BATCH_SIZE];
+/* write_batch: Errors from attempting to map the gfns. */
+int errors[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_restore_arrays {
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 6678df95b8..cdc27a9cde 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -92,7 +92,7 @@ static int write_batch(struct xc_sr_context *ctx)
 void *guest_mapping = NULL;
 void **guest_data = NULL;
 void **local_pages = NULL;
-int *errors = NULL, rc = -1;
+int *errors = ctx->save.m->errors, rc = -1;
 unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0;
 unsigned int nr_pfns = ctx->save.nr_batch_pfns;
 void *page, *orig_page;
@@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx)
 
 assert(nr_pfns != 0);
 
-/* Errors from attempting to map the gfns. */
-errors = malloc(nr_pfns * sizeof(*errors));
 /* Pointers to page data to send.  Mapped gfns or local allocations. */
 guest_data = calloc(nr_pfns, sizeof(*guest_data));
 /* Pointers to locally allocated pages.  Need freeing. */
@@ -114,7 +112,7 @@ static int write_batch(struct xc_sr_context *ctx)
 /* iovec[] for writev(). */
 iov = malloc((nr_pfns + 4) * sizeof(*iov));
 
-if ( !errors || !guest_data || !local_pages || !iov )
+if ( !guest_data || !local_pages || !iov )
 {
 ERROR("Unable to allocate arrays for a batch of %u pages",
   nr_pfns);
@@ -271,7 +269,6 @@ static int write_batch(struct xc_sr_context *ctx)
 free(iov);
 free(local_pages);
 free(guest_data);
-free(errors);
 
 return rc;
 }



[PATCH v1 18/23] tools/guest: restore: move map_errs array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move map_errs array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  |  1 +
 tools/libs/guest/xg_sr_restore.c | 12 +---
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 5731a5c186..eba3a49877 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -229,6 +229,7 @@ struct xc_sr_restore_arrays {
 uint32_t types[MAX_BATCH_SIZE];
 /* process_page_data */
 xen_pfn_t mfns[MAX_BATCH_SIZE];
+int map_errs[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 3ba089f862..94c329032f 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -206,21 +206,13 @@ static int process_page_data(struct xc_sr_context *ctx, 
unsigned int count,
 {
 xc_interface *xch = ctx->xch;
 xen_pfn_t *mfns = ctx->restore.m->mfns;
-int *map_errs = malloc(count * sizeof(*map_errs));
+int *map_errs = ctx->restore.m->map_errs;
 int rc;
 void *mapping = NULL, *guest_page = NULL;
 unsigned int i, /* i indexes the pfns from the record. */
 j,  /* j indexes the subset of pfns we decide to map. */
 nr_pages = 0;
 
-if ( !map_errs )
-{
-rc = -1;
-ERROR("Failed to allocate %zu bytes to process page data",
-  count * (sizeof(*mfns) + sizeof(*map_errs)));
-goto err;
-}
-
 rc = populate_pfns(ctx, count, pfns, types);
 if ( rc )
 {
@@ -298,8 +290,6 @@ static int process_page_data(struct xc_sr_context *ctx, 
unsigned int count,
 if ( mapping )
 xenforeignmemory_unmap(xch->fmem, mapping, nr_pages);
 
-free(map_errs);
-
 return rc;
 }
 



[PATCH v1 05/23] tools: show migration transfer rate in send_dirty_pages

2020-10-29 Thread Olaf Hering
Show how fast domU pages are transferred in each iteration.

The relevant data is how fast the pfns travel, not so much how much
protocol overhead exists. So the reported MiB/sec is just for pfns.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h |  2 ++
 tools/libs/guest/xg_sr_save.c   | 47 +
 2 files changed, 49 insertions(+)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 70e328e951..f3a7a29298 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -238,6 +238,8 @@ struct xc_sr_context
 bool debug;
 
 unsigned long p2m_size;
+size_t pages_sent;
+size_t overhead_sent;
 
 struct precopy_stats stats;
 
diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c
index 0546d3d9e6..f58a35ddde 100644
--- a/tools/libs/guest/xg_sr_save.c
+++ b/tools/libs/guest/xg_sr_save.c
@@ -1,5 +1,6 @@
 #include 
 #include 
+#include 
 
 #include "xg_sr_common.h"
 
@@ -238,6 +239,8 @@ static int write_batch(struct xc_sr_context *ctx)
 iov[3].iov_len = nr_pfns * sizeof(*rec_pfns);
 
 iovcnt = 4;
+ctx->save.pages_sent += nr_pages;
+ctx->save.overhead_sent += sizeof(rec) + sizeof(hdr) + nr_pfns * 
sizeof(*rec_pfns);
 
 if ( nr_pages )
 {
@@ -357,6 +360,43 @@ static int suspend_domain(struct xc_sr_context *ctx)
 return 0;
 }
 
+static void show_transfer_rate(struct xc_sr_context *ctx, struct timespec 
*start)
+{
+xc_interface *xch = ctx->xch;
+struct timespec end = {}, diff = {};
+size_t ms, MiB_sec = ctx->save.pages_sent * PAGE_SIZE;
+
+if (!MiB_sec)
+return;
+
+if ( clock_gettime(CLOCK_MONOTONIC, ) )
+PERROR("clock_gettime");
+
+if ( (end.tv_nsec - start->tv_nsec) < 0 )
+{
+diff.tv_sec = end.tv_sec - start->tv_sec - 1;
+diff.tv_nsec = end.tv_nsec - start->tv_nsec + (1000U*1000U*1000U);
+}
+else
+{
+diff.tv_sec = end.tv_sec - start->tv_sec;
+diff.tv_nsec = end.tv_nsec - start->tv_nsec;
+}
+
+ms = (diff.tv_nsec / (1000U*1000U));
+if (!ms)
+ms = 1;
+ms += (diff.tv_sec * 1000U);
+
+MiB_sec *= 1000U;
+MiB_sec /= ms;
+MiB_sec /= 1024U*1024U;
+
+errno = 0;
+ERROR("%s: %zu bytes + %zu pages in %ld.%09ld sec, %zu MiB/sec", __func__,
+  ctx->save.overhead_sent, ctx->save.pages_sent, diff.tv_sec, 
diff.tv_nsec, MiB_sec);
+}
+
 /*
  * Send a subset of pages in the guests p2m, according to the dirty bitmap.
  * Used for each subsequent iteration of the live migration loop.
@@ -370,9 +410,15 @@ static int send_dirty_pages(struct xc_sr_context *ctx,
 xen_pfn_t p;
 unsigned long written;
 int rc;
+struct timespec start = {};
 DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
 >save.dirty_bitmap_hbuf);
 
+ctx->save.pages_sent = 0;
+ctx->save.overhead_sent = 0;
+if ( clock_gettime(CLOCK_MONOTONIC, ) )
+PERROR("clock_gettime");
+
 for ( p = 0, written = 0; p < ctx->save.p2m_size; ++p )
 {
 if ( !test_bit(p, dirty_bitmap) )
@@ -396,6 +442,7 @@ static int send_dirty_pages(struct xc_sr_context *ctx,
 if ( written > entries )
 DPRINTF("Bitmap contained more entries than expected...");
 
+show_transfer_rate(ctx, );
 xc_report_progress_step(xch, entries, entries);
 
 return ctx->save.ops.check_vm_state(ctx);



[PATCH v1 15/23] tools/guest: restore: move pfns array

2020-10-29 Thread Olaf Hering
Remove allocation from hotpath, move pfns array into preallocated space.

Signed-off-by: Olaf Hering 
---
 tools/libs/guest/xg_sr_common.h  | 2 ++
 tools/libs/guest/xg_sr_restore.c | 6 ++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h
index 2a020fef5c..516d9b9fb5 100644
--- a/tools/libs/guest/xg_sr_common.h
+++ b/tools/libs/guest/xg_sr_common.h
@@ -224,6 +224,8 @@ struct xc_sr_save_arrays {
 };
 
 struct xc_sr_restore_arrays {
+/* handle_page_data */
+xen_pfn_t pfns[MAX_BATCH_SIZE];
 };
 
 struct xc_sr_context
diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c
index 4a9ece9681..7d1447e86c 100644
--- a/tools/libs/guest/xg_sr_restore.c
+++ b/tools/libs/guest/xg_sr_restore.c
@@ -315,7 +315,7 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 unsigned int i, pages_of_data = 0;
 int rc = -1;
 
-xen_pfn_t *pfns = NULL, pfn;
+xen_pfn_t *pfns = ctx->restore.m->pfns, pfn;
 uint32_t *types = NULL, type;
 
 /*
@@ -363,9 +363,8 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 goto err;
 }
 
-pfns = malloc(pages->count * sizeof(*pfns));
 types = malloc(pages->count * sizeof(*types));
-if ( !pfns || !types )
+if ( !types )
 {
 ERROR("Unable to allocate enough memory for %u pfns",
   pages->count);
@@ -412,7 +411,6 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
>pfn[pages->count]);
  err:
 free(types);
-free(pfns);
 
 return rc;
 }



[PATCH v1 02/23] tools: add xc_is_known_page_type to libxenctrl

2020-10-29 Thread Olaf Hering
Users of xc_get_pfn_type_batch may want to sanity check the data
returned by Xen. Add a simple helper for this purpose.

Signed-off-by: Olaf Hering 
---
 tools/libs/ctrl/xc_private.h | 33 +
 1 file changed, 33 insertions(+)

diff --git a/tools/libs/ctrl/xc_private.h b/tools/libs/ctrl/xc_private.h
index 5d2c7274fb..afb08aafe1 100644
--- a/tools/libs/ctrl/xc_private.h
+++ b/tools/libs/ctrl/xc_private.h
@@ -421,6 +421,39 @@ void *xc_map_foreign_ranges(xc_interface *xch, uint32_t 
dom,
 int xc_get_pfn_type_batch(xc_interface *xch, uint32_t dom,
   unsigned int num, xen_pfn_t *);
 
+/* Sanitiy check for types returned by Xen */
+static inline bool xc_is_known_page_type(xen_pfn_t type)
+{
+bool ret;
+
+switch (type)
+{
+case XEN_DOMCTL_PFINFO_NOTAB:
+
+case XEN_DOMCTL_PFINFO_L1TAB:
+case XEN_DOMCTL_PFINFO_L1TAB | XEN_DOMCTL_PFINFO_LPINTAB:
+
+case XEN_DOMCTL_PFINFO_L2TAB:
+case XEN_DOMCTL_PFINFO_L2TAB | XEN_DOMCTL_PFINFO_LPINTAB:
+
+case XEN_DOMCTL_PFINFO_L3TAB:
+case XEN_DOMCTL_PFINFO_L3TAB | XEN_DOMCTL_PFINFO_LPINTAB:
+
+case XEN_DOMCTL_PFINFO_L4TAB:
+case XEN_DOMCTL_PFINFO_L4TAB | XEN_DOMCTL_PFINFO_LPINTAB:
+
+case XEN_DOMCTL_PFINFO_XTAB:
+case XEN_DOMCTL_PFINFO_XALLOC:
+case XEN_DOMCTL_PFINFO_BROKEN:
+ret = true;
+break;
+default:
+ret = false;
+break;
+}
+return ret;
+}
+
 void bitmap_64_to_byte(uint8_t *bp, const uint64_t *lp, int nbits);
 void bitmap_byte_to_64(uint64_t *lp, const uint8_t *bp, int nbits);
 



Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread Arvind Sankar
On Thu, Oct 29, 2020 at 03:05:31PM +, David Laight wrote:
> From: Arnd Bergmann
> > Sent: 28 October 2020 21:21
> > 
> > From: Arnd Bergmann 
> > 
> > There are hundreds of warnings in a W=2 build about a local
> > variable shadowing the global 'apic' definition:
> > 
> > arch/x86/kvm/lapic.h:149:65: warning: declaration of 'apic' shadows a 
> > global declaration [-Wshadow]
> > 
> > Avoid this by renaming the global 'apic' variable to the more descriptive
> > 'x86_system_apic'. It was originally called 'genapic', but both that
> > and the current 'apic' seem to be a little overly generic for a global
> > variable.
> > 
> > Fixes: c48f14966cc4 ("KVM: inline kvm_apic_present() and 
> > kvm_lapic_enabled()")
> > Fixes: c8d46cf06dc2 ("x86: rename 'genapic' to 'apic'")
> > Signed-off-by: Arnd Bergmann 
> > ---
> > v2: rename the global instead of the local variable in the header
> ...
> > diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
> > index 284e73661a18..33e2dc78ca11 100644
> > --- a/arch/x86/hyperv/hv_apic.c
> > +++ b/arch/x86/hyperv/hv_apic.c
> > @@ -259,14 +259,14 @@ void __init hv_apic_init(void)
> > /*
> >  * Set the IPI entry points.
> >  */
> > -   orig_apic = *apic;
> > -
> > -   apic->send_IPI = hv_send_ipi;
> > -   apic->send_IPI_mask = hv_send_ipi_mask;
> > -   apic->send_IPI_mask_allbutself = hv_send_ipi_mask_allbutself;
> > -   apic->send_IPI_allbutself = hv_send_ipi_allbutself;
> > -   apic->send_IPI_all = hv_send_ipi_all;
> > -   apic->send_IPI_self = hv_send_ipi_self;
> > +   orig_apic = *x86_system_apic;
> > +
> > +   x86_system_apic->send_IPI = hv_send_ipi;
> > +   x86_system_apic->send_IPI_mask = hv_send_ipi_mask;
> > +   x86_system_apic->send_IPI_mask_allbutself = 
> > hv_send_ipi_mask_allbutself;
> > +   x86_system_apic->send_IPI_allbutself = hv_send_ipi_allbutself;
> > +   x86_system_apic->send_IPI_all = hv_send_ipi_all;
> > +   x86_system_apic->send_IPI_self = hv_send_ipi_self;
> > }
> > 
> > if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) {
> > @@ -285,10 +285,10 @@ void __init hv_apic_init(void)
> >  */
> > apic_set_eoi_write(hv_apic_eoi_write);
> > if (!x2apic_enabled()) {
> > -   apic->read  = hv_apic_read;
> > -   apic->write = hv_apic_write;
> > -   apic->icr_write = hv_apic_icr_write;
> > -   apic->icr_read  = hv_apic_icr_read;
> > +   x86_system_apic->read  = hv_apic_read;
> > +   x86_system_apic->write = hv_apic_write;
> > +   x86_system_apic->icr_write = hv_apic_icr_write;
> > +   x86_system_apic->icr_read  = hv_apic_icr_read;
> > }
> 
> For those two just add:
>   struct apic *apic = x86_system_apic;
> before all the assignments.
> Less churn and much better code.
> 

Why would it be better code?



Re: [PATCH v1 4/4] xen/pci: solve compilation error when memory paging is not enabled.

2020-10-29 Thread Jan Beulich
On 29.10.2020 17:58, Rahul Singh wrote:
>> On 28 Oct 2020, at 3:13 pm, Rahul Singh  wrote:
>>> On 28 Oct 2020, at 11:56 am, Jan Beulich  wrote:
>>> On 26.10.2020 18:17, Rahul Singh wrote:
 --- a/xen/drivers/passthrough/pci.c
 +++ b/xen/drivers/passthrough/pci.c
 @@ -1419,13 +1419,15 @@ static int assign_device(struct domain *d, u16 
 seg, u8 bus, u8 devfn, u32 flag)
if ( !is_iommu_enabled(d) )
return 0;

 -/* Prevent device assign if mem paging or mem sharing have been 
 +#if defined(CONFIG_HAS_MEM_PAGING) || defined(CONFIG_MEM_SHARING)
 +/* Prevent device assign if mem paging or mem sharing have been
 * enabled for this domain */
if ( d != dom_io &&
 unlikely(mem_sharing_enabled(d) ||
  vm_event_check_ring(d->vm_event_paging) ||
  p2m_get_hostp2m(d)->global_logdirty) )
return -EXDEV;
 +#endif
>>>
>>> Besides this also disabling mem-sharing and log-dirty related
>>> logic, I don't think the change is correct: Each item being
>>> checked needs individually disabling depending on its associated
>>> CONFIG_*. For this, perhaps you want to introduce something
>>> like mem_paging_enabled(d), to avoid the need for #ifdef here?
>>>
>>
>> Ok I will fix that in next version. 
> 
> I just check and found out that mem-sharing , men-paging and log-dirty is x86 
> specific and not implemented for ARM.
> Is that will be ok if I move above code to x86 specific directory and 
> introduce new function arch_pcidev_is_assignable() that will test if pcidev 
> is assignable or not ?

As an immediate workaround - perhaps (long term the individual
pieces should still be individually checked here, as they're
not inherently x86-specific). Since there's no device property
involved here, the suggested name looks misleading. Perhaps
arch_iommu_usable(d)?

Jan



[PATCH 2/2] x86: PV shim doesn't need GRANT_TABLE

2020-10-29 Thread Jan Beulich
The only reference into the code controlled by this option is from the
hypercall table, and that hypercall table entry gets overwritten.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/configs/pvshim_defconfig
+++ b/xen/arch/x86/configs/pvshim_defconfig
@@ -9,6 +9,7 @@ CONFIG_EXPERT=y
 CONFIG_SCHED_NULL=y
 # Disable features not used by the PV shim
 # CONFIG_XEN_SHSTK is not set
+# CONFIG_GRANT_TABLE is not set
 # CONFIG_HYPFS is not set
 # CONFIG_BIGMEM is not set
 # CONFIG_KEXEC is not set




[PATCH 1/2] x86: fix build of PV shim when !GRANT_TABLE

2020-10-29 Thread Jan Beulich
To do its compat translation, shim code needs to include the compat
header. For this to work, this header first of all needs to be
generated.

Reported-by: Andrew Cooper 
Signed-off-by: Jan Beulich 

--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -24,6 +24,7 @@ headers-$(CONFIG_X86) += compat/arch
 headers-$(CONFIG_ARGO)+= compat/argo.h
 headers-$(CONFIG_PV)  += compat/callback.h
 headers-$(CONFIG_GRANT_TABLE) += compat/grant_table.h
+headers-$(CONFIG_PV_SHIM) += compat/grant_table.h
 headers-$(CONFIG_HVM) += compat/hvm/dm_op.h
 headers-$(CONFIG_HVM) += compat/hvm/hvm_op.h
 headers-$(CONFIG_HVM) += compat/hvm/hvm_vcpu.h




[PATCH 0/2] x86: PV shim vs grant table

2020-10-29 Thread Jan Beulich
While Andrew reported a randconfig build failure, I started wondering
why we'd ever build in a way different from what had failed to build.
Fix the build and then switch the shim config file accordingly.

1: fix build of PV shim when !GRANT_TABLE
2: PV shim doesn't need GRANT_TABLE

Jan



Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread Paolo Bonzini
On 29/10/20 17:56, Arvind Sankar wrote:
>> For those two just add:
>>  struct apic *apic = x86_system_apic;
>> before all the assignments.
>> Less churn and much better code.
>>
> Why would it be better code?
> 

I think he means the compiler produces better code, because it won't
read the global variable repeatedly.  Not sure if that's true,(*) but I
think I do prefer that version if Arnd wants to do that tweak.

Paolo

(*) if it is, it might also be due to Linux using -fno-strict-aliasing




Re: [PATCH v1 4/4] xen/pci: solve compilation error when memory paging is not enabled.

2020-10-29 Thread Rahul Singh
Hello Jan,

> On 28 Oct 2020, at 3:13 pm, Rahul Singh  wrote:
> 
> Hello Jan,
> 
>> On 28 Oct 2020, at 11:56 am, Jan Beulich  wrote:
>> 
>> On 26.10.2020 18:17, Rahul Singh wrote:
>>> --- a/xen/drivers/passthrough/pci.c
>>> +++ b/xen/drivers/passthrough/pci.c
>>> @@ -1419,13 +1419,15 @@ static int assign_device(struct domain *d, u16 seg, 
>>> u8 bus, u8 devfn, u32 flag)
>>>if ( !is_iommu_enabled(d) )
>>>return 0;
>>> 
>>> -/* Prevent device assign if mem paging or mem sharing have been 
>>> +#if defined(CONFIG_HAS_MEM_PAGING) || defined(CONFIG_MEM_SHARING)
>>> +/* Prevent device assign if mem paging or mem sharing have been
>>> * enabled for this domain */
>>>if ( d != dom_io &&
>>> unlikely(mem_sharing_enabled(d) ||
>>>  vm_event_check_ring(d->vm_event_paging) ||
>>>  p2m_get_hostp2m(d)->global_logdirty) )
>>>return -EXDEV;
>>> +#endif
>> 
>> Besides this also disabling mem-sharing and log-dirty related
>> logic, I don't think the change is correct: Each item being
>> checked needs individually disabling depending on its associated
>> CONFIG_*. For this, perhaps you want to introduce something
>> like mem_paging_enabled(d), to avoid the need for #ifdef here?
>> 
> 
> Ok I will fix that in next version. 

I just check and found out that mem-sharing , men-paging and log-dirty is x86 
specific and not implemented for ARM.
Is that will be ok if I move above code to x86 specific directory and introduce 
new function arch_pcidev_is_assignable() that will test if pcidev is assignable 
or not ?

> 
>> Jan

Regards,
Rahul




[xen-unstable-smoke test] 156297: tolerable all pass - PUSHED

2020-10-29 Thread osstest service owner
flight 156297 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156297/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  15 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  16 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  15 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  16 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  1fd1d4bafdf6f9f8fe5ca9b947f016a7aae92a74
baseline version:
 xen  16a20963b3209788f2c0d3a3eebb7d92f03f5883

Last test of basis   156260  2020-10-27 18:01:29 Z1 days
Testing same since   156297  2020-10-29 14:01:23 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   16a20963b3..1fd1d4bafd  1fd1d4bafdf6f9f8fe5ca9b947f016a7aae92a74 -> smoke



Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread Thomas Gleixner
Arnd,

On Thu, Oct 29 2020 at 10:51, Arnd Bergmann wrote:
> On Thu, Oct 29, 2020 at 8:04 AM Paolo Bonzini  wrote:
>> On 28/10/20 22:20, Arnd Bergmann wrote:
>> > Avoid this by renaming the global 'apic' variable to the more descriptive
>> > 'x86_system_apic'. It was originally called 'genapic', but both that
>> > and the current 'apic' seem to be a little overly generic for a global
>> > variable.
>>
>> The 'apic' affects only the current CPU, so one of 'x86_local_apic',
>> 'x86_lapic' or 'x86_apic' is probably preferrable.
>
> Ok, I'll change it to x86_local_apic then, unless someone else has
> a preference between them.

x86_local_apic is fine with me.

> I think ideally there would be no global variable, withall accesses
> encapsulated in function calls, possibly using static_call() optimizations
> if any of them are performance critical.

There are a bunch, yes.

> It doesn't seem hard to do, but I'd rather leave that change to
> an x86 person ;-)

It's not hard, but I'm not really sure whether it buys much.

Can you please redo that against tip x86/apic. Much of what you are
touching got a major overhaul.

Thanks,

tglx



[PATCH] tools: add noidentpt domain config option

2020-10-29 Thread Tamas K Lengyel
The Identity Pagetable is currently being created for all HVM VMs during setup.
This was only necessary for running HVM guests on Intel CPUs where EPT was
available but unrestricted guest mode was not.

Add option to skip the creation of the Identity Pagetable via the "noidentpt"
xl config option. This allows a system administrator to skip this step when
the hardware is known to have the required features.

Signed-off-by: Tamas K Lengyel 
---
Further context: while running VM forks a significant bottle-neck was
identified when HVM_PARAM_IDENT_PT is getting copied from the parent VM. This
is due to the fact that setting this parameter requires obtaining a Xen-wide
lock (domctl_lock_acquire). When running several VM forks in parallel during
fuzzing the fork reset hypercall can fail due to the lock being taken by
another fork that's being reset at the same time. This whole situation can be
avoided if there is no Identity-map pagetable to begin with as on modern CPUs
this identity-map pagetable is not actually required.
---
 docs/man/xl.cfg.5.pod.in |  5 +
 tools/include/xenguest.h |  1 +
 tools/libs/guest/xg_dom_x86.c| 31 +--
 tools/libs/light/libxl_create.c  |  2 ++
 tools/libs/light/libxl_dom.c |  2 ++
 tools/libs/light/libxl_types.idl |  1 +
 tools/xl/xl_parse.c  |  2 ++
 7 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 0532739c1f..4d992fe346 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -587,6 +587,11 @@ which are incompatible with migration. Currently this is 
limited to
 enabling the invariant TSC feature flag in CPUID results when TSC is
 not emulated.
 
+=item B
+
+Disable the creation of the Identity-map Pagetable that was required to run HVM
+guests on Intel CPUs with EPT where no unrestricted guest mode was available.
+
 =item B
 
 Specify that this domain is a driver domain. This enables certain
diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h
index a9984dbea5..998a8c57ba 100644
--- a/tools/include/xenguest.h
+++ b/tools/include/xenguest.h
@@ -26,6 +26,7 @@
 
 #define XCFLAGS_LIVE  (1 << 0)
 #define XCFLAGS_DEBUG (1 << 1)
+#define XCFLAGS_NOIDENTPT (1 << 2)
 
 #define X86_64_B_SIZE   64 
 #define X86_32_B_SIZE   32
diff --git a/tools/libs/guest/xg_dom_x86.c b/tools/libs/guest/xg_dom_x86.c
index 2953aeb90b..827bea7eb7 100644
--- a/tools/libs/guest/xg_dom_x86.c
+++ b/tools/libs/guest/xg_dom_x86.c
@@ -718,20 +718,23 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom)
 goto out;
 }
 
-/*
- * Identity-map page table is required for running with CR0.PG=0 when
- * using Intel EPT. Create a 32-bit non-PAE page directory of superpages.
- */
-if ( (ident_pt = xc_map_foreign_range(
-  xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE,
-  special_pfn(SPECIALPAGE_IDENT_PT))) == NULL )
-goto error_out;
-for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ )
-ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
-   _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
-munmap(ident_pt, PAGE_SIZE);
-xc_hvm_param_set(xch, domid, HVM_PARAM_IDENT_PT,
- special_pfn(SPECIALPAGE_IDENT_PT) << PAGE_SHIFT);
+if ( !(dom->flags & XCFLAGS_NOIDENTPT) )
+{
+/*
+ * Identity-map page table is required for running with CR0.PG=0 when
+ * using Intel EPT. Create a 32-bit non-PAE page directory of 
superpages.
+ */
+if ( (ident_pt = xc_map_foreign_range(
+  xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE,
+  special_pfn(SPECIALPAGE_IDENT_PT))) == NULL )
+goto error_out;
+for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ )
+ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
+   _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
+munmap(ident_pt, PAGE_SIZE);
+xc_hvm_param_set(xch, domid, HVM_PARAM_IDENT_PT,
+ special_pfn(SPECIALPAGE_IDENT_PT) << PAGE_SHIFT);
+}
 
 dom->console_pfn = special_pfn(SPECIALPAGE_CONSOLE);
 xc_clear_domain_page(dom->xch, dom->guest_domid, dom->console_pfn);
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 321a13e519..62b06b359c 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -256,6 +256,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 
 libxl_defbool_setdefault(_info->disable_migrate, false);
 
+libxl_defbool_setdefault(_info->disable_identpt, false);
+
 for (i = 0 ; i < b_info->num_iomem; i++)
 if (b_info->iomem[i].gfn == LIBXL_INVALID_GFN)
 b_info->iomem[i].gfn = b_info->iomem[i].start;
diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c
index 

Re: [PATCH] x86/hvm: process softirq while saving/loading entries

2020-10-29 Thread Jan Beulich
On 29.10.2020 16:20, Roger Pau Monne wrote:
> On slow systems with sync_console saving or loading the context of big
> guests can cause the watchdog to trigger. Fix this by adding a couple
> of process_pending_softirqs.

Which raises the question in how far this is then also a problem
for the caller of the underlying hypercall. IOW I wonder whether
instead we need to make use of continuations here. Nevertheless
...

> Signed-off-by: Roger Pau Monné 

Acked-by: Jan Beulich 

Jan



RE: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread David Laight
From: Arnd Bergmann
> Sent: 29 October 2020 09:51
...
> I think ideally there would be no global variable, withall accesses
> encapsulated in function calls, possibly using static_call() optimizations
> if any of them are performance critical.

There isn't really a massive difference between global variables
and global access functions.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


[PATCH v1] xl: fix description of migrate --debug

2020-10-29 Thread Olaf Hering
xl migrate --debug used to track every pfn in every batch of pages.
But these times are gone. Adjust the help text to tell what --debug
is supposed to do today.

Signed-off-by: Olaf Hering 
---
 docs/man/xl.1.pod.in   | 4 +++-
 tools/xl/xl_cmdtable.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in
index 4bde0672fa..d0f50f0b4a 100644
--- a/docs/man/xl.1.pod.in
+++ b/docs/man/xl.1.pod.in
@@ -488,7 +488,9 @@ Include timestamps in output.
 
 =item B<--debug>
 
-Display huge (!) amount of debug information during the migration process.
+Verify transferred domU page data. All memory will be transferred one more
+time to the destination host while the domU is paused, and compare with
+the result of the inital transfer while the domU was still running.
 
 =item B<-p>
 
diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c
index fd2dc0aef2..af160dde42 100644
--- a/tools/xl/xl_cmdtable.c
+++ b/tools/xl/xl_cmdtable.c
@@ -168,7 +168,7 @@ struct cmd_spec cmd_table[] = {
   "-e  Do not wait in the background (on ) for the 
death\n"
   "of the domain.\n"
   "-T  Show timestamps during the migration process.\n"
-  "--debug Print huge (!) amount of debug during the migration 
process.\n"
+  "--debug Verify transferred domU page data.\n"
   "-p  Do not unpause domain after migrating it.\n"
   "-D  Preserve the domain id"
 },



Re: [PATCH 2/2] tools/libs: fix uninstall rule for header files

2020-10-29 Thread Bertrand Marquis



> On 19 Oct 2020, at 08:21, Jan Beulich  wrote:
> 
> This again was working right only as long as $(LIBHEADER) consisted of
> just one entry.
> 
> Signed-off-by: Jan Beulich 
Reviewed-by: Bertrand Marquis 

The change is obviously fixing a bug :-) and the double $ is required to 
protect from make.

Cheers
Bertrand


> ---
> An alternative would be to use $(addprefix ) without any shell loop.
> 
> --- a/tools/libs/libs.mk
> +++ b/tools/libs/libs.mk
> @@ -107,7 +107,7 @@ install: build
> .PHONY: uninstall
> uninstall:
>   rm -f $(DESTDIR)$(PKG_INSTALLDIR)/$(LIB_FILE_NAME).pc
> - for i in $(LIBHEADER); do rm -f $(DESTDIR)$(includedir)/$(LIBHEADER); 
> done
> + for i in $(LIBHEADER); do rm -f $(DESTDIR)$(includedir)/$$i; done
>   rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so
>   rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so.$(MAJOR)
>   rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so.$(MAJOR).$(MINOR)
> 
> 




[PATCH] x86/hvm: process softirq while saving/loading entries

2020-10-29 Thread Roger Pau Monne
On slow systems with sync_console saving or loading the context of big
guests can cause the watchdog to trigger. Fix this by adding a couple
of process_pending_softirqs.

Signed-off-by: Roger Pau Monné 
---
 xen/arch/x86/hvm/save.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c
index a2c56fbc1e..584620985b 100644
--- a/xen/arch/x86/hvm/save.c
+++ b/xen/arch/x86/hvm/save.c
@@ -21,6 +21,7 @@
  */
 
 #include 
+#include 
 #include 
 
 #include 
@@ -255,6 +256,7 @@ int hvm_save(struct domain *d, hvm_domain_context_t *h)
v, i);
 return -ENODATA;
 }
+process_pending_softirqs();
 }
 }
 else
@@ -268,6 +270,7 @@ int hvm_save(struct domain *d, hvm_domain_context_t *h)
d->domain_id, i);
 return -ENODATA;
 }
+process_pending_softirqs();
 }
 }
 
@@ -341,6 +344,7 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h)
d->domain_id, desc->typecode, desc->instance);
 return -1;
 }
+process_pending_softirqs();
 }
 
 /* Not reached */
-- 
2.29.0




[PATCH] x86/HVM: send mapcache invalidation request to qemu regardless of preemption

2020-10-29 Thread Jan Beulich
Even if only part of a hypercall completed before getting preempted,
invalidation ought to occur. Therefore fold the two return statements.

Signed-off-by: Jan Beulich 
---
Split off from "x86/HVM: refine when to send mapcache invalidation
request to qemu".

--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -326,14 +326,11 @@ int hvm_hypercall(struct cpu_user_regs *
 
 HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%lu -> %lx", eax, regs->rax);
 
-if ( curr->hcall_preempted )
-return HVM_HCALL_preempted;
-
 if ( unlikely(currd->arch.hvm.qemu_mapcache_invalidate) &&
  test_and_clear_bool(currd->arch.hvm.qemu_mapcache_invalidate) )
 send_invalidate_req();
 
-return HVM_HCALL_completed;
+return curr->hcall_preempted ? HVM_HCALL_preempted : HVM_HCALL_completed;
 }
 
 /*



RE: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header

2020-10-29 Thread David Laight
From: Arnd Bergmann
> Sent: 28 October 2020 21:21
> 
> From: Arnd Bergmann 
> 
> There are hundreds of warnings in a W=2 build about a local
> variable shadowing the global 'apic' definition:
> 
> arch/x86/kvm/lapic.h:149:65: warning: declaration of 'apic' shadows a global 
> declaration [-Wshadow]
> 
> Avoid this by renaming the global 'apic' variable to the more descriptive
> 'x86_system_apic'. It was originally called 'genapic', but both that
> and the current 'apic' seem to be a little overly generic for a global
> variable.
> 
> Fixes: c48f14966cc4 ("KVM: inline kvm_apic_present() and kvm_lapic_enabled()")
> Fixes: c8d46cf06dc2 ("x86: rename 'genapic' to 'apic'")
> Signed-off-by: Arnd Bergmann 
> ---
> v2: rename the global instead of the local variable in the header
...
> diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
> index 284e73661a18..33e2dc78ca11 100644
> --- a/arch/x86/hyperv/hv_apic.c
> +++ b/arch/x86/hyperv/hv_apic.c
> @@ -259,14 +259,14 @@ void __init hv_apic_init(void)
>   /*
>* Set the IPI entry points.
>*/
> - orig_apic = *apic;
> -
> - apic->send_IPI = hv_send_ipi;
> - apic->send_IPI_mask = hv_send_ipi_mask;
> - apic->send_IPI_mask_allbutself = hv_send_ipi_mask_allbutself;
> - apic->send_IPI_allbutself = hv_send_ipi_allbutself;
> - apic->send_IPI_all = hv_send_ipi_all;
> - apic->send_IPI_self = hv_send_ipi_self;
> + orig_apic = *x86_system_apic;
> +
> + x86_system_apic->send_IPI = hv_send_ipi;
> + x86_system_apic->send_IPI_mask = hv_send_ipi_mask;
> + x86_system_apic->send_IPI_mask_allbutself = 
> hv_send_ipi_mask_allbutself;
> + x86_system_apic->send_IPI_allbutself = hv_send_ipi_allbutself;
> + x86_system_apic->send_IPI_all = hv_send_ipi_all;
> + x86_system_apic->send_IPI_self = hv_send_ipi_self;
>   }
> 
>   if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) {
> @@ -285,10 +285,10 @@ void __init hv_apic_init(void)
>*/
>   apic_set_eoi_write(hv_apic_eoi_write);
>   if (!x2apic_enabled()) {
> - apic->read  = hv_apic_read;
> - apic->write = hv_apic_write;
> - apic->icr_write = hv_apic_icr_write;
> - apic->icr_read  = hv_apic_icr_read;
> + x86_system_apic->read  = hv_apic_read;
> + x86_system_apic->write = hv_apic_write;
> + x86_system_apic->icr_write = hv_apic_icr_write;
> + x86_system_apic->icr_read  = hv_apic_icr_read;
>   }

For those two just add:
struct apic *apic = x86_system_apic;
before all the assignments.
Less churn and much better code.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)




[PATCH] x86emul: support AVX-VNNI

2020-10-29 Thread Jan Beulich
These are VEX-encoded equivalents of the EVEX-encoded AVX512-VNNI ISA
extension.

Signed-off-by: Jan Beulich 
---
SDE: -spr

--- a/tools/libs/light/libxl_cpuid.c
+++ b/tools/libs/light/libxl_cpuid.c
@@ -226,6 +226,7 @@ int libxl_cpuid_parse_config(libxl_cpuid
 {"core-caps",0x0007,  0, CPUID_REG_EDX, 30,  1},
 {"ssbd", 0x0007,  0, CPUID_REG_EDX, 31,  1},
 
+{"avx-vnni", 0x0007,  1, CPUID_REG_EAX,  4,  1},
 {"avx512-bf16",  0x0007,  1, CPUID_REG_EAX,  5,  1},
 
 {"lahfsahf", 0x8001, NA, CPUID_REG_ECX,  0,  1},
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -175,7 +175,7 @@ static const char *const str_7d0[32] =
 
 static const char *const str_7a1[32] =
 {
-/* 4 */ [ 5] = "avx512-bf16",
+[ 4] = "avx-vnni",  [ 5] = "avx512-bf16",
 };
 
 static const struct {
--- a/tools/tests/x86_emulator/predicates.c
+++ b/tools/tests/x86_emulator/predicates.c
@@ -1335,6 +1335,10 @@ static const struct vex {
 { { 0x45 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsrlv{d,q} */
 { { 0x46 }, 2, T, R, pfx_66, W0, Ln }, /* vpsravd */
 { { 0x47 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsllv{d,q} */
+{ { 0x50 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusd */
+{ { 0x51 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusds */
+{ { 0x52 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwssd */
+{ { 0x53 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwssds */
 { { 0x58 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastd */
 { { 0x59 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastq */
 { { 0x5a }, 2, F, R, pfx_66, W0, L1 }, /* vbroadcasti128 */
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -5028,6 +5028,61 @@ int main(int argc, char **argv)
 printf("okay\n");
 }
 
+printf("%-40s", "Testing vpdpwssd (%ecx),%{y,z}mmA,%{y,z}mmB...");
+if ( stack_exec && cpu_has_avx512_vnni && cpu_has_avx_vnni )
+{
+/* Do the same operation two ways and compare the results. */
+decl_insn(vpdpwssd_vex1);
+decl_insn(vpdpwssd_vex2);
+decl_insn(vpdpwssd_evex);
+
+for ( i = 0; i < 24; ++i )
+res[i] = i | (~i << 16);
+
+asm volatile ( "vmovdqu32 32(%0), %%zmm1\n\t"
+   "vextracti64x4 $1, %%zmm1, %%ymm2\n\t"
+   "vpxor %%xmm0, %%xmm0, %%xmm3\n\t"
+   "vpxor %%xmm0, %%xmm0, %%xmm4\n\t"
+   "vpxor %%xmm0, %%xmm0, %%xmm5\n"
+   put_insn(vpdpwssd_vex1,
+/* %{vex%} vpdpwssd (%1), %%ymm1, %%ymm3" */
+".byte 0xc4, 0xe2, 0x75, 0x52, 0x19") "\n"
+   put_insn(vpdpwssd_vex2,
+/* "%{vex%} vpdpwssd 32(%1), %%ymm2, %%ymm4" */
+".byte 0xc4, 0xe2, 0x6d, 0x52, 0x61, 0x20") 
"\n"
+   put_insn(vpdpwssd_evex,
+/* "vpdpwssd (%1), %%zmm1, %%zmm5" */
+".byte 0x62, 0xf2, 0x75, 0x48, 0x52, 0x29")
+   :: "r" (res), "c" (NULL) );
+
+set_insn(vpdpwssd_vex1);
+regs.ecx = (unsigned long)res;
+rc = x86_emulate(, );
+if ( rc != X86EMUL_OKAY || !check_eip(vpdpwssd_vex1) )
+goto fail;
+
+set_insn(vpdpwssd_vex2);
+regs.ecx = (unsigned long)res;
+rc = x86_emulate(, );
+if ( rc != X86EMUL_OKAY || !check_eip(vpdpwssd_vex2) )
+goto fail;
+
+set_insn(vpdpwssd_evex);
+regs.ecx = (unsigned long)res;
+rc = x86_emulate(, );
+if ( rc != X86EMUL_OKAY || !check_eip(vpdpwssd_evex) )
+goto fail;
+
+asm ( "vinserti64x4 $1, %%ymm4, %%zmm3, %%zmm0\n\t"
+  "vpcmpeqd %%zmm0, %%zmm5, %%k0\n\t"
+  "kmovw %%k0, %0" : "=g" (rc) );
+if ( rc != 0x )
+goto fail;
+printf("okay\n");
+}
+else
+printf("skipped\n");
+
 printf("%-40s", "Testing invpcid 16(%ecx),%%edx...");
 if ( stack_exec )
 {
--- a/tools/tests/x86_emulator/x86-emulate.h
+++ b/tools/tests/x86_emulator/x86-emulate.h
@@ -170,6 +170,7 @@ static inline bool xcr0_mask(uint64_t ma
 #define cpu_has_avx512_4fmaps (cp.feat.avx512_4fmaps && xcr0_mask(0xe6))
 #define cpu_has_avx512_vp2intersect (cp.feat.avx512_vp2intersect && 
xcr0_mask(0xe6))
 #define cpu_has_serialize  cp.feat.serialize
+#define cpu_has_avx_vnni   (cp.feat.avx_vnni && xcr0_mask(6))
 #define cpu_has_avx512_bf16 (cp.feat.avx512_bf16 && xcr0_mask(0xe6))
 
 #define cpu_has_xgetbv1   (cpu_has_xsave && cp.xstate.xgetbv1)
--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -2008,6 +2008,7 @@ amd_like(const struct x86_emulate_ctxt *
 #define vcpu_has_avx512_4fmaps() (ctxt->cpuid->feat.avx512_4fmaps)
 #define vcpu_has_avx512_vp2intersect() 

Re: [PATCH 12/12] xen/cpupool: make per-cpupool sched-gran hypfs node writable

2020-10-29 Thread Jürgen Groß

On 29.10.20 15:58, Jan Beulich wrote:

On 26.10.2020 10:13, Juergen Gross wrote:

@@ -1088,13 +1098,58 @@ static int cpupool_gran_read(const struct hypfs_entry 
*entry,
  return copy_to_guest(uaddr, name, strlen(name) + 1) ? -EFAULT : 0;
  }
  
+static int cpupool_gran_write(struct hypfs_entry_leaf *leaf,

+  XEN_GUEST_HANDLE_PARAM(void) uaddr,
+  unsigned int ulen)
+{
+const struct hypfs_dyndir_id *data;
+struct cpupool *cpupool;
+enum sched_gran gran;
+unsigned int sched_gran;
+char name[SCHED_GRAN_NAME_LEN];
+int ret = 0;
+
+if ( ulen > SCHED_GRAN_NAME_LEN )
+return -ENOSPC;
+
+if ( copy_from_guest(name, uaddr, ulen) )
+return -EFAULT;
+
+sched_gran = sched_gran_get(name, ) ? 0
+ : cpupool_check_granularity(gran);
+if ( memchr(name, 0, ulen) != (name + ulen - 1) || sched_gran == 0 )
+return -EINVAL;


I guess the memchr() check wants to happen before the call to
sched_gran_get()?


Yes.


Juergen



Re: [PATCH 12/12] xen/cpupool: make per-cpupool sched-gran hypfs node writable

2020-10-29 Thread Jan Beulich
On 26.10.2020 10:13, Juergen Gross wrote:
> @@ -1088,13 +1098,58 @@ static int cpupool_gran_read(const struct hypfs_entry 
> *entry,
>  return copy_to_guest(uaddr, name, strlen(name) + 1) ? -EFAULT : 0;
>  }
>  
> +static int cpupool_gran_write(struct hypfs_entry_leaf *leaf,
> +  XEN_GUEST_HANDLE_PARAM(void) uaddr,
> +  unsigned int ulen)
> +{
> +const struct hypfs_dyndir_id *data;
> +struct cpupool *cpupool;
> +enum sched_gran gran;
> +unsigned int sched_gran;
> +char name[SCHED_GRAN_NAME_LEN];
> +int ret = 0;
> +
> +if ( ulen > SCHED_GRAN_NAME_LEN )
> +return -ENOSPC;
> +
> +if ( copy_from_guest(name, uaddr, ulen) )
> +return -EFAULT;
> +
> +sched_gran = sched_gran_get(name, ) ? 0
> + : 
> cpupool_check_granularity(gran);
> +if ( memchr(name, 0, ulen) != (name + ulen - 1) || sched_gran == 0 )
> +return -EINVAL;

I guess the memchr() check wants to happen before the call to
sched_gran_get()?

Jan



Re: [PATCH 20/33] docs: ABI: testing: make the files compatible with ReST output

2020-10-29 Thread Jonathan Cameron
On Wed, 28 Oct 2020 15:23:18 +0100
Mauro Carvalho Chehab  wrote:

> From: Mauro Carvalho Chehab 
> 
> Some files over there won't parse well by Sphinx.
> 
> Fix them.
> 
> Signed-off-by: Mauro Carvalho Chehab 
> Signed-off-by: Mauro Carvalho Chehab 

Query below...  I'm going to guess a rebase issue?

Other than that
Acked-by: Jonathan Cameron  # for IIO


> diff --git a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 
> b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
> index b7259234ad70..a10a4de3e5fe 100644
> --- a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
> +++ b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32
> @@ -3,67 +3,85 @@ KernelVersion:  4.11
>  Contact: benjamin.gaign...@st.com
>  Description:
>   Reading returns the list possible master modes which are:
> - - "reset" : The UG bit from the TIMx_EGR register is
> +
> +
> + - "reset"
> + The UG bit from the TIMx_EGR register is
>   used as trigger output (TRGO).
> - - "enable": The Counter Enable signal CNT_EN is used
> + - "enable"
> + The Counter Enable signal CNT_EN is used
>   as trigger output.
> - - "update": The update event is selected as trigger output.
> + - "update"
> + The update event is selected as trigger output.
>   For instance a master timer can then be used
>   as a prescaler for a slave timer.
> - - "compare_pulse" : The trigger output send a positive pulse
> - when the CC1IF flag is to be set.
> - - "OC1REF": OC1REF signal is used as trigger output.
> - - "OC2REF": OC2REF signal is used as trigger output.
> - - "OC3REF": OC3REF signal is used as trigger output.
> - - "OC4REF": OC4REF signal is used as trigger output.
> + - "compare_pulse"
> + The trigger output send a positive pulse
> + when the CC1IF flag is to be set.
> + - "OC1REF"
> + OC1REF signal is used as trigger output.
> + - "OC2REF"
> + OC2REF signal is used as trigger output.
> + - "OC3REF"
> + OC3REF signal is used as trigger output.
> + - "OC4REF"
> + OC4REF signal is used as trigger output.
> +
>   Additional modes (on TRGO2 only):
> - - "OC5REF": OC5REF signal is used as trigger output.
> - - "OC6REF": OC6REF signal is used as trigger output.
> +
> + - "OC5REF"
> + OC5REF signal is used as trigger output.
> + - "OC6REF"
> + OC6REF signal is used as trigger output.
>   - "compare_pulse_OC4REF":
> -   OC4REF rising or falling edges generate pulses.
> + OC4REF rising or falling edges generate pulses.
>   - "compare_pulse_OC6REF":
> -   OC6REF rising or falling edges generate pulses.
> + OC6REF rising or falling edges generate pulses.
>   - "compare_pulse_OC4REF_r_or_OC6REF_r":
> -   OC4REF or OC6REF rising edges generate pulses.
> + OC4REF or OC6REF rising edges generate pulses.
>   - "compare_pulse_OC4REF_r_or_OC6REF_f":
> -   OC4REF rising or OC6REF falling edges generate pulses.
> + OC4REF rising or OC6REF falling edges generate
> + pulses.
>   - "compare_pulse_OC5REF_r_or_OC6REF_r":
> -   OC5REF or OC6REF rising edges generate pulses.
> + OC5REF or OC6REF rising edges generate pulses.
>   - "compare_pulse_OC5REF_r_or_OC6REF_f":
> -   OC5REF rising or OC6REF falling edges generate pulses.
> + OC5REF rising or OC6REF falling edges generate
> + pulses.
>  
> - +---+   +-++-+
> - | Prescaler +-> | Counter |+-> | Master  | TRGO(2)
> - +---+   +--++-+|-> | Control +-->
> -||  ||  +-+
> - +--v+-+ OCxREF ||  +-+
> - | Chx compare +--> | Output  | ChX
> - +---+-+ |  | Control +-->
> -   . |   |  +-+
> -   . |   |.
> - 

Re: [PATCH] xen: add support for automatic debug key actions in case of crash

2020-10-29 Thread Jan Beulich
On 29.10.2020 15:40, Jürgen Groß wrote:
> On 29.10.20 15:22, Jan Beulich wrote:
>> On 22.10.2020 16:39, Juergen Gross wrote:
>>> @@ -507,6 +509,41 @@ void __init initialize_keytable(void)
>>>   }
>>>   }
>>>   
>>> +#define CRASHACTION_SIZE  32
>>> +static char crash_debug_panic[CRASHACTION_SIZE];
>>> +static char crash_debug_hwdom[CRASHACTION_SIZE];
>>> +static char crash_debug_watchdog[CRASHACTION_SIZE];
>>> +static char crash_debug_kexeccmd[CRASHACTION_SIZE];
>>> +static char crash_debug_debugkey[CRASHACTION_SIZE];
>>> +
>>> +static char *crash_action[CRASHREASON_N] = {
>>> +[CRASHREASON_PANIC] = crash_debug_panic,
>>> +[CRASHREASON_HWDOM] = crash_debug_hwdom,
>>> +[CRASHREASON_WATCHDOG] = crash_debug_watchdog,
>>> +[CRASHREASON_KEXECCMD] = crash_debug_kexeccmd,
>>> +[CRASHREASON_DEBUGKEY] = crash_debug_debugkey,
>>> +};
>>> +
>>> +string_runtime_param("crash-debug-panic", crash_debug_panic);
>>> +string_runtime_param("crash-debug-hwdom", crash_debug_hwdom);
>>> +string_runtime_param("crash-debug-watchdog", crash_debug_watchdog);
>>> +string_runtime_param("crash-debug-kexeccmd", crash_debug_kexeccmd);
>>> +string_runtime_param("crash-debug-debugkey", crash_debug_debugkey);
>>
>> In general I'm not in favor of this (or similar) needing
>> five new command line options, instead of just one. The problem
>> with e.g.
>>
>> crash-debug=panic:rq,watchdog:0d
>>
>> is that ',' (or any other separator chosen) could in principle
>> also be a debug key. It would still be possible to use
>>
>> crash-debug=panic:rq crash-debug=watchdog:0d
>>
>> though. Thoughts?
> 
> OTOH the runtime parameters are much easier addressable that way.

Ah yes, I can see this as a reason. Would make we wonder whether
command line and runtime handling may want disconnecting in this
specific case then. (But I can also see the argument of this
being too much overhead then.)

>>> +void keyhandler_crash_action(enum crash_reason reason)
>>> +{
>>> +const char *action = crash_action[reason];
>>> +struct cpu_user_regs *regs = get_irq_regs() ? : guest_cpu_user_regs();
>>> +
>>> +while ( *action ) {
>>> +if ( *action == '.' )
>>> +mdelay(1000);
>>> +else
>>> +handle_keypress(*action, regs);
>>> +action++;
>>> +}
>>> +}
>>
>> I think only diagnostic keys should be permitted here.
> 
> While I understand that other keys could produce nonsense or do harm,
> I'm not sure we should really prohibit them. Allowing them would e.g.
> allow to do just a reboot without kdump for one type of crashes.

Ah yes, that's a fair point.

>>> --- a/xen/include/xen/kexec.h
>>> +++ b/xen/include/xen/kexec.h
>>> @@ -1,6 +1,8 @@
>>>   #ifndef __XEN_KEXEC_H__
>>>   #define __XEN_KEXEC_H__
>>>   
>>> +#include 
>>
>> Could we go without this, utilizing the gcc extension of forward
>> declared enums? Otoh ...
>>
>>> @@ -82,7 +84,11 @@ void vmcoreinfo_append_str(const char *fmt, ...)
>>>   #define kexecing 0
>>>   
>>>   static inline void kexec_early_calculations(void) {}
>>> -static inline void kexec_crash(void) {}
>>> +static inline void kexec_crash(enum crash_reason reason)
>>> +{
>>> +keyhandler_crash_action(reason);
>>> +}
>>
>> ... if this is to be an inline function and not just a #define,
>> it'll need the declaration of the function to have been seen.
> 
> And even being a #define all users of kexec_crash() would need to
> #include keyhandler.h (and I'm not sure there are any source files
> including kexec.h which don't use kexec_crash()).

About as many which do as ones which don't. But there's no
generally accessible header which includes xen/kexec.h, so perhaps
the extra dependency indeed isn't all this problematic.

Jan



Re: [PATCH] xen: add support for automatic debug key actions in case of crash

2020-10-29 Thread Jürgen Groß

On 29.10.20 15:22, Jan Beulich wrote:

On 22.10.2020 16:39, Juergen Gross wrote:

When the host crashes it would sometimes be nice to have additional
debug data available which could be produced via debug keys, but
halting the server for manual intervention might be impossible due to
the need to reboot/kexec rather sooner than later.

Add support for automatic debug key actions in case of crashes which
can be activated via boot- or runtime-parameter.


While I can certainly see this possibly being a useful thing in
certain situations, I'm uncertain it's going to be helpful in at
least a fair set of cases. What output to request very much
depends on the nature of the problem one is running into, and
the more keys one adds "just in case", the longer the reboot
latency, and the higher the risk (see also below) of the output
generation actually causing further problems.


The obvious case is a watchdog induced crash: at least 2 sets of dom0
state will help in many cases.


IOW I'm neither fully convinced that we want this, nor fully
opposed.


Depending on the type of crash the desired data might be different, so
support different settings for the possible types of crashes.

The parameter is "crash-debug" with the following syntax:

   crash-debug-=

with  being one of:

   panic, hwdom, watchdog, kexeccmd, debugkey

and  a sequence of debug key characters with '.' having the
special semantics of a 1 second pause.


1 second is a whole lot of time. To get two successive sets
of data, a much shorter delay (if any) would normally suffice.


Yes, I'd be fine to trade that for a shorter period of time.


Also, while '.' may seem like a good choice right now, with the
shortage of characters we may want to put a real handler behind
it at come point. The one character that clearly won't make much
sense to use in this context is 'h', but that's awful as a (kind
of) separator. Could we perhaps replace 'h' by '?', freeing up
'h' and allowing '?' to be used for this purpose here?


Fine with me. Another possibility would be to add '\' as an escape
character with '\.' meaning "debug-key .".


--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -574,6 +574,29 @@ reduction of features at Xen's disposal to manage guests.
  ### cpuinfo (x86)
  > `= `
  
+### crash-debug-debugkey

+### crash-debug-hwdom
+### crash-debug-kexeccmd
+### crash-debug-panic
+### crash-debug-watchdog
+> `= `
+
+> Can be modified at runtime
+
+Specify debug-key actions in cases of crashes. Each of the parameters applies
+to a different crash reason. The `` is a sequence of debug key
+characters, with `.` having the special meaning of a 1 second pause.
+
+So e.g. `crash-debug-watchdog=0.0r` would dump dom0 state twice with a
+second between the two state dumps, followed by the run queues of the
+hypervisor, if the system crashes due to a watchdog timeout.
+
+These parameters should be used carefully, as e.g. specifying
+`crash-debug-debugkey=C` would result in an endless loop. Depending on the
+reason of the system crash it might happen that triggering some debug key
+action will result in a hang instead of dumping data and then doing a
+reboot or crash dump.


I think it would be useful if the flavors were (briefly)
explained: At the very least "debugkey" doesn't directly fit "in
cases of crashes", as there's no crash involved. kexec_crash()
instead gets invoked without there having been any crash.


Yes, and having some additional state generate for this case might
help diagnosis.



You may also want to point out that this is a best effort thing
only - system state at the point of a crash may be such that the
attempt of handling one or the debug keys would have further bad
effects on the system, including that the actual kexec may then
never occur.


True.




@@ -507,6 +509,41 @@ void __init initialize_keytable(void)
  }
  }
  
+#define CRASHACTION_SIZE  32

+static char crash_debug_panic[CRASHACTION_SIZE];
+static char crash_debug_hwdom[CRASHACTION_SIZE];
+static char crash_debug_watchdog[CRASHACTION_SIZE];
+static char crash_debug_kexeccmd[CRASHACTION_SIZE];
+static char crash_debug_debugkey[CRASHACTION_SIZE];
+
+static char *crash_action[CRASHREASON_N] = {
+[CRASHREASON_PANIC] = crash_debug_panic,
+[CRASHREASON_HWDOM] = crash_debug_hwdom,
+[CRASHREASON_WATCHDOG] = crash_debug_watchdog,
+[CRASHREASON_KEXECCMD] = crash_debug_kexeccmd,
+[CRASHREASON_DEBUGKEY] = crash_debug_debugkey,
+};
+
+string_runtime_param("crash-debug-panic", crash_debug_panic);
+string_runtime_param("crash-debug-hwdom", crash_debug_hwdom);
+string_runtime_param("crash-debug-watchdog", crash_debug_watchdog);
+string_runtime_param("crash-debug-kexeccmd", crash_debug_kexeccmd);
+string_runtime_param("crash-debug-debugkey", crash_debug_debugkey);


In general I'm not in favor of this (or similar) needing
five new command line options, instead of just one. The problem
with e.g.

crash-debug=panic:rq,watchdog:0d

is that ',' 

Re: [PATCH] x86/pv: Drop stale comment in dom0_construct_pv()

2020-10-29 Thread Jan Beulich
On 29.10.2020 15:00, Andrew Cooper wrote:
> This comment has been around since c/s 1372bca0615 in 2004.  It is stale, as
> it predates the introduction of struct vcpu.

That commit only moved it around; it's 22a857bde9b8 afaics from
early 2003 where it first appeared, where it had a reason:

/*
 * WARNING: The new domain must have its 'processor' field
 * filled in by now !!
 */
phys_l2tab = ALLOC_FRAME_FROM_DOMAIN();
l2tab = map_domain_mem(phys_l2tab);
memcpy(l2tab, idle_pg_table[p->processor], PAGE_SIZE);

But yes, the comment has been stale for a long time, and I've
been wondering a number of times what it was supposed to tell
me. (I think it was already stale at the point the comment
first got altered, in 3072fef54df8.)

> It is not obvious that it was even correct at the time.  Where a vcpu (domain
> at the time) has been configured to run is unrelated to construct the domain's
> initial pagetables, etc.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Jan Beulich 

Jan



Re: [XEN PATCH v1] xen/arm : Add support for SMMUv3 driver

2020-10-29 Thread Bertrand Marquis
Hi Julien,

> On 28 Oct 2020, at 19:12, Julien Grall  wrote:
> 
> 
> 
> On 26/10/2020 11:03, Rahul Singh wrote:
>> Hello Julien,
> 
> Hi Rahul,
> 
>>> On 23 Oct 2020, at 4:19 pm, Julien Grall  wrote:
>>> 
>>> 
>>> 
>>> On 23/10/2020 15:27, Rahul Singh wrote:
 Hello Julien,
> On 23 Oct 2020, at 2:00 pm, Julien Grall  wrote:
> 
> 
> 
> On 23/10/2020 12:35, Rahul Singh wrote:
>> Hello,
>>> On 23 Oct 2020, at 1:02 am, Stefano Stabellini  
>>> wrote:
>>> 
>>> On Thu, 22 Oct 2020, Julien Grall wrote:
>> On 20/10/2020 16:25, Rahul Singh wrote:
>>> Add support for ARM architected SMMUv3 implementations. It is based 
>>> on
>>> the Linux SMMUv3 driver.
>>> Major differences between the Linux driver are as follows:
>>> 1. Only Stage-2 translation is supported as compared to the Linux 
>>> driver
>>>that supports both Stage-1 and Stage-2 translations.
>>> 2. Use P2M  page table instead of creating one as SMMUv3 has the
>>>capability to share the page tables with the CPU.
>>> 3. Tasklets is used in place of threaded IRQ's in Linux for event 
>>> queue
>>>and priority queue IRQ handling.
>> 
>> Tasklets are not a replacement for threaded IRQ. In particular, they 
>> will
>> have priority over anything else (IOW nothing will run on the pCPU 
>> until
>> they are done).
>> 
>> Do you know why Linux is using thread. Is it because of long running
>> operations?
> 
> Yes you are right because of long running operations Linux is using 
> the
> threaded IRQs.
> 
> SMMUv3 reports fault/events bases on memory-based circular buffer 
> queues not
> based on the register. As per my understanding, it is time-consuming 
> to
> process the memory based queues in interrupt context because of that 
> Linux
> is using threaded IRQ to process the faults/events from SMMU.
> 
> I didn’t find any other solution in XEN in place of tasklet to defer 
> the
> work, that’s why I used tasklet in XEN in replacement of threaded 
> IRQs. If
> we do all work in interrupt context we will make XEN less responsive.
 
 So we need to make sure that Xen continue to receives interrupts, but 
 we also
 need to make sure that a vCPU bound to the pCPU is also responsive.
 
> 
> If you know another solution in XEN that will be used to defer the 
> work in
> the interrupt please let me know I will try to use that.
 
 One of my work colleague encountered a similar problem recently. He 
 had a long
 running tasklet and wanted to be broken down in smaller chunk.
 
 We decided to use a timer to reschedule the taslket in the future. 
 This allows
 the scheduler to run other loads (e.g. vCPU) for some time.
 
 This is pretty hackish but I couldn't find a better solution as 
 tasklet have
 high priority.
 
 Maybe the other will have a better idea.
>>> 
>>> Julien's suggestion is a good one.
>>> 
>>> But I think tasklets can be configured to be called from the idle_loop,
>>> in which case they are not run in interrupt context?
>>> 
>>  Yes you are right tasklet will be scheduled from the idle_loop that is 
>> not interrupt conext.
> 
> This depends on your tasklet. Some will run from the softirq context 
> which is usually (for Arm) on the return of an exception.
> 
 Thanks for the info. I will check and will get better understanding of the 
 tasklet how it will run in XEN.
>>> 
>>> 4. Latest version of the Linux SMMUv3 code implements the commands 
>>> queue
>>>access functions based on atomic operations implemented in Linux.
>> 
>> Can you provide more details?
> 
> I tried to port the latest version of the SMMUv3 code than I observed 
> that
> in order to port that code I have to also port atomic operation 
> implemented
> in Linux to XEN. As latest Linux code uses atomic operation to 
> process the
> command queues 
> (atomic_cond_read_relaxed(),atomic_long_cond_read_relaxed() ,
> atomic_fetch_andnot_relaxed()) .
 
 Thank you for the explanation. I think it would be best to import the 
 atomic
 helpers and use the latest code.
 
 This will ensure that we don't re-introduce bugs and also buy us some 
 time
 before the Linux and Xen driver diverge again too much.
 
 Stefano, what do you think?
>>> 
>>> I think you are right.
>> Yes, I agree with 

Re: [PATCH] xen: add support for automatic debug key actions in case of crash

2020-10-29 Thread Jan Beulich
On 22.10.2020 16:39, Juergen Gross wrote:
> When the host crashes it would sometimes be nice to have additional
> debug data available which could be produced via debug keys, but
> halting the server for manual intervention might be impossible due to
> the need to reboot/kexec rather sooner than later.
> 
> Add support for automatic debug key actions in case of crashes which
> can be activated via boot- or runtime-parameter.

While I can certainly see this possibly being a useful thing in
certain situations, I'm uncertain it's going to be helpful in at
least a fair set of cases. What output to request very much
depends on the nature of the problem one is running into, and
the more keys one adds "just in case", the longer the reboot
latency, and the higher the risk (see also below) of the output
generation actually causing further problems.

IOW I'm neither fully convinced that we want this, nor fully
opposed.

> Depending on the type of crash the desired data might be different, so
> support different settings for the possible types of crashes.
> 
> The parameter is "crash-debug" with the following syntax:
> 
>   crash-debug-=
> 
> with  being one of:
> 
>   panic, hwdom, watchdog, kexeccmd, debugkey
> 
> and  a sequence of debug key characters with '.' having the
> special semantics of a 1 second pause.

1 second is a whole lot of time. To get two successive sets
of data, a much shorter delay (if any) would normally suffice.

Also, while '.' may seem like a good choice right now, with the
shortage of characters we may want to put a real handler behind
it at come point. The one character that clearly won't make much
sense to use in this context is 'h', but that's awful as a (kind
of) separator. Could we perhaps replace 'h' by '?', freeing up
'h' and allowing '?' to be used for this purpose here?

> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -574,6 +574,29 @@ reduction of features at Xen's disposal to manage guests.
>  ### cpuinfo (x86)
>  > `= `
>  
> +### crash-debug-debugkey
> +### crash-debug-hwdom
> +### crash-debug-kexeccmd
> +### crash-debug-panic
> +### crash-debug-watchdog
> +> `= `
> +
> +> Can be modified at runtime
> +
> +Specify debug-key actions in cases of crashes. Each of the parameters applies
> +to a different crash reason. The `` is a sequence of debug key
> +characters, with `.` having the special meaning of a 1 second pause.
> +
> +So e.g. `crash-debug-watchdog=0.0r` would dump dom0 state twice with a
> +second between the two state dumps, followed by the run queues of the
> +hypervisor, if the system crashes due to a watchdog timeout.
> +
> +These parameters should be used carefully, as e.g. specifying
> +`crash-debug-debugkey=C` would result in an endless loop. Depending on the
> +reason of the system crash it might happen that triggering some debug key
> +action will result in a hang instead of dumping data and then doing a
> +reboot or crash dump.

I think it would be useful if the flavors were (briefly)
explained: At the very least "debugkey" doesn't directly fit "in
cases of crashes", as there's no crash involved. kexec_crash()
instead gets invoked without there having been any crash.

You may also want to point out that this is a best effort thing
only - system state at the point of a crash may be such that the
attempt of handling one or the debug keys would have further bad
effects on the system, including that the actual kexec may then
never occur.

> @@ -507,6 +509,41 @@ void __init initialize_keytable(void)
>  }
>  }
>  
> +#define CRASHACTION_SIZE  32
> +static char crash_debug_panic[CRASHACTION_SIZE];
> +static char crash_debug_hwdom[CRASHACTION_SIZE];
> +static char crash_debug_watchdog[CRASHACTION_SIZE];
> +static char crash_debug_kexeccmd[CRASHACTION_SIZE];
> +static char crash_debug_debugkey[CRASHACTION_SIZE];
> +
> +static char *crash_action[CRASHREASON_N] = {
> +[CRASHREASON_PANIC] = crash_debug_panic,
> +[CRASHREASON_HWDOM] = crash_debug_hwdom,
> +[CRASHREASON_WATCHDOG] = crash_debug_watchdog,
> +[CRASHREASON_KEXECCMD] = crash_debug_kexeccmd,
> +[CRASHREASON_DEBUGKEY] = crash_debug_debugkey,
> +};
> +
> +string_runtime_param("crash-debug-panic", crash_debug_panic);
> +string_runtime_param("crash-debug-hwdom", crash_debug_hwdom);
> +string_runtime_param("crash-debug-watchdog", crash_debug_watchdog);
> +string_runtime_param("crash-debug-kexeccmd", crash_debug_kexeccmd);
> +string_runtime_param("crash-debug-debugkey", crash_debug_debugkey);

In general I'm not in favor of this (or similar) needing
five new command line options, instead of just one. The problem
with e.g.

crash-debug=panic:rq,watchdog:0d

is that ',' (or any other separator chosen) could in principle
also be a debug key. It would still be possible to use

crash-debug=panic:rq crash-debug=watchdog:0d

though. Thoughts?

> +void keyhandler_crash_action(enum crash_reason reason)
> +{
> +const char *action = crash_action[reason];
> +

[PATCH] x86/pv: Drop stale comment in dom0_construct_pv()

2020-10-29 Thread Andrew Cooper
This comment has been around since c/s 1372bca0615 in 2004.  It is stale, as
it predates the introduction of struct vcpu.

It is not obvious that it was even correct at the time.  Where a vcpu (domain
at the time) has been configured to run is unrelated to construct the domain's
initial pagetables, etc.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Roger Pau Monné 
CC: Wei Liu 

Almost...  I'm not entirely sure NUMA memory allocation is plumbed through
correctly, but even that still has nothing to do with v->processor
---
 xen/arch/x86/pv/dom0_build.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c
index d79503d6a9..f7165309a2 100644
--- a/xen/arch/x86/pv/dom0_build.c
+++ b/xen/arch/x86/pv/dom0_build.c
@@ -616,7 +616,6 @@ int __init dom0_construct_pv(struct domain *d,
 v->arch.pv.event_callback_cs= FLAT_COMPAT_KERNEL_CS;
 }
 
-/* WARNING: The new domain must have its 'processor' field filled in! */
 if ( !is_pv_32bit_domain(d) )
 {
 maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l4_page_table;
-- 
2.11.0




Ping²: [PATCH] x86/PV: make post-migration page state consistent

2020-10-29 Thread Jan Beulich
On 11.09.2020 12:34, Jan Beulich wrote:
> When a page table page gets de-validated, its type reference count drops
> to zero (and PGT_validated gets cleared), but its type remains intact.
> XEN_DOMCTL_getpageframeinfo3, therefore, so far reported prior usage for
> such pages. An intermediate write to such a page via e.g.
> MMU_NORMAL_PT_UPDATE, however, would transition the page's type to
> PGT_writable_page, thus altering what XEN_DOMCTL_getpageframeinfo3 would
> return. In libxc the decision which pages to normalize / localize
> depends solely on the type returned from the domctl. As a result without
> further precautions the guest won't be able to tell whether such a page
> has had its (apparent) PTE entries transitioned to the new MFNs.
> 
> Add a check of PGT_validated, thus consistently avoiding normalization /
> localization in the tool stack.
> 
> Alongside using XEN_DOMCTL_PFINFO_NOTAB instead of plain zero for the
> change at hand, also change the variable's initializer to use this
> constant, too. Take the opportunity and also adjust its type.
> 
> Signed-off-by: Jan Beulich 

I think I did address all questions here.

Jan

> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -215,7 +215,8 @@ long arch_do_domctl(
>  
>  for ( i = 0; i < num; ++i )
>  {
> -unsigned long gfn = 0, type = 0;
> +unsigned long gfn = 0;
> +unsigned int type = XEN_DOMCTL_PFINFO_NOTAB;
>  struct page_info *page;
>  p2m_type_t t;
>  
> @@ -255,6 +256,8 @@ long arch_do_domctl(
>  
>  if ( page->u.inuse.type_info & PGT_pinned )
>  type |= XEN_DOMCTL_PFINFO_LPINTAB;
> +else if ( !(page->u.inuse.type_info & PGT_validated) )
> +type = XEN_DOMCTL_PFINFO_NOTAB;
>  
>  if ( page->count_info & PGC_broken )
>  type = XEN_DOMCTL_PFINFO_BROKEN;
> 




Ping: [PATCH] tools/python: pass more -rpath-link options to ld

2020-10-29 Thread Jan Beulich
On 19.10.2020 10:31, Jan Beulich wrote:
> With the split of libraries, I've observed a number of warnings from
> (old?) ld.
> 
> Signed-off-by: Jan Beulich 

Marek?

> ---
> It's unclear to me whether this is ld version dependent - the pattern
> of where I've seen such warnings doesn't suggest a clear version
> dependency.
> 
> --- a/tools/python/setup.py
> +++ b/tools/python/setup.py
> @@ -7,10 +7,15 @@ XEN_ROOT = "../.."
>  extra_compile_args  = [ "-fno-strict-aliasing", "-Werror" ]
>  
>  PATH_XEN  = XEN_ROOT + "/tools/include"
> +PATH_LIBXENTOOLCORE = XEN_ROOT + "/tools/libs/toolcore"
>  PATH_LIBXENTOOLLOG = XEN_ROOT + "/tools/libs/toollog"
> +PATH_LIBXENCALL = XEN_ROOT + "/tools/libs/call"
>  PATH_LIBXENEVTCHN = XEN_ROOT + "/tools/libs/evtchn"
> +PATH_LIBXENGNTTAB = XEN_ROOT + "/tools/libs/gnttab"
>  PATH_LIBXENCTRL = XEN_ROOT + "/tools/libs/ctrl"
>  PATH_LIBXENGUEST = XEN_ROOT + "/tools/libs/guest"
> +PATH_LIBXENDEVICEMODEL = XEN_ROOT + "/tools/libs/devicemodel"
> +PATH_LIBXENFOREIGNMEMORY = XEN_ROOT + "/tools/libs/foreignmemory"
>  PATH_XENSTORE = XEN_ROOT + "/tools/libs/store"
>  
>  xc = Extension("xc",
> @@ -24,7 +29,13 @@ xc = Extension("xc",
> library_dirs   = [ PATH_LIBXENCTRL, PATH_LIBXENGUEST ],
> libraries  = [ "xenctrl", "xenguest" ],
> depends= [ PATH_LIBXENCTRL + "/libxenctrl.so", 
> PATH_LIBXENGUEST + "/libxenguest.so" ],
> -   extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG 
> ],
> +   extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENCALL,
> +  
> "-Wl,-rpath-link="+PATH_LIBXENDEVICEMODEL,
> +  "-Wl,-rpath-link="+PATH_LIBXENEVTCHN,
> +  
> "-Wl,-rpath-link="+PATH_LIBXENFOREIGNMEMORY,
> +  "-Wl,-rpath-link="+PATH_LIBXENGNTTAB,
> +  "-Wl,-rpath-link="+PATH_LIBXENTOOLCORE,
> +  "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG 
> ],
> sources= [ "xen/lowlevel/xc/xc.c" ])
>  
>  xs = Extension("xs",
> @@ -33,6 +44,7 @@ xs = Extension("xs",
> library_dirs   = [ PATH_XENSTORE ],
> libraries  = [ "xenstore" ],
> depends= [ PATH_XENSTORE + "/libxenstore.so" ],
> +   extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLCORE 
> ],
> sources= [ "xen/lowlevel/xs/xs.c" ])
>  
>  plat = os.uname()[0]
> 




Ping: [PATCH 2/2] tools/libs: fix uninstall rule for header files

2020-10-29 Thread Jan Beulich
On 19.10.2020 09:21, Jan Beulich wrote:
> This again was working right only as long as $(LIBHEADER) consisted of
> just one entry.
> 
> Signed-off-by: Jan Beulich 

While patch 1 has become irrelevant with Juergen's more complete
change, this one is still applicable afaict.

Jan

> ---
> An alternative would be to use $(addprefix ) without any shell loop.
> 
> --- a/tools/libs/libs.mk
> +++ b/tools/libs/libs.mk
> @@ -107,7 +107,7 @@ install: build
>  .PHONY: uninstall
>  uninstall:
>   rm -f $(DESTDIR)$(PKG_INSTALLDIR)/$(LIB_FILE_NAME).pc
> - for i in $(LIBHEADER); do rm -f $(DESTDIR)$(includedir)/$(LIBHEADER); 
> done
> + for i in $(LIBHEADER); do rm -f $(DESTDIR)$(includedir)/$$i; done
>   rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so
>   rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so.$(MAJOR)
>   rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so.$(MAJOR).$(MINOR)
> 
> 




[linux-linus test] 156269: regressions - FAIL

2020-10-29 Thread osstest service owner
flight 156269 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/156269/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. 
vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-intel  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-debianhvm-amd64  7 xen-install  fail REGR. vs. 152332
 test-amd64-coresched-i386-xl  7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemut-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-qemut-ws16-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-xsm   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt   7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-qemuu-rhel6hvm-amd  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 
152332
 test-amd64-i386-examine   6 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-raw7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-pvshim 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-freebsd10-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-shadow 7 xen-install  fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332
 test-amd64-i386-freebsd10-i386  7 xen-installfail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-win7-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 xen-install   fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332
 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. 
vs. 152332
 test-arm64-arm64-xl-credit1   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-xsm   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-libvirt-xsm  8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-examine  8 reboot   fail REGR. vs. 152332
 test-amd64-amd64-amd64-pvgrub 20 guest-stop  fail REGR. vs. 152332
 test-amd64-amd64-i386-pvgrub 19 guest-localmigrate/x10   fail REGR. vs. 152332
 test-armhf-armhf-xl-credit1   8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-libvirt  8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-examine  8 reboot   fail REGR. vs. 152332
 test-armhf-armhf-xl-cubietruck  8 xen-boot   fail REGR. vs. 152332
 test-armhf-armhf-xl-multivcpu  8 xen-bootfail REGR. vs. 152332
 test-armhf-armhf-libvirt-raw  8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-seattle   8 xen-boot fail REGR. vs. 152332
 test-arm64-arm64-xl-credit2   8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-xl   8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-xl-credit2   8 xen-boot fail REGR. vs. 152332
 test-armhf-armhf-xl-vhd   8 xen-boot fail REGR. vs. 152332

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds  8 xen-boot fail REGR. vs. 152332

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-xl  11 leak-check/basis(11)fail blocked in 152332
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332
 test-amd64-amd64-libvirt 15 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail   never pass
 

Ping: [PATCH v3 0/3] x86: shim building adjustments (plus shadow follow-on)

2020-10-29 Thread Jan Beulich
Tim,

On 19.10.2020 10:42, Jan Beulich wrote:
> The change addressing the shadow related build issue noticed by
> Andrew went in already. The build breakage goes beyond this
> specific combination though - PV_SHIM_EXCLUSIVE plus HVM is
> similarly an issue. This is what the 1st patch tries to take care
> of, in a shape already on irc noticed to be controversial. I'm
> submitting the change nevertheless because for the moment there
> looks to be a majority in favor of going this route. One argument
> not voiced there yet: What good does it do to allow a user to
> enable HVM when then on the resulting hypervisor they still can't
> run HVM guests (for the hypervisor still being a dedicated PV
> shim one). On top of this, the alternative approach is likely
> going to get ugly.
> 
> The shadow related adjustments are here merely because the want
> to make them was noticed in the context of the patch which has
> already gone in.
> 
> 1: don't permit HVM and PV_SHIM_EXCLUSIVE at the same time
> 2: refactor shadow_vram_{get,put}_l1e()
> 3: sh_{make,destroy}_monitor_table() are "even more" HVM-only

unless you tell me otherwise I'm intending to commit the latter
two with Roger's acks some time next week. Of course it would
still be nice to know your view on the first of the TBDs in
patch 3 (which may result in further changes, in a follow-up).

Jan



Re: [PATCH] meson.build: fix building of Xen support for aarch64

2020-10-29 Thread Jason Andryuk
On Thu, Oct 29, 2020 at 6:01 AM Alex Bennée  wrote:
>
>
> Stefano Stabellini  writes:
>
> > On Wed, 28 Oct 2020, Alex Bennée wrote:
> >> Xen is supported on aarch64 although weirdly using the i386-softmmu
> >> model. Checking based on the host CPU meant we never enabled Xen
> >> support. It would be nice to enable CONFIG_XEN for aarch64-softmmu to
> >> make it not seem weird but that will require further build surgery.
> >>
> >> Signed-off-by: Alex Bennée 
> >> Cc: Masami Hiramatsu 
> >> Cc: Stefano Stabellini 
> >> Cc: Anthony Perard 
> >> Cc: Paul Durrant 
> >> Fixes: 8a19980e3f ("configure: move accelerator logic to meson")
> >> ---
> >>  meson.build | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/meson.build b/meson.build
> >> index 835424999d..f1fcbfed4c 100644
> >> --- a/meson.build
> >> +++ b/meson.build
> >> @@ -81,6 +81,8 @@ if cpu in ['x86', 'x86_64']
> >>  'CONFIG_HVF': ['x86_64-softmmu'],
> >>  'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'],
> >>}
> >> +elif cpu in [ 'arm', 'aarch64' ]
> >> +  accelerator_targets += { 'CONFIG_XEN': ['i386-softmmu'] }
> >>  endif
> >
> > This looks very reasonable -- the patch makes sense.

A comment would be useful to explain that Arm needs i386-softmmu for
the xenpv machine.

> >
> > However I have two questions, mostly for my own understanding. I tried
> > to repro the aarch64 build problem but it works at my end, even without
> > this patch.
>
> Building on a x86 host or with cross compiler?
>
> > I wonder why. I suspect it works thanks to these lines in
> > meson.build:

I think it's a runtime and not a build problem.  In osstest, Xen
support was detected and configured, but CONFIG_XEN wasn't set for
Arm.  So at runtime xen_available() returns 0, and QEMU doesn't start
with "qemu-system-i386: -xen-domid 1: Option not supported for this
target"

I posted my investigation here:
https://lore.kernel.org/xen-devel/cakf6xpss8kpgovzrkitpz63bhbvbjxrtywdhekzuo2q1kem...@mail.gmail.com/

Regards,
Jason



  1   2   >