Re: BUG: libxl vuart build order
Hi Stefano, On Thu, Oct 29, 2020 at 07:03:28PM -0700, Stefano Stabellini wrote: > + xen-devel and libxl maintainers > > In short, there is a regression in libxl with the ARM vuart introduced > by moving ARM guests to the PVH build. > > > On Thu, 29 Oct 2020, Takahiro Akashi wrote: > > On Wed, Oct 28, 2020 at 02:44:16PM -0700, Stefano Stabellini wrote: > > > On Wed, 28 Oct 2020, Takahiro Akashi wrote: > > > > On Tue, Oct 27, 2020 at 09:02:14AM +0900, Takahiro Akashi wrote: > > > > > On Mon, Oct 26, 2020 at 04:37:30PM -0700, Stefano Stabellini wrote: > > > > > > > > > > > > On Mon, 26 Oct 2020, Takahiro Akashi wrote: > > > > > > > Stefano, > > > > > > > > > > > > > > # I'm afraid that I have already bothered you with a lot of > > > > > > > questions. > > > > > > > > > > > > > > When I looked at Xen's vpl011 implementation, I found > > > > > > > CR (and LCHR) register is not supported. (trap may cause a data > > > > > > > abort). > > > > > > > On the other hand, for example, linux's pl011 driver surely > > > > > > > accesses CR (and LCHR) register. > > > > > > > So I guess that linux won't be able to use pl011 on a Xen guest vm > > > > > > > if vuart = "sbsa_uart". > > > > > > > > > > > > > > Is this a known issue or do I miss anything? > > > > > > > > > > > > Linux should definitely be able to use it, and in fact, I am using > > > > > > it > > > > > > with Linux in my test environment. > > > > > > > > > > > > I think the confusion comes from the name "vpl011": it is in fact > > > > > > not a > > > > > > full PL011 UART, but an SBSA UART. > > > > > > > > > > Yeah, I have noticed it. > > > > > > > > > > > SBSA UART only implements a subset of > > > > > > the PL011 registers. The compatible string is "arm,sbsa-uart", also > > > > > > see > > > > > > drivers/tty/serial/amba-pl011.c:sbsa_uart_probe. > > > > > > > > > > Looking closely into the details of implementation, I found > > > > > that all the accesses to unimplemented registers, including > > > > > CR, are deliberately avoided in sbsa part of linux driver. > > > > > > > > So I'm now trying to implement "sbsa-uart" driver on U-Boot > > > > by modifying the existing pl011 driver. > > > > (Please note the current xen'ized U-Boot utilises a para-virtualized > > > > console, i.e. with HVM_PARAM_CONSOLE_PFN.) > > > > > > > > So far my all attempts have failed. > > > > > > > > There are a couple of problems, and one of them is how we can > > > > access vpl011 port (from dom0). > > > > What I did is: > > > > - modify U-Boot's pl011 driver > > > > (I'm sure that the driver correctly handle a vpl011 device > > > > with regard of accessing a proper set of registers.) > > > > - start U-Boot guest with "vuart=sbsa_uart" by > > > > xl create uboot.cfg -c > > > > > > > > Then I have seen almost nothing on the screen. > > > > Digging into vpl011 implementation, I found that all the characters > > > > written DR register will be directed to a "backend domain" if a guest > > > > vm is launched by xl command. > > > > (In case of dom0less, the backend seems to be Xen itself.) > > > > > > > > As a silly experiment, I modified domain_vpl011_init() to always create > > > > a vpl011 device with "backend_in_domain == false". > > > > Then, I could see more boot messages from U-Boot, but still fails > > > > to use the device as a console, I mean, we will lose all the outputs > > > > after at some point and won't be able to type any keys (at a command > > > > prompt). > > > > (This will be another problem on U-Boot side.) > > > > > > > > My first question here is how we can configure and connect a console > > > > in this case? > > > > Should "xl create -c" or "xl console -t vuart" simply work? > > > > > > "xl create -c" creates a guest and connect to the primary console which > > > is the PV console (i.e. HVM_PARAM_CONSOLE_PFN.) > > > > So in case of vuart, it (console) doesn't work? > > (Apparently, "xl create" doesn't take '-t' option.) > > > > > To connect to the emulated sbsa uart you need to pass -t vuart. So yes, > > > "xl console -t vuart domain_name" should get you access to the emulated > > > sbsa uart. The sbsa uart can also be exposed to dom0less guests; you get > > > their output by using CTRL-AAA to switch to right domU console. > > > > > > You can add printks to xen/arch/arm/vpl011.c in Xen to see what's > > > happening on the Xen side. vpl011.c is the emulator. > > > > I'm sure that write to "REG_DR" register is caught by Xen. > > What I don't understand is > > if back_in_domain -> no outputs > > if !back_in_domain -> can see outputs > > > > (As you know, if a guest is created by xl command, back_in_domain > > is forcedly set to true.) > > > > I looked into xenstore and found that "vuart/0/tty" does not exist, > > but "console/tty" does. > > How can this happen for vuart? > > (I clearly specified, vuart = "sbsa_uart" in Xen config.) > > It looks like we have a bug :-( > > I managed to reproduce the issue. The problem is a
[qemu-mainline test] 156287: regressions - FAIL
flight 156287 qemu-mainline real [real] flight 156312 qemu-mainline real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/156287/ http://logs.test-lab.xenproject.org/osstest/logs/156312/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-arm64-arm64-libvirt-xsm 14 guest-start fail REGR. vs. 152631 test-armhf-armhf-xl-vhd 12 debian-di-installfail REGR. vs. 152631 test-armhf-armhf-libvirt 14 guest-start fail REGR. vs. 152631 test-amd64-amd64-libvirt-vhd 19 guest-start/debian.repeat fail REGR. vs. 152631 test-armhf-armhf-libvirt-raw 12 debian-di-installfail REGR. vs. 152631 test-amd64-amd64-xl-qcow2 21 guest-start/debian.repeat fail REGR. vs. 152631 test-amd64-i386-xl-qemuu-ws16-amd64 12 windows-install fail REGR. vs. 152631 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152631 test-armhf-armhf-xl-rtds 18 guest-start/debian.repeatfail like 152631 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 152631 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152631 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152631 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass version targeted for testing: qemuubbc48d2bcb9711614fbe751c2c5ae13e172fbca8 baseline version: qemuu1d806cef0e38b5db8347a8e12f214d543204a314 Last test of basis 152631 2020-08-20 09:07:46 Z 70 days Failing since152659 2020-08-21 14:07:39 Z 69 days 159 attempts Testing same since 156287 2020-10-29 05:06:49 Z0 days1 attempts People who touched revisions under test: Aaron Lindsay Alberto Garcia Aleksandar Markovic Alex Bennée Alex Williamson Alexander Bulekov Alexey Kirillov Alistair Francis Alistair Francis Amey Narkhede Ana Pazos Andrew Jones Andrey Konovalov Andrey Shinkevich Ani Sinha Anthony PERARD Anton Blanchard Anup Patel Babu Moger BALATON Zoltan Bastian Koppelmann Ben Widawsky Bihong Yu Bin Meng Bruce Rogers Carlo
Re: [PATCH] tools/python: pass more -rpath-link options to ld
On Mon, Oct 19, 2020 at 10:31:37AM +0200, Jan Beulich wrote: > With the split of libraries, I've observed a number of warnings from > (old?) ld. > > Signed-off-by: Jan Beulich > --- > It's unclear to me whether this is ld version dependent - the pattern > of where I've seen such warnings doesn't suggest a clear version > dependency. > > --- a/tools/python/setup.py > +++ b/tools/python/setup.py > @@ -7,10 +7,15 @@ XEN_ROOT = "../.." > extra_compile_args = [ "-fno-strict-aliasing", "-Werror" ] > > PATH_XEN = XEN_ROOT + "/tools/include" > +PATH_LIBXENTOOLCORE = XEN_ROOT + "/tools/libs/toolcore" > PATH_LIBXENTOOLLOG = XEN_ROOT + "/tools/libs/toollog" > +PATH_LIBXENCALL = XEN_ROOT + "/tools/libs/call" > PATH_LIBXENEVTCHN = XEN_ROOT + "/tools/libs/evtchn" > +PATH_LIBXENGNTTAB = XEN_ROOT + "/tools/libs/gnttab" > PATH_LIBXENCTRL = XEN_ROOT + "/tools/libs/ctrl" > PATH_LIBXENGUEST = XEN_ROOT + "/tools/libs/guest" > +PATH_LIBXENDEVICEMODEL = XEN_ROOT + "/tools/libs/devicemodel" > +PATH_LIBXENFOREIGNMEMORY = XEN_ROOT + "/tools/libs/foreignmemory" > PATH_XENSTORE = XEN_ROOT + "/tools/libs/store" > > xc = Extension("xc", > @@ -24,7 +29,13 @@ xc = Extension("xc", > library_dirs = [ PATH_LIBXENCTRL, PATH_LIBXENGUEST ], > libraries = [ "xenctrl", "xenguest" ], > depends= [ PATH_LIBXENCTRL + "/libxenctrl.so", > PATH_LIBXENGUEST + "/libxenguest.so" ], > - extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG > ], > + extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENCALL, > + > "-Wl,-rpath-link="+PATH_LIBXENDEVICEMODEL, > + "-Wl,-rpath-link="+PATH_LIBXENEVTCHN, > + > "-Wl,-rpath-link="+PATH_LIBXENFOREIGNMEMORY, > + "-Wl,-rpath-link="+PATH_LIBXENGNTTAB, > + "-Wl,-rpath-link="+PATH_LIBXENTOOLCORE, > + "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG > ], This basically open-codes SHLIB_libxenctrl + SHLIB_libxenguest. Isn't it better to pass it from make that already has all the dependencies resolved? > sources= [ "xen/lowlevel/xc/xc.c" ]) > > xs = Extension("xs", > @@ -33,6 +44,7 @@ xs = Extension("xs", > library_dirs = [ PATH_XENSTORE ], > libraries = [ "xenstore" ], > depends= [ PATH_XENSTORE + "/libxenstore.so" ], > + extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLCORE > ], > sources= [ "xen/lowlevel/xs/xs.c" ]) > > plat = os.uname()[0] -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? signature.asc Description: PGP signature
[xen-unstable-smoke test] 156310: tolerable all pass - PUSHED
flight 156310 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/156310/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 6e2ee3dfd660d9fde96243da7d565244b4d2f164 baseline version: xen 26a8fa494f2f323622b6928bd15921b41818f180 Last test of basis 156304 2020-10-29 20:00:26 Z0 days Testing same since 156310 2020-10-30 00:01:28 Z0 days1 attempts People who touched revisions under test: Bertrand Marquis Stefano Stabellini jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 26a8fa494f..6e2ee3dfd6 6e2ee3dfd660d9fde96243da7d565244b4d2f164 -> smoke
BUG: libxl vuart build order
+ xen-devel and libxl maintainers In short, there is a regression in libxl with the ARM vuart introduced by moving ARM guests to the PVH build. On Thu, 29 Oct 2020, Takahiro Akashi wrote: > On Wed, Oct 28, 2020 at 02:44:16PM -0700, Stefano Stabellini wrote: > > On Wed, 28 Oct 2020, Takahiro Akashi wrote: > > > On Tue, Oct 27, 2020 at 09:02:14AM +0900, Takahiro Akashi wrote: > > > > On Mon, Oct 26, 2020 at 04:37:30PM -0700, Stefano Stabellini wrote: > > > > > > > > > > On Mon, 26 Oct 2020, Takahiro Akashi wrote: > > > > > > Stefano, > > > > > > > > > > > > # I'm afraid that I have already bothered you with a lot of > > > > > > questions. > > > > > > > > > > > > When I looked at Xen's vpl011 implementation, I found > > > > > > CR (and LCHR) register is not supported. (trap may cause a data > > > > > > abort). > > > > > > On the other hand, for example, linux's pl011 driver surely > > > > > > accesses CR (and LCHR) register. > > > > > > So I guess that linux won't be able to use pl011 on a Xen guest vm > > > > > > if vuart = "sbsa_uart". > > > > > > > > > > > > Is this a known issue or do I miss anything? > > > > > > > > > > Linux should definitely be able to use it, and in fact, I am using it > > > > > with Linux in my test environment. > > > > > > > > > > I think the confusion comes from the name "vpl011": it is in fact not > > > > > a > > > > > full PL011 UART, but an SBSA UART. > > > > > > > > Yeah, I have noticed it. > > > > > > > > > SBSA UART only implements a subset of > > > > > the PL011 registers. The compatible string is "arm,sbsa-uart", also > > > > > see > > > > > drivers/tty/serial/amba-pl011.c:sbsa_uart_probe. > > > > > > > > Looking closely into the details of implementation, I found > > > > that all the accesses to unimplemented registers, including > > > > CR, are deliberately avoided in sbsa part of linux driver. > > > > > > So I'm now trying to implement "sbsa-uart" driver on U-Boot > > > by modifying the existing pl011 driver. > > > (Please note the current xen'ized U-Boot utilises a para-virtualized > > > console, i.e. with HVM_PARAM_CONSOLE_PFN.) > > > > > > So far my all attempts have failed. > > > > > > There are a couple of problems, and one of them is how we can > > > access vpl011 port (from dom0). > > > What I did is: > > > - modify U-Boot's pl011 driver > > > (I'm sure that the driver correctly handle a vpl011 device > > > with regard of accessing a proper set of registers.) > > > - start U-Boot guest with "vuart=sbsa_uart" by > > > xl create uboot.cfg -c > > > > > > Then I have seen almost nothing on the screen. > > > Digging into vpl011 implementation, I found that all the characters > > > written DR register will be directed to a "backend domain" if a guest > > > vm is launched by xl command. > > > (In case of dom0less, the backend seems to be Xen itself.) > > > > > > As a silly experiment, I modified domain_vpl011_init() to always create > > > a vpl011 device with "backend_in_domain == false". > > > Then, I could see more boot messages from U-Boot, but still fails > > > to use the device as a console, I mean, we will lose all the outputs > > > after at some point and won't be able to type any keys (at a command > > > prompt). > > > (This will be another problem on U-Boot side.) > > > > > > My first question here is how we can configure and connect a console > > > in this case? > > > Should "xl create -c" or "xl console -t vuart" simply work? > > > > "xl create -c" creates a guest and connect to the primary console which > > is the PV console (i.e. HVM_PARAM_CONSOLE_PFN.) > > So in case of vuart, it (console) doesn't work? > (Apparently, "xl create" doesn't take '-t' option.) > > > To connect to the emulated sbsa uart you need to pass -t vuart. So yes, > > "xl console -t vuart domain_name" should get you access to the emulated > > sbsa uart. The sbsa uart can also be exposed to dom0less guests; you get > > their output by using CTRL-AAA to switch to right domU console. > > > > You can add printks to xen/arch/arm/vpl011.c in Xen to see what's > > happening on the Xen side. vpl011.c is the emulator. > > I'm sure that write to "REG_DR" register is caught by Xen. > What I don't understand is > if back_in_domain -> no outputs > if !back_in_domain -> can see outputs > > (As you know, if a guest is created by xl command, back_in_domain > is forcedly set to true.) > > I looked into xenstore and found that "vuart/0/tty" does not exist, > but "console/tty" does. > How can this happen for vuart? > (I clearly specified, vuart = "sbsa_uart" in Xen config.) It looks like we have a bug :-( I managed to reproduce the issue. The problem is a race in libxl. tools/libxc/xc_dom_arm.c:alloc_magic_pages is called first, setting dom->vuart_gfn. Then, libxl__build_hvm should be setting state->vuart_gfn to dom->vuart_gfn (like libxl__build_pv does) but it doesn't. --- libxl: set vuart_gfn in libxl__build_hvm Setting vuart_gfn was missed when switching ARM
[libvirt test] 156290: regressions - FAIL
flight 156290 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/156290/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777 build-arm64-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-armhf-libvirt 6 libvirt-buildfail REGR. vs. 151777 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a version targeted for testing: libvirt 1f807631f402210d036ec4803e7adfefa222f786 baseline version: libvirt 2c846fa6bcc11929c9fb857a22430fb9945654ad Last test of basis 151777 2020-07-10 04:19:19 Z 111 days Failing since151818 2020-07-11 04:18:52 Z 110 days 105 attempts Testing same since 156273 2020-10-28 04:19:15 Z1 days2 attempts People who touched revisions under test: Adolfo Jayme Barrientos Andika Triwidada Andrea Bolognani Balázs Meskó Bastien Orivel Bihong Yu Binfeng Wu Boris Fiuczynski Christian Ehrhardt Christian Schoenebeck Cole Robinson Collin Walling Cornelia Huck Côme Borsoi Daniel Henrique Barboza Daniel Letai Daniel P. Berrange Daniel P. Berrangé Erik Skultety Fabian Freyer Fangge Jin Fedora Weblate Translation Halil Pasic Han Han Hao Wang Ian Wienand Jamie Strandboge Jamie Strandboge Jean-Baptiste Holcroft Jianan Gao Jim Fehlig Jin Yan Jiri Denemark Jonathon Jongsma Ján Tomko Kashyap Chamarthy Kevin Locke Laine Stump Liao Pingfang Lin Ma Lin Ma Marc Hartmayer Marek Marczykowski-Górecki Markus Schade Martin Kletzander Masayoshi Mizuma Matt Coleman Matt Coleman Mauro Matteo Cascella Michal Privoznik Michał Smyk Milo Casagrande Neal Gompa Nico Pache Nikolay Shirokovskiy Olesya Gerasimenko Patrick Magauran Paulo de Rezende Pinatti Pavel Hrdina Peter Krempa Pino Toscano Pino Toscano Piotr Drąg Prathamesh Chavan Roman Bogorodskiy Roman Bolshakov Ryan Schmidt Sam Hartman Scott Shambarger Sebastian Mitterle Simon Gaiser Stefan Bader Stefan Berger Szymon Scholz Thomas Huth Tim Wiederhake Tomáš Golembiovský Wang Xin Weblate Yang Hang Yanqiu Zhang Yi Li Yi Wang Yuri Chornoivan Zheng Chuan zhenwei pi Zhenyu Zheng jobs: build-amd64-xsm pass build-arm64-xsm pass build-i386-xsm pass build-amd64 pass build-arm64 pass build-armhf pass build-i386 pass build-amd64-libvirt fail build-arm64-libvirt fail build-armhf-libvirt fail build-i386-libvirt fail build-amd64-pvopspass build-arm64-pvopspass build-armhf-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm blocked test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked test-amd64-amd64-libvirt-xsm blocked test-arm64-arm64-libvirt-xsm
[seabios test] 156285: tolerable FAIL - PUSHED
flight 156285 seabios real [real] http://logs.test-lab.xenproject.org/osstest/logs/156285/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 155839 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 155839 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 155839 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 155839 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 155839 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass version targeted for testing: seabios 94f0510dc75e910400aad6c169048d672c8c7193 baseline version: seabios 58a44be024f69d2e4d2b58553529230abdd3935e Last test of basis 155839 2020-10-15 09:39:29 Z 14 days Testing same since 156285 2020-10-28 19:42:01 Z1 days1 attempts People who touched revisions under test: Alexander Graf jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm pass test-amd64-i386-xl-qemuu-debianhvm-i386-xsm pass test-amd64-amd64-qemuu-nested-amdfail test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-amd64-qemuu-freebsd11-amd64 pass test-amd64-amd64-qemuu-freebsd12-amd64 pass test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-i386-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-qemuu-ws16-amd64 fail test-amd64-i386-xl-qemuu-ws16-amd64 fail test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrictpass test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict pass test-amd64-amd64-qemuu-nested-intel pass test-amd64-i386-qemuu-rhel6hvm-intel pass test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow pass test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/seabios.git 58a44be..94f0510 94f0510dc75e910400aad6c169048d672c8c7193 -> xen-tested-master
Re: [PATCH v2 3/3] xen/arm: Warn user on cpu errata 832075
On Thu, 29 Oct 2020, Bertrand Marquis wrote: > Hi Julien, > > > On 28 Oct 2020, at 18:39, Julien Grall wrote: > > > > Hi Bertrand, > > > > On 26/10/2020 16:21, Bertrand Marquis wrote: > >> When a Cortex A57 processor is affected by CPU errata 832075, a guest > >> not implementing the workaround for it could deadlock the system. > >> Add a warning during boot informing the user that only trusted guests > >> should be executed on the system. > >> An equivalent warning is already given to the user by KVM on cores > >> affected by this errata. > >> Also taint the hypervisor as unsecure when this errata applies and > >> mention Cortex A57 r0p0 - r1p2 as not security supported in SUPPORT.md > >> Signed-off-by: Bertrand Marquis > > > > Reviewed-by: Julien Grall > > Thanks > > > > > If you don't need to resend the series, then I would be happy to fix the > > typo pointed out by George on commit. > > There is only the condensing from Stefano. > If you can handle that on commit to great but if you need me to send a v3 to > make your life easier do not hesitate to tell me I have just done the committing
[xen-unstable-smoke test] 156304: tolerable all pass - PUSHED
flight 156304 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/156304/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 26a8fa494f2f323622b6928bd15921b41818f180 baseline version: xen 1fd1d4bafdf6f9f8fe5ca9b947f016a7aae92a74 Last test of basis 156297 2020-10-29 14:01:23 Z0 days Testing same since 156304 2020-10-29 20:00:26 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Jan Beulich jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 1fd1d4bafd..26a8fa494f 26a8fa494f2f323622b6928bd15921b41818f180 -> smoke
[xen-4.12-testing test] 156280: regressions - FAIL
flight 156280 xen-4.12-testing real [real] flight 156308 xen-4.12-testing real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/156280/ http://logs.test-lab.xenproject.org/osstest/logs/156308/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qcow218 guest-saverestore.2 fail REGR. vs. 156035 Tests which did not succeed, but are not blocking: test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156035 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 156035 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156035 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156035 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156035 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156035 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156035 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156035 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 156035 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156035 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156035 test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass version targeted for testing: xen 4100d463dbdd95d85fabe387dd5676bed75f65f7 baseline version: xen 0108b011e133915a8ebd33636811d8c141b6e9f3 Last test of basis 156035 2020-10-20 13:36:02 Z9 days Testing same since 156263 2020-10-27 18:36:53 Z2 days2 attempts People who touched revisions under test: Andrew Cooper jobs:
RE: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
From: Arvind Sankar > Sent: 29 October 2020 21:35 > > On Thu, Oct 29, 2020 at 09:41:13PM +0100, Thomas Gleixner wrote: > > On Thu, Oct 29 2020 at 17:59, Paolo Bonzini wrote: > > > On 29/10/20 17:56, Arvind Sankar wrote: > > >>> For those two just add: > > >>> struct apic *apic = x86_system_apic; > > >>> before all the assignments. > > >>> Less churn and much better code. > > >>> > > >> Why would it be better code? > > >> > > > > > > I think he means the compiler produces better code, because it won't > > > read the global variable repeatedly. Not sure if that's true,(*) but I > > > think I do prefer that version if Arnd wants to do that tweak. > > > > It's not true. > > > > foo *p = bar; > > > > p->a = 1; > > p->b = 2; > > > > The compiler is free to reload bar after accessing p->a and with > > > > bar->a = 1; > > bar->b = 1; > > > > it can either cache bar in a register or reread it after bar->a > > > > The generated code is the same as long as there is no reason to reload, > > e.g. register pressure. > > > > Thanks, > > > > tglx > > It's not quite the same. > > https://godbolt.org/z/4dzPbM > > With -fno-strict-aliasing, the compiler reloads the pointer if you write > to the start of what it points to, but not if you write to later > elements. I guess it assumes that global data doesn't overlap. But in general they are sort of opposites: With the local variable it can reload if it knows the write cannot have affected the global - but is unlikely to do so. Using the global it must reload if it is possible the write might have affected the global. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm
Hi Alex. [sorry for the possible format issues] > I assume this is wiring up the required bits of support to handle the > IOREQ requests in QEMU? No, it is not related to QEMU. We don't run QEMU in our system. Being honest I have never tried to run it) virtio-disk backend PoC is a completely standalone entity (IOREQ server) which emulates a virtio-mmio disk device. It is based on code from DEMU (for IOREQ server purposes) and some code from kvmtool to implement virtio protocol, disk operations over underlying H/W and Xenbus code to be able to read configuration from the Xenstore (it is configured via domain config file). Last patch in this series (marked as RFC) actually adds required bits to the libxl code. > We are putting together a PoC demo to show > a virtio enabled image (AGL) running on both KVM and Xen hypervisors so > we are keen to see your code as soon as you can share it. Thank you. Sure, I will provide a branch with code by the end of this week. > > I'm currently preparing a patch series for QEMU which fixes the recent > breakage caused by changes to the build system. As part of that I've > separated CONFIG_XEN and CONFIG_XEN_HVM so it's possible to build > i386-softmmu with unneeded for ARM bits. Hopefully this will allow me to > create a qemu-aarch64-system binary with just the PV related models in > it. > Does it mean that it will be possible to use QEMU in Xen on Arm just for "backend provider" purposes? > > Talking to Stefano it probably makes sense going forward to introduce a > new IOREQ aware machine type for QEMU that doesn't bring in the rest of > the x86 overhead. I was thinking maybe xenvirt? > > You've tested with virtio-block but we'd also like to extend this to > other arbitrary virtio devices. I guess we will need some sort of > mechanism to inform the QEMU command line where each device sits in the > virtio-mmio bus so the FDT Xen delivers to the guest matches up with > what QEMU is ready to serve requests for? > I am sorry, I can't provide ideas here, not familiar with QEMU. But, completely agree, that other virtio devices should be supported. -- Regards, Oleksandr Tyshchenko
[PATCH 09/36] qdev: Make qdev_get_prop_ptr() get Object* arg
Make the code more generic and not specific to TYPE_DEVICE. Signed-off-by: Eduardo Habkost --- Cc: Stefan Berger Cc: Stefano Stabellini Cc: Anthony Perard Cc: Paul Durrant Cc: Kevin Wolf Cc: Max Reitz Cc: Paolo Bonzini Cc: "Daniel P. Berrangé" Cc: Eduardo Habkost Cc: Richard Henderson Cc: David Hildenbrand Cc: Cornelia Huck Cc: Halil Pasic Cc: Christian Borntraeger Cc: Thomas Huth Cc: Matthew Rosato Cc: Alex Williamson Cc: qemu-de...@nongnu.org Cc: xen-devel@lists.xenproject.org Cc: qemu-bl...@nongnu.org Cc: qemu-s3...@nongnu.org --- include/hw/qdev-properties.h | 2 +- backends/tpm/tpm_util.c | 8 ++-- hw/block/xen-block.c | 6 +-- hw/core/qdev-properties-system.c | 57 +- hw/core/qdev-properties.c| 82 +--- hw/s390x/css.c | 5 +- hw/s390x/s390-pci-bus.c | 4 +- hw/vfio/pci-quirks.c | 5 +- 8 files changed, 68 insertions(+), 101 deletions(-) diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h index 0ea822e6a7..0b92cfc761 100644 --- a/include/hw/qdev-properties.h +++ b/include/hw/qdev-properties.h @@ -302,7 +302,7 @@ void qdev_prop_set_macaddr(DeviceState *dev, const char *name, const uint8_t *value); void qdev_prop_set_enum(DeviceState *dev, const char *name, int value); -void *qdev_get_prop_ptr(DeviceState *dev, Property *prop); +void *qdev_get_prop_ptr(Object *obj, Property *prop); void qdev_prop_register_global(GlobalProperty *prop); const GlobalProperty *qdev_find_global_prop(DeviceState *dev, diff --git a/backends/tpm/tpm_util.c b/backends/tpm/tpm_util.c index b58d298c1a..e91c21dd4a 100644 --- a/backends/tpm/tpm_util.c +++ b/backends/tpm/tpm_util.c @@ -35,8 +35,7 @@ static void get_tpm(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); -TPMBackend **be = qdev_get_prop_ptr(dev, opaque); +TPMBackend **be = qdev_get_prop_ptr(obj, opaque); char *p; p = g_strdup(*be ? (*be)->id : ""); @@ -49,7 +48,7 @@ static void set_tpm(Object *obj, Visitor *v, const char *name, void *opaque, { DeviceState *dev = DEVICE(obj); Property *prop = opaque; -TPMBackend *s, **be = qdev_get_prop_ptr(dev, prop); +TPMBackend *s, **be = qdev_get_prop_ptr(obj, prop); char *str; if (dev->realized) { @@ -73,9 +72,8 @@ static void set_tpm(Object *obj, Visitor *v, const char *name, void *opaque, static void release_tpm(Object *obj, const char *name, void *opaque) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque; -TPMBackend **be = qdev_get_prop_ptr(dev, prop); +TPMBackend **be = qdev_get_prop_ptr(obj, prop); if (*be) { tpm_backend_reset(*be); diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c index 8a7a3f5452..1ba9981c08 100644 --- a/hw/block/xen-block.c +++ b/hw/block/xen-block.c @@ -335,9 +335,8 @@ static char *disk_to_vbd_name(unsigned int disk) static void xen_block_get_vdev(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque; -XenBlockVdev *vdev = qdev_get_prop_ptr(dev, prop); +XenBlockVdev *vdev = qdev_get_prop_ptr(obj, prop); char *str; switch (vdev->type) { @@ -396,9 +395,8 @@ static int vbd_name_to_disk(const char *name, const char **endp, static void xen_block_set_vdev(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque; -XenBlockVdev *vdev = qdev_get_prop_ptr(dev, prop); +XenBlockVdev *vdev = qdev_get_prop_ptr(obj, prop); char *str, *p; const char *end; diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index d0fb063a49..c8c73c371b 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -59,9 +59,8 @@ static bool check_prop_still_unset(DeviceState *dev, const char *name, static void get_drive(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque; -void **ptr = qdev_get_prop_ptr(dev, prop); +void **ptr = qdev_get_prop_ptr(obj, prop); const char *value; char *p; @@ -87,7 +86,7 @@ static void set_drive_helper(Object *obj, Visitor *v, const char *name, { DeviceState *dev = DEVICE(obj); Property *prop = opaque; -void **ptr = qdev_get_prop_ptr(dev, prop); +void **ptr = qdev_get_prop_ptr(obj, prop); char *str; BlockBackend *blk; bool blk_created = false; @@ -185,7 +184,7 @@ static void release_drive(Object *obj, const char *name, void *opaque) { DeviceState *dev = DEVICE(obj); Property *prop = opaque; -BlockBackend **ptr =
[PATCH 14/36] qdev: Move dev->realized check to qdev_property_set()
Every single qdev property setter function manually checks dev->realized. We can just check dev->realized inside qdev_property_set() instead. The check is being added as a separate function (qdev_prop_allow_set()) because it will become a callback later. Signed-off-by: Eduardo Habkost --- Cc: Stefan Berger Cc: Stefano Stabellini Cc: Anthony Perard Cc: Paul Durrant Cc: Kevin Wolf Cc: Max Reitz Cc: Paolo Bonzini Cc: "Daniel P. Berrangé" Cc: Eduardo Habkost Cc: Cornelia Huck Cc: Halil Pasic Cc: Christian Borntraeger Cc: Thomas Huth Cc: Richard Henderson Cc: David Hildenbrand Cc: Matthew Rosato Cc: Alex Williamson Cc: Mark Cave-Ayland Cc: Artyom Tarasenko Cc: qemu-de...@nongnu.org Cc: xen-devel@lists.xenproject.org Cc: qemu-bl...@nongnu.org Cc: qemu-s3...@nongnu.org --- backends/tpm/tpm_util.c | 6 -- hw/block/xen-block.c | 5 -- hw/core/qdev-properties-system.c | 64 --- hw/core/qdev-properties.c| 106 ++- hw/s390x/css.c | 6 -- hw/s390x/s390-pci-bus.c | 6 -- hw/vfio/pci-quirks.c | 6 -- target/sparc/cpu.c | 6 -- 8 files changed, 18 insertions(+), 187 deletions(-) diff --git a/backends/tpm/tpm_util.c b/backends/tpm/tpm_util.c index e91c21dd4a..042cacfcca 100644 --- a/backends/tpm/tpm_util.c +++ b/backends/tpm/tpm_util.c @@ -46,16 +46,10 @@ static void get_tpm(Object *obj, Visitor *v, const char *name, void *opaque, static void set_tpm(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque; TPMBackend *s, **be = qdev_get_prop_ptr(obj, prop); char *str; -if (dev->realized) { -qdev_prop_set_after_realize(dev, name, errp); -return; -} - if (!visit_type_str(v, name, , errp)) { return; } diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c index 1ba9981c08..bd1aef63a7 100644 --- a/hw/block/xen-block.c +++ b/hw/block/xen-block.c @@ -400,11 +400,6 @@ static void xen_block_set_vdev(Object *obj, Visitor *v, const char *name, char *str, *p; const char *end; -if (dev->realized) { -qdev_prop_set_after_realize(dev, name, errp); -return; -} - if (!visit_type_str(v, name, , errp)) { return; } diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index fca1b694ca..60a45f5620 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -92,11 +92,6 @@ static void set_drive_helper(Object *obj, Visitor *v, const char *name, bool blk_created = false; int ret; -if (dev->realized) { -qdev_prop_set_after_realize(dev, name, errp); -return; -} - if (!visit_type_str(v, name, , errp)) { return; } @@ -228,17 +223,11 @@ static void get_chr(Object *obj, Visitor *v, const char *name, void *opaque, static void set_chr(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque; CharBackend *be = qdev_get_prop_ptr(obj, prop); Chardev *s; char *str; -if (dev->realized) { -qdev_prop_set_after_realize(dev, name, errp); -return; -} - if (!visit_type_str(v, name, , errp)) { return; } @@ -309,18 +298,12 @@ static void get_mac(Object *obj, Visitor *v, const char *name, void *opaque, static void set_mac(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque; MACAddr *mac = qdev_get_prop_ptr(obj, prop); int i, pos; char *str; const char *p; -if (dev->realized) { -qdev_prop_set_after_realize(dev, name, errp); -return; -} - if (!visit_type_str(v, name, , errp)) { return; } @@ -388,7 +371,6 @@ static void get_netdev(Object *obj, Visitor *v, const char *name, static void set_netdev(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque; NICPeers *peers_ptr = qdev_get_prop_ptr(obj, prop); NetClientState **ncs = peers_ptr->ncs; @@ -396,11 +378,6 @@ static void set_netdev(Object *obj, Visitor *v, const char *name, int queues, err = 0, i = 0; char *str; -if (dev->realized) { -qdev_prop_set_after_realize(dev, name, errp); -return; -} - if (!visit_type_str(v, name, , errp)) { return; } @@ -467,18 +444,12 @@ static void get_audiodev(Object *obj, Visitor *v, const char* name, static void set_audiodev(Object *obj, Visitor *v, const char* name, void *opaque, Error **errp) { -DeviceState *dev = DEVICE(obj); Property *prop = opaque;
[PATCH 25/36] qdev: Rename qdev_get_prop_ptr() to object_static_prop_ptr()
The function will be moved to common QOM code, as it is not specific to TYPE_DEVICE anymore. Signed-off-by: Eduardo Habkost --- Cc: Stefan Berger Cc: Stefano Stabellini Cc: Anthony Perard Cc: Paul Durrant Cc: Kevin Wolf Cc: Max Reitz Cc: Paolo Bonzini Cc: "Daniel P. Berrangé" Cc: Eduardo Habkost Cc: Cornelia Huck Cc: Halil Pasic Cc: Christian Borntraeger Cc: Richard Henderson Cc: David Hildenbrand Cc: Thomas Huth Cc: Matthew Rosato Cc: Alex Williamson Cc: qemu-de...@nongnu.org Cc: xen-devel@lists.xenproject.org Cc: qemu-bl...@nongnu.org Cc: qemu-s3...@nongnu.org --- include/hw/qdev-properties.h | 2 +- backends/tpm/tpm_util.c | 6 +-- hw/block/xen-block.c | 4 +- hw/core/qdev-properties-system.c | 46 +++ hw/core/qdev-properties.c| 64 hw/s390x/css.c | 4 +- hw/s390x/s390-pci-bus.c | 4 +- hw/vfio/pci-quirks.c | 4 +- 8 files changed, 67 insertions(+), 67 deletions(-) diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h index 0acc92ae2b..4146dac281 100644 --- a/include/hw/qdev-properties.h +++ b/include/hw/qdev-properties.h @@ -332,7 +332,7 @@ void qdev_prop_set_macaddr(DeviceState *dev, const char *name, const uint8_t *value); void qdev_prop_set_enum(DeviceState *dev, const char *name, int value); -void *qdev_get_prop_ptr(Object *obj, Property *prop); +void *object_static_prop_ptr(Object *obj, Property *prop); void qdev_prop_register_global(GlobalProperty *prop); const GlobalProperty *qdev_find_global_prop(Object *obj, diff --git a/backends/tpm/tpm_util.c b/backends/tpm/tpm_util.c index 042cacfcca..2b5f788861 100644 --- a/backends/tpm/tpm_util.c +++ b/backends/tpm/tpm_util.c @@ -35,7 +35,7 @@ static void get_tpm(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { -TPMBackend **be = qdev_get_prop_ptr(obj, opaque); +TPMBackend **be = object_static_prop_ptr(obj, opaque); char *p; p = g_strdup(*be ? (*be)->id : ""); @@ -47,7 +47,7 @@ static void set_tpm(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { Property *prop = opaque; -TPMBackend *s, **be = qdev_get_prop_ptr(obj, prop); +TPMBackend *s, **be = object_static_prop_ptr(obj, prop); char *str; if (!visit_type_str(v, name, , errp)) { @@ -67,7 +67,7 @@ static void set_tpm(Object *obj, Visitor *v, const char *name, void *opaque, static void release_tpm(Object *obj, const char *name, void *opaque) { Property *prop = opaque; -TPMBackend **be = qdev_get_prop_ptr(obj, prop); +TPMBackend **be = object_static_prop_ptr(obj, prop); if (*be) { tpm_backend_reset(*be); diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c index bd1aef63a7..20985c465a 100644 --- a/hw/block/xen-block.c +++ b/hw/block/xen-block.c @@ -336,7 +336,7 @@ static void xen_block_get_vdev(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { Property *prop = opaque; -XenBlockVdev *vdev = qdev_get_prop_ptr(obj, prop); +XenBlockVdev *vdev = object_static_prop_ptr(obj, prop); char *str; switch (vdev->type) { @@ -396,7 +396,7 @@ static void xen_block_set_vdev(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { Property *prop = opaque; -XenBlockVdev *vdev = qdev_get_prop_ptr(obj, prop); +XenBlockVdev *vdev = object_static_prop_ptr(obj, prop); char *str, *p; const char *end; diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index d9355053d2..448d77ecab 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -60,7 +60,7 @@ static void get_drive(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { Property *prop = opaque; -void **ptr = qdev_get_prop_ptr(obj, prop); +void **ptr = object_static_prop_ptr(obj, prop); const char *value; char *p; @@ -86,7 +86,7 @@ static void set_drive_helper(Object *obj, Visitor *v, const char *name, { DeviceState *dev = DEVICE(obj); Property *prop = opaque; -void **ptr = qdev_get_prop_ptr(obj, prop); +void **ptr = object_static_prop_ptr(obj, prop); char *str; BlockBackend *blk; bool blk_created = false; @@ -179,7 +179,7 @@ static void release_drive(Object *obj, const char *name, void *opaque) { DeviceState *dev = DEVICE(obj); Property *prop = opaque; -BlockBackend **ptr = qdev_get_prop_ptr(obj, prop); +BlockBackend **ptr = object_static_prop_ptr(obj, prop); if (*ptr) { AioContext *ctx = blk_get_aio_context(*ptr); @@ -212,7 +212,7 @@ const PropertyInfo qdev_prop_drive_iothread = { static void get_chr(Object *obj, Visitor *v, const char *name, void
Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
On Thu, Oct 29, 2020 at 09:41:13PM +0100, Thomas Gleixner wrote: > On Thu, Oct 29 2020 at 17:59, Paolo Bonzini wrote: > > On 29/10/20 17:56, Arvind Sankar wrote: > >>> For those two just add: > >>> struct apic *apic = x86_system_apic; > >>> before all the assignments. > >>> Less churn and much better code. > >>> > >> Why would it be better code? > >> > > > > I think he means the compiler produces better code, because it won't > > read the global variable repeatedly. Not sure if that's true,(*) but I > > think I do prefer that version if Arnd wants to do that tweak. > > It's not true. > > foo *p = bar; > > p->a = 1; > p->b = 2; > > The compiler is free to reload bar after accessing p->a and with > > bar->a = 1; > bar->b = 1; > > it can either cache bar in a register or reread it after bar->a > > The generated code is the same as long as there is no reason to reload, > e.g. register pressure. > > Thanks, > > tglx It's not quite the same. https://godbolt.org/z/4dzPbM With -fno-strict-aliasing, the compiler reloads the pointer if you write to the start of what it points to, but not if you write to later elements.
Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm
On Thu, 29 Oct 2020, Oleksandr Tyshchenko wrote: > Hi Stefano > > [sorry for the possible format issue] > > On Thu, Oct 29, 2020 at 9:53 PM Stefano Stabellini > wrote: > On Thu, 29 Oct 2020, Oleksandr Tyshchenko wrote: > > On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu > wrote: > > Hi Oleksandr, > > > > Hi Masami > > > > [sorry for the possible format issue] > > > > > > I would like to try this on my arm64 board. > > > > Glad to hear you are interested in this topic. > > > > > > According to your comments in the patch, I made this config > file. > > # cat debian.conf > > name = "debian" > > type = "pvh" > > vcpus = 8 > > memory = 512 > > kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64" > > ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64" > > cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw" > > disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ] > > vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ] > > virtio = 1 > > vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ] > > > > And tried to boot a DomU, but I got below error. > > > > # xl create -c debian.conf > > Parsing config from debian.conf > > libxl: error: libxl_create.c:1863:domcreate_attach_devices: > Domain > > 1:unable to add virtio_disk devices > > libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain > > 1:xc_domain_pause failed > > libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get > domain > > type for domid=1 > > libxl: error: libxl_domain.c:1136:domain_destroy_callback: > Domain > > 1:Unable to destroy guest > > libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain > > 1:Destruction of domain failed > > > > > > Could you tell me how can I test it? > > > > > > I assume it is due to the lack of the virtio-disk backend (which I > haven't shared yet as I focused on the IOREQ/DM support on > Arm in the > > first place). > > Could you wait a little bit, I am going to share it soon. > > Do you have a quick-and-dirty hack you can share in the meantime? Even > just on github as a special branch? It would be very useful to be able > to have a test-driver for the new feature. > > Well, I will provide a branch on github with our PoC virtio-disk backend by > the end of this week. It will be possible to test this series > with it. Very good, thank you!
Re: Xen on RP4
On Wed, Oct 28, 2020 at 05:37:02PM -0700, Stefano Stabellini wrote: > On Tue, 27 Oct 2020, Elliott Mitchell wrote: > > On Mon, Oct 26, 2020 at 06:44:27PM +, Julien Grall wrote: > > > On 26/10/2020 16:03, Elliott Mitchell wrote: > > > > On Mon, Oct 26, 2020 at 01:31:42PM +, Julien Grall wrote: > > > >> On 24/10/2020 06:35, Elliott Mitchell wrote: > > > >>> ACPI has a distinct > > > >>> means of specifying a limited DMA-width; the above fails, because it > > > >>> assumes a *device-tree*. > > > >> > > > >> Do you know if it would be possible to infer from the ACPI static table > > > >> the DMA-width? > > > > > > > > Yes, and it is. Due to not knowing much about ACPI tables I don't know > > > > what the C code would look like though (problem is which documentation > > > > should I be looking at first?). > > > > > > What you provided below is an excerpt of the DSDT. AFAIK, DSDT content > > > is written in AML. So far the shortest implementation I have seen for > > > the AML parser is around 5000 lines (see [1]). It might be possible to > > > strip some the code, although I think this will still probably too big > > > for a single workaround. > > > > > > What I meant by "static table" is a table that looks like a structure > > > and can be parsed in a few lines. If we can't find on contain the DMA > > > window, then the next best solution is to find a way to identity the > > > platform. > > > > > > I don't know enough ACPI to know if this solution is possible. A good > > > starter would probably be the ACPI spec [2]. > > > > Okay, that is worse than I had thought (okay, ACPI is impressively > > complex for something nominally firmware-level). > > > > There are strings in the present Tianocore implementation for Raspberry > > PI 4B which could be targeted. Notably included in the output during > > boot listing the tables, "RPIFDN", "RPIFDN RPI" and "RPIFDN RPI4" (I'm > > unsure how kosher these are as this wsn't implemented nor blessed by the > > Raspberry PI Foundation). > > > > I strongly dislike this approach as you soon turn the Xen project into a > > database of hardware. This is already occurring with > > xen/arch/arm/platforms and I would love to do something about this. On > > that thought, how about utilizing Xen's command-line for this purpose? > > I don't think that a command line option is a good idea: basically it is > punting to users the task of platform detection. Also, it means that > users will be necessarily forced to edit the uboot script or grub > configuration file to boot. -EINVAL Many Linux installations (near universal on desktop/server, but may be uncommon on ARM servers) Xen's command-line comes from grub.cfg. grub.cfg is in turn created by a series of scripts with several places for users to modify configuration without breaking things. The scripts which create grub.cfg could add a "dma_mem=" option to Xen's command-line based upon what the running kernel reports. If the kernel is running on top of Xen, it will still be able to retrieve this information out of ACPI. This does mean distributions would need to modify scripts, but that is doable. This also means a dumb user could potentially jump in, modify the value and thus cause unrecoverable breakage. Yet on the flip side this also allows for the short-term stop-gap of smart users modifying the option as appropriate for new hardware. Certainly it may not be the greatest approach, but it isn't as bad as you're claiming. > Note that even if we introduced a new command line, we wouldn't take > away the need for xen/arch/arm/platforms anyway. Perhaps, but it could allow for this setting at least to be moved to somewhere outside of Xen. I'm inclined to agree with Juergen Gro??, this reads kind of like having an extra domain run during Xen's initialization which can talk ACPI. > > Have a procedure of during installation/updates retrieve DMA limitation > > information from the running OS and the following boot Xen will follow > > the appropriate setup. By its nature, Domain 0 will have the information > > needed, just becomes an issue of how hard that is to retrieve... > > Historically that is what we used to do for many things related to ACPI, > but unfortunately it leads to a pretty bad architecture where Xen > depends on Dom0 for booting rather than the other way around. (Dom0 > should be the one requiring Xen for booting, given that Xen is higher > privilege and boots first.) > > > I think the best compromise is still to use an ACPI string to detect the > platform. For instance, would it be possible to use the OEMID fields in > RSDT, XSDT, FADT? Possibly even a combination of them? > > Another option might be to get the platform name from UEFI somehow. I included appropriate strings in e-mail. Suitable strings do appear in `dmesg`. Problem is this feels like you're hard-coding a fixed list of platforms Xen can run on. Instead values like these should be provided by firmware. ACPI
Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm
Hi Stefano [sorry for the possible format issue] On Thu, Oct 29, 2020 at 9:53 PM Stefano Stabellini wrote: > On Thu, 29 Oct 2020, Oleksandr Tyshchenko wrote: > > On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu < > masami.hirama...@linaro.org> wrote: > > Hi Oleksandr, > > > > Hi Masami > > > > [sorry for the possible format issue] > > > > > > I would like to try this on my arm64 board. > > > > Glad to hear you are interested in this topic. > > > > > > According to your comments in the patch, I made this config file. > > # cat debian.conf > > name = "debian" > > type = "pvh" > > vcpus = 8 > > memory = 512 > > kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64" > > ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64" > > cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw" > > disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ] > > vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ] > > virtio = 1 > > vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ] > > > > And tried to boot a DomU, but I got below error. > > > > # xl create -c debian.conf > > Parsing config from debian.conf > > libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain > > 1:unable to add virtio_disk devices > > libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain > > 1:xc_domain_pause failed > > libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get > domain > > type for domid=1 > > libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain > > 1:Unable to destroy guest > > libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain > > 1:Destruction of domain failed > > > > > > Could you tell me how can I test it? > > > > > > I assume it is due to the lack of the virtio-disk backend (which I > haven't shared yet as I focused on the IOREQ/DM support on Arm in the > > first place). > > Could you wait a little bit, I am going to share it soon. > > Do you have a quick-and-dirty hack you can share in the meantime? Even > just on github as a special branch? It would be very useful to be able > to have a test-driver for the new feature. Well, I will provide a branch on github with our PoC virtio-disk backend by the end of this week. It will be possible to test this series with it. -- Regards, Oleksandr Tyshchenko
Re: Ping: [PATCH v3 0/3] x86: shim building adjustments (plus shadow follow-on)
At 14:40 +0100 on 29 Oct (1603982415), Jan Beulich wrote: > Tim, > > unless you tell me otherwise I'm intending to commit the latter > two with Roger's acks some time next week. Of course it would > still be nice to know your view on the first of the TBDs in > patch 3 (which may result in further changes, in a follow-up). Ack, sorry for the dropped patches, and thank you for the ping. I've now replied to everything that I think is wating for my review. Cheers, Tim.
Re: [PATCH] x86/shadow: correct GFN use by sh_unshadow_for_p2m_change()
At 16:42 +0100 on 28 Oct (1603903365), Jan Beulich wrote: > Luckily sh_remove_all_mappings()'s use of the parameter is limited to > generation of log messages. Nevertheless we'd better pass correct GFNs > around: > - the incoming GFN, when replacing a large page, may not be large page > aligned, > - incrementing by page-size-scaled values can't be right. > > Signed-off-by: Jan Beulich Reviewed-by: Tim Deegan Thanks! Tim.
Re: [PATCH 5/5] x86/p2m: split write_p2m_entry() hook
At 10:24 +0100 on 28 Oct (1603880693), Jan Beulich wrote: > Fair parts of the present handlers are identical; in fact > nestedp2m_write_p2m_entry() lacks a call to p2m_entry_modify(). Move > common parts right into write_p2m_entry(), splitting the hooks into a > "pre" one (needed just by shadow code) and a "post" one. > > For the common parts moved I think that the p2m_flush_nestedp2m() is, > at least from an abstract perspective, also applicable in the shadow > case. Hence it doesn't get a 3rd hook put in place. > > The initial comment that was in hap_write_p2m_entry() gets dropped: Its > placement was bogus, and looking back the the commit introducing it > (dd6de3ab9985 "Implement Nested-on-Nested") I can't see either what use > of a p2m it was meant to be associated with. > > Signed-off-by: Jan Beulich This seems like a good approach to me. I'm happy with the shadow parts but am not confident enough on nested p2m any more to have an opinion on that side. Acked-by: Tim Deegan
Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
On Thu, Oct 29 2020 at 17:59, Paolo Bonzini wrote: > On 29/10/20 17:56, Arvind Sankar wrote: >>> For those two just add: >>> struct apic *apic = x86_system_apic; >>> before all the assignments. >>> Less churn and much better code. >>> >> Why would it be better code? >> > > I think he means the compiler produces better code, because it won't > read the global variable repeatedly. Not sure if that's true,(*) but I > think I do prefer that version if Arnd wants to do that tweak. It's not true. foo *p = bar; p->a = 1; p->b = 2; The compiler is free to reload bar after accessing p->a and with bar->a = 1; bar->b = 1; it can either cache bar in a register or reread it after bar->a The generated code is the same as long as there is no reason to reload, e.g. register pressure. Thanks, tglx
Re: [PATCH 2/5] x86/p2m: collapse the two ->write_p2m_entry() hooks
At 10:22 +0100 on 28 Oct (1603880578), Jan Beulich wrote: > The struct paging_mode instances get set to the same functions > regardless of mode by both HAP and shadow code, hence there's no point > having this hook there. The hook also doesn't need moving elsewhere - we > can directly use struct p2m_domain's. This merely requires (from a > strictly formal pov; in practice this may not even be needed) making > sure we don't end up using safe_write_pte() for nested P2Ms. > > Signed-off-by: Jan Beulich Acked-by: Tim Deegan
Re: [PATCH v3 3/3] x86/shadow: sh_{make,destroy}_monitor_table() are "even more" HVM-only
At 10:45 +0200 on 19 Oct (1603104300), Jan Beulich wrote: > With them depending on just the number of shadow levels, there's no need > for more than one instance of them, and hence no need for any hook (IOW > 452219e24648 ["x86/shadow: monitor table is HVM-only"] didn't go quite > far enough). Move the functions to hvm.c while dropping the dead > is_pv_32bit_domain() code paths. > > While moving the code, replace a stale comment reference to > sh_install_xen_entries_in_l4(). Doing so made me notice the function > also didn't have its prototype dropped in 8d7b633adab7 ("x86/mm: > Consolidate all Xen L4 slot writing into init_xen_l4_slots()"), which > gets done here as well. > > Signed-off-by: Jan Beulich Acked-by: Tim Deegan > TBD: In principle both functions could have their first parameter > constified. In fact, "destroy" doesn't depend on the vCPU at all > and hence could be passed a struct domain *. Not sure whether such > an asymmetry would be acceptable. > In principle "make" would also not need passing of the number of > shadow levels (can be derived from v), which would result in yet > another asymmetry. > If these asymmetries were acceptable, "make" could then also update > v->arch.hvm.monitor_table, instead of doing so at both call sites. Feel free to add consts, but please don't change the function parameters any more than that. I would rather keep them as a matched pair, and leave the hvm.monitor_table updates in the caller, where it's easier to see why they're not symmetrical. Cheers Tim.
Re: [XEN PATCH v1] xen/arm : Add support for SMMUv3 driver
On Thu, 29 Oct 2020, Bertrand Marquis wrote: > > On 28 Oct 2020, at 19:12, Julien Grall wrote: > > On 26/10/2020 11:03, Rahul Singh wrote: > >> Hello Julien, > >>> On 23 Oct 2020, at 4:19 pm, Julien Grall wrote: > >>> On 23/10/2020 15:27, Rahul Singh wrote: > Hello Julien, > > On 23 Oct 2020, at 2:00 pm, Julien Grall wrote: > > On 23/10/2020 12:35, Rahul Singh wrote: > >> Hello, > >>> On 23 Oct 2020, at 1:02 am, Stefano Stabellini > >>> wrote: > >>> > >>> On Thu, 22 Oct 2020, Julien Grall wrote: > >> On 20/10/2020 16:25, Rahul Singh wrote: > >>> Add support for ARM architected SMMUv3 implementations. It is > >>> based on > >>> the Linux SMMUv3 driver. > >>> Major differences between the Linux driver are as follows: > >>> 1. Only Stage-2 translation is supported as compared to the Linux > >>> driver > >>>that supports both Stage-1 and Stage-2 translations. > >>> 2. Use P2M page table instead of creating one as SMMUv3 has the > >>>capability to share the page tables with the CPU. > >>> 3. Tasklets is used in place of threaded IRQ's in Linux for event > >>> queue > >>>and priority queue IRQ handling. > >> > >> Tasklets are not a replacement for threaded IRQ. In particular, > >> they will > >> have priority over anything else (IOW nothing will run on the pCPU > >> until > >> they are done). > >> > >> Do you know why Linux is using thread. Is it because of long > >> running > >> operations? > > > > Yes you are right because of long running operations Linux is using > > the > > threaded IRQs. > > > > SMMUv3 reports fault/events bases on memory-based circular buffer > > queues not > > based on the register. As per my understanding, it is > > time-consuming to > > process the memory based queues in interrupt context because of > > that Linux > > is using threaded IRQ to process the faults/events from SMMU. > > > > I didn’t find any other solution in XEN in place of tasklet to > > defer the > > work, that’s why I used tasklet in XEN in replacement of threaded > > IRQs. If > > we do all work in interrupt context we will make XEN less > > responsive. > > So we need to make sure that Xen continue to receives interrupts, > but we also > need to make sure that a vCPU bound to the pCPU is also responsive. > > > > > If you know another solution in XEN that will be used to defer the > > work in > > the interrupt please let me know I will try to use that. > > One of my work colleague encountered a similar problem recently. He > had a long > running tasklet and wanted to be broken down in smaller chunk. > > We decided to use a timer to reschedule the taslket in the future. > This allows > the scheduler to run other loads (e.g. vCPU) for some time. > > This is pretty hackish but I couldn't find a better solution as > tasklet have > high priority. > > Maybe the other will have a better idea. > >>> > >>> Julien's suggestion is a good one. > >>> > >>> But I think tasklets can be configured to be called from the > >>> idle_loop, > >>> in which case they are not run in interrupt context? > >>> > >> Yes you are right tasklet will be scheduled from the idle_loop that > >> is not interrupt conext. > > > > This depends on your tasklet. Some will run from the softirq context > > which is usually (for Arm) on the return of an exception. > > > Thanks for the info. I will check and will get better understanding of > the tasklet how it will run in XEN. > >>> > >>> 4. Latest version of the Linux SMMUv3 code implements the > >>> commands queue > >>>access functions based on atomic operations implemented in > >>> Linux. > >> > >> Can you provide more details? > > > > I tried to port the latest version of the SMMUv3 code than I > > observed that > > in order to port that code I have to also port atomic operation > > implemented > > in Linux to XEN. As latest Linux code uses atomic operation to > > process the > > command queues > > (atomic_cond_read_relaxed(),atomic_long_cond_read_relaxed() , > > atomic_fetch_andnot_relaxed()) . > > Thank you for the explanation. I think it would be best to import > the atomic > helpers and use the latest code. > > This will ensure
Re: [PATCH v3 2/3] x86/shadow: refactor shadow_vram_{get,put}_l1e()
At 10:44 +0200 on 19 Oct (1603104271), Jan Beulich wrote: > By passing the functions an MFN and flags, only a single instance of > each is needed; they were pretty large for being inline functions > anyway. > > While moving the code, also adjust coding style and add const where > sensible / possible. > > Signed-off-by: Jan Beulich Acked-by: Tim Deegan
Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm
On Thu, 29 Oct 2020, Alex Bennée wrote: > Oleksandr Tyshchenko writes: > > On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu < > > masami.hirama...@linaro.org> wrote: > > > >> Hi Oleksandr, > >> > > Hi Masami > > > > [sorry for the possible format issue] > > > > > >> > >> I would like to try this on my arm64 board. > >> > > Glad to hear you are interested in this topic. > > > > > >> > >> According to your comments in the patch, I made this config file. > >> # cat debian.conf > >> name = "debian" > >> type = "pvh" > >> vcpus = 8 > >> memory = 512 > >> kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64" > >> ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64" > >> cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw" > >> disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ] > >> vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ] > >> virtio = 1 > >> vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ] > >> > >> And tried to boot a DomU, but I got below error. > >> > >> # xl create -c debian.conf > >> Parsing config from debian.conf > >> libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain > >> 1:unable to add virtio_disk devices > >> libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain > >> 1:xc_domain_pause failed > >> libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get domain > >> type for domid=1 > >> libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain > >> 1:Unable to destroy guest > >> libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain > >> 1:Destruction of domain failed > >> > >> > >> Could you tell me how can I test it? > >> > > > > I assume it is due to the lack of the virtio-disk backend (which I haven't > > shared yet as I focused on the IOREQ/DM support on Arm in the first place). > > Could you wait a little bit, I am going to share it soon. > > I assume this is wiring up the required bits of support to handle the > IOREQ requests in QEMU? We are putting together a PoC demo to show > a virtio enabled image (AGL) running on both KVM and Xen hypervisors so > we are keen to see your code as soon as you can share it. > > I'm currently preparing a patch series for QEMU which fixes the recent > breakage caused by changes to the build system. As part of that I've > separated CONFIG_XEN and CONFIG_XEN_HVM so it's possible to build > i386-softmmu with unneeded for ARM bits. Hopefully this will allow me to > create a qemu-aarch64-system binary with just the PV related models in > it. > > Talking to Stefano it probably makes sense going forward to introduce a > new IOREQ aware machine type for QEMU that doesn't bring in the rest of > the x86 overhead. I was thinking maybe xenvirt? > > You've tested with virtio-block but we'd also like to extend this to > other arbitrary virtio devices. I guess we will need some sort of > mechanism to inform the QEMU command line where each device sits in the > virtio-mmio bus so the FDT Xen delivers to the guest matches up with > what QEMU is ready to serve requests for? That would be great. As a reference, given that the domU memory mapping layout is fixed, on x86 we just made sure that QEMU's idea of where the devices are is the same of Xen's. What you are suggesting would make it much more flexible and clearer.
Re: [PATCH] meson.build: fix building of Xen support for aarch64
On Thu, 29 Oct 2020, Alex Bennée wrote: > > On Wed, 28 Oct 2020, Alex Bennée wrote: > >> Xen is supported on aarch64 although weirdly using the i386-softmmu > >> model. Checking based on the host CPU meant we never enabled Xen > >> support. It would be nice to enable CONFIG_XEN for aarch64-softmmu to > >> make it not seem weird but that will require further build surgery. > >> > >> Signed-off-by: Alex Bennée > >> Cc: Masami Hiramatsu > >> Cc: Stefano Stabellini > >> Cc: Anthony Perard > >> Cc: Paul Durrant > >> Fixes: 8a19980e3f ("configure: move accelerator logic to meson") > >> --- > >> meson.build | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/meson.build b/meson.build > >> index 835424999d..f1fcbfed4c 100644 > >> --- a/meson.build > >> +++ b/meson.build > >> @@ -81,6 +81,8 @@ if cpu in ['x86', 'x86_64'] > >> 'CONFIG_HVF': ['x86_64-softmmu'], > >> 'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'], > >>} > >> +elif cpu in [ 'arm', 'aarch64' ] > >> + accelerator_targets += { 'CONFIG_XEN': ['i386-softmmu'] } > >> endif > > > > This looks very reasonable -- the patch makes sense. > > > > > > However I have two questions, mostly for my own understanding. I tried > > to repro the aarch64 build problem but it works at my end, even without > > this patch. > > Building on a x86 host or with cross compiler? native build (qemu-user) > > I wonder why. I suspect it works thanks to these lines in > > meson.build: > > > > if not get_option('xen').disabled() and 'CONFIG_XEN_BACKEND' in > > config_host > > accelerators += 'CONFIG_XEN' > > have_xen_pci_passthrough = not > > get_option('xen_pci_passthrough').disabled() and targetos == 'linux' > > else > > have_xen_pci_passthrough = false > > endif > > > > But I am not entirely sure who is adding CONFIG_XEN_BACKEND to > > config_host. > > The is part of the top level configure check - which basically checks > for --enable-xen or autodetects the presence of the userspace libraries. > I'm not sure if we have a slight over proliferation of symbols for Xen > support (although I'm about to add more). > > > The other question is: does it make sense to print the value of > > CONFIG_XEN as part of the summary? Something like: > > > > diff --git a/meson.build b/meson.build > > index 47e32e1fcb..c6e7832dc9 100644 > > --- a/meson.build > > +++ b/meson.build > > @@ -2070,6 +2070,7 @@ summary_info += {'KVM support': > > config_all.has_key('CONFIG_KVM')} > > summary_info += {'HAX support': config_all.has_key('CONFIG_HAX')} > > summary_info += {'HVF support': config_all.has_key('CONFIG_HVF')} > > summary_info += {'WHPX support': config_all.has_key('CONFIG_WHPX')} > > +summary_info += {'XEN support': config_all.has_key('CONFIG_XEN')} > > summary_info += {'TCG support': config_all.has_key('CONFIG_TCG')} > > if config_all.has_key('CONFIG_TCG') > >summary_info += {'TCG debug enabled': > > config_host.has_key('CONFIG_DEBUG_TCG')} > > > > > > But I realize there is already: > > > > summary_info += {'xen support': > > config_host.has_key('CONFIG_XEN_BACKEND')} > > > > so it would be a bit of a duplicate > > Hmm so what we have is: > > CONFIG_XEN_BACKEND > - ensures that appropriate compiler flags are added > - pegs RAM_ADDR_MAX at UINT64_MAX (instead of UINTPTR_MAX) > CONFIG_XEN > - which controls a bunch of build objects, some of which may be i386 only? > ./accel/meson.build:15:specific_ss.add_all(when: ['CONFIG_XEN'], if_true: > dummy_ss) > ./accel/stubs/meson.build:2:specific_ss.add(when: 'CONFIG_XEN', if_false: > files('xen-stub.c')) > ./accel/xen/meson.build:1:specific_ss.add(when: 'CONFIG_XEN', if_true: > files('xen-all.c')) > ./hw/9pfs/meson.build:17:fs_ss.add(when: 'CONFIG_XEN', if_true: > files('xen-9p-backend.c')) > ./hw/block/dataplane/meson.build:2:specific_ss.add(when: 'CONFIG_XEN', > if_true: files('xen-block.c')) > ./hw/block/meson.build:14:softmmu_ss.add(when: 'CONFIG_XEN', if_true: > files('xen-block.c')) > ./hw/char/meson.build:23:softmmu_ss.add(when: 'CONFIG_XEN', if_true: > files('xen_console.c')) > ./hw/display/meson.build:18:softmmu_ss.add(when: 'CONFIG_XEN', if_true: > files('xenfb.c')) > ./hw/i386/xen/meson.build:1:i386_ss.add(when: 'CONFIG_XEN', if_true: > files('xen-hvm.c', > > 'xen-mapcache.c', > > 'xen_apic.c', > > 'xen_platform.c', > > 'xen_pvdevice.c') > ./hw/net/meson.build:2:softmmu_ss.add(when: 'CONFIG_XEN', if_true: > files('xen_nic.c')) > ./hw/usb/meson.build:76:softmmu_ss.add(when: ['CONFIG_USB', 'CONFIG_XEN', > libusb], if_true: files('xen-usb.c')) >
Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm
Oleksandr Tyshchenko writes: > On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu < > masami.hirama...@linaro.org> wrote: > >> Hi Oleksandr, >> > Hi Masami > > [sorry for the possible format issue] > > >> >> I would like to try this on my arm64 board. >> > Glad to hear you are interested in this topic. > > >> >> According to your comments in the patch, I made this config file. >> # cat debian.conf >> name = "debian" >> type = "pvh" >> vcpus = 8 >> memory = 512 >> kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64" >> ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64" >> cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw" >> disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ] >> vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ] >> virtio = 1 >> vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ] >> >> And tried to boot a DomU, but I got below error. >> >> # xl create -c debian.conf >> Parsing config from debian.conf >> libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain >> 1:unable to add virtio_disk devices >> libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain >> 1:xc_domain_pause failed >> libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get domain >> type for domid=1 >> libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain >> 1:Unable to destroy guest >> libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain >> 1:Destruction of domain failed >> >> >> Could you tell me how can I test it? >> > > I assume it is due to the lack of the virtio-disk backend (which I haven't > shared yet as I focused on the IOREQ/DM support on Arm in the first place). > Could you wait a little bit, I am going to share it soon. I assume this is wiring up the required bits of support to handle the IOREQ requests in QEMU? We are putting together a PoC demo to show a virtio enabled image (AGL) running on both KVM and Xen hypervisors so we are keen to see your code as soon as you can share it. I'm currently preparing a patch series for QEMU which fixes the recent breakage caused by changes to the build system. As part of that I've separated CONFIG_XEN and CONFIG_XEN_HVM so it's possible to build i386-softmmu with unneeded for ARM bits. Hopefully this will allow me to create a qemu-aarch64-system binary with just the PV related models in it. Talking to Stefano it probably makes sense going forward to introduce a new IOREQ aware machine type for QEMU that doesn't bring in the rest of the x86 overhead. I was thinking maybe xenvirt? You've tested with virtio-block but we'd also like to extend this to other arbitrary virtio devices. I guess we will need some sort of mechanism to inform the QEMU command line where each device sits in the virtio-mmio bus so the FDT Xen delivers to the guest matches up with what QEMU is ready to serve requests for? -- Alex Bennée
Re: [PATCH] x86/hvm: process softirq while saving/loading entries
On Thu, Oct 29, 2020 at 05:38:17PM +0100, Jan Beulich wrote: > On 29.10.2020 16:20, Roger Pau Monne wrote: > > On slow systems with sync_console saving or loading the context of big > > guests can cause the watchdog to trigger. Fix this by adding a couple > > of process_pending_softirqs. > > Which raises the question in how far this is then also a problem > for the caller of the underlying hypercall. IOW I wonder whether > instead we need to make use of continuations here. Nevertheless FWIW, I've only hit this with debug builds on boxes that have slow serial with sync_console enabled, due to the verbose printks. > > > Signed-off-by: Roger Pau Monné > > Acked-by: Jan Beulich Thanks.
[PATCH v1 00/23] reduce overhead during live migration
The current live migration code can easily saturate an 1Gb link. There is still room for improvement with faster network connections. Even with this series reviewed and applied. See description of patch #6. Olaf Olaf Hering (23): tools: add readv_exact to libxenctrl tools: add xc_is_known_page_type to libxenctrl tools: use xc_is_known_page_type tools: unify type checking for data pfns in migration stream tools: show migration transfer rate in send_dirty_pages tools/guest: prepare to allocate arrays once tools/guest: save: move batch_pfns tools/guest: save: move mfns array tools/guest: save: move types array tools/guest: save: move errors array tools/guest: save: move iov array tools/guest: save: move rec_pfns array tools/guest: save: move guest_data array tools/guest: save: move local_pages array tools/guest: restore: move pfns array tools/guest: restore: move types array tools/guest: restore: move mfns array tools/guest: restore: move map_errs array tools/guest: restore: move mfns array in populate_pfns tools/guest: restore: move pfns array in populate_pfns tools/guest: restore: split record processing tools/guest: restore: split handle_page_data tools/guest: restore: write data directly into guest tools/libs/ctrl/xc_private.c | 54 ++- tools/libs/ctrl/xc_private.h | 34 ++ tools/libs/guest/xg_sr_common.c | 33 +- tools/libs/guest/xg_sr_common.h | 86 +++- tools/libs/guest/xg_sr_restore.c | 562 +- tools/libs/guest/xg_sr_save.c | 158 tools/libs/guest/xg_sr_save_x86_hvm.c | 5 +- tools/libs/guest/xg_sr_save_x86_pv.c | 31 +- 8 files changed, 666 insertions(+), 297 deletions(-)
Re: [PATCH] meson.build: fix building of Xen support for aarch64
On Thu, 29 Oct 2020, Jason Andryuk wrote: > On Thu, Oct 29, 2020 at 6:01 AM Alex Bennée wrote: > > > > > > Stefano Stabellini writes: > > > > > On Wed, 28 Oct 2020, Alex Bennée wrote: > > >> Xen is supported on aarch64 although weirdly using the i386-softmmu > > >> model. Checking based on the host CPU meant we never enabled Xen > > >> support. It would be nice to enable CONFIG_XEN for aarch64-softmmu to > > >> make it not seem weird but that will require further build surgery. > > >> > > >> Signed-off-by: Alex Bennée > > >> Cc: Masami Hiramatsu > > >> Cc: Stefano Stabellini > > >> Cc: Anthony Perard > > >> Cc: Paul Durrant > > >> Fixes: 8a19980e3f ("configure: move accelerator logic to meson") > > >> --- > > >> meson.build | 2 ++ > > >> 1 file changed, 2 insertions(+) > > >> > > >> diff --git a/meson.build b/meson.build > > >> index 835424999d..f1fcbfed4c 100644 > > >> --- a/meson.build > > >> +++ b/meson.build > > >> @@ -81,6 +81,8 @@ if cpu in ['x86', 'x86_64'] > > >> 'CONFIG_HVF': ['x86_64-softmmu'], > > >> 'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'], > > >>} > > >> +elif cpu in [ 'arm', 'aarch64' ] > > >> + accelerator_targets += { 'CONFIG_XEN': ['i386-softmmu'] } > > >> endif > > > > > > This looks very reasonable -- the patch makes sense. > > A comment would be useful to explain that Arm needs i386-softmmu for > the xenpv machine. > > > > > > > However I have two questions, mostly for my own understanding. I tried > > > to repro the aarch64 build problem but it works at my end, even without > > > this patch. > > > > Building on a x86 host or with cross compiler? > > > > > I wonder why. I suspect it works thanks to these lines in > > > meson.build: > > I think it's a runtime and not a build problem. In osstest, Xen > support was detected and configured, but CONFIG_XEN wasn't set for > Arm. So at runtime xen_available() returns 0, and QEMU doesn't start > with "qemu-system-i386: -xen-domid 1: Option not supported for this > target" > > I posted my investigation here: > https://lore.kernel.org/xen-devel/cakf6xpss8kpgovzrkitpz63bhbvbjxrtywdhekzuo2q1kem...@mail.gmail.com/ Right, but strangely I cannot reproduce it here. I am natively building (qemu-user) on aarch64 and for me CONFIG_XEN gets set.
Re: Xen on RP4
On Thu, 29 Oct 2020, Jürgen Groß wrote: > On 29.10.20 01:37, Stefano Stabellini wrote: > > On Tue, 27 Oct 2020, Elliott Mitchell wrote: > > > On Mon, Oct 26, 2020 at 06:44:27PM +, Julien Grall wrote: > > > > On 26/10/2020 16:03, Elliott Mitchell wrote: > > > > > On Mon, Oct 26, 2020 at 01:31:42PM +, Julien Grall wrote: > > > > > > On 24/10/2020 06:35, Elliott Mitchell wrote: > > > > > > > ACPI has a distinct > > > > > > > means of specifying a limited DMA-width; the above fails, because > > > > > > > it > > > > > > > assumes a *device-tree*. > > > > > > > > > > > > Do you know if it would be possible to infer from the ACPI static > > > > > > table > > > > > > the DMA-width? > > > > > > > > > > Yes, and it is. Due to not knowing much about ACPI tables I don't > > > > > know > > > > > what the C code would look like though (problem is which documentation > > > > > should I be looking at first?). > > > > > > > > What you provided below is an excerpt of the DSDT. AFAIK, DSDT content > > > > is written in AML. So far the shortest implementation I have seen for > > > > the AML parser is around 5000 lines (see [1]). It might be possible to > > > > strip some the code, although I think this will still probably too big > > > > for a single workaround. > > > > > > > > What I meant by "static table" is a table that looks like a structure > > > > and can be parsed in a few lines. If we can't find on contain the DMA > > > > window, then the next best solution is to find a way to identity the > > > > platform. > > > > > > > > I don't know enough ACPI to know if this solution is possible. A good > > > > starter would probably be the ACPI spec [2]. > > > > > > Okay, that is worse than I had thought (okay, ACPI is impressively > > > complex for something nominally firmware-level). > > > > > > There are strings in the present Tianocore implementation for Raspberry > > > PI 4B which could be targeted. Notably included in the output during > > > boot listing the tables, "RPIFDN", "RPIFDN RPI" and "RPIFDN RPI4" (I'm > > > unsure how kosher these are as this wsn't implemented nor blessed by the > > > Raspberry PI Foundation). > > > > > > I strongly dislike this approach as you soon turn the Xen project into a > > > database of hardware. This is already occurring with > > > xen/arch/arm/platforms and I would love to do something about this. On > > > that thought, how about utilizing Xen's command-line for this purpose? > > > > I don't think that a command line option is a good idea: basically it is > > punting to users the task of platform detection. Also, it means that > > users will be necessarily forced to edit the uboot script or grub > > configuration file to boot. > > > > Note that even if we introduced a new command line, we wouldn't take > > away the need for xen/arch/arm/platforms anyway. > > > > It would be far better for Xen to autodetect the platform if we can. > > > > > > > Have a procedure of during installation/updates retrieve DMA limitation > > > information from the running OS and the following boot Xen will follow > > > the appropriate setup. By its nature, Domain 0 will have the information > > > needed, just becomes an issue of how hard that is to retrieve... > > > > Historically that is what we used to do for many things related to ACPI, > > but unfortunately it leads to a pretty bad architecture where Xen > > depends on Dom0 for booting rather than the other way around. (Dom0 > > should be the one requiring Xen for booting, given that Xen is higher > > privilege and boots first.) > > > > > > I think the best compromise is still to use an ACPI string to detect the > > platform. For instance, would it be possible to use the OEMID fields in > > RSDT, XSDT, FADT? Possibly even a combination of them? > > > > Another option might be to get the platform name from UEFI somehow. > > What about having a small domain parsing the ACPI booting first and use > that information for booting dom0? > > That dom would be part of the Xen build and the hypervisor wouldn't need > to gain all the ACPI AML logic. That could work, but in practice we don't have such a domain today -- the infrastructure is missing. I wonder whether the bootloader (uboot or grub) would know about the platform and might be able to pass that information to Xen somehow.
Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm
On Thu, 29 Oct 2020, Oleksandr Tyshchenko wrote: > On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu > wrote: > Hi Oleksandr, > > Hi Masami > > [sorry for the possible format issue] > > > I would like to try this on my arm64 board. > > Glad to hear you are interested in this topic. > > > According to your comments in the patch, I made this config file. > # cat debian.conf > name = "debian" > type = "pvh" > vcpus = 8 > memory = 512 > kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64" > ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64" > cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw" > disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ] > vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ] > virtio = 1 > vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ] > > And tried to boot a DomU, but I got below error. > > # xl create -c debian.conf > Parsing config from debian.conf > libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain > 1:unable to add virtio_disk devices > libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain > 1:xc_domain_pause failed > libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get domain > type for domid=1 > libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain > 1:Unable to destroy guest > libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain > 1:Destruction of domain failed > > > Could you tell me how can I test it? > > > I assume it is due to the lack of the virtio-disk backend (which I haven't > shared yet as I focused on the IOREQ/DM support on Arm in the > first place). > Could you wait a little bit, I am going to share it soon. Do you have a quick-and-dirty hack you can share in the meantime? Even just on github as a special branch? It would be very useful to be able to have a test-driver for the new feature.
[xen-4.11-testing test] 156277: tolerable FAIL - PUSHED
flight 156277 xen-4.11-testing real [real] http://logs.test-lab.xenproject.org/osstest/logs/156277/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 12 debian-hvm-install fail like 156034 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 12 debian-hvm-install fail like 156034 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 156034 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 156034 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 156034 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 156034 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 156034 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 156034 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 156034 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 156034 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 156034 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 156034 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 156034 test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass version targeted for testing: xen e274c8bdc12eb596e55233040e8b49da27150f31 baseline version: xen 63199dfd3a0418f1677c6ccd7fe05b123af4610a Last test of basis 156034 2020-10-20 13:35:54 Z9 days Testing same since 156262 2020-10-27 18:36:52 Z2 days2 attempts People who touched revisions under test: Andrew Cooper jobs: build-amd64-xsm pass
Re: [PATCH] x86/pv: Drop stale comment in dom0_construct_pv()
On 29/10/2020 14:37, Jan Beulich wrote: > On 29.10.2020 15:00, Andrew Cooper wrote: >> This comment has been around since c/s 1372bca0615 in 2004. It is stale, as >> it predates the introduction of struct vcpu. > That commit only moved it around; it's 22a857bde9b8 afaics from > early 2003 where it first appeared, where it had a reason: > > /* > * WARNING: The new domain must have its 'processor' field > * filled in by now !! > */ > phys_l2tab = ALLOC_FRAME_FROM_DOMAIN(); > l2tab = map_domain_mem(phys_l2tab); > memcpy(l2tab, idle_pg_table[p->processor], PAGE_SIZE); Oh yes - my simple search didn't spot the reformat. > > But yes, the comment has been stale for a long time, and I've > been wondering a number of times what it was supposed to tell > me. (I think it was already stale at the point the comment > first got altered, in 3072fef54df8.) Looks like it became stale with 99db02d5097 "Remove CPU-dependent page-directory entries." which drops the per-cpu idle_pg_table. > >> It is not obvious that it was even correct at the time. Where a vcpu (domain >> at the time) has been configured to run is unrelated to construct the >> domain's >> initial pagetables, etc. >> >> Signed-off-by: Andrew Cooper > Acked-by: Jan Beulich Thanks. I'll update the commit message. ~Andrew
[PATCH v2] libxl: Add suppress-vmdesc to QEMU machine
The device model state saved by QMP xen-save-devices-state doesn't include the vmdesc json. When restoring an HVM, xen-load-devices-state always triggers "Expected vmdescription section, but got 0". This is not a problem when restore comes from a file. However, when QEMU runs in a linux stubdom and comes over a console, EOF is not received. This causes a delay restoring - though it does restore. Setting suppress-vmdesc skips looking for the vmdesc during restore and avoids the wait. QEMU 5.2 enables suppress-vmdesc by default for xenfv, but this change sets it manually for xenfv and xen_platform_pci=0 when -machine pc is use. QEMU commit 9850c6047b8b "migration: Allow to suppress vmdesc submission" added suppress-vmdesc in QEMU 2.3. Signed-off-by: Jason Andryuk --- QEMU 2.3 came out in 2015, so setting suppress-vmdesc unilaterally should be okay... Is this okay? --- tools/libs/light/libxl_dm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c index d1ff35dda3..3da83259c0 100644 --- a/tools/libs/light/libxl_dm.c +++ b/tools/libs/light/libxl_dm.c @@ -1778,9 +1778,9 @@ static int libxl__build_device_model_args_new(libxl__gc *gc, /* Switching here to the machine "pc" which does not add * the xen-platform device instead of the default "xenfv" machine. */ -machinearg = libxl__strdup(gc, "pc,accel=xen"); +machinearg = libxl__strdup(gc, "pc,accel=xen,suppress-vmdesc=on"); } else { -machinearg = libxl__strdup(gc, "xenfv"); +machinearg = libxl__strdup(gc, "xenfv,suppress-vmdesc=on"); } if (b_info->u.hvm.mmio_hole_memkb) { uint64_t max_ram_below_4g = (1ULL << 32) - -- 2.25.1
Re: [PATCH V2 00/23] IOREQ feature (+ virtio-mmio) on Arm
On Thu, Oct 29, 2020 at 9:42 AM Masami Hiramatsu < masami.hirama...@linaro.org> wrote: > Hi Oleksandr, > Hi Masami [sorry for the possible format issue] > > I would like to try this on my arm64 board. > Glad to hear you are interested in this topic. > > According to your comments in the patch, I made this config file. > # cat debian.conf > name = "debian" > type = "pvh" > vcpus = 8 > memory = 512 > kernel = "/opt/agl/vmlinuz-5.9.0-1-arm64" > ramdisk = "/opt/agl/initrd.img-5.9.0-1-arm64" > cmdline= "console=hvc0 earlyprintk=xen root=/dev/xvda1 rw" > disk = [ '/opt/agl/debian.qcow2,qcow2,hda' ] > vif = [ 'mac=00:16:3E:74:3d:76,bridge=xenbr0' ] > virtio = 1 > vdisk = [ 'backend=Dom0, disks=ro:/dev/sda1' ] > > And tried to boot a DomU, but I got below error. > > # xl create -c debian.conf > Parsing config from debian.conf > libxl: error: libxl_create.c:1863:domcreate_attach_devices: Domain > 1:unable to add virtio_disk devices > libxl: error: libxl_domain.c:1218:destroy_domid_pci_done: Domain > 1:xc_domain_pause failed > libxl: error: libxl_dom.c:39:libxl__domain_type: unable to get domain > type for domid=1 > libxl: error: libxl_domain.c:1136:domain_destroy_callback: Domain > 1:Unable to destroy guest > libxl: error: libxl_domain.c:1063:domain_destroy_cb: Domain > 1:Destruction of domain failed > > > Could you tell me how can I test it? > I assume it is due to the lack of the virtio-disk backend (which I haven't shared yet as I focused on the IOREQ/DM support on Arm in the first place). Could you wait a little bit, I am going to share it soon. -- Regards, Oleksandr Tyshchenko
Re: [PATCH] x86/HVM: send mapcache invalidation request to qemu regardless of preemption
On 29/10/2020 15:14, Jan Beulich wrote: > Even if only part of a hypercall completed before getting preempted, > invalidation ought to occur. Therefore fold the two return statements. > > Signed-off-by: Jan Beulich Acked-by: Andrew Cooper
Re: [PATCH 1/2] x86: fix build of PV shim when !GRANT_TABLE
On 29/10/2020 17:09, Jan Beulich wrote: > To do its compat translation, shim code needs to include the compat > header. For this to work, this header first of all needs to be > generated. > > Reported-by: Andrew Cooper > Signed-off-by: Jan Beulich Acked-by: Andrew Cooper
Re: [PATCH 2/2] x86: PV shim doesn't need GRANT_TABLE
On 29/10/2020 17:10, Jan Beulich wrote: > The only reference into the code controlled by this option is from the > hypercall table, and that hypercall table entry gets overwritten. > > Signed-off-by: Jan Beulich Acked-by: Andrew Cooper
Re: [PATCH] x86emul: support AVX-VNNI
On 29/10/2020 15:01, Jan Beulich wrote: > These are VEX-encoded equivalents of the EVEX-encoded AVX512-VNNI ISA > extension. > > Signed-off-by: Jan Beulich Acked-by: Andrew Cooper
[PATCH v1 21/23] tools/guest: restore: split record processing
handle_page_data must be able to read directly into mapped guest memory. This will avoid unneccesary memcpy calls for data which can be consumed verbatim. Rearrange the code to allow decisions based on the incoming record. This change is preparation for future changes in handle_page_data, no change in behavior is intended. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.c | 33 - tools/libs/guest/xg_sr_common.h | 4 ++- tools/libs/guest/xg_sr_restore.c | 49 ++-- tools/libs/guest/xg_sr_save.c| 7 - 4 files changed, 63 insertions(+), 30 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.c b/tools/libs/guest/xg_sr_common.c index 17567ab133..cabde4ef74 100644 --- a/tools/libs/guest/xg_sr_common.c +++ b/tools/libs/guest/xg_sr_common.c @@ -91,26 +91,33 @@ int write_split_record(struct xc_sr_context *ctx, struct xc_sr_record *rec, return -1; } -int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec) +int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr) { xc_interface *xch = ctx->xch; -struct xc_sr_rhdr rhdr; -size_t datasz; -if ( read_exact(fd, , sizeof(rhdr)) ) +if ( read_exact(fd, rhdr, sizeof(*rhdr)) ) { PERROR("Failed to read Record Header from stream"); return -1; } -if ( rhdr.length > REC_LENGTH_MAX ) +if ( rhdr->length > REC_LENGTH_MAX ) { -ERROR("Record (0x%08x, %s) length %#x exceeds max (%#x)", rhdr.type, - rec_type_to_str(rhdr.type), rhdr.length, REC_LENGTH_MAX); +ERROR("Record (0x%08x, %s) length %#x exceeds max (%#x)", rhdr->type, + rec_type_to_str(rhdr->type), rhdr->length, REC_LENGTH_MAX); return -1; } -datasz = ROUNDUP(rhdr.length, REC_ALIGN_ORDER); +return 0; +} + +int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr, + struct xc_sr_record *rec) +{ +xc_interface *xch = ctx->xch; +size_t datasz; + +datasz = ROUNDUP(rhdr->length, REC_ALIGN_ORDER); if ( datasz ) { @@ -119,7 +126,7 @@ int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec) if ( !rec->data ) { ERROR("Unable to allocate %zu bytes for record data (0x%08x, %s)", - datasz, rhdr.type, rec_type_to_str(rhdr.type)); + datasz, rhdr->type, rec_type_to_str(rhdr->type)); return -1; } @@ -128,18 +135,18 @@ int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec) free(rec->data); rec->data = NULL; PERROR("Failed to read %zu bytes of data for record (0x%08x, %s)", - datasz, rhdr.type, rec_type_to_str(rhdr.type)); + datasz, rhdr->type, rec_type_to_str(rhdr->type)); return -1; } } else rec->data = NULL; -rec->type = rhdr.type; -rec->length = rhdr.length; +rec->type = rhdr->type; +rec->length = rhdr->length; return 0; -}; +} static void __attribute__((unused)) build_assertions(void) { diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 3fe665b91d..66d1b0dfe6 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -475,7 +475,9 @@ static inline int write_record(struct xc_sr_context *ctx, * * On failure, the contents of the record structure are undefined. */ -int read_record(struct xc_sr_context *ctx, int fd, struct xc_sr_record *rec); +int read_record_header(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr); +int read_record_data(struct xc_sr_context *ctx, int fd, struct xc_sr_rhdr *rhdr, + struct xc_sr_record *rec); /* * This would ideally be private in restore.c, but is needed by diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 71b39612ee..93f69b9ba8 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -471,7 +471,7 @@ static int send_checkpoint_dirty_pfn_list(struct xc_sr_context *ctx) return rc; } -static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec); +static int process_buffered_record(struct xc_sr_context *ctx, struct xc_sr_record *rec); static int handle_checkpoint(struct xc_sr_context *ctx) { xc_interface *xch = ctx->xch; @@ -510,7 +510,7 @@ static int handle_checkpoint(struct xc_sr_context *ctx) for ( i = 0; i < ctx->restore.buffered_rec_num; i++ ) { -rc = process_record(ctx, >restore.buffered_records[i]); +rc = process_buffered_record(ctx, >restore.buffered_records[i]); if ( rc ) goto err; } @@ -571,10 +571,11 @@ static int handle_checkpoint(struct xc_sr_context *ctx) return rc; } -static int buffer_record(struct
Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
On Thu, Oct 29, 2020 at 05:59:54PM +0100, Paolo Bonzini wrote: > On 29/10/20 17:56, Arvind Sankar wrote: > >> For those two just add: > >>struct apic *apic = x86_system_apic; > >> before all the assignments. > >> Less churn and much better code. > >> > > Why would it be better code? > > > > I think he means the compiler produces better code, because it won't > read the global variable repeatedly. Not sure if that's true,(*) but I > think I do prefer that version if Arnd wants to do that tweak. > > Paolo > > (*) if it is, it might also be due to Linux using -fno-strict-aliasing > Nope, it doesn't read it multiple times. I guess it still assumes that apic isn't in the middle of what it points to: it would reload the address if the first element of *apic was modified, but doesn't for other elements. Interesting.
[PATCH v1 19/23] tools/guest: restore: move mfns array in populate_pfns
Remove allocation from hotpath, move populate_pfns mfns array into preallocated space. Use some prefix to avoid conflict with an array used in handle_page_data. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_restore.c | 5 ++--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index eba3a49877..96a77b5969 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -230,6 +230,8 @@ struct xc_sr_restore_arrays { /* process_page_data */ xen_pfn_t mfns[MAX_BATCH_SIZE]; int map_errs[MAX_BATCH_SIZE]; +/* populate_pfns */ +xen_pfn_t pp_mfns[MAX_BATCH_SIZE]; }; struct xc_sr_context diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 94c329032f..85a32aaed2 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -138,12 +138,12 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count, const xen_pfn_t *original_pfns, const uint32_t *types) { xc_interface *xch = ctx->xch; -xen_pfn_t *mfns = malloc(count * sizeof(*mfns)), +xen_pfn_t *mfns = ctx->restore.m->pp_mfns, *pfns = malloc(count * sizeof(*pfns)); unsigned int i, nr_pfns = 0; int rc = -1; -if ( !mfns || !pfns ) +if ( !pfns ) { ERROR("Failed to allocate %zu bytes for populating the physmap", 2 * count * sizeof(*mfns)); @@ -191,7 +191,6 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count, err: free(pfns); -free(mfns); return rc; }
[PATCH v1 23/23] tools/guest: restore: write data directly into guest
Read incoming migration stream directly into the guest memory. This avoids the memory allocation and copying, and the resulting performance penalty. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 1 + tools/libs/guest/xg_sr_restore.c | 132 ++- 2 files changed, 129 insertions(+), 4 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 7ec8867b88..f76af23bcc 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -231,6 +231,7 @@ struct xc_sr_restore_arrays { xen_pfn_t mfns[MAX_BATCH_SIZE]; int map_errs[MAX_BATCH_SIZE]; void *guest_data[MAX_BATCH_SIZE]; +struct iovec iov[MAX_BATCH_SIZE]; /* populate_pfns */ xen_pfn_t pp_mfns[MAX_BATCH_SIZE]; diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 060f3d1f4e..2f575d7dd9 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -392,6 +392,122 @@ err: return rc; } +/* + * Handle PAGE_DATA record from the stream. + * Given a list of pfns, their types, and a block of page data from the + * stream, populate and record their types, map the relevant subset and copy + * the data into the guest. + */ +static int handle_incoming_page_data(struct xc_sr_context *ctx, + struct xc_sr_rhdr *rhdr) +{ +xc_interface *xch = ctx->xch; +struct xc_sr_restore_arrays *m = ctx->restore.m; +struct xc_sr_rec_page_data_header *pages = >pages; +uint64_t *pfn_nums = m->pages.pfn; +uint32_t i; +int rc, iov_idx; + +rc = handle_static_data_end_v2(ctx); +if ( rc ) +goto err; + +/* First read and verify the header */ +rc = read_exact(ctx->fd, pages, sizeof(*pages)); +if ( rc ) +{ +PERROR("Could not read rec_pfn header"); +goto err; +} + +if ( verify_rec_page_hdr(ctx, rhdr->length, pages) == false ) +{ +rc = -1; +goto err; +} + +/* Then read and verify the incoming pfn numbers */ +rc = read_exact(ctx->fd, pfn_nums, sizeof(*pfn_nums) * pages->count); +if ( rc ) +{ +PERROR("Could not read rec_pfn data"); +goto err; +} + +if ( verify_rec_page_pfns(ctx, rhdr->length, pages) == false ) +{ +rc = -1; +goto err; +} + +/* Finally read and verify the incoming pfn data */ +rc = map_guest_pages(ctx, pages); +if ( rc ) +goto err; + +/* Prepare read buffers, either guest or throw away memory */ +for ( i = 0, iov_idx = 0; i < pages->count; i++ ) +{ +if ( !m->guest_data[i] ) +continue; + +m->iov[iov_idx].iov_len = PAGE_SIZE; +if ( ctx->restore.verify ) +m->iov[iov_idx].iov_base = ctx->restore.verify_buf + i * PAGE_SIZE; +else +m->iov[iov_idx].iov_base = m->guest_data[i]; +iov_idx++; +} + +if ( !iov_idx ) +goto done; + +rc = readv_exact(ctx->fd, m->iov, iov_idx); +if ( rc ) +{ +PERROR("read of %d pages failed", iov_idx); +goto err; +} + +/* Post-processing of pfn data */ +for ( i = 0, iov_idx = 0; i < pages->count; i++ ) +{ +if ( !m->guest_data[i] ) +continue; + +rc = ctx->restore.ops.localise_page(ctx, m->types[i], m->iov[iov_idx].iov_base); +if ( rc ) +{ +ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")", + m->pfns[i], m->types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT); +goto err; + +} + +if ( ctx->restore.verify ) +{ +if ( memcmp(m->guest_data[i], m->iov[iov_idx].iov_base, PAGE_SIZE) ) +{ +ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")", + m->pfns[i], m->types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT); +} +} + +iov_idx++; +} + +done: +rc = 0; + +err: +if ( ctx->restore.guest_mapping ) +{ +xenforeignmemory_unmap(xch->fmem, ctx->restore.guest_mapping, ctx->restore.nr_mapped_pages); +ctx->restore.guest_mapping = NULL; +} +return rc; +} + /* * Handle PAGE_DATA record from an existing buffer * Given a list of pfns, their types, and a block of page data from the @@ -773,11 +889,19 @@ static int process_incoming_record_header(struct xc_sr_context *ctx, struct xc_s struct xc_sr_record rec; int rc; -rc = read_record_data(ctx, ctx->fd, rhdr, ); -if ( rc ) -return rc; +switch ( rhdr->type ) +{ +case REC_TYPE_PAGE_DATA: +rc = handle_incoming_page_data(ctx, rhdr); +break; +default: +rc = read_record_data(ctx, ctx->fd, rhdr, ); +if ( rc == 0 ) +rc = process_buffered_record(ctx, );; +break; +} -return process_buffered_record(ctx, ); +return rc; }
[PATCH v1 06/23] tools/guest: prepare to allocate arrays once
The hotpath 'send_dirty_pages' is supposed to do just one thing: sending. The other end 'handle_page_data' is supposed to do just receiving. But instead both do other costly work like memory allocations and data moving. Do the allocations once, the array sizes are a compiletime constant. Avoid unneeded copying of data by receiving data directly into mapped guest memory. This patch is just prepartion, subsequent changes will populate the arrays. Once all changes are applied, migration of a busy HVM domU changes like that: Without this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_testing): 2020-10-29 10:23:10.711+: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 55.324905335 sec, 203 MiB/sec: Internal error 2020-10-29 10:23:35.115+: xc: show_transfer_rate: 16829632 bytes + 2097552 pages in 24.401179720 sec, 335 MiB/sec: Internal error 2020-10-29 10:23:59.436+: xc: show_transfer_rate: 16829032 bytes + 2097478 pages in 24.319025928 sec, 336 MiB/sec: Internal error 2020-10-29 10:24:23.844+: xc: show_transfer_rate: 16829024 bytes + 2097477 pages in 24.406992500 sec, 335 MiB/sec: Internal error 2020-10-29 10:24:48.292+: xc: show_transfer_rate: 16828912 bytes + 2097463 pages in 24.446489027 sec, 335 MiB/sec: Internal error 2020-10-29 10:25:01.816+: xc: show_transfer_rate: 16836080 bytes + 2098356 pages in 13.447091818 sec, 609 MiB/sec: Internal error With this series, from sr650 to sr950 (xen-4.15.20201027T173911.16a20963b3 xen_unstable): 2020-10-28 21:26:05.074+: xc: show_transfer_rate: 23663128 bytes + 2879563 pages in 52.564054368 sec, 213 MiB/sec: Internal error 2020-10-28 21:26:23.527+: xc: show_transfer_rate: 16830040 bytes + 2097603 pages in 18.450592015 sec, 444 MiB/sec: Internal error 2020-10-28 21:26:41.926+: xc: show_transfer_rate: 16830944 bytes + 2097717 pages in 18.397862306 sec, 445 MiB/sec: Internal error 2020-10-28 21:27:00.339+: xc: show_transfer_rate: 16829176 bytes + 2097498 pages in 18.411973339 sec, 445 MiB/sec: Internal error 2020-10-28 21:27:18.643+: xc: show_transfer_rate: 16828592 bytes + 2097425 pages in 18.303326695 sec, 447 MiB/sec: Internal error 2020-10-28 21:27:26.289+: xc: show_transfer_rate: 16835952 bytes + 2098342 pages in 7.579846749 sec, 1081 MiB/sec: Internal error Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 8 tools/libs/guest/xg_sr_restore.c | 8 tools/libs/guest/xg_sr_save.c| 4 +++- 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index f3a7a29298..62bc87b5f4 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -211,6 +211,12 @@ static inline int update_blob(struct xc_sr_blob *blob, return 0; } +struct xc_sr_save_arrays { +}; + +struct xc_sr_restore_arrays { +}; + struct xc_sr_context { xc_interface *xch; @@ -248,6 +254,7 @@ struct xc_sr_context unsigned long *deferred_pages; unsigned long nr_deferred_pages; xc_hypercall_buffer_t dirty_bitmap_hbuf; +struct xc_sr_save_arrays *m; } save; struct /* Restore data. */ @@ -299,6 +306,7 @@ struct xc_sr_context /* Sender has invoked verify mode on the stream. */ bool verify; +struct xc_sr_restore_arrays *m; } restore; }; diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 0332ae9f32..4a9ece9681 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -739,6 +739,13 @@ static int setup(struct xc_sr_context *ctx) } ctx->restore.allocated_rec_num = DEFAULT_BUF_RECORDS; +ctx->restore.m = malloc(sizeof(*ctx->restore.m)); +if ( !ctx->restore.m ) { +ERROR("Unable to allocate memory for arrays"); +rc = -1; +goto err; +} + err: return rc; } @@ -757,6 +764,7 @@ static void cleanup(struct xc_sr_context *ctx) xc_hypercall_buffer_free_pages( xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->restore.p2m_size))); +free(ctx->restore.m); free(ctx->restore.buffered_records); free(ctx->restore.populated_pfns); diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index f58a35ddde..1e3c8eff2f 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -853,8 +853,9 @@ static int setup(struct xc_sr_context *ctx) ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE * sizeof(*ctx->save.batch_pfns)); ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size); +ctx->save.m = malloc(sizeof(*ctx->save.m)); -if ( !ctx->save.batch_pfns || !dirty_bitmap || !ctx->save.deferred_pages ) +if ( !ctx->save.m || !ctx->save.batch_pfns || !dirty_bitmap || !ctx->save.deferred_pages ) { ERROR("Unable to allocate
[PATCH v1 09/23] tools/guest: save: move types array
Remove allocation from hotpath, move types array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_save.c | 7 ++- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 0c2bef8f78..3cbadb607b 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -215,6 +215,8 @@ struct xc_sr_save_arrays { xen_pfn_t batch_pfns[MAX_BATCH_SIZE]; /* write_batch: Mfns of the batch pfns. */ xen_pfn_t mfns[MAX_BATCH_SIZE]; +/* write_batch: Types of the batch pfns. */ +xen_pfn_t types[MAX_BATCH_SIZE]; }; struct xc_sr_restore_arrays { diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index cdab288c3f..6678df95b8 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -88,7 +88,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx) static int write_batch(struct xc_sr_context *ctx) { xc_interface *xch = ctx->xch; -xen_pfn_t *mfns = ctx->save.m->mfns, *types = NULL; +xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types; void *guest_mapping = NULL; void **guest_data = NULL; void **local_pages = NULL; @@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx) assert(nr_pfns != 0); -/* Types of the batch pfns. */ -types = malloc(nr_pfns * sizeof(*types)); /* Errors from attempting to map the gfns. */ errors = malloc(nr_pfns * sizeof(*errors)); /* Pointers to page data to send. Mapped gfns or local allocations. */ @@ -116,7 +114,7 @@ static int write_batch(struct xc_sr_context *ctx) /* iovec[] for writev(). */ iov = malloc((nr_pfns + 4) * sizeof(*iov)); -if ( !types || !errors || !guest_data || !local_pages || !iov ) +if ( !errors || !guest_data || !local_pages || !iov ) { ERROR("Unable to allocate arrays for a batch of %u pages", nr_pfns); @@ -274,7 +272,6 @@ static int write_batch(struct xc_sr_context *ctx) free(local_pages); free(guest_data); free(errors); -free(types); return rc; }
[PATCH v1 03/23] tools: use xc_is_known_page_type
Verify pfn type on sending side, also verify incoming batch of pfns. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_restore.c | 3 +-- tools/libs/guest/xg_sr_save.c| 6 ++ 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index b57a787519..f1c3169229 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -406,8 +406,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) } type = (pages->pfn[i] & PAGE_DATA_TYPE_MASK) >> 32; -if ( ((type >> XEN_DOMCTL_PFINFO_LTAB_SHIFT) >= 5) && - ((type >> XEN_DOMCTL_PFINFO_LTAB_SHIFT) <= 8) ) +if ( xc_is_known_page_type(type) == false ) { ERROR("Invalid type %#"PRIx32" for pfn %#"PRIpfn" (index %u)", type, pfn, i); diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 2ba7c3200c..044d0ae3aa 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -147,6 +147,12 @@ static int write_batch(struct xc_sr_context *ctx) for ( i = 0; i < nr_pfns; ++i ) { +if ( xc_is_known_page_type(types[i]) == false ) +{ +ERROR("Wrong type %#"PRIpfn" for pfn %#"PRIpfn, types[i], mfns[i]); +goto err; +} + switch ( types[i] ) { case XEN_DOMCTL_PFINFO_BROKEN:
[PATCH v1 17/23] tools/guest: restore: move mfns array
Remove allocation from hotpath, move mfns array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_restore.c | 5 ++--- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 0ceecb289d..5731a5c186 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -227,6 +227,8 @@ struct xc_sr_restore_arrays { /* handle_page_data */ xen_pfn_t pfns[MAX_BATCH_SIZE]; uint32_t types[MAX_BATCH_SIZE]; +/* process_page_data */ +xen_pfn_t mfns[MAX_BATCH_SIZE]; }; struct xc_sr_context diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 7729071e41..3ba089f862 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -205,7 +205,7 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count, xen_pfn_t *pfns, uint32_t *types, void *page_data) { xc_interface *xch = ctx->xch; -xen_pfn_t *mfns = malloc(count * sizeof(*mfns)); +xen_pfn_t *mfns = ctx->restore.m->mfns; int *map_errs = malloc(count * sizeof(*map_errs)); int rc; void *mapping = NULL, *guest_page = NULL; @@ -213,7 +213,7 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count, j, /* j indexes the subset of pfns we decide to map. */ nr_pages = 0; -if ( !mfns || !map_errs ) +if ( !map_errs ) { rc = -1; ERROR("Failed to allocate %zu bytes to process page data", @@ -299,7 +299,6 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count, xenforeignmemory_unmap(xch->fmem, mapping, nr_pages); free(map_errs); -free(mfns); return rc; }
[PATCH v1 07/23] tools/guest: save: move batch_pfns
The batch_pfns array is already allocated in advance. Move it into the preallocated area. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 +- tools/libs/guest/xg_sr_save.c | 25 +++-- 2 files changed, 12 insertions(+), 15 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 62bc87b5f4..c78a07b8f8 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -212,6 +212,7 @@ static inline int update_blob(struct xc_sr_blob *blob, } struct xc_sr_save_arrays { +xen_pfn_t batch_pfns[MAX_BATCH_SIZE]; }; struct xc_sr_restore_arrays { @@ -249,7 +250,6 @@ struct xc_sr_context struct precopy_stats stats; -xen_pfn_t *batch_pfns; unsigned int nr_batch_pfns; unsigned long *deferred_pages; unsigned long nr_deferred_pages; diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 1e3c8eff2f..597e638c59 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -77,7 +77,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx) /* * Writes a batch of memory as a PAGE_DATA record into the stream. The batch - * is constructed in ctx->save.batch_pfns. + * is constructed in ctx->save.m->batch_pfns. * * This function: * - gets the types for each pfn in the batch. @@ -128,12 +128,12 @@ static int write_batch(struct xc_sr_context *ctx) for ( i = 0; i < nr_pfns; ++i ) { types[i] = mfns[i] = ctx->save.ops.pfn_to_gfn(ctx, - ctx->save.batch_pfns[i]); + ctx->save.m->batch_pfns[i]); /* Likely a ballooned page. */ if ( mfns[i] == INVALID_MFN ) { -set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages); +set_bit(ctx->save.m->batch_pfns[i], ctx->save.deferred_pages); ++ctx->save.nr_deferred_pages; } } @@ -179,7 +179,7 @@ static int write_batch(struct xc_sr_context *ctx) if ( errors[p] ) { ERROR("Mapping of pfn %#"PRIpfn" (mfn %#"PRIpfn") failed %d", - ctx->save.batch_pfns[i], mfns[p], errors[p]); + ctx->save.m->batch_pfns[i], mfns[p], errors[p]); goto err; } @@ -193,7 +193,7 @@ static int write_batch(struct xc_sr_context *ctx) { if ( rc == -1 && errno == EAGAIN ) { -set_bit(ctx->save.batch_pfns[i], ctx->save.deferred_pages); +set_bit(ctx->save.m->batch_pfns[i], ctx->save.deferred_pages); ++ctx->save.nr_deferred_pages; types[i] = XEN_DOMCTL_PFINFO_XTAB; --nr_pages; @@ -224,7 +224,7 @@ static int write_batch(struct xc_sr_context *ctx) rec.length += nr_pages * PAGE_SIZE; for ( i = 0; i < nr_pfns; ++i ) -rec_pfns[i] = ((uint64_t)(types[i]) << 32) | ctx->save.batch_pfns[i]; +rec_pfns[i] = ((uint64_t)(types[i]) << 32) | ctx->save.m->batch_pfns[i]; iov[0].iov_base = iov[0].iov_len = sizeof(rec.type); @@ -296,9 +296,9 @@ static int flush_batch(struct xc_sr_context *ctx) if ( !rc ) { -VALGRIND_MAKE_MEM_UNDEFINED(ctx->save.batch_pfns, +VALGRIND_MAKE_MEM_UNDEFINED(ctx->save.m->batch_pfns, MAX_BATCH_SIZE * -sizeof(*ctx->save.batch_pfns)); +sizeof(*ctx->save.m->batch_pfns)); } return rc; @@ -315,7 +315,7 @@ static int add_to_batch(struct xc_sr_context *ctx, xen_pfn_t pfn) rc = flush_batch(ctx); if ( rc == 0 ) -ctx->save.batch_pfns[ctx->save.nr_batch_pfns++] = pfn; +ctx->save.m->batch_pfns[ctx->save.nr_batch_pfns++] = pfn; return rc; } @@ -850,14 +850,12 @@ static int setup(struct xc_sr_context *ctx) dirty_bitmap = xc_hypercall_buffer_alloc_pages( xch, dirty_bitmap, NRPAGES(bitmap_size(ctx->save.p2m_size))); -ctx->save.batch_pfns = malloc(MAX_BATCH_SIZE * - sizeof(*ctx->save.batch_pfns)); ctx->save.deferred_pages = bitmap_alloc(ctx->save.p2m_size); ctx->save.m = malloc(sizeof(*ctx->save.m)); -if ( !ctx->save.m || !ctx->save.batch_pfns || !dirty_bitmap || !ctx->save.deferred_pages ) +if ( !ctx->save.m || !dirty_bitmap || !ctx->save.deferred_pages ) { -ERROR("Unable to allocate memory for dirty bitmaps, batch pfns and" +ERROR("Unable to allocate memory for dirty bitmaps and" " deferred pages"); rc = -1; errno = ENOMEM; @@ -886,7 +884,6 @@ static void cleanup(struct xc_sr_context *ctx) xc_hypercall_buffer_free_pages(xch, dirty_bitmap,
[PATCH v1 16/23] tools/guest: restore: move types array
Remove allocation from hotpath, move types array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 1 + tools/libs/guest/xg_sr_restore.c | 12 +--- 2 files changed, 2 insertions(+), 11 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 516d9b9fb5..0ceecb289d 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -226,6 +226,7 @@ struct xc_sr_save_arrays { struct xc_sr_restore_arrays { /* handle_page_data */ xen_pfn_t pfns[MAX_BATCH_SIZE]; +uint32_t types[MAX_BATCH_SIZE]; }; struct xc_sr_context diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 7d1447e86c..7729071e41 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -316,7 +316,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) int rc = -1; xen_pfn_t *pfns = ctx->restore.m->pfns, pfn; -uint32_t *types = NULL, type; +uint32_t *types = ctx->restore.m->types, type; /* * v2 compatibility only exists for x86 streams. This is a bit of a @@ -363,14 +363,6 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) goto err; } -types = malloc(pages->count * sizeof(*types)); -if ( !types ) -{ -ERROR("Unable to allocate enough memory for %u pfns", - pages->count); -goto err; -} - for ( i = 0; i < pages->count; ++i ) { pfn = pages->pfn[i] & PAGE_DATA_PFN_MASK; @@ -410,8 +402,6 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) rc = process_page_data(ctx, pages->count, pfns, types, >pfn[pages->count]); err: -free(types); - return rc; }
[PATCH v1 08/23] tools/guest: save: move mfns array
Remove allocation from hotpath, move mfns array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_save.c | 7 ++- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index c78a07b8f8..0c2bef8f78 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -213,6 +213,8 @@ static inline int update_blob(struct xc_sr_blob *blob, struct xc_sr_save_arrays { xen_pfn_t batch_pfns[MAX_BATCH_SIZE]; +/* write_batch: Mfns of the batch pfns. */ +xen_pfn_t mfns[MAX_BATCH_SIZE]; }; struct xc_sr_restore_arrays { diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 597e638c59..cdab288c3f 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -88,7 +88,7 @@ static int write_checkpoint_record(struct xc_sr_context *ctx) static int write_batch(struct xc_sr_context *ctx) { xc_interface *xch = ctx->xch; -xen_pfn_t *mfns = NULL, *types = NULL; +xen_pfn_t *mfns = ctx->save.m->mfns, *types = NULL; void *guest_mapping = NULL; void **guest_data = NULL; void **local_pages = NULL; @@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx) assert(nr_pfns != 0); -/* Mfns of the batch pfns. */ -mfns = malloc(nr_pfns * sizeof(*mfns)); /* Types of the batch pfns. */ types = malloc(nr_pfns * sizeof(*types)); /* Errors from attempting to map the gfns. */ @@ -118,7 +116,7 @@ static int write_batch(struct xc_sr_context *ctx) /* iovec[] for writev(). */ iov = malloc((nr_pfns + 4) * sizeof(*iov)); -if ( !mfns || !types || !errors || !guest_data || !local_pages || !iov ) +if ( !types || !errors || !guest_data || !local_pages || !iov ) { ERROR("Unable to allocate arrays for a batch of %u pages", nr_pfns); @@ -277,7 +275,6 @@ static int write_batch(struct xc_sr_context *ctx) free(guest_data); free(errors); free(types); -free(mfns); return rc; }
[PATCH v1 13/23] tools/guest: save: move guest_data array
Remove allocation from hotpath, move guest_data array into preallocated space. Because this was allocated with calloc: Adjust the loop to clear unused entries as needed. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_save.c | 11 ++- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 81158a4f4d..33e66678c6 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -223,6 +223,8 @@ struct xc_sr_save_arrays { struct iovec iov[MAX_BATCH_SIZE + 4]; /* write_batch */ uint64_t rec_pfns[MAX_BATCH_SIZE]; +/* write_batch: Pointers to page data to send. Mapped gfns or local allocations. */ +void *guest_data[MAX_BATCH_SIZE]; }; struct xc_sr_restore_arrays { diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 4d7fbe09ed..658f834ae8 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -90,7 +90,7 @@ static int write_batch(struct xc_sr_context *ctx) xc_interface *xch = ctx->xch; xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types; void *guest_mapping = NULL; -void **guest_data = NULL; +void **guest_data = ctx->save.m->guest_data; void **local_pages = NULL; int *errors = ctx->save.m->errors, rc = -1; unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0; @@ -105,12 +105,10 @@ static int write_batch(struct xc_sr_context *ctx) assert(nr_pfns != 0); -/* Pointers to page data to send. Mapped gfns or local allocations. */ -guest_data = calloc(nr_pfns, sizeof(*guest_data)); /* Pointers to locally allocated pages. Need freeing. */ local_pages = calloc(nr_pfns, sizeof(*local_pages)); -if ( !guest_data || !local_pages ) +if ( !local_pages ) { ERROR("Unable to allocate arrays for a batch of %u pages", nr_pfns); @@ -166,7 +164,10 @@ static int write_batch(struct xc_sr_context *ctx) for ( i = 0, p = 0; i < nr_pfns; ++i ) { if ( page_type_has_stream_data(types[i]) == false ) +{ +guest_data[i] = NULL; continue; +} if ( errors[p] ) { @@ -183,6 +184,7 @@ static int write_batch(struct xc_sr_context *ctx) if ( rc ) { +guest_data[i] = NULL; if ( rc == -1 && errno == EAGAIN ) { set_bit(ctx->save.m->batch_pfns[i], ctx->save.deferred_pages); @@ -256,7 +258,6 @@ static int write_batch(struct xc_sr_context *ctx) for ( i = 0; local_pages && i < nr_pfns; ++i ) free(local_pages[i]); free(local_pages); -free(guest_data); return rc; }
[PATCH v1 20/23] tools/guest: restore: move pfns array in populate_pfns
Remove allocation from hotpath, move populate_pfns' pfns array into preallocated space. Use some prefix to avoid conflict with an array used in handle_page_data. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 1 + tools/libs/guest/xg_sr_restore.c | 11 +-- 2 files changed, 2 insertions(+), 10 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 96a77b5969..3fe665b91d 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -232,6 +232,7 @@ struct xc_sr_restore_arrays { int map_errs[MAX_BATCH_SIZE]; /* populate_pfns */ xen_pfn_t pp_mfns[MAX_BATCH_SIZE]; +xen_pfn_t pp_pfns[MAX_BATCH_SIZE]; }; struct xc_sr_context diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 85a32aaed2..71b39612ee 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -139,17 +139,10 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count, { xc_interface *xch = ctx->xch; xen_pfn_t *mfns = ctx->restore.m->pp_mfns, -*pfns = malloc(count * sizeof(*pfns)); +*pfns = ctx->restore.m->pp_pfns; unsigned int i, nr_pfns = 0; int rc = -1; -if ( !pfns ) -{ -ERROR("Failed to allocate %zu bytes for populating the physmap", - 2 * count * sizeof(*mfns)); -goto err; -} - for ( i = 0; i < count; ++i ) { if ( (!types || @@ -190,8 +183,6 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count, rc = 0; err: -free(pfns); - return rc; }
[PATCH v1 12/23] tools/guest: save: move rec_pfns array
Remove allocation from hotpath, move rec_pfns array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_save.c | 11 +-- 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index f315b4f526..81158a4f4d 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -221,6 +221,8 @@ struct xc_sr_save_arrays { int errors[MAX_BATCH_SIZE]; /* write_batch: iovec[] for writev(). */ struct iovec iov[MAX_BATCH_SIZE + 4]; +/* write_batch */ +uint64_t rec_pfns[MAX_BATCH_SIZE]; }; struct xc_sr_restore_arrays { diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 385a591332..4d7fbe09ed 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -96,7 +96,7 @@ static int write_batch(struct xc_sr_context *ctx) unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0; unsigned int nr_pfns = ctx->save.nr_batch_pfns; void *page, *orig_page; -uint64_t *rec_pfns = NULL; +uint64_t *rec_pfns = ctx->save.m->rec_pfns; struct iovec *iov = ctx->save.m->iov; int iovcnt = 0; struct xc_sr_rec_page_data_header hdr = { 0 }; struct xc_sr_record rec = { @@ -201,14 +201,6 @@ static int write_batch(struct xc_sr_context *ctx) } } -rec_pfns = malloc(nr_pfns * sizeof(*rec_pfns)); -if ( !rec_pfns ) -{ -ERROR("Unable to allocate %zu bytes of memory for page data pfn list", - nr_pfns * sizeof(*rec_pfns)); -goto err; -} - hdr.count = nr_pfns; rec.length = sizeof(hdr); @@ -259,7 +251,6 @@ static int write_batch(struct xc_sr_context *ctx) rc = ctx->save.nr_batch_pfns = 0; err: -free(rec_pfns); if ( guest_mapping ) xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped); for ( i = 0; local_pages && i < nr_pfns; ++i )
[PATCH v1 22/23] tools/guest: restore: split handle_page_data
handle_page_data must be able to read directly into mapped guest memory. This will avoid unneccesary memcpy calls for data that can be consumed verbatim. Split the various steps of record processing: - move processing to handle_buffered_page_data - adjust xenforeignmemory_map to set errno in case of failure - adjust verify mode to set errno in case of failure This change is preparation for future changes in handle_page_data, no change in behavior is intended. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 9 + tools/libs/guest/xg_sr_restore.c | 343 --- 2 files changed, 231 insertions(+), 121 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 66d1b0dfe6..7ec8867b88 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -230,9 +230,14 @@ struct xc_sr_restore_arrays { /* process_page_data */ xen_pfn_t mfns[MAX_BATCH_SIZE]; int map_errs[MAX_BATCH_SIZE]; +void *guest_data[MAX_BATCH_SIZE]; + /* populate_pfns */ xen_pfn_t pp_mfns[MAX_BATCH_SIZE]; xen_pfn_t pp_pfns[MAX_BATCH_SIZE]; + +/* Must be the last member */ +struct xc_sr_rec_page_data_header pages; }; struct xc_sr_context @@ -323,7 +328,11 @@ struct xc_sr_context /* Sender has invoked verify mode on the stream. */ bool verify; +void *verify_buf; + struct xc_sr_restore_arrays *m; +void *guest_mapping; +uint32_t nr_mapped_pages; } restore; }; diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 93f69b9ba8..060f3d1f4e 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -186,123 +186,18 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count, return rc; } -/* - * Given a list of pfns, their types, and a block of page data from the - * stream, populate and record their types, map the relevant subset and copy - * the data into the guest. - */ -static int process_page_data(struct xc_sr_context *ctx, unsigned int count, - xen_pfn_t *pfns, uint32_t *types, void *page_data) +static int handle_static_data_end_v2(struct xc_sr_context *ctx) { -xc_interface *xch = ctx->xch; -xen_pfn_t *mfns = ctx->restore.m->mfns; -int *map_errs = ctx->restore.m->map_errs; -int rc; -void *mapping = NULL, *guest_page = NULL; -unsigned int i, /* i indexes the pfns from the record. */ -j, /* j indexes the subset of pfns we decide to map. */ -nr_pages = 0; - -rc = populate_pfns(ctx, count, pfns, types); -if ( rc ) -{ -ERROR("Failed to populate pfns for batch of %u pages", count); -goto err; -} - -for ( i = 0; i < count; ++i ) -{ -ctx->restore.ops.set_page_type(ctx, pfns[i], types[i]); - -if ( page_type_has_stream_data(types[i]) == true ) -mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]); -} - -/* Nothing to do? */ -if ( nr_pages == 0 ) -goto done; - -mapping = guest_page = xenforeignmemory_map( -xch->fmem, ctx->domid, PROT_READ | PROT_WRITE, -nr_pages, mfns, map_errs); -if ( !mapping ) -{ -rc = -1; -PERROR("Unable to map %u mfns for %u pages of data", - nr_pages, count); -goto err; -} - -for ( i = 0, j = 0; i < count; ++i ) -{ -if ( page_type_has_stream_data(types[i]) == false ) -continue; - -if ( map_errs[j] ) -{ -rc = -1; -ERROR("Mapping pfn %#"PRIpfn" (mfn %#"PRIpfn", type %#"PRIx32") failed with %d", - pfns[i], mfns[j], types[i], map_errs[j]); -goto err; -} - -/* Undo page normalisation done by the saver. */ -rc = ctx->restore.ops.localise_page(ctx, types[i], page_data); -if ( rc ) -{ -ERROR("Failed to localise pfn %#"PRIpfn" (type %#"PRIx32")", - pfns[i], types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT); -goto err; -} - -if ( ctx->restore.verify ) -{ -/* Verify mode - compare incoming data to what we already have. */ -if ( memcmp(guest_page, page_data, PAGE_SIZE) ) -ERROR("verify pfn %#"PRIpfn" failed (type %#"PRIx32")", - pfns[i], types[i] >> XEN_DOMCTL_PFINFO_LTAB_SHIFT); -} -else -{ -/* Regular mode - copy incoming data into place. */ -memcpy(guest_page, page_data, PAGE_SIZE); -} - -++j; -guest_page += PAGE_SIZE; -page_data += PAGE_SIZE; -} - - done: -rc = 0; - - err: -if ( mapping ) -xenforeignmemory_unmap(xch->fmem, mapping, nr_pages); - -return rc; -} +int rc = 0; -/* - * Validate a PAGE_DATA record from the stream, and
[PATCH v1 04/23] tools: unify type checking for data pfns in migration stream
Introduce a helper which decides if a given pfn type has data for the migration stream. No change in behavior intended. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 17 tools/libs/guest/xg_sr_restore.c | 34 +--- tools/libs/guest/xg_sr_save.c| 14 ++--- 3 files changed, 24 insertions(+), 41 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index cc3ad1c394..70e328e951 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -455,6 +455,23 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count, /* Handle a STATIC_DATA_END record. */ int handle_static_data_end(struct xc_sr_context *ctx); +static inline bool page_type_has_stream_data(uint32_t type) +{ +bool ret; + +switch (type) +{ +case XEN_DOMCTL_PFINFO_XTAB: +case XEN_DOMCTL_PFINFO_XALLOC: +case XEN_DOMCTL_PFINFO_BROKEN: +ret = false; +break; +default: +ret = true; +break; +} +return ret; +} #endif /* * Local variables: diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index f1c3169229..0332ae9f32 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -152,9 +152,8 @@ int populate_pfns(struct xc_sr_context *ctx, unsigned int count, for ( i = 0; i < count; ++i ) { -if ( (!types || (types && - (types[i] != XEN_DOMCTL_PFINFO_XTAB && - types[i] != XEN_DOMCTL_PFINFO_BROKEN))) && +if ( (!types || + (types && page_type_has_stream_data(types[i]) == true)) && !pfn_is_populated(ctx, original_pfns[i]) ) { rc = pfn_set_populated(ctx, original_pfns[i]); @@ -233,25 +232,8 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count, { ctx->restore.ops.set_page_type(ctx, pfns[i], types[i]); -switch ( types[i] ) -{ -case XEN_DOMCTL_PFINFO_NOTAB: - -case XEN_DOMCTL_PFINFO_L1TAB: -case XEN_DOMCTL_PFINFO_L1TAB | XEN_DOMCTL_PFINFO_LPINTAB: - -case XEN_DOMCTL_PFINFO_L2TAB: -case XEN_DOMCTL_PFINFO_L2TAB | XEN_DOMCTL_PFINFO_LPINTAB: - -case XEN_DOMCTL_PFINFO_L3TAB: -case XEN_DOMCTL_PFINFO_L3TAB | XEN_DOMCTL_PFINFO_LPINTAB: - -case XEN_DOMCTL_PFINFO_L4TAB: -case XEN_DOMCTL_PFINFO_L4TAB | XEN_DOMCTL_PFINFO_LPINTAB: - +if ( page_type_has_stream_data(types[i]) == true ) mfns[nr_pages++] = ctx->restore.ops.pfn_to_gfn(ctx, pfns[i]); -break; -} } /* Nothing to do? */ @@ -271,14 +253,8 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count, for ( i = 0, j = 0; i < count; ++i ) { -switch ( types[i] ) -{ -case XEN_DOMCTL_PFINFO_XTAB: -case XEN_DOMCTL_PFINFO_BROKEN: -case XEN_DOMCTL_PFINFO_XALLOC: -/* No page data to deal with. */ +if ( page_type_has_stream_data(types[i]) == false ) continue; -} if ( map_errs[j] ) { @@ -413,7 +389,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) goto err; } -if ( type < XEN_DOMCTL_PFINFO_BROKEN ) +if ( page_type_has_stream_data(type) == true ) /* NOTAB and all L1 through L4 tables (including pinned) should * have a page worth of data in the record. */ pages_of_data++; diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 044d0ae3aa..0546d3d9e6 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -153,13 +153,8 @@ static int write_batch(struct xc_sr_context *ctx) goto err; } -switch ( types[i] ) -{ -case XEN_DOMCTL_PFINFO_BROKEN: -case XEN_DOMCTL_PFINFO_XALLOC: -case XEN_DOMCTL_PFINFO_XTAB: +if ( page_type_has_stream_data(types[i]) == false ) continue; -} mfns[nr_pages++] = mfns[i]; } @@ -177,13 +172,8 @@ static int write_batch(struct xc_sr_context *ctx) for ( i = 0, p = 0; i < nr_pfns; ++i ) { -switch ( types[i] ) -{ -case XEN_DOMCTL_PFINFO_BROKEN: -case XEN_DOMCTL_PFINFO_XALLOC: -case XEN_DOMCTL_PFINFO_XTAB: +if ( page_type_has_stream_data(types[i]) == false ) continue; -} if ( errors[p] ) {
[PATCH v1 11/23] tools/guest: save: move iov array
Remove allocation from hotpath, move iov array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_save.c | 7 ++- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 71b676c0e0..f315b4f526 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -219,6 +219,8 @@ struct xc_sr_save_arrays { xen_pfn_t types[MAX_BATCH_SIZE]; /* write_batch: Errors from attempting to map the gfns. */ int errors[MAX_BATCH_SIZE]; +/* write_batch: iovec[] for writev(). */ +struct iovec iov[MAX_BATCH_SIZE + 4]; }; struct xc_sr_restore_arrays { diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index cdc27a9cde..385a591332 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -97,7 +97,7 @@ static int write_batch(struct xc_sr_context *ctx) unsigned int nr_pfns = ctx->save.nr_batch_pfns; void *page, *orig_page; uint64_t *rec_pfns = NULL; -struct iovec *iov = NULL; int iovcnt = 0; +struct iovec *iov = ctx->save.m->iov; int iovcnt = 0; struct xc_sr_rec_page_data_header hdr = { 0 }; struct xc_sr_record rec = { .type = REC_TYPE_PAGE_DATA, @@ -109,10 +109,8 @@ static int write_batch(struct xc_sr_context *ctx) guest_data = calloc(nr_pfns, sizeof(*guest_data)); /* Pointers to locally allocated pages. Need freeing. */ local_pages = calloc(nr_pfns, sizeof(*local_pages)); -/* iovec[] for writev(). */ -iov = malloc((nr_pfns + 4) * sizeof(*iov)); -if ( !guest_data || !local_pages || !iov ) +if ( !guest_data || !local_pages ) { ERROR("Unable to allocate arrays for a batch of %u pages", nr_pfns); @@ -266,7 +264,6 @@ static int write_batch(struct xc_sr_context *ctx) xenforeignmemory_unmap(xch->fmem, guest_mapping, nr_pages_mapped); for ( i = 0; local_pages && i < nr_pfns; ++i ) free(local_pages[i]); -free(iov); free(local_pages); free(guest_data);
[PATCH v1 01/23] tools: add readv_exact to libxenctrl
Read a batch of iovec's. In the common case of short reads, finish individual iov's with read_exact. Signed-off-by: Olaf Hering --- tools/libs/ctrl/xc_private.c | 54 +++- tools/libs/ctrl/xc_private.h | 1 + 2 files changed, 54 insertions(+), 1 deletion(-) diff --git a/tools/libs/ctrl/xc_private.c b/tools/libs/ctrl/xc_private.c index d94f846686..a86ed213a5 100644 --- a/tools/libs/ctrl/xc_private.c +++ b/tools/libs/ctrl/xc_private.c @@ -659,8 +659,23 @@ int write_exact(int fd, const void *data, size_t size) #if defined(__MINIOS__) /* - * MiniOS's libc doesn't know about writev(). Implement it as multiple write()s. + * MiniOS's libc doesn't know about readv/writev(). + * Implement it as multiple read/write()s. */ +int readv_exact(int fd, const struct iovec *iov, int iovcnt) +{ +int rc, i; + +for ( i = 0; i < iovcnt; ++i ) +{ +rc = read_exact(fd, iov[i].iov_base, iov[i].iov_len); +if ( rc ) +return rc; +} + +return 0; +} + int writev_exact(int fd, const struct iovec *iov, int iovcnt) { int rc, i; @@ -675,6 +689,44 @@ int writev_exact(int fd, const struct iovec *iov, int iovcnt) return 0; } #else +int readv_exact(int fd, const struct iovec *iov, int iovcnt) +{ +int rc = 0, idx = 0; +ssize_t len; + +while ( idx < iovcnt ) +{ +len = readv(fd, [idx], min(iovcnt - idx, IOV_MAX)); +if ( len == -1 && errno == EINTR ) +continue; +if ( len <= 0 ) +{ +rc = -1; +goto out; +} +while ( len > 0 && idx < iovcnt ) +{ +if ( len >= iov[idx].iov_len ) +{ +len -= iov[idx].iov_len; +} +else +{ +void *p = iov[idx].iov_base + len; +size_t l = iov[idx].iov_len - len; + +rc = read_exact(fd, p, l); +if ( rc ) +goto out; +len = 0; +} +idx++; +} +} +out: +return rc; +} + int writev_exact(int fd, const struct iovec *iov, int iovcnt) { struct iovec *local_iov = NULL; diff --git a/tools/libs/ctrl/xc_private.h b/tools/libs/ctrl/xc_private.h index f0b5f83ac8..5d2c7274fb 100644 --- a/tools/libs/ctrl/xc_private.h +++ b/tools/libs/ctrl/xc_private.h @@ -441,6 +441,7 @@ int xc_flush_mmu_updates(xc_interface *xch, struct xc_mmu *mmu); /* Return 0 on success; -1 on error setting errno. */ int read_exact(int fd, void *data, size_t size); /* EOF => -1, errno=0 */ +int readv_exact(int fd, const struct iovec *iov, int iovcnt); int write_exact(int fd, const void *data, size_t size); int writev_exact(int fd, const struct iovec *iov, int iovcnt);
[PATCH v1 14/23] tools/guest: save: move local_pages array
Remove allocation from hotpath, move local_pages array into preallocated space. Adjust the code to use the src page as is in case of HVM. In case of PV the page may need to be normalised, use an private memory area for this purpose. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 22 ++- tools/libs/guest/xg_sr_save.c | 25 +++-- tools/libs/guest/xg_sr_save_x86_hvm.c | 5 +++-- tools/libs/guest/xg_sr_save_x86_pv.c | 31 ++- 4 files changed, 39 insertions(+), 44 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 33e66678c6..2a020fef5c 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -33,16 +33,12 @@ struct xc_sr_save_ops * Optionally transform the contents of a page from being specific to the * sending environment, to being generic for the stream. * - * The page of data at the end of 'page' may be a read-only mapping of a - * running guest; it must not be modified. If no transformation is - * required, the callee should leave '*pages' untouched. + * The page of data '*src' may be a read-only mapping of a running guest; + * it must not be modified. If no transformation is required, the callee + * should leave '*src' untouched, and return it via '**ptr'. * - * If a transformation is required, the callee should allocate themselves - * a local page using malloc() and return it via '*page'. - * - * The caller shall free() '*page' in all cases. In the case that the - * callee encounters an error, it should *NOT* free() the memory it - * allocated for '*page'. + * If a transformation is required, the callee should provide the + * transformed page in a private buffer and return it via '**ptr'. * * It is valid to fail with EAGAIN if the transformation is not able to be * completed at this point. The page shall be retried later. @@ -50,7 +46,7 @@ struct xc_sr_save_ops * @returns 0 for success, -1 for failure, with errno appropriately set. */ int (*normalise_page)(struct xc_sr_context *ctx, xen_pfn_t type, - void **page); + void *src, unsigned int idx, void **ptr); /** * Set up local environment to save a domain. (Typically querying @@ -371,6 +367,12 @@ struct xc_sr_context union { +struct +{ +/* Used by write_batch for modified pages. */ +void *normalised_pages; +} save; + struct { /* State machine for the order of received records. */ diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 658f834ae8..804e4ccb3a 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -91,11 +91,10 @@ static int write_batch(struct xc_sr_context *ctx) xen_pfn_t *mfns = ctx->save.m->mfns, *types = ctx->save.m->types; void *guest_mapping = NULL; void **guest_data = ctx->save.m->guest_data; -void **local_pages = NULL; int *errors = ctx->save.m->errors, rc = -1; unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0; unsigned int nr_pfns = ctx->save.nr_batch_pfns; -void *page, *orig_page; +void *src; uint64_t *rec_pfns = ctx->save.m->rec_pfns; struct iovec *iov = ctx->save.m->iov; int iovcnt = 0; struct xc_sr_rec_page_data_header hdr = { 0 }; @@ -105,16 +104,6 @@ static int write_batch(struct xc_sr_context *ctx) assert(nr_pfns != 0); -/* Pointers to locally allocated pages. Need freeing. */ -local_pages = calloc(nr_pfns, sizeof(*local_pages)); - -if ( !local_pages ) -{ -ERROR("Unable to allocate arrays for a batch of %u pages", - nr_pfns); -goto err; -} - for ( i = 0; i < nr_pfns; ++i ) { types[i] = mfns[i] = ctx->save.ops.pfn_to_gfn(ctx, @@ -176,11 +165,8 @@ static int write_batch(struct xc_sr_context *ctx) goto err; } -orig_page = page = guest_mapping + (p * PAGE_SIZE); -rc = ctx->save.ops.normalise_page(ctx, types[i], ); - -if ( orig_page != page ) -local_pages[i] = page; +src = guest_mapping + (p * PAGE_SIZE); +rc = ctx->save.ops.normalise_page(ctx, types[i], src, i, _data[i]); if ( rc ) { @@ -195,8 +181,6 @@ static int write_batch(struct xc_sr_context *ctx) else goto err; } -else -guest_data[i] = page; rc = -1; ++p; @@ -255,9 +239,6 @@ static int write_batch(struct xc_sr_context *ctx) err: if ( guest_mapping )
[PATCH v1 10/23] tools/guest: save: move errors array
Remove allocation from hotpath, move errors array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_save.c | 7 ++- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 3cbadb607b..71b676c0e0 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -217,6 +217,8 @@ struct xc_sr_save_arrays { xen_pfn_t mfns[MAX_BATCH_SIZE]; /* write_batch: Types of the batch pfns. */ xen_pfn_t types[MAX_BATCH_SIZE]; +/* write_batch: Errors from attempting to map the gfns. */ +int errors[MAX_BATCH_SIZE]; }; struct xc_sr_restore_arrays { diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 6678df95b8..cdc27a9cde 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -92,7 +92,7 @@ static int write_batch(struct xc_sr_context *ctx) void *guest_mapping = NULL; void **guest_data = NULL; void **local_pages = NULL; -int *errors = NULL, rc = -1; +int *errors = ctx->save.m->errors, rc = -1; unsigned int i, p, nr_pages = 0, nr_pages_mapped = 0; unsigned int nr_pfns = ctx->save.nr_batch_pfns; void *page, *orig_page; @@ -105,8 +105,6 @@ static int write_batch(struct xc_sr_context *ctx) assert(nr_pfns != 0); -/* Errors from attempting to map the gfns. */ -errors = malloc(nr_pfns * sizeof(*errors)); /* Pointers to page data to send. Mapped gfns or local allocations. */ guest_data = calloc(nr_pfns, sizeof(*guest_data)); /* Pointers to locally allocated pages. Need freeing. */ @@ -114,7 +112,7 @@ static int write_batch(struct xc_sr_context *ctx) /* iovec[] for writev(). */ iov = malloc((nr_pfns + 4) * sizeof(*iov)); -if ( !errors || !guest_data || !local_pages || !iov ) +if ( !guest_data || !local_pages || !iov ) { ERROR("Unable to allocate arrays for a batch of %u pages", nr_pfns); @@ -271,7 +269,6 @@ static int write_batch(struct xc_sr_context *ctx) free(iov); free(local_pages); free(guest_data); -free(errors); return rc; }
[PATCH v1 18/23] tools/guest: restore: move map_errs array
Remove allocation from hotpath, move map_errs array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 1 + tools/libs/guest/xg_sr_restore.c | 12 +--- 2 files changed, 2 insertions(+), 11 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 5731a5c186..eba3a49877 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -229,6 +229,7 @@ struct xc_sr_restore_arrays { uint32_t types[MAX_BATCH_SIZE]; /* process_page_data */ xen_pfn_t mfns[MAX_BATCH_SIZE]; +int map_errs[MAX_BATCH_SIZE]; }; struct xc_sr_context diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 3ba089f862..94c329032f 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -206,21 +206,13 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count, { xc_interface *xch = ctx->xch; xen_pfn_t *mfns = ctx->restore.m->mfns; -int *map_errs = malloc(count * sizeof(*map_errs)); +int *map_errs = ctx->restore.m->map_errs; int rc; void *mapping = NULL, *guest_page = NULL; unsigned int i, /* i indexes the pfns from the record. */ j, /* j indexes the subset of pfns we decide to map. */ nr_pages = 0; -if ( !map_errs ) -{ -rc = -1; -ERROR("Failed to allocate %zu bytes to process page data", - count * (sizeof(*mfns) + sizeof(*map_errs))); -goto err; -} - rc = populate_pfns(ctx, count, pfns, types); if ( rc ) { @@ -298,8 +290,6 @@ static int process_page_data(struct xc_sr_context *ctx, unsigned int count, if ( mapping ) xenforeignmemory_unmap(xch->fmem, mapping, nr_pages); -free(map_errs); - return rc; }
[PATCH v1 05/23] tools: show migration transfer rate in send_dirty_pages
Show how fast domU pages are transferred in each iteration. The relevant data is how fast the pfns travel, not so much how much protocol overhead exists. So the reported MiB/sec is just for pfns. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_save.c | 47 + 2 files changed, 49 insertions(+) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 70e328e951..f3a7a29298 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -238,6 +238,8 @@ struct xc_sr_context bool debug; unsigned long p2m_size; +size_t pages_sent; +size_t overhead_sent; struct precopy_stats stats; diff --git a/tools/libs/guest/xg_sr_save.c b/tools/libs/guest/xg_sr_save.c index 0546d3d9e6..f58a35ddde 100644 --- a/tools/libs/guest/xg_sr_save.c +++ b/tools/libs/guest/xg_sr_save.c @@ -1,5 +1,6 @@ #include #include +#include #include "xg_sr_common.h" @@ -238,6 +239,8 @@ static int write_batch(struct xc_sr_context *ctx) iov[3].iov_len = nr_pfns * sizeof(*rec_pfns); iovcnt = 4; +ctx->save.pages_sent += nr_pages; +ctx->save.overhead_sent += sizeof(rec) + sizeof(hdr) + nr_pfns * sizeof(*rec_pfns); if ( nr_pages ) { @@ -357,6 +360,43 @@ static int suspend_domain(struct xc_sr_context *ctx) return 0; } +static void show_transfer_rate(struct xc_sr_context *ctx, struct timespec *start) +{ +xc_interface *xch = ctx->xch; +struct timespec end = {}, diff = {}; +size_t ms, MiB_sec = ctx->save.pages_sent * PAGE_SIZE; + +if (!MiB_sec) +return; + +if ( clock_gettime(CLOCK_MONOTONIC, ) ) +PERROR("clock_gettime"); + +if ( (end.tv_nsec - start->tv_nsec) < 0 ) +{ +diff.tv_sec = end.tv_sec - start->tv_sec - 1; +diff.tv_nsec = end.tv_nsec - start->tv_nsec + (1000U*1000U*1000U); +} +else +{ +diff.tv_sec = end.tv_sec - start->tv_sec; +diff.tv_nsec = end.tv_nsec - start->tv_nsec; +} + +ms = (diff.tv_nsec / (1000U*1000U)); +if (!ms) +ms = 1; +ms += (diff.tv_sec * 1000U); + +MiB_sec *= 1000U; +MiB_sec /= ms; +MiB_sec /= 1024U*1024U; + +errno = 0; +ERROR("%s: %zu bytes + %zu pages in %ld.%09ld sec, %zu MiB/sec", __func__, + ctx->save.overhead_sent, ctx->save.pages_sent, diff.tv_sec, diff.tv_nsec, MiB_sec); +} + /* * Send a subset of pages in the guests p2m, according to the dirty bitmap. * Used for each subsequent iteration of the live migration loop. @@ -370,9 +410,15 @@ static int send_dirty_pages(struct xc_sr_context *ctx, xen_pfn_t p; unsigned long written; int rc; +struct timespec start = {}; DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap, >save.dirty_bitmap_hbuf); +ctx->save.pages_sent = 0; +ctx->save.overhead_sent = 0; +if ( clock_gettime(CLOCK_MONOTONIC, ) ) +PERROR("clock_gettime"); + for ( p = 0, written = 0; p < ctx->save.p2m_size; ++p ) { if ( !test_bit(p, dirty_bitmap) ) @@ -396,6 +442,7 @@ static int send_dirty_pages(struct xc_sr_context *ctx, if ( written > entries ) DPRINTF("Bitmap contained more entries than expected..."); +show_transfer_rate(ctx, ); xc_report_progress_step(xch, entries, entries); return ctx->save.ops.check_vm_state(ctx);
[PATCH v1 15/23] tools/guest: restore: move pfns array
Remove allocation from hotpath, move pfns array into preallocated space. Signed-off-by: Olaf Hering --- tools/libs/guest/xg_sr_common.h | 2 ++ tools/libs/guest/xg_sr_restore.c | 6 ++ 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/libs/guest/xg_sr_common.h b/tools/libs/guest/xg_sr_common.h index 2a020fef5c..516d9b9fb5 100644 --- a/tools/libs/guest/xg_sr_common.h +++ b/tools/libs/guest/xg_sr_common.h @@ -224,6 +224,8 @@ struct xc_sr_save_arrays { }; struct xc_sr_restore_arrays { +/* handle_page_data */ +xen_pfn_t pfns[MAX_BATCH_SIZE]; }; struct xc_sr_context diff --git a/tools/libs/guest/xg_sr_restore.c b/tools/libs/guest/xg_sr_restore.c index 4a9ece9681..7d1447e86c 100644 --- a/tools/libs/guest/xg_sr_restore.c +++ b/tools/libs/guest/xg_sr_restore.c @@ -315,7 +315,7 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) unsigned int i, pages_of_data = 0; int rc = -1; -xen_pfn_t *pfns = NULL, pfn; +xen_pfn_t *pfns = ctx->restore.m->pfns, pfn; uint32_t *types = NULL, type; /* @@ -363,9 +363,8 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) goto err; } -pfns = malloc(pages->count * sizeof(*pfns)); types = malloc(pages->count * sizeof(*types)); -if ( !pfns || !types ) +if ( !types ) { ERROR("Unable to allocate enough memory for %u pfns", pages->count); @@ -412,7 +411,6 @@ static int handle_page_data(struct xc_sr_context *ctx, struct xc_sr_record *rec) >pfn[pages->count]); err: free(types); -free(pfns); return rc; }
[PATCH v1 02/23] tools: add xc_is_known_page_type to libxenctrl
Users of xc_get_pfn_type_batch may want to sanity check the data returned by Xen. Add a simple helper for this purpose. Signed-off-by: Olaf Hering --- tools/libs/ctrl/xc_private.h | 33 + 1 file changed, 33 insertions(+) diff --git a/tools/libs/ctrl/xc_private.h b/tools/libs/ctrl/xc_private.h index 5d2c7274fb..afb08aafe1 100644 --- a/tools/libs/ctrl/xc_private.h +++ b/tools/libs/ctrl/xc_private.h @@ -421,6 +421,39 @@ void *xc_map_foreign_ranges(xc_interface *xch, uint32_t dom, int xc_get_pfn_type_batch(xc_interface *xch, uint32_t dom, unsigned int num, xen_pfn_t *); +/* Sanitiy check for types returned by Xen */ +static inline bool xc_is_known_page_type(xen_pfn_t type) +{ +bool ret; + +switch (type) +{ +case XEN_DOMCTL_PFINFO_NOTAB: + +case XEN_DOMCTL_PFINFO_L1TAB: +case XEN_DOMCTL_PFINFO_L1TAB | XEN_DOMCTL_PFINFO_LPINTAB: + +case XEN_DOMCTL_PFINFO_L2TAB: +case XEN_DOMCTL_PFINFO_L2TAB | XEN_DOMCTL_PFINFO_LPINTAB: + +case XEN_DOMCTL_PFINFO_L3TAB: +case XEN_DOMCTL_PFINFO_L3TAB | XEN_DOMCTL_PFINFO_LPINTAB: + +case XEN_DOMCTL_PFINFO_L4TAB: +case XEN_DOMCTL_PFINFO_L4TAB | XEN_DOMCTL_PFINFO_LPINTAB: + +case XEN_DOMCTL_PFINFO_XTAB: +case XEN_DOMCTL_PFINFO_XALLOC: +case XEN_DOMCTL_PFINFO_BROKEN: +ret = true; +break; +default: +ret = false; +break; +} +return ret; +} + void bitmap_64_to_byte(uint8_t *bp, const uint64_t *lp, int nbits); void bitmap_byte_to_64(uint64_t *lp, const uint8_t *bp, int nbits);
Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
On Thu, Oct 29, 2020 at 03:05:31PM +, David Laight wrote: > From: Arnd Bergmann > > Sent: 28 October 2020 21:21 > > > > From: Arnd Bergmann > > > > There are hundreds of warnings in a W=2 build about a local > > variable shadowing the global 'apic' definition: > > > > arch/x86/kvm/lapic.h:149:65: warning: declaration of 'apic' shadows a > > global declaration [-Wshadow] > > > > Avoid this by renaming the global 'apic' variable to the more descriptive > > 'x86_system_apic'. It was originally called 'genapic', but both that > > and the current 'apic' seem to be a little overly generic for a global > > variable. > > > > Fixes: c48f14966cc4 ("KVM: inline kvm_apic_present() and > > kvm_lapic_enabled()") > > Fixes: c8d46cf06dc2 ("x86: rename 'genapic' to 'apic'") > > Signed-off-by: Arnd Bergmann > > --- > > v2: rename the global instead of the local variable in the header > ... > > diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c > > index 284e73661a18..33e2dc78ca11 100644 > > --- a/arch/x86/hyperv/hv_apic.c > > +++ b/arch/x86/hyperv/hv_apic.c > > @@ -259,14 +259,14 @@ void __init hv_apic_init(void) > > /* > > * Set the IPI entry points. > > */ > > - orig_apic = *apic; > > - > > - apic->send_IPI = hv_send_ipi; > > - apic->send_IPI_mask = hv_send_ipi_mask; > > - apic->send_IPI_mask_allbutself = hv_send_ipi_mask_allbutself; > > - apic->send_IPI_allbutself = hv_send_ipi_allbutself; > > - apic->send_IPI_all = hv_send_ipi_all; > > - apic->send_IPI_self = hv_send_ipi_self; > > + orig_apic = *x86_system_apic; > > + > > + x86_system_apic->send_IPI = hv_send_ipi; > > + x86_system_apic->send_IPI_mask = hv_send_ipi_mask; > > + x86_system_apic->send_IPI_mask_allbutself = > > hv_send_ipi_mask_allbutself; > > + x86_system_apic->send_IPI_allbutself = hv_send_ipi_allbutself; > > + x86_system_apic->send_IPI_all = hv_send_ipi_all; > > + x86_system_apic->send_IPI_self = hv_send_ipi_self; > > } > > > > if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) { > > @@ -285,10 +285,10 @@ void __init hv_apic_init(void) > > */ > > apic_set_eoi_write(hv_apic_eoi_write); > > if (!x2apic_enabled()) { > > - apic->read = hv_apic_read; > > - apic->write = hv_apic_write; > > - apic->icr_write = hv_apic_icr_write; > > - apic->icr_read = hv_apic_icr_read; > > + x86_system_apic->read = hv_apic_read; > > + x86_system_apic->write = hv_apic_write; > > + x86_system_apic->icr_write = hv_apic_icr_write; > > + x86_system_apic->icr_read = hv_apic_icr_read; > > } > > For those two just add: > struct apic *apic = x86_system_apic; > before all the assignments. > Less churn and much better code. > Why would it be better code?
Re: [PATCH v1 4/4] xen/pci: solve compilation error when memory paging is not enabled.
On 29.10.2020 17:58, Rahul Singh wrote: >> On 28 Oct 2020, at 3:13 pm, Rahul Singh wrote: >>> On 28 Oct 2020, at 11:56 am, Jan Beulich wrote: >>> On 26.10.2020 18:17, Rahul Singh wrote: --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -1419,13 +1419,15 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag) if ( !is_iommu_enabled(d) ) return 0; -/* Prevent device assign if mem paging or mem sharing have been +#if defined(CONFIG_HAS_MEM_PAGING) || defined(CONFIG_MEM_SHARING) +/* Prevent device assign if mem paging or mem sharing have been * enabled for this domain */ if ( d != dom_io && unlikely(mem_sharing_enabled(d) || vm_event_check_ring(d->vm_event_paging) || p2m_get_hostp2m(d)->global_logdirty) ) return -EXDEV; +#endif >>> >>> Besides this also disabling mem-sharing and log-dirty related >>> logic, I don't think the change is correct: Each item being >>> checked needs individually disabling depending on its associated >>> CONFIG_*. For this, perhaps you want to introduce something >>> like mem_paging_enabled(d), to avoid the need for #ifdef here? >>> >> >> Ok I will fix that in next version. > > I just check and found out that mem-sharing , men-paging and log-dirty is x86 > specific and not implemented for ARM. > Is that will be ok if I move above code to x86 specific directory and > introduce new function arch_pcidev_is_assignable() that will test if pcidev > is assignable or not ? As an immediate workaround - perhaps (long term the individual pieces should still be individually checked here, as they're not inherently x86-specific). Since there's no device property involved here, the suggested name looks misleading. Perhaps arch_iommu_usable(d)? Jan
[PATCH 2/2] x86: PV shim doesn't need GRANT_TABLE
The only reference into the code controlled by this option is from the hypercall table, and that hypercall table entry gets overwritten. Signed-off-by: Jan Beulich --- a/xen/arch/x86/configs/pvshim_defconfig +++ b/xen/arch/x86/configs/pvshim_defconfig @@ -9,6 +9,7 @@ CONFIG_EXPERT=y CONFIG_SCHED_NULL=y # Disable features not used by the PV shim # CONFIG_XEN_SHSTK is not set +# CONFIG_GRANT_TABLE is not set # CONFIG_HYPFS is not set # CONFIG_BIGMEM is not set # CONFIG_KEXEC is not set
[PATCH 1/2] x86: fix build of PV shim when !GRANT_TABLE
To do its compat translation, shim code needs to include the compat header. For this to work, this header first of all needs to be generated. Reported-by: Andrew Cooper Signed-off-by: Jan Beulich --- a/xen/include/Makefile +++ b/xen/include/Makefile @@ -24,6 +24,7 @@ headers-$(CONFIG_X86) += compat/arch headers-$(CONFIG_ARGO)+= compat/argo.h headers-$(CONFIG_PV) += compat/callback.h headers-$(CONFIG_GRANT_TABLE) += compat/grant_table.h +headers-$(CONFIG_PV_SHIM) += compat/grant_table.h headers-$(CONFIG_HVM) += compat/hvm/dm_op.h headers-$(CONFIG_HVM) += compat/hvm/hvm_op.h headers-$(CONFIG_HVM) += compat/hvm/hvm_vcpu.h
[PATCH 0/2] x86: PV shim vs grant table
While Andrew reported a randconfig build failure, I started wondering why we'd ever build in a way different from what had failed to build. Fix the build and then switch the shim config file accordingly. 1: fix build of PV shim when !GRANT_TABLE 2: PV shim doesn't need GRANT_TABLE Jan
Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
On 29/10/20 17:56, Arvind Sankar wrote: >> For those two just add: >> struct apic *apic = x86_system_apic; >> before all the assignments. >> Less churn and much better code. >> > Why would it be better code? > I think he means the compiler produces better code, because it won't read the global variable repeatedly. Not sure if that's true,(*) but I think I do prefer that version if Arnd wants to do that tweak. Paolo (*) if it is, it might also be due to Linux using -fno-strict-aliasing
Re: [PATCH v1 4/4] xen/pci: solve compilation error when memory paging is not enabled.
Hello Jan, > On 28 Oct 2020, at 3:13 pm, Rahul Singh wrote: > > Hello Jan, > >> On 28 Oct 2020, at 11:56 am, Jan Beulich wrote: >> >> On 26.10.2020 18:17, Rahul Singh wrote: >>> --- a/xen/drivers/passthrough/pci.c >>> +++ b/xen/drivers/passthrough/pci.c >>> @@ -1419,13 +1419,15 @@ static int assign_device(struct domain *d, u16 seg, >>> u8 bus, u8 devfn, u32 flag) >>>if ( !is_iommu_enabled(d) ) >>>return 0; >>> >>> -/* Prevent device assign if mem paging or mem sharing have been >>> +#if defined(CONFIG_HAS_MEM_PAGING) || defined(CONFIG_MEM_SHARING) >>> +/* Prevent device assign if mem paging or mem sharing have been >>> * enabled for this domain */ >>>if ( d != dom_io && >>> unlikely(mem_sharing_enabled(d) || >>> vm_event_check_ring(d->vm_event_paging) || >>> p2m_get_hostp2m(d)->global_logdirty) ) >>>return -EXDEV; >>> +#endif >> >> Besides this also disabling mem-sharing and log-dirty related >> logic, I don't think the change is correct: Each item being >> checked needs individually disabling depending on its associated >> CONFIG_*. For this, perhaps you want to introduce something >> like mem_paging_enabled(d), to avoid the need for #ifdef here? >> > > Ok I will fix that in next version. I just check and found out that mem-sharing , men-paging and log-dirty is x86 specific and not implemented for ARM. Is that will be ok if I move above code to x86 specific directory and introduce new function arch_pcidev_is_assignable() that will test if pcidev is assignable or not ? > >> Jan Regards, Rahul
[xen-unstable-smoke test] 156297: tolerable all pass - PUSHED
flight 156297 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/156297/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 1fd1d4bafdf6f9f8fe5ca9b947f016a7aae92a74 baseline version: xen 16a20963b3209788f2c0d3a3eebb7d92f03f5883 Last test of basis 156260 2020-10-27 18:01:29 Z1 days Testing same since 156297 2020-10-29 14:01:23 Z0 days1 attempts People who touched revisions under test: Jan Beulich jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 16a20963b3..1fd1d4bafd 1fd1d4bafdf6f9f8fe5ca9b947f016a7aae92a74 -> smoke
Re: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
Arnd, On Thu, Oct 29 2020 at 10:51, Arnd Bergmann wrote: > On Thu, Oct 29, 2020 at 8:04 AM Paolo Bonzini wrote: >> On 28/10/20 22:20, Arnd Bergmann wrote: >> > Avoid this by renaming the global 'apic' variable to the more descriptive >> > 'x86_system_apic'. It was originally called 'genapic', but both that >> > and the current 'apic' seem to be a little overly generic for a global >> > variable. >> >> The 'apic' affects only the current CPU, so one of 'x86_local_apic', >> 'x86_lapic' or 'x86_apic' is probably preferrable. > > Ok, I'll change it to x86_local_apic then, unless someone else has > a preference between them. x86_local_apic is fine with me. > I think ideally there would be no global variable, withall accesses > encapsulated in function calls, possibly using static_call() optimizations > if any of them are performance critical. There are a bunch, yes. > It doesn't seem hard to do, but I'd rather leave that change to > an x86 person ;-) It's not hard, but I'm not really sure whether it buys much. Can you please redo that against tip x86/apic. Much of what you are touching got a major overhaul. Thanks, tglx
[PATCH] tools: add noidentpt domain config option
The Identity Pagetable is currently being created for all HVM VMs during setup. This was only necessary for running HVM guests on Intel CPUs where EPT was available but unrestricted guest mode was not. Add option to skip the creation of the Identity Pagetable via the "noidentpt" xl config option. This allows a system administrator to skip this step when the hardware is known to have the required features. Signed-off-by: Tamas K Lengyel --- Further context: while running VM forks a significant bottle-neck was identified when HVM_PARAM_IDENT_PT is getting copied from the parent VM. This is due to the fact that setting this parameter requires obtaining a Xen-wide lock (domctl_lock_acquire). When running several VM forks in parallel during fuzzing the fork reset hypercall can fail due to the lock being taken by another fork that's being reset at the same time. This whole situation can be avoided if there is no Identity-map pagetable to begin with as on modern CPUs this identity-map pagetable is not actually required. --- docs/man/xl.cfg.5.pod.in | 5 + tools/include/xenguest.h | 1 + tools/libs/guest/xg_dom_x86.c| 31 +-- tools/libs/light/libxl_create.c | 2 ++ tools/libs/light/libxl_dom.c | 2 ++ tools/libs/light/libxl_types.idl | 1 + tools/xl/xl_parse.c | 2 ++ 7 files changed, 30 insertions(+), 14 deletions(-) diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in index 0532739c1f..4d992fe346 100644 --- a/docs/man/xl.cfg.5.pod.in +++ b/docs/man/xl.cfg.5.pod.in @@ -587,6 +587,11 @@ which are incompatible with migration. Currently this is limited to enabling the invariant TSC feature flag in CPUID results when TSC is not emulated. +=item B + +Disable the creation of the Identity-map Pagetable that was required to run HVM +guests on Intel CPUs with EPT where no unrestricted guest mode was available. + =item B Specify that this domain is a driver domain. This enables certain diff --git a/tools/include/xenguest.h b/tools/include/xenguest.h index a9984dbea5..998a8c57ba 100644 --- a/tools/include/xenguest.h +++ b/tools/include/xenguest.h @@ -26,6 +26,7 @@ #define XCFLAGS_LIVE (1 << 0) #define XCFLAGS_DEBUG (1 << 1) +#define XCFLAGS_NOIDENTPT (1 << 2) #define X86_64_B_SIZE 64 #define X86_32_B_SIZE 32 diff --git a/tools/libs/guest/xg_dom_x86.c b/tools/libs/guest/xg_dom_x86.c index 2953aeb90b..827bea7eb7 100644 --- a/tools/libs/guest/xg_dom_x86.c +++ b/tools/libs/guest/xg_dom_x86.c @@ -718,20 +718,23 @@ static int alloc_magic_pages_hvm(struct xc_dom_image *dom) goto out; } -/* - * Identity-map page table is required for running with CR0.PG=0 when - * using Intel EPT. Create a 32-bit non-PAE page directory of superpages. - */ -if ( (ident_pt = xc_map_foreign_range( - xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE, - special_pfn(SPECIALPAGE_IDENT_PT))) == NULL ) -goto error_out; -for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ ) -ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER | - _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); -munmap(ident_pt, PAGE_SIZE); -xc_hvm_param_set(xch, domid, HVM_PARAM_IDENT_PT, - special_pfn(SPECIALPAGE_IDENT_PT) << PAGE_SHIFT); +if ( !(dom->flags & XCFLAGS_NOIDENTPT) ) +{ +/* + * Identity-map page table is required for running with CR0.PG=0 when + * using Intel EPT. Create a 32-bit non-PAE page directory of superpages. + */ +if ( (ident_pt = xc_map_foreign_range( + xch, domid, PAGE_SIZE, PROT_READ | PROT_WRITE, + special_pfn(SPECIALPAGE_IDENT_PT))) == NULL ) +goto error_out; +for ( i = 0; i < PAGE_SIZE / sizeof(*ident_pt); i++ ) +ident_pt[i] = ((i << 22) | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER | + _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE); +munmap(ident_pt, PAGE_SIZE); +xc_hvm_param_set(xch, domid, HVM_PARAM_IDENT_PT, + special_pfn(SPECIALPAGE_IDENT_PT) << PAGE_SHIFT); +} dom->console_pfn = special_pfn(SPECIALPAGE_CONSOLE); xc_clear_domain_page(dom->xch, dom->guest_domid, dom->console_pfn); diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c index 321a13e519..62b06b359c 100644 --- a/tools/libs/light/libxl_create.c +++ b/tools/libs/light/libxl_create.c @@ -256,6 +256,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc, libxl_defbool_setdefault(_info->disable_migrate, false); +libxl_defbool_setdefault(_info->disable_identpt, false); + for (i = 0 ; i < b_info->num_iomem; i++) if (b_info->iomem[i].gfn == LIBXL_INVALID_GFN) b_info->iomem[i].gfn = b_info->iomem[i].start; diff --git a/tools/libs/light/libxl_dom.c b/tools/libs/light/libxl_dom.c index
Re: [PATCH] x86/hvm: process softirq while saving/loading entries
On 29.10.2020 16:20, Roger Pau Monne wrote: > On slow systems with sync_console saving or loading the context of big > guests can cause the watchdog to trigger. Fix this by adding a couple > of process_pending_softirqs. Which raises the question in how far this is then also a problem for the caller of the underlying hypercall. IOW I wonder whether instead we need to make use of continuations here. Nevertheless ... > Signed-off-by: Roger Pau Monné Acked-by: Jan Beulich Jan
RE: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
From: Arnd Bergmann > Sent: 29 October 2020 09:51 ... > I think ideally there would be no global variable, withall accesses > encapsulated in function calls, possibly using static_call() optimizations > if any of them are performance critical. There isn't really a massive difference between global variables and global access functions. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
[PATCH v1] xl: fix description of migrate --debug
xl migrate --debug used to track every pfn in every batch of pages. But these times are gone. Adjust the help text to tell what --debug is supposed to do today. Signed-off-by: Olaf Hering --- docs/man/xl.1.pod.in | 4 +++- tools/xl/xl_cmdtable.c | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/man/xl.1.pod.in b/docs/man/xl.1.pod.in index 4bde0672fa..d0f50f0b4a 100644 --- a/docs/man/xl.1.pod.in +++ b/docs/man/xl.1.pod.in @@ -488,7 +488,9 @@ Include timestamps in output. =item B<--debug> -Display huge (!) amount of debug information during the migration process. +Verify transferred domU page data. All memory will be transferred one more +time to the destination host while the domU is paused, and compare with +the result of the inital transfer while the domU was still running. =item B<-p> diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index fd2dc0aef2..af160dde42 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -168,7 +168,7 @@ struct cmd_spec cmd_table[] = { "-e Do not wait in the background (on ) for the death\n" "of the domain.\n" "-T Show timestamps during the migration process.\n" - "--debug Print huge (!) amount of debug during the migration process.\n" + "--debug Verify transferred domU page data.\n" "-p Do not unpause domain after migrating it.\n" "-D Preserve the domain id" },
Re: [PATCH 2/2] tools/libs: fix uninstall rule for header files
> On 19 Oct 2020, at 08:21, Jan Beulich wrote: > > This again was working right only as long as $(LIBHEADER) consisted of > just one entry. > > Signed-off-by: Jan Beulich Reviewed-by: Bertrand Marquis The change is obviously fixing a bug :-) and the double $ is required to protect from make. Cheers Bertrand > --- > An alternative would be to use $(addprefix ) without any shell loop. > > --- a/tools/libs/libs.mk > +++ b/tools/libs/libs.mk > @@ -107,7 +107,7 @@ install: build > .PHONY: uninstall > uninstall: > rm -f $(DESTDIR)$(PKG_INSTALLDIR)/$(LIB_FILE_NAME).pc > - for i in $(LIBHEADER); do rm -f $(DESTDIR)$(includedir)/$(LIBHEADER); > done > + for i in $(LIBHEADER); do rm -f $(DESTDIR)$(includedir)/$$i; done > rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so > rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so.$(MAJOR) > rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so.$(MAJOR).$(MINOR) > >
[PATCH] x86/hvm: process softirq while saving/loading entries
On slow systems with sync_console saving or loading the context of big guests can cause the watchdog to trigger. Fix this by adding a couple of process_pending_softirqs. Signed-off-by: Roger Pau Monné --- xen/arch/x86/hvm/save.c | 4 1 file changed, 4 insertions(+) diff --git a/xen/arch/x86/hvm/save.c b/xen/arch/x86/hvm/save.c index a2c56fbc1e..584620985b 100644 --- a/xen/arch/x86/hvm/save.c +++ b/xen/arch/x86/hvm/save.c @@ -21,6 +21,7 @@ */ #include +#include #include #include @@ -255,6 +256,7 @@ int hvm_save(struct domain *d, hvm_domain_context_t *h) v, i); return -ENODATA; } +process_pending_softirqs(); } } else @@ -268,6 +270,7 @@ int hvm_save(struct domain *d, hvm_domain_context_t *h) d->domain_id, i); return -ENODATA; } +process_pending_softirqs(); } } @@ -341,6 +344,7 @@ int hvm_load(struct domain *d, hvm_domain_context_t *h) d->domain_id, desc->typecode, desc->instance); return -1; } +process_pending_softirqs(); } /* Not reached */ -- 2.29.0
[PATCH] x86/HVM: send mapcache invalidation request to qemu regardless of preemption
Even if only part of a hypercall completed before getting preempted, invalidation ought to occur. Therefore fold the two return statements. Signed-off-by: Jan Beulich --- Split off from "x86/HVM: refine when to send mapcache invalidation request to qemu". --- a/xen/arch/x86/hvm/hypercall.c +++ b/xen/arch/x86/hvm/hypercall.c @@ -326,14 +326,11 @@ int hvm_hypercall(struct cpu_user_regs * HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%lu -> %lx", eax, regs->rax); -if ( curr->hcall_preempted ) -return HVM_HCALL_preempted; - if ( unlikely(currd->arch.hvm.qemu_mapcache_invalidate) && test_and_clear_bool(currd->arch.hvm.qemu_mapcache_invalidate) ) send_invalidate_req(); -return HVM_HCALL_completed; +return curr->hcall_preempted ? HVM_HCALL_preempted : HVM_HCALL_completed; } /*
RE: [PATCH] [v2] x86: apic: avoid -Wshadow warning in header
From: Arnd Bergmann > Sent: 28 October 2020 21:21 > > From: Arnd Bergmann > > There are hundreds of warnings in a W=2 build about a local > variable shadowing the global 'apic' definition: > > arch/x86/kvm/lapic.h:149:65: warning: declaration of 'apic' shadows a global > declaration [-Wshadow] > > Avoid this by renaming the global 'apic' variable to the more descriptive > 'x86_system_apic'. It was originally called 'genapic', but both that > and the current 'apic' seem to be a little overly generic for a global > variable. > > Fixes: c48f14966cc4 ("KVM: inline kvm_apic_present() and kvm_lapic_enabled()") > Fixes: c8d46cf06dc2 ("x86: rename 'genapic' to 'apic'") > Signed-off-by: Arnd Bergmann > --- > v2: rename the global instead of the local variable in the header ... > diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c > index 284e73661a18..33e2dc78ca11 100644 > --- a/arch/x86/hyperv/hv_apic.c > +++ b/arch/x86/hyperv/hv_apic.c > @@ -259,14 +259,14 @@ void __init hv_apic_init(void) > /* >* Set the IPI entry points. >*/ > - orig_apic = *apic; > - > - apic->send_IPI = hv_send_ipi; > - apic->send_IPI_mask = hv_send_ipi_mask; > - apic->send_IPI_mask_allbutself = hv_send_ipi_mask_allbutself; > - apic->send_IPI_allbutself = hv_send_ipi_allbutself; > - apic->send_IPI_all = hv_send_ipi_all; > - apic->send_IPI_self = hv_send_ipi_self; > + orig_apic = *x86_system_apic; > + > + x86_system_apic->send_IPI = hv_send_ipi; > + x86_system_apic->send_IPI_mask = hv_send_ipi_mask; > + x86_system_apic->send_IPI_mask_allbutself = > hv_send_ipi_mask_allbutself; > + x86_system_apic->send_IPI_allbutself = hv_send_ipi_allbutself; > + x86_system_apic->send_IPI_all = hv_send_ipi_all; > + x86_system_apic->send_IPI_self = hv_send_ipi_self; > } > > if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) { > @@ -285,10 +285,10 @@ void __init hv_apic_init(void) >*/ > apic_set_eoi_write(hv_apic_eoi_write); > if (!x2apic_enabled()) { > - apic->read = hv_apic_read; > - apic->write = hv_apic_write; > - apic->icr_write = hv_apic_icr_write; > - apic->icr_read = hv_apic_icr_read; > + x86_system_apic->read = hv_apic_read; > + x86_system_apic->write = hv_apic_write; > + x86_system_apic->icr_write = hv_apic_icr_write; > + x86_system_apic->icr_read = hv_apic_icr_read; > } For those two just add: struct apic *apic = x86_system_apic; before all the assignments. Less churn and much better code. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
[PATCH] x86emul: support AVX-VNNI
These are VEX-encoded equivalents of the EVEX-encoded AVX512-VNNI ISA extension. Signed-off-by: Jan Beulich --- SDE: -spr --- a/tools/libs/light/libxl_cpuid.c +++ b/tools/libs/light/libxl_cpuid.c @@ -226,6 +226,7 @@ int libxl_cpuid_parse_config(libxl_cpuid {"core-caps",0x0007, 0, CPUID_REG_EDX, 30, 1}, {"ssbd", 0x0007, 0, CPUID_REG_EDX, 31, 1}, +{"avx-vnni", 0x0007, 1, CPUID_REG_EAX, 4, 1}, {"avx512-bf16", 0x0007, 1, CPUID_REG_EAX, 5, 1}, {"lahfsahf", 0x8001, NA, CPUID_REG_ECX, 0, 1}, --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -175,7 +175,7 @@ static const char *const str_7d0[32] = static const char *const str_7a1[32] = { -/* 4 */ [ 5] = "avx512-bf16", +[ 4] = "avx-vnni", [ 5] = "avx512-bf16", }; static const struct { --- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -1335,6 +1335,10 @@ static const struct vex { { { 0x45 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsrlv{d,q} */ { { 0x46 }, 2, T, R, pfx_66, W0, Ln }, /* vpsravd */ { { 0x47 }, 2, T, R, pfx_66, Wn, Ln }, /* vpsllv{d,q} */ +{ { 0x50 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusd */ +{ { 0x51 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpbusds */ +{ { 0x52 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwssd */ +{ { 0x53 }, 2, T, R, pfx_66, W0, Ln }, /* vpdpwssds */ { { 0x58 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastd */ { { 0x59 }, 2, T, R, pfx_66, W0, Ln }, /* vpbroadcastq */ { { 0x5a }, 2, F, R, pfx_66, W0, L1 }, /* vbroadcasti128 */ --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -5028,6 +5028,61 @@ int main(int argc, char **argv) printf("okay\n"); } +printf("%-40s", "Testing vpdpwssd (%ecx),%{y,z}mmA,%{y,z}mmB..."); +if ( stack_exec && cpu_has_avx512_vnni && cpu_has_avx_vnni ) +{ +/* Do the same operation two ways and compare the results. */ +decl_insn(vpdpwssd_vex1); +decl_insn(vpdpwssd_vex2); +decl_insn(vpdpwssd_evex); + +for ( i = 0; i < 24; ++i ) +res[i] = i | (~i << 16); + +asm volatile ( "vmovdqu32 32(%0), %%zmm1\n\t" + "vextracti64x4 $1, %%zmm1, %%ymm2\n\t" + "vpxor %%xmm0, %%xmm0, %%xmm3\n\t" + "vpxor %%xmm0, %%xmm0, %%xmm4\n\t" + "vpxor %%xmm0, %%xmm0, %%xmm5\n" + put_insn(vpdpwssd_vex1, +/* %{vex%} vpdpwssd (%1), %%ymm1, %%ymm3" */ +".byte 0xc4, 0xe2, 0x75, 0x52, 0x19") "\n" + put_insn(vpdpwssd_vex2, +/* "%{vex%} vpdpwssd 32(%1), %%ymm2, %%ymm4" */ +".byte 0xc4, 0xe2, 0x6d, 0x52, 0x61, 0x20") "\n" + put_insn(vpdpwssd_evex, +/* "vpdpwssd (%1), %%zmm1, %%zmm5" */ +".byte 0x62, 0xf2, 0x75, 0x48, 0x52, 0x29") + :: "r" (res), "c" (NULL) ); + +set_insn(vpdpwssd_vex1); +regs.ecx = (unsigned long)res; +rc = x86_emulate(, ); +if ( rc != X86EMUL_OKAY || !check_eip(vpdpwssd_vex1) ) +goto fail; + +set_insn(vpdpwssd_vex2); +regs.ecx = (unsigned long)res; +rc = x86_emulate(, ); +if ( rc != X86EMUL_OKAY || !check_eip(vpdpwssd_vex2) ) +goto fail; + +set_insn(vpdpwssd_evex); +regs.ecx = (unsigned long)res; +rc = x86_emulate(, ); +if ( rc != X86EMUL_OKAY || !check_eip(vpdpwssd_evex) ) +goto fail; + +asm ( "vinserti64x4 $1, %%ymm4, %%zmm3, %%zmm0\n\t" + "vpcmpeqd %%zmm0, %%zmm5, %%k0\n\t" + "kmovw %%k0, %0" : "=g" (rc) ); +if ( rc != 0x ) +goto fail; +printf("okay\n"); +} +else +printf("skipped\n"); + printf("%-40s", "Testing invpcid 16(%ecx),%%edx..."); if ( stack_exec ) { --- a/tools/tests/x86_emulator/x86-emulate.h +++ b/tools/tests/x86_emulator/x86-emulate.h @@ -170,6 +170,7 @@ static inline bool xcr0_mask(uint64_t ma #define cpu_has_avx512_4fmaps (cp.feat.avx512_4fmaps && xcr0_mask(0xe6)) #define cpu_has_avx512_vp2intersect (cp.feat.avx512_vp2intersect && xcr0_mask(0xe6)) #define cpu_has_serialize cp.feat.serialize +#define cpu_has_avx_vnni (cp.feat.avx_vnni && xcr0_mask(6)) #define cpu_has_avx512_bf16 (cp.feat.avx512_bf16 && xcr0_mask(0xe6)) #define cpu_has_xgetbv1 (cpu_has_xsave && cp.xstate.xgetbv1) --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -2008,6 +2008,7 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_avx512_4fmaps() (ctxt->cpuid->feat.avx512_4fmaps) #define vcpu_has_avx512_vp2intersect()
Re: [PATCH 12/12] xen/cpupool: make per-cpupool sched-gran hypfs node writable
On 29.10.20 15:58, Jan Beulich wrote: On 26.10.2020 10:13, Juergen Gross wrote: @@ -1088,13 +1098,58 @@ static int cpupool_gran_read(const struct hypfs_entry *entry, return copy_to_guest(uaddr, name, strlen(name) + 1) ? -EFAULT : 0; } +static int cpupool_gran_write(struct hypfs_entry_leaf *leaf, + XEN_GUEST_HANDLE_PARAM(void) uaddr, + unsigned int ulen) +{ +const struct hypfs_dyndir_id *data; +struct cpupool *cpupool; +enum sched_gran gran; +unsigned int sched_gran; +char name[SCHED_GRAN_NAME_LEN]; +int ret = 0; + +if ( ulen > SCHED_GRAN_NAME_LEN ) +return -ENOSPC; + +if ( copy_from_guest(name, uaddr, ulen) ) +return -EFAULT; + +sched_gran = sched_gran_get(name, ) ? 0 + : cpupool_check_granularity(gran); +if ( memchr(name, 0, ulen) != (name + ulen - 1) || sched_gran == 0 ) +return -EINVAL; I guess the memchr() check wants to happen before the call to sched_gran_get()? Yes. Juergen
Re: [PATCH 12/12] xen/cpupool: make per-cpupool sched-gran hypfs node writable
On 26.10.2020 10:13, Juergen Gross wrote: > @@ -1088,13 +1098,58 @@ static int cpupool_gran_read(const struct hypfs_entry > *entry, > return copy_to_guest(uaddr, name, strlen(name) + 1) ? -EFAULT : 0; > } > > +static int cpupool_gran_write(struct hypfs_entry_leaf *leaf, > + XEN_GUEST_HANDLE_PARAM(void) uaddr, > + unsigned int ulen) > +{ > +const struct hypfs_dyndir_id *data; > +struct cpupool *cpupool; > +enum sched_gran gran; > +unsigned int sched_gran; > +char name[SCHED_GRAN_NAME_LEN]; > +int ret = 0; > + > +if ( ulen > SCHED_GRAN_NAME_LEN ) > +return -ENOSPC; > + > +if ( copy_from_guest(name, uaddr, ulen) ) > +return -EFAULT; > + > +sched_gran = sched_gran_get(name, ) ? 0 > + : > cpupool_check_granularity(gran); > +if ( memchr(name, 0, ulen) != (name + ulen - 1) || sched_gran == 0 ) > +return -EINVAL; I guess the memchr() check wants to happen before the call to sched_gran_get()? Jan
Re: [PATCH 20/33] docs: ABI: testing: make the files compatible with ReST output
On Wed, 28 Oct 2020 15:23:18 +0100 Mauro Carvalho Chehab wrote: > From: Mauro Carvalho Chehab > > Some files over there won't parse well by Sphinx. > > Fix them. > > Signed-off-by: Mauro Carvalho Chehab > Signed-off-by: Mauro Carvalho Chehab Query below... I'm going to guess a rebase issue? Other than that Acked-by: Jonathan Cameron # for IIO > diff --git a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 > b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 > index b7259234ad70..a10a4de3e5fe 100644 > --- a/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 > +++ b/Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 > @@ -3,67 +3,85 @@ KernelVersion: 4.11 > Contact: benjamin.gaign...@st.com > Description: > Reading returns the list possible master modes which are: > - - "reset" : The UG bit from the TIMx_EGR register is > + > + > + - "reset" > + The UG bit from the TIMx_EGR register is > used as trigger output (TRGO). > - - "enable": The Counter Enable signal CNT_EN is used > + - "enable" > + The Counter Enable signal CNT_EN is used > as trigger output. > - - "update": The update event is selected as trigger output. > + - "update" > + The update event is selected as trigger output. > For instance a master timer can then be used > as a prescaler for a slave timer. > - - "compare_pulse" : The trigger output send a positive pulse > - when the CC1IF flag is to be set. > - - "OC1REF": OC1REF signal is used as trigger output. > - - "OC2REF": OC2REF signal is used as trigger output. > - - "OC3REF": OC3REF signal is used as trigger output. > - - "OC4REF": OC4REF signal is used as trigger output. > + - "compare_pulse" > + The trigger output send a positive pulse > + when the CC1IF flag is to be set. > + - "OC1REF" > + OC1REF signal is used as trigger output. > + - "OC2REF" > + OC2REF signal is used as trigger output. > + - "OC3REF" > + OC3REF signal is used as trigger output. > + - "OC4REF" > + OC4REF signal is used as trigger output. > + > Additional modes (on TRGO2 only): > - - "OC5REF": OC5REF signal is used as trigger output. > - - "OC6REF": OC6REF signal is used as trigger output. > + > + - "OC5REF" > + OC5REF signal is used as trigger output. > + - "OC6REF" > + OC6REF signal is used as trigger output. > - "compare_pulse_OC4REF": > - OC4REF rising or falling edges generate pulses. > + OC4REF rising or falling edges generate pulses. > - "compare_pulse_OC6REF": > - OC6REF rising or falling edges generate pulses. > + OC6REF rising or falling edges generate pulses. > - "compare_pulse_OC4REF_r_or_OC6REF_r": > - OC4REF or OC6REF rising edges generate pulses. > + OC4REF or OC6REF rising edges generate pulses. > - "compare_pulse_OC4REF_r_or_OC6REF_f": > - OC4REF rising or OC6REF falling edges generate pulses. > + OC4REF rising or OC6REF falling edges generate > + pulses. > - "compare_pulse_OC5REF_r_or_OC6REF_r": > - OC5REF or OC6REF rising edges generate pulses. > + OC5REF or OC6REF rising edges generate pulses. > - "compare_pulse_OC5REF_r_or_OC6REF_f": > - OC5REF rising or OC6REF falling edges generate pulses. > + OC5REF rising or OC6REF falling edges generate > + pulses. > > - +---+ +-++-+ > - | Prescaler +-> | Counter |+-> | Master | TRGO(2) > - +---+ +--++-+|-> | Control +--> > -|| || +-+ > - +--v+-+ OCxREF || +-+ > - | Chx compare +--> | Output | ChX > - +---+-+ | | Control +--> > - . | | +-+ > - . | |. > -
Re: [PATCH] xen: add support for automatic debug key actions in case of crash
On 29.10.2020 15:40, Jürgen Groß wrote: > On 29.10.20 15:22, Jan Beulich wrote: >> On 22.10.2020 16:39, Juergen Gross wrote: >>> @@ -507,6 +509,41 @@ void __init initialize_keytable(void) >>> } >>> } >>> >>> +#define CRASHACTION_SIZE 32 >>> +static char crash_debug_panic[CRASHACTION_SIZE]; >>> +static char crash_debug_hwdom[CRASHACTION_SIZE]; >>> +static char crash_debug_watchdog[CRASHACTION_SIZE]; >>> +static char crash_debug_kexeccmd[CRASHACTION_SIZE]; >>> +static char crash_debug_debugkey[CRASHACTION_SIZE]; >>> + >>> +static char *crash_action[CRASHREASON_N] = { >>> +[CRASHREASON_PANIC] = crash_debug_panic, >>> +[CRASHREASON_HWDOM] = crash_debug_hwdom, >>> +[CRASHREASON_WATCHDOG] = crash_debug_watchdog, >>> +[CRASHREASON_KEXECCMD] = crash_debug_kexeccmd, >>> +[CRASHREASON_DEBUGKEY] = crash_debug_debugkey, >>> +}; >>> + >>> +string_runtime_param("crash-debug-panic", crash_debug_panic); >>> +string_runtime_param("crash-debug-hwdom", crash_debug_hwdom); >>> +string_runtime_param("crash-debug-watchdog", crash_debug_watchdog); >>> +string_runtime_param("crash-debug-kexeccmd", crash_debug_kexeccmd); >>> +string_runtime_param("crash-debug-debugkey", crash_debug_debugkey); >> >> In general I'm not in favor of this (or similar) needing >> five new command line options, instead of just one. The problem >> with e.g. >> >> crash-debug=panic:rq,watchdog:0d >> >> is that ',' (or any other separator chosen) could in principle >> also be a debug key. It would still be possible to use >> >> crash-debug=panic:rq crash-debug=watchdog:0d >> >> though. Thoughts? > > OTOH the runtime parameters are much easier addressable that way. Ah yes, I can see this as a reason. Would make we wonder whether command line and runtime handling may want disconnecting in this specific case then. (But I can also see the argument of this being too much overhead then.) >>> +void keyhandler_crash_action(enum crash_reason reason) >>> +{ >>> +const char *action = crash_action[reason]; >>> +struct cpu_user_regs *regs = get_irq_regs() ? : guest_cpu_user_regs(); >>> + >>> +while ( *action ) { >>> +if ( *action == '.' ) >>> +mdelay(1000); >>> +else >>> +handle_keypress(*action, regs); >>> +action++; >>> +} >>> +} >> >> I think only diagnostic keys should be permitted here. > > While I understand that other keys could produce nonsense or do harm, > I'm not sure we should really prohibit them. Allowing them would e.g. > allow to do just a reboot without kdump for one type of crashes. Ah yes, that's a fair point. >>> --- a/xen/include/xen/kexec.h >>> +++ b/xen/include/xen/kexec.h >>> @@ -1,6 +1,8 @@ >>> #ifndef __XEN_KEXEC_H__ >>> #define __XEN_KEXEC_H__ >>> >>> +#include >> >> Could we go without this, utilizing the gcc extension of forward >> declared enums? Otoh ... >> >>> @@ -82,7 +84,11 @@ void vmcoreinfo_append_str(const char *fmt, ...) >>> #define kexecing 0 >>> >>> static inline void kexec_early_calculations(void) {} >>> -static inline void kexec_crash(void) {} >>> +static inline void kexec_crash(enum crash_reason reason) >>> +{ >>> +keyhandler_crash_action(reason); >>> +} >> >> ... if this is to be an inline function and not just a #define, >> it'll need the declaration of the function to have been seen. > > And even being a #define all users of kexec_crash() would need to > #include keyhandler.h (and I'm not sure there are any source files > including kexec.h which don't use kexec_crash()). About as many which do as ones which don't. But there's no generally accessible header which includes xen/kexec.h, so perhaps the extra dependency indeed isn't all this problematic. Jan
Re: [PATCH] xen: add support for automatic debug key actions in case of crash
On 29.10.20 15:22, Jan Beulich wrote: On 22.10.2020 16:39, Juergen Gross wrote: When the host crashes it would sometimes be nice to have additional debug data available which could be produced via debug keys, but halting the server for manual intervention might be impossible due to the need to reboot/kexec rather sooner than later. Add support for automatic debug key actions in case of crashes which can be activated via boot- or runtime-parameter. While I can certainly see this possibly being a useful thing in certain situations, I'm uncertain it's going to be helpful in at least a fair set of cases. What output to request very much depends on the nature of the problem one is running into, and the more keys one adds "just in case", the longer the reboot latency, and the higher the risk (see also below) of the output generation actually causing further problems. The obvious case is a watchdog induced crash: at least 2 sets of dom0 state will help in many cases. IOW I'm neither fully convinced that we want this, nor fully opposed. Depending on the type of crash the desired data might be different, so support different settings for the possible types of crashes. The parameter is "crash-debug" with the following syntax: crash-debug-= with being one of: panic, hwdom, watchdog, kexeccmd, debugkey and a sequence of debug key characters with '.' having the special semantics of a 1 second pause. 1 second is a whole lot of time. To get two successive sets of data, a much shorter delay (if any) would normally suffice. Yes, I'd be fine to trade that for a shorter period of time. Also, while '.' may seem like a good choice right now, with the shortage of characters we may want to put a real handler behind it at come point. The one character that clearly won't make much sense to use in this context is 'h', but that's awful as a (kind of) separator. Could we perhaps replace 'h' by '?', freeing up 'h' and allowing '?' to be used for this purpose here? Fine with me. Another possibility would be to add '\' as an escape character with '\.' meaning "debug-key .". --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -574,6 +574,29 @@ reduction of features at Xen's disposal to manage guests. ### cpuinfo (x86) > `= ` +### crash-debug-debugkey +### crash-debug-hwdom +### crash-debug-kexeccmd +### crash-debug-panic +### crash-debug-watchdog +> `= ` + +> Can be modified at runtime + +Specify debug-key actions in cases of crashes. Each of the parameters applies +to a different crash reason. The `` is a sequence of debug key +characters, with `.` having the special meaning of a 1 second pause. + +So e.g. `crash-debug-watchdog=0.0r` would dump dom0 state twice with a +second between the two state dumps, followed by the run queues of the +hypervisor, if the system crashes due to a watchdog timeout. + +These parameters should be used carefully, as e.g. specifying +`crash-debug-debugkey=C` would result in an endless loop. Depending on the +reason of the system crash it might happen that triggering some debug key +action will result in a hang instead of dumping data and then doing a +reboot or crash dump. I think it would be useful if the flavors were (briefly) explained: At the very least "debugkey" doesn't directly fit "in cases of crashes", as there's no crash involved. kexec_crash() instead gets invoked without there having been any crash. Yes, and having some additional state generate for this case might help diagnosis. You may also want to point out that this is a best effort thing only - system state at the point of a crash may be such that the attempt of handling one or the debug keys would have further bad effects on the system, including that the actual kexec may then never occur. True. @@ -507,6 +509,41 @@ void __init initialize_keytable(void) } } +#define CRASHACTION_SIZE 32 +static char crash_debug_panic[CRASHACTION_SIZE]; +static char crash_debug_hwdom[CRASHACTION_SIZE]; +static char crash_debug_watchdog[CRASHACTION_SIZE]; +static char crash_debug_kexeccmd[CRASHACTION_SIZE]; +static char crash_debug_debugkey[CRASHACTION_SIZE]; + +static char *crash_action[CRASHREASON_N] = { +[CRASHREASON_PANIC] = crash_debug_panic, +[CRASHREASON_HWDOM] = crash_debug_hwdom, +[CRASHREASON_WATCHDOG] = crash_debug_watchdog, +[CRASHREASON_KEXECCMD] = crash_debug_kexeccmd, +[CRASHREASON_DEBUGKEY] = crash_debug_debugkey, +}; + +string_runtime_param("crash-debug-panic", crash_debug_panic); +string_runtime_param("crash-debug-hwdom", crash_debug_hwdom); +string_runtime_param("crash-debug-watchdog", crash_debug_watchdog); +string_runtime_param("crash-debug-kexeccmd", crash_debug_kexeccmd); +string_runtime_param("crash-debug-debugkey", crash_debug_debugkey); In general I'm not in favor of this (or similar) needing five new command line options, instead of just one. The problem with e.g. crash-debug=panic:rq,watchdog:0d is that ','
Re: [PATCH] x86/pv: Drop stale comment in dom0_construct_pv()
On 29.10.2020 15:00, Andrew Cooper wrote: > This comment has been around since c/s 1372bca0615 in 2004. It is stale, as > it predates the introduction of struct vcpu. That commit only moved it around; it's 22a857bde9b8 afaics from early 2003 where it first appeared, where it had a reason: /* * WARNING: The new domain must have its 'processor' field * filled in by now !! */ phys_l2tab = ALLOC_FRAME_FROM_DOMAIN(); l2tab = map_domain_mem(phys_l2tab); memcpy(l2tab, idle_pg_table[p->processor], PAGE_SIZE); But yes, the comment has been stale for a long time, and I've been wondering a number of times what it was supposed to tell me. (I think it was already stale at the point the comment first got altered, in 3072fef54df8.) > It is not obvious that it was even correct at the time. Where a vcpu (domain > at the time) has been configured to run is unrelated to construct the domain's > initial pagetables, etc. > > Signed-off-by: Andrew Cooper Acked-by: Jan Beulich Jan
Re: [XEN PATCH v1] xen/arm : Add support for SMMUv3 driver
Hi Julien, > On 28 Oct 2020, at 19:12, Julien Grall wrote: > > > > On 26/10/2020 11:03, Rahul Singh wrote: >> Hello Julien, > > Hi Rahul, > >>> On 23 Oct 2020, at 4:19 pm, Julien Grall wrote: >>> >>> >>> >>> On 23/10/2020 15:27, Rahul Singh wrote: Hello Julien, > On 23 Oct 2020, at 2:00 pm, Julien Grall wrote: > > > > On 23/10/2020 12:35, Rahul Singh wrote: >> Hello, >>> On 23 Oct 2020, at 1:02 am, Stefano Stabellini >>> wrote: >>> >>> On Thu, 22 Oct 2020, Julien Grall wrote: >> On 20/10/2020 16:25, Rahul Singh wrote: >>> Add support for ARM architected SMMUv3 implementations. It is based >>> on >>> the Linux SMMUv3 driver. >>> Major differences between the Linux driver are as follows: >>> 1. Only Stage-2 translation is supported as compared to the Linux >>> driver >>>that supports both Stage-1 and Stage-2 translations. >>> 2. Use P2M page table instead of creating one as SMMUv3 has the >>>capability to share the page tables with the CPU. >>> 3. Tasklets is used in place of threaded IRQ's in Linux for event >>> queue >>>and priority queue IRQ handling. >> >> Tasklets are not a replacement for threaded IRQ. In particular, they >> will >> have priority over anything else (IOW nothing will run on the pCPU >> until >> they are done). >> >> Do you know why Linux is using thread. Is it because of long running >> operations? > > Yes you are right because of long running operations Linux is using > the > threaded IRQs. > > SMMUv3 reports fault/events bases on memory-based circular buffer > queues not > based on the register. As per my understanding, it is time-consuming > to > process the memory based queues in interrupt context because of that > Linux > is using threaded IRQ to process the faults/events from SMMU. > > I didn’t find any other solution in XEN in place of tasklet to defer > the > work, that’s why I used tasklet in XEN in replacement of threaded > IRQs. If > we do all work in interrupt context we will make XEN less responsive. So we need to make sure that Xen continue to receives interrupts, but we also need to make sure that a vCPU bound to the pCPU is also responsive. > > If you know another solution in XEN that will be used to defer the > work in > the interrupt please let me know I will try to use that. One of my work colleague encountered a similar problem recently. He had a long running tasklet and wanted to be broken down in smaller chunk. We decided to use a timer to reschedule the taslket in the future. This allows the scheduler to run other loads (e.g. vCPU) for some time. This is pretty hackish but I couldn't find a better solution as tasklet have high priority. Maybe the other will have a better idea. >>> >>> Julien's suggestion is a good one. >>> >>> But I think tasklets can be configured to be called from the idle_loop, >>> in which case they are not run in interrupt context? >>> >> Yes you are right tasklet will be scheduled from the idle_loop that is >> not interrupt conext. > > This depends on your tasklet. Some will run from the softirq context > which is usually (for Arm) on the return of an exception. > Thanks for the info. I will check and will get better understanding of the tasklet how it will run in XEN. >>> >>> 4. Latest version of the Linux SMMUv3 code implements the commands >>> queue >>>access functions based on atomic operations implemented in Linux. >> >> Can you provide more details? > > I tried to port the latest version of the SMMUv3 code than I observed > that > in order to port that code I have to also port atomic operation > implemented > in Linux to XEN. As latest Linux code uses atomic operation to > process the > command queues > (atomic_cond_read_relaxed(),atomic_long_cond_read_relaxed() , > atomic_fetch_andnot_relaxed()) . Thank you for the explanation. I think it would be best to import the atomic helpers and use the latest code. This will ensure that we don't re-introduce bugs and also buy us some time before the Linux and Xen driver diverge again too much. Stefano, what do you think? >>> >>> I think you are right. >> Yes, I agree with
Re: [PATCH] xen: add support for automatic debug key actions in case of crash
On 22.10.2020 16:39, Juergen Gross wrote: > When the host crashes it would sometimes be nice to have additional > debug data available which could be produced via debug keys, but > halting the server for manual intervention might be impossible due to > the need to reboot/kexec rather sooner than later. > > Add support for automatic debug key actions in case of crashes which > can be activated via boot- or runtime-parameter. While I can certainly see this possibly being a useful thing in certain situations, I'm uncertain it's going to be helpful in at least a fair set of cases. What output to request very much depends on the nature of the problem one is running into, and the more keys one adds "just in case", the longer the reboot latency, and the higher the risk (see also below) of the output generation actually causing further problems. IOW I'm neither fully convinced that we want this, nor fully opposed. > Depending on the type of crash the desired data might be different, so > support different settings for the possible types of crashes. > > The parameter is "crash-debug" with the following syntax: > > crash-debug-= > > with being one of: > > panic, hwdom, watchdog, kexeccmd, debugkey > > and a sequence of debug key characters with '.' having the > special semantics of a 1 second pause. 1 second is a whole lot of time. To get two successive sets of data, a much shorter delay (if any) would normally suffice. Also, while '.' may seem like a good choice right now, with the shortage of characters we may want to put a real handler behind it at come point. The one character that clearly won't make much sense to use in this context is 'h', but that's awful as a (kind of) separator. Could we perhaps replace 'h' by '?', freeing up 'h' and allowing '?' to be used for this purpose here? > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -574,6 +574,29 @@ reduction of features at Xen's disposal to manage guests. > ### cpuinfo (x86) > > `= ` > > +### crash-debug-debugkey > +### crash-debug-hwdom > +### crash-debug-kexeccmd > +### crash-debug-panic > +### crash-debug-watchdog > +> `= ` > + > +> Can be modified at runtime > + > +Specify debug-key actions in cases of crashes. Each of the parameters applies > +to a different crash reason. The `` is a sequence of debug key > +characters, with `.` having the special meaning of a 1 second pause. > + > +So e.g. `crash-debug-watchdog=0.0r` would dump dom0 state twice with a > +second between the two state dumps, followed by the run queues of the > +hypervisor, if the system crashes due to a watchdog timeout. > + > +These parameters should be used carefully, as e.g. specifying > +`crash-debug-debugkey=C` would result in an endless loop. Depending on the > +reason of the system crash it might happen that triggering some debug key > +action will result in a hang instead of dumping data and then doing a > +reboot or crash dump. I think it would be useful if the flavors were (briefly) explained: At the very least "debugkey" doesn't directly fit "in cases of crashes", as there's no crash involved. kexec_crash() instead gets invoked without there having been any crash. You may also want to point out that this is a best effort thing only - system state at the point of a crash may be such that the attempt of handling one or the debug keys would have further bad effects on the system, including that the actual kexec may then never occur. > @@ -507,6 +509,41 @@ void __init initialize_keytable(void) > } > } > > +#define CRASHACTION_SIZE 32 > +static char crash_debug_panic[CRASHACTION_SIZE]; > +static char crash_debug_hwdom[CRASHACTION_SIZE]; > +static char crash_debug_watchdog[CRASHACTION_SIZE]; > +static char crash_debug_kexeccmd[CRASHACTION_SIZE]; > +static char crash_debug_debugkey[CRASHACTION_SIZE]; > + > +static char *crash_action[CRASHREASON_N] = { > +[CRASHREASON_PANIC] = crash_debug_panic, > +[CRASHREASON_HWDOM] = crash_debug_hwdom, > +[CRASHREASON_WATCHDOG] = crash_debug_watchdog, > +[CRASHREASON_KEXECCMD] = crash_debug_kexeccmd, > +[CRASHREASON_DEBUGKEY] = crash_debug_debugkey, > +}; > + > +string_runtime_param("crash-debug-panic", crash_debug_panic); > +string_runtime_param("crash-debug-hwdom", crash_debug_hwdom); > +string_runtime_param("crash-debug-watchdog", crash_debug_watchdog); > +string_runtime_param("crash-debug-kexeccmd", crash_debug_kexeccmd); > +string_runtime_param("crash-debug-debugkey", crash_debug_debugkey); In general I'm not in favor of this (or similar) needing five new command line options, instead of just one. The problem with e.g. crash-debug=panic:rq,watchdog:0d is that ',' (or any other separator chosen) could in principle also be a debug key. It would still be possible to use crash-debug=panic:rq crash-debug=watchdog:0d though. Thoughts? > +void keyhandler_crash_action(enum crash_reason reason) > +{ > +const char *action = crash_action[reason]; > +
[PATCH] x86/pv: Drop stale comment in dom0_construct_pv()
This comment has been around since c/s 1372bca0615 in 2004. It is stale, as it predates the introduction of struct vcpu. It is not obvious that it was even correct at the time. Where a vcpu (domain at the time) has been configured to run is unrelated to construct the domain's initial pagetables, etc. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monné CC: Wei Liu Almost... I'm not entirely sure NUMA memory allocation is plumbed through correctly, but even that still has nothing to do with v->processor --- xen/arch/x86/pv/dom0_build.c | 1 - 1 file changed, 1 deletion(-) diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c index d79503d6a9..f7165309a2 100644 --- a/xen/arch/x86/pv/dom0_build.c +++ b/xen/arch/x86/pv/dom0_build.c @@ -616,7 +616,6 @@ int __init dom0_construct_pv(struct domain *d, v->arch.pv.event_callback_cs= FLAT_COMPAT_KERNEL_CS; } -/* WARNING: The new domain must have its 'processor' field filled in! */ if ( !is_pv_32bit_domain(d) ) { maddr_to_page(mpt_alloc)->u.inuse.type_info = PGT_l4_page_table; -- 2.11.0
Ping²: [PATCH] x86/PV: make post-migration page state consistent
On 11.09.2020 12:34, Jan Beulich wrote: > When a page table page gets de-validated, its type reference count drops > to zero (and PGT_validated gets cleared), but its type remains intact. > XEN_DOMCTL_getpageframeinfo3, therefore, so far reported prior usage for > such pages. An intermediate write to such a page via e.g. > MMU_NORMAL_PT_UPDATE, however, would transition the page's type to > PGT_writable_page, thus altering what XEN_DOMCTL_getpageframeinfo3 would > return. In libxc the decision which pages to normalize / localize > depends solely on the type returned from the domctl. As a result without > further precautions the guest won't be able to tell whether such a page > has had its (apparent) PTE entries transitioned to the new MFNs. > > Add a check of PGT_validated, thus consistently avoiding normalization / > localization in the tool stack. > > Alongside using XEN_DOMCTL_PFINFO_NOTAB instead of plain zero for the > change at hand, also change the variable's initializer to use this > constant, too. Take the opportunity and also adjust its type. > > Signed-off-by: Jan Beulich I think I did address all questions here. Jan > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -215,7 +215,8 @@ long arch_do_domctl( > > for ( i = 0; i < num; ++i ) > { > -unsigned long gfn = 0, type = 0; > +unsigned long gfn = 0; > +unsigned int type = XEN_DOMCTL_PFINFO_NOTAB; > struct page_info *page; > p2m_type_t t; > > @@ -255,6 +256,8 @@ long arch_do_domctl( > > if ( page->u.inuse.type_info & PGT_pinned ) > type |= XEN_DOMCTL_PFINFO_LPINTAB; > +else if ( !(page->u.inuse.type_info & PGT_validated) ) > +type = XEN_DOMCTL_PFINFO_NOTAB; > > if ( page->count_info & PGC_broken ) > type = XEN_DOMCTL_PFINFO_BROKEN; >
Ping: [PATCH] tools/python: pass more -rpath-link options to ld
On 19.10.2020 10:31, Jan Beulich wrote: > With the split of libraries, I've observed a number of warnings from > (old?) ld. > > Signed-off-by: Jan Beulich Marek? > --- > It's unclear to me whether this is ld version dependent - the pattern > of where I've seen such warnings doesn't suggest a clear version > dependency. > > --- a/tools/python/setup.py > +++ b/tools/python/setup.py > @@ -7,10 +7,15 @@ XEN_ROOT = "../.." > extra_compile_args = [ "-fno-strict-aliasing", "-Werror" ] > > PATH_XEN = XEN_ROOT + "/tools/include" > +PATH_LIBXENTOOLCORE = XEN_ROOT + "/tools/libs/toolcore" > PATH_LIBXENTOOLLOG = XEN_ROOT + "/tools/libs/toollog" > +PATH_LIBXENCALL = XEN_ROOT + "/tools/libs/call" > PATH_LIBXENEVTCHN = XEN_ROOT + "/tools/libs/evtchn" > +PATH_LIBXENGNTTAB = XEN_ROOT + "/tools/libs/gnttab" > PATH_LIBXENCTRL = XEN_ROOT + "/tools/libs/ctrl" > PATH_LIBXENGUEST = XEN_ROOT + "/tools/libs/guest" > +PATH_LIBXENDEVICEMODEL = XEN_ROOT + "/tools/libs/devicemodel" > +PATH_LIBXENFOREIGNMEMORY = XEN_ROOT + "/tools/libs/foreignmemory" > PATH_XENSTORE = XEN_ROOT + "/tools/libs/store" > > xc = Extension("xc", > @@ -24,7 +29,13 @@ xc = Extension("xc", > library_dirs = [ PATH_LIBXENCTRL, PATH_LIBXENGUEST ], > libraries = [ "xenctrl", "xenguest" ], > depends= [ PATH_LIBXENCTRL + "/libxenctrl.so", > PATH_LIBXENGUEST + "/libxenguest.so" ], > - extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG > ], > + extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENCALL, > + > "-Wl,-rpath-link="+PATH_LIBXENDEVICEMODEL, > + "-Wl,-rpath-link="+PATH_LIBXENEVTCHN, > + > "-Wl,-rpath-link="+PATH_LIBXENFOREIGNMEMORY, > + "-Wl,-rpath-link="+PATH_LIBXENGNTTAB, > + "-Wl,-rpath-link="+PATH_LIBXENTOOLCORE, > + "-Wl,-rpath-link="+PATH_LIBXENTOOLLOG > ], > sources= [ "xen/lowlevel/xc/xc.c" ]) > > xs = Extension("xs", > @@ -33,6 +44,7 @@ xs = Extension("xs", > library_dirs = [ PATH_XENSTORE ], > libraries = [ "xenstore" ], > depends= [ PATH_XENSTORE + "/libxenstore.so" ], > + extra_link_args= [ "-Wl,-rpath-link="+PATH_LIBXENTOOLCORE > ], > sources= [ "xen/lowlevel/xs/xs.c" ]) > > plat = os.uname()[0] >
Ping: [PATCH 2/2] tools/libs: fix uninstall rule for header files
On 19.10.2020 09:21, Jan Beulich wrote: > This again was working right only as long as $(LIBHEADER) consisted of > just one entry. > > Signed-off-by: Jan Beulich While patch 1 has become irrelevant with Juergen's more complete change, this one is still applicable afaict. Jan > --- > An alternative would be to use $(addprefix ) without any shell loop. > > --- a/tools/libs/libs.mk > +++ b/tools/libs/libs.mk > @@ -107,7 +107,7 @@ install: build > .PHONY: uninstall > uninstall: > rm -f $(DESTDIR)$(PKG_INSTALLDIR)/$(LIB_FILE_NAME).pc > - for i in $(LIBHEADER); do rm -f $(DESTDIR)$(includedir)/$(LIBHEADER); > done > + for i in $(LIBHEADER); do rm -f $(DESTDIR)$(includedir)/$$i; done > rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so > rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so.$(MAJOR) > rm -f $(DESTDIR)$(libdir)/lib$(LIB_FILE_NAME).so.$(MAJOR).$(MINOR) > >
[linux-linus test] 156269: regressions - FAIL
flight 156269 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/156269/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-xsm7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-ws16-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemut-rhel6hvm-intel 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-debianhvm-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemuu-rhel6hvm-intel 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-coresched-i386-xl 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemut-rhel6hvm-amd 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl-qemut-ws16-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-pair 10 xen-install/src_host fail REGR. vs. 152332 test-amd64-i386-pair 11 xen-install/dst_host fail REGR. vs. 152332 test-amd64-i386-libvirt-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt 7 xen-install fail REGR. vs. 152332 test-amd64-i386-qemuu-rhel6hvm-amd 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-examine 6 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-raw7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-pvshim 7 xen-install fail REGR. vs. 152332 test-amd64-i386-freebsd10-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-shadow 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-debianhvm-i386-xsm 7 xen-install fail REGR. vs. 152332 test-amd64-i386-freebsd10-i386 7 xen-installfail REGR. vs. 152332 test-amd64-i386-xl-qemuu-win7-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemut-win7-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-xl-qemuu-ovmf-amd64 7 xen-install fail REGR. vs. 152332 test-amd64-i386-libvirt-pair 10 xen-install/src_host fail REGR. vs. 152332 test-amd64-i386-libvirt-pair 11 xen-install/dst_host fail REGR. vs. 152332 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 7 xen-install fail REGR. vs. 152332 test-arm64-arm64-xl-credit1 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl-xsm 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-libvirt-xsm 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-examine 8 reboot fail REGR. vs. 152332 test-amd64-amd64-amd64-pvgrub 20 guest-stop fail REGR. vs. 152332 test-amd64-amd64-i386-pvgrub 19 guest-localmigrate/x10 fail REGR. vs. 152332 test-armhf-armhf-xl-credit1 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-libvirt 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-examine 8 reboot fail REGR. vs. 152332 test-armhf-armhf-xl-cubietruck 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl-multivcpu 8 xen-bootfail REGR. vs. 152332 test-armhf-armhf-libvirt-raw 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl-seattle 8 xen-boot fail REGR. vs. 152332 test-arm64-arm64-xl-credit2 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl-credit2 8 xen-boot fail REGR. vs. 152332 test-armhf-armhf-xl-vhd 8 xen-boot fail REGR. vs. 152332 Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds 8 xen-boot fail REGR. vs. 152332 Tests which did not succeed, but are not blocking: test-arm64-arm64-xl 11 leak-check/basis(11)fail blocked in 152332 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 152332 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 152332 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 152332 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 152332 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 152332 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass
Ping: [PATCH v3 0/3] x86: shim building adjustments (plus shadow follow-on)
Tim, On 19.10.2020 10:42, Jan Beulich wrote: > The change addressing the shadow related build issue noticed by > Andrew went in already. The build breakage goes beyond this > specific combination though - PV_SHIM_EXCLUSIVE plus HVM is > similarly an issue. This is what the 1st patch tries to take care > of, in a shape already on irc noticed to be controversial. I'm > submitting the change nevertheless because for the moment there > looks to be a majority in favor of going this route. One argument > not voiced there yet: What good does it do to allow a user to > enable HVM when then on the resulting hypervisor they still can't > run HVM guests (for the hypervisor still being a dedicated PV > shim one). On top of this, the alternative approach is likely > going to get ugly. > > The shadow related adjustments are here merely because the want > to make them was noticed in the context of the patch which has > already gone in. > > 1: don't permit HVM and PV_SHIM_EXCLUSIVE at the same time > 2: refactor shadow_vram_{get,put}_l1e() > 3: sh_{make,destroy}_monitor_table() are "even more" HVM-only unless you tell me otherwise I'm intending to commit the latter two with Roger's acks some time next week. Of course it would still be nice to know your view on the first of the TBDs in patch 3 (which may result in further changes, in a follow-up). Jan
Re: [PATCH] meson.build: fix building of Xen support for aarch64
On Thu, Oct 29, 2020 at 6:01 AM Alex Bennée wrote: > > > Stefano Stabellini writes: > > > On Wed, 28 Oct 2020, Alex Bennée wrote: > >> Xen is supported on aarch64 although weirdly using the i386-softmmu > >> model. Checking based on the host CPU meant we never enabled Xen > >> support. It would be nice to enable CONFIG_XEN for aarch64-softmmu to > >> make it not seem weird but that will require further build surgery. > >> > >> Signed-off-by: Alex Bennée > >> Cc: Masami Hiramatsu > >> Cc: Stefano Stabellini > >> Cc: Anthony Perard > >> Cc: Paul Durrant > >> Fixes: 8a19980e3f ("configure: move accelerator logic to meson") > >> --- > >> meson.build | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/meson.build b/meson.build > >> index 835424999d..f1fcbfed4c 100644 > >> --- a/meson.build > >> +++ b/meson.build > >> @@ -81,6 +81,8 @@ if cpu in ['x86', 'x86_64'] > >> 'CONFIG_HVF': ['x86_64-softmmu'], > >> 'CONFIG_WHPX': ['i386-softmmu', 'x86_64-softmmu'], > >>} > >> +elif cpu in [ 'arm', 'aarch64' ] > >> + accelerator_targets += { 'CONFIG_XEN': ['i386-softmmu'] } > >> endif > > > > This looks very reasonable -- the patch makes sense. A comment would be useful to explain that Arm needs i386-softmmu for the xenpv machine. > > > > However I have two questions, mostly for my own understanding. I tried > > to repro the aarch64 build problem but it works at my end, even without > > this patch. > > Building on a x86 host or with cross compiler? > > > I wonder why. I suspect it works thanks to these lines in > > meson.build: I think it's a runtime and not a build problem. In osstest, Xen support was detected and configured, but CONFIG_XEN wasn't set for Arm. So at runtime xen_available() returns 0, and QEMU doesn't start with "qemu-system-i386: -xen-domid 1: Option not supported for this target" I posted my investigation here: https://lore.kernel.org/xen-devel/cakf6xpss8kpgovzrkitpz63bhbvbjxrtywdhekzuo2q1kem...@mail.gmail.com/ Regards, Jason