[PATCH] tools/xenstore: use talloc_asprintf_append() in do_control_help()
Instead of calculating the length of all help output and then allocating the space for it, just use talloc_asprintf_append() to expand the text as needed. Signed-off-by: Juergen Gross --- tools/xenstore/xenstored_control.c | 27 +++ 1 file changed, 7 insertions(+), 20 deletions(-) diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c index adb8d51b04..61bcbc069d 100644 --- a/tools/xenstore/xenstored_control.c +++ b/tools/xenstore/xenstored_control.c @@ -853,36 +853,23 @@ static struct cmd_s cmds[] = { static int do_control_help(void *ctx, struct connection *conn, char **vec, int num) { - int cmd, len = 0; + int cmd; char *resp; if (num) return EINVAL; - for (cmd = 0; cmd < ARRAY_SIZE(cmds); cmd++) { - len += strlen(cmds[cmd].cmd) + 1; - len += strlen(cmds[cmd].pars) + 1; - } - len++; - - resp = talloc_array(ctx, char, len); + resp = talloc_asprintf(ctx, "%s", ""); if (!resp) return ENOMEM; - - len = 0; for (cmd = 0; cmd < ARRAY_SIZE(cmds); cmd++) { - strcpy(resp + len, cmds[cmd].cmd); - len += strlen(cmds[cmd].cmd); - resp[len] = '\t'; - len++; - strcpy(resp + len, cmds[cmd].pars); - len += strlen(cmds[cmd].pars); - resp[len] = '\n'; - len++; + resp = talloc_asprintf_append(resp, "%s\t%s\n", + cmds[cmd].cmd, cmds[cmd].pars); + if (!resp) + return ENOMEM; } - resp[len] = 0; - send_reply(conn, XS_CONTROL, resp, len); + send_reply(conn, XS_CONTROL, resp, strlen(resp) + 1); return 0; } -- 2.31.1
[qemu-mainline test] 167751: tolerable FAIL - PUSHED
flight 167751 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/167751/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167745 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167745 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 167745 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167745 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 167745 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167745 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167745 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 167745 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass version targeted for testing: qemuu5e0214cdeee17de949f2565f4429c15173179ae3 baseline version: qemuu0dabdd6b3a7ead1183d6f26eaded7d0c332e4cc7 Last test of basis 167745 2022-01-19 05:43:42 Z0 days Testing same since 167751 2022-01-19 19:38:19 Z0 days1 attempts People who touched revisions under test: Alex Bennée Beraldo Leal Bernhard Beschow B
[ovmf test] 167754: all pass - PUSHED
flight 167754 ovmf real [real] http://logs.test-lab.xenproject.org/osstest/logs/167754/ Perfect :-) All tests in this flight passed as required version targeted for testing: ovmf 772c5bb8dcb9841f1839dec02f33324e31b36d25 baseline version: ovmf 5801910013757bd626f67ed77eea6c16a176eebf Last test of basis 167729 2022-01-17 22:40:27 Z2 days Testing same since 167754 2022-01-20 01:56:46 Z0 days1 attempts People who touched revisions under test: Michael Kubacki jobs: build-amd64-xsm pass build-i386-xsm pass build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvopspass build-i386-pvops pass test-amd64-amd64-xl-qemuu-ovmf-amd64 pass test-amd64-i386-xl-qemuu-ovmf-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/osstest/ovmf.git 5801910013..772c5bb8dc 772c5bb8dcb9841f1839dec02f33324e31b36d25 -> xen-tested-master
[xen-unstable test] 167749: tolerable FAIL - PUSHED
flight 167749 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/167749/ Failures :-/ but no regressions. Regressions which are regarded as allowable (not blocking): test-armhf-armhf-xl-rtds18 guest-start/debian.repeat fail REGR. vs. 167743 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167743 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167743 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167743 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167743 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167743 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 167743 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167743 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167743 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 167743 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167743 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167743 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 167743 test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass version
Re: [RFC v1 3/5] xen/arm: introduce SCMI-SMC mediator driver
On Wed, 19 Jan 2022, Oleksii Moisieiev wrote: > On Thu, Jan 06, 2022 at 04:04:34PM +, Julien Grall wrote: > > On 06/01/2022 15:43, Oleksii Moisieiev wrote: > > > On Thu, Jan 06, 2022 at 02:02:10PM +, Julien Grall wrote: > > > > On 06/01/2022 13:53, Oleksii Moisieiev wrote: > > > > > Hi Julien, > > > > > > > > > > On Mon, Jan 03, 2022 at 01:14:17PM +, Julien Grall wrote: > > > > > > Hi, > > > > > > > > > > > > On 24/12/2021 17:02, Oleksii Moisieiev wrote: > > > > > > > On Fri, Dec 24, 2021 at 03:42:42PM +0100, Julien Grall wrote: > > > > > > > > On 20/12/2021 16:41, Oleksii Moisieiev wrote: > > > > > > > > > > 2) What are the expected memory attribute for the > > > > > > > > > > regions? > > > > > > > > > > > > > > > > > > xen uses iommu_permit_access to pass agent page to the guest. > > > > > > > > > So guest can access the page directly. > > > > > > > > > > > > > > > > I think you misunderstood my comment. Memory can be mapped with > > > > > > > > various type > > > > > > > > (e.g. Device, Memory) and attribute (cacheable, > > > > > > > > non-cacheable...). What will > > > > > > > > the firmware expect? What will the guest OS usually? > > > > > > > > > > > > > > > > The reason I am asking is the attributes have to matched to > > > > > > > > avoid any > > > > > > > > coherency issues. At the moment, if you use > > > > > > > > XEN_DOMCTL_memory_mapping, Xen > > > > > > > > will configure the stage-2 to use Device nGnRnE. As the result, > > > > > > > > the result > > > > > > > > memory access will be Device nGnRnE which is very strict. > > > > > > > > :w > > > > > > > > > > > > > > Let me share with you the configuration example: > > > > > > > scmi expects memory to be configured in the device-tree: > > > > > > > > > > > > > > cpu_scp_shm: scp-shmem@0xXXX { > > > > > > > compatible = "arm,scmi-shmem"; > > > > > > > reg = <0x0 0xXX 0x0 0x1000>; > > > > > > > }; > > > > > > > > > > > > > > where XX address I allocate in alloc_magic_pages function. > > > > > > > > > > > > The goal of alloc_magic_pages() is to allocate RAM. However, what > > > > > > you want > > > > > > is a guest physical address and then map a part of the shared page. > > > > > > > > > > Do you mean that I can't use alloc_magic_pages to allocate shared > > > > > memory region for SCMI? > > > > Correct. alloc_magic_pages() will allocate a RAM page and then assign > > > > to the > > > > guest. From your description, this is not what you want because you will > > > > call XEN_DOMCTL_memory_mapping (and therefore replace the mapping). > > > > > > > > > > Ok thanks, I will refactor this part in v2. > > > > > > > > > > > > > > > > > > > > I can see two options here: > > > > > > 1) Hardcode the SCMI region in the memory map > > > > > > 2) Create a new region in the memory map that can be used for > > > > > > reserving > > > > > > memory for mapping. > > > > > > > > > > Could you please explain what do you mean under the "new region in the > > > > > memory map"? > > > > > > > > I mean reserving some guest physical address that could be used for map > > > > host > > > > physical address (e.g. SCMI region, GIC CPU interface...). > > > > > > > > So rather than hardcoding the address, we have something more flexible. > > > > > > > > > > Ok, I will fix that in v2. > > > > Just for avoidance of doubt. I was clarify option 2 and not requesting to > > implement. That said, if you want to implement option 2 I would be happy to > > review it. > > > > I'm thinking about the best way to reserve address for the domain. > We have xen_pfn_t shared_info_pfn in struct xc_dom_image which is not > used for Arm architecture. It can be set from shared_info_arm callback, > defined in xg_dom_arm.c. > I can use shared_info to store address of the SCMI and then use map_sci_page > to > call XEN_DOMCTL_memory_mapping. > > This will allow me to reuse existing functionality and do not allocate > extra RAM. > > What do you think about that? I cannot speak for Julien but I think he meant something else (Julien please correct me if I am wrong.) Exposing addresses via device tree is not a problem. Normally we pick a fixed address for guest resources, for instance GUEST_GICD_BASE, see xen/include/public/arch-arm.h. We could do that for SCMI as well and it is basically approach 1). However, it is a bit inflexible and could cause issues with things like direct-map (https://marc.info/?l=xen-devel&m=163997768108997). A more flexible way would be for the SCMI guest address to be dynamically generated somehow. I am not sure how Julien envisioned the address to be generated exactly. Thanks to Oleksandr's work we have a way to find large regions of "free" address space. It is currently used for grant-table mappings. Maybe we could use a subset of it for SCMI? It might be best to wait for Julien's answer as he might have a better idea.
[xen-unstable-smoke test] 167753: tolerable all pass - PUSHED
flight 167753 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/167753/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 9b7cdb83fccf59912e56714dd79dbfab57518a65 baseline version: xen 2fc98a9587704b3cdedfe3ae2a6104e7d9e251bd Last test of basis 167748 2022-01-19 13:00:35 Z0 days Testing same since 167753 2022-01-19 23:01:48 Z0 days1 attempts People who touched revisions under test: Jan Beulich Wei Liu jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 2fc98a9587..9b7cdb83fc 9b7cdb83fccf59912e56714dd79dbfab57518a65 -> smoke
Re: [RFC v1 5/5] xen/arm: add SCI mediator support for DomUs
On Wed, 19 Jan 2022, Oleksii Moisieiev wrote: > On Fri, Dec 24, 2021 at 02:30:50PM +0100, Julien Grall wrote: > > Hi, > > > > On 23/12/2021 20:06, Stefano Stabellini wrote: > > > On Wed, 22 Dec 2021, Stefano Stabellini wrote: > > > > # Future Ideas > > > > > > > > A great suggestion by Julien is to start supporting the dom0less partial > > > > device tree format in xl/libxl as well so that we can have a single > > > > "device_tree" option in dom.cfg instead of 4 (device_tree, iomem, irqs, > > > > dtdev). > > > > > > > > Even with that implemented, the user has still to provide a partial dtb, > > > > xen,reg and xen,path. I think this is a great step forward and we should > > > > do that, if nothing else to make it easier to switch between dom0less > > > > and normal domU configurations. But the number of options and > > > > information that the user has to provide is still similar to what we > > > > have today. > > > > > > I have just realized that if we start to parse the partial DTB in > > > xl/libxl the same way that we do for dom0less guests (parse "xen,path", > > > "xen,reg", and "interrupts", making dtdev, irqs and iomem optional) > > > actually we can achieve the goal below thanks to the combination: > > > "xen,path" + "xen,force-assign-without-iommu". > > > > > > In other words, with dom0less we already have a way to specify the link > > > to the host node even if the device is not a DMA master. We can do that > > > by specifying both xen,path and xen,force-assign-without-iommu for a > > > device. > > > > > > This is just FYI. I am not suggesting we should introduce dom0less-style > > > partial DTBs in order to get SCMI support in guests (although it would > > > be great to have). I think the best way forward for this series is one > > > of the combinations below, like a) + d), or a) + c) > > > > I strongly prefer a) + c) because a warning is easy to miss/ignore. At least > > with the extra property the user made an action to think about it and agree > > that this is the way do it. > > > > It is also easier to spot if we ask the user to provide the configuration > > file. > > > > Let me share my thoughts about c), which is: > c) require force-assign-without-iommu="true" in dom.cfg > > Adding this parameter to domain config means removing > xen,force-assign-without-iommu param from partial DTB. Why? No I don't think so. > This will affect dom0less configuration, which I can't test for now > without extra effort. We are just talking about adding: force-assign-without-iommu="true" to the xl config file. For instance: dtdev = [ "/amba/serial@ff00" ] iomem = ["0xff000,1"] force-assign-without-iommu="true" It is unrelated to the dom0less partial DTB. There is no need to test dom0less to do this.
Re: [RFC v1 3/5] xen/arm: introduce SCMI-SMC mediator driver
On Wed, 19 Jan 2022, Oleksii Moisieiev wrote: > On Wed, Dec 22, 2021 at 06:23:24PM -0800, Stefano Stabellini wrote: > > On Wed, 22 Dec 2021, Oleksii Moisieiev wrote: > > > On Tue, Dec 21, 2021 at 01:22:50PM -0800, Stefano Stabellini wrote: > > > > On Tue, 21 Dec 2021, Oleksii Moisieiev wrote: > > > > > Hi Stefano, > > > > > > > > > > On Mon, Dec 20, 2021 at 04:52:01PM -0800, Stefano Stabellini wrote: > > > > > > On Mon, 20 Dec 2021, Oleksii Moisieiev wrote: > > > > > > > Hi Stefano, > > > > > > > > > > > > > > On Fri, Dec 17, 2021 at 06:14:55PM -0800, Stefano Stabellini > > > > > > > wrote: > > > > > > > > On Tue, 14 Dec 2021, Oleksii Moisieiev wrote: > > > > > > > > > This is the implementation of SCI interface, called SCMI-SMC > > > > > > > > > driver, > > > > > > > > > which works as the mediator between XEN Domains and Firmware > > > > > > > > > (SCP, ATF etc). > > > > > > > > > This allows devices from the Domains to work with clocks, > > > > > > > > > resets and > > > > > > > > > power-domains without access to CPG. > > > > > > > > > > > > > > > > > > The following features are implemented: > > > > > > > > > - request SCMI channels from ATF and pass channels to Domains; > > > > > > > > > - set device permissions for Domains based on the Domain > > > > > > > > > partial > > > > > > > > > device-tree. Devices with permissions are able to work with > > > > > > > > > clocks, > > > > > > > > > resets and power-domains via SCMI; > > > > > > > > > - redirect scmi messages from Domains to ATF. > > > > > > > > > > > > > > > > > > Signed-off-by: Oleksii Moisieiev > > > > > > > > > --- > > > > > > > > > xen/arch/arm/Kconfig | 2 + > > > > > > > > > xen/arch/arm/sci/Kconfig | 10 + > > > > > > > > > xen/arch/arm/sci/Makefile | 1 + > > > > > > > > > xen/arch/arm/sci/scmi_smc.c | 795 > > > > > > > > > ++ > > > > > > > > > xen/include/public/arch-arm.h | 1 + > > > > > > > > > 5 files changed, 809 insertions(+) > > > > > > > > > create mode 100644 xen/arch/arm/sci/Kconfig > > > > > > > > > create mode 100644 xen/arch/arm/sci/scmi_smc.c > > > > > > > > > > > > > > > > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig > > > > > > > > > index 186e1db389..02d96c6cfc 100644 > > > > > > > > > --- a/xen/arch/arm/Kconfig > > > > > > > > > +++ b/xen/arch/arm/Kconfig > > > > > > > > > @@ -114,6 +114,8 @@ config SCI > > > > > > > > > support. It allows guests to control system > > > > > > > > > resourcess via one of > > > > > > > > > SCI mediators implemented in XEN. > > > > > > > > > > > > > > > > > > +source "arch/arm/sci/Kconfig" > > > > > > > > > + > > > > > > > > > endmenu > > > > > > > > > > > > > > > > > > menu "ARM errata workaround via the alternative framework" > > > > > > > > > diff --git a/xen/arch/arm/sci/Kconfig > > > > > > > > > b/xen/arch/arm/sci/Kconfig > > > > > > > > > new file mode 100644 > > > > > > > > > index 00..9563067ddc > > > > > > > > > --- /dev/null > > > > > > > > > +++ b/xen/arch/arm/sci/Kconfig > > > > > > > > > @@ -0,0 +1,10 @@ > > > > > > > > > +config SCMI_SMC > > > > > > > > > + bool "Enable SCMI-SMC mediator driver" > > > > > > > > > + default n > > > > > > > > > + depends on SCI > > > > > > > > > + ---help--- > > > > > > > > > + > > > > > > > > > + Enables mediator in XEN to pass SCMI requests from > > > > > > > > > Domains to ATF. > > > > > > > > > + This feature allows drivers from Domains to work with > > > > > > > > > System > > > > > > > > > + Controllers (such as power,resets,clock etc.). SCP is > > > > > > > > > used as transport > > > > > > > > > + for communication. > > > > > > > > > diff --git a/xen/arch/arm/sci/Makefile > > > > > > > > > b/xen/arch/arm/sci/Makefile > > > > > > > > > index 837dc7492b..67f2611872 100644 > > > > > > > > > --- a/xen/arch/arm/sci/Makefile > > > > > > > > > +++ b/xen/arch/arm/sci/Makefile > > > > > > > > > @@ -1 +1,2 @@ > > > > > > > > > obj-y += sci.o > > > > > > > > > +obj-$(CONFIG_SCMI_SMC) += scmi_smc.o > > > > > > > > > diff --git a/xen/arch/arm/sci/scmi_smc.c > > > > > > > > > b/xen/arch/arm/sci/scmi_smc.c > > > > > > > > > new file mode 100644 > > > > > > > > > index 00..2eb01ea82d > > > > > > > > > --- /dev/null > > > > > > > > > +++ b/xen/arch/arm/sci/scmi_smc.c > > > > > > > > > @@ -0,0 +1,795 @@ > > > > > > > > > +/* > > > > > > > > > + * xen/arch/arm/sci/scmi_smc.c > > > > > > > > > + * > > > > > > > > > + * SCMI mediator driver, using SCP as transport. > > > > > > > > > + * > > > > > > > > > + * Oleksii Moisieiev > > > > > > > > > + * Copyright (C) 2021, EPAM Systems. > > > > > > > > > + * > > > > > > > > > + * This program is free software; you can redistribute it > > > > > > > > > and/or modify > > > > > > > > > + * it under the terms of the GNU General Public License as > > > > > > > > > published by > > > > > > > > > + * the Free Software Foundation; either version 2
[linux-linus test] 167747: regressions - FAIL
flight 167747 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/167747/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-pygrub 8 xen-boot fail REGR. vs. 167684 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. vs. 167684 test-amd64-amd64-xl-qemut-debianhvm-amd64 8 xen-bootfail REGR. vs. 167684 test-amd64-amd64-freebsd11-amd64 8 xen-boot fail REGR. vs. 167684 test-armhf-armhf-xl-credit2 8 xen-boot fail REGR. vs. 167684 test-amd64-amd64-examine 8 reboot fail REGR. vs. 167684 test-amd64-amd64-examine-bios 8 reboot fail REGR. vs. 167684 test-amd64-amd64-examine-uefi 8 reboot fail REGR. vs. 167684 test-amd64-coresched-amd64-xl 8 xen-bootfail REGR. vs. 167684 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167684 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 167684 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167684 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167684 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167684 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167684 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 167684 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 167684 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass version targeted for testing: linux1d1df41c5a33359a00e919d54eaebfb789711fdc baseline version: linux455e73a07f6e288b0061dfcf4fcf54f
[qemu-mainline test] 167745: tolerable FAIL - PUSHED
flight 167745 qemu-mainline real [real] flight 167750 qemu-mainline real-retest [real] http://logs.test-lab.xenproject.org/osstest/logs/167745/ http://logs.test-lab.xenproject.org/osstest/logs/167750/ Failures :-/ but no regressions. Tests which are failing intermittently (not blocking): test-arm64-arm64-xl-vhd 8 xen-bootfail pass in 167750-retest Tests which did not succeed, but are not blocking: test-arm64-arm64-xl-vhd 14 migrate-support-check fail in 167750 never pass test-arm64-arm64-xl-vhd 15 saverestore-support-check fail in 167750 never pass test-amd64-amd64-xl-rtds 20 guest-localmigrate/x10 fail like 167741 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167741 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167741 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167741 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 167741 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 167741 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167741 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167741 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 167741 test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass version targeted for testing: qemuu0dabdd6b3a7ead1183d6f26eaded7d0c332e4cc7 baseline version: qemuu8b846207151955a
Re: Regression booting Linux kernel 5.16.1 under Xen: Failed to update kernel mapping for mfn=6930582 pfn=524288
On 18/01/2022 14:55, Sander Eikelenboom wrote: On 18/01/2022 12:13, Juergen Gross wrote: On 18.01.22 11:53, Sander Eikelenboom wrote: L.S., Both Linux kernel 5.16.0 and 5.16.1 fail to boot as Dom0 under xen-unstable and crash early in boot on my hardware. With Linux 5.15.13 it boots fine. Serial log is below. ... (XEN) [001b1ffedeb8] Command line: dom0_mem=2048M,max:2048M loglvl=all guest_loglvl=all console_timestamps=datems vga=gfx-1280x1024x32 no-cpuidle com1=38400,8n1 console=vga,com1 ivrs_ioapic[6]=00:14.0 iommu=on,verbose,debug conring_size=128k ucode=scan sched=credit2 gnttab_max_frames=64 reboot=a ... mapping kernel into physical memory about to get started... [ 0.00] Linux version 5.16.1-20220118-doflr-mac80211debug+ (root@serveerstertje) (gcc (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP PREEMPT Tue Jan 18 10:49:09 CET 2022 [ 0.00] Command line: root=/dev/mapper/serveerstertje_ssd-root ro verbose earlyprintk=xen mem=2048M console=hvc0 scsi_mod.use_blk_mq=1 console=tty0 acpi_enforce_resources=lax max_loop=30 loop_max_part=10 r8169.use_dac=1 loglevel=10 nomodeset net.ifnames=0 biosdevname=0 xen-pciback.hide=(00:14.2)(04:00.*)(08:00.0)(09:00.*)(0a:00.0)(0d:00.0) ... [ 0.135670] [ cut here ] [ 0.135690] Failed to update kernel mapping for mfn=6930582 pfn=524288 [ 0.135701] WARNING: CPU: 0 PID: 0 at arch/x86/xen/setup.c:312 xen_remap_memory+0x191/0x26c There seems to be a problem with rearranging the dom0 memory layout to match that of the host. Especially it is going sideways when trying to remap a memory frame to PFN 524288 (0x8). You are limiting dom0 memory to 2 GB via Xen command line, but in addition to that you specify "mem=2048M" for the dom0 kernel, too. This should cap memory from PFN 0x8 onwards, hence the failure when dom0 tries to remap a frame into this area. Can you please try removing the "mem=2048M" from the dom0 parameters? I think there has been a kernel commit fixing the correct handling of the "mem=" parameter in 5.16, which might be the reason why you are seeing the crash only now. Juergen Ah that sounds plausible, will test tomorrow and report back. Thanks once again ! -- Sander Hi Juergen, Just tested without "mem=2048M" and with that the machine boots fine. Thanks for the help and sorry for the noise ! -- Sander
Re: [PATCH v2 1/4] x86/guest: Introduce {get,set}_reg() infrastructure
On 19/01/2022 13:28, Jan Beulich wrote: > On 17.01.2022 19:34, Andrew Cooper wrote: >> --- a/xen/arch/x86/hvm/hvm.c >> +++ b/xen/arch/x86/hvm/hvm.c >> @@ -3744,6 +3744,28 @@ int hvm_msr_write_intercept(unsigned int msr, >> uint64_t msr_content, >> return X86EMUL_EXCEPTION; >> } >> >> +uint64_t hvm_get_reg(struct vcpu *v, unsigned int reg) >> +{ >> +ASSERT(v == current || !vcpu_runnable(v)); >> + >> +switch ( reg ) >> +{ >> +default: >> +return alternative_call(hvm_funcs.get_reg, v, reg); >> +} >> +} >> + >> +void hvm_set_reg(struct vcpu *v, unsigned int reg, uint64_t val) >> +{ >> +ASSERT(v == current || !vcpu_runnable(v)); >> + >> +switch ( reg ) >> +{ >> +default: >> +return alternative_vcall(hvm_funcs.set_reg, v, reg, val); > I'm inclined to ask to drop "return" from here. It's a tossup between this, and a following break. I was guestimating based on the subsequent patches, because there is isn't a plausible use for common logic following the switch statement. > Also, for both functions, without it being clear for what kind of > registers beyond MSRs this may want using down the road, I wonder > whether uint64_t is actually wide enough. The tsc scaling/offset registers will probably be the easiest to move, because they're driven almost exclusively from common code. nhvm_vcpu_p2m_base() too, because it is only read, and is trivial. cr2 would be easy example, except it's probably not useful to split out of the general cr paths. Another MSR example which is simple to move (and drop hooks) is get_shadow_gs_base(). The segment registers are the only obvious examples which don't fit into uint64_t. As a tangent, code generation for get/set_sreg() would probably be far better if get() returned by value, and set() took by value. struct segment_register is only a pair of registers, and the optimiser can probably keep most callsites from spilling to the stack. >> --- a/xen/arch/x86/hvm/svm/svm.c >> +++ b/xen/arch/x86/hvm/svm/svm.c >> @@ -2469,6 +2469,33 @@ static bool svm_get_pending_event(struct vcpu *v, >> struct x86_event *info) >> return true; >> } >> >> +static uint64_t svm_get_reg(struct vcpu *v, unsigned int reg) >> +{ >> +struct domain *d = v->domain; >> + >> +switch ( reg ) >> +{ >> +default: >> +printk(XENLOG_G_ERR "%s(%pv, 0x%08x) Bad register\n", >> + __func__, v, reg); > Is __func__ actually of much use here and in similar further places? Yes. Admittedly moreso before I added the domain_crash(), but it is information not printed. It is specifically useful because nothing in the domain_crash() path distinguishes PV and HVM guests, meaning that the output is ambiguous at a glance when investigating customer logs. VTx vs SVM is less ambiguous at a glance because Intel vs AMD information is plentiful in dmesg, but there's no harm making it clearer. >> @@ -852,6 +867,15 @@ static inline int hvm_vmtrace_get_option( >> return -EOPNOTSUPP; >> } >> >> +static inline uint64_t pv_get_reg(struct vcpu *v, unsigned int reg) >> +{ >> +ASSERT_UNREACHABLE(); >> +} >> +static inline void pv_set_reg(struct vcpu *v, unsigned int reg, uint64_t >> val) >> +{ >> +ASSERT_UNREACHABLE(); >> +} > Were these meant to have hvm_ prefixes? Oops yes. I'm not entirely sure if the stubs are necessary, given our usual DCE rule. I'll try some !PV and !HVM builds and see whether I can drop them entirely. > With at least this last aspect addressed > Reviewed-by: Jan Beulich Thanks. ~Andrew
Re: [PATCH v2 4/4] x86/spec-ctrl: Fix NMI race condition with VT-x MSR_SPEC_CTRL handling
On 19/01/2022 13:42, Jan Beulich wrote: > On 17.01.2022 19:34, Andrew Cooper wrote: >> --- a/xen/arch/x86/hvm/vmx/entry.S >> +++ b/xen/arch/x86/hvm/vmx/entry.S >> @@ -35,7 +35,14 @@ ENTRY(vmx_asm_vmexit_handler) >> >> /* SPEC_CTRL_ENTRY_FROM_VMXReq: b=curr %rsp=regs/cpuinfo, Clob: >> acd */ >> ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM >> -ALTERNATIVE "", DO_SPEC_CTRL_ENTRY_FROM_HVM, X86_FEATURE_SC_MSR_HVM >> + >> +.macro restore_spec_ctrl >> +mov$MSR_SPEC_CTRL, %ecx >> +movzbl CPUINFO_xen_spec_ctrl(%rsp), %eax >> +xor%edx, %edx >> +wrmsr >> +.endm >> +ALTERNATIVE "", restore_spec_ctrl, X86_FEATURE_SC_MSR_HVM >> /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */ >> >> /* Hardware clears MSR_DEBUGCTL on VMExit. Reinstate it if >> debugging Xen. */ >> @@ -82,8 +89,7 @@ UNLIKELY_END(realmode) >> mov VCPUMSR_spec_ctrl_raw(%rax), %eax >> >> /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ >> -/* SPEC_CTRL_EXIT_TO_VMX Req: a=spec_ctrl %rsp=regs/cpuinfo, >> Clob: cd */ >> -ALTERNATIVE "", DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_SC_MSR_HVM >> +/* SPEC_CTRL_EXIT_TO_VMX Req: %rsp=regs/cpuinfo >> Clob:*/ >> ALTERNATIVE "", __stringify(verw CPUINFO_verw_sel(%rsp)), >> X86_FEATURE_SC_VERW_HVM > I notice you did update this clobber remark, but what about the one further > up in context? What about it? It still clobbers %eax, %ecx and %edx. >> --- a/xen/arch/x86/include/asm/msr.h >> +++ b/xen/arch/x86/include/asm/msr.h >> @@ -287,7 +287,15 @@ extern struct msr_policy raw_msr_policy, >> /* Container object for per-vCPU MSRs */ >> struct vcpu_msrs >> { >> -/* 0x0048 - MSR_SPEC_CTRL */ >> +/* >> + * 0x0048 - MSR_SPEC_CTRL >> + * >> + * For PV guests, this holds the guest kernel value. It is accessed on >> + * every entry/exit path. >> + * >> + * For VT-x guests, the guest value is held in the MSR guest load/save >> + * list. >> + */ >> struct { >> uint32_t raw; >> } spec_ctrl; > To stand a chance of noticing bad use of this field for VT-x guests, would > it make sense to store some sentinel value into this field for all involved > vCPU-s? The usage is going to get more complicated before we're done. I'd like to wait until the churn reduces before applying invariants like that. ~Andrew
Re: [PATCH v2 5/4] x86/hvm: Drop hvm_{get,set}_guest_bndcfgs() and use {get,set}_regs() instead
On 19/01/2022 13:50, Jan Beulich wrote: > On 17.01.2022 20:25, Andrew Cooper wrote: >> hvm_{get,set}_guest_bndcfgs() are thin wrappers around accessing MSR_BNDCFGS. >> >> MPX was implemented on Skylake uarch CPUs and dropped in subsequent CPUs, and >> is disabled by default in Xen VMs. >> >> It would be nice to move all the logic into vmx_msr_{read,write}_intercept(), >> but the common HVM migration code uses guest_{rd,wr}msr(). Therefore, use >> {get,set}_regs() to reduce the quantity of "common" HVM code. >> >> In lieu of having hvm_set_guest_bndcfgs() split out, use some #ifdef >> CONFIG_HVM in guest_wrmsr(). In vmx_{get,set}_regs(), split the switch >> statements into two depending on whether the require remote VMCS acquisition >> or not. >> >> Signed-off-by: Andrew Cooper > Reviewed-by: Jan Beulich > > One remark: > >> @@ -323,10 +324,9 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t >> *val) >> break; >> >> case MSR_IA32_BNDCFGS: >> -if ( !cp->feat.mpx || !is_hvm_domain(d) || >> - !hvm_get_guest_bndcfgs(v, val) ) >> +if ( !cp->feat.mpx ) /* Implies Intel HVM only */ > Wouldn't it make sense to accompany this comment by ... > >> goto gp_fault; >> -break; > ASSERT(is_hvm_domain(d)); > > (and then the same on the "set" path)? So this is the reason for the default logic in the {get,set}_reg() path. The absence of MSR_BNDCFGS in the PV and SVM paths will cause the VM to be crashed cleanly. If you're on a VMX on a non-MPX capable system, the VMREAD/VMWRITE will hit a BUG (which in due course I want to downgrade to a domain crash). It's a bit more friendly than an ASSERT() (doesn't take the system down), is present in release builds too, and more precise as it excludes SVM too. ~Andrew P.S. I'm still trying to decide on an acceptable name to hide { ASSERT_UNREACHABLE(); gprink(); domain_crash() } behind, so we can downgrade more BUG()/etc to more runtime-friendly options.
[xen-unstable-smoke test] 167748: tolerable all pass - PUSHED
flight 167748 xen-unstable-smoke real [real] http://logs.test-lab.xenproject.org/osstest/logs/167748/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass version targeted for testing: xen 2fc98a9587704b3cdedfe3ae2a6104e7d9e251bd baseline version: xen 444597436d08ccae6d210a2b1b877fef636796ea Last test of basis 167740 2022-01-18 18:01:37 Z0 days Testing same since 167748 2022-01-19 13:00:35 Z0 days1 attempts People who touched revisions under test: Alexander Monakov Andrew Cooper Anthony PERARD Artem Bityutskiy Chen Yu Jan Beulich Juergen Gross Rafael J. Wysocki Roger Pau Monné Zhang Rui jobs: build-arm64-xsm pass build-amd64 pass build-armhf pass build-amd64-libvirt pass test-armhf-armhf-xl pass test-arm64-arm64-xl-xsm pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-amd64-libvirt pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 444597436d..2fc98a9587 2fc98a9587704b3cdedfe3ae2a6104e7d9e251bd -> smoke
Re: [PATCH 11/19] rnbd-src: remove struct rnbd_dev_blk_io
On Tue, Jan 18, 2022 at 8:20 AM Christoph Hellwig wrote: > > Only the priv field of rnbd_dev_blk_io is used, so store the value of > that in bio->bi_private directly and remove the entire bio_set overhead. > > Signed-off-by: Christoph Hellwig there is one typo in the subject line, should be rnbd-srv. > --- > drivers/block/rnbd/rnbd-srv-dev.c | 4 +--- > drivers/block/rnbd/rnbd-srv-dev.h | 13 ++--- > drivers/block/rnbd/rnbd-srv.c | 30 +- > drivers/block/rnbd/rnbd-srv.h | 1 - > 4 files changed, 8 insertions(+), 40 deletions(-) > > diff --git a/drivers/block/rnbd/rnbd-srv-dev.c > b/drivers/block/rnbd/rnbd-srv-dev.c > index 98d3e591a0885..c5d0a03911659 100644 > --- a/drivers/block/rnbd/rnbd-srv-dev.c > +++ b/drivers/block/rnbd/rnbd-srv-dev.c > @@ -12,8 +12,7 @@ > #include "rnbd-srv-dev.h" > #include "rnbd-log.h" > > -struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t flags, > - struct bio_set *bs) > +struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t flags) > { > struct rnbd_dev *dev; > int ret; > @@ -30,7 +29,6 @@ struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t > flags, > > dev->blk_open_flags = flags; > bdevname(dev->bdev, dev->name); > - dev->ibd_bio_set = bs; > > return dev; > > diff --git a/drivers/block/rnbd/rnbd-srv-dev.h > b/drivers/block/rnbd/rnbd-srv-dev.h > index 1a14ece0be726..2c3df02b5e8ec 100644 > --- a/drivers/block/rnbd/rnbd-srv-dev.h > +++ b/drivers/block/rnbd/rnbd-srv-dev.h > @@ -14,25 +14,16 @@ > > struct rnbd_dev { > struct block_device *bdev; > - struct bio_set *ibd_bio_set; > fmode_t blk_open_flags; > charname[BDEVNAME_SIZE]; > }; > > -struct rnbd_dev_blk_io { > - struct rnbd_dev *dev; > - void *priv; > - /* have to be last member for front_pad usage of bioset_init */ > - struct bio bio; > -}; > - > /** > * rnbd_dev_open() - Open a device > + * @path: path to open > * @flags: open flags > - * @bs:bio_set to use during block io, > */ > -struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t flags, > - struct bio_set *bs); > +struct rnbd_dev *rnbd_dev_open(const char *path, fmode_t flags); > > /** > * rnbd_dev_close() - Close a device > diff --git a/drivers/block/rnbd/rnbd-srv.c b/drivers/block/rnbd/rnbd-srv.c > index 65c670e96075b..b1ac1414b56d5 100644 > --- a/drivers/block/rnbd/rnbd-srv.c > +++ b/drivers/block/rnbd/rnbd-srv.c > @@ -116,9 +116,7 @@ rnbd_get_sess_dev(int dev_id, struct rnbd_srv_session > *srv_sess) > > static void rnbd_dev_bi_end_io(struct bio *bio) > { > - struct rnbd_dev_blk_io *io = bio->bi_private; > - > - rnbd_endio(io->priv, blk_status_to_errno(bio->bi_status)); > + rnbd_endio(bio->bi_private, blk_status_to_errno(bio->bi_status)); > bio_put(bio); > } > > @@ -131,7 +129,6 @@ static int process_rdma(struct rnbd_srv_session *srv_sess, > struct rnbd_srv_sess_dev *sess_dev; > u32 dev_id; > int err; > - struct rnbd_dev_blk_io *io; > struct bio *bio; > short prio; > > @@ -152,20 +149,16 @@ static int process_rdma(struct rnbd_srv_session > *srv_sess, > priv->sess_dev = sess_dev; > priv->id = id; > > - bio = bio_alloc_bioset(GFP_KERNEL, 1, > sess_dev->rnbd_dev->ibd_bio_set); > + bio = bio_alloc(GFP_KERNEL, 1); > if (bio_add_page(bio, virt_to_page(data), datalen, > offset_in_page(data))) { > rnbd_srv_err(sess_dev, "Failed to map data to bio\n"); > err = -EINVAL; > - goto sess_dev_put; > + goto bio_put; ok, bio_put is used here, I think it's better the move to patch 10. > } > > - io = container_of(bio, struct rnbd_dev_blk_io, bio); > - io->dev = sess_dev->rnbd_dev; > - io->priv = priv; > - > bio->bi_end_io = rnbd_dev_bi_end_io; > - bio->bi_private = io; > + bio->bi_private = priv; > bio->bi_opf = rnbd_to_bio_flags(le32_to_cpu(msg->rw)); > bio->bi_iter.bi_sector = le64_to_cpu(msg->sector); > bio->bi_iter.bi_size = le32_to_cpu(msg->bi_size); > @@ -180,7 +173,6 @@ static int process_rdma(struct rnbd_srv_session *srv_sess, > > bio_put: > bio_put(bio); > -sess_dev_put: > rnbd_put_sess_dev(sess_dev); > err: > kfree(priv); > @@ -261,7 +253,6 @@ static void destroy_sess(struct rnbd_srv_session > *srv_sess) > > out: > xa_destroy(&srv_sess->index_idr); > - bioset_exit(&srv_sess->sess_bio_set); > > pr_info("RTRS Session %s disconnected\n", srv_sess->sessname); > > @@ -290,16 +281,6 @@ static int create_sess(struct rtrs_srv_sess *rtrs) > return -ENOMEM; > > srv_sess->queue_depth = rtrs_srv_get_queue_depth(rtrs); > -
[xen-unstable test] 167743: tolerable FAIL - PUSHED
flight 167743 xen-unstable real [real] http://logs.test-lab.xenproject.org/osstest/logs/167743/ Failures :-/ but no regressions. Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167736 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167736 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167736 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop fail like 167736 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop fail like 167736 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 167736 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167736 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop fail like 167736 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 167736 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop fail like 167736 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167736 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 167736 test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-libvirt-xsm 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-i386-xl-pvshim14 guest-start fail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-i386-libvirt-raw 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass version targeted for testing: xen 444597436d08ccae6d210a2b1b877fef636796ea baseline version: xen ad47bc9a9742e73b
Re: [PATCH v2 5/4] x86/hvm: Drop hvm_{get,set}_guest_bndcfgs() and use {get,set}_regs() instead
On 17.01.2022 20:25, Andrew Cooper wrote: > hvm_{get,set}_guest_bndcfgs() are thin wrappers around accessing MSR_BNDCFGS. > > MPX was implemented on Skylake uarch CPUs and dropped in subsequent CPUs, and > is disabled by default in Xen VMs. > > It would be nice to move all the logic into vmx_msr_{read,write}_intercept(), > but the common HVM migration code uses guest_{rd,wr}msr(). Therefore, use > {get,set}_regs() to reduce the quantity of "common" HVM code. > > In lieu of having hvm_set_guest_bndcfgs() split out, use some #ifdef > CONFIG_HVM in guest_wrmsr(). In vmx_{get,set}_regs(), split the switch > statements into two depending on whether the require remote VMCS acquisition > or not. > > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich One remark: > @@ -323,10 +324,9 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t > *val) > break; > > case MSR_IA32_BNDCFGS: > -if ( !cp->feat.mpx || !is_hvm_domain(d) || > - !hvm_get_guest_bndcfgs(v, val) ) > +if ( !cp->feat.mpx ) /* Implies Intel HVM only */ Wouldn't it make sense to accompany this comment by ... > goto gp_fault; > -break; ASSERT(is_hvm_domain(d)); (and then the same on the "set" path)? Jan
Re: [PATCH v2 4/4] x86/spec-ctrl: Fix NMI race condition with VT-x MSR_SPEC_CTRL handling
On 17.01.2022 19:34, Andrew Cooper wrote: > --- a/xen/arch/x86/hvm/vmx/entry.S > +++ b/xen/arch/x86/hvm/vmx/entry.S > @@ -35,7 +35,14 @@ ENTRY(vmx_asm_vmexit_handler) > > /* SPEC_CTRL_ENTRY_FROM_VMXReq: b=curr %rsp=regs/cpuinfo, Clob: > acd */ > ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM > -ALTERNATIVE "", DO_SPEC_CTRL_ENTRY_FROM_HVM, X86_FEATURE_SC_MSR_HVM > + > +.macro restore_spec_ctrl > +mov$MSR_SPEC_CTRL, %ecx > +movzbl CPUINFO_xen_spec_ctrl(%rsp), %eax > +xor%edx, %edx > +wrmsr > +.endm > +ALTERNATIVE "", restore_spec_ctrl, X86_FEATURE_SC_MSR_HVM > /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */ > > /* Hardware clears MSR_DEBUGCTL on VMExit. Reinstate it if > debugging Xen. */ > @@ -82,8 +89,7 @@ UNLIKELY_END(realmode) > mov VCPUMSR_spec_ctrl_raw(%rax), %eax > > /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ > -/* SPEC_CTRL_EXIT_TO_VMX Req: a=spec_ctrl %rsp=regs/cpuinfo, Clob: > cd */ > -ALTERNATIVE "", DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_SC_MSR_HVM > +/* SPEC_CTRL_EXIT_TO_VMX Req: %rsp=regs/cpuinfo Clob: >*/ > ALTERNATIVE "", __stringify(verw CPUINFO_verw_sel(%rsp)), > X86_FEATURE_SC_VERW_HVM I notice you did update this clobber remark, but what about the one further up in context? > --- a/xen/arch/x86/include/asm/msr.h > +++ b/xen/arch/x86/include/asm/msr.h > @@ -287,7 +287,15 @@ extern struct msr_policy raw_msr_policy, > /* Container object for per-vCPU MSRs */ > struct vcpu_msrs > { > -/* 0x0048 - MSR_SPEC_CTRL */ > +/* > + * 0x0048 - MSR_SPEC_CTRL > + * > + * For PV guests, this holds the guest kernel value. It is accessed on > + * every entry/exit path. > + * > + * For VT-x guests, the guest value is held in the MSR guest load/save > + * list. > + */ > struct { > uint32_t raw; > } spec_ctrl; To stand a chance of noticing bad use of this field for VT-x guests, would it make sense to store some sentinel value into this field for all involved vCPU-s? Jan
Re: [PATCH v6 10/12] libs/{light,guest}: implement xc_cpuid_apply_policy in libxl
On Tue, Jan 18, 2022 at 01:37:18PM +, Andrew Cooper wrote: > On 17/01/2022 09:48, Roger Pau Monne wrote: > > diff --git a/tools/libs/light/libxl_cpuid.c b/tools/libs/light/libxl_cpuid.c > > index e1acf6648d..7dcdb35a4c 100644 > > --- a/tools/libs/light/libxl_cpuid.c > > +++ b/tools/libs/light/libxl_cpuid.c > > @@ -454,6 +456,41 @@ int libxl__cpuid_legacy(libxl_ctx *ctx, uint32_t > > domid, bool restore, > > */ > > bool nested_virt = info->nested_hvm.val > 0; > > > > +policy = xc_cpu_policy_init(); > > +if (!policy) { > > +LOGED(ERROR, domid, "Failed to init CPU policy"); > > +rc = ERROR_FAIL; > > +goto out; > > +} > > + > > +r = xc_cpu_policy_get_domain(ctx->xch, domid, policy); > > +if (r) { > > +LOGED(ERROR, domid, "Failed to fetch domain CPU policy"); > > +rc = ERROR_FAIL; > > +goto out; > > +} > > + > > +if (restore) { > > +/* > > + * Make sure the policy is compatible with pre Xen 4.13. Note that > > + * newer Xen versions will pass policy data on the restore stream, > > so > > + * any adjustments done here will be superseded. > > + */ > > +r = xc_cpu_policy_make_compat_4_12(ctx->xch, policy, hvm); > > One hidden host policy acquisition, and ... > > > +if (r) { > > +LOGED(ERROR, domid, "Failed to setup compatible CPU policy"); > > +rc = ERROR_FAIL; > > +goto out; > > +} > > +} > > + > > +r = xc_cpu_policy_legacy_topology(ctx->xch, policy, hvm); > > ... one host featureset and ... (final comment) > > > +if (r) { > > +LOGED(ERROR, domid, "Failed to setup CPU policy topology"); > > +rc = ERROR_FAIL; > > +goto out; > > +} > > + > > /* > > * For PV guests, PAE is Xen-controlled (it is the 'p' that > > differentiates > > * the xen-3.0-x86_32 and xen-3.0-x86_32p ABIs). It is mandatory as > > Xen > > @@ -466,6 +503,13 @@ int libxl__cpuid_legacy(libxl_ctx *ctx, uint32_t > > domid, bool restore, > > */ > > if (info->type == LIBXL_DOMAIN_TYPE_HVM) > > pae = libxl_defbool_val(info->u.hvm.pae); > > +rc = libxl_cpuid_parse_config(&info->cpuid, GCSPRINTF("pae=%d", pae)); > > +if (rc) { > > +LOGD(ERROR, domid, "Failed to set PAE CPUID flag"); > > +rc = ERROR_FAIL; > > +goto out; > > +} > > + > > > > /* > > * Advertising Invariant TSC to a guest means that the TSC frequency > > won't > > @@ -481,14 +525,50 @@ int libxl__cpuid_legacy(libxl_ctx *ctx, uint32_t > > domid, bool restore, > > */ > > itsc = (libxl_defbool_val(info->disable_migrate) || > > info->tsc_mode == LIBXL_TSC_MODE_ALWAYS_EMULATE); > > +rc = libxl_cpuid_parse_config(&info->cpuid, GCSPRINTF("invtsc=%d", > > itsc)); > > +if (rc) { > > +LOGD(ERROR, domid, "Failed to set Invariant TSC CPUID flag"); > > +rc = ERROR_FAIL; > > +goto out; > > +} > > > > -r = xc_cpuid_apply_policy(ctx->xch, domid, restore, NULL, 0, > > - pae, itsc, nested_virt, info->cpuid); > > -if (r) > > -LOGEVD(ERROR, -r, domid, "Failed to apply CPUID policy"); > > +/* Set Nested virt CPUID bits for HVM. */ > > +if (hvm) { > > +rc = libxl_cpuid_parse_config(&info->cpuid, GCSPRINTF("vmx=%d", > > + > > nested_virt)); > > +if (rc) { > > +LOGD(ERROR, domid, "Failed to set VMX CPUID flag"); > > +rc = ERROR_FAIL; > > +goto out; > > +} > > + > > +rc = libxl_cpuid_parse_config(&info->cpuid, GCSPRINTF("svm=%d", > > + > > nested_virt)); > > +if (rc) { > > +LOGD(ERROR, domid, "Failed to set SVM CPUID flag"); > > +rc = ERROR_FAIL; > > +goto out; > > +} > > +} > > libxl_cpuid_parse_config() is a large overhead, and cannot suffer errors > because libxl crashes rather than failing memory allocations. The > invasiveness would be reduced substantially by having: > > str = GCSPRINTF("pae=%d,invtsc=%d", pae, itsc); > if ( hvm ) > append GCSPRINTF("vmx=%d,svm=%d", nested_virt, nested_virt); > > and a single call to libxl_cpuid_parse_config(). > > > Depending on how much we care, these need inserting at the head of the > info->cpuid list so they get processed in the same order as before. > > > + > > +/* Apply the bits from info->cpuid if any. */ > > 'if any' is stale now. > > > +r = xc_cpu_policy_apply_cpuid(ctx->xch, policy, info->cpuid, hvm); > > ... and both a host and default policy. > > That is a lot of overhead added behind the scenes. It would be rather > better to have this function obtain the host policy and pass it to all 3 > helpers. Default policy is harder to judge, but it would avoid needing
Re: [PATCH v2 2/4] x86/msr: Split MSR_SPEC_CTRL handling
On 17.01.2022 19:34, Andrew Cooper wrote: > In order to fix a VT-x bug, and support MSR_SPEC_CTRL on AMD, move > MSR_SPEC_CTRL handling into the new {get,set}_reg() infrastructure. > > Duplicate the msrs->spec_ctrl.raw accesses in the PV and VT-x paths for now. > The SVM path is currently unreachable because of the CPUID policy. > > No functional change. > > Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich
Re: [PATCH v2 1/4] x86/guest: Introduce {get,set}_reg() infrastructure
On 17.01.2022 19:34, Andrew Cooper wrote: > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -3744,6 +3744,28 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t > msr_content, > return X86EMUL_EXCEPTION; > } > > +uint64_t hvm_get_reg(struct vcpu *v, unsigned int reg) > +{ > +ASSERT(v == current || !vcpu_runnable(v)); > + > +switch ( reg ) > +{ > +default: > +return alternative_call(hvm_funcs.get_reg, v, reg); > +} > +} > + > +void hvm_set_reg(struct vcpu *v, unsigned int reg, uint64_t val) > +{ > +ASSERT(v == current || !vcpu_runnable(v)); > + > +switch ( reg ) > +{ > +default: > +return alternative_vcall(hvm_funcs.set_reg, v, reg, val); I'm inclined to ask to drop "return" from here. Also, for both functions, without it being clear for what kind of registers beyond MSRs this may want using down the road, I wonder whether uint64_t is actually wide enough. > --- a/xen/arch/x86/hvm/svm/svm.c > +++ b/xen/arch/x86/hvm/svm/svm.c > @@ -2469,6 +2469,33 @@ static bool svm_get_pending_event(struct vcpu *v, > struct x86_event *info) > return true; > } > > +static uint64_t svm_get_reg(struct vcpu *v, unsigned int reg) > +{ > +struct domain *d = v->domain; > + > +switch ( reg ) > +{ > +default: > +printk(XENLOG_G_ERR "%s(%pv, 0x%08x) Bad register\n", > + __func__, v, reg); Is __func__ actually of much use here and in similar further places? > @@ -852,6 +867,15 @@ static inline int hvm_vmtrace_get_option( > return -EOPNOTSUPP; > } > > +static inline uint64_t pv_get_reg(struct vcpu *v, unsigned int reg) > +{ > +ASSERT_UNREACHABLE(); > +} > +static inline void pv_set_reg(struct vcpu *v, unsigned int reg, uint64_t val) > +{ > +ASSERT_UNREACHABLE(); > +} Were these meant to have hvm_ prefixes? With at least this last aspect addressed Reviewed-by: Jan Beulich Jan
Re: improve the bio allocation interface
On 1/18/22 12:19 AM, Christoph Hellwig wrote: > Hi Jens, > > this series is posted early because it has wide-ranging changes and > could use some early ACKs before -rc1. > > It changes the interface to the bio allocators to always pass a > block_device and the operation, which is information needed for every > bio submitted through bio_submit. This means the fields can be > directly initialized in bio_init instead of first being zeroed and > thus should help to micro-optimize even better than the __bio_set_dev > that Pavel proposed while also cleaning up code. Looks pretty straight forward from the block core point of view. Didn't look too closely at the fs/driver changes yet. -- Jens Axboe
Re: [PATCH] fix wrong function declaration
Hi Juergen, > On 19 Jan 2022, at 13:06, Juergen Gross wrote: > > Coverity spotted a wrong function declaration, fix it. > > Coverity-Id: 1497423 > Signed-off-by: Juergen Gross Reviewed-by: Bertrand Marquis Cheers Bertrand > --- > include/balloon.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/balloon.h b/include/balloon.h > index 8f7c8bd8..510e475a 100644 > --- a/include/balloon.h > +++ b/include/balloon.h > @@ -50,7 +50,7 @@ static inline int chk_free_pages(unsigned long needed) > { > return needed <= nr_free_pages; > } > -static inline balloon_set_nr_pages(unsigned long pages, unsigned long pfn) { > } > +static inline void balloon_set_nr_pages(unsigned long pages, unsigned long > pfn) { } > > #endif /* CONFIG_BALLOON */ > #endif /* _BALLOON_H_ */ > -- > 2.31.1 > >
[libvirt test] 167744: regressions - FAIL
flight 167744 libvirt real [real] http://logs.test-lab.xenproject.org/osstest/logs/167744/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-arm64-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-i386-libvirt6 libvirt-buildfail REGR. vs. 151777 build-armhf-libvirt 6 libvirt-buildfail REGR. vs. 151777 build-amd64-libvirt 6 libvirt-buildfail REGR. vs. 151777 Tests which did not succeed, but are not blocking: test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-pair 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-vhd 1 build-check(1) blocked n/a test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-libvirt-pair 1 build-check(1) blocked n/a test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a test-amd64-i386-libvirt-raw 1 build-check(1) blocked n/a test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a test-arm64-arm64-libvirt 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-qcow2 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-raw 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-raw 1 build-check(1) blocked n/a test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-armhf-armhf-libvirt-qcow2 1 build-check(1) blocked n/a version targeted for testing: libvirt ae8add363a54e30ebf1c71e8644fe626a5029c33 baseline version: libvirt 2c846fa6bcc11929c9fb857a22430fb9945654ad Last test of basis 151777 2020-07-10 04:19:19 Z 558 days Failing since151818 2020-07-11 04:18:52 Z 557 days 539 attempts Testing same since 167744 2022-01-19 04:20:09 Z0 days1 attempts People who touched revisions under test: Adolfo Jayme Barrientos Aleksandr Alekseev Aleksei Zakharov Andika Triwidada Andrea Bolognani Ani Sinha Balázs Meskó Barrett Schonefeld Bastian Germann Bastien Orivel BiaoXiang Ye Bihong Yu Binfeng Wu Bjoern Walk Boris Fiuczynski Brian Turek Bruno Haible Chris Mayo Christian Borntraeger Christian Ehrhardt Christian Kirbach Christian Schoenebeck Cole Robinson Collin Walling Cornelia Huck Cédric Bosdonnat Côme Borsoi Daniel Henrique Barboza Daniel Letai Daniel P. Berrange Daniel P. Berrangé Didik Supriadi dinglimin Divya Garg Dmitrii Shcherbakov Dmytro Linkin Eiichi Tsukata Eric Farman Erik Skultety Fabian Affolter Fabian Freyer Fabiano Fidêncio Fangge Jin Farhan Ali Fedora Weblate Translation Franck Ridel Gavi Teitz gongwei Guoyi Tu Göran Uddeborg Halil Pasic Han Han Hao Wang Hela Basa Helmut Grohne Hiroki Narukawa Hyman Huang(黄勇) Ian Wienand Ioanna Alifieraki Ivan Teterevkov Jakob Meng Jamie Strandboge Jamie Strandboge Jan Kuparinen jason lee Jean-Baptiste Holcroft Jia Zhou Jianan Gao Jim Fehlig Jin Yan Jinsheng Zhang Jiri Denemark Joachim Falk John Ferlan Jonathan Watt Jonathon Jongsma Julio Faracco Justin Gatzen Ján Tomko Kashyap Chamarthy Kevin Locke Koichi Murase Kristina Hanicova Laine Stump Laszlo Ersek Lee Yarwood Lei Yang Liao Pingfang Lin Ma Lin Ma Lin Ma Liu Yiding Luke Yue Luyao Zhong Marc Hartmayer Marc-André Lureau Marek Marczykowski-Górecki Markus Schade Martin Kletzander Masayoshi Mizuma Matej Cepl Matt Coleman Matt Coleman Mauro Matteo Cascella Meina Li Michal Privoznik Michał Smyk Milo Casagrande Moshe Levi Muha Aliss Nathan Neal Gompa Nick Chevsky Nick Shyrokovskiy Nickys Music Group Nico Pache Nikolay Shirokovskiy Olaf Hering Olesya Gerasimenko Or Ozeri Orion Poplawski Pany Patrick Magauran Paulo de Rezende Pinatti Pavel Hrdina Peng Liang Peter Krempa Pino Toscano Pino Toscano Piotr Drąg Prathamesh Chavan Praveen K Paladugu Richard W.M. Jones Ricky Tigg Robin Lee Rohit Kumar Roman Bogorodskiy Roman Bolshakov Ryan Gahagan Ryan Schmidt Sam Hartman Scott Shambarger Sebastian Mitterle SeongHyun Jo Shalini Chellathurai Saroja Shaojun Yang Shi Lei simmon Simon Chopin Simon Gaiser Simon Rowe Stefan Bader Stefan Berger Stefan Berger Stef
[PATCH] fix wrong function declaration
Coverity spotted a wrong function declaration, fix it. Coverity-Id: 1497423 Signed-off-by: Juergen Gross --- include/balloon.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/balloon.h b/include/balloon.h index 8f7c8bd8..510e475a 100644 --- a/include/balloon.h +++ b/include/balloon.h @@ -50,7 +50,7 @@ static inline int chk_free_pages(unsigned long needed) { return needed <= nr_free_pages; } -static inline balloon_set_nr_pages(unsigned long pages, unsigned long pfn) { } +static inline void balloon_set_nr_pages(unsigned long pages, unsigned long pfn) { } #endif /* CONFIG_BALLOON */ #endif /* _BALLOON_H_ */ -- 2.31.1
[linux-linus test] 167742: regressions - FAIL
flight 167742 linux-linus real [real] http://logs.test-lab.xenproject.org/osstest/logs/167742/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-pygrub 8 xen-boot fail REGR. vs. 167684 test-amd64-amd64-freebsd11-amd64 8 xen-boot fail REGR. vs. 167684 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict 8 xen-boot fail REGR. vs. 167684 test-amd64-amd64-xl-qemut-debianhvm-amd64 8 xen-bootfail REGR. vs. 167684 test-amd64-amd64-examine 8 reboot fail REGR. vs. 167684 test-amd64-amd64-examine-bios 8 reboot fail REGR. vs. 167684 test-amd64-amd64-examine-uefi 8 reboot fail REGR. vs. 167684 test-amd64-coresched-amd64-xl 8 xen-bootfail REGR. vs. 167684 Tests which are failing intermittently (not blocking): test-armhf-armhf-xl-rtds 18 guest-start/debian.repeat fail pass in 167735 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stopfail like 167684 test-armhf-armhf-libvirt 16 saverestore-support-checkfail like 167684 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 167684 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stopfail like 167684 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stopfail like 167684 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stopfail like 167684 test-armhf-armhf-libvirt-qcow2 15 saverestore-support-check fail like 167684 test-armhf-armhf-libvirt-raw 15 saverestore-support-checkfail like 167684 test-amd64-amd64-libvirt 15 migrate-support-checkfail never pass test-amd64-amd64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 15 migrate-support-checkfail never pass test-arm64-arm64-xl-seattle 16 saverestore-support-checkfail never pass test-arm64-arm64-xl 15 migrate-support-checkfail never pass test-arm64-arm64-xl 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-credit2 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit2 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 15 migrate-support-checkfail never pass test-arm64-arm64-xl-credit1 16 saverestore-support-checkfail never pass test-arm64-arm64-libvirt-xsm 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-thunderx 15 migrate-support-checkfail never pass test-arm64-arm64-xl-thunderx 16 saverestore-support-checkfail never pass test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass test-amd64-amd64-libvirt-qcow2 14 migrate-support-checkfail never pass test-amd64-amd64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 14 migrate-support-checkfail never pass test-arm64-arm64-libvirt-raw 15 saverestore-support-checkfail never pass test-arm64-arm64-xl-vhd 14 migrate-support-checkfail never pass test-arm64-arm64-xl-vhd 15 saverestore-support-checkfail never pass test-armhf-armhf-xl-credit1 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit1 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-multivcpu 15 migrate-support-checkfail never pass test-armhf-armhf-xl-multivcpu 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-rtds 15 migrate-support-checkfail never pass test-armhf-armhf-xl-rtds 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 15 migrate-support-checkfail never pass test-armhf-armhf-xl-credit2 16 saverestore-support-checkfail never pass test-armhf-armhf-xl-cubietruck 15 migrate-support-checkfail never pass test-armhf-armhf-xl-cubietruck 16 saverestore-support-checkfail never pass test-arm64-arm64-xl-xsm 15 migrate-support-checkfail never pass test-arm64-arm64-xl-xsm 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-qcow2 14 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 15 migrate-support-checkfail never pass test-armhf-armhf-xl-arndale 16 saverestore-support-checkfail never pass test-armhf-armhf-xl 15 migrate-support-checkfail never pass test-armhf-armhf-xl 16 saverestore-support-checkfail never pass test-armhf-armhf-libvirt-raw 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd 14 migrate-support-checkfail never pass test-armhf-armhf-xl-vhd
Re: [RFC v1 3/5] xen/arm: introduce SCMI-SMC mediator driver
On Wed, Dec 22, 2021 at 06:23:24PM -0800, Stefano Stabellini wrote: > On Wed, 22 Dec 2021, Oleksii Moisieiev wrote: > > On Tue, Dec 21, 2021 at 01:22:50PM -0800, Stefano Stabellini wrote: > > > On Tue, 21 Dec 2021, Oleksii Moisieiev wrote: > > > > Hi Stefano, > > > > > > > > On Mon, Dec 20, 2021 at 04:52:01PM -0800, Stefano Stabellini wrote: > > > > > On Mon, 20 Dec 2021, Oleksii Moisieiev wrote: > > > > > > Hi Stefano, > > > > > > > > > > > > On Fri, Dec 17, 2021 at 06:14:55PM -0800, Stefano Stabellini wrote: > > > > > > > On Tue, 14 Dec 2021, Oleksii Moisieiev wrote: > > > > > > > > This is the implementation of SCI interface, called SCMI-SMC > > > > > > > > driver, > > > > > > > > which works as the mediator between XEN Domains and Firmware > > > > > > > > (SCP, ATF etc). > > > > > > > > This allows devices from the Domains to work with clocks, > > > > > > > > resets and > > > > > > > > power-domains without access to CPG. > > > > > > > > > > > > > > > > The following features are implemented: > > > > > > > > - request SCMI channels from ATF and pass channels to Domains; > > > > > > > > - set device permissions for Domains based on the Domain partial > > > > > > > > device-tree. Devices with permissions are able to work with > > > > > > > > clocks, > > > > > > > > resets and power-domains via SCMI; > > > > > > > > - redirect scmi messages from Domains to ATF. > > > > > > > > > > > > > > > > Signed-off-by: Oleksii Moisieiev > > > > > > > > --- > > > > > > > > xen/arch/arm/Kconfig | 2 + > > > > > > > > xen/arch/arm/sci/Kconfig | 10 + > > > > > > > > xen/arch/arm/sci/Makefile | 1 + > > > > > > > > xen/arch/arm/sci/scmi_smc.c | 795 > > > > > > > > ++ > > > > > > > > xen/include/public/arch-arm.h | 1 + > > > > > > > > 5 files changed, 809 insertions(+) > > > > > > > > create mode 100644 xen/arch/arm/sci/Kconfig > > > > > > > > create mode 100644 xen/arch/arm/sci/scmi_smc.c > > > > > > > > > > > > > > > > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig > > > > > > > > index 186e1db389..02d96c6cfc 100644 > > > > > > > > --- a/xen/arch/arm/Kconfig > > > > > > > > +++ b/xen/arch/arm/Kconfig > > > > > > > > @@ -114,6 +114,8 @@ config SCI > > > > > > > > support. It allows guests to control system > > > > > > > > resourcess via one of > > > > > > > > SCI mediators implemented in XEN. > > > > > > > > > > > > > > > > +source "arch/arm/sci/Kconfig" > > > > > > > > + > > > > > > > > endmenu > > > > > > > > > > > > > > > > menu "ARM errata workaround via the alternative framework" > > > > > > > > diff --git a/xen/arch/arm/sci/Kconfig b/xen/arch/arm/sci/Kconfig > > > > > > > > new file mode 100644 > > > > > > > > index 00..9563067ddc > > > > > > > > --- /dev/null > > > > > > > > +++ b/xen/arch/arm/sci/Kconfig > > > > > > > > @@ -0,0 +1,10 @@ > > > > > > > > +config SCMI_SMC > > > > > > > > + bool "Enable SCMI-SMC mediator driver" > > > > > > > > + default n > > > > > > > > + depends on SCI > > > > > > > > + ---help--- > > > > > > > > + > > > > > > > > + Enables mediator in XEN to pass SCMI requests from > > > > > > > > Domains to ATF. > > > > > > > > + This feature allows drivers from Domains to work with > > > > > > > > System > > > > > > > > + Controllers (such as power,resets,clock etc.). SCP is > > > > > > > > used as transport > > > > > > > > + for communication. > > > > > > > > diff --git a/xen/arch/arm/sci/Makefile > > > > > > > > b/xen/arch/arm/sci/Makefile > > > > > > > > index 837dc7492b..67f2611872 100644 > > > > > > > > --- a/xen/arch/arm/sci/Makefile > > > > > > > > +++ b/xen/arch/arm/sci/Makefile > > > > > > > > @@ -1 +1,2 @@ > > > > > > > > obj-y += sci.o > > > > > > > > +obj-$(CONFIG_SCMI_SMC) += scmi_smc.o > > > > > > > > diff --git a/xen/arch/arm/sci/scmi_smc.c > > > > > > > > b/xen/arch/arm/sci/scmi_smc.c > > > > > > > > new file mode 100644 > > > > > > > > index 00..2eb01ea82d > > > > > > > > --- /dev/null > > > > > > > > +++ b/xen/arch/arm/sci/scmi_smc.c > > > > > > > > @@ -0,0 +1,795 @@ > > > > > > > > +/* > > > > > > > > + * xen/arch/arm/sci/scmi_smc.c > > > > > > > > + * > > > > > > > > + * SCMI mediator driver, using SCP as transport. > > > > > > > > + * > > > > > > > > + * Oleksii Moisieiev > > > > > > > > + * Copyright (C) 2021, EPAM Systems. > > > > > > > > + * > > > > > > > > + * This program is free software; you can redistribute it > > > > > > > > and/or modify > > > > > > > > + * it under the terms of the GNU General Public License as > > > > > > > > published by > > > > > > > > + * the Free Software Foundation; either version 2 of the > > > > > > > > License, or > > > > > > > > + * (at your option) any later version. > > > > > > > > + * > > > > > > > > + * This program is distributed in the hope that it will be > > > > > > > > useful, > > > > > > > > + * but WITHOUT ANY WARRANTY; without even the impli
Re: [PATCH 2/5] x86/mwait-idle: add SnowRidge C-state table
On Tue, Jan 18, 2022 at 03:05:54PM +0100, Jan Beulich wrote: > On 18.01.2022 11:17, Roger Pau Monné wrote: > > On Mon, Sep 06, 2021 at 03:00:46PM +0200, Jan Beulich wrote: > >> --- a/xen/arch/x86/cpu/mwait-idle.c > >> +++ b/xen/arch/x86/cpu/mwait-idle.c > >> @@ -742,6 +742,32 @@ static const struct cpuidle_state dnv_cs > >>{} > >> }; > >> > >> +/* > >> + * Note, depending on HW and FW revision, SnowRidge SoC may or may not > >> support > >> + * C6, and this is indicated in the CPUID mwait leaf. > >> + */ > >> +static const struct cpuidle_state snr_cstates[] = { > >> + { > >> + .name = "C1", > > > > We usually use names like "C1-SNR" or similar in other cpuidle_state > > structs. Is using plain "C1" intentional here? > > I don't know. We're simply following the Linux side change. If we're > unhappy with their naming (it indeed looks inconsistent), then we > ought to make a separate patch on top (and maybe submit that also to > Linux). Looks like at some point Linux dropped the '-SNR' and similar suffixes from the state names, so we should likely import that patch as a pre-req for consistency? It's commit: de09cdd09fa1 intel_idle: stop exposing platform acronyms in sysfs > > >> @@ -954,6 +980,11 @@ static const struct idle_cpu idle_cpu_dn > >>.disable_promotion_to_c1e = 1, > >> }; > >> > >> +static const struct idle_cpu idle_cpu_snr = { > >> + .state_table = snr_cstates, > >> + .disable_promotion_to_c1e = true, > > > > This likely wants to be 1 because the type is bool_t. > > bool_t is just an alias of bool, so "true" ought to be fine. We may > want to follow Linux in switching to bool altogether - iirc we didn't > have bool yet at the time the driver (or the first commit needing it) > was ported. I guess I'll make a patch ... OK, thanks! I guess for the patch itself then: Acked-by: Roger Pau Monné Would be nice to get those things fixed for consistency. Roger.
Re: [PATCH v4] xen/arm: Allow QEMU platform to be built with GICv2
Bertrand Marquis 于2022年1月18日周二 17:17写道: > > Hi Dongju, > > > On 18 Jan 2022, at 08:58, Dongjiu Geng wrote: > > > > Bertrand Marquis 于2022年1月18日周二 16:48写道: > >> > >> Hi Dongju, > >> > >>> On 18 Jan 2022, at 08:45, Dongjiu Geng wrote: > >>> > >>> Julien Grall 于2022年1月17日周一 22:16写道: > > Hi, > > On 17/01/2022 10:40, Dongjiu Geng wrote: > > It turns out that QEMU has been supporting GICv2 virtualization since > > v3.1.0. So remove the dependencies on GICv3. > > > Technically, the current form of CONFIG_QEMU allows the same binary to > boot on QEMU with GICv2 or GICv3. > > > If we want to use GICv3, > > we can select the QEMU_LEGACY configuration. > > AFAIK, GICv3 is not a legacy feature... So it feels a bit odd to name it > like that (see more below). > >>> > >>> Legacy means QEMU platform only supports GICV3, now it can support > >>> both GICv2 and GICv3. The scope of support has been expanded > >>> Not mean GICv3 is a legacy feature. > >> > >> You might be misleading a bit here. > >> In the current configuration, Xen support GICv2, GICv3 and vgic. > >> The only thing not supported is actually the new VGIC but this is an > >> unsupported feature not fully functional which shall be used with caution. > >> > >> What issue exactly do you have in Qemu configured for gicv2 when you use > >> the default configuration ? > > > > I want to use NEW_VGIC with GICv2, but QEMU only select GICV3, when > > GICv3 is select, the NEW_VGIC can not be used. I try the NEW_VGIC > > with GICv2, not found issue. so I want to remove this limitation. > > If you think we should not support NEW_VGIC feature, we can ignore > > this patch. thanks! > > Supporting GICv2 makes sense but using NEW_VGIC in Xen might not as it is not > security supported and does not support ITS and MSIs. It is surely that NEW_VGIC not support ITS and MSI. but I think QEMU platform should not limit user select it. Selecting GICv2、GICv3 or NEW_VGIC may be chosen by users. But I find user can select it at all. > > Do you have a reason to use the NEW_VGIC implementation instead of the > standard one ? I add some features which is ported from KVM, NEW_VGIC is refereed to KVM,so it easily integrate > > Cheers > Bertrand > > > > >> > >> Cheers > >> Bertrand > >> > >>> > > > > > Signed-off-by: Dongjiu Geng > > --- > > xen/arch/arm/platforms/Kconfig | 10 +- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/xen/arch/arm/platforms/Kconfig > > b/xen/arch/arm/platforms/Kconfig > > index c93a6b2756..41e82a42ee 100644 > > --- a/xen/arch/arm/platforms/Kconfig > > +++ b/xen/arch/arm/platforms/Kconfig > > @@ -13,7 +13,15 @@ config ALL_PLAT > > automatically select any of the related drivers. > > > > config QEMU > > - bool "QEMU aarch virt machine support" > > + bool "QEMU aarch virt machine support >= v3.1.0" > > This is a bit misleading. A user may select this thinking that this will > select GICv3. However, this will not. > > This also raises the question of what is the default GIC version in QEMU > (i.e. if you don't pass anything on the command line)? If this is GICv3, > then I am afraid that this patch would be a no-go for me. > > Looking at overall discussion, you seem to push the patch only to allow > building a tiny Xen for QEMU and the new vGIC. > > The default Xen (i.e. make defconfig) will also work on QEMU. Given that > the new vGIC is a still in development, I am seriously considering to > say that if you want to try it then you have to use the default > configuration. > > @Dongjiu, is there any reason why you want to use the tiny QEMU config > rather than the default configuration? > >>> > >>> Hi Julien, thanks for the reply, I do not use the tiny QEMU config. I > >>> used the default configuration and selected the platform as QEMU. > >>> But QEMU platform only slects GICV3, so I submit this patch to remove > >>> the limitation because I want to use VGIC. VGIC can already support > >>> GICv2 now. > >>> > > @Bertrand, @Stefano, what do you think? > > Cheers, > > -- > Julien Grall >
Re: [PATCH v6 08/12] libs/guest: rework xc_cpuid_xend_policy
On Tue, Jan 18, 2022 at 02:06:25PM +, Andrew Cooper wrote: > On 17/01/2022 09:48, Roger Pau Monne wrote: > > Rename xc_cpuid_xend_policy to xc_cpu_policy_apply_cpuid > > I'm not convinced by this name. 'xend' is important because it's the > format of the data passed, which 'cpuid' is not. The format would be the libxc format now. Having xend here is a layering violation IMO. Note this function goes away in patch 11, so I'm unsure there's a lot of value in discussing over the name of a function that's about to disappear. > It is a horribly inefficient format, and really doesn't want to survive > cleanup to the internals of libxl. Even when moved to the internals of libxl the format is not exposed to users of libxl (it's a libxl__cpuid_policy in libxl_internal.h), so we are free to modify it at our own will. I would defer the changes to the format to a separate series, there's enough churn here already. My aim was to provide a new interface while keeping the functional changes to a minimum. > > diff --git a/tools/libs/guest/xg_cpuid_x86.c > > b/tools/libs/guest/xg_cpuid_x86.c > > index e7ae133d8d..9060a2f763 100644 > > --- a/tools/libs/guest/xg_cpuid_x86.c > > +++ b/tools/libs/guest/xg_cpuid_x86.c > > @@ -254,144 +254,107 @@ int xc_set_domain_cpu_policy(xc_interface *xch, > > uint32_t domid, > > return ret; > > } > > > > -static int compare_leaves(const void *l, const void *r) > > -{ > > -const xen_cpuid_leaf_t *lhs = l; > > -const xen_cpuid_leaf_t *rhs = r; > > - > > -if ( lhs->leaf != rhs->leaf ) > > -return lhs->leaf < rhs->leaf ? -1 : 1; > > - > > -if ( lhs->subleaf != rhs->subleaf ) > > -return lhs->subleaf < rhs->subleaf ? -1 : 1; > > - > > -return 0; > > -} > > - > > -static xen_cpuid_leaf_t *find_leaf( > > -xen_cpuid_leaf_t *leaves, unsigned int nr_leaves, > > -const struct xc_xend_cpuid *xend) > > -{ > > -const xen_cpuid_leaf_t key = { xend->leaf, xend->subleaf }; > > - > > -return bsearch(&key, leaves, nr_leaves, sizeof(*leaves), > > compare_leaves); > > -} > > - > > -static int xc_cpuid_xend_policy( > > -xc_interface *xch, uint32_t domid, const struct xc_xend_cpuid *xend) > > +int xc_cpu_policy_apply_cpuid(xc_interface *xch, xc_cpu_policy_t *policy, > > + const struct xc_xend_cpuid *cpuid, bool hvm) > > { > > int rc; > > -xc_dominfo_t di; > > -unsigned int nr_leaves, nr_msrs; > > -uint32_t err_leaf = -1, err_subleaf = -1, err_msr = -1; > > -/* > > - * Three full policies. The host, default for the domain type, > > - * and domain current. > > - */ > > -xen_cpuid_leaf_t *host = NULL, *def = NULL, *cur = NULL; > > -unsigned int nr_host, nr_def, nr_cur; > > +xc_cpu_policy_t *host = NULL, *def = NULL; > > > > -if ( xc_domain_getinfo(xch, domid, 1, &di) != 1 || > > - di.domid != domid ) > > -{ > > -ERROR("Failed to obtain d%d info", domid); > > -rc = -ESRCH; > > -goto fail; > > -} > > - > > -rc = xc_cpu_policy_get_size(xch, &nr_leaves, &nr_msrs); > > -if ( rc ) > > -{ > > -PERROR("Failed to obtain policy info size"); > > -rc = -errno; > > -goto fail; > > -} > > - > > -rc = -ENOMEM; > > -if ( (host = calloc(nr_leaves, sizeof(*host))) == NULL || > > - (def = calloc(nr_leaves, sizeof(*def))) == NULL || > > - (cur = calloc(nr_leaves, sizeof(*cur))) == NULL ) > > -{ > > -ERROR("Unable to allocate memory for %u CPUID leaves", nr_leaves); > > -goto fail; > > -} > > - > > -/* Get the domain's current policy. */ > > -nr_msrs = 0; > > -nr_cur = nr_leaves; > > -rc = get_domain_cpu_policy(xch, domid, &nr_cur, cur, &nr_msrs, NULL); > > -if ( rc ) > > +host = xc_cpu_policy_init(); > > +def = xc_cpu_policy_init(); > > +if ( !host || !def ) > > { > > -PERROR("Failed to obtain d%d current policy", domid); > > -rc = -errno; > > -goto fail; > > +PERROR("Failed to init policies"); > > +rc = -ENOMEM; > > +goto out; > > } > > > > /* Get the domain type's default policy. */ > > -nr_msrs = 0; > > -nr_def = nr_leaves; > > -rc = get_system_cpu_policy(xch, di.hvm ? > > XEN_SYSCTL_cpu_policy_hvm_default > > +rc = xc_cpu_policy_get_system(xch, hvm ? > > XEN_SYSCTL_cpu_policy_hvm_default > > : > > XEN_SYSCTL_cpu_policy_pv_default, > > - &nr_def, def, &nr_msrs, NULL); > > + def); > > if ( rc ) > > { > > -PERROR("Failed to obtain %s def policy", di.hvm ? "hvm" : "pv"); > > -rc = -errno; > > -goto fail; > > +PERROR("Failed to obtain %s def policy", hvm ? "hvm" : "pv"); > > +goto out; > > } > > > > /* Get the host policy. */ > > -nr_msrs = 0; > > -nr_host = nr_le
Re: [RFC v1 3/5] xen/arm: introduce SCMI-SMC mediator driver
Hi Julien, On Thu, Jan 06, 2022 at 04:04:34PM +, Julien Grall wrote: > Hi, > > On 06/01/2022 15:43, Oleksii Moisieiev wrote: > > On Thu, Jan 06, 2022 at 02:02:10PM +, Julien Grall wrote: > > > > > > > > > On 06/01/2022 13:53, Oleksii Moisieiev wrote: > > > > Hi Julien, > > > > > > Hi, > > > > > > > > > > > On Mon, Jan 03, 2022 at 01:14:17PM +, Julien Grall wrote: > > > > > Hi, > > > > > > > > > > On 24/12/2021 17:02, Oleksii Moisieiev wrote: > > > > > > On Fri, Dec 24, 2021 at 03:42:42PM +0100, Julien Grall wrote: > > > > > > > On 20/12/2021 16:41, Oleksii Moisieiev wrote: > > > > > > > > > 2) What are the expected memory attribute for the > > > > > > > > > regions? > > > > > > > > > > > > > > > > xen uses iommu_permit_access to pass agent page to the guest. > > > > > > > > So guest can access the page directly. > > > > > > > > > > > > > > I think you misunderstood my comment. Memory can be mapped with > > > > > > > various type > > > > > > > (e.g. Device, Memory) and attribute (cacheable, > > > > > > > non-cacheable...). What will > > > > > > > the firmware expect? What will the guest OS usually? > > > > > > > > > > > > > > The reason I am asking is the attributes have to matched to avoid > > > > > > > any > > > > > > > coherency issues. At the moment, if you use > > > > > > > XEN_DOMCTL_memory_mapping, Xen > > > > > > > will configure the stage-2 to use Device nGnRnE. As the result, > > > > > > > the result > > > > > > > memory access will be Device nGnRnE which is very strict. > > > > > > > > > > > > > > > > > > > Let me share with you the configuration example: > > > > > > scmi expects memory to be configured in the device-tree: > > > > > > > > > > > > cpu_scp_shm: scp-shmem@0xXXX { > > > > > > compatible = "arm,scmi-shmem"; > > > > > > reg = <0x0 0xXX 0x0 0x1000>; > > > > > > }; > > > > > > > > > > > > where XX address I allocate in alloc_magic_pages function. > > > > > > > > > > The goal of alloc_magic_pages() is to allocate RAM. However, what you > > > > > want > > > > > is a guest physical address and then map a part of the shared page. > > > > > > > > Do you mean that I can't use alloc_magic_pages to allocate shared > > > > memory region for SCMI? > > > Correct. alloc_magic_pages() will allocate a RAM page and then assign to > > > the > > > guest. From your description, this is not what you want because you will > > > call XEN_DOMCTL_memory_mapping (and therefore replace the mapping). > > > > > > > Ok thanks, I will refactor this part in v2. > > > > > > > > > > > > > > > > I can see two options here: > > > > > 1) Hardcode the SCMI region in the memory map > > > > > 2) Create a new region in the memory map that can be used for > > > > > reserving > > > > > memory for mapping. > > > > > > > > Could you please explain what do you mean under the "new region in the > > > > memory map"? > > > > > > I mean reserving some guest physical address that could be used for map > > > host > > > physical address (e.g. SCMI region, GIC CPU interface...). > > > > > > So rather than hardcoding the address, we have something more flexible. > > > > > > > Ok, I will fix that in v2. > > Just for avoidance of doubt. I was clarify option 2 and not requesting to > implement. That said, if you want to implement option 2 I would be happy to > review it. > I'm thinking about the best way to reserve address for the domain. We have xen_pfn_t shared_info_pfn in struct xc_dom_image which is not used for Arm architecture. It can be set from shared_info_arm callback, defined in xg_dom_arm.c. I can use shared_info to store address of the SCMI and then use map_sci_page to call XEN_DOMCTL_memory_mapping. This will allow me to reuse existing functionality and do not allocate extra RAM. What do you think about that? -- Best regards, Oleksii. > > > > > > > > > > > > > > > > We still have plenty of space in the guest memory map. So the former > > > > > is > > > > > probably going to be fine for now. > > > > > > > > > > > Then I get paddr of the scmi channel for this domain and use > > > > > > XEN_DOMCTL_memory_mapping to map scmi channel address to gfn. > > > > > >> Hope I wass able to answer your question. > > > > > > > > > > What you provided me is how the guest OS will locate the shared > > > > > memory. This > > > > > still doesn't tell me which memory attribute will be used to map the > > > > > page in > > > > > Stage-1 (guest page-tables). > > > > > > > > > > To find that out, you want to look at the driver and how the mapping > > > > > is > > > > > done. The Linux driver (drivers/firmware/arm_scmi) is using > > > > > devm_ioremap() > > > > > (see smc_chan_setup()). > > > > > > > > > > Under the hood, the function devm_ioremap() is using PROT_DEVICE_nGnRE > > > > > (arm64) which is one of the most restrictive memory attribute. > > > > > > > > > > This means the firmware should be able to deal with the most > > > > > restrict
[xen-unstable-coverity test] 167746: all pass - PUSHED
flight 167746 xen-unstable-coverity real [real] http://logs.test-lab.xenproject.org/osstest/logs/167746/ Perfect :-) All tests in this flight passed as required version targeted for testing: xen 444597436d08ccae6d210a2b1b877fef636796ea baseline version: xen 9ce0a5e207f3968e65d0af33a15bee5bdf5c8a7f Last test of basis 167714 2022-01-16 09:19:39 Z3 days Testing same since 167746 2022-01-19 09:19:44 Z0 days1 attempts People who touched revisions under test: Andrew Cooper Jan Beulich Juergen Gross Lukasz Hawrylko Nick Rosbrook jobs: coverity-amd64 pass sg-report-flight on osstest.test-lab.xenproject.org logs: /home/logs/logs images: /home/logs/images Logs, config files, etc. are available at http://logs.test-lab.xenproject.org/osstest/logs Explanation of these reports, and of osstest in general, is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master Test harness code can be found at http://xenbits.xen.org/gitweb?p=osstest.git;a=summary Pushing revision : To xenbits.xen.org:/home/xen/git/xen.git 9ce0a5e207..444597436d 444597436d08ccae6d210a2b1b877fef636796ea -> coverity-tested/smoke
Re: [RFC v1 5/5] xen/arm: add SCI mediator support for DomUs
Hi Julien, On Fri, Dec 24, 2021 at 02:30:50PM +0100, Julien Grall wrote: > Hi, > > On 23/12/2021 20:06, Stefano Stabellini wrote: > > On Wed, 22 Dec 2021, Stefano Stabellini wrote: > > > # Future Ideas > > > > > > A great suggestion by Julien is to start supporting the dom0less partial > > > device tree format in xl/libxl as well so that we can have a single > > > "device_tree" option in dom.cfg instead of 4 (device_tree, iomem, irqs, > > > dtdev). > > > > > > Even with that implemented, the user has still to provide a partial dtb, > > > xen,reg and xen,path. I think this is a great step forward and we should > > > do that, if nothing else to make it easier to switch between dom0less > > > and normal domU configurations. But the number of options and > > > information that the user has to provide is still similar to what we > > > have today. > > > > I have just realized that if we start to parse the partial DTB in > > xl/libxl the same way that we do for dom0less guests (parse "xen,path", > > "xen,reg", and "interrupts", making dtdev, irqs and iomem optional) > > actually we can achieve the goal below thanks to the combination: > > "xen,path" + "xen,force-assign-without-iommu". > > > > In other words, with dom0less we already have a way to specify the link > > to the host node even if the device is not a DMA master. We can do that > > by specifying both xen,path and xen,force-assign-without-iommu for a > > device. > > > > This is just FYI. I am not suggesting we should introduce dom0less-style > > partial DTBs in order to get SCMI support in guests (although it would > > be great to have). I think the best way forward for this series is one > > of the combinations below, like a) + d), or a) + c) > > I strongly prefer a) + c) because a warning is easy to miss/ignore. At least > with the extra property the user made an action to think about it and agree > that this is the way do it. > > It is also easier to spot if we ask the user to provide the configuration > file. > Let me share my thoughts about c), which is: c) require force-assign-without-iommu="true" in dom.cfg Adding this parameter to domain config means removing xen,force-assign-without-iommu param from partial DTB. This will affect dom0less configuration, which I can't test for now without extra effort. What I suggest is to implement a) + d) in this patch series, which is: a) extend dtdev to cover all devices, including non-DMA masters d) or print a warning like: "WARNING: device assignment safety for device XXX cannot be verified. Please make sure XXX is not a DMA mastering device." And introduce a) + c) with the next patch series where dom0less scmi support will be done. Maybe leave a comment in code that force-assign-without-iommu config parameter should be implemened. What do you think about this? -- Best regards, Oleksii.
Re: [PATCH 1/2] tools/libs/light: numa placement: don't try to free a NULL list of vcpus
On Mon, 2022-01-17 at 15:56 +, Anthony PERARD wrote: > On Fri, Jan 14, 2022 at 11:22:00PM +, Dario Faggioli wrote: > > > > Also, if we go that way, I guess we want to change > > libxl_cputopology_list_free(), libxl_pcitopology_list_free(), > > libxl_numainfo_list_free(), libxl_dominfo_list_free(), > > libxl_vminfo_list_free() and libxl_cpupoolinfo_list_free(), don't > > we? > > I actually don't know if that would be useful. Those functions do > work > as expected (I think) with the right parameters: they do nothing when > called with (NULL, 0). free(NULL) does nothing. > Right. > So checking for NULL before using `nr` would probably just be a > workaround for programming mistake in the function allocating the > object > or some missing initialisation in the caller. So I don't think it is > important anymore. > Agreed. That's why, as I said, I though about doing something like that, but ended up deciding not to. > > > Also I think it is better to keep the free in the exit path at > > > the > > > end > > > of the loop. > > > > > Can I ask why as, as I was trying to say above, if we are sent > > directly > > to next by one of the goto, we do know that vinfo is NULL and > > libxl_vcpuinfo_list_free() will be basically a NOP, however it is > > implemented. > > > > Of course, you're the maintainer of this code, so I'll do like that > > if > > you ask... I'm just curious. :-) > > Freeing resources should always been attempted, even if that mean to > check whether there's something to free before calling the associated > free function. Imagine someone adding some code that could fail after > the libxl_list_vcpu(), then when that new code fails it would `goto > next;` and the allocated data from libxl_list_vcpu() would never be > freed. > Yeah, sure, whoever adds code that does a 'goto next' with an allocated list, should also realize that libxl_vcpuinfo_list_free() needs to be moved after 'next' itself, as a consequence of the very change being done, and this seems fair to me. But, at the end of the day, it's not a big deal at all. Thanks for satisfying my curiosity and, since you also agree on... > > Actually, if you really think that the call to > > libxl_vcpuinfo_list_free() should stay where it is, I can just drop > > this patch and go on with patch 2 only, which is enough for fixing > > the > > crash. > > This patch is just a workaround a bug in libxl_list_vcpu(), so I > think > it can be dropped. > ...this, I'll just drop this patch and proceed with just what, in this series, was patch 2. Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ --- <> (Raistlin Majere) signature.asc Description: This is a digitally signed message part
Re: [PATCH] x86/time: further improve TSC / CPU freq calibration accuracy
On 19.01.2022 09:46, Roger Pau Monné wrote: > On Tue, Jan 18, 2022 at 03:26:36PM +0100, Jan Beulich wrote: >> On 18.01.2022 15:05, Roger Pau Monné wrote: >>> On Tue, Jan 18, 2022 at 02:39:03PM +0100, Jan Beulich wrote: On 18.01.2022 13:45, Roger Pau Monné wrote: > On Thu, Jan 13, 2022 at 02:41:31PM +0100, Jan Beulich wrote: >> Calibration logic assumes that the platform timer (HPET or ACPI PM >> timer) and the TSC are read at about the same time. This assumption may >> not hold when a long latency event (e.g. SMI or NMI) occurs between the >> two reads. Reduce the risk of reading uncorrelated values by doing at >> least four pairs of reads, using the tuple where the delta between the >> enclosing TSC reads was smallest. From the fourth iteration onwards bail >> if the new TSC delta isn't better (smaller) than the best earlier one. > > Do you have some measurements as to whether this makes the frequency > detection any more accurate? In the normal case (no SMI or alike) I haven't observed any further improvement on top of the earlier patch. >> Signed-off-by: Jan Beulich >> --- >> Obviously (I think) instead of having both read_{hpet,pmtmr}_and_tsc() >> the calibration logic could be folded between HPET and PMTMR, at the > > As an intermediate solution you could have a read_counter_and_tsc I > would think? Not sure in which way you view this as "intermediate". >>> >>> As in not unifying both calibration logic into a single function, but >>> rather just use a generic function to read the counter and TSC. >>> >>> With your original remark I assumed that you wanted to unify all the >>> calibration code in init_hpet and init_pmtimer into a generic >>> function, but maybe I've misunderstood. >> >> Oh, I see. I have to admit that I see little value (at this point) to >> break out just the pair-read logic. While I did say we have similar >> issues elsewhere, my initial analysis has left me with the impression >> that those other cases are sufficiently different for such a helper to >> be of no use there. >> >> expense of a couple more indirect calls (which may not be that much of a >> problem as this is all boot-time only). Really such folding would have >> been possible already before, just that the amount of duplicate code >> hasn't been this large so far. IOW if so desired, then perhaps the >> folding should be a separate prereq patch. > > You could even pass a platform_timesource as a parameter to that > generic read counter and TSC helper. This was the plan, yes, in case I was asked to follow that route (or if I decided myself that I'd prefer that significantly over the redundancy). >>> >>> I think having a generic read_counter_and_tsc would be better than >>> having read_{hpet,pmtmr}_and_tsc. Btw, I think I will want to break out the full calibration logic, to then further generalize it to make suitable for LAPIC timer calibration (then no longer against the PIT, but against the earlier chosen platform timer). >> --- a/xen/arch/x86/time.c >> +++ b/xen/arch/x86/time.c >> @@ -392,9 +392,36 @@ static u64 read_hpet_count(void) >> return hpet_read32(HPET_COUNTER); >> } >> >> +static uint32_t __init read_hpet_and_tsc(uint64_t *tsc) >> +{ >> +uint64_t tsc_prev = *tsc = rdtsc_ordered(), tsc_min = ~0; >> +uint32_t best = best; >> +unsigned int i; >> + >> +for ( i = 0; ; ++i ) >> +{ >> +uint32_t hpet = hpet_read32(HPET_COUNTER); >> +uint64_t tsc_cur = rdtsc_ordered(); >> +uint64_t tsc_delta = tsc_cur - tsc_prev; >> + >> +if ( tsc_delta < tsc_min ) >> +{ >> +tsc_min = tsc_delta; >> +*tsc = tsc_cur; >> +best = hpet; >> +} >> +else if ( i > 2 ) >> +break; >> + >> +tsc_prev = tsc_cur; > > Would it be better to set tsc_prev to the current TSC value here in > order to minimize the window you are measuring? Or else tsc_delta will > also account for the time in the loop code, and there's more > likeliness from being interrupted? I did consider this (or some variant thereof), but did for the moment conclude that it wouldn't make enough of a difference. If you think it would be meaningfully better that way, I'll switch. Both here and for the aspect above, please express you preference (i.e. not in the form of a question). >>> >>> Let's leave it as-is (just one TSC read per loop iteration). >>> > I guess being interrupted four times so that all loops are bad is very > unlikely. > > Since this loop could in theory run multiple times, do we need to > check that the counter doesn't overflow? Or else the elapsed counter > value would be miscalculated
Re: [PATCH] x86/time: further improve TSC / CPU freq calibration accuracy
On Tue, Jan 18, 2022 at 03:26:36PM +0100, Jan Beulich wrote: > On 18.01.2022 15:05, Roger Pau Monné wrote: > > On Tue, Jan 18, 2022 at 02:39:03PM +0100, Jan Beulich wrote: > >> On 18.01.2022 13:45, Roger Pau Monné wrote: > >>> On Thu, Jan 13, 2022 at 02:41:31PM +0100, Jan Beulich wrote: > Calibration logic assumes that the platform timer (HPET or ACPI PM > timer) and the TSC are read at about the same time. This assumption may > not hold when a long latency event (e.g. SMI or NMI) occurs between the > two reads. Reduce the risk of reading uncorrelated values by doing at > least four pairs of reads, using the tuple where the delta between the > enclosing TSC reads was smallest. From the fourth iteration onwards bail > if the new TSC delta isn't better (smaller) than the best earlier one. > >>> > >>> Do you have some measurements as to whether this makes the frequency > >>> detection any more accurate? > >> > >> In the normal case (no SMI or alike) I haven't observed any further > >> improvement on top of the earlier patch. > >> > Signed-off-by: Jan Beulich > --- > Obviously (I think) instead of having both read_{hpet,pmtmr}_and_tsc() > the calibration logic could be folded between HPET and PMTMR, at the > >>> > >>> As an intermediate solution you could have a read_counter_and_tsc I > >>> would think? > >> > >> Not sure in which way you view this as "intermediate". > > > > As in not unifying both calibration logic into a single function, but > > rather just use a generic function to read the counter and TSC. > > > > With your original remark I assumed that you wanted to unify all the > > calibration code in init_hpet and init_pmtimer into a generic > > function, but maybe I've misunderstood. > > Oh, I see. I have to admit that I see little value (at this point) to > break out just the pair-read logic. While I did say we have similar > issues elsewhere, my initial analysis has left me with the impression > that those other cases are sufficiently different for such a helper to > be of no use there. > > expense of a couple more indirect calls (which may not be that much of a > problem as this is all boot-time only). Really such folding would have > been possible already before, just that the amount of duplicate code > hasn't been this large so far. IOW if so desired, then perhaps the > folding should be a separate prereq patch. > >>> > >>> You could even pass a platform_timesource as a parameter to that > >>> generic read counter and TSC helper. > >> > >> This was the plan, yes, in case I was asked to follow that route (or > >> if I decided myself that I'd prefer that significantly over the > >> redundancy). > > > > I think having a generic read_counter_and_tsc would be better than > > having read_{hpet,pmtmr}_and_tsc. > > > >> > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -392,9 +392,36 @@ static u64 read_hpet_count(void) > return hpet_read32(HPET_COUNTER); > } > > +static uint32_t __init read_hpet_and_tsc(uint64_t *tsc) > +{ > +uint64_t tsc_prev = *tsc = rdtsc_ordered(), tsc_min = ~0; > +uint32_t best = best; > +unsigned int i; > + > +for ( i = 0; ; ++i ) > +{ > +uint32_t hpet = hpet_read32(HPET_COUNTER); > +uint64_t tsc_cur = rdtsc_ordered(); > +uint64_t tsc_delta = tsc_cur - tsc_prev; > + > +if ( tsc_delta < tsc_min ) > +{ > +tsc_min = tsc_delta; > +*tsc = tsc_cur; > +best = hpet; > +} > +else if ( i > 2 ) > +break; > + > +tsc_prev = tsc_cur; > >>> > >>> Would it be better to set tsc_prev to the current TSC value here in > >>> order to minimize the window you are measuring? Or else tsc_delta will > >>> also account for the time in the loop code, and there's more > >>> likeliness from being interrupted? > >> > >> I did consider this (or some variant thereof), but did for the moment > >> conclude that it wouldn't make enough of a difference. If you think > >> it would be meaningfully better that way, I'll switch. > >> > >> Both here and for the aspect above, please express you preference (i.e. > >> not in the form of a question). > > > > Let's leave it as-is (just one TSC read per loop iteration). > > > >>> I guess being interrupted four times so that all loops are bad is very > >>> unlikely. > >>> > >>> Since this loop could in theory run multiple times, do we need to > >>> check that the counter doesn't overflow? Or else the elapsed counter > >>> value would be miscalculated. > >> > >> I don't think I see any issue with overflow here. It's the callers > >> who need to care about that. And I don't think this function will run > >> for long enough such that a counter would not only wrap, but also > >> reach its initial count again.
Re: [patch] genirq/msi: Populate sysfs entry only once
Hi Thomas, On 2022/01/19 8:59, Thomas Gleixner wrote: Kunihiko, On Wed, Jan 12 2022 at 09:05, Kunihiko Hayashi wrote: Is this fix the same as below? https://marc.info/?l=linux-kernel&m=164061119923119&w=2 pretty much the same, but I missed that patch. I was off for 2+ weeks and on return Boris poked me about this issue and I fixed it. Then I went ahead and marked all vacation mail read as I always do :) So sorry for not noticing that patch. No problem. If this issue wansn't resolved, the PCIe controller wouldn't work properly, so I'm relieved to solve the issue and get your response. Thank you, --- Best Regards Kunihiko Hayashi
RE: [PATCH 07/37] xen/x86: use paddr_t for addresses in NUMA node structure
Hi Jan, > -Original Message- > From: Jan Beulich > Sent: 2022年1月19日 15:55 > To: Wei Chen > Cc: Bertrand Marquis ; xen- > de...@lists.xenproject.org; sstabell...@kernel.org; jul...@xen.org > Subject: Re: [PATCH 07/37] xen/x86: use paddr_t for addresses in NUMA node > structure > > On 19.01.2022 07:33, Wei Chen wrote: > >> From: Jan Beulich > >> Sent: 2022年1月18日 23:23 > >> > >> On 23.09.2021 14:02, Wei Chen wrote: > >>> @@ -249,24 +250,26 @@ static int __init numa_emulation(u64 start_pfn, > >> u64 end_pfn) > >>> void __init numa_initmem_init(unsigned long start_pfn, unsigned long > >> end_pfn) > >>> { > >>> int i; > >>> +paddr_t start, end; > >>> > >>> #ifdef CONFIG_NUMA_EMU > >>> if ( numa_fake && !numa_emulation(start_pfn, end_pfn) ) > >>> return; > >>> #endif > >>> > >>> +start = pfn_to_paddr(start_pfn); > >>> +end = pfn_to_paddr(end_pfn); > >> > >> Nit: Would be slightly neater if these were the initializers of the > >> variables. > > > > But if this function returns in numa_fake && !numa_emulation, > > will the two pfn_to_paddr operations be waste? > > And what harm would that do? Ok, two or three instructions waste in init time will not take too much harm. I will move them above with initializers. > > >>> @@ -441,7 +441,7 @@ void __init srat_parse_regions(u64 addr) > >>> acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) > >>> return; > >>> > >>> - srat_region_mask = pdx_init_mask(addr); > >>> + srat_region_mask = pdx_init_mask((u64)addr); > >> > >> I don't see the need for a cast here. > >> > > > > current addr type has been changed to paddr_t, but pdx_init_mask > > accept parameter type is u64. I know paddr_t is a typedef of > > u64 on Arm64/32, or unsinged long on x86. In current stage, > > their machine byte-lengths are the same. But in case of future > > changes I think an explicit case here maybe better? > > It may only ever be an up-cast, yet the compiler would do a widening > conversion (according to the usual type conversion rules) for us > anyway no matter whether there's a cast. Down-casts (in the general > compiler case, i.e. considering a wider set than just gcc and clang) > sometimes need making explicit to silence compiler warnings about > truncation, but I've not observed any compiler warning when widening > values. Ok, I will drop that cast. > > Jan
RE: [PATCH 04/37] xen: introduce an arch helper for default dma zone status
Hi Jan, > -Original Message- > From: Jan Beulich > Sent: 2022年1月19日 15:50 > To: Wei Chen > Cc: Bertrand Marquis ; xen- > de...@lists.xenproject.org; sstabell...@kernel.org; jul...@xen.org > Subject: Re: [PATCH 04/37] xen: introduce an arch helper for default dma > zone status > > On 19.01.2022 03:49, Wei Chen wrote: > > Hi Jan, > > > >> -Original Message- > >> From: Jan Beulich > >> Sent: 2022年1月18日 22:16 > >> To: Wei Chen > >> Cc: Bertrand Marquis ; xen- > >> de...@lists.xenproject.org; sstabell...@kernel.org; jul...@xen.org > >> Subject: Re: [PATCH 04/37] xen: introduce an arch helper for default > dma > >> zone status > >> > >> On 18.01.2022 10:20, Wei Chen wrote: > From: Jan Beulich > Sent: 2022年1月18日 16:16 > > On 18.01.2022 08:51, Wei Chen wrote: > >> From: Jan Beulich > >> Sent: 2022年1月18日 0:11 > >> On 23.09.2021 14:02, Wei Chen wrote: > >>> In current code, when Xen is running in a multiple nodes NUMA > >>> system, it will set dma_bitsize in end_boot_allocator to reserve > >>> some low address memory for DMA. > >>> > >>> There are some x86 implications in current implementation. Becuase > >>> on x86, memory starts from 0. On a multiple nodes NUMA system, if > >>> a single node contains the majority or all of the DMA memory. x86 > >>> prefer to give out memory from non-local allocations rather than > >>> exhausting the DMA memory ranges. Hence x86 use dma_bitsize to set > >>> aside some largely arbitrary amount memory for DMA memory ranges. > >>> The allocations from these memory ranges would happen only after > >>> exhausting all other nodes' memory. > >>> > >>> But the implications are not shared across all architectures. For > >>> example, Arm doesn't have these implications. So in this patch, we > >>> introduce an arch_have_default_dmazone helper for arch to > determine > >>> that it need to set dma_bitsize for reserve DMA allocations or not. > >> > >> How would Arm guarantee availability of memory below a certain > >> boundary for limited-capability devices? Or is there no need > >> because there's an assumption that I/O for such devices would > >> always pass through an IOMMU, lifting address size restrictions? > >> (I guess in a !PV build on x86 we could also get rid of such a > >> reservation.) > > > > On Arm, we still can have some devices with limited DMA capability. > > And we also don't force all such devices to use IOMMU. This devices > > will affect the dma_bitsize. Like RPi platform, it sets its > >> dma_bitsize > > to 30. But in multiple NUMA nodes system, Arm doesn't have a default > > DMA zone. Multiple nodes is not a constraint on dma_bitsize. And > some > > previous discussions are placed here [1]. > > I'm afraid that doesn't give me more clues. For example, in the mail > being replied to there I find "That means, only first 4GB memory can > be used for DMA." Yet that's not an implication from setting > dma_bitsize. DMA is fine to occur to any address. The special address > range is being held back in case in particular Dom0 is in need of > such > a range to perform I/O to _some_ devices. > >>> > >>> I am sorry that my last reply hasn't given you more clues. On Arm, > only > >>> Dom0 can have DMA without IOMMU. So when we allocate memory for Dom0, > >>> we're trying to allocate memory under 4GB or in the range of > dma_bitsize > >>> indicated. I think these operations meet your above Dom0 special > address > >>> range description. As we have already allocated memory for DMA, so I > >>> think we don't need a DMA zone in page allocation. I am not sure is > that > >>> answers your earlier question? > >> > >> I view all of this as flawed, or as a workaround at best. Xen shouldn't > >> make assumptions on what Dom0 may need. Instead Dom0 should make > >> arrangements such that it can do I/O to/from all devices of interest. > >> This may involve arranging for address restricted buffers. And for this > >> to be possible, Xen would need to have available some suitable memory. > >> I understand this is complicated by the fact that despite being HVM- > like, > >> due to the lack of an IOMMU in front of certain devices address > >> restrictions on Dom0 address space alone (i.e. without any Xen > >> involvement) won't help ... > >> > > > > I agree with you that the current implementation is probably the best > > kind of workaround. Do you have some suggestions for this patch to > > address above comments? Or should I just need to modify the commit log > > to contain some of our above discussions? > > Extending the description is my primary request, or else we may end up > having the same discussion every time you submit a new version. As to > improving the situation such that preferably per-arch customization > wouldn't be necessary - that's perhaps better to be thought about by >
Re: statistical time calibration
On 18.01.2022 16:03, Jan Beulich wrote: > Hello, > > Roger pointer me to a FreeBSD commit [1] introducing such there. While > we don't start at 2000ms (but rather at 50), this still looked interesting > enough to take a closer look. I think I've mostly understood the idea and > implementation now, with the exception of three things: > > 1) When deciding whether to increment "passes", both variance values have > an arbitrary value of 4 added to them. There's a sentence about this in > the earlier (big) comment, but it lacks any justification as to the chosen > value. What's worse, variance is not a plain number, but a quantity in the > same units as the base values. While not relevant for the eventual usage, I'd like to correct myself here: The unit of variance (and covariance) is the square of the base unit (assuming, in the covariance case, the units of both values are the same, as is the case here, where fundamentally both use Hz, and just the scales may - and typically will - be different). Which ... > Since typically both clocks will run at > very difference frequencies, using the same (constant) value here has much > more of an effect on the lower frequency clock's value than on the higher > frequency one's. ... means the difference in (relative) effect on the two values is even more significant. Jan > 2) The second of the "important formulas" is nothing I could recall or was > able to look up. All I could find are somewhat similar, but still > sufficiently different ones. Perhaps my "introductory statistics" have > meanwhile been too long ago ... (In this context I'd like to also mention > that it took me quite a while to prove to myself that the degenerate case > of, in particular, the first iteration wouldn't lead to an early exit > from the function.) > > 3) At the bottom of the loop there is some delaying logic, leading to > later data points coming in closer succession than earlier ones. I'm > afraid I don't understand the "theoretical risk of aliasing", and hence > I'm seeing more risks than benefits from this construct. > > Beyond that there are implementation aspects that I'm not happy with, > like aforementioned delay loop not dealing with a TSC which did start > from a large "negative" value, and which hence would eventually wrap. Nor > is the SMI (or other long latency events) aspect being taken care of. But > any such concern could of course be dealt with as we port over this > logic, if we decided we want to go that route. > > My main concern is with the goal of reaching accuracy of 1PPM, and the > loop ending only after a full second (if I got that right) if that > accuracy cannot be reached. Afaict there's no guarantee that 1PPM is > reachable. My recent observations suggest that with HPET that's > feasible (but only barely), but with PMTMR it might be more like 3 or > more. > > The other slight concern I have, as previously voiced on IRC, is the use > of floating point here. > > Jan > > [1] > https://cgit.freebsd.org/src/commit/?id=c2705ceaeb09d8579661097fd358ffb5defb5624 > >
RE: [PATCH 08/37] xen/x86: add detection of discontinous node memory range
Hi Jan, > -Original Message- > From: Jan Beulich > Sent: 2022年1月19日 16:01 > To: Wei Chen > Cc: Bertrand Marquis ; xen- > de...@lists.xenproject.org; sstabell...@kernel.org; jul...@xen.org > Subject: Re: [PATCH 08/37] xen/x86: add detection of discontinous node > memory range > > On 19.01.2022 08:33, Wei Chen wrote: > >> From: Jan Beulich > >> Sent: 2022年1月19日 0:13 > >> > >> On 23.09.2021 14:02, Wei Chen wrote: > >>> One NUMA node may contain several memory blocks. In current Xen > >>> code, Xen will maintain a node memory range for each node to cover > >>> all its memory blocks. But here comes the problem, in the gap of > >>> one node's two memory blocks, if there are some memory blocks don't > >>> belong to this node (remote memory blocks). This node's memory range > >>> will be expanded to cover these remote memory blocks. > >>> > >>> One node's memory range contains othe nodes' memory, this is obviously > >>> not very reasonable. This means current NUMA code only can support > >>> node has continous memory blocks. However, on a physical machine, the > >>> addresses of multiple nodes can be interleaved. > >>> > > > > I will adjust above paragraph to: > > ... This means current NUMA code only can support node has no interlaced > > memory blocks. ... > > > >>> So in this patch, we add code to detect discontinous memory blocks > >>> for one node. NUMA initializtion will be failed and error messages > >>> will be printed when Xen detect such hardware configuration. > > > > I will adjust above paragraph to: > > So in this patch, we add code to detect interleave of different nodes' > > memory blocks. NUMA initialization will be ... > > Taking just this part of your reply (the issue continues later), may I > ask that you use a consistent term throughout this single patch? Mixing > "interlace" and "interleave" like you do may make people wonder whether > the two are intended to express slightly different aspects. Personally, > as per my suggestion, I'd prefer "interleave", but I'm not a native > speaker. > Sorry, I am not a native speaker too, I had checked dict for interlaced before I used it. https://www.merriam-webster.com/thesaurus/interlaced Obviously, I'm probably using it incorrectly and making it harder to understand, I will use "interleave" in my patches. Thanks, Wei Chen > Jan
Re: [PATCH 08/37] xen/x86: add detection of discontinous node memory range
On 19.01.2022 08:33, Wei Chen wrote: >> From: Jan Beulich >> Sent: 2022年1月19日 0:13 >> >> On 23.09.2021 14:02, Wei Chen wrote: >>> One NUMA node may contain several memory blocks. In current Xen >>> code, Xen will maintain a node memory range for each node to cover >>> all its memory blocks. But here comes the problem, in the gap of >>> one node's two memory blocks, if there are some memory blocks don't >>> belong to this node (remote memory blocks). This node's memory range >>> will be expanded to cover these remote memory blocks. >>> >>> One node's memory range contains othe nodes' memory, this is obviously >>> not very reasonable. This means current NUMA code only can support >>> node has continous memory blocks. However, on a physical machine, the >>> addresses of multiple nodes can be interleaved. >>> > > I will adjust above paragraph to: > ... This means current NUMA code only can support node has no interlaced > memory blocks. ... > >>> So in this patch, we add code to detect discontinous memory blocks >>> for one node. NUMA initializtion will be failed and error messages >>> will be printed when Xen detect such hardware configuration. > > I will adjust above paragraph to: > So in this patch, we add code to detect interleave of different nodes' > memory blocks. NUMA initialization will be ... Taking just this part of your reply (the issue continues later), may I ask that you use a consistent term throughout this single patch? Mixing "interlace" and "interleave" like you do may make people wonder whether the two are intended to express slightly different aspects. Personally, as per my suggestion, I'd prefer "interleave", but I'm not a native speaker. Jan