Re: [Xen-devel] [PATCH 7/7] tools/hotplug: add wrapper to start xenstored
On Tue, Jan 06, Ian Jackson wrote: Olaf Hering writes ([PATCH 7/7] tools/hotplug: add wrapper to start xenstored): The shell wrapper in xenstored.service does not handle XENSTORE_TRACE. ... +XENSTORED_LIBEXEC = xenstored.sh Should be in /etc as previously discussed. Previously I wrote: Bottom line: as relevant maintainer, I'm afraid I'm going to insist that this script be in /etc. I'm disappointed. It is not acceptable to resubmit a change ignoring such unequivocal feedback. Plain /etc wont work, I think. /etc/xen/scripts perhaps? But see my other reply to IanC, maybe there is a way to avoid the wrapper. And after having some time to think about this: If one has a need to adjust something, then this could be done in the xencommons script right away. In other words, the modification can be done there instead of calling the wrapper. Nacked-by: Ian Jackson ian.jack...@eu.citrix.com +hotplug/Linux/xenstored.sh Although many of the existing hotplug scripts have this notion of calling things foo.sh because they happen to be written in shell, I think this is bad practice. I would prefer xenstored-wrap or some such. (My co-maintainers may disagree...) But this is a bit of a bikeshed issue. I agree. Initally I had xenstored-launcher in mind. echo -n Starting $XENSTORED... - $XENSTORED --pid-file /var/run/xenstored.pid $XENSTORED_ARGS + XENSTORED=$XENSTORED \ + XENSTORED_TRACE=$XENSTORED_TRACE \ + XENSTORED_ARGS=$XENSTORED_ARGS \ + ${LIBEXEC_BIN}/xenstored.sh --pid-file /var/run/xenstored.pid It might be easier to . xenstore-wrap. Failing that using `export' will avoid this rather odd and repetitive style. I think thats a good idea. Something like this may work, doing the . and the exec in the subshell: ( set -- --pid-file /var/run/xenstored.pid . xenstored.sh ) diff --git a/tools/hotplug/Linux/xenstored.sh.in b/tools/hotplug/Linux/xenstored.sh.in new file mode 100644 index 000..dc806ee --- /dev/null +++ b/tools/hotplug/Linux/xenstored.sh.in @@ -0,0 +1,6 @@ +#!/bin/sh +if test -n $XENSTORED_TRACE +then + XENSTORED_ARGS= -T /var/log/xen/xenstored-trace.log +fi +exec $XENSTORED $@ $XENSTORED_ARGS This should probably have around $@ just in case. Ok. I will wait for results from SELinux testing before respinning this patch. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Crash on efi_reset_machine on Lenovo ThinkCentre m93 (Haswell)
On 05.01.15 at 16:04, konrad.w...@oracle.com wrote: Odd - (XEN) [ Xen-4.4.1 x86_64 debug=n Not tainted ] (XEN) CPU:0 (XEN) RIP:e008:[d5fd8412] d5fd8412 the address here ... (XEN) [ Xen-4.5.0-rc-lK x86_64 debug=y Tainted:C ] (XEN) CPU:0 (XEN) RIP:e008:[d5fd83d0] d5fd83d0 ... is different from the one here, yet one would think the firmware always does the same thing on a given runtime services call. Except if the page here wasn't marked for runtime use in the memory map (which you didn't provide). I hadn't dug deep enough in this to figure out how it works on Linux but was wondering if anybody else had seen this? Iirc on Linux rebooting via runtime services is only possible when explicitly asking for it on the command line. In any event this very much looks like a firmware issue (and knowing what's at the addresses in question would be interesting), and the only workaround I can think of would be to use no-efi-rs. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.5 Development Update (GA slip by a week)
On Tue, 2015-01-06 at 12:23 -0800, Suriyan Ramasami wrote: On Tue, Jan 6, 2015 at 10:54 AM, Suriyan Ramasami suriya...@gmail.com wrote: On Tue, Jan 6, 2015 at 10:51 AM, Lars Kurth lars.kurth@gmail.com wrote: On 6 Jan 2015, at 18:08, Suriyan Ramasami suriya...@gmail.com wrote: I shall try my hand at updating the information again. If this needs to be done (yesterday), then as a temporary solution we could just delete this information for now, and I shall work on it soon. Ideally it needs to be done by next Wed. If you have the content, you could send it to me and I can fix the page Thanks Lars for the offer. I just started editing the page, and looks simple enough to get some content out there. I should have updated the relevant information in a few hours from now. Please do help me in fixing the page once its done (if it needs fixing) Thanks! - Suriyan Hello Lars/Ian, I have updated the wiki somewhat to an OK level of information. Thanks, I've removed the todo banner since there is some content now (I've not reviewed it in detail, since I don't know about the h/w, but it all looks very plausible). Earlier you said: I did have specific content for this wiki page, as the Arndale XEN information is Linaro centric and hence not applicable to the OdroidXU - especially the boot loader part and the linux dom0 part. The rest of the information is quite similar. I think that some of the Linaro specific stuff on the Arndale page is no longer needed, e.g. I run mainline u-boot and Linux just fine on one these days. So I think it is probably worth trying to refactor into a generic Exynos part and board specific details, but that's not urgent for the 4.5 release. Anyway, not asking you to tackle all that. I've added it to http://wiki.xen.org/wiki/Xen_Document_Days/TODO, along with my previous suggestion to get rid of the duplicated lists on the main ARM page. I might try and have a go at that on the next doc day. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/7 v3] tools/hotplug: systemd changes for 4.5
On Mon, Jan 05, Konrad Rzeszutek Wilk wrote: +Release Issues +== + +While we did the utmost to get a release out, there are certain +fixes which were not complete on time. As such please reference this +section if you are running into trouble. + +* systemd not working with Fedora Core 20, 21 or later (systemctl + reports xenstore failing to start). + + Systemd support is now part of Xen source code. While utmost work has + been done to make the systemd files compatible across all the + distributions, there might issues when using systemd files from + Xen sources. The work-around is to define an mount entry in + /etc/fstab as follow: + + tmpfs /var/lib/xenstored tmpfs + mode=755,context=system_u:object_r:xenstored_var_lib_t:s0 0 0 + + Shouldnt this go into a new SELinux section in the INSTALL file? Its my understanding that the reported SELinux failure is not only related to the context= mount option, but also to the socket passing from systemd. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] qemu upstream arm compile error failed
I tried to use the qemu upstream on xen Dom0 (arm arch) to get the PVFB, PVKBD and PVMOUSE work. However, I got compiler error at the begin. Here are the detail steps: 1. download the source code from git://git.qemu-project.org/qemu.git . 2. cd qemu 3. configure using command ./configure --cross-prefix=arm-linux-gnueabihf- --target-list=arm-softmmu --enable-fdt 4. make However, I got the following compile error GEN tests/test-qmp-commands.h GEN tests/test-qapi-event.h CCtests/qemu-iotests/socket_scm_helper.o LINK tests/qemu-iotests/socket_scm_helper /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lgthread-2.0 /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lglib-2.0 collect2: error: ld returned 1 exit status make: *** [tests/qemu-iotests/socket_scm_helper] Error 1 Looks like the build process need some libs missed in my gcc-cross/arm-linux-gnueabihf/4.8 What is the reason and solution? Thanks Regards, Mao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] collect Dirty bitmap from Xen source code
I am working on Xen live Migration. I want to print dirty_bitmap which is given in XEN_GUEST_HANDLE_64(uint8) dirty_bitmap in shadow.h and Dirty bit tracking is given in: int shadow_track_dirty_vram(struct domain *d, unsigned long first_pfn, unsigned long nr, XEN_GUEST_HANDLE_64(uint8) dirty_bitmap); Anyone can help me to print dirty bitmap as i want to collect dirty bitmap from log files or I want to print dirty bitmap.___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] qemu upstream arm compile error failed
On Wed, 2015-01-07 at 18:02 +0800, Mao Mingy wrote: I tried to use the qemu upstream on xen Dom0 (arm arch) to get the PVFB, PVKBD and PVMOUSE work. With Xen 4.5 qemu is automatically built for ARM, no need to do it separately AFAIK. Also note that you need to build qemu for i386, even on ARM (since the PV backends are currently entwined with x86 for historical reasons. This doesn't matter since qemu does no CPU emulation under Xen anyway. Lastly: /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lgthread-2.0 /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lglib-2.0 I think it is pretty obvious from these error messages that you are missing some build dependencies, this isn't a Xen specific issue AFAICT. The solution is to make ARM versions of those libraries available in your cross environment. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information
On 06.01.15 at 12:55, andrew.coop...@citrix.com wrote: On 06/01/15 02:18, Boris Ostrovsky wrote: diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h index 5f295f3..a5eb81c 100644 --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -56,6 +56,8 @@ struct pci_dev { u8 phantom_stride; +int node; /* NUMA node */ + enum pdev_type { DEV_TYPE_PCI_UNKNOWN, DEV_TYPE_PCIe_ENDPOINT, @@ -102,7 +104,8 @@ void setup_hwdom_pci_devices(struct domain *, int pci_release_devices(struct domain *d); int pci_add_segment(u16 seg); const unsigned long *pci_get_ro_map(u16 seg); -int pci_add_device(u16 seg, u8 bus, u8 devfn, const struct pci_dev_info *); +int pci_add_device(u16 seg, u8 bus, u8 devfn, + const struct pci_dev_info *, int); Please use parameter names in definitions. For the added parameter - yes. For the pre-existing pointer one I don't see a strong need to do so (and there was no name there before). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
On 07/01/2015 08:18, Andy Lutomirski wrote: Thus far, I've been told unambiguously that a guest can't observe pvti while it's being written, and I think you're now telling me that this isn't true and that a guest *can* observe pvti while it's being written while the low bit of the version field is not set. If so, this is rather strongly incompatible with the spec in the KVM docs. Where am I saying that? I thought the conclusion from what you and Marcelo pointed out about the code was that, once the first vCPU updated its pvti, it could start running guest code while the other vCPUs are still updating pvti, so its guest code can observe the other vCPUs mid-update. Ah, in that sense you're right. However, each VCPU cannot observe _its own_ pvti entry while it's being written (no matter what's in the low bit of the version field). Paolo ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/4] libxl: Add interface for querying hypervisor about PCI topology
On Mon, 2015-01-05 at 21:18 -0500, Boris Ostrovsky wrote: .. and use this new interface to display it along with CPU topology and NUMA information when 'xl info -n' command is issued Also, can we see how an `xl info -n' looks, on an IONUMA system? @@ -195,6 +195,24 @@ int xc_cputopoinfo(xc_interface *xch, return 0; } +int xc_pcitopoinfo(xc_interface *xch, + xc_pcitopoinfo_t *put_info) +{ +int ret; +DECLARE_SYSCTL; + +sysctl.cmd = XEN_SYSCTL_pcitopoinfo; + +memcpy(sysctl.u.pcitopoinfo, put_info, sizeof(*put_info)); + +if ( (ret = do_sysctl(xch, sysctl)) != 0 ) +return ret; + +memcpy(put_info, sysctl.u.pcitopoinfo, sizeof(*put_info)); + +return 0; +} @@ -5121,6 +5121,64 @@ libxl_cputopology *libxl_get_cpu_topology(libxl_ctx *ctx, int *nb_cpu_out) return ret; } +libxl_pcitopology *libxl_get_pci_topology(libxl_ctx *ctx, int *num_devs) +{ +GC_INIT(ctx); +xc_pcitopoinfo_t tinfo; +DECLARE_HYPERCALL_BUFFER(xen_sysctl_pcitopo_t, pcitopo); +libxl_pcitopology *ret = NULL; +int i, rc; + I see from where this comes from. However, at least from new functions, I think we should avoid using things like DECLARE_HYPERCALL_BUFFER etc., in libxl. They belong in libxc, IMO. This is basically what Andrew was doing here (although that was on xc_{topology,numa}info: http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/hwloc-support-experimental http://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=blobdiff;f=tools/libxc/xc_misc.c;h=4f672cead5a4d2b1fc23bcddb6fb49e4d6ec72b5;hp=330345459176961cacc7fda506e843831fe7156a;hb=3585994405b6a73c137309dd4be91f48c71e4903;hpb=9a80d5056766535ac624774b96495f8b97b1d28b Basically, what I'm suggesting is to have xc_pcitopoinfo() to do most of the work, with libxl_get_pci_topology being just a wrapper to it. In fact, if you grep/cscope for things like DECLARE_HYPERCALL_BUFFER in tools/libxl, the *only* results are these: libxl.c libxl_get_cpu_topology 5076 DECLARE_HYPERCALL_BUFFER(xc_cpu_to_core_t, coremap); libxl.c libxl_get_cpu_topology 5077 DECLARE_HYPERCALL_BUFFER(xc_cpu_to_socket_t, socketmap); libxl.c libxl_get_cpu_topology 5078 DECLARE_HYPERCALL_BUFFER(xc_cpu_to_node_t, nodemap); libxl.c libxl_get_numainfo 5142 DECLARE_HYPERCALL_BUFFER(xc_node_to_memsize_t, memsize); libxl.c libxl_get_numainfo 5143 DECLARE_HYPERCALL_BUFFER(xc_node_to_memfree_t, memfree); libxl.c libxl_get_numainfo 5144 DECLARE_HYPERCALL_BUFFER(uint32_t, node_dists); And I think we should work toward removing these too, rather than adding new ones! :-) Regards, Dario -- This happens because I choose it to happen! (Raistlin Majere) - Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK) signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information
On 06.01.15 at 03:18, boris.ostrov...@oracle.com wrote: @@ -618,7 +620,22 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) } else pdev_info.is_virtfn = 0; -ret = pci_add_device(add.seg, add.bus, add.devfn, pdev_info); + +if ( add.flags XEN_PCI_DEV_PXM ) +{ +uint32_t pxm; +int optarr_off = offsetof(struct physdev_pci_device_add, optarr) / unsigned int or size_t. --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -56,6 +56,8 @@ struct pci_dev { u8 phantom_stride; +int node; /* NUMA node */ I thought I asked about this on v1 already: Does this really need to be an int, when commonly node numbers are stored in u8/unsigned char? Shrinking the field size would prevent the structure size from growing... Of course an additional question would be whether the node wouldn't better go into struct arch_pci_dev - that depends on whether we expect ARM to be using NUMA... Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec
On Mon, Jan 05, Vitaly Kuznetsov wrote: Wei Liu wei.l...@citrix.com writes: Olaf mentioned his concern about handling ballooned pages in 20141211153029.ga1...@aepfle.de. Is that point moot now? Well, the limitation is real and some guest-side handling will be required in case we want to support kexec with ballooning. But as David validly mentioned It's the responsibility of the guest to ensure it either doesn't kexec when it is ballooned or that the kexec kernel can handle this. Not sure if we can (and need to) do anything hypevisor- or toolstack-side. One approach would be to mark all pages as some sort of populate-on-demand first. Then copy the existing assigned pages from domA to domB and update the page type. The remaining pages are likely ballooned. Once the guest tries to access them this should give the hypervisor and/or toolstack a chance to assign a real RAM page to them. I mean, if a host-assisted approach for kexec is implemented then this approach must also cover ballooning. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/4] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
On 06.01.15 at 14:41, andrew.coop...@citrix.com wrote: On 06/01/15 02:18, Boris Ostrovsky wrote: Instead of copying data for each field in xen_sysctl_topologyinfo separately put cpu/socket/node into a single structure and do a single copy for each processor. There is also no need to copy whole op to user at the end, max_cpu_index is sufficient Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact that these are used for CPU topology. Subsequent patch will add support for PCI topology sysctl. Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com If we are going to change the hypercall, then can we see about making it a stable interface (i.e. not a sysctl/domctl)? There are non-toolstack components which might want/need access to this information. (i.e. I am still looking for a reasonable way to get this information from Xen in hwloc) In which case leaving the sysctl alone and just adding a new non-sysctl interface should be considered. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/4] sysctl: Add sysctl interface for querying PCI topology
On 06.01.15 at 03:18, boris.ostrov...@oracle.com wrote: --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -365,6 +365,66 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) } break; #ifdef HAS_PCI +case XEN_SYSCTL_pcitopoinfo: +{ +xen_sysctl_pcitopoinfo_t *ti = op-u.pcitopoinfo; + +if ( guest_handle_is_null(ti-pcitopo) || + (ti-first_dev = ti-num_devs) ) +{ +ret = -EINVAL; +break; +} + +for ( ; ti-first_dev ti-num_devs; ti-first_dev++ ) +{ +xen_sysctl_pcitopo_t pcitopo; +struct pci_dev *pdev; + +if ( copy_from_guest_offset(pcitopo, ti-pcitopo, +ti-first_dev, 1) ) +{ +ret = -EFAULT; +break; +} + +spin_lock(pcidevs_lock); +pdev = pci_get_pdev(pcitopo.pcidev.seg, pcitopo.pcidev.bus, +pcitopo.pcidev.devfn); +if ( !pdev || (pdev-node == NUMA_NO_NODE) ) +pcitopo.node = INVALID_TOPOLOGY_ID; +else +pcitopo.node = pdev-node; Are hypervisor-internal node numbers really meaningful to the caller? +spin_unlock(pcidevs_lock); + +if ( copy_to_guest_offset(ti-pcitopo, ti-first_dev, __copy_ty_guest_offset() + pcitopo, 1) ) +{ +ret = -EFAULT; +break; +} + +if ( hypercall_preempt_check() ) +break; You didn't increment -first_dev yet, i.e. you'd start with the same index again when continuing later, and in the end you may not make any forward progress. @@ -463,7 +464,7 @@ typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t; DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t); /* XEN_SYSCTL_cputopoinfo */ -#define INVALID_TOPOLOGY_ID (~0U) +#define INVALID_TOPOLOGY_ID (~0U) /* Also used by pcitopo */ Better extend the preceding comment. @@ -492,6 +493,36 @@ struct xen_sysctl_cputopoinfo { typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t; DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t); +/* XEN_SYSCTL_pcitopoinfo */ +struct xen_sysctl_pcitopo { +struct physdev_pci_device pcidev; +uint32_t node; +}; +typedef struct xen_sysctl_pcitopo xen_sysctl_pcitopo_t; +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopo_t); + +struct xen_sysctl_pcitopoinfo { +/* IN: Size of pcitopo array */ +uint32_t num_devs; + +/* + * IN/OUT: First element of pcitopo array that needs to be processed by + * hypervisor. + * This is used primarily by hypercall continuations and callers will + * typically set it to zero + */ +uint32_t first_dev; + +/* + * If not NULL, filled with node identifier for each pcidev The If not NULL would be meaningful only if NULL had a special meaning. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/7] tools/hotplug: remove SELinux options from var-lib-xenstored.mount
On Tue, Jan 06, Ian Campbell wrote: On Fri, 2014-12-19 at 12:25 +0100, Olaf Hering wrote: ... Acked-by: Ian Campbell ian.campb...@citrix.com (on commit s/Appearently/Apparently/; s/non-existant/non-existent/ in the commit log) I made typos also in other commit messages. Should I resend the entire series, or will this be done during commit? Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xl only waits 33 seconds for ballooning to complete
On Tue, 2015-01-06 at 14:17 -0700, Mike Latimer wrote: Hi, In a previous post (1), I mentioned issues seen while ballooning a large amount of memory. In the current code, the ballooning process only has 33 seconds to complete, or the xl operation (i.e. domain create) will fail. When a lot of ballooning is required, or the host is very slow to balloon memory, this delay is not sufficient. The code involved is tools/libxl/xl_cmdimpl.c:freemem. This function retries 3 times, and each retry includes a 10 second delay in libxl_wait_for_free_memory and a 1 second delay in libxl_wait_for_memory_target. Is there a better approach, which would account for ballooning operations that take a much longer time to complete? The easiest option is to simply increase the retry count, but that would again leave us with a fixed window of time for an operation to complete. It seems like something that monitors the balloon process, and continues to wait if it is progressing, might be a better approach. That's exactly what I was about to suggest as I read the penultimate paragraph, i.e. keep waiting so long as some reasonable delta occurs on each iteration. Ian. Any ideas? Thanks, Mike 1. http://lists.xen.org/archives/html/xen-devel/2014-12/msg01443.html ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 7/7] tools/hotplug: add wrapper to start xenstored
On Tue, Jan 06, Ian Campbell wrote: On Fri, 2014-12-19 at 12:25 +0100, Olaf Hering wrote: The shell wrapper in xenstored.service does not handle XENSTORE_TRACE. Create a separate wrapper script which is used in the sysv runlevel script and in systemd xenstored.service. It preserves existing behaviour by handling the XENSTORE_TRACE boolean. It also implements the handling of XENSTORED_ARGS=. This variable has to be added to sysconfig/xencommons. Why don't we just drop XENSTORED_* in favour of XENSTORED_ARGS (with an example in the sysconfig file of enabling tracing if you like)? After having two weeks to think about this I came to the same conclusion. I think whatever the outcome is, the boolean should be removed. The sysconfig file should get a XENSTORED_ARGS= along with a help text which mentions -T /path and xenstored --help to get other options because there is no man page. Going to a wrapper script just to make some fairly uncommon debugging option marginally more convenient seems like overkill to me, plus XENSTORED_ARGS would allow for passing other useful options to xenstored. If I recall correctly the point of the current 'sh -c exec ...' stunt was to expand the XENSTORE variable from the sysconfig file. But this approach leads to failures with SELinux because the socket passing does not work this way. Up to now I have not seen a success report for selinux+systemd+xenstored. Maybe its already somewhere in the other unread mails. In my cover letter I provided some possible ways to handle selinux+systemd+xenstored. Ideally the approach Exec=/usr/bin/env $XENSTORED --no-fork $XENSTORED_ARGS works because it means its possible to select a binary via the sysconfig file. But it also means the XENSTORE_TRACE boolean has to be removed in favour of the plain XENSTORED_ARGS= approach mentioned above. Finally this would avoid the need for a wrapper script. Hopefully someone with access to a SELinux enabled system will report which approach actually works. The wrapper uses exec unconditionally. This works because the systemd service file passes --no-fork, which has the desired effect that the binary launched by systemd becomes the final daemon process. The sysv script does not pass --no-fork, which causes xenstored to fork internally to return to the caller of the wrapper script. The place of the wrapper is currently LIBEXEC_BIN, it has to be decided what the final location is supposed to be. IanJ wants it in /etc. If we go this route then I agree with Ian J. (/etc/xen/scripts, I suppose). I have not heard back which location has to be used. If /etc/xen/scripts is the place, so be it. I thought this is just for hotplug scripts. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 for 4.5] x86/VPMU: Clear last_vcpu when destroying VPMU
From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com] Sent: Tuesday, January 06, 2015 12:33 AM On Thu, Dec 18, 2014 at 01:06:40PM -0500, Boris Ostrovsky wrote: We need to make sure that last_vcpu is not pointing to VCPU whose VPMU is being destroyed. Otherwise we may try to dereference it in the future, when VCPU is gone. We have to do this via IPI since otherwise there is a (somewheat theoretical) chance that between test and subsequent clearing of last_vcpu the remote processor (i.e. vpmu-last_pcpu) might do both vpmu_load() and then vpmu_save() for another VCPU. The former will clear last_vcpu and the latter will set it to something else. Performing this operation via IPI will guarantee that nothing can happen on the remote processor between testing and clearing of last_vcpu. We should also check for VPMU_CONTEXT_ALLOCATED in vpmu_destroy() to avoid unnecessary percpu tests and arch-specific destroy ops. Thus checks in AMD and Intel routines are no longer needed. Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com --- xen/arch/x86/hvm/svm/vpmu.c |3 --- xen/arch/x86/hvm/vmx/vpmu_core2.c |2 -- xen/arch/x86/hvm/vpmu.c | 20 3 files changed, 20 insertions(+), 5 deletions(-) CC-ing the rest of the maintainers (Intel ones, since Boris is on the AMD side). I am OK with this patch going in Xen 4.5 as for one thing to actually use vpmu you have to pass 'vpmu=1' on the Xen command line. Aka, Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Acked-by: Kevin Tian kevin.t...@intel.com Changes in v4: * Back to v2's IPI implementation Changes in v3: * Use cmpxchg instead of IPI * Use correct routine names in commit message * Remove duplicate test for VPMU_CONTEXT_ALLOCATED in arch-specific destroy ops Changes in v2: * Test last_vcpu locally before IPI * Don't handle local pcpu as a special case --- on_selected_cpus will take care of that * Dont't cast variables unnecessarily diff --git a/xen/arch/x86/hvm/svm/vpmu.c b/xen/arch/x86/hvm/svm/vpmu.c index 8e07a98..4c448bb 100644 --- a/xen/arch/x86/hvm/svm/vpmu.c +++ b/xen/arch/x86/hvm/svm/vpmu.c @@ -403,9 +403,6 @@ static void amd_vpmu_destroy(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); -if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) ) -return; - if ( ((struct amd_vpmu_context *)vpmu-context)-msr_bitmap_set ) amd_vpmu_unset_msr_bitmap(v); diff --git a/xen/arch/x86/hvm/vmx/vpmu_core2.c b/xen/arch/x86/hvm/vmx/vpmu_core2.c index 68b6272..590c2a9 100644 --- a/xen/arch/x86/hvm/vmx/vpmu_core2.c +++ b/xen/arch/x86/hvm/vmx/vpmu_core2.c @@ -818,8 +818,6 @@ static void core2_vpmu_destroy(struct vcpu *v) struct vpmu_struct *vpmu = vcpu_vpmu(v); struct core2_vpmu_context *core2_vpmu_cxt = vpmu-context; -if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) ) -return; xfree(core2_vpmu_cxt-pmu_enable); xfree(vpmu-context); if ( cpu_has_vmx_msr_bitmap ) diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c index 1df74c2..37f0d9f 100644 --- a/xen/arch/x86/hvm/vpmu.c +++ b/xen/arch/x86/hvm/vpmu.c @@ -247,10 +247,30 @@ void vpmu_initialise(struct vcpu *v) } } +static void vpmu_clear_last(void *arg) +{ +if ( this_cpu(last_vcpu) == arg ) +this_cpu(last_vcpu) = NULL; +} + void vpmu_destroy(struct vcpu *v) { struct vpmu_struct *vpmu = vcpu_vpmu(v); +if ( !vpmu_is_set(vpmu, VPMU_CONTEXT_ALLOCATED) ) +return; + +/* + * Need to clear last_vcpu in case it points to v. + * We can check here non-atomically whether it is 'v' since + * last_vcpu can never become 'v' again at this point. + * We will test it again in vpmu_clear_last() with interrupts + * disabled to make sure we don't clear someone else. + */ +if ( per_cpu(last_vcpu, vpmu-last_pcpu) == v ) +on_selected_cpus(cpumask_of(vpmu-last_pcpu), + vpmu_clear_last, v, 1); + if ( vpmu-arch_vpmu_ops vpmu-arch_vpmu_ops-arch_vpmu_destroy ) vpmu-arch_vpmu_ops-arch_vpmu_destroy(v); } -- 1.7.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/4] x86: expose CMT L3 event mask to user space
On 23.12.14 at 09:54, chao.p.p...@linux.intel.com wrote: --- a/xen/arch/x86/sysctl.c +++ b/xen/arch/x86/sysctl.c @@ -157,6 +157,11 @@ long arch_do_sysctl( sysctl-u.psr_cmt_op.u.data = (ret ? 0 : info.size); break; } +case XEN_SYSCTL_PSR_CMT_get_l3_event_mask: +{ +sysctl-u.psr_cmt_op.u.data = psr_cmt-l3.features; +break; +} Stray figure braces. Other than that Acked-by: Jan Beulich jbeul...@suse.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec
On 07/01/15 09:10, Olaf Hering wrote: On Mon, Jan 05, Vitaly Kuznetsov wrote: Wei Liu wei.l...@citrix.com writes: Olaf mentioned his concern about handling ballooned pages in 20141211153029.ga1...@aepfle.de. Is that point moot now? Well, the limitation is real and some guest-side handling will be required in case we want to support kexec with ballooning. But as David validly mentioned It's the responsibility of the guest to ensure it either doesn't kexec when it is ballooned or that the kexec kernel can handle this. Not sure if we can (and need to) do anything hypevisor- or toolstack-side. One approach would be to mark all pages as some sort of populate-on-demand first. Then copy the existing assigned pages from domA to domB and update the page type. The remaining pages are likely ballooned. Once the guest tries to access them this should give the hypervisor and/or toolstack a chance to assign a real RAM page to them. I mean, if a host-assisted approach for kexec is implemented then this approach must also cover ballooning. It is not possible for the hypervisor or toolstack to do what you want because there may not be enough free memory to repopulate the new domain. The guest can handle this by: 1. Not ballooning (this is common in cloud environments). 2. Reducing the balloon prior to kexec. 3. Running the kexec'd image in a reserved chunk of memory (the crash kernel case). 4. Providing balloon information to the kexec'd image. None of these require any additional hypervisor or toolstack support and 1-3 are trivial for a guest to implement. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec
Olaf Hering o...@aepfle.de writes: On Mon, Jan 05, Vitaly Kuznetsov wrote: Wei Liu wei.l...@citrix.com writes: Olaf mentioned his concern about handling ballooned pages in 20141211153029.ga1...@aepfle.de. Is that point moot now? Well, the limitation is real and some guest-side handling will be required in case we want to support kexec with ballooning. But as David validly mentioned It's the responsibility of the guest to ensure it either doesn't kexec when it is ballooned or that the kexec kernel can handle this. Not sure if we can (and need to) do anything hypevisor- or toolstack-side. One approach would be to mark all pages as some sort of populate-on-demand first. Then copy the existing assigned pages from domA to domB and update the page type. The remaining pages are likely ballooned. Once the guest tries to access them this should give the hypervisor and/or toolstack a chance to assign a real RAM page to them. The thing is .. we don't have these pages when kexec is being performed, they are already ballooned out and the hypervisor doesn't have the knowledge of which GFNs should be re-populated. I think it is possible to keep track of all pages the guest balloons out for this purpose, but .. I mean, if a host-assisted approach for kexec is implemented then this approach must also cover ballooning. I don't see why solving the issue hypervisor-side is a must. When the guest performs kdump we don't care about the ballooning as we have a separate memory area which is supposed to have no ballooned out pages. When we do kexec nothing stops us from asking balloon driver to bring everything back, it is fine to perform non-trivial work before kexec (e.g. we shutdown all the devices). But, as I said, I'll try playing with ballooning to make these thoughts not purely theoretical. -- Vitaly ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec
On Wed, Jan 07, David Vrabel wrote: 2. Reducing the balloon prior to kexec. We carry a patch for kexec(1) which does balloon up before doing the actual kexec call. I propose to get such change into the upstream kexec tools if that is indeed the way to go. The benefit is that the guest waits until every ballooned page is populated. If the host is short on memory then the guest will hang instead of crash after kexec. https://build.opensuse.org/package/view_file/Kernel:kdump/kexec-tools/kexec-tools-xen-balloon-up.patch Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v2 3/5] tools: correct coding style for psr
- space: remove space after '(' or before ')' in 'if' condition; - indention: align function definition/call arguments; Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- tools/libxc/include/xenctrl.h |8 tools/libxc/xc_psr.c | 10 +- tools/libxl/libxl.h | 11 +++ tools/libxl/libxl_psr.c | 11 +++ tools/libxl/xl_cmdimpl.c | 11 ++- 5 files changed, 29 insertions(+), 22 deletions(-) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 96b357c..c6e9e3e 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2693,15 +2693,15 @@ typedef enum xc_psr_cmt_type xc_psr_cmt_type; int xc_psr_cmt_attach(xc_interface *xch, uint32_t domid); int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid); int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid, -uint32_t *rmid); + uint32_t *rmid); int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid); int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch, -uint32_t *upscaling_factor); + uint32_t *upscaling_factor); int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask); int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, uint32_t *l3_cache_size); -int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, -uint32_t cpu, uint32_t psr_cmt_type, uint64_t *monitor_data); +int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu, +uint32_t psr_cmt_type, uint64_t *monitor_data); int xc_psr_cmt_enabled(xc_interface *xch); #endif diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c index e76a0f9..e3ecc41 100644 --- a/tools/libxc/xc_psr.c +++ b/tools/libxc/xc_psr.c @@ -47,7 +47,7 @@ int xc_psr_cmt_detach(xc_interface *xch, uint32_t domid) } int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid, -uint32_t *rmid) + uint32_t *rmid) { int rc; DECLARE_DOMCTL; @@ -88,7 +88,7 @@ int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid) } int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch, -uint32_t *upscaling_factor) + uint32_t *upscaling_factor) { static int val = 0; int rc; @@ -137,7 +137,7 @@ int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask) } int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, - uint32_t *l3_cache_size) + uint32_t *l3_cache_size) { static int val = 0; int rc; @@ -162,8 +162,8 @@ int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, return rc; } -int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, -uint32_t cpu, xc_psr_cmt_type type, uint64_t *monitor_data) +int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, uint32_t cpu, +xc_psr_cmt_type type, uint64_t *monitor_data) { xc_resource_op_t op; xc_resource_entry_t entries[2]; diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 42ace76..d84ff7f 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1455,10 +1455,13 @@ int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid); int libxl_psr_cmt_enabled(libxl_ctx *ctx); int libxl_psr_cmt_type_supported(libxl_ctx *ctx, libxl_psr_cmt_type type); int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid); -int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid, -uint32_t *l3_cache_size); -int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, uint32_t domid, -uint32_t socketid, uint32_t *l3_cache_occupancy); +int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, +uint32_t socketid, +uint32_t *l3_cache_size); +int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, + uint32_t domid, + uint32_t socketid, + uint32_t *l3_cache_occupancy); #endif /* misc */ diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c index 3018a0d..0f2c7e0 100644 --- a/tools/libxl/libxl_psr.c +++ b/tools/libxl/libxl_psr.c @@ -153,8 +153,9 @@ int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid) return rc; } -int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid, - uint32_t *l3_cache_size) +int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, +uint32_t socketid, +uint32_t *l3_cache_size) { GC_INIT(ctx); @@ -178,8 +179,10 @@ out: return rc; } -int
[Xen-devel] [PATCH v2 1/5] x86: expose CMT L3 event mask to user space
L3 event mask indicates the event types supported in host, including cache occupancy event as well as local/total memory bandwidth events for Memory Bandwidth Monitoring(MBM). Expose it so all these events can be monitored in user space. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Jan Beulich jbeul...@suse.com --- xen/arch/x86/sysctl.c |3 +++ xen/include/public/sysctl.h |1 + 2 files changed, 4 insertions(+) diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c index 57ad992..611a291 100644 --- a/xen/arch/x86/sysctl.c +++ b/xen/arch/x86/sysctl.c @@ -157,6 +157,9 @@ long arch_do_sysctl( sysctl-u.psr_cmt_op.u.data = (ret ? 0 : info.size); break; } +case XEN_SYSCTL_PSR_CMT_get_l3_event_mask: +sysctl-u.psr_cmt_op.u.data = psr_cmt-l3.features; +break; default: sysctl-u.psr_cmt_op.u.data = 0; ret = -ENOSYS; diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h index b3713b3..8552dc6 100644 --- a/xen/include/public/sysctl.h +++ b/xen/include/public/sysctl.h @@ -641,6 +641,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_coverage_op_t); /* The L3 cache size is returned in KB unit */ #define XEN_SYSCTL_PSR_CMT_get_l3_cache_size 2 #define XEN_SYSCTL_PSR_CMT_enabled 3 +#define XEN_SYSCTL_PSR_CMT_get_l3_event_mask 4 struct xen_sysctl_psr_cmt_op { uint32_t cmd; /* IN: XEN_SYSCTL_PSR_CMT_* */ uint32_t flags; /* padding variable, may be extended for future use */ -- 1.7.9.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.5 Development Update (GA slip by a week)
On Tue, Jan 06, Konrad Rzeszutek Wilk wrote: There is only one outstanding patch and that is #7 tools/hotplug: add wrapper to start xenstored. Olaf is back tomorrow so it might make it .. or not. See my other replies. I think once we know how to deal with SELinux and systemd this change may be not needed anymore. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/5] tools: correct coding style for psr
On Wed, Jan 07, 2015 at 07:12:03PM +0800, Chao Peng wrote: - space: remove space after '(' or before ')' in 'if' condition; - indention: align function definition/call arguments; Signed-off-by: Chao Peng chao.p.p...@linux.intel.com Acked-by: Wei Liu wei.l...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask
On 07/01/15 11:12, Chao Peng wrote: This is the tools side wrapper for XEN_SYSCTL_PSR_CMT_get_l3_event_mask of XEN_SYSCTL_psr_cmt_op. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- tools/libxc/include/xenctrl.h |1 + tools/libxc/xc_psr.c | 24 tools/libxl/libxl.h |1 + tools/libxl/libxl_psr.c | 18 ++ 4 files changed, 44 insertions(+) diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 0ad8b8d..96b357c 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -2697,6 +2697,7 @@ int xc_psr_cmt_get_domain_rmid(xc_interface *xch, uint32_t domid, int xc_psr_cmt_get_total_rmid(xc_interface *xch, uint32_t *total_rmid); int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch, uint32_t *upscaling_factor); +int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask); int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, uint32_t *l3_cache_size); int xc_psr_cmt_get_data(xc_interface *xch, uint32_t rmid, diff --git a/tools/libxc/xc_psr.c b/tools/libxc/xc_psr.c index 872e6dc..e76a0f9 100644 --- a/tools/libxc/xc_psr.c +++ b/tools/libxc/xc_psr.c @@ -112,6 +112,30 @@ int xc_psr_cmt_get_l3_upscaling_factor(xc_interface *xch, return rc; } +int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask) +{ +static int val = 0; This should be uint32_t rather than int. I am somewhat concerned about multithreaded use of libxc, but this is not the first issue in libxc, and probably shouldn't be held against this patch. As the result of the hypercall is going to be the same, the worse that a race could achieve is a wasted hypercall. +int rc; +DECLARE_SYSCTL; + +if ( val ) +{ +*event_mask = val; +return 0; +} + +sysctl.cmd = XEN_SYSCTL_psr_cmt_op; +sysctl.u.psr_cmt_op.cmd = +XEN_SYSCTL_PSR_CMT_get_l3_event_mask; +sysctl.u.psr_cmt_op.flags = 0; + +rc = xc_sysctl(xch, sysctl); +if ( !rc ) +val = *event_mask = sysctl.u.psr_cmt_op.u.data; + +return rc; +} + int xc_psr_cmt_get_l3_cache_size(xc_interface *xch, uint32_t cpu, uint32_t *l3_cache_size) { diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index 0a123f1..42ace76 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1453,6 +1453,7 @@ int libxl_psr_cmt_attach(libxl_ctx *ctx, uint32_t domid); int libxl_psr_cmt_detach(libxl_ctx *ctx, uint32_t domid); int libxl_psr_cmt_domain_attached(libxl_ctx *ctx, uint32_t domid); int libxl_psr_cmt_enabled(libxl_ctx *ctx); +int libxl_psr_cmt_type_supported(libxl_ctx *ctx, libxl_psr_cmt_type type); int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid); int libxl_psr_cmt_get_l3_cache_size(libxl_ctx *ctx, uint32_t socketid, uint32_t *l3_cache_size); diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c index 0437465..3018a0d 100644 --- a/tools/libxl/libxl_psr.c +++ b/tools/libxl/libxl_psr.c @@ -120,6 +120,24 @@ int libxl_psr_cmt_enabled(libxl_ctx *ctx) return xc_psr_cmt_enabled(ctx-xch); } +int libxl_psr_cmt_type_supported(libxl_ctx *ctx, libxl_psr_cmt_type type) +{ +GC_INIT(ctx); +uint32_t event_mask; +int ret; The libxl CODING_SYTLE states that this ret should be rc ~Andrew + +ret = xc_psr_cmt_get_l3_event_mask(ctx-xch, event_mask); +if (ret 0) { +libxl__psr_cmt_log_err_msg(gc, errno); +ret = 0; +} else { +ret = event_mask (1 (type - 1)); +} + +GC_FREE; +return ret; +} + int libxl_psr_cmt_get_total_rmid(libxl_ctx *ctx, uint32_t *total_rmid) { GC_INIT(ctx); ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] check-headers error in staging
After upgrade to current staging, my test packages fail to build: [ 289s] + make -j4 -k -C tools/include/xen-foreign [ 289s] make: Entering directory `/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign' [ 289s] python mkheader.py arm32 arm32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-arm.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/xen.h [ 289s] python mkheader.py arm64 arm64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-arm.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/xen.h [ 289s] python mkheader.py x86_32 x86_32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-x86/xen-x86_32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/ [ 289s] python mkheader.py x86_64 x86_64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-x86/xen-x86_64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/ [ 289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64 [ 289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c [ 289s] ./checker tmp.size [ 289s] diff -u reference.size tmp.size [ 289s] --- reference.size 2015-01-07 10:28:57.0 + [ 289s] +++ tmp.size 2015-01-07 12:28:33.564911299 + [ 289s] @@ -9,6 +9,6 @@ [ 289s] arch_vcpu_info| 0 0 24 16 [ 289s] vcpu_time_info| 32 32 32 32 [ 289s] vcpu_info | 48 48 64 64 [ 289s] -arch_shared_info | 0 0 268 280 [ 289s] -shared_info |1088108825843368 [ 289s] +arch_shared_info | 0 0 24 48 [ 289s] +shared_info |1088108823403136 [ 289s] [ 289s] make: *** [check-headers] Error 1 this was the last successful build: xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538 this build fails: xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8 So it looks like something between git commites dcd8486..dd94cac causes this failure. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use
On Wed, 2015-01-07 at 13:07 +, David Vrabel wrote: On 07/01/15 12:00, Ian Campbell wrote: On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote: From: Jenny Herbert jennifer.herb...@citrix.com Introduce gnttab_unmap_refs_async() that can be used to safely unmap pages that may be in use (ref count 1). If the pages are in use the unmap is deferred and retried later. This polling is not very clever but it should be good enough if the cases where the delay is necessary are rare. This is needed to allow block backends using grant mapping to safely use network storage (block or filesystem based such as iSCSI or NFS). The network storage driver may complete a block request whilst there is a queued network packet retry (because the ack from the remote end races with deciding to queue the retry). The pages for the retried packet would be grant unmapped and the network driver (or hardware) would access the unmapped page. I thought this had been solved a little while ago by mapping a scratch page on unmap even for kernel space grant mappings, but both the design doc and here imply not (i.e. the scratch is for user grant mappings only), so I must be misremembering. It was only for user grant mappings and it did not fix the case where the page being unmapped was currently dma mapped. This could have resulted in the NIC transmitting sensitive data. e.g., 1. iscsi queues a retransmit with page P (frame F). 2. NIC driver DMA maps and queues frame F on h/w. 3. iscsi completes the I/O. 4. page P is unmapped. 5. response is sent to guest 6. guest reuses frame F. 7. NIC transmits frame F. You don't actually need Xen to expose this sort of thing, any userspace which reuses a buffer after write() (to e.g. NFS has completed) might expose the new data on the wire in a retransmit. It's certainly much easier to expose with Xen, I'll grant (no pun intended) you that. We don't use this safe unmap mechanism for netback because the zero copy stuff means we don't need it and the polling on the unmap is high latency and only good enough if the polling is needed very rarely. I'd think it should be rare for netback too unless your network is made of wet string. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use
On 07/01/15 13:24, Ian Campbell wrote: On Wed, 2015-01-07 at 13:07 +, David Vrabel wrote: We don't use this safe unmap mechanism for netback because the zero copy stuff means we don't need it and the polling on the unmap is high latency and only good enough if the polling is needed very rarely. I'd think it should be rare for netback too unless your network is made of wet string. Hmmm. Yes, this might be worth looking at but I would prefer to do it to it as a follow up to this series. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information
On 01/07/2015 04:06 AM, Jan Beulich wrote: On 06.01.15 at 03:18, boris.ostrov...@oracle.com wrote: @@ -618,7 +620,22 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) } else pdev_info.is_virtfn = 0; -ret = pci_add_device(add.seg, add.bus, add.devfn, pdev_info); + +if ( add.flags XEN_PCI_DEV_PXM ) +{ +uint32_t pxm; +int optarr_off = offsetof(struct physdev_pci_device_add, optarr) / unsigned int or size_t. --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -56,6 +56,8 @@ struct pci_dev { u8 phantom_stride; +int node; /* NUMA node */ I thought I asked about this on v1 already: Does this really need to be an int, when commonly node numbers are stored in u8/unsigned char? Shrinking the field size would prevent the structure size from growing... I kept this field as an int to be able to store NUMA_NO_NODE which I thought to be (int)-1. But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to (int)-1 by pxm_to_node(). Given that there is a number of tests for NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node() return u8 as well? Of course an additional question would be whether the node wouldn't better go into struct arch_pci_dev - that depends on whether we expect ARM to be using NUMA... Since we have CPU topology in common code I thought this would be arch-independent as well. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/4] sysctl: Add sysctl interface for querying PCI topology
On 01/07/2015 04:21 AM, Jan Beulich wrote: On 06.01.15 at 03:18, boris.ostrov...@oracle.com wrote: --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -365,6 +365,66 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) } break; #ifdef HAS_PCI +case XEN_SYSCTL_pcitopoinfo: +{ +xen_sysctl_pcitopoinfo_t *ti = op-u.pcitopoinfo; + +if ( guest_handle_is_null(ti-pcitopo) || + (ti-first_dev = ti-num_devs) ) +{ +ret = -EINVAL; +break; +} + +for ( ; ti-first_dev ti-num_devs; ti-first_dev++ ) +{ +xen_sysctl_pcitopo_t pcitopo; +struct pci_dev *pdev; + +if ( copy_from_guest_offset(pcitopo, ti-pcitopo, +ti-first_dev, 1) ) +{ +ret = -EFAULT; +break; +} + +spin_lock(pcidevs_lock); +pdev = pci_get_pdev(pcitopo.pcidev.seg, pcitopo.pcidev.bus, +pcitopo.pcidev.devfn); +if ( !pdev || (pdev-node == NUMA_NO_NODE) ) +pcitopo.node = INVALID_TOPOLOGY_ID; +else +pcitopo.node = pdev-node; Are hypervisor-internal node numbers really meaningful to the caller? This is the same information (pxm - node mapping ) that we provide in XEN_SYSCTL_topologyinfo (renamed in this series to XEN_SYSCTL_cputopoinfo). Given that I expect the two topologies to be used together I think the answer is yes. @@ -463,7 +464,7 @@ typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t; DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t); /* XEN_SYSCTL_cputopoinfo */ -#define INVALID_TOPOLOGY_ID (~0U) +#define INVALID_TOPOLOGY_ID (~0U) /* Also used by pcitopo */ Better extend the preceding comment. I mentioned it to Wei yesterday that the file is structured (or at least appears to me to be structured) in such a way that these top comments mark sections of definitions for each sysctl. And so I thought that I'd be breaking this convention if I were to extend the comment. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use
On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote: From: Jenny Herbert jennifer.herb...@citrix.com Introduce gnttab_unmap_refs_async() that can be used to safely unmap pages that may be in use (ref count 1). If the pages are in use the unmap is deferred and retried later. This polling is not very clever but it should be good enough if the cases where the delay is necessary are rare. This is needed to allow block backends using grant mapping to safely use network storage (block or filesystem based such as iSCSI or NFS). The network storage driver may complete a block request whilst there is a queued network packet retry (because the ack from the remote end races with deciding to queue the retry). The pages for the retried packet would be grant unmapped and the network driver (or hardware) would access the unmapped page. I thought this had been solved a little while ago by mapping a scratch page on unmap even for kernel space grant mappings, but both the design doc and here imply not (i.e. the scratch is for user grant mappings only), so I must be misremembering. Regardless, this approach seems likely to be far better... ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 4/5] tools: code refactoring for MBM
On Wed, Jan 07, 2015 at 07:12:04PM +0800, Chao Peng wrote: [...] -int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, - uint32_t domid, - uint32_t socketid, - uint32_t *l3_cache_occupancy) +static int libxl__psr_cmt_get_l3_monitoring_data(libxl__gc *gc, + uint32_t domid, + xc_psr_cmt_type type, + uint32_t socketid, + uint64_t *data) { -GC_INIT(ctx); - unsigned int rmid; -uint32_t upscaling_factor; -uint64_t monitor_data; int cpu, rc; -xc_psr_cmt_type type; -rc = xc_psr_cmt_get_domain_rmid(ctx-xch, domid, rmid); +rc = xc_psr_cmt_get_domain_rmid(CTX-xch, domid, rmid); if (rc 0 || rmid == 0) { LOGE(ERROR, fail to get the domain rmid, or domain is not attached with platform QoS monitoring service); -rc = ERROR_FAIL; -goto out; +return ERROR_FAIL; Please retain the goto out idiom if possible. } cpu = libxl__pick_socket_cpu(gc, socketid); if (cpu 0) { LOGE(ERROR, failed to get socket cpu); -rc = ERROR_FAIL; -goto out; +return ERROR_FAIL; } -type = XC_PSR_CMT_L3_OCCUPANCY; -rc = xc_psr_cmt_get_data(ctx-xch, rmid, cpu, type, monitor_data); +rc = xc_psr_cmt_get_data(CTX-xch, rmid, cpu, type, data); if (rc 0) { LOGE(ERROR, failed to get monitoring data); -rc = ERROR_FAIL; -goto out; +return ERROR_FAIL; } +return rc; +} + +int libxl_psr_cmt_get_cache_occupancy(libxl_ctx *ctx, + uint32_t domid, + uint32_t socketid, + uint32_t *l3_cache_occupancy) +{ +GC_INIT(ctx); +uint64_t data; +uint32_t upscaling_factor; +int rc; + +rc= libxl__psr_cmt_get_l3_monitoring_data(gc, domid, rc = Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.
Maybe we should just change the document to clarify that an election is only needed if the previous maintainer steps down, which is what I think the intention really was. Seems reasonable to me, presumably some existing mechanism (i.e. common sense...) exists if the incumbent goes off the rails or disappears without resigning etc. Ian, thanks for digging out the link. Looking through the thread we said, the consensus seems to have been to handle stable tree maintainers like maintainers. Each stable branch has a maintainer who is nominated/volunteers according to the Maintainer Election process described in the project governance document doesn't imply a formal election. The governance document states * Nomination: A maintainer should nominate himself by proposing a patch to the MAINTAINERS file or mailing a nomination to the project's mailing list. Alternatively another maintainer may nominate a community member. A nomination should explain the contributions of proposed maintainer to the project as well as a scope (set of owned components). Where the case is not obvious, evidence such as specific patches and other evidence supporting the nomination should be cited. * Confirmation: Normally, there is no need for a direct election to confirm a new maintainer. Discussion should happen on the mailing list using the principles of consensus decision making. If there is disagreement or doubt, the project lead or a committer should ask the community manager to arrange a more formal vote So I would propose to replace Each stable branch has a maintainer who is nominated/volunteers according to the Maintainer Election process described in the project governance document. This will resulting in the MAINTAINERS file in the relevant branch being patched to include the maintainer. with Each stable branch has a maintainer who is nominated/volunteers according to the Maintainer Election process described in the project governance document. This means that the stable branch maintainer nominates himself or is nominated by another maintainer on the mailing list or through a patch to the MAINTAINERS file on the relevant branch. The principles of consensus decision making are applied, unless there is disagreement, in which case a formal election may be needed. The MAINTAINERS file in the relevant branch will be patched to include the stable branch maintainer. That would mean that we don't have to go through this discussion again for 4.6. Lars On 07/01/2015 12:17, Ian Campbell ian.campb...@citrix.com wrote: On Wed, 2015-01-07 at 11:59 +, Lars Kurth wrote: On 07/01/2015 11:26, Ian Campbell ian.campb...@citrix.com wrote: On Tue, 2015-01-06 at 11:15 -0500, Konrad Rzeszutek Wilk wrote: Hello, Per http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases: Each stable branch has a maintainer who is nominated/volunteers according to the Maintainer Election process described in the project governance document [http://www.xenproject.org/governance.html]. This will resulting in the MAINTAINERS file in the relevant branch being patched to include the maintainer. For the past year or so Jan Beulich has been the stable tree maintainer. Since Xen 4.5 has branched that opens up a new stable tree and we can also stop maintaining Xen 4.3 stable tree. The nominations are open - please volunteer yourself. In case nobody volunteers I can also take the role. I ask folks to finish voting/nominating by Jan 14th so that when Xen 4.5 comes out we have an viable stable tree maintainer. I'm not sure how voting is supposed to proceed with multiple nominations (and with the deadline for nominations apparently being the same as for voting), Actually, it is questionable whether there are multiple nominations. Andrew said If Jan wants a break, I would be happy to volunteer. True, and Konrad said if nobody else Still, my +1 for Jan stands. I am also not convinced that we need an election, unless the existing maintainer wants to steps down. We never had one in the past. And we don't have an explicit nomination for Release Managers unless the existing RM steps down. I can't find the mailing list discussion which led to http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases (the link in the change history seems to be wrong). http://lists.xen.org/archives/html/xen-devel/2012-11/msg01391.html perhaps? Maybe we should just change the document to clarify that an election is only needed if the previous maintainer steps down, which is what I think the intention really was. Seems reasonable to me, presumably some existing mechanism (i.e. common sense...) exists if the incumbent goes off the rails or disappears without resigning etc. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 11/12] xen/gntdev: mark userspace PTEs as special on x86 PV guests
On Wed, 2015-01-07 at 13:23 +, David Vrabel wrote: On 07/01/15 12:11, Ian Campbell wrote: On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote: In an x86 PV guest, get_user_pages_fast() on a userspace address range containing foreign mappings does not work correctly because the M2P lookup of the MFN from a userspace PTE may return the wrong page. Force get_user_pages_fast() to fail on such addresses by marking the PTEs as special. If Xen has XENFEAT_gnttab_map_avail_bits (available since at least 4.0), http://wiki.xenproject.org/wiki/Xen_Kernel_Feature_Matrix says the dom0 pvpops already requires = 4.0 too, which matches my recollection (something to do with a new APIC interface which upstream insisted on during upstreaming, IIRC), but both could be out of date... gntdev is usable by driver domains and useful for inter-domain comms so it isn't limited to dom0 use only and Linux still needs to run on Xen 3.2 (I think that's the oldest still available on AWS). Ah yes, driver domains... Because of the m2p override limitation, gntdev is currently unsafe[1] to use by untrusted userspace apps so there's no (new) security issues here. However, we could disable gntdev if this feature is messing unless overridden by a module option. Opinions on this? If it is exploitable by untrusted apps in the new form (the race between mmap and the pte update still is, right?), then that might be best, or only allow root to open it? David [1] mapping a ref twice or a two refs for the same frame can corrupt kernel state is various exciting ways because of messed up page ref counts. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 06/12] xen: mark grant mapped pages as foreign
On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote: From: Jenny Herbert jennifer.herb...@citrix.com Use the foreign page flag to mark pages that have a grant map. Use page-private to store information of the grant (the granting domain and the grant reference). Signed-off-by: Jenny Herbert jenny.herb...@citrix.com Signed-off-by: David Vrabel david.vra...@citrix.com --- arch/x86/xen/p2m.c| 50 ++--- include/xen/grant_table.h | 13 2 files changed, 56 insertions(+), 7 deletions(-) diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index 0d70814..22624a3 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -648,6 +648,43 @@ bool set_phys_to_machine(unsigned long pfn, unsigned long mfn) return true; } +static int +init_page_grant_ref(struct page *p, domid_t domid, grant_ref_t grantref) I'd be inclined to add map to the names somewhere, otherwise people might thing they need to call this when allocating a grant in the f.e. or other things. +{ +#ifdef CONFIG_X86_64 Rather than suggesting to add CONFIG_ARM_64 here I'll suggest BITS_PER_LONG = 64. + uint64_t gref; + uint64_t* gref_p = gref; +#else + uint64_t* gref_p = kmalloc(sizeof(uint64_t), GFP_KERNEL); Might this allocation be happening during e.g. swapping? I suppose it is backend only, and swapping to a loopback vbd would be pretty mad. If you can figure a reasonable use case for that you might want some extra GFP flags? Might this be hot enough to warrant using a specific kmem_cache? + if (!gref) + return -ENOMEM; + uint64_t* gref = gref_p; +#endif + + *gref_p = ((uint64_t) grantref 32) | domid; + p-private = gref; There is a set_page_private() macro, which doesn't seem to do much but I suppose you should use it (and page_private() for accessing, if you don't already). @@ -182,4 +183,16 @@ int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops, void gnttab_batch_map(struct gnttab_map_grant_ref *batch, unsigned count); void gnttab_batch_copy(struct gnttab_copy *batch, unsigned count); +static inline void +get_page_grant_ref(struct page *p, domid_t* domid, grant_ref_t* grantref) +{ BUG_ON(!PageBlah(p))? +#ifdef CONFIG_X86_64 + uint64_t gref = p-private; +#else + uint64_t gref = *p-private; +#endif + *domid = gref 0x; + *grantref = gref 32; +} + #endif /* __ASM_GNTTAB_H__ */ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.
On Wed, 2015-01-07 at 11:59 +, Lars Kurth wrote: On 07/01/2015 11:26, Ian Campbell ian.campb...@citrix.com wrote: On Tue, 2015-01-06 at 11:15 -0500, Konrad Rzeszutek Wilk wrote: Hello, Per http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases: Each stable branch has a maintainer who is nominated/volunteers according to the Maintainer Election process described in the project governance document [http://www.xenproject.org/governance.html]. This will resulting in the MAINTAINERS file in the relevant branch being patched to include the maintainer. For the past year or so Jan Beulich has been the stable tree maintainer. Since Xen 4.5 has branched that opens up a new stable tree and we can also stop maintaining Xen 4.3 stable tree. The nominations are open - please volunteer yourself. In case nobody volunteers I can also take the role. I ask folks to finish voting/nominating by Jan 14th so that when Xen 4.5 comes out we have an viable stable tree maintainer. I'm not sure how voting is supposed to proceed with multiple nominations (and with the deadline for nominations apparently being the same as for voting), Actually, it is questionable whether there are multiple nominations. Andrew said If Jan wants a break, I would be happy to volunteer. True, and Konrad said if nobody else Still, my +1 for Jan stands. I am also not convinced that we need an election, unless the existing maintainer wants to steps down. We never had one in the past. And we don't have an explicit nomination for Release Managers unless the existing RM steps down. I can't find the mailing list discussion which led to http://wiki.xenproject.org/wiki/Xen_Project_Maintenance_Releases (the link in the change history seems to be wrong). http://lists.xen.org/archives/html/xen-devel/2012-11/msg01391.html perhaps? Maybe we should just change the document to clarify that an election is only needed if the previous maintainer steps down, which is what I think the intention really was. Seems reasonable to me, presumably some existing mechanism (i.e. common sense...) exists if the incumbent goes off the rails or disappears without resigning etc. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] community develop/contributor call
Hi all, For a few years now I've been running a monthly phone call between various organisations which contribute to Xen, the so called TCT Call[0]. However since then the Xen Project has moved to the Linux foundation and the introduction of the Advisory Board means this call isn't really useful in its current form any more. Rather than just cancelling it I've been thinking of repurposing it into a more development focused call for all contributors (maintainers, patch submitters etc). The idea would be to provide a regular slot where technical and development topics can be discussed on an on demand basis without the need to setup an ad-hoc call each time something comes up. In other words to provide a forum for topics such as thrashing out a more complex design for a new feature, building consensus on how to move forward with a stalled patch series, discussing difficult freeze exception requests during release periods and cases where on list discussion has stalled or those involved feel the need to have a live chat about it for some reason. My intention would be the call would be targeted at maintainers and people who are themselves working on the code/design, although anyone will be welcome to dial in and take part. The plan would be to post a call for agenda items to xen-devel a week beforehand with the call being cancelled the day before if no topics were proposed (I have no problem with the call only happening occasionally, there's no point in having it if there are no topics, but if it's useful a couple of times a year it's worth having IMHO). I will try and reach out to anyone who should be involved with a proposed topic to try and ensure that the call has the right people on it in order to make progress. In the event that some of the participants feel that a topic is still being usefully discussed on list and that discussion should remain there then I'll mediate to either gain consensus one way or another or unblock things some other way etc. I would expect people decide based on the published agenda each month whether they want/need to attend. I hope that at least the maintainers would make an effort to attend if there were topics covering their areas (and as above I will chase where needed). In the short term I intend to stick with the same time (5PM UK, on the second Wednesday of the month) and cadence (monthly) as before, but I might revisit the timing, in particular if we find the proposed topics make the time problematic for key people who want to be involved (e.g. folks in the Far East, for whom 5PM UK is pretty antisocial). Since that schedule puts a call a week today (14th Jan), I'll send out a call for agenda later today. Ian. [0] http://wiki.xen.org/wiki/Xen_Technical_Coordination_Team_%28TCT%29_Meeting ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 11/12] xen/gntdev: mark userspace PTEs as special on x86 PV guests
On 07/01/15 12:11, Ian Campbell wrote: On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote: In an x86 PV guest, get_user_pages_fast() on a userspace address range containing foreign mappings does not work correctly because the M2P lookup of the MFN from a userspace PTE may return the wrong page. Force get_user_pages_fast() to fail on such addresses by marking the PTEs as special. If Xen has XENFEAT_gnttab_map_avail_bits (available since at least 4.0), http://wiki.xenproject.org/wiki/Xen_Kernel_Feature_Matrix says the dom0 pvpops already requires = 4.0 too, which matches my recollection (something to do with a new APIC interface which upstream insisted on during upstreaming, IIRC), but both could be out of date... gntdev is usable by driver domains and useful for inter-domain comms so it isn't limited to dom0 use only and Linux still needs to run on Xen 3.2 (I think that's the oldest still available on AWS). Because of the m2p override limitation, gntdev is currently unsafe[1] to use by untrusted userspace apps so there's no (new) security issues here. However, we could disable gntdev if this feature is messing unless overridden by a module option. Opinions on this? David [1] mapping a ref twice or a two refs for the same frame can corrupt kernel state is various exciting ways because of messed up page ref counts. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use
On Wed, 2015-01-07 at 13:30 +, David Vrabel wrote: On 07/01/15 13:24, Ian Campbell wrote: On Wed, 2015-01-07 at 13:07 +, David Vrabel wrote: We don't use this safe unmap mechanism for netback because the zero copy stuff means we don't need it and the polling on the unmap is high latency and only good enough if the polling is needed very rarely. I'd think it should be rare for netback too unless your network is made of wet string. Hmmm. Yes, this might be worth looking at but I would prefer to do it to it as a follow up to this series. Sure. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-3.10 test] 33190: regressions - trouble: blocked/broken/fail/pass
flight 33190 linux-3.10 real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/33190/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-qemut-winxpsp3 7 windows-install fail REGR. vs. 26303 build-armhf-libvirt 5 libvirt-buildfail in 33120 REGR. vs. 26303 build-i386-libvirt5 libvirt-buildfail in 33120 REGR. vs. 26303 build-amd64-libvirt 5 libvirt-buildfail in 33120 REGR. vs. 26303 Tests which are failing intermittently (not blocking): test-armhf-armhf-xl 3 host-install(3) broken pass in 33120 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail blocked in 26303 build-armhf-libvirt 3 host-install(3)broken blocked in 26303 build-i386-libvirt3 host-install(3)broken blocked in 26303 build-amd64-libvirt 3 host-install(3)broken blocked in 26303 test-amd64-amd64-xl-qemuu-debianhvm-amd64 10 guest-localmigrate fail blocked in 26303 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 26303 test-amd64-amd64-xl-winxpsp3 7 windows-install fail like 26303 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pvh-intel 9 guest-start fail never pass test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-amd 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-armhf-armhf-libvirt 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass test-armhf-armhf-xl 5 xen-boot fail in 33120 never pass version targeted for testing: linuxa472efc75989c7092187fe00f0400e02c495c436 baseline version: linuxbe67db109090b17b56eb8eb2190cd70700f107aa 816 people touched revisions under test, not listing them all jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt broken build-armhf-libvirt broken build-i386-libvirt broken build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl broken test-amd64-i386-xl pass test-amd64-amd64-xl-pvh-amd fail test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64fail test-amd64-i386-xl-qemuu-debianhvm-amd64 pass test-amd64-i386-freebsd10-amd64 pass test-amd64-amd64-xl-qemuu-ovmf-amd64
Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information
On 07/01/15 14:42, Boris Ostrovsky wrote: On 01/07/2015 04:06 AM, Jan Beulich wrote: On 06.01.15 at 03:18, boris.ostrov...@oracle.com wrote: @@ -618,7 +620,22 @@ ret_t do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg) } else pdev_info.is_virtfn = 0; -ret = pci_add_device(add.seg, add.bus, add.devfn, pdev_info); + +if ( add.flags XEN_PCI_DEV_PXM ) +{ +uint32_t pxm; +int optarr_off = offsetof(struct physdev_pci_device_add, optarr) / unsigned int or size_t. --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -56,6 +56,8 @@ struct pci_dev { u8 phantom_stride; +int node; /* NUMA node */ I thought I asked about this on v1 already: Does this really need to be an int, when commonly node numbers are stored in u8/unsigned char? Shrinking the field size would prevent the structure size from growing... I kept this field as an int to be able to store NUMA_NO_NODE which I thought to be (int)-1. But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to (int)-1 by pxm_to_node(). Given that there is a number of tests for NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node() return u8 as well? I noticed this as well, and found it quite counter intuitive. I would suggest fixing NUMA_NO_NODE to -1 and removing some of the type-punning. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/3] x86/xen: add extra memory for remapped frames during setup
If the non-RAM regions in the e820 memory map are larger than the size of the initial balloon, a BUG was triggered as the frames are remaped beyond the limit of the linear p2m. The frames are remapped into the initial balloon area (xen_extra_mem) but not enough of this is available. Ensure enough extra memory regions are added for these remapped frames. Signed-off-by: David Vrabel david.vra...@citrix.com --- arch/x86/xen/setup.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index 664dffc..feb6d86 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -366,7 +366,7 @@ static void __init xen_do_set_identity_and_remap_chunk( static unsigned long __init xen_set_identity_and_remap_chunk( const struct e820entry *list, size_t map_size, unsigned long start_pfn, unsigned long end_pfn, unsigned long nr_pages, unsigned long remap_pfn, - unsigned long *released) + unsigned long *released, unsigned long *remapped) { unsigned long pfn; unsigned long i = 0; @@ -404,6 +404,7 @@ static unsigned long __init xen_set_identity_and_remap_chunk( /* Update variables to reflect new mappings. */ i += size; remap_pfn += size; + *remapped += size; } /* @@ -420,12 +421,13 @@ static unsigned long __init xen_set_identity_and_remap_chunk( static void __init xen_set_identity_and_remap( const struct e820entry *list, size_t map_size, unsigned long nr_pages, - unsigned long *released) + unsigned long *released, unsigned long *remapped) { phys_addr_t start = 0; unsigned long last_pfn = nr_pages; const struct e820entry *entry; unsigned long num_released = 0; + unsigned long num_remapped = 0; int i; /* @@ -452,12 +454,13 @@ static void __init xen_set_identity_and_remap( last_pfn = xen_set_identity_and_remap_chunk( list, map_size, start_pfn, end_pfn, nr_pages, last_pfn, - num_released); + num_released, num_remapped); start = end; } } *released = num_released; + *remapped = num_remapped; pr_info(Released %ld page(s)\n, num_released); } @@ -577,6 +580,7 @@ char * __init xen_memory_setup(void) struct xen_memory_map memmap; unsigned long max_pages; unsigned long extra_pages = 0; + unsigned long remapped_pages; int i; int op; @@ -626,9 +630,10 @@ char * __init xen_memory_setup(void) * underlying RAM. */ xen_set_identity_and_remap(map, memmap.nr_entries, max_pfn, - xen_released_pages); + xen_released_pages, remapped_pages); extra_pages += xen_released_pages; + extra_pages += remapped_pages; /* * Clamp the amount of extra memory to a EXTRA_MEM_RATIO -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/3] x86/xen: don't count how many PFNs are identity mapped
This accounting is just used to print a diagnostic message that isn't very useful. Signed-off-by: David Vrabel david.vra...@citrix.com --- arch/x86/xen/setup.c | 27 +-- 1 file changed, 9 insertions(+), 18 deletions(-) diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index dfd77de..664dffc 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -229,15 +229,14 @@ static int __init xen_free_mfn(unsigned long mfn) * as a fallback if the remapping fails. */ static void __init xen_set_identity_and_release_chunk(unsigned long start_pfn, - unsigned long end_pfn, unsigned long nr_pages, unsigned long *identity, - unsigned long *released) + unsigned long end_pfn, unsigned long nr_pages, unsigned long *released) { - unsigned long len = 0; unsigned long pfn, end; int ret; WARN_ON(start_pfn end_pfn); + /* Release pages first. */ end = min(end_pfn, nr_pages); for (pfn = start_pfn; pfn end; pfn++) { unsigned long mfn = pfn_to_mfn(pfn); @@ -250,16 +249,14 @@ static void __init xen_set_identity_and_release_chunk(unsigned long start_pfn, WARN(ret != 1, Failed to release pfn %lx err=%d\n, pfn, ret); if (ret == 1) { + (*released)++; if (!__set_phys_to_machine(pfn, INVALID_P2M_ENTRY)) break; - len++; } else break; } - /* Need to release pages first */ - *released += len; - *identity += set_phys_range_identity(start_pfn, end_pfn); + set_phys_range_identity(start_pfn, end_pfn); } /* @@ -318,7 +315,6 @@ static void __init xen_do_set_identity_and_remap_chunk( unsigned long ident_pfn_iter, remap_pfn_iter; unsigned long ident_end_pfn = start_pfn + size; unsigned long left = size; - unsigned long ident_cnt = 0; unsigned int i, chunk; WARN_ON(size == 0); @@ -347,8 +343,7 @@ static void __init xen_do_set_identity_and_remap_chunk( xen_remap_mfn = mfn; /* Set identity map */ - ident_cnt += set_phys_range_identity(ident_pfn_iter, - ident_pfn_iter + chunk); + set_phys_range_identity(ident_pfn_iter, ident_pfn_iter + chunk); left -= chunk; } @@ -371,7 +366,7 @@ static void __init xen_do_set_identity_and_remap_chunk( static unsigned long __init xen_set_identity_and_remap_chunk( const struct e820entry *list, size_t map_size, unsigned long start_pfn, unsigned long end_pfn, unsigned long nr_pages, unsigned long remap_pfn, - unsigned long *identity, unsigned long *released) + unsigned long *released) { unsigned long pfn; unsigned long i = 0; @@ -386,8 +381,7 @@ static unsigned long __init xen_set_identity_and_remap_chunk( /* Do not remap pages beyond the current allocation */ if (cur_pfn = nr_pages) { /* Identity map remaining pages */ - *identity += set_phys_range_identity(cur_pfn, - cur_pfn + size); + set_phys_range_identity(cur_pfn, cur_pfn + size); break; } if (cur_pfn + size nr_pages) @@ -398,7 +392,7 @@ static unsigned long __init xen_set_identity_and_remap_chunk( if (!remap_range_size) { pr_warning(Unable to find available pfn range, not remapping identity pages\n); xen_set_identity_and_release_chunk(cur_pfn, - cur_pfn + left, nr_pages, identity, released); + cur_pfn + left, nr_pages, released); break; } /* Adjust size to fit in current e820 RAM region */ @@ -410,7 +404,6 @@ static unsigned long __init xen_set_identity_and_remap_chunk( /* Update variables to reflect new mappings. */ i += size; remap_pfn += size; - *identity += size; } /* @@ -430,7 +423,6 @@ static void __init xen_set_identity_and_remap( unsigned long *released) { phys_addr_t start = 0; - unsigned long identity = 0; unsigned long last_pfn = nr_pages; const struct e820entry *entry; unsigned long num_released = 0; @@ -460,14 +452,13 @@ static void __init xen_set_identity_and_remap( last_pfn = xen_set_identity_and_remap_chunk( list, map_size, start_pfn, end_pfn, nr_pages, last_pfn, - identity, num_released); +
[Xen-devel] [PATCH 3/3] x86/xen: optimize get_phys_to_machine()
The page table walk is only needed to distinguish between identity and missing, both of which have INVALID_P2M_ENTRY. Signed-off-by: David Vrabel david.vra...@citrix.com --- arch/x86/xen/p2m.c | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index edbc7a6..a848201 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -405,8 +405,7 @@ void __init xen_vmalloc_p2m_tree(void) unsigned long get_phys_to_machine(unsigned long pfn) { - pte_t *ptep; - unsigned int level; + unsigned long mfn; if (unlikely(pfn = xen_p2m_size)) { if (pfn xen_max_p2m_pfn) @@ -414,19 +413,26 @@ unsigned long get_phys_to_machine(unsigned long pfn) return IDENTITY_FRAME(pfn); } + + mfn = xen_p2m_addr[pfn]; - ptep = lookup_address((unsigned long)(xen_p2m_addr + pfn), level); - BUG_ON(!ptep || level != PG_LEVEL_4K); + if (unlikely(mfn == INVALID_P2M_ENTRY)) { + pte_t *ptep; + unsigned int level; - /* -* The INVALID_P2M_ENTRY is filled in both p2m_*identity -* and in p2m_*missing, so returning the INVALID_P2M_ENTRY -* would be wrong. -*/ - if (pte_pfn(*ptep) == PFN_DOWN(__pa(p2m_identity))) - return IDENTITY_FRAME(pfn); + ptep = lookup_address((unsigned long)(xen_p2m_addr + pfn), level); + BUG_ON(!ptep || level != PG_LEVEL_4K); + + /* +* The INVALID_P2M_ENTRY is filled in both p2m_*identity +* and in p2m_*missing, so returning the INVALID_P2M_ENTRY +* would be wrong. +*/ + if (pte_pfn(*ptep) == PFN_DOWN(__pa(p2m_identity))) + return IDENTITY_FRAME(pfn); + } - return xen_p2m_addr[pfn]; + return mfn; } EXPORT_SYMBOL_GPL(get_phys_to_machine); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 0/9] toolstack-based approach to pvhvm guest kexec
Jan Beulich jbeul...@suse.com writes: On 07.01.15 at 11:41, david.vra...@citrix.com wrote: On 07/01/15 09:10, Olaf Hering wrote: On Mon, Jan 05, Vitaly Kuznetsov wrote: Wei Liu wei.l...@citrix.com writes: Olaf mentioned his concern about handling ballooned pages in 20141211153029.ga1...@aepfle.de. Is that point moot now? Well, the limitation is real and some guest-side handling will be required in case we want to support kexec with ballooning. But as David validly mentioned It's the responsibility of the guest to ensure it either doesn't kexec when it is ballooned or that the kexec kernel can handle this. Not sure if we can (and need to) do anything hypevisor- or toolstack-side. One approach would be to mark all pages as some sort of populate-on-demand first. Then copy the existing assigned pages from domA to domB and update the page type. The remaining pages are likely ballooned. Once the guest tries to access them this should give the hypervisor and/or toolstack a chance to assign a real RAM page to them. I mean, if a host-assisted approach for kexec is implemented then this approach must also cover ballooning. It is not possible for the hypervisor or toolstack to do what you want because there may not be enough free memory to repopulate the new domain. The guest can handle this by: 1. Not ballooning (this is common in cloud environments). 2. Reducing the balloon prior to kexec. Which may fail because again there may not be enough memory to claim back from the hypervisor. Yes, but it may be better to cancel kexec at this point. -- Vitaly ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 08/12] xen/grant-table: add a mechanism to safely unmap pages that are in use
On 07/01/15 12:00, Ian Campbell wrote: On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote: From: Jenny Herbert jennifer.herb...@citrix.com Introduce gnttab_unmap_refs_async() that can be used to safely unmap pages that may be in use (ref count 1). If the pages are in use the unmap is deferred and retried later. This polling is not very clever but it should be good enough if the cases where the delay is necessary are rare. This is needed to allow block backends using grant mapping to safely use network storage (block or filesystem based such as iSCSI or NFS). The network storage driver may complete a block request whilst there is a queued network packet retry (because the ack from the remote end races with deciding to queue the retry). The pages for the retried packet would be grant unmapped and the network driver (or hardware) would access the unmapped page. I thought this had been solved a little while ago by mapping a scratch page on unmap even for kernel space grant mappings, but both the design doc and here imply not (i.e. the scratch is for user grant mappings only), so I must be misremembering. It was only for user grant mappings and it did not fix the case where the page being unmapped was currently dma mapped. This could have resulted in the NIC transmitting sensitive data. e.g., 1. iscsi queues a retransmit with page P (frame F). 2. NIC driver DMA maps and queues frame F on h/w. 3. iscsi completes the I/O. 4. page P is unmapped. 5. response is sent to guest 6. guest reuses frame F. 7. NIC transmits frame F. We don't use this safe unmap mechanism for netback because the zero copy stuff means we don't need it and the polling on the unmap is high latency and only good enough if the polling is needed very rarely. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/7] tools/hotplug: remove SELinux options from var-lib-xenstored.mount
On Wed, Jan 07, 2015 at 09:31:50AM +, Ian Campbell wrote: On Wed, 2015-01-07 at 10:23 +0100, Olaf Hering wrote: On Tue, Jan 06, Ian Campbell wrote: On Fri, 2014-12-19 at 12:25 +0100, Olaf Hering wrote: ... Acked-by: Ian Campbell ian.campb...@citrix.com (on commit s/Appearently/Apparently/; s/non-existant/non-existent/ in the commit log) I made typos also in other commit messages. Should I resend the entire series, or will this be done during commit? Looks like Konrad already committed, I don't know if he fixed the typos (I suppose it doesn't matter now either way). I did the changes you pointed our here. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 11/12] xen/gntdev: mark userspace PTEs as special on x86 PV guests
On Tue, 2015-01-06 at 18:57 +, David Vrabel wrote: In an x86 PV guest, get_user_pages_fast() on a userspace address range containing foreign mappings does not work correctly because the M2P lookup of the MFN from a userspace PTE may return the wrong page. Force get_user_pages_fast() to fail on such addresses by marking the PTEs as special. If Xen has XENFEAT_gnttab_map_avail_bits (available since at least 4.0), http://wiki.xenproject.org/wiki/Xen_Kernel_Feature_Matrix says the dom0 pvpops already requires = 4.0 too, which matches my recollection (something to do with a new APIC interface which upstream insisted on during upstreaming, IIRC), but both could be out of date... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Bugfix] x86/apic: Fix xen IRQ allocation failure caused by commit b81975eade8c
On Wed, Jan 07, 2015 at 02:13:49PM +0800, Jiang Liu wrote: Commit b81975eade8c (x86, irq: Clean up irqdomain transition code) breaks xen IRQ allocation because xen_smp_prepare_cpus() doesn't invoke setup_IO_APIC(), so no irqdomains created for IOAPICs and mp_map_pin_to_irq() fails at the very beginning. --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2369,31 +2369,29 @@ static void ioapic_destroy_irqdomain(int idx) ioapics[idx].pin_info = NULL; } -void __init setup_IO_APIC(void) +void __init setup_IO_APIC(bool xen_smp) { int ioapic; - /* - * calling enable_IO_APIC() is moved to setup_local_APIC for BP - */ - io_apic_irqs = nr_legacy_irqs() ? ~PIC_IRQS : ~0UL; + if (!xen_smp) { + apic_printk(APIC_VERBOSE, ENABLING IO-APIC IRQs\n); + io_apic_irqs = nr_legacy_irqs() ? ~PIC_IRQS : ~0UL; + + /* Set up IO-APIC IRQ routing. */ + x86_init.mpparse.setup_ioapic_ids(); + sync_Arb_IDs(); + } Is there a specific reason that this cannot run in all cases? What I am asking is why are we doing a special case here? The description at the top implied that we were just missing an call to setup_IO_APIC.. - apic_printk(APIC_VERBOSE, ENABLING IO-APIC IRQs\n); for_each_ioapic(ioapic) BUG_ON(mp_irqdomain_create(ioapic)); - - /* - * Set up IO-APIC IRQ routing. - */ - x86_init.mpparse.setup_ioapic_ids(); - - sync_Arb_IDs(); setup_IO_APIC_irqs(); - init_IO_APIC_traps(); - if (nr_legacy_irqs()) - check_timer(); - ioapic_initialized = 1; + + if (!xen_smp) { + init_IO_APIC_traps(); + if (nr_legacy_irqs()) + check_timer(); + } } /* diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 4c071aeb8417..7eb0283901fa 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -326,7 +326,10 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus) xen_raw_printk(m); panic(m); + } else { + setup_IO_APIC(true); } + xen_init_lock_cpu(0); smp_store_boot_cpu_info(); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 7/7] tools/hotplug: add wrapper to start xenstored
On Wed, Jan 07, 2015 at 10:49:38AM +0100, Olaf Hering wrote: On Tue, Jan 06, Ian Jackson wrote: Olaf Hering writes ([PATCH 7/7] tools/hotplug: add wrapper to start xenstored): The shell wrapper in xenstored.service does not handle XENSTORE_TRACE. ... +XENSTORED_LIBEXEC = xenstored.sh Should be in /etc as previously discussed. Previously I wrote: Bottom line: as relevant maintainer, I'm afraid I'm going to insist that this script be in /etc. I'm disappointed. It is not acceptable to resubmit a change ignoring such unequivocal feedback. Plain /etc wont work, I think. /etc/xen/scripts perhaps? But see my other reply to IanC, maybe there is a way to avoid the wrapper. And after having some time to think about this: If one has a need to adjust something, then this could be done in the xencommons script right away. In other words, the modification can be done there instead of calling the wrapper. Nacked-by: Ian Jackson ian.jack...@eu.citrix.com +hotplug/Linux/xenstored.sh Although many of the existing hotplug scripts have this notion of calling things foo.sh because they happen to be written in shell, I think this is bad practice. I would prefer xenstored-wrap or some such. (My co-maintainers may disagree...) But this is a bit of a bikeshed issue. I agree. Initally I had xenstored-launcher in mind. echo -n Starting $XENSTORED... - $XENSTORED --pid-file /var/run/xenstored.pid $XENSTORED_ARGS + XENSTORED=$XENSTORED \ + XENSTORED_TRACE=$XENSTORED_TRACE \ + XENSTORED_ARGS=$XENSTORED_ARGS \ + ${LIBEXEC_BIN}/xenstored.sh --pid-file /var/run/xenstored.pid It might be easier to . xenstore-wrap. Failing that using `export' will avoid this rather odd and repetitive style. I think thats a good idea. Something like this may work, doing the . and the exec in the subshell: ( set -- --pid-file /var/run/xenstored.pid . xenstored.sh ) diff --git a/tools/hotplug/Linux/xenstored.sh.in b/tools/hotplug/Linux/xenstored.sh.in new file mode 100644 index 000..dc806ee --- /dev/null +++ b/tools/hotplug/Linux/xenstored.sh.in @@ -0,0 +1,6 @@ +#!/bin/sh +if test -n $XENSTORED_TRACE +then + XENSTORED_ARGS= -T /var/log/xen/xenstored-trace.log +fi +exec $XENSTORED $@ $XENSTORED_ARGS This should probably have around $@ just in case. Ok. I will wait for results from SELinux testing before respinning this patch. It did work for me (I did an 'Tested-by') in my email. Please do keep in mind that today is the last for commits for Xen 4.5. No pressure :-) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/7 v3] tools/hotplug: systemd changes for 4.5
On Wed, Jan 07, 2015 at 10:53:06AM +0100, Olaf Hering wrote: On Mon, Jan 05, Konrad Rzeszutek Wilk wrote: +Release Issues +== + +While we did the utmost to get a release out, there are certain +fixes which were not complete on time. As such please reference this +section if you are running into trouble. + +* systemd not working with Fedora Core 20, 21 or later (systemctl + reports xenstore failing to start). + + Systemd support is now part of Xen source code. While utmost work has + been done to make the systemd files compatible across all the + distributions, there might issues when using systemd files from + Xen sources. The work-around is to define an mount entry in + /etc/fstab as follow: + + tmpfs /var/lib/xenstored tmpfs + mode=755,context=system_u:object_r:xenstored_var_lib_t:s0 0 0 + + Shouldnt this go into a new SELinux section in the INSTALL file? It is going in the web-page for 'Release Issues' and such. Its my understanding that the reported SELinux failure is not only related to the context= mount option, but also to the socket passing from systemd. I couldn't spot any errors in SELinux for this. Perhaps I had misconfigured? Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] tools/hotplug: introduce XENSTORED_ARGS= in sysconfig file.
On Wed, Jan 07, Ian Campbell wrote: On Wed, 2015-01-07 at 16:49 +, Ian Jackson wrote: Certainly removing this feature this late in the 4.5 release cycle is not appropriate. I agree that faffing around with the initscripts/systemd units at the eleventh hour seem liable to leave us with a release where xenstored doesn't start or something. What about staging, is that how it is supposed to look like in 4.6? Or should I rather work on the wrapper script so that XENSTORED_TRACE has to be used? I would like to see patch #1 in 4.5 as the proper way to pass additional options to xenstored. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-4.2-testing test] 33227: regressions - FAIL
flight 33227 xen-4.2-testing real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/33227/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xend-qemut-winxpsp3 5 xen-boot fail REGR. vs. 32291 test-amd64-amd64-pv 2 hosts-allocate running in 33184 [st=running!] test-amd64-amd64-xl 2 hosts-allocate running in 33184 [st=running!] Tests which are failing intermittently (not blocking): test-amd64-i386-xl-credit27 debian-install fail pass in 33184 test-amd64-i386-xl-win7-amd64 7 windows-installfail pass in 33184 test-amd64-i386-qemuu-freebsd10-amd64 3 host-install(3) broken in 33184 pass in 33227 test-amd64-i386-qemut-rhel6hvm-amd 3 host-install(3) broken in 33184 pass in 33227 test-amd64-i386-qemuu-freebsd10-i386 3 host-install(3) broken in 33184 pass in 33227 test-amd64-amd64-xl-sedf-pin 3 host-install(3) broken in 33184 pass in 33227 test-amd64-i386-xl 15 guest-localmigrate/x10 fail in 33184 pass in 33227 test-amd64-i386-pair 4 host-install/dst_host(4) broken in 33184 pass in 33227 test-amd64-i386-pair 3 host-install/src_host(3) broken in 33184 pass in 33227 test-amd64-i386-xend-qemut-winxpsp3 3 host-install(3) broken in 33184 pass in 33227 test-amd64-amd64-xl-win7-amd64 3 host-install(3) broken in 33184 pass in 33227 test-amd64-amd64-xl-qemuu-debianhvm-amd64 3 host-install(3) broken in 33184 pass in 33227 test-i386-i386-xl-qemut-winxpsp3 3 host-install(3) broken in 33184 pass in 33227 test-amd64-amd64-pair 3 host-install/src_host(3) broken in 33184 pass in 33227 test-amd64-amd64-pair 4 host-install/dst_host(4) broken in 33184 pass in 33227 test-amd64-i386-xend-winxpsp3 3 host-install(3) broken in 33184 pass in 33227 test-amd64-amd64-xl-qemut-win7-amd64 3 host-install(3) broken in 33184 pass in 33227 Regressions which are regarded as allowable (not blocking): test-amd64-i386-qemut-rhel6hvm-intel 3 host-install(3) broken in 33184 like 32162 Tests which did not succeed, but are not blocking: test-amd64-amd64-rumpuserxen-amd64 1 build-check(1) blocked n/a test-i386-i386-rumpuserxen-i386 1 build-check(1) blocked n/a test-amd64-i386-rumpuserxen-i386 1 build-check(1) blocked n/a build-amd64-rumpuserxen 5 rumpuserxen-buildfail never pass test-i386-i386-libvirt9 guest-start fail never pass test-amd64-i386-xl-qemuu-ovmf-amd64 7 debian-hvm-install fail never pass test-amd64-amd64-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-qemuu-ovmf-amd64 7 debian-hvm-install fail never pass build-i386-rumpuserxen5 rumpuserxen-buildfail never pass test-amd64-i386-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-i386-i386-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xend-winxpsp3 17 leak-check/check fail never pass test-i386-i386-xl-qemuu-winxpsp3 14 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass test-i386-i386-xl-winxpsp3 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail in 33184 never pass version targeted for testing: xen 95af3f09eeef089e0100a8518f7ca75206e33c7c baseline version: xen 353de6b221c2d0fb59edfceb1f535357e4d84825 People who touched revisions under test: Mihai DonÈu mdo...@bitdefender.com RÄzvan Cojocaru rcojoc...@bitdefender.com jobs: build-amd64 pass build-i386 pass build-amd64-libvirt pass build-i386-libvirt pass build-amd64-pvops
Re: [Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask
On Wed, Jan 07, 2015 at 09:43:44PM +, Ian Jackson wrote: Andrew Cooper writes (Re: [Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask): Other culprits are xc_get_max_nodes(), xc_get_max_cpus(), 4 instances in xc_psr.c and most things in xc_offline_page.c which appears to have static structures for domain context. The pluggable loader infrastructure in xc_dom.c also appears to be thread-unsafe. xc_dom_decompress_unsafe.c also uses static data, but unsafe in the name might be a sufficient guard? I will look at these tomorrow. No aggressively optimising compiler is going to perform partial writes on a naturally aligned integer, so I stand by my comment when applied to the common case. You misunderstand. An aggressively optimising compiler might be able to prove (perhaps through whole program analysis - we have link-time optimisation nowadays) various falsehoods about the way these variables are used. The resulting generated machine code might be arbitrarily bad, up to and including missing important parts of the whole program. Does this really happen, as for this example? I'm not aware of any compilers which currently take advantge of thread safety bugs (really, just spec-violations) but I think this is just a matter of time. I think this can be the issue. At least we should be very careful. Or to build a common mechanism in libxc to cache something safely, using that to replace all the static places current we have. While I personally agree to make xc re-enterable. The cache responsibility then can be move to top caller. For this example, remove static in libxc and the caching for const hypercall return value is then moved to xl. It can just 'cache' the value in local variable. While this is not the most performance way as at least one hypercall is needed for each xl command invocation. Chao Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST PATCH 3/4] Add nested testcase of installing L2 guest VM
-Original Message- From: Pang, LongtaoX Sent: Wednesday, January 07, 2015 11:53 AM To: 'Wei Liu' Cc: xen-devel@lists.xen.org; ian.jack...@eu.citrix.com; ian.campb...@citrix.com; Hu, Robert; Zheng, Di Subject: RE: [OSSTEST PATCH 3/4] Add nested testcase of installing L2 guest VM -Original Message- From: Wei Liu [mailto:wei.l...@citrix.com] Sent: Wednesday, January 07, 2015 12:52 AM To: Pang, LongtaoX Cc: Wei Liu; xen-devel@lists.xen.org; ian.jack...@eu.citrix.com; ian.campb...@citrix.com; Hu, Robert; Zheng, Di Subject: Re: [OSSTEST PATCH 3/4] Add nested testcase of installing L2 guest VM On Tue, Jan 06, 2015 at 03:28:43AM +, Pang, LongtaoX wrote: -Original Message- From: Wei Liu [mailto:wei.l...@citrix.com] Sent: Thursday, December 11, 2014 7:44 PM To: Pang, LongtaoX Cc: xen-devel@lists.xen.org; ian.jack...@eu.citrix.com; ian.campb...@citrix.com; wei.l...@citrix.com; Hu, Robert; Zheng, Di Subject: Re: [OSSTEST PATCH 3/4] Add nested testcase of installing L2 guest VM On Wed, Dec 10, 2014 at 04:07:39PM +0800, longtao.pang wrote: From: longtao.pang longtaox.p...@intel.com This patch is used for installing L2 guest VM inside L1 guest VM. --- sg-run-job|2 + ts-debian-install | 166 + 2 files changed, 132 insertions(+), 36 deletions(-) diff --git a/sg-run-job b/sg-run-job index e513bd1..85f7b22 100755 --- a/sg-run-job +++ b/sg-run-job @@ -292,6 +292,8 @@ proc need-hosts/test-nested {} {return host} proc run-job/test-nested {} { run-ts . = ts-debian-hvm-install + host + nested + nested_L1 run-ts . = ts-xen-install + host + nested + nested_build +run-ts . = ts-debian-install + host + nested + amd64 + nested_L2 +run-ts . = ts-guest-destroy + host nested It would also be possible to run ts-debian-hvm-install as L2. That would suite this test case better -- it's testing nested HVM. There's no need to remove the PV test case though. [Pang, LongtaoX] [Pang, LongtaoX] Thanks for checking. We used ts-debian-hvm-install for installing L1 HVM guest via ISO Image, because we will build XEN, XEN-Tools and dom0 kernel inside it, and then we will install L2 guest inside L1. But, L2 guest is just a native OS, so we think use ts-debian-install is enough for installing L2 and will make it easy to control. ts-debian-install installs a L2 PV guest, which should work even without nested HVM enabled for your L1 HVM guest. You're testing nested HVM I think it makes more sense to install a L2 HVM guest. [Pang, LongtaoX] Thanks Wei, I will try to re-use the script of ts-debian-hvm-install as L2, maybe it will make this script become complicated. If it works, there will not be necessary to modify and use ts-debian-install anymore. [Pang, LongtaoX] Hi Wei, for script of ts-debian-hvm-install, as too many parameters, functions, structure and variables are not suit for L2 installing , if I re-use and modify as L2, it will make the script become more convoluted and hard to maintain in later days. So, I plant to write a new script similar to ts-debian-hvm-install, called ts-debian-hvm-install-L2 for L2 guest installing. If you have any concern or other opinions, please tell me, thanks. [...] +sub start () { +my $cfg_xend= /etc/xen/$guesthost.cfg; +my $cmd= toolstack()-{Command}. create .$cfg_xend; +target_cmd_root($ho, $cmd, 30); +my $domains = target_cmd_output_root($ho, toolstack()-{Command}. list); +logm(guest state is\n$domains); } I think we already have a guest start script? This hunk is going to break easily if we're more flexible about the toolstack (we already have a partially working libvirt test case). [Pang, LongtaoX] Thanks for checking, I have tried to use ts-guest-start to start guest, but it maybe not suit for here, because some function and parameters in the script is not necessary here, If we use the script we will modify it again and may impact other test jobs. So I create a function here to start L2 guest. Then you need to keep an eye on the ongoing work from Ian Campbell to factor out abstraction layer of toolstack and rebase accordingly. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] VT-d: don't crash when PTE bits 52 and up are non-zero
From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Wednesday, January 07, 2015 6:16 PM On 23.12.14 at 07:52, kevin.t...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Friday, December 19, 2014 7:26 PM This can (and will) be legitimately the case when sharing page tables with EPT (more of a problem before p2m_access_rwx became zero, but still possible even now when other than that is the default for a guest), leading to an unconditional crash (in print_vtd_entries()) when a DMA remapping fault occurs. could you elaborate the scenarios when bits 52+ are non-zero? Signed-off-by: Jan Beulich jbeul...@suse.com but the changes looks reasonable to me. Signed-off-by: Kevin Tian kevin.t...@intel.com I translated this to a Reviewed-by, as S-o-b doesn't seem to make sense here. Konrad - please indicate whether this can also go into 4.5. Jan sorry for typo. it should be Acked-by. :-) Thanks Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] dom0 as pvh boot problem
On Tue, Jan 6, 2015 at 10:39 PM, Zhang, Yang Z yang.z.zh...@intel.com wrote: Elena Ufimtseva wrote on 2015-01-07: On Mon, Jan 5, 2015 at 7:53 AM, Zhang, Yang Z yang.z.zh...@intel.com wrote: Jan Beulich wrote on 2015-01-05: Elena Ufimtseva ufimts...@gmail.com 01/02/15 7:32 PM The last successful command is the reading status register of second IOMMU unit: snip from iommu_enable_translation() in ./xen/drivers/passthrough/vtd/iommu.c 746:sts = dmar_readl(iommu-reg, DMAR_GSTS_REG); 747:dmar_writel(iommu-reg, DMAR_GCMD_REG, sts | DMA_GCMD_TE); /snip After dmar_writel for second iommu the machine hangs. That's rather odd - you say it doesn't even reach the IOMMU_WAIT_OP()right after that? That would suggest a fault or other abnormal condition raised by the translation enabling (i.e. some problem with the page tables, albeit that should then have been a problem for the first IOMMU already). Yet an eventual fault can't be delivered at that point due to interrupts being disabled. Perhaps the VT-d maintainers (now Cc-ed) have some suggestion as to what's going on or how to diagnose. I am curious why pv dom0 boot fine. Will pvh dom0 share EPT table with VT-d? Maybe try with disable sharing to see whether helps. Yes, it is interesting. I am working on pvh guest boot under pv dom0. Hi Yang Somehow I dropped the list from this conversation. Adding back. We can start from this point. It is not very hard to find the difference on IOMMU setup when booting pvh dom0 and pv dom0. Also, have you seen any difference on IOMMU register on working and non-working case? I compared the registers and I did not find any difference in pv or pvh case. I did it for both iommus, before enabling each unit and after. They are the same. If you would like, I can post the dumps. As for the differences between setting up iommu, I will check and get back with this. Thank you for your advice. I tried booting with non shared EPTs, but with the same result. Jan Best regards, Yang Best regards, Yang -- Elena ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] dom0 as pvh boot problem
On Wed, Jan 7, 2015 at 2:51 AM, Jan Beulich jbeul...@suse.com wrote: On 06.01.15 at 23:20, ufimts...@gmail.com wrote: On Mon, Jan 5, 2015 at 3:44 AM, Jan Beulich jbeul...@suse.com wrote: Elena Ufimtseva ufimts...@gmail.com 01/02/15 7:32 PM The last successful command is the reading status register of second IOMMU unit: snip from iommu_enable_translation() in ./xen/drivers/passthrough/vtd/iommu.c 746:sts = dmar_readl(iommu-reg, DMAR_GSTS_REG); 747:dmar_writel(iommu-reg, DMAR_GCMD_REG, sts | DMA_GCMD_TE); /snip After dmar_writel for second iommu the machine hangs. That's rather odd - you say it doesn't even reach the IOMMU_WAIT_OP() right after that? Thats odd, last tests I did show that it does complete the write to the control register of the second drhd, but I cannot say if it reaches IOMMU_WAIT_OP() as right after this write it hangs. I tried to enable iommu's in reverse order with the same result. The same being it hanging on the second one being enabled, or no hanging on the one getting enabled first? The same meant the same behaviour. Does not matter if iommu#1 (numbered in the order of enumeration) enabled first or after iommu#0, machine hangs after enabling translation by writing to comman reg. of iommu#1. That would suggest a fault or other abnormal condition raised by the translation enabling (i.e. some problem with the page tables, albeit that should then have been a problem for the first IOMMU already). I wonder if such problem can be diagnosed without interrupt. Maybe reflected in error logging event registers? Sure - you'd need to poll those registers, which you can't if the CPU really hangs. Or maybe you could try to monitor the IOMMU state from another CPU... Thanks Jan for advise. Yes, I can do this, pin dom0 to one cpu and poll the registers from another. Jan -- Elena ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information
On 07.01.15 at 15:47, andrew.coop...@citrix.com wrote: On 07/01/15 14:42, Boris Ostrovsky wrote: I kept this field as an int to be able to store NUMA_NO_NODE which I thought to be (int)-1. But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to (int)-1 by pxm_to_node(). Given that there is a number of tests for NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node() return u8 as well? I noticed this as well, and found it quite counter intuitive. I would suggest fixing NUMA_NO_NODE to -1 and removing some of the type-punning. I have to admit that I see no value in wasting 4 bytes for something that for the foreseeable future won't exceed 1 byte. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [Patch V4] expand x86 arch_shared_info to support linear p2m list
The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list currently contains the mfn of the top level page frame of the 3 level p2m tree, which is used by the Xen tools during saving and restoring (and live migration) of pv domains and for crash dump analysis. With three levels of the p2m tree it is possible to support up to 512 GB of RAM for a 64 bit pv domain. A 32 bit pv domain can support more, as each memory page can hold 1024 instead of 512 entries, leading to a limit of 4 TB. To be able to support more RAM on x86-64 switch to an additional virtual mapped p2m list. This patch expands struct arch_shared_info with a new p2m list virtual address, the root of the page table root and a p2m generation count. The new information is indicated by the domain to be valid by storing a non-zero value into the page table root member. To avoid build failures in the tools directory the checked structure sizes must be adapted, too. Signed-off-by: Juergen Gross jgr...@suse.com --- tools/include/xen-foreign/reference.size | 4 ++-- xen/include/public/arch-x86/xen.h| 36 +--- 2 files changed, 35 insertions(+), 5 deletions(-) diff --git a/tools/include/xen-foreign/reference.size b/tools/include/xen-foreign/reference.size index 60ee262..ffe319e 100644 --- a/tools/include/xen-foreign/reference.size +++ b/tools/include/xen-foreign/reference.size @@ -9,6 +9,6 @@ vcpu_guest_context| 344 34428005168 arch_vcpu_info| 0 0 24 16 vcpu_time_info| 32 32 32 32 vcpu_info | 48 48 64 64 -arch_shared_info | 0 0 268 280 -shared_info |1088108825843368 +arch_shared_info | 0 0 24 48 +shared_info |1088108823403136 diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch-x86/xen.h index f35804b..c5e880b 100644 --- a/xen/include/public/arch-x86/xen.h +++ b/xen/include/public/arch-x86/xen.h @@ -220,11 +220,41 @@ typedef struct vcpu_guest_context vcpu_guest_context_t; DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t); struct arch_shared_info { -unsigned long max_pfn; /* max pfn that appears in table */ -/* Frame containing list of mfns containing list of mfns containing p2m. */ +/* + * Number of valid entries in the p2m table(s) anchored at + * pfn_to_mfn_frame_list_list and/or p2m_vaddr. + */ +unsigned long max_pfn; +/* + * Frame containing list of mfns containing list of mfns containing p2m. + * A value of 0 indicates it has not yet been set up, ~0 indicates it has + * been set to invalid e.g. due to the p2m being too large for the 3-level + * p2m tree. In this case the linear mapper p2m list anchored at p2m_vaddr + * is to be used. + */ xen_pfn_t pfn_to_mfn_frame_list_list; unsigned long nmi_reason; -uint64_t pad[32]; +/* + * Following three fields are valid if p2m_cr3 contains a value different + * from 0. + * p2m_cr3 is the root of the address space where p2m_vaddr is valid. + * p2m_cr3 is in the same format as a cr3 value in the vcpu register state + * and holds the folded machine frame number (via xen_pfn_to_cr3) of a + * L3 or L4 page table. + * p2m_vaddr holds the virtual address of the linear p2m list. All entries + * in the range [0...max_pfn[ are accessible via this pointer. + * p2m_generation will be incremented by the guest before and after each + * change of the mappings of the p2m list. p2m_generation starts at 0 and + * a value with the least significant bit set indicates that a mapping + * update is in progress. This allows guest external software (e.g. in Dom0) + * to verify that read mappings are consistent and whether they have changed + * since the last check. + * Modifying a p2m element in the linear p2m list is allowed via an atomic + * write only. + */ +unsigned long p2m_cr3; /* cr3 value of the p2m address space */ +unsigned long p2m_vaddr; /* virtual address of the p2m list */ +unsigned long p2m_generation; /* generation count of p2m mapping */ }; typedef struct arch_shared_info arch_shared_info_t; -- 2.1.2 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [Patch V4] support guest virtual mapped p2m list
The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list currently contains the mfn of the top level page frame of the 3 level p2m tree, which is used by the Xen tools during saving and restoring (and live migration) of pv domains and for crash dump analysis. With three levels of the p2m tree it is possible to support up to 512 GB of RAM for a 64 bit pv domain. A 32 bit pv domain can support more, as each memory page can hold 1024 instead of 512 entries, leading to a limit of 4 TB. To be able to support more RAM on x86-64 switch to a virtual mapped p2m list. Changes in V4: - adjust structure sizes in tools/include/xen-foreign/reference.size Changes in V3: - removed XENFEAT_virtual_p2m completely as the linear p2m list and the 3 level p2m tree can be used in parallel unless the domain size exceeds the limit mentioned above - updated comments to reflect the parallel use of both p2m schemes Changes in V2: - add new structure member p2m_generation in arch_shared_info - rename structure member referencing the p2m address space to p2m_cr3 - add some comments - removed patches 2-4 as overriding missing XENFEAT_virtual_p2m will be done via kernel parameter (patch 2 will be resent after Xen 4.5 is out) *** BLURB HERE *** Juergen Gross (1): expand x86 arch_shared_info to support linear p2m list tools/include/xen-foreign/reference.size | 4 ++-- xen/include/public/arch-x86/xen.h| 36 +--- 2 files changed, 35 insertions(+), 5 deletions(-) -- 2.1.2 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/4] dt-uart: cleanups, bugfixes and /chosen/stdout-path support
Before I put on my air tanks and go looking in my QUEUE-4.6 email folder I wanted to start the year by doing some actual programming, and this seemed like an afternoons hacking... The two main changes here are: * a bugfix to deal with DT paths which contain a common, which are perfectly valid and quite common. * support for the /chosen/stdout-path device tree property, which allows Xen to find a boot console without manual configuration if presented with a suitable device tree. Juno and Seattle both contain the node in the dt supplied with the upstream Linux kernel (not sure about their factory kernel), as do a bunch of non-virt capable arm devices, but with stdout-path support being added to Linux in v3.19-rc1 I expect that number will soon grow. I think dt-uart: use ':' as default separator between path and options should be a candidate for stable backport, and if it makes the backport easier I'd be inclined to take dt-uart: Clarify log messages at init time too. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/4] dt-uart: use ':' as default separator between path and options
',' is a valid character in a device-tree path (see ePAPR v1.1 Table 2-1), in fact ',' is actually pretty common in node names. Using ',' as a separator breaks for example on fast models. If you use the full path (/smb/motherboard/iofpga@3,/uart@09) rather than the alias then earlyprintk gives: (XEN) Looking for UART console /smb/motherboard/iofpga@3 (XEN) Unable to find device /smb/motherboard/iofpga@3 (XEN) Bad console= option 'dtuart' I actually noticed this on Jetson where the uart is /serial@0,70006300 and there happened to be no alias defined. Instead use ':' as the separator, it is defined to terminate the path in the context of /chosen/stdout-path (Table 3-4) which is pretty closely analogous to the dtuart= option and so makes a pretty good choice (especially since the next patch adds support for stdout-path). We still handle ',' for backwards compatibility. Note that this introduces a wrinkle in that in order to specify a dtuart path containing a ',' with no options you need to append an otherwise pointless ':' (or use an alias with no ',' in it). Additionally, expand the buffer for the dtuart option, a path can be far longer than 30 characters (in fact the maximum size of a single node name is 31, so it's not even necessarily enough for an alias). 128 is completely arbitrary and allows for paths at least 8 deep even with worst case node names. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- I've retained the handling of ',' for compatibility, but I'm almost inclined to just drop it, if not now then in a release or two. --- docs/misc/xen-command-line.markdown |9 - xen/drivers/char/dt-uart.c |6 -- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 152ae03..f7cb6d9 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -550,13 +550,20 @@ Pin dom0 vcpus to their respective pcpus Flag that makes a 64bit dom0 boot in PVH mode. No 32bit support at present. ### dtuart (ARM) - `= path [,options]` + `= path [:options]` Default: `` Specify the full path in the device tree for the UART. If the path doesn't start with `/`, it is assumed to be an alias. The options are device specific. +The path should be separated from the options with a `:`. For +backwards compatibility `,` is also supported, however this is +deprecated because `,` is a valid character in a device tree path. + +To specify a path containing a `,` with no options simply append a `:` +to the path. + ### e820-mtrr-clip `= boolean` diff --git a/xen/drivers/char/dt-uart.c b/xen/drivers/char/dt-uart.c index 04dbb97..54e65fc 100644 --- a/xen/drivers/char/dt-uart.c +++ b/xen/drivers/char/dt-uart.c @@ -31,7 +31,7 @@ * doesn't start with '/', we assuming that it's an alias. * @options: UART speficic options (see in each UART driver) */ -static char __initdata opt_dtuart[30] = ; +static char __initdata opt_dtuart[256] = ; string_param(dtuart, opt_dtuart); void __init dt_uart_init(void) @@ -50,7 +50,9 @@ void __init dt_uart_init(void) return; } -options = strchr(opt_dtuart, ','); +options = strchr(opt_dtuart, ':'); +if ( !options ) +options = strchr(opt_dtuart, ','); if ( options != NULL ) *(options++) = '\0'; else -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/4] dt-uart: support /chosen/stdout-path property.
ePAPR v1.1 section 3.5 defines the /chosen/stdout-path property to refer to the device to be used for boot console output, so if no dtuart property is given try to use that instead. This will make Xen find a suitable console by default on DT platforms which include this property. As it happens the dtuart option has the exact same syntax as stdout-path, so we can just copy the value into that buffer if it is empty. FWIW support for this was added to Linux in v3.19-rc1 (7914a7c5651a of: support passing console options with stdout-path) and a fairly large number of the dts files shipped with Linux have already included a stdout-path property for quite a while now. Since there is a base of existing device trees with the property, we do not support the legacy ',' options separator so we remain compatible. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- xen/arch/arm/domain_build.c |2 ++ xen/drivers/char/dt-uart.c | 29 +++-- 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index de180d8..c33a73c 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -424,6 +424,7 @@ static int write_properties(struct domain *d, struct kernel_info *kinfo, * bootargs (from module #1, above). * * remove bootargs, xen,dom0-bootargs, xen,xen-bootargs, * linux,initrd-start and linux,initrd-end. + * * remove stdout-path. * * remove bootargs, linux,uefi-system-table, * linux,uefi-mmap-start, linux,uefi-mmap-size, * linux,uefi-mmap-desc-size, and linux,uefi-mmap-desc-ver @@ -434,6 +435,7 @@ static int write_properties(struct domain *d, struct kernel_info *kinfo, if ( dt_property_name_is_equal(prop, xen,xen-bootargs) || dt_property_name_is_equal(prop, linux,initrd-start) || dt_property_name_is_equal(prop, linux,initrd-end) || + dt_property_name_is_equal(prop, stdout-path) || dt_property_name_is_equal(prop, linux,uefi-system-table) || dt_property_name_is_equal(prop, linux,uefi-mmap-start) || dt_property_name_is_equal(prop, linux,uefi-mmap-size) || diff --git a/xen/drivers/char/dt-uart.c b/xen/drivers/char/dt-uart.c index 54e65fc..08b0d76 100644 --- a/xen/drivers/char/dt-uart.c +++ b/xen/drivers/char/dt-uart.c @@ -22,6 +22,7 @@ #include xen/console.h #include xen/device_tree.h #include xen/serial.h +#include xen/errno.h /* * Configure UART port with a string: @@ -38,7 +39,7 @@ void __init dt_uart_init(void) { struct dt_device_node *dev; int ret; -const char *devpath = opt_dtuart; +const char *devpath = opt_dtuart, *stdout = NULL; char *options; if ( !console_has(dtuart) ) @@ -46,12 +47,36 @@ void __init dt_uart_init(void) if ( !strcmp(opt_dtuart, ) ) { +struct dt_device_node *chosen = dt_find_node_by_path(/chosen); + +if ( chosen ) +{ +ret = dt_property_read_string(chosen, stdout-path, stdout); +if ( ret = 0 ) +{ +printk(Taking dtuart configuration from /chosen/stdout-path\n); +strlcpy(opt_dtuart, stdout, sizeof(opt_dtuart)); +} +else if ( ret != -EINVAL /* Not present */ ) +printk(Failed to read /chosen/stdout-path (%d)\n, ret); +} +} + +if ( !strcmp(opt_dtuart, ) ) +{ printk(No dtuart path configured\n); return; } options = strchr(opt_dtuart, ':'); -if ( !options ) +/* + * Support ',' as a legacy separator for path from command line + * only, since there is no legacy on the Xen side with stdout-path + * and looking for ',' there would render us incompatible with + * many of the device tree files out there which already include a + * stdout-path. + */ +if ( !options !stdout ) options = strchr(opt_dtuart, ','); if ( options != NULL ) *(options++) = '\0'; -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/4] dt-uart: Clarify log messages at init time.
- Don't log at all if console=dtuart (the default) was not present, in that case the user has asked for something else, no need for every other driver to tell them this. - Use dtuart in all other messages, rather than just console or uart. - Be more explicit if we are exiting because dtuart= wasn't given. - Log the options which we've parsed. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- xen/drivers/char/dt-uart.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/xen/drivers/char/dt-uart.c b/xen/drivers/char/dt-uart.c index 45a87a6..04dbb97 100644 --- a/xen/drivers/char/dt-uart.c +++ b/xen/drivers/char/dt-uart.c @@ -41,9 +41,12 @@ void __init dt_uart_init(void) const char *devpath = opt_dtuart; char *options; -if ( !console_has(dtuart) || !strcmp(opt_dtuart, ) ) +if ( !console_has(dtuart) ) +return; /* Not for us */ + +if ( !strcmp(opt_dtuart, ) ) { -printk(No console\n); +printk(No dtuart path configured\n); return; } @@ -53,7 +56,7 @@ void __init dt_uart_init(void) else options = ; -printk(Looking for UART console %s\n, devpath); +printk(Looking for dtuart at \%s\, options \%s\\n, devpath, options); if ( *devpath == '/' ) dev = dt_find_node_by_path(devpath); else @@ -68,7 +71,7 @@ void __init dt_uart_init(void) ret = device_init(dev, DEVICE_SERIAL, options); if ( ret ) -printk(Unable to initialize serial: %d\n, ret); +printk(Unable to initialize dtuart: %d\n, ret); } /* -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4] xmalloc: add support for checking the pool integrity
On 16.12.14 at 20:33, mdo...@bitdefender.com wrote: Implemented xmem_pool_check(), xmem_pool_check_locked() and xmem_pool_check_unlocked() to verity the integrity of the TLSF matrix. Signed-off-by: Mihai Donțu mdo...@bitdefender.com Andrew, Julien, having gone through the discussion following this patch submission once again just now, I wonder where we are: If you're objecting to any part of the patch as is, please call this out clearly so that Mihai has a way to address your concerns. Otherwise please state clearly that you don't object to the patch going in as is. Thanks, Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/4] sysctl: Add sysctl interface for querying PCI topology
On 01/07/2015 10:17 AM, Jan Beulich wrote: On 07.01.15 at 15:55, boris.ostrov...@oracle.com wrote: On 01/07/2015 04:21 AM, Jan Beulich wrote: On 06.01.15 at 03:18, boris.ostrov...@oracle.com wrote: +for ( ; ti-first_dev ti-num_devs; ti-first_dev++ ) +{ +xen_sysctl_pcitopo_t pcitopo; +struct pci_dev *pdev; + +if ( copy_from_guest_offset(pcitopo, ti-pcitopo, +ti-first_dev, 1) ) +{ +ret = -EFAULT; +break; +} + +spin_lock(pcidevs_lock); +pdev = pci_get_pdev(pcitopo.pcidev.seg, pcitopo.pcidev.bus, +pcitopo.pcidev.devfn); +if ( !pdev || (pdev-node == NUMA_NO_NODE) ) +pcitopo.node = INVALID_TOPOLOGY_ID; +else +pcitopo.node = pdev-node; Are hypervisor-internal node numbers really meaningful to the caller? This is the same information (pxm - node mapping ) that we provide in XEN_SYSCTL_topologyinfo (renamed in this series to XEN_SYSCTL_cputopoinfo). Given that I expect the two topologies to be used together I think the answer is yes. Building your argumentation on potentially mis-designed existing interfaces is bogus. The question is - what use is a Xen internal node number to a caller of a particular hypercall (other than it being purely informational, e.g. for printing human readable output)? Just as with knowing CPU/memory topology --- this will help with placing a guest if we know what proximity domain both the device and the CPUs/memory belong to. Exposing PXM values to the caller would be as good as those internal node IDs. The only (I think) problem is that PXMs are not necessarily zero-based and may not be contiguous and so we need to have some sort of a common mapping for both CPUs and devices. And hypervisor provides such mapping in persistent way. And if we are going to keep this as a sysctl then we need to be consistent with what we do now for CPUs, which is pxm2node[]. Or change CPU topology sysctl as well, which I don't think is a good idea. In particular if we were to introduce a new non-sysctl interface, determining whether the hypervisor internal representation is really the right one to expose here should be one of the most important design aspects. Yes. I personally think that exposing e.g. the firmware determined (and hence hopefully stable across reboots) PXM would be more reasonable. Again, the main argument that I see against using PXM values directly is the fact that it's not zero-based/non-contiguous. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] check-headers error in staging
On 07/01/15 12:37, Olaf Hering wrote: After upgrade to current staging, my test packages fail to build: [ 289s] + make -j4 -k -C tools/include/xen-foreign [ 289s] make: Entering directory `/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign' [ 289s] python mkheader.py arm32 arm32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-arm.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/xen.h [ 289s] python mkheader.py arm64 arm64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-arm.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/xen.h [ 289s] python mkheader.py x86_32 x86_32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-x86/xen-x86_32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/ [ 289s] python mkheader.py x86_64 x86_64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/public/arch-x86/xen-x86_64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../../../xen/include/ [ 289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64 [ 289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c [ 289s] ./checker tmp.size [ 289s] diff -u reference.size tmp.size [ 289s] --- reference.size 2015-01-07 10:28:57.0 + [ 289s] +++ tmp.size 2015-01-07 12:28:33.564911299 + [ 289s] @@ -9,6 +9,6 @@ [ 289s] arch_vcpu_info| 0 0 24 16 [ 289s] vcpu_time_info| 32 32 32 32 [ 289s] vcpu_info | 48 48 64 64 [ 289s] -arch_shared_info | 0 0 268 280 [ 289s] -shared_info |1088108825843368 [ 289s] +arch_shared_info | 0 0 24 48 [ 289s] +shared_info |1088108823403136 [ 289s] [ 289s] make: *** [check-headers] Error 1 this was the last successful build: xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538 this build fails: xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8 So it looks like something between git commites dcd8486..dd94cac causes this failure. It will be c/s dac6d3b1a expand x86 arch_shared_info to support linear p2m list There is a shrink to the size of arch_shared_info, but was argued and accepted as safe to do. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] check-headers error in staging
On 07.01.15 at 13:45, andrew.coop...@citrix.com wrote: On 07/01/15 12:37, Olaf Hering wrote: After upgrade to current staging, my test packages fail to build: [ 289s] + make -j4 -k -C tools/include/xen-foreign [ 289s] make: Entering directory `/usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign' [ 289s] python mkheader.py arm32 arm32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../. ./../xen/include/public/arch-arm.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/.. /../../xen/include/public/xen.h [ 289s] python mkheader.py arm64 arm64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../. ./../xen/include/public/arch-arm.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/.. /../../xen/include/public/xen.h [ 289s] python mkheader.py x86_32 x86_32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../. ./../xen/include/public/arch-x86/xen-x86_32.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../. ./../xen/include/ [ 289s] python mkheader.py x86_64 x86_64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../. ./../xen/include/public/arch-x86/xen-x86_64.h /usr/src/packages/BUILD/xen-4.6.0.30084/non-dbg/tools/include/xen-foreign/../. ./../xen/include/ [ 289s] python mkchecker.py checker.c arm32 arm64 x86_32 x86_64 [ 289s] gcc -Wall -Werror -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -Wdeclaration-after-statement -o checker checker.c [ 289s] ./checker tmp.size [ 289s] diff -u reference.size tmp.size [ 289s] --- reference.size 2015-01-07 10:28:57.0 + [ 289s] +++ tmp.size 2015-01-07 12:28:33.564911299 + [ 289s] @@ -9,6 +9,6 @@ [ 289s] arch_vcpu_info| 0 0 24 16 [ 289s] vcpu_time_info| 32 32 32 32 [ 289s] vcpu_info | 48 48 64 64 [ 289s] -arch_shared_info | 0 0 268 280 [ 289s] -shared_info |1088108825843368 [ 289s] +arch_shared_info | 0 0 24 48 [ 289s] +shared_info |1088108823403136 [ 289s] [ 289s] make: *** [check-headers] Error 1 this was the last successful build: xen_hg_changeset Mon Dec 15 17:40:12 2014 + hg: 30046:cefc36150538 this build fails: xen_hg_changeset Wed Jan 07 11:28:57 2015 +0100 hg: 30084:88d114af72d8 So it looks like something between git commites dcd8486..dd94cac causes this failure. It will be c/s dac6d3b1a expand x86 arch_shared_info to support linear p2m list There is a shrink to the size of arch_shared_info, but was argued and accepted as safe to do. So it looks like I should revert that one then, as it'll cause an unconditional build failure in the tools part of the build afaict. Plus it's not clear how to properly express the now variable size to the checking logic, i.e. resolving the issue may take some time. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information
On 07.01.15 at 15:42, boris.ostrov...@oracle.com wrote: On 01/07/2015 04:06 AM, Jan Beulich wrote: On 06.01.15 at 03:18, boris.ostrov...@oracle.com wrote: --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -56,6 +56,8 @@ struct pci_dev { u8 phantom_stride; +int node; /* NUMA node */ I thought I asked about this on v1 already: Does this really need to be an int, when commonly node numbers are stored in u8/unsigned char? Shrinking the field size would prevent the structure size from growing... I kept this field as an int to be able to store NUMA_NO_NODE which I thought to be (int)-1. But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to (int)-1 by pxm_to_node(). Given that there is a number of tests for NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node() return u8 as well? I think that would make sense, together with fixing up one of the three callers in VT-d code (from alloc_pgtable_maddr()); the other two look correct already. Of course an additional question would be whether the node wouldn't better go into struct arch_pci_dev - that depends on whether we expect ARM to be using NUMA... Since we have CPU topology in common code I thought this would be arch-independent as well. Not sure what you're referring to here: What common piece of data stores the node of a particular CPU? cpu_to_node[] clearly is x86- specific. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/4] sysctl: Add sysctl interface for querying PCI topology
On 07.01.15 at 15:55, boris.ostrov...@oracle.com wrote: On 01/07/2015 04:21 AM, Jan Beulich wrote: On 06.01.15 at 03:18, boris.ostrov...@oracle.com wrote: +for ( ; ti-first_dev ti-num_devs; ti-first_dev++ ) +{ +xen_sysctl_pcitopo_t pcitopo; +struct pci_dev *pdev; + +if ( copy_from_guest_offset(pcitopo, ti-pcitopo, +ti-first_dev, 1) ) +{ +ret = -EFAULT; +break; +} + +spin_lock(pcidevs_lock); +pdev = pci_get_pdev(pcitopo.pcidev.seg, pcitopo.pcidev.bus, +pcitopo.pcidev.devfn); +if ( !pdev || (pdev-node == NUMA_NO_NODE) ) +pcitopo.node = INVALID_TOPOLOGY_ID; +else +pcitopo.node = pdev-node; Are hypervisor-internal node numbers really meaningful to the caller? This is the same information (pxm - node mapping ) that we provide in XEN_SYSCTL_topologyinfo (renamed in this series to XEN_SYSCTL_cputopoinfo). Given that I expect the two topologies to be used together I think the answer is yes. Building your argumentation on potentially mis-designed existing interfaces is bogus. The question is - what use is a Xen internal node number to a caller of a particular hypercall (other than it being purely informational, e.g. for printing human readable output)? In particular if we were to introduce a new non-sysctl interface, determining whether the hypervisor internal representation is really the right one to expose here should be one of the most important design aspects. I personally think that exposing e.g. the firmware determined (and hence hopefully stable across reboots) PXM would be more reasonable. @@ -463,7 +464,7 @@ typedef struct xen_sysctl_lockprof_op xen_sysctl_lockprof_op_t; DEFINE_XEN_GUEST_HANDLE(xen_sysctl_lockprof_op_t); /* XEN_SYSCTL_cputopoinfo */ -#define INVALID_TOPOLOGY_ID (~0U) +#define INVALID_TOPOLOGY_ID (~0U) /* Also used by pcitopo */ Better extend the preceding comment. I mentioned it to Wei yesterday that the file is structured (or at least appears to me to be structured) in such a way that these top comments mark sections of definitions for each sysctl. And so I thought that I'd be breaking this convention if I were to extend the comment. Yeah, I saw that discussion after having replied. Looking more closely, I think the placement of the INVALID_TOPOLOGY_ID definition is sub-optimal when you want to use it in a second place. Move it mode towards the beginning of the header, leave the current comment as is, and if you feel so ad a new comment ahead of it explaining which operations are using it. And then, if we go the non-sysctl route, the definition would likely need moving into xen.h anyway. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 7/7] tools/hotplug: add wrapper to start xenstored
Olaf Hering writes (Re: [PATCH 7/7] tools/hotplug: add wrapper to start xenstored): If I recall correctly the point of the current 'sh -c exec ...' stunt was to expand the XENSTORE variable from the sysconfig file. But this approach leads to failures with SELinux because the socket passing does not work this way. Up to now I have not seen a success report for selinux+systemd+xenstored. Maybe its already somewhere in the other unread mails. The selinux policy should follow the actual code, not vice versa. That is, if the approach which we select (based on all the other criteria) is not compatible with existing selinux policies, this should be fixed by changing the selinux policies. Since the selinux policies are not in xen.git, and are not maintained as part of the Xen Project, there is no reason to delay introducing changes in xen.git#master which are known to be incompatible with some selinux policies. My conclusion therefore is that selinux policies are an irrelevant consideration when deciding what the scripts, systemd integration, etc. should look like in xen.git#master. (And what applies to xen.git#master applies to the as-yet-unreleased xen.git#staging-4.5 too.) Hopefully someone with access to a SELinux enabled system will report which approach actually works. I have concluded that the right approach is to disregard selinux. Developers of selinux-enforcing setups should update the selinux policies to support what the upstream Xen Project code does. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] README, xen/Makefile: Update to Xen 4.5.0
On Tue, Jan 06, 2015 at 09:14:22PM +0200, Pasi Kärkkäinen wrote: On Tue, Jan 06, 2015 at 01:21:58PM -0500, Konrad Rzeszutek Wilk wrote: On Tue, Jan 06, 2015 at 06:06:23PM +, Ian Jackson wrote: Konrad Rzeszutek Wilk writes ([Xen-devel] [PATCH] README, xen/Makefile: Update to Xen 4.5.0): -The 4.3 release offers a number of improvements, including NUMA -scheduling affinity, openvswitch integration, and defaulting to -qemu-xen rather than qemu-traditional for non-stubdom guests. -(qemu-xen is kept very close to the upstream project.) We also have a -number of updates to vTPM, and improvements to XSM and Flask to allow -greater disaggregation. Additionally, 4.3 contains a basic version of -Xen for the new ARM server architecture, both 32- and 64-bit. And as -always, there are a number of performance, stability, and security +The 4.5 release offers a number of improvements: including shedding Should read +The 4.5 release offers a number of improvements, including: shedding (note two punctuation changes) and the list items should all be separated with semicolons IMO. Thank you for your update. I've incorporated feedback from all folks I hope. Would this be satisfactory? Do we want to mention things like HVM guest direct kernel boot, or HVM guests MMIO hole resize support? and support in QEMU for expanding the PCI hole. which is in the README HVM guest direct kernel boot - I forgot about that! And I think there was optimizations to oxenstored to support up to 1000 VMs per host.. .. and that. diff --git a/README b/README index 4a9cac1..e2c9e83 100644 --- a/README +++ b/README @@ -43,8 +43,9 @@ guests; and lower interrupt latency. The toolstack has expanded to include support for: VM Generation ID (a Windows 2012 Server requirement); Remus initial support (for high availability) in libxl (since xend has been removed); libxenlight JSON -support and persistent configuration support, and systemd support; and -support in QEMU for expanding the PCI hole. +support, HVM guest direct kernel boot, and persistent configuration +support; systemd support; optimizations in oxenstored to support more +than 1000+ VMs; and support in QEMU for expanding the PCI hole. Lastly, we have removed the Python toolstack (xend). ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [CALL-FOR-AGENDA] Monthly Xen.org Technical Call (2015-01-14)
The first Xen technical call will be at: Wed 14 Jan 17:00:00 GMT 201 `date -d @1421254800` See http://lists.xen.org/archives/html/xen-devel/2015-01/msg00414.html for more information on the call. Please let me know (CC-ing the list) any topics which you would like to discuss. It might be useful to include: * References to any relevant/recent mailing list threads; * Other people who you think should be involved in the discussion (and CC them); If you would like to attend then please let me know so I can send you the dial in details. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [CALL-FOR-AGENDA] Monthly Xen.org Technical Call (2015-01-14)
On Wed, Jan 07, 2015 at 03:32:15PM +, Ian Campbell wrote: The first Xen technical call will be at: Wed 14 Jan 17:00:00 GMT 201 `date -d @1421254800` See http://lists.xen.org/archives/html/xen-devel/2015-01/msg00414.html for more information on the call. Please let me know (CC-ing the list) any topics which you would like to discuss. It might be useful to include: * References to any relevant/recent mailing list threads; * Other people who you think should be involved in the discussion (and CC them); - Xen 4.5 retrospective (aka - what was good, bad, etc - the idea is to improve in the future) If you would like to attend then please let me know so I can send you the dial in details. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/4] dt-uart: add an emacs magic block
Signed-off-by: Ian Campbell ian.campb...@citrix.com --- xen/drivers/char/dt-uart.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/xen/drivers/char/dt-uart.c b/xen/drivers/char/dt-uart.c index fa92b5c..45a87a6 100644 --- a/xen/drivers/char/dt-uart.c +++ b/xen/drivers/char/dt-uart.c @@ -70,3 +70,13 @@ void __init dt_uart_init(void) if ( ret ) printk(Unable to initialize serial: %d\n, ret); } + +/* + * Local variables: + * mode: C + * c-file-style: BSD + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */ -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 1/4] pci: Do not ignore device's PXM information
On 07.01.15 at 16:34, boris.ostrov...@oracle.com wrote: On 01/07/2015 10:07 AM, Jan Beulich wrote: On 07.01.15 at 15:47, andrew.coop...@citrix.com wrote: On 07/01/15 14:42, Boris Ostrovsky wrote: I kept this field as an int to be able to store NUMA_NO_NODE which I thought to be (int)-1. But now I see that NUMA_NO_NODE is, in fact, 0xff but is promoted to (int)-1 by pxm_to_node(). Given that there is a number of tests for NUMA_NO_NODE and not for (int)-1, should we then make pxm_to_node() return u8 as well? I noticed this as well, and found it quite counter intuitive. I would suggest fixing NUMA_NO_NODE to -1 and removing some of the type-punning. I have to admit that I see no value in wasting 4 bytes for something that for the foreseeable future won't exceed 1 byte. The downside of going to u8 is that we'd be limiting number of nodes to 254, which is somewhat awkward. OTOH we already do this by testing nodeID against 0xff in various places. With NODES_SHIFT being 6 and hence MAX_NUMNODES being 0x40, we can't reach 254 right now anyway. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [win-pv-devel] XenBus_AddWatch
Yea, I debugged through Harper's code and it seems like the error response is coming from XenStore on Dom0. Ive been looking at it, and OXenStored is running on Dom0. Of all the languages possible, they decided to write xenstore in OCaml. I am not very good at OCaml and dont know anyone that is, so I could work with someone to help them fix this bug if there is interest. I cannot watch a xenstore node, even when the DomU that I'm watching from owns the node and allows full access to everyone else that wants to access the node. Permissions are bU where U is the DomID of the DomU doing the watching. Is there any requirement for consistency of programming languages and use of common languages across the different subprojects in Xen? On Tue, Jan 6, 2015 at 5:34 AM, Paul Durrant paul.durr...@citrix.com wrote: -Original Message- From: win-pv-devel-boun...@lists.xenproject.org [mailto:win-pv-devel- boun...@lists.xenproject.org] On Behalf Of hanji unit Sent: 31 December 2014 15:15 To: win-pv-de...@lists.xenproject.org; xen-devel@lists.xen.org Subject: [win-pv-devel] XenBus_AddWatch Hello, I am calling XenBus_AddWatch API from a DomU guest in the win-pvdrivers xenpci driver, and noticed that I am not able to watch xenstore entries that are outside of the DomU's xenstore tree. For example, the following call in XenPci_EvtDeviceD0EntryPostInterruptsEnabled fails with response=EIO even if the xenstore permissions for Container and Container/DomU in xenstore are both b0: response = XenBus_AddWatch(xpdd, XN_BASE_GLOBAL, Container/DomU, MyCallback, xpdd); However, the following call works and it watches a xenstore entry relative to DomUs xenstore tree: response = XenBus_AddWatch(xpdd, XBT_NIL, DomU, MyCallback, xpdd); Writing to an entry outside the DomUs tree is allowed if permissions are b0: result = XnWriteString(xpdd, XN_BASE_GLOBAL, Container/DomU, buffer); It seems like DomUs should be allowed to watch xenstore entries outside their trees. Is this a bug or is it by design? Hi, The API you're referring to is part of James Harper's GPLPV drivers I believe, which I'm no expert on. I would imagine the API simply passes error codes back from xenstored though so you should probably check xenstored-access.log, although EIO does sound like an odd code to get back. Paul Thanks. ___ win-pv-devel mailing list win-pv-de...@lists.xenproject.org http://lists.xenproject.org/cgi-bin/mailman/listinfo/win-pv-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] libxl: fix paths in capability string
Wei Liu wrote: Jim, another idea: if those strings are likely to be wrong and in fact not used, can we just not print them? IMO they are useful, particularly when they are correct :-). They allow users to see which emulators are available and their complete path, which in turn can be directly used in the emulator element in domainXML. I do agree that a configure option for libvirt is the way to go here, allowing packagers to specify where these binaries exist. I recall discussions at a past Xen hackathon around the preference to use the distro's qemu instead of the one provided by Xen. A configure option for libvirt would facilitate that. Regards, Jim ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask
Andrew Cooper writes (Re: [Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask): Other culprits are xc_get_max_nodes(), xc_get_max_cpus(), 4 instances in xc_psr.c and most things in xc_offline_page.c which appears to have static structures for domain context. The pluggable loader infrastructure in xc_dom.c also appears to be thread-unsafe. xc_dom_decompress_unsafe.c also uses static data, but unsafe in the name might be a sufficient guard? I will look at these tomorrow. No aggressively optimising compiler is going to perform partial writes on a naturally aligned integer, so I stand by my comment when applied to the common case. You misunderstand. An aggressively optimising compiler might be able to prove (perhaps through whole program analysis - we have link-time optimisation nowadays) various falsehoods about the way these variables are used. The resulting generated machine code might be arbitrarily bad, up to and including missing important parts of the whole program. I'm not aware of any compilers which currently take advantge of thread safety bugs (really, just spec-violations) but I think this is just a matter of time. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Mapping Data between Dom0 and DomU
Hello, My name is Christian and I am new to Xen development, and I've been struggling a bit. I'm trying to develop a device driver so that a Windows 2012 Server VM has a way to send about 10MB of data to a CentOS VM. There is no real device on the backend, I just need a way to constantly send a large buffer between DomU and Dom0 using a zero copy solution. I wrote a Linux device driver (simple char device) in Dom0 which allocates the memory and reserves the pages. However, I'm not sure how to offer these pages to the Windows VM. I also have a Windows device driver that I've installed in the Windows VM, but again I'm not sure how to connect the two drivers. I've been browsing the QEMU source code, the Windows PV source code, the Xen source code, and the Definitive Guide to the Xen Hypervisor for answers. Unfortunately, I am a bit lost. I'm hoping someone with some experience with shared data communication between Dom0 and DomU can point me at some source files to study in order to figure out how to implement what I'm trying to do. A tutorial, sample code, and/or general direction would be great. Thanks for your help, Christian ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [qemu-mainline bisection] complete test-amd64-i386-xl-qemuu-winxpsp3
branch xen-unstable xen branch xen-unstable job test-amd64-i386-xl-qemuu-winxpsp3 test windows-install Tree: linux git://xenbits.xen.org/linux-pvops.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: qemu git://xenbits.xen.org/staging/qemu-xen-unstable.git Tree: qemuu git://git.qemu.org/qemu.git Tree: xen git://xenbits.xen.org/xen.git *** Found and reproduced problem changeset *** Bug is in tree: qemuu git://git.qemu.org/qemu.git Bug introduced: 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6 Bug not present: 60fb1a87b47b14e4ea67043aa56f353e77fbd70a commit 49d2e648e8087d154d8bf8b91f27c8e05e79d5a6 Author: Marcel Apfelbaum marce...@redhat.com Date: Tue Dec 16 16:58:05 2014 + machine: remove qemu_machine_opts global list QEMU has support for options per machine, keeping a global list of options is no longer necessary. Signed-off-by: Marcel Apfelbaum marce...@redhat.com Reviewed-by: Alexander Graf ag...@suse.de Reviewed-by: Greg Bellows greg.bell...@linaro.org Message-id: 1418217570-15517-2-git-send-email-marce...@redhat.com Signed-off-by: Peter Maydell peter.mayd...@linaro.org For bisection revision-tuple graph see: http://www.chiark.greenend.org.uk/~xensrcts/results/bisect.qemu-mainline.test-amd64-i386-xl-qemuu-winxpsp3.windows-install.html Revision IDs in each graph node refer, respectively, to the Trees above. Searching for failure / basis pass: 33123 fail [host=lace-bug] / 32598 ok. Failure / basis pass flights: 33123 / 32598 (tree with no url: seabios) Tree: linux git://xenbits.xen.org/linux-pvops.git Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git Tree: qemu git://xenbits.xen.org/staging/qemu-xen-unstable.git Tree: qemuu git://git.qemu.org/qemu.git Tree: xen git://xenbits.xen.org/xen.git Latest 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22 ab0302ee764fd702465aef6d88612cdff4302809 36174af3fbeb1b662c0eadbfa193e77f68cc955b Basis pass 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22 7e58e2ac7778cca3234c33387e49577bb7732714 36174af3fbeb1b662c0eadbfa193e77f68cc955b Generating revisions with ./adhoc-revtuple-generator git://xenbits.xen.org/linux-pvops.git#83a926f7a4e39fb6be0576024e67fe161593defa-83a926f7a4e39fb6be0576024e67fe161593defa git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860 git://xenbits.xen.org/staging/qemu-xen-unstable.git#b0d42741f8e9a00854c3b3faca1da84bfc69bf22-b0d42741f8e9a00854c3b3faca1da84bfc69bf22 git://git.qemu.org/qemu.git#7e58e2ac7778cca3234c33387e49577bb7732714-ab0302ee764fd702465aef6d88612cdff4302809 git://xenbits.xen.org/xen.git#36174af3fbeb1b662c0eadbfa193e77f68cc955b-36174af3fbeb1b662c0eadbfa193e77f68cc955b + exec + sh -xe + cd /export/home/osstest/repos/qemu + git remote set-url origin git://drall.uk.xensource.com:9419/git://git.qemu.org/qemu.git + git fetch -p origin +refs/heads/*:refs/remotes/origin/* + exec + sh -xe + cd /export/home/osstest/repos/qemu + git remote set-url origin git://drall.uk.xensource.com:9419/git://git.qemu.org/qemu.git + git fetch -p origin +refs/heads/*:refs/remotes/origin/* Loaded 1005 nodes in revision graph Searching for test results: 32585 pass irrelevant 32598 pass 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22 7e58e2ac7778cca3234c33387e49577bb7732714 36174af3fbeb1b662c0eadbfa193e77f68cc955b 32611 fail 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22 ab0302ee764fd702465aef6d88612cdff4302809 36174af3fbeb1b662c0eadbfa193e77f68cc955b 32626 fail 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22 ab0302ee764fd702465aef6d88612cdff4302809 36174af3fbeb1b662c0eadbfa193e77f68cc955b 32689 fail 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22 ab0302ee764fd702465aef6d88612cdff4302809 36174af3fbeb1b662c0eadbfa193e77f68cc955b 32659 fail 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22 ab0302ee764fd702465aef6d88612cdff4302809 36174af3fbeb1b662c0eadbfa193e77f68cc955b 32876 fail 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22 ab0302ee764fd702465aef6d88612cdff4302809 36174af3fbeb1b662c0eadbfa193e77f68cc955b 32854 fail 83a926f7a4e39fb6be0576024e67fe161593defa c530a75c1e6a472b0eb9558310b518f0dfcd8860 b0d42741f8e9a00854c3b3faca1da84bfc69bf22
Re: [Xen-devel] Mapping Data between Dom0 and DomU
On 08/01/2015 01:22, Christian Refvik wrote: Hello, My name is Christian and I am new to Xen development, and I've been struggling a bit. I'm trying to develop a device driver so that a Windows 2012 Server VM has a way to send about 10MB of data to a CentOS VM. There is no real device on the backend, I just need a way to constantly send a large buffer between DomU and Dom0 using a zero copy solution. I wrote a Linux device driver (simple char device) in Dom0 which allocates the memory and reserves the pages. However, I'm not sure how to offer these pages to the Windows VM. I also have a Windows device driver that I've installed in the Windows VM, but again I'm not sure how to connect the two drivers. I've been browsing the QEMU source code, the Windows PV source code, the Xen source code, and the Definitive Guide to the Xen Hypervisor for answers. Unfortunately, I am a bit lost. I'm hoping someone with some experience with shared data communication between Dom0 and DomU can point me at some source files to study in order to figure out how to implement what I'm trying to do. A tutorial, sample code, and/or general direction would be great. Thanks for your help, Christian You will want to use grants, and in particular grant mappings, which is the Xen interface for creating shared memory between VMs. The basic premise is that domain A nominates some of its pages to be grantable. Domain A gives the grant references to domain B (usually negotiated via xenstore), which allows domain B to make a mapping of domain A's nominated pages. A relevant bit of code to look at would be tools/libvchan/ , which as far as I am aware does pretty much what you describe (but without the windows support). ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Question about significant network performance difference after pin RX netback to vcpu0
Hi, I am trying to test the single-queue networking performance for netback/netfront in upstream, my testing environment is as follows: 1. Using pkt-gen to send a single UDP flow from one host to a vm which runs on another XEN host. Two hosts are connected with 10GE network (Intel 82599 NIC, which has multiqueue support) 2. The receiver XEN host uses xen 4.4, and dom0's OS is Linux 3.17.4 which already has multiqueue netback support 3. The receiver XEN host's CPU is NUMA, and cpu0-cpu7 belong to the same node 4. The receiver Dom0 has 6 VCPU and 6G Memory, and the vcpu/mem is pinned to host by using dom0_vcpu_num=4 dom0_vcpus_pin dom0_mem=6144M args during boot 5. The receiver VM has 4 vcpu and 4G memory, and vcpu0 is pinned to pcpu 6 which means the vcpu0 is running on the same numa node with dom0 During testing, I have modified the sender host's IP address to make the receiver host and vm have behaviors as follows: 1. The queue 5 of NIC in Dom0 handles the flow, which can be confirmed from /proc/interrupts. As a result, the ksoftirqd/5 process will run with some cpu usage during testing, which can be confirmed from the result of top command 2. The queue 2 of netback vif handles the flow, which can be confirmed from the result of top command in Dom0 (which shows that the vif1.0-guest-q2 process runs with some cpu usage) 3. The ksfotirqd/0 (which running on vcpu0) in DomU runs with some high cpu usage, and it seems that this process handles the soft interrupt of RX However, I find some strange phenomenon as follows: 1. All RX interrupts of netback vif are only sent to queue 0, which can be confirmed from the content of /proc/interrupts file in Dom0. But other than ksoftirqd/0, it seems that ksoftirqd/2 will handle the soft interrupt and run with high cpu usage when the vif1.0-guest-q2 process runs on vpu2. But why not the ksortirqd/0 handles the soft interrupt because the RX interrupts are sent to queue0? I have also tried to make another queue of netback VIF (e.g. vif1.0-guest-q1, vif1.0-guest-q3, etc) to handle the flow, but all RX interrupts are still only sent to vcpu0, I am wondering the reason for it? 2. The RX interrupts are sent to queue 2 of vnic in DomU, but it seems that it is the ksoftirqd/0 other than ksoftirqd/2 handles the soft interrupt. 3. If and only if I pin the vif1.0-guest-q2 process to the vcpu0 of Dom0, the throughout of flow is higher (which can be improved as much as double). This is the most strange phenomenon I find, I am wondering the reason for it. Could anyone give me some hint about it? If there are some things unclear, please tell me and I will give more detailed description. Thanks. --- Best Regards Trump___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] VT-d: don't crash when PTE bits 52 and up are non-zero
On Wed, Jan 07, 2015 at 10:15:39AM +, Jan Beulich wrote: On 23.12.14 at 07:52, kevin.t...@intel.com wrote: From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Friday, December 19, 2014 7:26 PM This can (and will) be legitimately the case when sharing page tables with EPT (more of a problem before p2m_access_rwx became zero, but still possible even now when other than that is the default for a guest), leading to an unconditional crash (in print_vtd_entries()) when a DMA remapping fault occurs. could you elaborate the scenarios when bits 52+ are non-zero? Signed-off-by: Jan Beulich jbeul...@suse.com but the changes looks reasonable to me. Signed-off-by: Kevin Tian kevin.t...@intel.com I translated this to a Reviewed-by, as S-o-b doesn't seem to make sense here. Konrad - please indicate whether this can also go into 4.5. Yes. Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/4] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient
On 07.01.15 at 15:45, boris.ostrov...@oracle.com wrote: On 01/07/2015 04:12 AM, Jan Beulich wrote: On 06.01.15 at 14:41, andrew.coop...@citrix.com wrote: On 06/01/15 02:18, Boris Ostrovsky wrote: Instead of copying data for each field in xen_sysctl_topologyinfo separately put cpu/socket/node into a single structure and do a single copy for each processor. There is also no need to copy whole op to user at the end, max_cpu_index is sufficient Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact that these are used for CPU topology. Subsequent patch will add support for PCI topology sysctl. Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com If we are going to change the hypercall, then can we see about making it a stable interface (i.e. not a sysctl/domctl)? There are non-toolstack components which might want/need access to this information. (i.e. I am still looking for a reasonable way to get this information from Xen in hwloc) In which case leaving the sysctl alone and just adding a new non-sysctl interface should be considered. I'd expect IO NUMA information to be used together with CPU topology information so I am not sure how useful this would be. Unless we create a similar interface for that (CPU/memory) as well. Creating a new CPU topology interface while leaving alone the current sysctl was what I meant to suggest. The I/O topology one would then become a sibling to the CPU one. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/3] x86/xen: optimize get_phys_to_machine()
On 07/01/15 15:18, Juergen Gross wrote: On 01/07/2015 03:47 PM, David Vrabel wrote: The page table walk is only needed to distinguish between identity and missing, both of which have INVALID_P2M_ENTRY. As get_phys_to_machine is called by __pfn_to_mfn() only which already checks for mfn == INVALID_P2M_ENTRY this optimization will have an effect only in the early boot case with pfn = xen_p2m_size. I doubt this is necessary. Doh. Now I remember suggesting this optimization before and getting the same answer from you. I'll drop this patch. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Bugfix] x86/apic: Fix xen IRQ allocation failure caused by commit b81975eade8c
On 2015/1/7 22:50, Konrad Rzeszutek Wilk wrote: On Wed, Jan 07, 2015 at 02:13:49PM +0800, Jiang Liu wrote: Commit b81975eade8c (x86, irq: Clean up irqdomain transition code) breaks xen IRQ allocation because xen_smp_prepare_cpus() doesn't invoke setup_IO_APIC(), so no irqdomains created for IOAPICs and mp_map_pin_to_irq() fails at the very beginning. --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2369,31 +2369,29 @@ static void ioapic_destroy_irqdomain(int idx) ioapics[idx].pin_info = NULL; } -void __init setup_IO_APIC(void) +void __init setup_IO_APIC(bool xen_smp) { int ioapic; -/* - * calling enable_IO_APIC() is moved to setup_local_APIC for BP - */ -io_apic_irqs = nr_legacy_irqs() ? ~PIC_IRQS : ~0UL; +if (!xen_smp) { +apic_printk(APIC_VERBOSE, ENABLING IO-APIC IRQs\n); +io_apic_irqs = nr_legacy_irqs() ? ~PIC_IRQS : ~0UL; + +/* Set up IO-APIC IRQ routing. */ +x86_init.mpparse.setup_ioapic_ids(); +sync_Arb_IDs(); +} Is there a specific reason that this cannot run in all cases? What I am asking is why are we doing a special case here? The description at the top implied that we were just missing an call to setup_IO_APIC.. Hi Konrad, I'm not very familiar with Xen IRQ yet, so I just enabled the code to create irqdomains for IOAPICs and keep other part as is. I will try to check whether we could enable other part all together:) Regards! Gerry -apic_printk(APIC_VERBOSE, ENABLING IO-APIC IRQs\n); for_each_ioapic(ioapic) BUG_ON(mp_irqdomain_create(ioapic)); - -/* - * Set up IO-APIC IRQ routing. - */ -x86_init.mpparse.setup_ioapic_ids(); - -sync_Arb_IDs(); setup_IO_APIC_irqs(); -init_IO_APIC_traps(); -if (nr_legacy_irqs()) -check_timer(); - ioapic_initialized = 1; + +if (!xen_smp) { +init_IO_APIC_traps(); +if (nr_legacy_irqs()) +check_timer(); +} } /* diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 4c071aeb8417..7eb0283901fa 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -326,7 +326,10 @@ static void __init xen_smp_prepare_cpus(unsigned int max_cpus) xen_raw_printk(m); panic(m); +} else { +setup_IO_APIC(true); } + xen_init_lock_cpu(0); smp_store_boot_cpu_info(); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 7/7] tools/hotplug: add wrapper to start xenstored
On Wed, Jan 07, 2015 at 03:27:15PM +, Ian Jackson wrote: Olaf Hering writes (Re: [PATCH 7/7] tools/hotplug: add wrapper to start xenstored): If I recall correctly the point of the current 'sh -c exec ...' stunt was to expand the XENSTORE variable from the sysconfig file. But this approach leads to failures with SELinux because the socket passing does not work this way. Up to now I have not seen a success report for selinux+systemd+xenstored. Maybe its already somewhere in the other unread mails. The selinux policy should follow the actual code, not vice versa. That is, if the approach which we select (based on all the other criteria) is not compatible with existing selinux policies, this should be fixed by changing the selinux policies. Since the selinux policies are not in xen.git, and are not maintained as part of the Xen Project, there is no reason to delay introducing changes in xen.git#master which are known to be incompatible with some selinux policies. My conclusion therefore is that selinux policies are an irrelevant consideration when deciding what the scripts, systemd integration, etc. should look like in xen.git#master. (And what applies to xen.git#master applies to the as-yet-unreleased xen.git#staging-4.5 too.) Hopefully someone with access to a SELinux enabled system will report which approach actually works. I have concluded that the right approach is to disregard selinux. Developers of selinux-enforcing setups should update the selinux policies to support what the upstream Xen Project code does. ... which is none. We don't ship any SELinux policies. Anyhow I concur with the sentiment which is why I was aiming at just having an release note about the SELinux part - and having this patch not worry that much about SELinux and instead be satisfactory to you and IanC. Olaf, that hopefully would make it easier for you to come up with a nice patch ? Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/4] dt-uart: support /chosen/stdout-path property.
On Wed, 2015-01-07 at 16:42 +, Julien Grall wrote: + +if ( chosen ) +{ +ret = dt_property_read_string(chosen, stdout-path, stdout); +if ( ret = 0 ) +{ +printk(Taking dtuart configuration from /chosen/stdout-path\n); +strlcpy(opt_dtuart, stdout, sizeof(opt_dtuart)); The final string in opt_dtuart may be truncated if stdout is bigger than 255 characters. I would add a check to avoid hours of debugging later. Good point. I suppose it may as well warn and continue: hypothetically the truncation might only affect some non-critical options so the console might actually work, so we might as well try. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] libxl_internal: lock_carefd - carefd
Wei Liu writes ([PATCH 1/2] libxl_internal: lock_carefd - carefd): lock_ prefix is redundant. Acked-by: Ian Jackson ian.jack...@eu.citrix.com (for 4.6) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/2] tools/hotplug: introduce XENSTORED_ARGS= in sysconfig file.
Olaf Hering writes ([PATCH 1/2] tools/hotplug: introduce XENSTORED_ARGS= in sysconfig file.): It is already used in the runlevel script and the service file. It is supposed to replace XENSTORED_TRACE= boolean, which cant be easily supported in the xenstored.service file. I don't think it is right to desupport XENSTORED_TRACE= which has been supported in sysconfig.xencommons since at least Xen 4.1. Certainly removing this feature this late in the 4.5 release cycle is not appropriate. Sorry, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Nominations for Xen 4.5 stable tree maintainer.
On Wed, 2015-01-07 at 16:33 +, Ian Jackson wrote: for form's sake I don't want to just say +1 without giving others an opportunity to (self-)nominate. I'd assumed I could change my vote any time up to the deadline, but it's also true that an existing +1 vote might have a chilling effect on others stepping forward which I hadn't considered. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] libxl_internal: comment on domain userdata unlock function
Wei Liu writes ([PATCH 2/2] libxl_internal: comment on domain userdata unlock function): Discuss why we need to unlink file path before closes fd. Acked-by: Ian Jackson ian.jack...@eu.citrix.com subject to minor grammar complaint: Potential backport candidate for 4.5.1 ? diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c index 9d8025d..a70214b 100644 --- a/tools/libxl/libxl_internal.c +++ b/tools/libxl/libxl_internal.c @@ -458,6 +458,20 @@ out: void libxl__unlock_domain_userdata(libxl__domain_userdata_lock *lock) { +/* It's important to unlink the file before closing fd to avoid + * such race (if close before unlink): to avoid the following race. Such must refer to a thing which precedes. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 3/4] sysctl: Add sysctl interface for querying PCI topology
On 07.01.15 at 16:54, boris.ostrov...@oracle.com wrote: On 01/07/2015 10:17 AM, Jan Beulich wrote: I personally think that exposing e.g. the firmware determined (and hence hopefully stable across reboots) PXM would be more reasonable. Again, the main argument that I see against using PXM values directly is the fact that it's not zero-based/non-contiguous. I have to admit that I can't see why either of the aspects would matter. One thing coming to mind though is that the memory allocation interfaces want Xen node numbers passed in. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask
On Wed, Jan 07, 2015 at 04:37:40PM +, Ian Jackson wrote: Andrew Cooper writes (Re: [Xen-devel] [PATCH v2 2/5] tools: add routine to get CMT L3 event mask): On 07/01/15 11:12, Chao Peng wrote: +int xc_psr_cmt_get_l3_event_mask(xc_interface *xch, uint32_t *event_mask) +{ +static int val = 0; This should be uint32_t rather than int. I am somewhat concerned about multithreaded use of libxc, but this is not the first issue in libxc, and probably shouldn't be held against this patch. On the contrary, this is quite bad. libxc should be properly reentrant and I think it generally is. If you are aware of other global variables of this kind please do ... xc_misc.c has two similar variables. ... I have just seen that the existing version of xc_psr.c has this problem already. IMO this is a serious bug. Why was it made static before ? As the result of the hypercall is going to be the same, the worse that a race could achieve is a wasted hypercall. This kind of analysis is unfounded in the presence of modern compilers with aggressive optimisations. At the very least, if you're going to do some caching like this, it needs a lock around it. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/2] libxl_internal: comment on domain userdata unlock function
On Wed, Jan 07, 2015 at 04:52:00PM +, Ian Jackson wrote: Wei Liu writes ([PATCH 2/2] libxl_internal: comment on domain userdata unlock function): Discuss why we need to unlink file path before closes fd. Acked-by: Ian Jackson ian.jack...@eu.citrix.com subject to minor grammar complaint: Potential backport candidate for 4.5.1 ? Sure, if you feel this is important. diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c index 9d8025d..a70214b 100644 --- a/tools/libxl/libxl_internal.c +++ b/tools/libxl/libxl_internal.c @@ -458,6 +458,20 @@ out: void libxl__unlock_domain_userdata(libxl__domain_userdata_lock *lock) { +/* It's important to unlink the file before closing fd to avoid + * such race (if close before unlink): to avoid the following race. Such must refer to a thing which precedes. I suppose you (or Ian C) can fix this when committing? Do I need to resend? Wei. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] xen-time: decreasing the rating of the xen clocksource below that of the tsc clocksource for dom0's
On 07.01.15 at 17:30, ian.campb...@citrix.com wrote: On Wed, 2015-01-07 at 17:16 +0100, Imre Palik wrote: From: Palik, Imre im...@amazon.de In Dom0's the use of the TSC clocksource (whenever it is stable enough to be used) instead of the Xen clocksource should not cause any issues, as Dom0 VMs never live-migrated. Is this still true given that dom0's vcpus are migrated amongst pcpus on the host? The tsc are not synchronised on some generations of hardware so the result there would be the TSC appearing to do very odd things under dom0's feet. Does Linux cope with that or does it not matter for some other reason? Indeed. The textual qualification above (whenever it is stable enough) isn't being expressed in the code change at all. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel