Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology
On 17.04.15 at 18:59, boris.ostrov...@oracle.com wrote: Changes in v7: * Break from the loop when -ENODEV is encountered This seems pretty inefficient for the caller. Returning a bad identifier other than XEN_INVALID_NODE_ID would seem better to me. --- a/docs/misc/xsm-flask.txt +++ b/docs/misc/xsm-flask.txt @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h) * XEN_SYSCTL_cpupool_op * XEN_SYSCTL_scheduler_op * XEN_SYSCTL_coverage_op + * XEN_SYSCTL_pcitopoinfo No additions to this list are permitted. Either the new sub-op is disaggregation safe (which it looks to be), or it can't be accepted. --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -411,6 +411,65 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) break; #endif +#ifdef HAS_PCI +case XEN_SYSCTL_pcitopoinfo: +{ +xen_sysctl_pcitopoinfo_t *ti = op-u.pcitopoinfo; +unsigned dev_cnt = 0; + +if ( guest_handle_is_null(ti-devs) || + guest_handle_is_null(ti-nodes) || + (ti-first_dev ti-num_devs) ) +{ +ret = -EINVAL; +break; +} + +while ( ti-first_dev ti-num_devs ) +{ +physdev_pci_device_t dev; +uint32_t node; +struct pci_dev *pdev; + +if ( copy_from_guest_offset(dev, ti-devs, ti-first_dev, 1) ) +{ +ret = -EFAULT; +break; +} + +spin_lock(pcidevs_lock); +pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn); +if ( !pdev ) +{ +ret = -ENODEV; +node = XEN_INVALID_NODE_ID; +} +else if ( pdev-node == NUMA_NO_NODE ) +node = XEN_INVALID_NODE_ID; +else +node = pdev-node; +spin_unlock(pcidevs_lock); + +if ( copy_to_guest_offset(ti-nodes, ti-first_dev, node, 1) ) +{ +ret = -EFAULT; +break; +} + +ti-first_dev++; + +if ( (ret == -ENODEV) || If despite the earlier comment you want to stay with this model, I think you should keep first_dev pointing at the bad slot (but also see below), leaving it to the caller to increment past it (consider this being function 0 and the next slots being the other functions - in that case it's clear that the caller would want to skip more than just this one slot). + ((++dev_cnt 0x3f) hypercall_preempt_check()) ) +break; +} + +if ( (!ret || (ret == -ENODEV)) + __copy_field_to_guest(u_sysctl, op, u.pcitopoinfo.first_dev) ) +ret = -EFAULT; +} +break; +#endif With the continuation-less model now used I don't think it makes sense to have first_dev and num_devs - for re-invocation all the caller needs to do is increment the buffer pointer suitably. I.e. you can get away with just a count of devices afaict. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v2 05/11] vmx: add new data structure member to support PML
On 04/17/2015 10:31 AM, Kai Huang wrote: On 04/17/2015 06:39 AM, Tian, Kevin wrote: From: Kai Huang [mailto:kai.hu...@linux.intel.com] Sent: Wednesday, April 15, 2015 3:04 PM A new 4K page pointer is added to arch_vmx_struct as PML buffer for vcpu. And a new 'status' field is added to vmx_domain to indicate whether PML is enabled for the domain or not. The 'status' field also can be used for further similiar purpose. not sure about the last sentence. what's the similar purpose to whether PML is enabled? :-) I mean potentially there might be such feature in the future, and I can't give you an example right now. If you are just commenting the description here but fine with the current code, I can remove that last sentence if you like. Or do you suggest to just use a bool_t pml_enabled? I am fine with both, but looks there's no objection from others so I intend to keep it as 'unsigned int status', if you agree. Hi Kevin, What's your opinion here? Is 'unsigned int status' OK to you? Note both new members don't have to be initialized to zero explicitly as both vcpu and domain structure are zero-ed when they are created. no initialization in this patch, so why explaining it here? OK. Looks it's a common sense to all of you so I'll just remove this sentence. Signed-off-by: Kai Huang kai.hu...@linux.intel.com --- xen/include/asm-x86/hvm/vmx/vmcs.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h index f831a78..2c679ac 100644 --- a/xen/include/asm-x86/hvm/vmx/vmcs.h +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h @@ -70,8 +70,12 @@ struct ept_data { cpumask_var_t synced_mask; }; +#define _VMX_DOMAIN_PML_ENABLED0 +#define VMX_DOMAIN_PML_ENABLED (1ul _VMX_DOMAIN_PML_ENABLED) struct vmx_domain { unsigned long apic_access_mfn; +/* VMX_DOMAIN_* */ +unsigned long status; }; struct pi_desc { @@ -142,6 +146,9 @@ struct arch_vmx_struct { /* Bitmap to control vmexit policy for Non-root VMREAD/VMWRITE */ struct page_info *vmread_bitmap; struct page_info *vmwrite_bitmap; + +#define NR_PML_ENTRIES 512 +struct page_info *pml_pg; move the macro out of the structure. OK. I will move it just above the declaration of struct arch_vmx_struct. and is pml_buffer or pml_buf more clear? To me pml_buffer or pml_buf is more likely a virtual address you can access the buffer directly, while pml_pg indicates it's a pointer of struct page_info. If you you look at patch 6, you can find statements like: uint64_t *pml_buf; pml_buf = __map_domain_page(v-arch.hvm_vmx.pml_pg); So I intend to keep it. And this one? Are you OK with 'pml_pg'? Thanks, -Kai Thanks, -Kai }; int vmx_create_vmcs(struct vcpu *v); -- 2.1.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants
On Thu, Mar 26, 2015 at 04:00:54AM +0100, Luis R. Rodriguez wrote: On Mon, Mar 23, 2015 at 12:20:47PM -0500, Bjorn Helgaas wrote: Hi Luis, This seems OK to me, Great. but I'm curious about a few things. On Fri, Mar 20, 2015 at 6:17 PM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This allows drivers to take advantage of write-combining when possible. Ideally we'd have pci_read_bases() just peg an IORESOURCE_WC flag for us We do set IORESOURCE_PREFETCH. Do you mean something different? I did not think we had a WC IORESOURCE flag. Are you saying that we can use IORESOURCE_PREFETCH for that purpose? If so then great. As I read a PCI BAR can have PCI_BASE_ADDRESS_MEM_PREFETCH and when that's the case we peg IORESOURCE_PREFETCH. That seems to be what I want indeed. Questions below. but where exactly video devices memory lie varies *largely* and at times things are mixed with MMIO registers, sometimes we can address the changes in drivers, other times the change requires intrusive changes. What does a video device address have to do with this? I do see that if a BAR maps only a frame buffer, the device might be able to mark it prefetchable, while if the BAR mapped both a frame buffer and some registers, it might not be able to make it prefetchable. But that doesn't seem like it depends on the *address*. I meant the offsets for each of those, either registers or framebuffer, and that typically they are mixed (primarily on older devices), so indeed your summary of the problem is what I meant. Let's remember that we are trying to take advantage of PAT here when available and avoid MTRR in that case, do we know that the same PCI BARs that have always historically used MTRRs had IORESOURCE_PREFETCH set, is that a fair assumption ? I realize they are different things -- but its precisely why I ask. pci_iomap_range() already makes a cacheable mapping if IORESOURCE_CACHEABLE; I'm guessing that you would like it to automatically use WC if the BAR if IORESOURCE_PREFETCH, e.g., if (flags IORESOURCE_CACHEABLE) return ioremap(start, len); if (flags IORESOURCE_PREFETCH) return ioremap_wc(start, len); return ioremap_nocache(start, len); Indeed, that's exactly what I think we should strive towards. Is there a reason not to do that? This depends on the exact defintion of IORESOURCE_PREFETCH and PCI_BASE_ADDRESS_MEM_PREFETCH and how they are used all over and accross *all devices*. This didn't look promising for starters: include/uapi/linux/pci_regs.h:#define PCI_BASE_ADDRESS_MEM_PREFETCH0x08 /* prefetchable? */ PCI_BASE_ADDRESS_MEM_PREFETCH seems to be BAR specific, so a few questions: 1) Can we rest assured for instance that if we check for PCI_BASE_ADDRESS_MEM_PREFETCH and if set that it will *only* be set on a full PCI BAR if the full PCI BAR does want WC? If not this can regress functionality. That seems risky. It however would not be risky if we used another API that did look for IORESOURCE_PREFETCH and if so use ioremap_wc() -- that way only drivers we know that do use the full PCI bar would use this API. There's a bit of a problem with this though: 2) Do we know that if a *full PCI BAR* is used for WC that PCI_BASE_ADDRESS_MEM_PREFETCH *was* definitely set for the PCI BAR? If so then the API usage would be restricted only to devices that we know *do* adhere to this. That reduces the possible uses for older drivers and can create regressions if used loosely without verification... but.. 3) If from now on we get folks to commit to uset PCI_BASE_ADDRESS_MEM_PREFETCH for full PCI BARs that do want WC perhaps newer devices / drivers will use this very consistently ? Can we bank on that and is it worth it ? 4) If a PCI BAR *does not* have PCI_BASE_ADDRESS_MEM_PREFETCH do we know it must not never want WC ? If we don't have certainty on any of the above I'm afraid we can't do much right now but perhaps we can push towards better use of PCI_BASE_ADDRESS_MEM_PREFETCH and hope folks will only use this for the full PCI BAR only if WC is desired. Thoughts? Bjorn, now that you're done schooling me on English, any thoughts on the above? Although there is also arch_phys_wc_add() that makes use of architecture specific write-combinging alternatives (MTRR on x86 when a system does not have PAT) we void polluting pci_iomap() space with it and force drivers and subsystems that want to use it to be explicit. There are a few motivations for this: a) Take advantage of PAT when available b) Help bury MTRR code away, MTRR is architecture specific and on x86 its replaced by PAT c) Help with the goal of eventually using _PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (de33c442e) ... +void __iomem *pci_iomap_wc_range(struct pci_dev
Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants
On Mon, Mar 23, 2015 at 12:20:47PM -0500, Bjorn Helgaas wrote: pci_iomap_range() already makes a cacheable mapping if IORESOURCE_CACHEABLE; I'm guessing that you would like it to automatically use WC if the BAR if IORESOURCE_PREFETCH, e.g., if (flags IORESOURCE_CACHEABLE) return ioremap(start, len); if (flags IORESOURCE_PREFETCH) return ioremap_wc(start, len); return ioremap_nocache(start, len); Is there a reason not to do that? I think that's wrong and will break a bunch of things. PCI prefetch bit merely means bridges can combine writes and prefetch reads. Prefetch does not affect ordering rules and does not allow writes to be collapsed. WC is stronger: it allows collapsing and changes ordering rules. WC can also hurt latency as small writes are buffered. To summarise, driver needs to know what it's doing, we can't set WC in the pci core automatically. -- MST ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants
On Tue, Apr 21, 2015 at 07:52:49PM +0200, Luis R. Rodriguez wrote: On Thu, Mar 26, 2015 at 04:00:54AM +0100, Luis R. Rodriguez wrote: On Mon, Mar 23, 2015 at 12:20:47PM -0500, Bjorn Helgaas wrote: Hi Luis, This seems OK to me, Great. but I'm curious about a few things. On Fri, Mar 20, 2015 at 6:17 PM, Luis R. Rodriguez mcg...@do-not-panic.com wrote: From: Luis R. Rodriguez mcg...@suse.com This allows drivers to take advantage of write-combining when possible. Ideally we'd have pci_read_bases() just peg an IORESOURCE_WC flag for us We do set IORESOURCE_PREFETCH. Do you mean something different? I did not think we had a WC IORESOURCE flag. Are you saying that we can use IORESOURCE_PREFETCH for that purpose? If so then great. As I read a PCI BAR can have PCI_BASE_ADDRESS_MEM_PREFETCH and when that's the case we peg IORESOURCE_PREFETCH. That seems to be what I want indeed. Questions below. but where exactly video devices memory lie varies *largely* and at times things are mixed with MMIO registers, sometimes we can address the changes in drivers, other times the change requires intrusive changes. What does a video device address have to do with this? I do see that if a BAR maps only a frame buffer, the device might be able to mark it prefetchable, while if the BAR mapped both a frame buffer and some registers, it might not be able to make it prefetchable. But that doesn't seem like it depends on the *address*. I meant the offsets for each of those, either registers or framebuffer, and that typically they are mixed (primarily on older devices), so indeed your summary of the problem is what I meant. Let's remember that we are trying to take advantage of PAT here when available and avoid MTRR in that case, do we know that the same PCI BARs that have always historically used MTRRs had IORESOURCE_PREFETCH set, is that a fair assumption ? I realize they are different things -- but its precisely why I ask. pci_iomap_range() already makes a cacheable mapping if IORESOURCE_CACHEABLE; I'm guessing that you would like it to automatically use WC if the BAR if IORESOURCE_PREFETCH, e.g., if (flags IORESOURCE_CACHEABLE) return ioremap(start, len); if (flags IORESOURCE_PREFETCH) return ioremap_wc(start, len); return ioremap_nocache(start, len); Indeed, that's exactly what I think we should strive towards. Is there a reason not to do that? This depends on the exact defintion of IORESOURCE_PREFETCH and PCI_BASE_ADDRESS_MEM_PREFETCH and how they are used all over and accross *all devices*. This didn't look promising for starters: include/uapi/linux/pci_regs.h:#define PCI_BASE_ADDRESS_MEM_PREFETCH 0x08/* prefetchable? */ PCI_BASE_ADDRESS_MEM_PREFETCH seems to be BAR specific, so a few questions: 1) Can we rest assured for instance that if we check for PCI_BASE_ADDRESS_MEM_PREFETCH and if set that it will *only* be set on a full PCI BAR if the full PCI BAR does want WC? If not this can regress functionality. That seems risky. It however would not be risky if we used another API that did look for IORESOURCE_PREFETCH and if so use ioremap_wc() -- that way only drivers we know that do use the full PCI bar would use this API. There's a bit of a problem with this though: 2) Do we know that if a *full PCI BAR* is used for WC that PCI_BASE_ADDRESS_MEM_PREFETCH *was* definitely set for the PCI BAR? If so then the API usage would be restricted only to devices that we know *do* adhere to this. That reduces the possible uses for older drivers and can create regressions if used loosely without verification... but.. In theory, PCI spec says this about prefetch memory: Bridges are permitted to merge writes into this range (refer to Section 3.2.6). Exceptions could be: - devices not behind a bridge (e.g. intergrated in a root complex) - devices behind a virtual bridge from same vendor (which know bridge won't prefetch) I worry that WC might also cause more reordering though. I don't remember this is true, off-hand. Bridges can only reorder transactions according to very specific rules. 3) If from now on we get folks to commit to uset PCI_BASE_ADDRESS_MEM_PREFETCH for full PCI BARs that do want WC perhaps newer devices / drivers will use this very consistently ? Can we bank on that and is it worth it ? Unfortunately there's a separate good reason to set memory as prefetcheable: it's the only way to get 64 bit addresses for devices behind bridges. So WC might be *safe* for prefetch BARs, but might not be a good idea. 4) If a PCI BAR *does not* have PCI_BASE_ADDRESS_MEM_PREFETCH do we know it must not never want WC ? That's not true I think. It means device can't allow prefetch but maybe it does allow
[Xen-devel] [PATCH] raisin: Some git-checkout improvements
1. Switch local variables to lower-case and declare them local. 2. Cloning git trees from remote repos is often a very long operation. Allow the user to specify a faster git cache as a prefix. 3. At the moment you can either check out a specific changeset or master, but you can't check out a different branch, because git doesn't always look in origin/ for the branch. If the initial git checkout $tag fails, try checking out origin/$tag before giving up. Signed-off-by: George Dunlap george.dun...@eu.citrix.com --- CC: Stefano Stabellini stefano.stabell...@citrix.com --- defconfig | 3 +++ lib/git-checkout.sh | 33 - 2 files changed, 23 insertions(+), 13 deletions(-) diff --git a/defconfig b/defconfig index d3ef283..38c7455 100644 --- a/defconfig +++ b/defconfig @@ -35,3 +35,6 @@ QEMU_TRADITIONAL_REVISION=master SEABIOS_REVISION=master GRUB_REVISION=master LIBVIRT_REVISION=master + +# Git prefix. Use this if you have a git caching proxy. +#GIT_PREFIX= diff --git a/lib/git-checkout.sh b/lib/git-checkout.sh index 2ca8f25..b033504 100755 --- a/lib/git-checkout.sh +++ b/lib/git-checkout.sh @@ -1,32 +1,39 @@ #!/usr/bin/env bash function git-checkout() { +local tree +local tag +local dir + if [[ $# -lt 3 ]] then echo Usage: $0 tree tag dir exit 1 fi -TREE=$1 -TAG=$2 -DIR=$3 +tree=$1 +tag=$2 +dir=$3 + +tree=${GIT_PREFIX}$tree set -e -if [[ ! -d $DIR-remote ]] +if [[ ! -d $dir-remote ]] then - rm -rf $DIR-remote $DIR-remote.tmp - mkdir -p $DIR-remote.tmp; rmdir $DIR-remote.tmp - $GIT clone $TREE $DIR-remote.tmp - if [[ $TAG ]] + rm -rf $dir-remote $dir-remote.tmp + mkdir -p $dir-remote.tmp; rmdir $dir-remote.tmp + $GIT clone $tree $dir-remote.tmp + if [[ $tag ]] then - cd $DIR-remote.tmp + cd $dir-remote.tmp $GIT branch -D dummy /dev/null 21 ||: - $GIT checkout -b dummy $TAG + $GIT checkout -b dummy $tag \ + || $GIT checkout -b dummy origin/$tag cd .. fi - mv $DIR-remote.tmp $DIR-remote + mv $dir-remote.tmp $dir-remote fi -rm -f $DIR -ln -sf $DIR-remote $DIR +rm -f $dir +ln -sf $dir-remote $dir } -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants
On Tue, Apr 21, 2015 at 12:25 PM, Michael S. Tsirkin m...@redhat.com wrote: To summarise, driver needs to know what it's doing, we can't set WC in the pci core automatically. Thanks, I'll document this and proceed with device driver helpers to aid with this. Luis ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] [RFC] x86/cpu: Fix SMAP check in PVOPS environments
On 21/04/2015 01:35, Andy Lutomirski wrote: On 04/20/2015 10:09 AM, Andrew Cooper wrote: There appears to be no formal statement of what pv_irq_ops.save_fl() is supposed to return precisely. Native returns the full flags, while lguest and Xen only return the Interrupt Flag, and both have comments by the implementations stating that only the Interrupt Flag is looked at. This may have been true when initially implemented, but no longer is. To make matters worse, the Xen PVOP leaves the upper bits undefined, making the BUG_ON() undefined behaviour. Experimentally, this now trips for 32bit PV guests on Broadwell hardware. The BUG_ON() is consistent for an individual build, but not consistent for all builds. It has also been a sitting timebomb since SMAP support was introduced. Use native_save_fl() instead, which will obtain an accurate view of the AC flag. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Thomas Gleixner t...@linutronix.de CC: Ingo Molnar mi...@redhat.com CC: H. Peter Anvin h...@zytor.com CC: x...@kernel.org CC: linux-ker...@vger.kernel.org CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Boris Ostrovsky boris.ostrov...@oracle.com CC: David Vrabel david.vra...@citrix.com CC: xen-devel xen-devel@lists.xen.org CC: Rusty Russell ru...@rustcorp.com.au CC: lgu...@lists.ozlabs.org --- This patch is RFC because I am not certain that native_save_fl() is necessarily the correct solution on lguest, but it does seem that setup_smap() wants to check the actual AC bit, rather than an idealised value. A different approach, given the dual nature of the AC flag now is to gate setup_smap() on a kernel rpl of 0. SMAP necessarily can't be used in a paravirtual situation where the kernel runs in cpl 0. Another different approach would be to formally state that pv_irq_ops.save_fl() needs to return all the flags, which would make local_irq_save() safe to use in this circumstance, but that makes a hotpath longer for the sake of a single boot time check. ...which reminds me: Why does native_restore_fl restore anything other than IF? A branch and sti should be considerably faster than popf. I was wondering about the performance aspect, given a comment in your patch which removed sysret64, but hadn't had time to investigate yet. Unfortunately, irq_save()/irq_enable()/irq_restore() appears to be a used pattern in the kernel, making the irq_restore() disable interrupts. The performance improvement might be worth explicitly moving the onus into the caller with irq_maybe_disable()/irq_maybe_enable(), but that does involve altering a lot of common code for an architecture specific gain. Also, if we did this, could Xen use PVI and then use native_restore_fl and avoid lots of pvops? Xen HVM guests already use the native pvops in this area, so would benefit from any improvement. PV guests on the other hand run with cpl 0 and instead have a writeable mask in a piece of shared memory with Xen, and need the pvop. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] Config.mk: Fix (and, effectively, update) QEMU_TAG
In 952944f7 QEMU_TAG update my tag update script mangled the machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first assignment to QEMU_TRADITIONAL_REVISION it found rather than the one which ought to have been replaced. The result was that: * From that commit on, QEMU_TAG was no longer honoured although QEMU_TRADITIONAL_REVISION still was * That particular update to QEMU_TRADITIONAL_REVISION's default value was effective * The next attempt to update QEMU_TRADITIONAL_REVISION, in 1fc3aeb3 libxl: use new QEMU xenstore protocol was totally ineffective. Fix this by restoring the transfer from QEMU_TAG. The effects are: * Once more, honour QEMU_TAG. * Belatedly apply the qemu-trad change part of libxl: use new QEMU xenstore protocol. (I have also fixed my script to not do this again.) Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: George Dunlap george.dun...@eu.citrix.com CC: Jan Beulich jbeul...@suse.com Reported-by: Jan Beulich jbeul...@suse.com --- Config.mk |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/Config.mk b/Config.mk index e5976fc..28b77d6 100644 --- a/Config.mk +++ b/Config.mk @@ -237,9 +237,7 @@ ifneq (,$(CONFIG_QEMU)) QEMU_TRADITIONAL_LOC ?= $(CONFIG_QEMU) endif ifneq (,$(QEMU_TAG)) -QEMU_TRADITIONAL_REVISION ?= ab42b4408cb4fc4f869d73218e3d2034e6f5e8ac -# Tue Mar 31 16:27:45 2015 +0100 -# xen: limit guest control of PCI command register +QEMU_TRADITIONAL_REVISION ?= $(QEMU_TAG) endif ifeq ($(GIT_HTTP),y) -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's vCPU on Xen ARM 4.5.x?
On Tue, 21 Apr 2015, 신정섭 wrote: Thanks your reply! I Think I find a Interrupt mechanism problem in Xen ARM 4.5 and I fix that simply. After fix it my irq routing code is working well on Xen ARM 4.5 too. I will start new thread about that problem. please confirm please. Confirm what? You are welcome to start a new thread an any interrupt bugs you might have found. Thanks -Original Message- From: Stefano Stabellinistefano.stabell...@eu.citrix.com To: 신정섭supsup5...@naver.com; Cc: Stefano Stabellinistefano.stabell...@eu.citrix.com; Ian Campbellian.campb...@citrix.com; xen-devel@lists.xen.org; Sent: 2015-04-21 (화) 19:13:54 Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's vCPU on Xen ARM 4.5.x? On Tue, 21 Apr 2015, 신정섭 wrote: I have a one more question. In Xen ARM 4.5, All SPI is routed to pcpu0 that run domain0's vcpu0. If domain0's vcpu0 run on pcpu0 All SPI is routed to pcpu0 If domain0's vcpu0 run on pcpu1 All SPI is routed to pcpu1 That is correct. these mean that Xen ARM 4.5 can inject spi only to domain0's vcpu0 and Xen ARM 4.5 cannot inject spi to domain0's vcpu1. Right? No, that is wrong. If the guest requests the spis to be routed to another vcpu, writing the appropriate values to the virtual GICD, then Xen will route the spis to the pcpu running the requested vcpu. So if your guest is Linux and you echo 2 /proc/irq/SPI_NUMBER/smp_affinity then you should see that Xen will start injecting the SPI to vcpu1. And is this reason ARM 4.5 don't use maintanance interrupt? No, that is just a performance optimization. Thanks -Original Message- From: Stefano Stabellinistefano.stabell...@eu.citrix.com To: 신정섭supsup5...@naver.com; Cc: Stefano Stabellinistefano.stabell...@eu.citrix.com; Ian Campbellian.campb...@citrix.com; xen-devel@lists.xen.org; Sent: 2015-04-21 (화) 02:25:09 Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's vCPU on Xen ARM 4.5.x? On Mon, 20 Apr 2015, 신정섭 wrote: Thanks your rely. But sorry i can't understand your explanation fully. I don't want to change GICD setting. I only want to change target Domain0' vcpu injected SPI. vcpu0 or vcpu1. I understand like below. In Xen4.4, vgic_vcpu_inject_irq() can inject SPI to any Domain0's vcpu on any pcpu. But int Xen4.5 vgic_vcpu_inject_irq() can inject SPI on only pcpu that receive SPI from GICD. Right? Yes, if you meant the virtual GICD (not the physical GICD). I'll repeat: In Xen 4.5 vgic_vcpu_inject_irq can inject a given SPI only to the pcpu that is set to run the vcpu that should receive the interrupt, as per the vGICD configuration. So if you echo VCPU_NUMBER /proc/irq/IRQ_NUMBER/smp_affinity in the guest, it should work and it should have a concrete effect in the delivery of the physical interrupt. -Original Message- From: Stefano Stabellinistefano.stabell...@eu.citrix.com To: 신정섭supsup5...@naver.com; Cc: Ian Campbellian.campb...@citrix.com; xen-devel@lists.xen.org; Stefano Stabellinistefano.stabell...@eu.citrix.com; Sent: 2015-04-20 (월) 19:49:50 Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's vCPU on Xen ARM 4.5.x? In Xen 4.5 we rely on the fact that the physical irq is routed to the physical cpu running the vcpu of the domain that needs to receive the corresponding virq. So if you want to inject IRQ 100 to CPU 1, while Dom0 is set to receive vIRQ 100 (virtual irq corresponding to IRQ 100) to vcpu0, running on CPU 0, that won't work. On Sat, 18 Apr 2015, 신정섭 wrote: NO Peripheral IRQ routing means that Xen select itself one of domain0's vCPU to inject periperal IRQ. So below Simple peripheral IRQ routing Code is a Example of Peripheral IRQ routing. periperal IRQ is injected to Domain0' vcpu0 or vcpu1 without vGIC Information. I know that periperal IRQ can be process on any cpu in linux. So All Domain0's vcpu can process periperal IRQ injected by Xen. On Xen 4.4.1 my simple Simple peripheral irq routing Code is working well. (below) But Xen 4.5.0 it dosen't. -Original Message- From: Ian Campbellian.campb...@citrix.com To: 신정섭supsup5...@naver.com; Cc: xen-devel@lists.xen.org; Stefano Stabellinistefano.stabell...@eu.citrix.com; Sent: 2015-04-17 (금) 18:49:39 Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's vCPU on Xen ARM 4.5.x? On Fri, 2015-04-17 at 11:36 +0900, 신정섭 wrote: I'm studying periperal irq routing to Domain0's vCPU What do you mean by peripheral irq routing? Do you mean
Re: [Xen-devel] [PATCH] Config.mk: Fix (and, effectively, update) QEMU_TAG
On 21.04.15 at 12:33, ian.jack...@eu.citrix.com wrote: In 952944f7 QEMU_TAG update my tag update script mangled the machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first assignment to QEMU_TRADITIONAL_REVISION it found rather than the one which ought to have been replaced. The result was that: * From that commit on, QEMU_TAG was no longer honoured although QEMU_TRADITIONAL_REVISION still was * That particular update to QEMU_TRADITIONAL_REVISION's default value was effective * The next attempt to update QEMU_TRADITIONAL_REVISION, in 1fc3aeb3 libxl: use new QEMU xenstore protocol was totally ineffective. Fix this by restoring the transfer from QEMU_TAG. The effects are: * Once more, honour QEMU_TAG. * Belatedly apply the qemu-trad change part of libxl: use new QEMU xenstore protocol. (I have also fixed my script to not do this again.) Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Reported-by: Jan Beulich jbeul...@suse.com Acked-by: Jan Beulich jbeul...@suse.com (and please also for 4.5) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v8 5/7] Add new script to customize nested test configuration
On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote: 1. In this script, make some appropriate runvars which selecthost would recognise. 2. Prepare the configurations for installing L2 guest VM. 3. Create a lv disk in L0 and hot-attach it to L1, need to restart L1 to make the block disk to be recognized by L1 system, then using this disk to create a VG that used for installing L2. Signed-off-by: longtao.pang longtaox.p...@intel.com --- Changes in v8: 1. Replace '$nested_host' by '$l1-{Guest}'. --- ts-nested-setup | 52 1 file changed, 52 insertions(+) create mode 100755 ts-nested-setup diff --git a/ts-nested-setup b/ts-nested-setup new file mode 100755 index 000..41d5e80 --- /dev/null +++ b/ts-nested-setup @@ -0,0 +1,52 @@ +#!/usr/bin/perl -w +# This is part of osstest, an automated testing framework for Xen. +# Copyright (C) 2015 Intel Inc. +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU Affero General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU Affero General Public License for more details. +# +# You should have received a copy of the GNU Affero General Public License +# along with this program. If not, see http://www.gnu.org/licenses/. + +use strict qw(vars); +use DBI; +use Osstest; +use Osstest::Debian; +use Osstest::TestSupport; + +tsreadconfig(); +our ($l0,$l1) = ts_get_host_guest(@ARGV); + +guest_check_ip($l1); +target_cmd_root($l1, mkdir -p /home/osstest/.ssh cp /root/.ssh/authorized_keys /home/osstest/.ssh/); +my $url = $c{WebspaceUrl}.$c{WebspaceCommon}.$l0-{Name}._.'overlay.tar'; +target_cmd_root($l1, END); +wget -O overlay.tar $url +tar -xf overlay.tar -C / +rm overlay.tar -f +update-rc.d osstest-confirm-booted start 99 2 . +END I cc'd you on some patches which I think should help avoid this duplication. + +store_runvar('nested_l1',$l1-{Guest}); I'm not sure what this is for and it would normally be wrong to hardcode nested_l1 like that, where is it used? Most places you seem to use nestedl1, not nested_l1. I think you ended up with this due to not using guest_var and the various hardcoding you've done elsewhere. I suspect with all that fixed and make-flight updated accordingly you won't need this var any more. +store_runvar($l1-{Guest}_ip,$l1-{Ip}); + +my $l2_disk_mb = 2; +my $l2_lv_name = 'nestedl2-disk'; +my $vgname = $l0-{Name}; +$vgname .= .$c{TestHostDomain} if ($l0-{Suite} =~ m/lenny/); +target_cmd_root($l0, END); +lvremove -f /dev/${vgname}/${l2_lv_name} ||: +lvcreate -L ${l2_disk_mb}M -n $l2_lv_name $vgname +dd if=/dev/zero of=/dev/${vgname}/${l2_lv_name} count=10 I think you can do all of this by passing a suitable l2 $gho to prepareguest_part_lvmdisk, can't you? I think you can get that by my $l2 = selectguest($ARGV[], $l1). Where ARGV[] is a new parameter passed by sg-run-job i.e. nestedl2, i.e. the one after whatever ts_get_host_guest consumes at the top of the file (so ARGV[2] perhaps?). Once you have that $l2 you can then use guest_var for e.g. the size, which will be good because it will be picked up by your modifications to ts-debian-hvm-install such that they automatically match. +xl block-attach $l1-{Name} /dev/${vgname}/${l2_lv_name},raw,sdb,rw You use sdb here, but xvdb below. I think xvdb would work here, or to avoid HVM confusion wrt SCSI vs PV perhaps use xvde throughout (since it has no emulated counterpart by default IIRC)? +END +target_install_packages_norec($l1, qw(lvm2 rsync genisoimage)); +target_reboot($l1); +target_cmd_root($l1, pvcreate /dev/xvdb vgcreate ${l2_lv_name}_vg /dev/xvdb); ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] [RFC] x86/cpu: Fix SMAP check in PVOPS environments
Andy Lutomirski l...@kernel.org writes: On 04/20/2015 10:09 AM, Andrew Cooper wrote: There appears to be no formal statement of what pv_irq_ops.save_fl() is supposed to return precisely. Native returns the full flags, while lguest and Xen only return the Interrupt Flag, and both have comments by the implementations stating that only the Interrupt Flag is looked at. This may have been true when initially implemented, but no longer is. To make matters worse, the Xen PVOP leaves the upper bits undefined, making the BUG_ON() undefined behaviour. Experimentally, this now trips for 32bit PV guests on Broadwell hardware. The BUG_ON() is consistent for an individual build, but not consistent for all builds. It has also been a sitting timebomb since SMAP support was introduced. Use native_save_fl() instead, which will obtain an accurate view of the AC flag. That should work for lguest. Indeed, it does (in practice those bits are 0). Tested-by: Rusty Russell ru...@rustcorp.com.au (lguest) Thanks, Rusty. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Thomas Gleixner t...@linutronix.de CC: Ingo Molnar mi...@redhat.com CC: H. Peter Anvin h...@zytor.com CC: x...@kernel.org CC: linux-ker...@vger.kernel.org CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com CC: Boris Ostrovsky boris.ostrov...@oracle.com CC: David Vrabel david.vra...@citrix.com CC: xen-devel xen-devel@lists.xen.org CC: Rusty Russell ru...@rustcorp.com.au CC: lgu...@lists.ozlabs.org --- This patch is RFC because I am not certain that native_save_fl() is necessarily the correct solution on lguest, but it does seem that setup_smap() wants to check the actual AC bit, rather than an idealised value. A different approach, given the dual nature of the AC flag now is to gate setup_smap() on a kernel rpl of 0. SMAP necessarily can't be used in a paravirtual situation where the kernel runs in cpl 0. Another different approach would be to formally state that pv_irq_ops.save_fl() needs to return all the flags, which would make local_irq_save() safe to use in this circumstance, but that makes a hotpath longer for the sake of a single boot time check. ...which reminds me: Why does native_restore_fl restore anything other than IF? A branch and sti should be considerably faster than popf. Also, if we did this, could Xen use PVI and then use native_restore_fl and avoid lots of pvops? --Andy ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v20 02/13] x86/VPMU: Add public xenpmu.h
On 20.04.15 at 18:38, boris.ostrov...@oracle.com wrote: On 04/20/2015 04:50 AM, Jan Beulich wrote: On 09.04.15 at 17:44, boris.ostrov...@oracle.com wrote: --- /dev/null +++ b/xen/include/public/pmu.h @@ -0,0 +1,38 @@ +#ifndef __XEN_PUBLIC_PMU_H__ +#define __XEN_PUBLIC_PMU_H__ + +#include xen.h +#if defined(__i386__) || defined(__x86_64__) +#include arch-x86/pmu.h +#elif defined (__arm__) || defined (__aarch64__) +#include arch-arm.h +#else +#error Unsupported architecture +#endif + +#define XENPMU_VER_MAJ0 +#define XENPMU_VER_MIN1 + + +/* Shared between hypervisor and PV domain */ +struct xen_pmu_data { Iirc this sharing is r/o - if so, please state so in the comment. If not, please extend the comment to briefly explain why writable sharing is safe/secure. This data structure is writeable by guest (specifically, PMU registers and APIC_LVTPC). There is a flag (PMU_CACHED, which is part of this structure) that the hypervisor sets to let the guest know that it can write those fields without having to trap. When the guest is done, it issues XENPMU_flush command and the hypervisor writes out those values to HW. I'll update the comments to make this clear. I think you'll actually want to state for each of the fields who reads and who writes them. In particular for (I hope) obvious reasons some (most?) of the fields would apparently need to be documented write-only by the hypervisor. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] [RFC] x86/cpu: Fix SMAP check in PVOPS environments
On 20.04.15 at 19:09, andrew.coop...@citrix.com wrote: A different approach, given the dual nature of the AC flag now is to gate setup_smap() on a kernel rpl of 0. SMAP necessarily can't be used in a paravirtual situation where the kernel runs in cpl 0. Can't isn't true here - for 64-bit PV Xen guests, which already toggle between two page table variants for kernel and user mode, it would be possible (but perhaps expensive) to mimic the needed behavior by introducing a 3rd set of page tables, containing only the kernel mappings. You may recall that I had even posted an RFC patch to tat effect about a year ago. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/configure.ac: generate Paths.mk if it's not available
On Tue, 2015-04-21 at 12:30 +0100, Wei Liu wrote: On Tue, Apr 21, 2015 at 11:54:27AM +0100, Ian Campbell wrote: On Tue, 2015-04-21 at 11:16 +0100, Wei Liu wrote: On Mon, Apr 20, 2015 at 03:07:38PM +0100, Wei Liu wrote: Xen toolstack references many variables in Paths.mk when building and installing, so tools' configure should generate Paths.mk if it's not available. Also make inclusion of Paths.mk mandatory in Tools.mk. Hmm... I just discovered that docs build also involves Paths.mk. This patch is ugly enough that I don't want to duplicate it for docs. So advise on how to fix this would be much appreciated. I wasn't terribly happy with having more than one place update Paths.mk already. Or we can state clear that anyone who builds Xen from source needs to run ./configure in top level directory, not the ones in subsystems. I think that's essentially what we've done so far, but it's not terribly satisfactory I'll admit. Is this the only issue which prevents this? Perhaps change each subsystem to generate+consume its own Paths ${subsys}.mk instead of a single global one? Either in config/Paths ${subsys.mk} or in ${subsys}/Paths.mk. If you invoke from the top-level then they will all end up with the same contents, but so what... I can try to refactor Paths.mk.in into several files. I was suggesting a single input but create multiple outputs, otherwise we have to keep all the inputs in sync. I expect the majority of paths are common to all sub components. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test
On Tue, 2015-04-21 at 13:33 +0100, Ian Jackson wrote: Ian Campbell writes (Re: [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test): On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote: --- a/Osstest/TestSupport.pm +++ b/Osstest/TestSupport.pm ... +if ( $r{${name}_ip} ) { +$setprop-(IpAddr, $r{${name}_ip}); +} This is one for Ian I think, but I suspect this should use ${ident} rather than ${name}. Yes. ${ident} is e.g. 'host' or 'srchost' or 'nestedl1' it is the prefer used on the runvar names. ${name} is the specific value assigned, i.e. an actual host name. For such properties we usually prefer ${ident}_foo. Ian, correct me if I'm wrong please. Ian C is right. But, I think actually the principle behind this change is wrong. It seems to be copying information from runvars into the in-memory data structure for host properties. But host properties are (by definition) matters of (fixed) configuration, not runtime definition. Remember that here the host is actual an L1 virtual machine, whose IP is not fixed (since it mac address isn't). I don't know if that affects your opinion at all? I haven't looked at the rest of the series, but the same effect should be achieved in a different way. Note that a $ho is a hash which already contains (after selecthost, say) an element with key `Ip'. So the right place to do this is indeed probably somewhere around selectguest or selecthost, but the information should be put into $ho-{Ip} without going via host properties. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)
On Tue, 21 Apr 2015, Fabio Fantoni wrote: Il 21/04/2015 12:49, Stefano Stabellini ha scritto: On Mon, 20 Apr 2015, Fabio Fantoni wrote: I updated xen and qemu from xen 4.5.0 with its upstream qemu included to xen 4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk to use revision master). After few minutes I booted windows 7 64 bit domU qemu crash, tried 2 times with same result. In the domU's qemu log: qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top == (((mbinptr) (((char *) ((av)-bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd old_size == 0) || ((unsigned long) (old_size) = (unsigned long)__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 * (sizeof(size_t))) - 1)) ~((2 * (sizeof(size_t))) - 1))) ((old_top)-size 0x1) ((unsigned long)old_end pagemask) == 0)' failed. Killing all inferiors In attachment the full backtrace of qemu crash. With a fast search after I saw the backtrace I found a probable cause of regression (I'm not sure): http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa spice: make sure we don't overflow ssd-buf Added also qemu-devel and spice-devel as cc. If you need more informations/tests tell me and I'll post them. Maybe you could try to revert the offending commit (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect the crash? Thanks for your reply. I reverted to 4.5.0 on dom0 for now on that system because I'm busy trying to found another problem that cause very bad performance without errors or nothing in logs :( I don't know if if xen related, kernel related or other for now. About this regression with spice I'll do further tests in next days (probably starting reverting the spice patch in qemu) but any help is appreciated. Based on data I have for now is possible that the problem is that qemu try to allocate other ram or videoram after domU create but with xen is not possible? In the spice related patch I saw something about dynamic allocation for example. It is probably caused by a commit in the range: 1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4 there are only 10 commits in that range. By using git bisect you should be able to narrow it down in just 3 tests. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/asm/irq: Don't use POPF but STI
On Tue, Apr 21, 2015 at 02:45:58PM +0200, Ingo Molnar wrote: From 6f01f6381e8293c360b7a89f516b8605e357d563 Mon Sep 17 00:00:00 2001 From: Ingo Molnar mi...@kernel.org Date: Tue, 21 Apr 2015 13:32:13 +0200 Subject: [PATCH] x86/asm/irq: Don't use POPF but STI So because the POPF instruction is slow and STI is faster on essentially all x86 CPUs that matter, instead of: 81891848: 9d popfq we can do: 81661a2e: 41 f7 c4 00 02 00 00test $0x200,%r12d 81661a35: 74 01 je 81661a38 snd_pcm_stream_unlock_irqrestore+0x28 81661a37: fb sti 81661a38: This bloats the kernel a bit, by about 1K on the 64-bit defconfig: textdata bss dec hex filename 122586341812120 1085440 15156194 e743e2 vmlinux.before 122595821812120 1085440 15157142 e74796 vmlinux.after the other cost is the extra branching, adding extra pressure to the branch prediction hardware and also potential branch misses. Do we care? After we enable interrupts, we'll most likely go somewhere cache cold anyway, so the branch misses will happen anyway. The question is, would the cost drop from POPF - STI cover the increase in branch misses overhead? Hmm, interesting. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv3 1/4] x86: provide xadd()
On 21.04.15 at 14:36, david.vra...@citrix.com wrote: On 21/04/15 11:36, Jan Beulich wrote: On 21.04.15 at 12:11, david.vra...@citrix.com wrote: +static always_inline unsigned long __xadd( +volatile void *ptr, unsigned long v, int size) +{ +switch ( size ) +{ +case 1: +asm volatile ( lock; xaddb %b0,%1 + : +r (v), +m (*__xg((volatile void *)ptr)) + :: memory); +return v; This doesn't seem to guarantee to return the old value: When the passed in v has more than 8 significant bits (which will get ignored as input), nothing will zap those bits from the register. Same for the 16-bit case obviously. +#define xadd(ptr, v) ({ \ +__xadd((ptr), (unsigned long)(v), sizeof(*(ptr))); \ +}) Assuming only xadd() is supposed to be used directly, perhaps the easiest would be to cast v to typeof(*(ptr)) (instead of unsigned long) here? I don't see how this helps. Did you perhaps mean cast the result? #define xadd(ptr, v) ({\ (typeof *(ptr))__xadd(ptr, (unsigned long)(v), \ sizeof(*(ptr))); \ }) Casting the result would work too; casting the input would have the same effect because (as said) the actual xadd doesn't alter bits 8...63 (or 16...63 in the 16-bit case), i.e. whether zero extension happens before or after doing the xadd doesn't matter. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology
On 21/04/15 13:56, Boris Ostrovsky wrote: On 04/21/2015 03:01 AM, Jan Beulich wrote: On 17.04.15 at 18:59, boris.ostrov...@oracle.com wrote: Changes in v7: * Break from the loop when -ENODEV is encountered This seems pretty inefficient for the caller. Returning a bad identifier other than XEN_INVALID_NODE_ID would seem better to me. That would mean that there is really no reason to return -ENODEV, which I thought you wanted to see. In that case (i.e. if no error needs to be returned) yes, a new token would make things simpler. --- a/docs/misc/xsm-flask.txt +++ b/docs/misc/xsm-flask.txt @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h) * XEN_SYSCTL_cpupool_op * XEN_SYSCTL_scheduler_op * XEN_SYSCTL_coverage_op + * XEN_SYSCTL_pcitopoinfo No additions to this list are permitted. Either the new sub-op is disaggregation safe (which it looks to be), or it can't be accepted. True, it *is* safe, but why then cputopoinfo and numainfo are in this list? They look to be safe as well. This list includes not yet audited. It is quite likely that some entries in the list are safe. fwiw, neither cputopo nor numainfo are currently safe for very large systems. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] VT-d: replace bogus gprintk()
From: Jan Beulich [mailto:jbeul...@suse.com] Sent: Tuesday, April 21, 2015 3:31 PM Just like the other messages in this function this one should be issued through plain printk() - the current vCPU is irrelevant here. (Noticed while backporting to older trees, which don't have gprintk().) Signed-off-by: Jan Beulich jbeul...@suse.com Acked-by: Kevin Tian kevin.t...@intel.com --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -858,8 +858,8 @@ static int iommu_page_fault_do_one(struc break; } -gprintk(XENLOG_G_WARNING VTDPREFIX, %s: reason %02x - %s\n, -kind, fault_reason, reason); +printk(XENLOG_G_WARNING VTDPREFIX %s: reason %02x - %s\n, + kind, fault_reason, reason); if ( iommu_verbose fault_type == DMA_REMAP ) print_vtd_entries(iommu, PCI_BUS(source_id), PCI_DEVFN2(source_id), ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 3/9] xen/arm: Data abort exception (R/W) mem_access events
On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote: This patch enables to store, set, check and deliver LPAE R/W mem_events. As the LPAE PTE's lack enough available software programmable bits, we store the permissions in a Radix tree. The tree is only looked at if mem_access_enabled is turned on. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 4/9] xen/arm: Instruction prefetch abort (X) mem_access event handling
On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote: Add missing structure definition for iabt and update the trap handling mechanism to only inject the exception if the mem_access checker decides to do so. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de I'm not super thrilled about the TLB flush here, but it does seem to be unavoidable, at least with our combined level of cunning right at the moment, so: Acked-by: Ian Campbell ian.campb...@citrix.com Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv3 1/4] x86: provide xadd()
On 21.04.15 at 15:15, david.vra...@citrix.com wrote: On 21/04/15 14:12, Jan Beulich wrote: On 21.04.15 at 14:36, david.vra...@citrix.com wrote: On 21/04/15 11:36, Jan Beulich wrote: On 21.04.15 at 12:11, david.vra...@citrix.com wrote: +static always_inline unsigned long __xadd( +volatile void *ptr, unsigned long v, int size) +{ +switch ( size ) +{ +case 1: +asm volatile ( lock; xaddb %b0,%1 + : +r (v), +m (*__xg((volatile void *)ptr)) + :: memory); +return v; This doesn't seem to guarantee to return the old value: When the passed in v has more than 8 significant bits (which will get ignored as input), nothing will zap those bits from the register. Same for the 16-bit case obviously. +#define xadd(ptr, v) ({ \ +__xadd((ptr), (unsigned long)(v), sizeof(*(ptr))); \ +}) Assuming only xadd() is supposed to be used directly, perhaps the easiest would be to cast v to typeof(*(ptr)) (instead of unsigned long) here? I don't see how this helps. Did you perhaps mean cast the result? #define xadd(ptr, v) ({\ (typeof *(ptr))__xadd(ptr, (unsigned long)(v), \ sizeof(*(ptr))); \ }) Casting the result would work too; casting the input would have the same effect because (as said) the actual xadd doesn't alter bits 8...63 (or 16...63 in the 16-bit case), i.e. whether zero extension happens before or after doing the xadd doesn't matter. Oh yes, of course. Any preference to which method? Not really - I guess the way you proposed it is the more obvious one. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology
On 04/21/2015 03:01 AM, Jan Beulich wrote: On 17.04.15 at 18:59, boris.ostrov...@oracle.com wrote: Changes in v7: * Break from the loop when -ENODEV is encountered This seems pretty inefficient for the caller. Returning a bad identifier other than XEN_INVALID_NODE_ID would seem better to me. That would mean that there is really no reason to return -ENODEV, which I thought you wanted to see. In that case (i.e. if no error needs to be returned) yes, a new token would make things simpler. --- a/docs/misc/xsm-flask.txt +++ b/docs/misc/xsm-flask.txt @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h) * XEN_SYSCTL_cpupool_op * XEN_SYSCTL_scheduler_op * XEN_SYSCTL_coverage_op + * XEN_SYSCTL_pcitopoinfo No additions to this list are permitted. Either the new sub-op is disaggregation safe (which it looks to be), or it can't be accepted. True, it *is* safe, but why then cputopoinfo and numainfo are in this list? They look to be safe as well. -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v2 05/11] vmx: add new data structure member to support PML
From: Kai Huang [mailto:kai.hu...@linux.intel.com] Sent: Tuesday, April 21, 2015 2:05 PM On 04/17/2015 10:31 AM, Kai Huang wrote: On 04/17/2015 06:39 AM, Tian, Kevin wrote: From: Kai Huang [mailto:kai.hu...@linux.intel.com] Sent: Wednesday, April 15, 2015 3:04 PM A new 4K page pointer is added to arch_vmx_struct as PML buffer for vcpu. And a new 'status' field is added to vmx_domain to indicate whether PML is enabled for the domain or not. The 'status' field also can be used for further similiar purpose. not sure about the last sentence. what's the similar purpose to whether PML is enabled? :-) I mean potentially there might be such feature in the future, and I can't give you an example right now. If you are just commenting the description here but fine with the current code, I can remove that last sentence if you like. Or do you suggest to just use a bool_t pml_enabled? I am fine with both, but looks there's no objection from others so I intend to keep it as 'unsigned int status', if you agree. Hi Kevin, What's your opinion here? Is 'unsigned int status' OK to you? yes Note both new members don't have to be initialized to zero explicitly as both vcpu and domain structure are zero-ed when they are created. no initialization in this patch, so why explaining it here? OK. Looks it's a common sense to all of you so I'll just remove this sentence. Signed-off-by: Kai Huang kai.hu...@linux.intel.com --- xen/include/asm-x86/hvm/vmx/vmcs.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h index f831a78..2c679ac 100644 --- a/xen/include/asm-x86/hvm/vmx/vmcs.h +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h @@ -70,8 +70,12 @@ struct ept_data { cpumask_var_t synced_mask; }; +#define _VMX_DOMAIN_PML_ENABLED0 +#define VMX_DOMAIN_PML_ENABLED (1ul _VMX_DOMAIN_PML_ENABLED) struct vmx_domain { unsigned long apic_access_mfn; +/* VMX_DOMAIN_* */ +unsigned long status; }; struct pi_desc { @@ -142,6 +146,9 @@ struct arch_vmx_struct { /* Bitmap to control vmexit policy for Non-root VMREAD/VMWRITE */ struct page_info *vmread_bitmap; struct page_info *vmwrite_bitmap; + +#define NR_PML_ENTRIES 512 +struct page_info *pml_pg; move the macro out of the structure. OK. I will move it just above the declaration of struct arch_vmx_struct. and is pml_buffer or pml_buf more clear? To me pml_buffer or pml_buf is more likely a virtual address you can access the buffer directly, while pml_pg indicates it's a pointer of struct page_info. If you you look at patch 6, you can find statements like: uint64_t *pml_buf; pml_buf = __map_domain_page(v-arch.hvm_vmx.pml_pg); So I intend to keep it. And this one? Are you OK with 'pml_pg'? good to me too. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 2/9] xen/arm: Allow hypervisor access to mem_access protected pages
On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote: -paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t) +static paddr_t __p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t) I'd have been inclined to make this take a p2m_domain* rather than a domain*, on the basis that the caller must have one in hand to have locked it. I don't think __p2m_lookup uses d other than to find the p2m. Not worth changing unless there is some other reason to resend though, so with or without that: Acked-by: Ian Campbell ian.campb...@citrix.com Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology
On 21.04.15 at 14:56, boris.ostrov...@oracle.com wrote: On 04/21/2015 03:01 AM, Jan Beulich wrote: On 17.04.15 at 18:59, boris.ostrov...@oracle.com wrote: Changes in v7: * Break from the loop when -ENODEV is encountered This seems pretty inefficient for the caller. Returning a bad identifier other than XEN_INVALID_NODE_ID would seem better to me. That would mean that there is really no reason to return -ENODEV, which I thought you wanted to see. In that case (i.e. if no error needs to be returned) yes, a new token would make things simpler. All I asked for was to make the two cases (no device and device with no known node) distinguishable. --- a/docs/misc/xsm-flask.txt +++ b/docs/misc/xsm-flask.txt @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h) * XEN_SYSCTL_cpupool_op * XEN_SYSCTL_scheduler_op * XEN_SYSCTL_coverage_op + * XEN_SYSCTL_pcitopoinfo No additions to this list are permitted. Either the new sub-op is disaggregation safe (which it looks to be), or it can't be accepted. True, it *is* safe, but why then cputopoinfo and numainfo are in this list? They look to be safe as well. Because at the time the list got composed we weren't able to spend time looking at _all_ of the operations, and hence we simply added them all (with a few exceptions on the domctl side iirc). Quite likely the two you mention could be removed from the list, but such needs to happen with proper reasoning in the patch description. Feel free to submit a patch. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology
On 04/21/2015 09:14 AM, Andrew Cooper wrote: On 21/04/15 13:56, Boris Ostrovsky wrote: On 04/21/2015 03:01 AM, Jan Beulich wrote: --- a/docs/misc/xsm-flask.txt +++ b/docs/misc/xsm-flask.txt @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h) * XEN_SYSCTL_cpupool_op * XEN_SYSCTL_scheduler_op * XEN_SYSCTL_coverage_op + * XEN_SYSCTL_pcitopoinfo No additions to this list are permitted. Either the new sub-op is disaggregation safe (which it looks to be), or it can't be accepted. True, it *is* safe, but why then cputopoinfo and numainfo are in this list? They look to be safe as well. This list includes not yet audited. It is quite likely that some entries in the list are safe. fwiw, neither cputopo nor numainfo are currently safe for very large systems. Why would safety of an operation depend on system size? And how are these two unsafe? (Because maybe then PCI topology query is unsafe in the same manner) -boris -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 8/8] raisin: RFC Add blktap2 as an external tree
On Mon, 2015-04-20 at 18:05 +0100, Stefano Stabellini wrote: I think we need to disable the build on architectures other than x86, see grub for example Eventually we might want to build our own grub on ARM in order to pick up Fu Wei's multiboot for arm64 patches, until they enter distros? Or maybe Raisin on UEFI should be calling efibootmgr to register Xen directly with the BIOS, and creating a xen.cfg in /boot, i.e. the way it currently works even on x86. Do we? There's no reason a blktap2 kernel module couldn't be built on ARM, is there? Maybe not, but I am pretty sure that it doesn't work at the moment. I don't think that the userspace stuff even compiles on ARM. Eventually we might have blktap on ARM, but I don't want to enable stuff in Raisin that we know it does not work. Especially if it is already to a greater or lesser extent deprecated (in favour of eventual blktap3) even on x86. And in any case the long-term plan is to get blktap3 working as well, which doesn't require kernel-space drivers. Or do you mean on systems other than Linux? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] effect of ticks in xen credit scheduler
Hi, I am student starting to work with Xen. I am using Xen 4.2.0. I am trying to understand the functioning of the credit scheduler in Xen. I try to modify the scheduler number of ticks per timeslice (form 3 to 1). I wonder that the is no difference in the scheduling of my virtual machines (No change in CPU allocation % and also application performance in the virtual machine). I just wanted to know what is the use of tich during time slice? Why do xen define 3 ticks per timeslice? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/configure.ac: generate Paths.mk if it's not available
On Mon, Apr 20, 2015 at 03:07:38PM +0100, Wei Liu wrote: Xen toolstack references many variables in Paths.mk when building and installing, so tools' configure should generate Paths.mk if it's not available. Also make inclusion of Paths.mk mandatory in Tools.mk. Hmm... I just discovered that docs build also involves Paths.mk. This patch is ugly enough that I don't want to duplicate it for docs. So advise on how to fix this would be much appreciated. Or we can state clear that anyone who builds Xen from source needs to run ./configure in top level directory, not the ones in subsystems. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Xen-unstable-staging: Xen BUG at iommu_map.c:455
On 21.04.15 at 10:24, li...@eikelenboom.it wrote: Tuesday, April 21, 2015, 10:11:07 AM, you wrote: Interesting - didn't you say that as a side effect of Andrew's patch you saw massive log spam? If you mean these: (XEN) [2015-04-12 14:55:20.226] p2m.c:884:d0v0 gfn_to_mfn failed! gfn=001ed type:4 [...] Those were actually due to Konrad's kernel patch that was on the devel-4.1 branch that has already been dropped. (commit 22d8a8938407cb1342af763e937fdf9ee8daf24a 'xen/pciback: Don't disable PCI_COMMAND on PCI device reset.') Ah, okay. Iirc there was no progress towards a resolution there yet? For the rest there is some extra log spam now, since the memory maps now are done in very small chunks (the hypercall continuation stuff working?): (XEN) [2015-04-21 08:04:01.207] memory_map:add: dom20 gfn=ec780 mfn=cc780 nr=40 [...] Don't know if that makes much sense anymore (unless specifically enabled if you want such detail .. and the whole range with perhaps a start and finish message is not enough) The hypervisor can't really tell whether a re-invocation of said hypercall is a continuation or a new request. Hence we can only either drop the message altogether or live with it being spammy on large regions (it's a XENLOG_G_INFO one anyway, so not enabled by default, and if enabled usually rate limited). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 13/19] xen: arm: Annotate registers trapped by MDCR_EL2.TDRA
On Tue, 2015-04-21 at 13:26 +0500, Julien Grall wrote: Hi Ian, On 17/04/2015 16:51, Ian Campbell wrote: Furthermore, this is the only registers not handled on AArch32 for this bit. This is rather strange to list them while you didn't do it for the trace registers. My intention was that every register trapped by a bit which we set be listed somewhere, to make it easier to cross reference with the docs and check we haven't accidentally forgotten something (as opposed to deliberately ignoring as indicated by these comments). You seem to be saying I've missed some trace registers, which ones? I meant that you didn't list the trace registers trapped but unhandled. Although I wasn't able to find a list, is it trace module specific? If so maybe a comment would be good? I think maybe you are talking about the things trapped by CPTR_EL2.TTA rather than MDCR_EL2.TDRA (the subject of this patch)? The table referenced for CPTR_EL2.TTA just says All implemented trace registers, rather than listing anything specific. I could add a similar comment to the relevant patch. Looks like HCR_EL2.TIDCP is similarly lacking a comment for the unhandled ones. I'll add one. Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 07/13] x86: dynamically get/set CBM for a domain
On Mon, Apr 20, 2015 at 04:52:09PM +0100, Andrew Cooper wrote: On 17/04/15 15:33, Chao Peng wrote: For CAT, COS is maintained in hypervisor only while CBM is exposed to user space directly to allow getting/setting domain's cache capacity. For each specified CBM, hypervisor will either use a existed COS which has the same CBM or allocate a new one if the same CBM is not found. If the allocation fails because of no enough COS available then error is returned. The getting/setting are always operated on a specified socket. For multiple sockets system, the interface may be called several times. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- Changes in v5: * Add spin_lock to protect cbm_map. --- +for ( cos = 0; cos = info-cos_max; cos++ ) +{ +/* If still not found, then keep unused one. */ +if ( !find cos != 0 map[cos].ref == 0 ) +find = map + cos; +else if ( map[cos].cbm == cbm ) +{ +if ( unlikely(cos == old_cos) ) +return 0; +find = map + cos; +break; +} +} + +/* If old cos is referred only by the domain, then use it. */ +if ( !find map[old_cos].ref == 1 ) +find = map + old_cos; + +if ( !find ) +return -EUSERS; + +cos = find - map; +if ( find-cbm != cbm ) +{ +ret = write_l3_cbm(socket, cos, cbm); +if ( ret ) +return ret; +find-cbm = cbm; +} + +spin_lock(info-cbm_lock); +find-ref++; +map[old_cos].ref--; +spin_unlock(info-cbm_lock); The spinlock must cover read accesses as well, or old_cos is liable to be stale by this point. You mean map[old_cos].ref and find-ref are stale, right? old_cos itself seems not need to be protected. It might be better to split into a rw_lock as it is read often but modifications should be very rare. NP, thanks. Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 3/4] xen: use ticket locks for spin locks
Replace the byte locks with ticket locks. Ticket locks are: a) fair; and b) peform better when contented since they spin without an atomic operation. The lock is split into two ticket values: head and tail. A locker acquires a ticket by (atomically) increasing tail and using the previous tail value. A CPU holds the lock if its ticket == head. The lock is released by increasing head. Architectures need only provide an xadd() implementation. Signed-off-by: David Vrabel david.vra...@citrix.com --- xen/common/spinlock.c | 110 xen/include/xen/spinlock.h | 16 +-- 2 files changed, 72 insertions(+), 54 deletions(-) diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c index 5fd8b1c..a2170d2 100644 --- a/xen/common/spinlock.c +++ b/xen/common/spinlock.c @@ -115,16 +115,32 @@ void spin_debug_disable(void) #endif +static always_inline spinlock_tickets_t observe_lock(spinlock_tickets_t *t) +{ +spinlock_tickets_t v; + +smp_rmb(); +v.head_tail = read_atomic(t-head_tail); +return v; +} + +static always_inline u16 observe_head(spinlock_tickets_t *t) +{ +smp_rmb(); +return read_atomic(t-head); +} + void _spin_lock(spinlock_t *lock) { +spinlock_tickets_t tickets = { .tail = 1, }; LOCK_PROFILE_VAR; check_lock(lock-debug); -while ( unlikely(!_raw_spin_trylock(lock-raw)) ) +tickets.head_tail = xadd(lock-tickets.head_tail, tickets.head_tail); +while ( tickets.tail != observe_head(lock-tickets) ) { LOCK_PROFILE_BLOCK; -while ( likely(_raw_spin_is_locked(lock-raw)) ) -cpu_relax(); +cpu_relax(); } LOCK_PROFILE_GOT; preempt_disable(); @@ -132,76 +148,58 @@ void _spin_lock(spinlock_t *lock) void _spin_lock_irq(spinlock_t *lock) { -LOCK_PROFILE_VAR; - ASSERT(local_irq_is_enabled()); local_irq_disable(); -check_lock(lock-debug); -while ( unlikely(!_raw_spin_trylock(lock-raw)) ) -{ -LOCK_PROFILE_BLOCK; -local_irq_enable(); -while ( likely(_raw_spin_is_locked(lock-raw)) ) -cpu_relax(); -local_irq_disable(); -} -LOCK_PROFILE_GOT; -preempt_disable(); +_spin_lock(lock); } unsigned long _spin_lock_irqsave(spinlock_t *lock) { unsigned long flags; -LOCK_PROFILE_VAR; local_irq_save(flags); -check_lock(lock-debug); -while ( unlikely(!_raw_spin_trylock(lock-raw)) ) -{ -LOCK_PROFILE_BLOCK; -local_irq_restore(flags); -while ( likely(_raw_spin_is_locked(lock-raw)) ) -cpu_relax(); -local_irq_disable(); -} -LOCK_PROFILE_GOT; -preempt_disable(); +_spin_lock(lock); return flags; } void _spin_unlock(spinlock_t *lock) { +smp_mb(); preempt_enable(); LOCK_PROFILE_REL; -_raw_spin_unlock(lock-raw); +lock-tickets.head++; } void _spin_unlock_irq(spinlock_t *lock) { -preempt_enable(); -LOCK_PROFILE_REL; -_raw_spin_unlock(lock-raw); +_spin_unlock(lock); local_irq_enable(); } void _spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags) { -preempt_enable(); -LOCK_PROFILE_REL; -_raw_spin_unlock(lock-raw); +_spin_unlock(lock); local_irq_restore(flags); } int _spin_is_locked(spinlock_t *lock) { check_lock(lock-debug); -return _raw_spin_is_locked(lock-raw); +return lock-tickets.head != lock-tickets.tail; } int _spin_trylock(spinlock_t *lock) { +spinlock_tickets_t old, new; + check_lock(lock-debug); -if ( !_raw_spin_trylock(lock-raw) ) +old = observe_lock(lock-tickets); +if ( old.head != old.tail ) +return 0; +new = old; +new.tail++; +if ( cmpxchg(lock-tickets.head_tail, old.head_tail, new.head_tail) + != old.head_tail ) return 0; #ifdef LOCK_PROFILE if (lock-profile) @@ -213,27 +211,32 @@ int _spin_trylock(spinlock_t *lock) void _spin_barrier(spinlock_t *lock) { +spinlock_tickets_t sample; #ifdef LOCK_PROFILE s_time_t block = NOW(); -u64 loop = 0; +#endif check_barrier(lock-debug); -do { smp_mb(); loop++;} while ( _raw_spin_is_locked(lock-raw) ); -if ((loop 1) lock-profile) +sample = observe_lock(lock-tickets); +if (sample.head != sample.tail) { -lock-profile-time_block += NOW() - block; -lock-profile-block_cnt++; -} -#else -check_barrier(lock-debug); -do { smp_mb(); } while ( _raw_spin_is_locked(lock-raw) ); +while (observe_head(lock-tickets) != sample.tail) +{ +#ifdef LOCK_PROFILE +if (lock-profile) +{ +lock-profile-time_block += NOW() - block; +lock-profile-block_cnt++; +} #endif +} +} smp_mb(); } int _spin_trylock_recursive(spinlock_t *lock) { -int cpu = smp_processor_id(); +unsigned int cpu = smp_processor_id(); /* Don't
[Xen-devel] [PATCHv3 4/4] x86, arm: remove asm/spinlock.h from all architectures
Now that all architecture use a common ticket lock implementation for spinlocks, remove the architecture specific byte lock implementations. Signed-off-by: David Vrabel david.vra...@citrix.com Reviewed-by: Tim Deegan t...@xen.org Acked-by: Jan Beulich jbeul...@suse.com --- xen/arch/arm/README.LinuxPrimitives | 28 --- xen/include/asm-arm/arm32/spinlock.h | 66 -- xen/include/asm-arm/arm64/spinlock.h | 63 xen/include/asm-arm/spinlock.h | 23 xen/include/asm-x86/spinlock.h | 37 --- xen/include/xen/spinlock.h |1 - 6 files changed, 218 deletions(-) delete mode 100644 xen/include/asm-arm/arm32/spinlock.h delete mode 100644 xen/include/asm-arm/arm64/spinlock.h delete mode 100644 xen/include/asm-arm/spinlock.h delete mode 100644 xen/include/asm-x86/spinlock.h diff --git a/xen/arch/arm/README.LinuxPrimitives b/xen/arch/arm/README.LinuxPrimitives index 7f33fc7..3115f51 100644 --- a/xen/arch/arm/README.LinuxPrimitives +++ b/xen/arch/arm/README.LinuxPrimitives @@ -25,16 +25,6 @@ linux/arch/arm64/include/asm/atomic.h xen/include/asm-arm/arm64/atomic.h - -spinlocks: last sync @ v3.16-rc6 (last commit: 95c4189689f9) - -linux/arch/arm64/include/asm/spinlock.h xen/include/asm-arm/arm64/spinlock.h - -Skipped: - 5686b06 arm64: lockref: add support for lockless lockrefs using cmpxchg - 52ea2a5 arm64: locks: introduce ticket-based spinlock implementation - -- - mem*: last sync @ v3.16-rc6 (last commit: d875c9b37240) linux/arch/arm64/lib/memchr.S xen/arch/arm/arm64/lib/memchr.S @@ -103,24 +93,6 @@ linux/arch/arm/include/asm/atomic.h xen/include/asm-arm/arm32/atomic.h - -spinlocks: last sync: 15e7e5c1ebf5 - -linux/arch/arm/include/asm/spinlock.h xen/include/asm-arm/arm32/spinlock.h - -*** Linux has switched to ticket locks but we still use bitlocks. - -resync to v3.14-rc7: - - 7c8746a ARM: 7955/1: spinlock: ensure we have a compiler barrier before sev - 0cbad9c ARM: 7854/1: lockref: add support for lockless lockrefs using cmpxchg64 - 9bb17be ARM: locks: prefetch the destination word for write prior to strex - 27a8479 ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock. - 00efaa0 ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free lo - afa31d8 ARM: 7811/1: locks: use early clobber in arch_spin_trylock - 73a6fdc ARM: spinlock: use inner-shareable dsb variant prior to sev instruction - -- - mem*: last sync @ v3.16-rc6 (last commit: d98b90ea22b0) linux/arch/arm/lib/copy_template.S xen/arch/arm/arm32/lib/copy_template.S diff --git a/xen/include/asm-arm/arm32/spinlock.h b/xen/include/asm-arm/arm32/spinlock.h deleted file mode 100644 index bc0343c..000 --- a/xen/include/asm-arm/arm32/spinlock.h +++ /dev/null @@ -1,66 +0,0 @@ -#ifndef __ASM_ARM32_SPINLOCK_H -#define __ASM_ARM32_SPINLOCK_H - -static inline void dsb_sev(void) -{ -__asm__ __volatile__ ( -dsb\n -sev\n -); -} - -typedef struct { -volatile unsigned int lock; -} raw_spinlock_t; - -#define _RAW_SPIN_LOCK_UNLOCKED { 0 } - -#define _raw_spin_is_locked(x) ((x)-lock != 0) - -static always_inline void _raw_spin_unlock(raw_spinlock_t *lock) -{ -ASSERT(_raw_spin_is_locked(lock)); - -smp_mb(); - -__asm__ __volatile__( - str %1, [%0]\n -: -: r (lock-lock), r (0) -: cc); - -dsb_sev(); -} - -static always_inline int _raw_spin_trylock(raw_spinlock_t *lock) -{ -unsigned long contended, res; - -do { -__asm__ __volatile__( - ldrex %0, [%2]\n - teq %0, #0\n - strexeq %1, %3, [%2]\n - movne %1, #0\n -: =r (contended), =r (res) -: r (lock-lock), r (1) -: cc); -} while (res); - -if (!contended) { -smp_mb(); -return 1; -} else { -return 0; -} -} - -#endif /* __ASM_SPINLOCK_H */ -/* - * Local variables: - * mode: C - * c-file-style: BSD - * c-basic-offset: 4 - * indent-tabs-mode: nil - * End: - */ diff --git a/xen/include/asm-arm/arm64/spinlock.h b/xen/include/asm-arm/arm64/spinlock.h deleted file mode 100644 index 5ae034d..000 --- a/xen/include/asm-arm/arm64/spinlock.h +++ /dev/null @@ -1,63 +0,0 @@ -/* - * Derived from Linux arch64 spinlock.h which is: - * Copyright (C) 2012 ARM Ltd. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the
[Xen-devel] [PATCHv3 2/4] arm: provide xadd()
xadd() atomically adds a value and returns the previous value. This is needed to implement ticket locks. This generic arm implementation uses the GCC __sync_fetch_and_add() builtin. This builtin resulted in suitable inlined asm for GCC 4.8.3 (arm64) and GCC 4.6.3 (arm32). Signed-off-by: David Vrabel david.vra...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com --- xen/include/asm-arm/system.h |2 ++ 1 file changed, 2 insertions(+) diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h index ce3d38a..367af1d 100644 --- a/xen/include/asm-arm/system.h +++ b/xen/include/asm-arm/system.h @@ -51,6 +51,8 @@ # error unknown ARM variant #endif +#define xadd(x, v) __sync_fetch_and_add(x, v) + extern struct vcpu *__context_switch(struct vcpu *prev, struct vcpu *next); #endif -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv3 1/4] x86: provide xadd()
xadd() atomically adds a value and returns the previous value. This is needed to implement ticket locks. Signed-off-by: David Vrabel david.vra...@citrix.com --- xen/include/asm-x86/system.h | 47 ++ 1 file changed, 47 insertions(+) diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h index 7111329..f244c8d 100644 --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -118,6 +118,53 @@ static always_inline unsigned long __cmpxchg( }) /* + * Undefined symbol to cause link failure if a wrong size is used with + * xadd(). + */ +extern unsigned long __bad_xadd_size(void); + +static always_inline unsigned long __xadd( +volatile void *ptr, unsigned long v, int size) +{ +switch ( size ) +{ +case 1: +asm volatile ( lock; xaddb %b0,%1 + : +r (v), +m (*__xg((volatile void *)ptr)) + :: memory); +return v; +case 2: +asm volatile ( lock; xaddw %w0,%1 + : +r (v), +m (*__xg((volatile void *)ptr)) + :: memory); +return v; +case 4: +asm volatile ( lock; xaddl %k0,%1 + : +r (v), +m (*__xg((volatile void *)ptr)) + :: memory); +return v; +case 8: +asm volatile ( lock; xaddq %q0,%1 + : +r (v), +m (*__xg((volatile void *)ptr)) + :: memory); + +return v; +default: +return __bad_xadd_size(); +} +} + +/* + * Atomically add @v to the 1, 2, 4, or 8 byte value at @ptr. Returns + * the previous value. + * + * This is a full memory barrier. + */ +#define xadd(ptr, v) ({ \ +__xadd((ptr), (unsigned long)(v), sizeof(*(ptr))); \ +}) + +/* * Both Intel and AMD agree that, from a programmer's viewpoint: * Loads cannot be reordered relative to other loads. * Stores cannot be reordered relative to other stores. -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 12/13] tools: add tools support for Intel CAT
On Tue, Apr 21, 2015 at 03:24:37AM +0200, Dario Faggioli wrote: On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote: This is the xc/xl changes to support Intel Cache Allocation Technology(CAT). Two commands are introduced: - xl psr-cat-hwinfo Show CAT hardware information. Examples: [root@vmm-psr vmm]# xl psr-cat-hwinfo Cache Allocation Technology (CAT): Socket ID : 0 L3 Cache: 12288KB Maximum COS : 15 CBM length : 12 Default CBM : 0xfff Or, you can rename the psr-cmt-hwinfo command, added in the previous patch, to 'psr-hwinfo' and make it accept options, e.g., -m, --cmt show Cache Monitoring Technology (CMT) hardware info -c, --cat show Cache Allocation Technology (CAT) hardware info By default (i.e., no options provided), it can just print all the hw info. Not a big deal, but I think that would make a better command line interface. Tools' maintainers' call, I guess. Thanks for suggestion. --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h + +int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid, + xc_psr_cat_type type, uint32_t target, + uint64_t data); +int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid, + xc_psr_cat_type type, uint32_t target, + uint64_t *data); So, for this twos, 'target' is the socket you want to act on. +int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket, + uint32_t *cos_max, uint32_t *cbm_len); While here you use 'socket', to mean the same thing. That looks rather inconsistent. Since it's a socket we are talking about, why not 'socket' everywhere? The idea behind here is: All the places that appear as 'target' imply there are possible values other than just socket (e.g. considering L2 Cache Allocation in the future). So 'target' is always paired with a 'psr_cat_type' parameter. For routines that only work for L3 (e.g. xc_psr_cat_get_l3_info) then 'socket' is used. I admit that it looks inconsistent, perhaps rename all 'socket' to 'target'? --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h +#ifdef LIBXL_HAVE_PSR_CAT + +#define LIBXL_PSR_TARGET_ALL (~0U) +int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid, + libxl_psr_cbm_type type, uint32_t target, + uint64_t cbm); +int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid, + libxl_psr_cbm_type type, uint32_t target, + uint64_t *cbm_r); + The same applies here: I'd rename taget to socket. Then there's that LIBXL_PSR_TARGET_ALL. So, target (or socket, if you'll rename it) is an integer representing _which_one_ socket to act on. However, there is a special value to mean all sockets. Another possibility would be to offer an API that natively allows for operating on multiple sockets, by using libxl_bitmap-s, as it happens in many other places, in libxl itself (e.g., setting and getting vcpu affinity). That means target would become a libxl_bitmap, and, in the implementation, you'll apply the operation on all the sockets corresponding to a set bit in there. Only one bit set means just that socket, all bits means all sockets. This looks like a better interface to me (no need for special ~0U values), it'd make the implementation more linear, and is more consistent with how other similar situations are handled in libxl. However, I appreciate that one may find it overkill... I guess it depends whether we expect the prevalent usage pattern to be almost always about single sockets --and maybe on all sockets, from time to time-- or if we see value in being able to specify more than one and less than all sockets. For instance, now that we have vNUMA, if a domain has 4 vNUMA nodes, each one mapped to a physical NUMA node, it looks to me like it would make sense to set CAT to, say, 0x0F, on the sockets corresponding to the physical NODEs. With the interface in this patch, that would require calling libxl_psr_cat_set_cbm() 4 times, with the libxl_bitmap approach, just once, after setting up the bitmap properly. Thoughts? I do like this suggestion and I have ever considered it actually. The only thing prevents me is that we need an extra _get_socket_count in xl for TARGET_ALL case. So libxl__count_physical_sockets is needed to be public. If Ian/Wei have no concerns for this, then I'm glad to do this. --- a/tools/libxl/libxl_psr.c +++ b/tools/libxl/libxl_psr.c +int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info, + uint32_t *nr) +{ +GC_INIT(ctx); +int rc, r; +uint32_t i, nr_sockets; +libxl_psr_cat_info *ptr; + +rc = libxl__count_physical_sockets(gc, nr_sockets);
Re: [Xen-devel] [PATCH 8/8] raisin: RFC Add blktap2 as an external tree
On 04/21/2015 10:25 AM, Ian Campbell wrote: On Mon, 2015-04-20 at 18:05 +0100, Stefano Stabellini wrote: I think we need to disable the build on architectures other than x86, see grub for example Eventually we might want to build our own grub on ARM in order to pick up Fu Wei's multiboot for arm64 patches, until they enter distros? Or maybe Raisin on UEFI should be calling efibootmgr to register Xen directly with the BIOS, and creating a xen.cfg in /boot, i.e. the way it currently works even on x86. Do we? There's no reason a blktap2 kernel module couldn't be built on ARM, is there? Maybe not, but I am pretty sure that it doesn't work at the moment. I don't think that the userspace stuff even compiles on ARM. Eventually we might have blktap on ARM, but I don't want to enable stuff in Raisin that we know it does not work. Especially if it is already to a greater or lesser extent deprecated (in favour of eventual blktap3) even on x86. So from my discussions w/ the XenServer guys, it seems that: 1. The master branch of the blktap.git repo contains support for *both* blktap3 and blktap2.5 (with a kernel module) 2. XenServer uses blktap3 for guest access, but still use the blktap2.5 w/ kernel module for dom0 access to guest disks, to avoid the possibility of hitting a scalability limit due to grant references. So from raisin's perspective, the only difference between blktap2.5 and blktap3 is using the master branch rather than the blktap2 branch of the repo. Whether we maintain support for blktap2.5 in libxl is a matter for the Xen maintainers; but if xapi is ever going to start using libxl, it will certainly need to be able to do so. (Dave / David, please correct me if I'm wrong.) That said, there's no harm in disabling it on ARM to begin with, and enabling it once blktap3 works. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v8 2/7] Changes to support '/boot' leading paths of kernel, xen, in grub
On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote: Support situations of grub that have vmlinuz and other things starting with path of '/boot' rather than '/'. Signed-off-by: longtao.pang longtaox.p...@intel.com Acked-by: Ian Campbell ian.campb...@citrix.com --- Osstest/Debian.pm |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm index 378af54..1a57cdd 100644 --- a/Osstest/Debian.pm +++ b/Osstest/Debian.pm @@ -449,21 +449,21 @@ sub setupboot_grub2 () { if (m/^submenu\s+[\'\](.*)[\'\].*\{\s*$/) { $submenu={ StartLine =$.}; } -if (m/^\s*multiboot\s*\/(xen\-[0-9][-+.0-9a-z]*\S+)/) { +if (m/^\s*multiboot\s*(?:\/boot)?\/(xen\S+)/) { die unless $entry; $entry-{Hv}= $1; } -if (m/^\s*multiboot\s*\/(vmlinu[xz]-(\S+))/) { +if (m/^\s*multiboot\s*(?:\/boot)?\/(vmlinu[xz]-(\S+))/) { die unless $entry; $entry-{KernOnly}= $1; $entry-{KernVer}= $2; } -if (m/^\s*module\s*\/(vmlinu[xz]-(\S+))/) { +if (m/^\s*module\s*(?:\/boot)?\/(vmlinu[xz]-(\S+))/) { die unless $entry; $entry-{KernDom0}= $1; $entry-{KernVer}= $2; } -if (m/^\s*module\s*\/(initrd\S+)/) { +if (m/^\s*module\s*(?:\/boot)?\/(initrd\S+)/) { $entry-{Initrd}= $1; } if (m/^\s*module\s*\/(xenpolicy\S+)/) { ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v8 1/7] parsing grub which has 'submenu' primitive
On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote: From a hvm kernel build from Linux stable Kernel tree, the auto generated grub2 menu will have 'submenu' primitive, upon the 'menuentry' items. Xen boot entries will be grouped into a submenu. This patch adds capability to support such grub formats. Signed-off-by: longtao.pang longtaox.p...@intel.com I think this won't work with nested submenus, but I assume we don't see them at least with Jessie so we can live with this. I'm also not sure it would cope with a submenu declared within a function, but again I expect we aren't seeing those in practice. So given this handles the grub.cfg files we see in practice: Acked-by: Ian Campbell ian.campb...@citrix.com I'd also like to see some of our delta in overlay/etc/grub.d/20_linux_xen which was added to remove the possibility of submenus to be removed, see below. Wei, per the comment below I think we should update the grub BR with a fixed patch, like the attached. Ian. - From e082a82a3036152797447ac4c809e3b67e68cd12 Mon Sep 17 00:00:00 2001 From: Ian Campbell ian.campb...@citrix.com Date: Tue, 21 Apr 2015 11:06:06 +0100 Subject: [PATCH] grub: remove patch to disable submenu from 20_linux_xen overlay setupboot_grub2 now supports submenus, so we can reduce our delta vs upstream a bit. I started by extracting 20_linux_xen from http://snapshot.debian.org/archive/debian/20130703T094657Z/pool/main/g/grub2/grub-common_1.99-27%2Bdeb7u2_amd64.deb and then applying the patch at http://savannah.gnu.org/file/grub.patch?file_id=32276 (the patch from grub bug #42420 at http://savannah.gnu.org/bugs/?43420) and reinstating the comment at the top of the file (modified to drop the reference to the Debian bug. This left me with some spurious changes: @@ -93,7 +93,7 @@ linux_entry () if test ! -e ${xen_dirname}/${xenpolicy} ; then return fi - xen_args=`echo $xen_args flask=enforcing` + xen_args=`echo $xen_args flask_enabled=1 flask_enforcing=1` if ${recovery} ; then title=$(gettext_quoted %s, with Xen %s (XSM enabled) and Linux %s (recovery mode)) else @@ -137,7 +137,6 @@ EOF echo'$message' module ${rel_dirname}/${xenpolicy} EOF - fi cat EOF } EOF I think these are bugs in the patch in the grub BTS, which were fixed while iterating over the XSM series in osstest but didn't make it into the upstream version, the fixes to those bugs are reverted byu the above. So I have manually reverted them. Signed-off-by: Ian Campbell ian.campb...@citrix.com --- Wei, if you agree wrt those changes I'll update the bug, or perhaps you want to? --- overlay/etc/grub.d/20_linux_xen | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/overlay/etc/grub.d/20_linux_xen b/overlay/etc/grub.d/20_linux_xen index 5315e2a..aaead1b 100755 --- a/overlay/etc/grub.d/20_linux_xen +++ b/overlay/etc/grub.d/20_linux_xen @@ -1,7 +1,7 @@ #! /bin/sh # Copied from the identically named file in grub-common 1.99-27+deb7u2. -# This version fixed Debian bug #690538 and GRUB bug #43420. +# This version fixes GRUB bug #43420. set -e @@ -173,6 +173,7 @@ while [ x${xen_list} != x ] ; do xen_dirname=`dirname ${current_xen}` rel_xen_dirname=`make_system_path_relative_to_its_root $xen_dirname` xen_version=`echo $xen_basename | sed -e s,.gz$,,g;s,^xen-,,g` +echo submenu \Xen ${xen_version}\ { while [ x$list != x ] ; do linux=`version_find_latest $list` echo Found linux image: $linux 2 @@ -214,5 +215,6 @@ while [ x${xen_list} != x ] ; do list=`echo $list | tr ' ' '\n' | grep -vx $linux | tr '\n' ' '` done +echo } xen_list=`echo $xen_list | tr ' ' '\n' | grep -vx $current_xen | tr '\n' ' '` done -- 2.1.4 --- /home/ianc/tmp/x/etc/grub.d/20_linux_xen 2013-07-03 04:39:20.0 +0100 +++ overlay/etc/grub.d/20_linux_xen 2015-04-21 11:09:57.777812773 +0100 @@ -81,10 +85,27 @@ recovery=$4 args=$5 xen_args=$6 - if ${recovery} ; then -title=$(gettext_quoted %s, with Xen %s and Linux %s (recovery mode)) + xsm=$7 + # If user wants to enable XSM support, make sure there's + # corresponding policy file. + if ${xsm} ; then + xenpolicy=`echo xenpolicy-$xen_version` + if test ! -e ${xen_dirname}/${xenpolicy} ; then + return + fi + xen_args=`echo $xen_args flask=enforcing` + if ${recovery} ; then + title=$(gettext_quoted %s, with Xen %s (XSM enabled) and Linux %s (recovery mode)) + else + title=$(gettext_quoted %s, with Xen %s (XSM enabled) and Linux %s) + fi else -title=$(gettext_quoted %s, with Xen %s and Linux %s) + xenpolicy= + if ${recovery} ; then + title=$(gettext_quoted %s, with Xen %s and Linux %s (recovery mode)) + else + title=$(gettext_quoted %s, with Xen %s and Linux %s) + fi fi printf menuentry '${title}' ${CLASS}
Re: [Xen-devel] Xen-unstable-staging: Xen BUG at iommu_map.c:455
On 20.04.15 at 20:50, li...@eikelenboom.it wrote: Monday, April 20, 2015, 6:11:42 PM, you wrote: On 16.04.15 at 11:28, t...@xen.org wrote: At 22:35 +0100 on 11 Apr (1428791713), Andrew Cooper wrote: At least we now understand why it happens. I will defer to others CC'd on this thread for their opinions in the matter. The patch semes like a pretty good check to me, though I'm not convinced it's race-free. At the least I'd cache the m2p lookup so we use the same value for the checks and the map_page() call. And IMO update_paging_mode() ought to log and reject bogus GFNs as well. could you give the patch below a try, namely also in the context of seeing again the issue originally fixed by Andrew's initial patch? Please make sure you try a debug build and you have iommu=debug on the Xen command line. I'm running with it now, have seen no issues so far ! Interesting - didn't you say that as a side effect of Andrew's patch you saw massive log spam? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 13/19] xen: arm: Annotate registers trapped by MDCR_EL2.TDRA
Hi Ian, On 17/04/2015 16:51, Ian Campbell wrote: Furthermore, this is the only registers not handled on AArch32 for this bit. This is rather strange to list them while you didn't do it for the trace registers. My intention was that every register trapped by a bit which we set be listed somewhere, to make it easier to cross reference with the docs and check we haven't accidentally forgotten something (as opposed to deliberately ignoring as indicated by these comments). You seem to be saying I've missed some trace registers, which ones? I meant that you didn't list the trace registers trapped but unhandled. Although I wasn't able to find a list, is it trace module specific? If so maybe a comment would be good? Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 03/13] x86: detect and initialize Intel CAT feature
On Mon, Apr 20, 2015 at 06:13:12PM +0200, Dario Faggioli wrote: On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote: --- a/xen/arch/x86/psr.c +++ b/xen/arch/x86/psr.c @@ -19,14 +19,25 @@ #include asm/psr.h #define PSR_CMT(10) +#define PSR_CAT(11) + +struct psr_cat_socket_info { +bool_t initialized; +bool_t enabled; +unsigned int cbm_len; +unsigned int cos_max; +}; Can't we ditch 'initialized' from within the struct and have a (global) bitmap, with one bit for each socket, expressing the same? And that also for 'enabled'. It's probably, at least up to a certain extent, a matter of taste (and I personally think it will look better), but it also should produce tighter code... NP, thanks. Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 11/13] tools/libxl: add command to show CMT hardware info
On Tue, Apr 21, 2015 at 02:37:54AM +0200, Dario Faggioli wrote: On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote: Add dedicated one to show hardware information. [root@vmm-psr]xl psr-cmt-hwinfo Cache Monitoring Technology (CMT): Enabled : 1 Total RMID : 63 Supported monitor types: cache-occupancy total-mem-bandwidth local-mem-bandwidth Nice. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -8014,6 +8014,36 @@ out: } #ifdef LIBXL_HAVE_PSR_CMT +static int psr_cmt_hwinfo(void) +{ +int rc; +int enabled; +uint32_t total_rmid; + I think you should still have something like: SWITCH_FOREACH_OPT(opt, , NULL, psr-cmt-hwinfo, 0) { /* No options */ } Or `xl psr-cmt-hwinfo -h' wouldn't work, would it? With that done: Reviewed-by: Dario Faggioli dario.faggi...@citrix.com Agreed, I will add it. Chao ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 8/8] raisin: RFC Add blktap2 as an external tree
On Tue, 21 Apr 2015, George Dunlap wrote: On 04/21/2015 10:25 AM, Ian Campbell wrote: On Mon, 2015-04-20 at 18:05 +0100, Stefano Stabellini wrote: I think we need to disable the build on architectures other than x86, see grub for example Eventually we might want to build our own grub on ARM in order to pick up Fu Wei's multiboot for arm64 patches, until they enter distros? Or maybe Raisin on UEFI should be calling efibootmgr to register Xen directly with the BIOS, and creating a xen.cfg in /boot, i.e. the way it currently works even on x86. Do we? There's no reason a blktap2 kernel module couldn't be built on ARM, is there? Maybe not, but I am pretty sure that it doesn't work at the moment. I don't think that the userspace stuff even compiles on ARM. Eventually we might have blktap on ARM, but I don't want to enable stuff in Raisin that we know it does not work. Especially if it is already to a greater or lesser extent deprecated (in favour of eventual blktap3) even on x86. So from my discussions w/ the XenServer guys, it seems that: 1. The master branch of the blktap.git repo contains support for *both* blktap3 and blktap2.5 (with a kernel module) 2. XenServer uses blktap3 for guest access, but still use the blktap2.5 w/ kernel module for dom0 access to guest disks, to avoid the possibility of hitting a scalability limit due to grant references. So from raisin's perspective, the only difference between blktap2.5 and blktap3 is using the master branch rather than the blktap2 branch of the repo. That is not entirely true: compiling and installing a kernel module is quite different from userspace stuff, at least in terms of dependencies and installation paths. Whether we maintain support for blktap2.5 in libxl is a matter for the Xen maintainers; but if xapi is ever going to start using libxl, it will certainly need to be able to do so. (Dave / David, please correct me if I'm wrong.) That said, there's no harm in disabling it on ARM to begin with, and enabling it once blktap3 works. Yes, I would the code in Raisin to actually work :-) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test
On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote: 1. Designate vif model by make-flight. 2. In L2 installation context, its host (L1) IP address is not queried from DNS, but after running ts-nested-setup + host + nested, L1 IP is stored in runvar. Signed-off-by: longtao.pang longtaox.p...@intel.com --- Changes in v8: Remove the unnecessary symbol of ';' in 'TestSupport.pm'. --- Osstest/TestSupport.pm |8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm index 1bde67d..4cdacfc 100644 --- a/Osstest/TestSupport.pm +++ b/Osstest/TestSupport.pm @@ -818,6 +818,10 @@ sub selecthost ($) { logm(Host $name is in HostGroup $ho-{Properties}{HostGroup}); } +if ( $r{${name}_ip} ) { +$setprop-(IpAddr, $r{${name}_ip}); +} This is one for Ian I think, but I suspect this should use ${ident} rather than ${name}. ${ident} is e.g. 'host' or 'srchost' or 'nestedl1' it is the prefer used on the runvar names. ${name} is the specific value assigned, i.e. an actual host name. For such properties we usually prefer ${ident}_foo. Ian, correct me if I'm wrong please. + # Next, we use the config file's general properites as defaults foreach my $k (keys %c) { next unless $k =~ m/^HostProp_([A-Z].*)$/; @@ -1548,11 +1552,13 @@ sub prepareguest_part_xencfg ($) { my $oncrash= $xopts-{OnCrash} || 'preserve'; my $vcpus= guest_var($gho, 'vcpus', $xopts-{DefVcpus} || 2); my $xoptcfg= $xopts-{ExtraConfig}; +my $vif= guest_var($gho, 'vifmodel',''); +my $vifmodel= $vif ? model=$vif : ''; $xoptcfg='' unless defined $xoptcfg; my $xencfg= END; name= '$gho-{Name}' memory = ${ram_mb} -vif = [ 'type=ioemu,mac=$gho-{Ether}' ] +vif = [ 'type=ioemu,${vifmodel},mac=$gho-{Ether}' ] If $vifmodel=='' then this will end up with a literal ,, in the configuration. I've not checked if this is allowed but for forms sake I would suggest to add the necessary , to $vifmodel when it is != ''. e.g. +my $vifmodel= $vif ? ,model=$vif : ''; [...] +vif = [ 'type=ioemu,mac=$gho-{Ether}${vifmodel}' ] Ian ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On Tue, Apr 21, 2015 at 4:29 PM, Jan Beulich jbeul...@suse.com wrote: On 21.04.15 at 16:24, ian.campb...@citrix.com wrote: On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote: On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64 bit host (maximum pfn more than 2^32)? It is, but do we really want to introduce other than just compat mode helper interfaces (i.e. leaving the current ones alone, and perhaps even making the new ones tools only) if we really care about such setups in the first place? Jan At the moment I just followed Andrew's advice and will introduce a new XENMEM_maximum_gpfn2 memop that returns the gpfn in a struct as xen_pfn_t. The old memops I'll leave untouched if that's OK. Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/5] raisin: introduce ovmf
On 04/21/2015 03:55 PM, Stefano Stabellini wrote: diff --git a/components/series b/components/series index f0f3cfa..fe9092a 100644 --- a/components/series +++ b/components/series @@ -1,4 +1,5 @@ seabios +ovmf xen qemu qemu_traditional diff --git a/components/xen b/components/xen index b3426f0..b3a0c96 100644 --- a/components/xen +++ b/components/xen @@ -29,7 +29,8 @@ function xen_build() { cd xen-dir ./configure --prefix=$PREFIX --with-system-qemu=$PREFIX/lib/xen/bin/qemu-system-i386 \ --disable-qemu-traditional --enable-rombios \ ---with-system-seabios=$BASEDIR/seabios-dir/out/bios.bin +--with-system-seabios=$BASEDIR/seabios-dir/out/bios.bin \ +--with-system-ovmf=$BASEDIR/ovmf-dir/ovmf.bin Does this still work if you don't build ovmf? -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/7] docs/build: Do not use move-if-changed
On Tue, 2015-04-21 at 16:00 +0100, Andrew Cooper wrote: On 21/04/15 15:12, Ian Campbell wrote: On Mon, 2015-04-20 at 11:49 +0100, Andrew Cooper wrote: Nothing expensive depends on these results. But are those uses a problem? Less fluff in the build logs, and less load moving files around the filesystem. OK, then: Acked-by: Ian Campbell ian.campb...@citrix.com Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]
Prashant Sreedharan writes (Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]): On Fri, 2015-04-17 at 15:12 -0400, David Miller wrote: From: Konrad Rzeszutek Wilk konrad.w...@oracle.com Date: Fri, 17 Apr 2015 15:04:48 -0400 A huge amount of NIC drivers use the DMA API, however if compiled under 32-bit an very important part of the DMA API can be ommitted leading to the drivers not working at all (especially if used with 'swiotlb=force iommu=soft'). ... As such enable this even when using 32-bit kernels. Reported-by: Ian Jackson ian.jack...@eu.citrix.com Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Thanks. Tested-by: Ian Jackson ian.jack...@eu.citrix.com I'd appreciate it if this patch could make its way into the 3.14.y stable branch, as well as all the other places it's applicable of course. I've included another copy below for convenience, with acks etc. from this email thread folded in. I have tested 3.14.34 plus just this patch, with my usual kernel configuration input and the affected machine, and this fixes the problem completely AFAICT. I have tested both baremetal 32-bit with `iommu=soft swiotlb=force' (which previously corrupted all received network packets) and 32-bit Linux under 64-bit Xen without any special options (which previously gave 25-30% packet loss). Thanks, Ian. From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk konrad.w...@oracle.com Date: Fri, 17 Apr 2015 14:55:47 -0400 Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected A huge amount of NIC drivers use the DMA API, however if compiled under 32-bit an very important part of the DMA API can be ommitted leading to the drivers not working at all (especially if used with 'swiotlb=force iommu=soft'). As Prashant Sreedharan explains it: the driver [tg3] uses DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma mapping and dma_unmap_addr() to get the mapping value. On most of the platforms this is a no-op, but ... with iommu=soft and swiotlb=force this house keeping is required, ... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the DMA address. As such enable this even when using 32-bit kernels. Reported-by: Ian Jackson ian.jack...@eu.citrix.com Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Acked-by: David S. Miller da...@davemloft.net Acked-by: Prashant Sreedharan prash...@broadcom.com Tested-by: Ian Jackson ian.jack...@eu.citrix.com --- arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b7d31ca..570c71d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -177,7 +177,7 @@ config SBUS config NEED_DMA_MAP_STATE def_bool y - depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG + depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG || SWIOTLB config NEED_SG_DMA_LENGTH def_bool y -- 2.1.0 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 12/13] tools: add tools support for Intel CAT
On Tue, 2015-04-21 at 17:49 +0800, Chao Peng wrote: On Tue, Apr 21, 2015 at 03:24:37AM +0200, Dario Faggioli wrote: On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote: This is the xc/xl changes to support Intel Cache Allocation Technology(CAT). Two commands are introduced: - xl psr-cat-hwinfo Show CAT hardware information. Examples: [root@vmm-psr vmm]# xl psr-cat-hwinfo Cache Allocation Technology (CAT): Socket ID : 0 L3 Cache: 12288KB Maximum COS : 15 CBM length : 12 Default CBM : 0xfff Or, you can rename the psr-cmt-hwinfo command, added in the previous patch, to 'psr-hwinfo' and make it accept options, e.g., -m, --cmt show Cache Monitoring Technology (CMT) hardware info -c, --cat show Cache Allocation Technology (CAT) hardware info By default (i.e., no options provided), it can just print all the hw info. Not a big deal, but I think that would make a better command line interface. Tools' maintainers' call, I guess. Thanks for suggestion. --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h + +int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid, + xc_psr_cat_type type, uint32_t target, + uint64_t data); +int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid, + xc_psr_cat_type type, uint32_t target, + uint64_t *data); So, for this twos, 'target' is the socket you want to act on. +int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket, + uint32_t *cos_max, uint32_t *cbm_len); While here you use 'socket', to mean the same thing. That looks rather inconsistent. Since it's a socket we are talking about, why not 'socket' everywhere? The idea behind here is: All the places that appear as 'target' imply there are possible values other than just socket (e.g. considering L2 Cache Allocation in the future). So 'target' is always paired with a 'psr_cat_type' parameter. Mmm... I understand your concerns. So, sticking to the future L2 CAT support example, what would 'target' mean in that case, a pCPU? I'll have to be something that makes it possible to 'identify' an L2, as much as socket identifies an L3... Is this the case? I'm not sure. My own concerns are that, if I look at the prototypes of this functions, it's not that evident what values should I use for the type and target parameters, whether there are constraints/relationships among them, etc. That applies to both the xc_* functions above and the libxl_* functions below, IMO. Libxc is not a stable interface, so we could even just forget about this, design this very interface _only_ for what we have now and deal with different CBM types when we'll be introducing them. However, libxl *does* have a stable API, so we still need to solve the issue at that level. For routines that only work for L3 (e.g. xc_psr_cat_get_l3_info) then 'socket' is used. I admit that it looks inconsistent, perhaps rename all 'socket' to 'target'? No, IMO, that is one good inconsistency, as it allows, at least for that function, to easily figure out what one should pass to the function by means of that parameter! :-) I really am not sure, and probably would have to know in what way(s) 'target' would change its meaning, depending on the value of 'type' (as asked above)... Probably what I'd do is leave parameters names as they are, but write a few doc-comments to explain how to use them, especially at the libxl level (libxc is a lot less critical, from this respect, I think). Regards, Dario signature.asc Description: This is a digitally signed message part ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] ts-xen-*: Avoid use of uninitialised $r{enable_xsm}
Ian Campbell writes ([PATCH OSSTEST] ts-xen-*: Avoid use of uninitialised $r{enable_xsm}): Use of uninitialized value $r{enable_xsm} in pattern match (m//) at ./ts-xen-install line 49. Acked-by: Ian Jackson ian.jack...@eu.citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On Tue, Apr 21, 2015 at 5:13 PM, Jan Beulich jbeul...@suse.com wrote: On 21.04.15 at 16:42, tkleng...@sec.in.tum.de wrote: On Tue, Apr 21, 2015 at 4:29 PM, Jan Beulich jbeul...@suse.com wrote: On 21.04.15 at 16:24, ian.campb...@citrix.com wrote: On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote: On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64 bit host (maximum pfn more than 2^32)? It is, but do we really want to introduce other than just compat mode helper interfaces (i.e. leaving the current ones alone, and perhaps even making the new ones tools only) if we really care about such setups in the first place? At the moment I just followed Andrew's advice and will introduce a new XENMEM_maximum_gpfn2 memop that returns the gpfn in a struct as xen_pfn_t. The old memops I'll leave untouched if that's OK. For this specific one - is there a reasonable use case? Other than for host PFN, we have control over guest ones, and I'm not sure managing a guest with GPFNs extending past 4 billion can be expected to work if only this one hypercall got fixed. IOW I'm expecting to NAK any such addition without proper rationale. Jan AFAIK everything works for me as it is already without this patch. I'm not sure if the cornercase of pfn's 32-bit wide on ARM64 is actually something that would work/is supported at the moment. Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] ts-host-install: Arrange for ssh logins to have no corefile size limit
Ian Campbell writes ([PATCH OSSTEST] ts-host-install: Arrange for ssh logins to have no corefile size limit): Collect the output of cat /proc/self/limits so we get some clue if this isn't working for some reason. Acked-by: Ian Jackson ian.jack...@eu.citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST] ts-libvirt-build: set coredump ulimit to unlimited for libvirtd
Ian Campbell writes ([PATCH OSSTEST] ts-libvirt-build: set coredump ulimit to unlimited for libvirtd): Signed-off-by: Ian Campbell ian.campb...@citrix.com Acked-by: Ian Jackson ian.jack...@eu.citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On Tue, 2015-04-21 at 15:29 +0100, Jan Beulich wrote: On 21.04.15 at 16:24, ian.campb...@citrix.com wrote: On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote: On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64 bit host (maximum pfn more than 2^32)? It is, but do we really want to introduce other than just compat mode helper interfaces (i.e. leaving the current ones alone, and perhaps even making the new ones tools only) if we really care about such setups in the first place? IIRC the original impetus for at least one part of this interface was for in guest use. I don't recall what the use was, I think it was the max pfn one which was used though. Perhaps in that case we can assume that a signed long is sufficient for any pfn they might see. But is that true? A 32-bit guest could see PFN2^31 I think. Perhaps we should add these fixed version as a tools interface and consider only doing a fixed guest visible version of the max pfn one? Modulo confirming that I'm not misremembering... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/5] raisin: introduce ovmf
Add a component to build ovmf and pass the output binary to the xen build. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- components/ovmf | 60 + components/series |1 + components/xen|3 ++- defconfig |6 -- 4 files changed, 67 insertions(+), 3 deletions(-) create mode 100644 components/ovmf diff --git a/components/ovmf b/components/ovmf new file mode 100644 index 000..a59a771 --- /dev/null +++ b/components/ovmf @@ -0,0 +1,60 @@ +#!/usr/bin/env bash + +function ovmf_check_package() { +local DEP_Debian_common=build-essential nasm uuid-dev python iasl +local DEP_Debian_x86_32=$DEP_Debian_common +local DEP_Debian_x86_64=$DEP_Debian_common +local DEP_Debian_arm32=$DEP_Debian_common +local DEP_Debian_arm64=$DEP_Debian_common + +local DEP_Fedora_common=make gcc gcc-c++ nasm libuuid-devel python acpica-tools +local DEP_Fedora_x86_32=$DEP_Fedora_common +local DEP_Fedora_x86_64=$DEP_Fedora_common + + +if [[ $RAISIN_ARCH != x86_64 ]] +then +echo ovmf is only supported on x86_64 +return +fi +echo Checking OVMF dependencies +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH +} + + +function ovmf_build() { +if [[ $RAISIN_ARCH != x86_64 ]] +then +echo ovmf is only supported on x86_64 +return +fi + +cd $BASEDIR +git-checkout $OVMF_URL $OVMF_REVISION ovmf-dir +cd ovmf-dir + +make -C BaseTools/Source/C +OvmfPkg/build.sh -a X64 -b RELEASE -n 4 +cp Build/OvmfX64/RELEASE_GCC*/FV/OVMF.fd ovmf.bin + +cd $BASEDIR +} + +function ovmf_clean() { +cd $BASEDIR +if [[ -d ovmf-dir ]] +then +cd ovmf-dir +$GIT clean -fdx +cd .. +rm -rf ovmf-dir +fi +} + +function ovmf_configure() { +: +} + +function ovmf_unconfigure() { +: +} diff --git a/components/series b/components/series index f0f3cfa..fe9092a 100644 --- a/components/series +++ b/components/series @@ -1,4 +1,5 @@ seabios +ovmf xen qemu qemu_traditional diff --git a/components/xen b/components/xen index b3426f0..b3a0c96 100644 --- a/components/xen +++ b/components/xen @@ -29,7 +29,8 @@ function xen_build() { cd xen-dir ./configure --prefix=$PREFIX --with-system-qemu=$PREFIX/lib/xen/bin/qemu-system-i386 \ --disable-qemu-traditional --enable-rombios \ ---with-system-seabios=$BASEDIR/seabios-dir/out/bios.bin +--with-system-seabios=$BASEDIR/seabios-dir/out/bios.bin \ +--with-system-ovmf=$BASEDIR/ovmf-dir/ovmf.bin $RAISIN_MAKE $RAISIN_MAKE install DESTDIR=$INST_DIR cd $BASEDIR diff --git a/defconfig b/defconfig index d3ef283..7d2a3f7 100644 --- a/defconfig +++ b/defconfig @@ -1,12 +1,12 @@ # Config variables for raisin # Components -## All components: seabios xen qemu qemu_traditional grub libvirt +## All components: seabios ovmf xen qemu qemu_traditional grub libvirt ## Core xen functionality: xen ## Remove a component from the list below, if you want to disable it ## You can manually overwrite this list using the COMPONENTS ## environmental variable. -ENABLED_COMPONENTS=seabios xen qemu qemu_traditional grub libvirt +ENABLED_COMPONENTS=seabios ovmf xen qemu qemu_traditional grub libvirt # Build config ## Make command to run @@ -27,6 +27,7 @@ QEMU_TRADITIONAL_URL=git://xenbits.xen.org/qemu-xen-unstable.git SEABIOS_URL=git://xenbits.xen.org/seabios.git GRUB_URL=git://git.savannah.gnu.org/grub.git LIBVIRT_URL=git://libvirt.org/libvirt.git +OVMF_URL=git://xenbits.xen.org/ovmf.git # Software versions. XEN_REVISION=master @@ -35,3 +36,4 @@ QEMU_TRADITIONAL_REVISION=master SEABIOS_REVISION=master GRUB_REVISION=master LIBVIRT_REVISION=master +OVMF_REVISION=master -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 6/7] docs/build: Move install checks into individual build targets
On 21/04/15 15:23, Ian Campbell wrote: On Tue, 2015-04-21 at 15:18 +0100, Jan Beulich wrote: On 21.04.15 at 16:15, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 13:01 +0100, Andrew Cooper wrote: On 20/04/15 12:57, Jan Beulich wrote: On 20.04.15 at 12:49, andrew.coop...@citrix.com wrote: @@ -103,12 +88,20 @@ install: install-man-pages install-html # Individual file build targets man1/%.1: man/%.pod.1 Makefile +ifdef POD2MAN Perhaps better to use ifneq($(POD2MAN),) in such cases? I was following the prevailing style, but can update all instances. FWIW, it is not currently an issue. I don't really mind, ifneq seems more prevalent in other makefiles, I'm not sure why this one uses ifdef nor really what the implications are. The difference becomes visible when the variable is defined but expands to nothing. And I view it as more useful if as empty variable means none than if that results in presumably very strange build errors. True. Andy, will you fix these up then? Yes - will fix in v2. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/5] raisin: rename ARCH to RAISIN_ARCH
Rename exported environmental variable ARCH to RAISIN_ARCH not to conflict with any component Makefile variables. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- README |2 +- components/grub |6 +++--- components/libvirt |2 +- components/qemu |2 +- components/qemu_traditional |2 +- components/seabios |6 +++--- components/xen |2 +- lib/common-functions.sh |6 +++--- scripts/mkdeb |8 scripts/mkrpm |2 +- 10 files changed, 19 insertions(+), 19 deletions(-) diff --git a/README b/README index ab8d3e7..b7832da 100644 --- a/README +++ b/README @@ -85,7 +85,7 @@ RAISIN_MAKE: make command to run SUDO: sudo command to run DISTRO: which Linux distribution we are running on, Debian, Fedora, etc PKGTYPE: which package format is used by the distro, rpm, deb, etc -ARCH: which architecture we are running on, x86_64, arm32, etc. +RAISIN_ARCH: which architecture we are running on, x86_64, arm32, etc. BASEDIR: absolute path of the raisin directory PREFIX: installation PREFIX INST_DIR: absolute path of the installation directory diff --git a/components/grub b/components/grub index 7c14a62..fa72b99 100644 --- a/components/grub +++ b/components/grub @@ -17,13 +17,13 @@ function grub_check_package() { local DEP_CentOS_x86_64=$DEP_Fedora_x86_64 -if [[ $ARCH != x86_64 $ARCH != x86_32 ]] +if [[ $RAISIN_ARCH != x86_64 $RAISIN_ARCH != x86_32 ]] then echo grub is only supported on x86_32 and x86_64 return fi echo Checking Grub dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } @@ -41,7 +41,7 @@ function grub_build() { -m ../memdisk.tar -o grub-i386-xen grub-core/*mod cp grub-i386-xen $INST_DIR/$PREFIX/lib/xen/boot ## GRUB64 -if [[ $ARCH = x86_64 ]] +if [[ $RAISIN_ARCH = x86_64 ]] then $RAISIN_MAKE clean ./configure --target=amd64 --with-platform=xen diff --git a/components/libvirt b/components/libvirt index 5fb1a41..a554643 100644 --- a/components/libvirt +++ b/components/libvirt @@ -23,7 +23,7 @@ function libvirt_check_package() { local DEP_CentOS_x86_64=$DEP_Fedora_x86_64 echo Checking Libvirt dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function libvirt_build() { diff --git a/components/qemu b/components/qemu index 00aeeb6..72cfec1 100644 --- a/components/qemu +++ b/components/qemu @@ -12,7 +12,7 @@ function qemu_check_package() { local DEP_Fedora_x86_64=$DEP_Fedora_common echo Checking QEMU dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function qemu_build() { diff --git a/components/qemu_traditional b/components/qemu_traditional index 500cbed..b338007 100644 --- a/components/qemu_traditional +++ b/components/qemu_traditional @@ -13,7 +13,7 @@ function qemu_traditional_check_package() { local DEP_Fedora_x86_64=$DEP_Fedora_common echo Checking QEMU dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function qemu_traditional_build() { diff --git a/components/seabios b/components/seabios index 960a538..ed2c7d2 100644 --- a/components/seabios +++ b/components/seabios @@ -12,18 +12,18 @@ function seabios_check_package() { local DEP_Fedora_x86_64=$DEP_Fedora_common -if [[ $ARCH != x86_64 $ARCH != x86_32 ]] +if [[ $RAISIN_ARCH != x86_64 $RAISIN_ARCH != x86_32 ]] then echo seabios is only supported on x86_32 and x86_64 return fi echo Checking SeaBIOS dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function seabios_build() { -if [[ $ARCH != x86_64 $ARCH != x86_32 ]] +if [[ $RAISIN_ARCH != x86_64 $RAISIN_ARCH != x86_32 ]] then echo seabios is only supported on x86_32 and x86_64 return diff --git a/components/xen b/components/xen index b7258b0..b3426f0 100644 --- a/components/xen +++ b/components/xen @@ -20,7 +20,7 @@ function xen_check_package() { local DEP_CentOS_x86_64=$DEP_CentOS_x86_32 glibc-devel.i686 echo Checking Xen dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function xen_build() { diff --git a/lib/common-functions.sh b/lib/common-functions.sh index c19046b..186a53b 100644 --- a/lib/common-functions.sh +++ b/lib/common-functions.sh @@ -41,7 +41,7 @@ function common_init() { get_components _verbose_echo Distro: $DISTRO -_verbose_echo Arch: $ARCH +_verbose_echo Arch: $RAISIN_ARCH _verbose_echo Components: $COMPONENTS for f in $COMPONENTS @@ -161,7 +161,7 @@ function get_distro() { }
[Xen-devel] [PATCH 2/5] raisin: remove duplicate source config in raise
Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- raise | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/raise b/raise index bce6908..68dbfd8 100755 --- a/raise +++ b/raise @@ -17,9 +17,12 @@ _help() { } # Include your defaults -if [[ -e ./config ]] ; then -. ./config +if [[ ! -e ./config ]] +then + echo No config file found, copying default config + cp defconfig config fi +source ./config # To use this as a library, set RAISIN_PATH appropriately [[ -z $RAISIN_PATH ]] RAISIN_PATH=$PWD/lib @@ -29,14 +32,6 @@ source ${RAISIN_PATH}/common-functions.sh source ${RAISIN_PATH}/git-checkout.sh source ${RAISIN_PATH}/commands.sh -# Include your defaults -if [[ ! -e ./config ]] ; then - echo No config file found, copying default config - cp defconfig config -fi - -source ./config - # Set up basic functionality common_init -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 5/5] raisin: build linux
Add a component, disabled by default, to build a linux kernel with the Xen kconfig options enabled. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- components/linux | 120 + components/series |1 + defconfig |4 +- 3 files changed, 124 insertions(+), 1 deletion(-) create mode 100644 components/linux diff --git a/components/linux b/components/linux new file mode 100644 index 000..f90a894 --- /dev/null +++ b/components/linux @@ -0,0 +1,120 @@ +#!/usr/bin/env bash + +function linux_check_package() { +local DEP_Debian_common=build-essential bc openssl +local DEP_Debian_x86_32=$DEP_Debian_common +local DEP_Debian_x86_64=$DEP_Debian_common +local DEP_Debian_arm32=$DEP_Debian_common +local DEP_Debian_arm64=$DEP_Debian_common + +local DEP_Fedora_common=make gcc bc openssl +local DEP_Fedora_x86_32=$DEP_Fedora_common +local DEP_Fedora_x86_64=$DEP_Fedora_common + +local DEP_CentOS_common=$DEP_Fedora_common +local DEP_CentOS_x86_32=$DEP_Fedora_x86_32 +local DEP_CentOS_x86_64=$DEP_Fedora_x86_64 + +echo Checking Linux dependencies +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH +} + +function _xenify_config() { +echo CONFIG_HYPERVISOR_GUEST=y $1 +echo CONFIG_PARAVIRT=y $1 +echo CONFIG_PARAVIRT_SPINLOCKS=y $1 +echo CONFIG_XEN=y $1 +echo CONFIG_XEN_DOM0=y $1 +echo CONFIG_XEN_PVHVM=y $1 +echo CONFIG_XEN_SAVE_RESTORE=y $1 +echo CONFIG_XEN_DEBUG_FS=y $1 +echo CONFIG_XEN_PVH=y $1 +echo CONFIG_PARAVIRT_CLOCK=y $1 +echo CONFIG_BALLOON_COMPACTION=y $1 +echo CONFIG_XEN_PCIDEV_FRONTEND=y $1 +echo CONFIG_XEN_BLKDEV_FRONTEND=y $1 +echo CONFIG_XEN_BLKDEV_BACKEND=y $1 +echo CONFIG_XEN_SCSI_FRONTEND=y $1 +echo CONFIG_XEN_NETDEV_FRONTEND=y $1 +echo CONFIG_XEN_NETDEV_BACKEND=y $1 +echo CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y $1 +echo CONFIG_XEN_FBDEV_FRONTEND=y $1 +echo CONFIG_HVC_XEN=y $1 +echo CONFIG_HVC_XEN_FRONTEND=y $1 +echo CONFIG_XEN_BALLOON=y $1 +echo CONFIG_XEN_SCRUB_PAGES=y $1 +echo CONFIG_XEN_DEV_EVTCHN=y $1 +echo CONFIG_XEN_BACKEND=y $1 +echo CONFIG_XENFS=y $1 +echo CONFIG_XEN_COMPAT_XENFS=y $1 +echo CONFIG_XEN_SYS_HYPERVISOR=y $1 +echo CONFIG_XEN_XENBUS_FRONTEND=y $1 +echo CONFIG_XEN_GNTDEV=y $1 +echo CONFIG_XEN_GRANT_DEV_ALLOC=y $1 +echo CONFIG_SWIOTLB_XEN=y $1 +echo CONFIG_XEN_PCIDEV_BACKEND=y $1 +echo CONFIG_XEN_PRIVCMD=y $1 +echo CONFIG_XEN_HAVE_PVMMU=y $1 +echo CONFIG_XEN_ACPI_PROCESSOR=y $1 +echo CONFIG_XEN_EFI=y $1 +echo CONFIG_XEN_AUTO_XLATE=y $1 +echo CONFIG_BRIDGE=y $1 +} + +function linux_build() { +local vmlinuz + +cd $BASEDIR +git-checkout $LINUX_URL $LINUX_REVISION linux-dir +cd linux-dir + +if [[ ! -e .config ]] +then +if [[ -e /boot/config-`uname -r` ]] +then +cp /boot/config-`uname -r` .config +else +$RAISIN_MAKE defconfig +fi +_xenify_config .config +$RAISIN_MAKE olddefconfig +fi + +$RAISIN_MAKE +$RAISIN_MAKE modules_install INSTALL_MOD_PATH=$INST_DIR + +mkdir -p $INST_DIR/boot/xen +vmlinuz=$INST_DIR/boot/xen/vmlinuz-$RAISIN_ARCH-$LINUX_REVISION-`date +%Y%m%d.%H%M%S` + +if [[ $RAISIN_ARCH = x86_64 || $RAISIN_ARCH = x86_32 ]] +then +cp arch/x86/boot/bzImage $vmlinuz +elif [[ $RAISIN_ARCH = arm32 ]] +then +cp arch/arm/boot/zImage $vmlinuz +elif [[ $RAISIN_ARCH = arm64 ]] +then +cp arch/x86/boot/Image.gz $vmlinuz +fi + +cd .. +} + +function linux_clean() { +cd $BASEDIR +if [[ -d linux-dir ]] +then +cd linux-dir +$RAISIN_MAKE distclean +cd .. +rm -rf linux-dir +fi +} + +function linux_configure() { +: +} + +function linux_unconfigure() { +: +} diff --git a/components/series b/components/series index fe9092a..928e78e 100644 --- a/components/series +++ b/components/series @@ -5,3 +5,4 @@ qemu qemu_traditional grub libvirt +linux diff --git a/defconfig b/defconfig index 7d2a3f7..b4ed94d 100644 --- a/defconfig +++ b/defconfig @@ -1,7 +1,7 @@ # Config variables for raisin # Components -## All components: seabios ovmf xen qemu qemu_traditional grub libvirt +## All components: seabios ovmf xen qemu qemu_traditional grub libvirt linux ## Core xen functionality: xen ## Remove a component from the list below, if you want to disable it ## You can manually overwrite this list using the COMPONENTS @@ -28,6 +28,7 @@ SEABIOS_URL=git://xenbits.xen.org/seabios.git GRUB_URL=git://git.savannah.gnu.org/grub.git LIBVIRT_URL=git://libvirt.org/libvirt.git OVMF_URL=git://xenbits.xen.org/ovmf.git +LINUX_URL=git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git # Software versions. XEN_REVISION=master @@ -37,3 +38,4 @@ SEABIOS_REVISION=master
[Xen-devel] [PATCH 1/5] raisin: introduce _verbose_echo
A new utility function to make the code more readable and compact: prints a message if VERBOSE = 1. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- lib/common-functions.sh | 46 -- 1 file changed, 16 insertions(+), 30 deletions(-) diff --git a/lib/common-functions.sh b/lib/common-functions.sh index dc3a2bb..c19046b 100644 --- a/lib/common-functions.sh +++ b/lib/common-functions.sh @@ -1,5 +1,12 @@ #!/usr/bin/env bash +function _verbose_echo() { +if [[ $VERBOSE -eq 1 ]] +then +echo $* +fi +} + # Executed once at the beginning of the script function common_init() { export BASEDIR=`pwd` @@ -33,12 +40,9 @@ function common_init() { get_arch get_components -if [[ $VERBOSE -eq 1 ]] -then -echo Distro: $DISTRO -echo Arch: $ARCH -echo Components: $COMPONENTS -fi +_verbose_echo Distro: $DISTRO +_verbose_echo Arch: $ARCH +_verbose_echo Components: $COMPONENTS for f in $COMPONENTS do @@ -62,10 +66,7 @@ function get_components() { if eval [[ ! -z \$$capital_REVISION ]] then COMPONENTS=$COMPONENTS $component -if [[ $VERBOSE -eq 1 ]] -then -echo Found component $component -fi +_verbose_echo Found component $component fi done fi @@ -166,10 +167,7 @@ function get_arch() { } function _check-package-deb() { -if [[ $VERBOSE -eq 1 ]] -then -echo Checking for package ${args[0]} -fi +_verbose_echo Checking for package ${args[0]} if dpkg -s $1 2/dev/null | grep -q Status:.*installed then @@ -184,10 +182,7 @@ function _install-package-deb() { } function _check-package-rpm() { -if [[ $VERBOSE -eq 1 ]] -then -echo Checking for package $1 -fi +_verbose_echo Checking for package $1 if rpm -q $1 21 /dev/null then @@ -273,22 +268,13 @@ function for_each_component () { done if [[ $found -eq 0 ]] then -if [[ $VERBOSE -eq 1 ]] -then -echo $component is disabled -fi +_verbose_echo $component is disabled continue fi -if [[ $VERBOSE -eq 1 ]] -then -echo calling $component_$1 -fi +_verbose_echo calling $component_$1 $component_$1 -if [[ $VERBOSE -eq 1 ]] -then -echo $component_$1 done -fi +_verbose_echo $component_$1 done done } -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/7] docs/build: Do not use move-if-changed
On 21/04/15 15:12, Ian Campbell wrote: On Mon, 2015-04-20 at 11:49 +0100, Andrew Cooper wrote: Nothing expensive depends on these results. But are those uses a problem? Less fluff in the build logs, and less load moving files around the filesystem. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/5] raisin: introduce _verbose_echo
On 04/21/2015 03:54 PM, Stefano Stabellini wrote: A new utility function to make the code more readable and compact: prints a message if VERBOSE = 1. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- lib/common-functions.sh | 46 -- 1 file changed, 16 insertions(+), 30 deletions(-) diff --git a/lib/common-functions.sh b/lib/common-functions.sh index dc3a2bb..c19046b 100644 --- a/lib/common-functions.sh +++ b/lib/common-functions.sh @@ -1,5 +1,12 @@ #!/usr/bin/env bash +function _verbose_echo() { +if [[ $VERBOSE -eq 1 ]] +then +echo $* +fi +} Why _? Don't we have VERBOSE in other parts of the code that could call this? -George + # Executed once at the beginning of the script function common_init() { export BASEDIR=`pwd` @@ -33,12 +40,9 @@ function common_init() { get_arch get_components -if [[ $VERBOSE -eq 1 ]] -then -echo Distro: $DISTRO -echo Arch: $ARCH -echo Components: $COMPONENTS -fi +_verbose_echo Distro: $DISTRO +_verbose_echo Arch: $ARCH +_verbose_echo Components: $COMPONENTS for f in $COMPONENTS do @@ -62,10 +66,7 @@ function get_components() { if eval [[ ! -z \$$capital_REVISION ]] then COMPONENTS=$COMPONENTS $component -if [[ $VERBOSE -eq 1 ]] -then -echo Found component $component -fi +_verbose_echo Found component $component fi done fi @@ -166,10 +167,7 @@ function get_arch() { } function _check-package-deb() { -if [[ $VERBOSE -eq 1 ]] -then -echo Checking for package ${args[0]} -fi +_verbose_echo Checking for package ${args[0]} if dpkg -s $1 2/dev/null | grep -q Status:.*installed then @@ -184,10 +182,7 @@ function _install-package-deb() { } function _check-package-rpm() { -if [[ $VERBOSE -eq 1 ]] -then -echo Checking for package $1 -fi +_verbose_echo Checking for package $1 if rpm -q $1 21 /dev/null then @@ -273,22 +268,13 @@ function for_each_component () { done if [[ $found -eq 0 ]] then -if [[ $VERBOSE -eq 1 ]] -then -echo $component is disabled -fi +_verbose_echo $component is disabled continue fi -if [[ $VERBOSE -eq 1 ]] -then -echo calling $component_$1 -fi +_verbose_echo calling $component_$1 $component_$1 -if [[ $VERBOSE -eq 1 ]] -then -echo $component_$1 done -fi +_verbose_echo $component_$1 done done } ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] raisin: build linux
On 04/21/2015 03:55 PM, Stefano Stabellini wrote: Add a component, disabled by default, to build a linux kernel with the Xen kconfig options enabled. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: George Dunlap george.dun...@eu.citrix.com I'm wondering if at some point we might want to give people the option to pass in a specific Linux config; but that can be a feature for another patch. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 2/5] raisin: remove duplicate source config in raise
On 04/21/2015 03:55 PM, Stefano Stabellini wrote: Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com I take it this got here by a bad merge or somethign? Acked-by: George Dunlap george.dun...@eu.citrix.com --- raise | 15 +-- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/raise b/raise index bce6908..68dbfd8 100755 --- a/raise +++ b/raise @@ -17,9 +17,12 @@ _help() { } # Include your defaults -if [[ -e ./config ]] ; then -. ./config +if [[ ! -e ./config ]] +then + echo No config file found, copying default config + cp defconfig config fi +source ./config # To use this as a library, set RAISIN_PATH appropriately [[ -z $RAISIN_PATH ]] RAISIN_PATH=$PWD/lib @@ -29,14 +32,6 @@ source ${RAISIN_PATH}/common-functions.sh source ${RAISIN_PATH}/git-checkout.sh source ${RAISIN_PATH}/commands.sh -# Include your defaults -if [[ ! -e ./config ]] ; then - echo No config file found, copying default config - cp defconfig config -fi - -source ./config - # Set up basic functionality common_init ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH] x86/asm/irq: Don't use POPF but STI
* Borislav Petkov b...@alien8.de wrote: On Tue, Apr 21, 2015 at 02:45:58PM +0200, Ingo Molnar wrote: From 6f01f6381e8293c360b7a89f516b8605e357d563 Mon Sep 17 00:00:00 2001 From: Ingo Molnar mi...@kernel.org Date: Tue, 21 Apr 2015 13:32:13 +0200 Subject: [PATCH] x86/asm/irq: Don't use POPF but STI So because the POPF instruction is slow and STI is faster on essentially all x86 CPUs that matter, instead of: 81891848: 9d popfq we can do: 81661a2e: 41 f7 c4 00 02 00 00test $0x200,%r12d 81661a35: 74 01 je 81661a38 snd_pcm_stream_unlock_irqrestore+0x28 81661a37: fb sti 81661a38: This bloats the kernel a bit, by about 1K on the 64-bit defconfig: textdata bss dec hex filename 122586341812120 1085440 15156194 e743e2 vmlinux.before 122595821812120 1085440 15157142 e74796 vmlinux.after the other cost is the extra branching, adding extra pressure to the branch prediction hardware and also potential branch misses. Do we care? [...] Only if it makes stuff faster. [...] After we enable interrupts, we'll most likely go somewhere cache cold anyway, so the branch misses will happen anyway. The question is, would the cost drop from POPF - STI cover the increase in branch misses overhead? Hmm, interesting. So there's a few places where the POPF is a STI in 100% of the cases. It's probably a win there. But my main worry would be sites that are 'multi use', such as locking APIs - for example spin_unlock_irqrestore(): those tend to be called from different code paths, and each one has a different IRQ flags state. For example scheduler wakeups done from irqs-off codepaths (it's very common), or from irqs-on codepaths (that's very common as well). In the former case we won't have a STI, in the latter case we will - and both would hit a POPF at the end of the critical section. The probability of a branch prediction miss is high in this case. So the question is, is the POPF/STI performance difference higher than the average cost of branch misses. If yes, then the change is probably a win. If not, then it's probably a loss. My gut feeling is that we should let the hardware do it, i.e. we should continue to use POPF - but I can be convinced ... Thanks, Ingo ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3] xen: sched_rt: print useful affinity info when dumping
In fact, printing the cpupool's CPU online mask for each vCPU is just redundant, as that is the same for all the vCPUs of all the domains in the same cpupool, while hard affinity is already part of the output of dumping domains info. Instead, print the intersection between hard affinity and online CPUs, which is --in case of this scheduler-- the effective affinity always used for the vCPUs. This change also takes the chance to add a scratch cpumask area, to avoid having to either put one (more) cpumask_t on the stack, or dynamically allocate it within the dumping routine. (The former being bad because hypervisor stack size is limited, the latter because dynamic allocations can fail, if the hypervisor was built for a large enough number of CPUs.) We allocate such scratch area, for all pCPUs, when the first instance of the RTDS scheduler is activated and, in order not to loose track/leak it if other instances are activated in new cpupools, and when the last instance is deactivated, we (sort of) refcount it. Such scratch area can be used to kill most of the cpumasks{_var}_t local variables in other functions in the file, but that is *NOT* done in this chage. Finally, convert the file to use keyhandler scratch, instead of open coded string buffers. Signed-off-by: Dario Faggioli dario.faggi...@citrix.com Cc: George Dunlap george.dun...@eu.citrix.com Cc: Meng Xu xumengpa...@gmail.com Cc: Jan Beulich jbeul...@suse.com Cc: Keir Fraser k...@xen.org --- Changes from v2: * added refcounting, to avoid leaking the scratch cpumasks * added some ASSERT()s to validate the refcounting Changes from v1: * improved changelog; * made a local variable to point to the correct scratch mask, as suggested during review. --- xen/common/sched_rt.c | 64 ++--- 1 file changed, 55 insertions(+), 9 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index 7c39a9e..79cbb67 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -124,6 +124,24 @@ #define TRC_RTDS_BUDGET_REPLENISH TRC_SCHED_CLASS_EVT(RTDS, 4) #define TRC_RTDS_SCHED_TASKLETTRC_SCHED_CLASS_EVT(RTDS, 5) + /* + * Useful to avoid too many cpumask_var_t on the stack. + */ +static cpumask_t **_cpumask_scratch; +#define cpumask_scratch _cpumask_scratch[smp_processor_id()] + +/* + * We want to only allocate the _cpumask_scratch array the first time an + * instance of this scheduler is used, and avoid reallocating and leaking + * the old one when more instance are activated inside new cpupools. We + * also want to get rid of it when the last instance is de-inited. + * + * So we (sort of) reference count the number of initialized instances. This + * does not need to happen via atomic_t refcounters, as it only happens either + * during boot, or under the protection of the cpupool_lock spinlock. + */ +static unsigned int nr_rt_ops; + /* * Systme-wide private data, include global RunQueue/DepletedQ * Global lock is referenced by schedule_data.schedule_lock from all @@ -218,8 +236,7 @@ __q_elem(struct list_head *elem) static void rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) { -char cpustr[1024]; -cpumask_t *cpupool_mask; +cpumask_t *cpupool_mask, *mask; ASSERT(svc != NULL); /* idle vcpu */ @@ -229,10 +246,22 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) return; } -cpumask_scnprintf(cpustr, sizeof(cpustr), svc-vcpu-cpu_hard_affinity); +/* + * We can't just use 'cpumask_scratch' because the dumping can + * happen from a pCPU outside of this scheduler's cpupool, and + * hence it's not right to use the pCPU's scratch mask (which + * may even not exist!). On the other hand, it is safe to use + * svc-vcpu-processor's own scratch space, since we hold the + * runqueue lock. + */ +mask = _cpumask_scratch[svc-vcpu-processor]; + +cpupool_mask = cpupool_scheduler_cpumask(svc-vcpu-domain-cpupool); +cpumask_and(mask, cpupool_mask, svc-vcpu-cpu_hard_affinity); +cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask); printk([%5d.%-2u] cpu %u, (%PRI_stime, %PRI_stime), cur_b=%PRI_stime cur_d=%PRI_stime last_start=%PRI_stime\n -\t\t onQ=%d runnable=%d cpu_hard_affinity=%s , +\t\t onQ=%d runnable=%d flags=%x effective hard_affinity=%s\n, svc-vcpu-domain-domain_id, svc-vcpu-vcpu_id, svc-vcpu-processor, @@ -243,11 +272,8 @@ rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc) svc-last_start, __vcpu_on_q(svc), vcpu_runnable(svc-vcpu), -cpustr); -memset(cpustr, 0, sizeof(cpustr)); -cpupool_mask = cpupool_scheduler_cpumask(svc-vcpu-domain-cpupool); -cpumask_scnprintf(cpustr, sizeof(cpustr), cpupool_mask); -printk(cpupool=%s\n, cpustr); +svc-flags, +
Re: [Xen-devel] [PATCH v10 3/6] Support for BIOS interrupt handler
On Mon, Mar 30, 2015 at 05:09:47AM +, Xu, Quan wrote: From: Stefan Berger [mailto:stef...@linux.vnet.ibm.com] On 03/27/2015 03:58 AM, Xu, Quan wrote: From: Xu, Quan From: Stefan Berger [mailto:stef...@linux.vnet.ibm.com] On 03/26/2015 07:01 AM, Xu, Quan wrote: From: Stefan Berger [mailto:stef...@linux.vnet.ibm.com] On 03/25/2015 06:42 PM, Kevin O'Connor wrote: On Tue, Mar 24, 2015 at 11:10:03AM -0400, Stefan Berger wrote: On 03/23/2015 08:13 PM, Kevin O'Connor wrote: Because of the mixed 16bit/32bit code in SeaBIOS, all assembler must use size suffixes - so the above should be roll instead of rol. Ok, fixed. As before - both issues are minor and can be addressed after merge (as long as there is agreement that the sha1.c file can be licensed as LGPLv3). It can have that license. I can post v11 or you can modify it, either way is fine. Thanks. I pushed the first three patches into a test branch at: https://github.com/KevinOConnor/seabios/tree/tcg-testing I'd like to get confirmation that this works for the Xen requirements before merging. I don't use Xen. I hope that Quan will provide feedback. Stefan Sure, I am glad to help you test it :):) Try to https://github.com/KevinOConnor/seabios/tree/tcg-testing ?? Yes. [...] I will go through all of these seabios patch, and try to make it compatible for Xen vTPM. What's the status of this? Is it safe to push forward parts of Stefan's patches, or does that represent a regression for Xen? -Kevin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 3/5] libxl: add support for vscsi
On Fri, Apr 17, 2015 at 08:30:58AM +, Olaf Hering wrote: Port pvscsi support from xend to libxl: vscsi=['pdev,vdev{,options}'] xl scsi-attach xl scsi-detach xl scsi-list Signed-off-by: Olaf Hering o...@aepfle.de Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Wei Liu wei.l...@citrix.com --- docs/man/xl.cfg.pod.5| 55 +++ docs/man/xl.pod.1| 18 + tools/libxl/Makefile | 2 + tools/libxl/libxl.c | 441 tools/libxl/libxl.h | 27 ++ tools/libxl/libxl_create.c | 1 + tools/libxl/libxl_device.c | 2 + tools/libxl/libxl_internal.h | 16 + tools/libxl/libxl_types.idl | 56 +++ tools/libxl/libxl_types_internal.idl | 1 + tools/libxl/libxl_vscsi.c| 271 + tools/libxl/libxlu_vscsi.c | 750 +++ tools/libxl/libxlutil.h | 21 + tools/libxl/xl.h | 3 + tools/libxl/xl_cmdimpl.c | 184 - tools/libxl/xl_cmdtable.c| 15 + 16 files changed, 1862 insertions(+), 1 deletion(-) diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5 index f936dfc..d395e56 100644 --- a/docs/man/xl.cfg.pod.5 +++ b/docs/man/xl.cfg.pod.5 @@ -510,6 +510,61 @@ value is optional if this is a guest domain. =back +=item Bvscsi=[ VSCSI_SPEC_STRING, VSCSI_SPEC_STRING, ...] + +Specifies the PVSCSI devices to be provided to the guest. PVSCSI passes +dom0 SCSI devices as-is to the guest. s/dom0/backend/ As in theory you could run an HVM guest with an SCSI device passed in - and use said backed to pass the SCSI device to another guest. 'as-is' ? I thought there were some filtering done in the backend? + +Each VSCSI_SPEC_STRING consists of pdev,vdev[,options]. +'pdev' describes the physical device, preferable in a persistant format. What is 'persistent' format? +'vdev' is the domU device in vHOST:CHANNEL:TARGET:LUN notation, all integers. +'option' lists additional flags which a backend may recognize. + +The supported values for pdev and option depends on the used backend driver: + +=over 4 + +=item BLinux pvops s/pvops// + +=over 4 + +=item Cpdev + +The backend driver in the pvops kernel is part of the Linux-IO Target framework +(LIO). As such the SCSI devices have to be configured first with the tools +provided by this framework, such as a xen-scsiback aware targetcli. The pdev +in domU.cfg has to refer to a config item in that framework instead of the raw +device. Ususally this is a WWN in the form of na.WWN:LUN. Usually. + +=item Coption + +No options recognized. + +=back + +=item BLinux xenlinux xenlinux? Classic Xen? + +=over 4 + +=item Cpdev + +The dom0 device in either /dev/scsidev or pHOST:CHANNEL:TARGET:LUN notation. + +Its recommended to use persistant names /dev/disk/by-*/* to refer to a pdev. +The toolstack will translate this internally to h:c:t:l notation, which is how +the backend driver will access the device. Using the h:c:t:l notation for +pdev in domU.cfg is discouraged because this value will change across reboots, +depending on the detection order in the OS. + +=item Coption + +Currently only the option value feature-host is recognized. SCSI command +emulation in backend driver is bypassed when feature-host is specified. + +=back + +=back + =item Bvfb=[ VFB_SPEC_STRING, VFB_SPEC_STRING, ...] Specifies the paravirtual framebuffer devices which should be supplied diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1 index 16783c8..19bdbfa 100644 --- a/docs/man/xl.pod.1 +++ b/docs/man/xl.pod.1 @@ -1328,6 +1328,24 @@ List virtual trusted platform modules for a domain. =back +=head2 PVSCSI DEVICES + +=over 4 + +=item Bscsi-attach Idomain-id Ipdev Ivdev,I[feature-host] + +Creates a new vscsi device in the domain specified by Idomain-id. + +=item Bscsi-detach Idomain-id Ivdev + +Removes the vscsi device from domain specified by Idomain-id. + +=item Bscsi-list Idomain-id I[domain-id] ... + +List vscsi devices for the domain specified by Idomain-id. + +=back + =head1 PCI PASS-THROUGH =over 4 diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index 1b16598..79b3867 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -91,6 +91,7 @@ endif LIBXL_LIBS += -lyajl LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ + libxl_vscsi.o \ libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \ libxl_internal.o libxl_utils.o libxl_uuid.o \ libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o \ @@ -122,6 +123,7 @@ AUTOINCS= libxlu_cfg_y.h libxlu_cfg_l.h
Re: [Xen-devel] [PATCH v5 13/13] docs: add xl-psr.markdown
On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote: @@ -1534,7 +1540,8 @@ Show CAT hardware information. =item Bpsr-cat-cbm-set [IOPTIONS] Idomain-id Icbm -Set cache capacity bitmasks(CBM) for a domain. +Set cache capacity bitmasks(CBM) for a domain. For how to specify Icbm +please refer to the link above. I think the link above is to far. I think say something more explicit like Lxl-psr.txt ? BOPTIONS @@ -1575,6 +1582,7 @@ And the following documents on the xen.org website: Lhttp://xenbits.xen.org/docs/unstable/misc/xl-network-configuration.html Lhttp://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt Lhttp://xenbits.xen.org/docs/unstable/misc/xsm-flask.txt +Lhttp://xenbits.xen.org/docs/unstable/misc/xl-psr.html For systems that don't automatically bring CPU online: diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown new file mode 100644 index 000..d167b84 --- /dev/null +++ b/docs/misc/xl-psr.markdown @@ -0,0 +1,134 @@ +# Intel Platform Shared Resource Monitoring/Control in xl + +This document introduces Intel Platform Shared Resource Monitoring/Control +technologies, their basic concepts and the xl interfaces. + +## Cache Monitoring Technology (CMT) + +Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell +and later server platforms that allows an OS or Hypervisor/VMM to determine +the usage of cache(currently only L3 cache supported) by applications running ^space before ( please. +For example, assuming a system with 8 portions and 3 domains: + +A CBM of 0xff for every domain means each domain can access the +whole cache. This is the default. + +Giving one domain a CBM of 0x0f and the other two domain's 0xf0 +means that the first domain gets exclusive access to half of the +cache (half of the portions) and the other two will share the +other half. + +Giving one domain a CBM of 0x0f, one 0x30 and the last 0xc0 +would give the first domain exclusive access to half the cache, +and the other two exclusive access to one quarter each. How does markdown render this? I think you might want to start each para with a * to make it a bullet list. Other than those minor things all looks good, thanks. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 1/5] vscsiif.h: fix WWN notation for p-dev property
On Tue, Apr 21, Konrad Rzeszutek Wilk wrote: On Fri, Apr 17, 2015 at 08:30:56AM +, Olaf Hering wrote: The pvops kernel expects either naa.WWN:LUN or h:c:t:l in the p-dev property. Add the missing :LUN part to the comment. What about the older kernels that had said driver? Which older kernels? There is just 3.18+ of pvops. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On 21.04.15 at 16:33, tkleng...@sec.in.tum.de wrote: On Tue, Apr 21, 2015 at 4:14 PM, Jan Beulich jbeul...@suse.com wrote: On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? The privcmd driver on Linux certainly does return an int via the ioctl. That's clearly a bug in Linux: static long privcmd_ioctl(struct file *file, unsigned int cmd, unsigned long data) { int ret = -ENOSYS; void __user *udata = (void __user *) data; switch (cmd) { case IOCTL_PRIVCMD_HYPERCALL: ret = privcmd_ioctl_hypercall(udata); break; [...] return ret; } Why in the world is ret an int? That's certainly not something inherited from XenoLinux, where all involved types are long. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test
On Tue, 2015-04-21 at 15:30 +0100, Ian Jackson wrote: Ian Campbell writes (Re: [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test): On Tue, 2015-04-21 at 14:28 +0100, Ian Jackson wrote: I'm only objecting to the involvement of the host property machinery. I'm afraid I'm not 100% sure I understand, but do you just mean that longtao should assign to $ho-{Ip} directly within selecthost instead of using $setprop to set the IpAddr hostprop and therefore indirectly -{Ip}? Yes. IOW the existing code in selecthost which looks up the IpAddr host prop into $ho-{IpStatic} and then into $ho-{Ip} should prefer the runvar to the host db if it is set? Yes. Good. Longtao, do you understand what is needed here? BTW, what is the difference between -{Ip} and -{IpStatic} and should the runvar be reflected in both or just in -{Ip}? AFAICT the only difference is that only IpStatic is used for expanding `ipaddr' and `ipaddrhex' in the patterns for dhcp pxe configurations. There is an incipient problem here: our existing setup does not require that osstest has complete control of the pxe config file for every host on the network. And indeed the Citrix (Cambridge) instance lacks such control. Instead there are some symlinks etc. The mg-hosts mkpxedir subcommand arranges to delegate the DHCP from root to the osstest user (and group), by making an appropriate subdirectory with the right permissions, and a suitable symlink. It's not quite clear to me how this should work for nested HVM hosts. I think this is OK with the current design because the nested HVM host is not PXE booted, it is installed as a guest but preseeded to look a lot like a host (quite a lot of sharing in the preseed creation) and then tailored into a more host-like thing by things like this IP addr runvar. So I think from what you are saying that the -{Ip} and -{IpStatic} distinction here is correct and a host which is really a guest should not have nor use -{IpStatic} (we want e.g. attempts to create a pxe dir to fail). Longtao, the conclusion is that the runvar should only appear in -{Ip}. -{IpStatic} should be left untouched (i.e. not initialised). Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/5] raisin: introduce ovmf and linux
Hi all, this patch series includes a few clean-ups and introduces a component to build ovmf and a component to build a linux kernel. The latter is not enabled by default, and could probably benefit from a bit more work on the kconfig options to really be useful, but it is a start. Stefano Stabellini (5): raisin: introduce _verbose_echo raisin: remove duplicate source config in raise raisin: rename ARCH to RAISIN_ARCH raisin: introduce ovmf raisin: build linux README |2 +- components/grub |6 +-- components/libvirt |2 +- components/linux| 120 +++ components/ovmf | 60 ++ components/qemu |2 +- components/qemu_traditional |2 +- components/seabios |6 +-- components/series |2 + components/xen |5 +- defconfig |8 ++- lib/common-functions.sh | 50 +++--- raise | 15 ++ scripts/mkdeb |8 +-- scripts/mkrpm |2 +- 15 files changed, 229 insertions(+), 61 deletions(-) create mode 100644 components/linux create mode 100644 components/ovmf Cheers, Stefano ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/5] raisin: rename ARCH to RAISIN_ARCH
On 04/21/2015 03:55 PM, Stefano Stabellini wrote: Rename exported environmental variable ARCH to RAISIN_ARCH not to conflict with any component Makefile variables. Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Acked-by: George Dunlap george.dun...@eu.citrix.com --- README |2 +- components/grub |6 +++--- components/libvirt |2 +- components/qemu |2 +- components/qemu_traditional |2 +- components/seabios |6 +++--- components/xen |2 +- lib/common-functions.sh |6 +++--- scripts/mkdeb |8 scripts/mkrpm |2 +- 10 files changed, 19 insertions(+), 19 deletions(-) diff --git a/README b/README index ab8d3e7..b7832da 100644 --- a/README +++ b/README @@ -85,7 +85,7 @@ RAISIN_MAKE: make command to run SUDO: sudo command to run DISTRO: which Linux distribution we are running on, Debian, Fedora, etc PKGTYPE: which package format is used by the distro, rpm, deb, etc -ARCH: which architecture we are running on, x86_64, arm32, etc. +RAISIN_ARCH: which architecture we are running on, x86_64, arm32, etc. BASEDIR: absolute path of the raisin directory PREFIX: installation PREFIX INST_DIR: absolute path of the installation directory diff --git a/components/grub b/components/grub index 7c14a62..fa72b99 100644 --- a/components/grub +++ b/components/grub @@ -17,13 +17,13 @@ function grub_check_package() { local DEP_CentOS_x86_64=$DEP_Fedora_x86_64 -if [[ $ARCH != x86_64 $ARCH != x86_32 ]] +if [[ $RAISIN_ARCH != x86_64 $RAISIN_ARCH != x86_32 ]] then echo grub is only supported on x86_32 and x86_64 return fi echo Checking Grub dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } @@ -41,7 +41,7 @@ function grub_build() { -m ../memdisk.tar -o grub-i386-xen grub-core/*mod cp grub-i386-xen $INST_DIR/$PREFIX/lib/xen/boot ## GRUB64 -if [[ $ARCH = x86_64 ]] +if [[ $RAISIN_ARCH = x86_64 ]] then $RAISIN_MAKE clean ./configure --target=amd64 --with-platform=xen diff --git a/components/libvirt b/components/libvirt index 5fb1a41..a554643 100644 --- a/components/libvirt +++ b/components/libvirt @@ -23,7 +23,7 @@ function libvirt_check_package() { local DEP_CentOS_x86_64=$DEP_Fedora_x86_64 echo Checking Libvirt dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function libvirt_build() { diff --git a/components/qemu b/components/qemu index 00aeeb6..72cfec1 100644 --- a/components/qemu +++ b/components/qemu @@ -12,7 +12,7 @@ function qemu_check_package() { local DEP_Fedora_x86_64=$DEP_Fedora_common echo Checking QEMU dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function qemu_build() { diff --git a/components/qemu_traditional b/components/qemu_traditional index 500cbed..b338007 100644 --- a/components/qemu_traditional +++ b/components/qemu_traditional @@ -13,7 +13,7 @@ function qemu_traditional_check_package() { local DEP_Fedora_x86_64=$DEP_Fedora_common echo Checking QEMU dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function qemu_traditional_build() { diff --git a/components/seabios b/components/seabios index 960a538..ed2c7d2 100644 --- a/components/seabios +++ b/components/seabios @@ -12,18 +12,18 @@ function seabios_check_package() { local DEP_Fedora_x86_64=$DEP_Fedora_common -if [[ $ARCH != x86_64 $ARCH != x86_32 ]] +if [[ $RAISIN_ARCH != x86_64 $RAISIN_ARCH != x86_32 ]] then echo seabios is only supported on x86_32 and x86_64 return fi echo Checking SeaBIOS dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function seabios_build() { -if [[ $ARCH != x86_64 $ARCH != x86_32 ]] +if [[ $RAISIN_ARCH != x86_64 $RAISIN_ARCH != x86_32 ]] then echo seabios is only supported on x86_32 and x86_64 return diff --git a/components/xen b/components/xen index b7258b0..b3426f0 100644 --- a/components/xen +++ b/components/xen @@ -20,7 +20,7 @@ function xen_check_package() { local DEP_CentOS_x86_64=$DEP_CentOS_x86_32 glibc-devel.i686 echo Checking Xen dependencies -eval check-package \$DEP_$DISTRO_$ARCH +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH } function xen_build() { diff --git a/lib/common-functions.sh b/lib/common-functions.sh index c19046b..186a53b 100644 --- a/lib/common-functions.sh +++ b/lib/common-functions.sh @@ -41,7 +41,7 @@ function common_init() {
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On 21.04.15 at 16:42, tkleng...@sec.in.tum.de wrote: On Tue, Apr 21, 2015 at 4:29 PM, Jan Beulich jbeul...@suse.com wrote: On 21.04.15 at 16:24, ian.campb...@citrix.com wrote: On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote: On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64 bit host (maximum pfn more than 2^32)? It is, but do we really want to introduce other than just compat mode helper interfaces (i.e. leaving the current ones alone, and perhaps even making the new ones tools only) if we really care about such setups in the first place? At the moment I just followed Andrew's advice and will introduce a new XENMEM_maximum_gpfn2 memop that returns the gpfn in a struct as xen_pfn_t. The old memops I'll leave untouched if that's OK. For this specific one - is there a reasonable use case? Other than for host PFN, we have control over guest ones, and I'm not sure managing a guest with GPFNs extending past 4 billion can be expected to work if only this one hypercall got fixed. IOW I'm expecting to NAK any such addition without proper rationale. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH OSSTEST v2] Arrange for core dumps to be placed in /var/core and collect them
Ian Campbell writes ([PATCH OSSTEST v2] Arrange for core dumps to be placed in /var/core and collect them): Refactor the $kvp_replace helper in ts-xen-install into a generic helper (which requires using ::EO and ::EI for namespacing) for use with target_editfile and use it to edit /etc/sysctl.conf to set kernel.core_pattern on boot. Tested in standalone mode by installing and running a C program containing *(int *)0 = 1; which, after running ulimit -c unlimited produces the expected core file. ts-logs-capture when run in standalone mode then picks them up. I've not yet figured out how to make the desired rlimit take affect for all processes (including e.g. daemons spawned on boot). Likely this will involve some combination of pam_limits.so PAM module and adding explicit ulimit calls to the initscripts which we care about (primarily xencommons and libvirt initscripts). Signed-off-by: Ian Campbell ian.campb...@citrix.com Acked-by: Ian Jackson ian.jack...@eu.citrix.com --- v2: Add /var/core to ts-leak-check I'm going to push this and the two related patches, and the $r{enable_xsm} one, RSN. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 11/13] tools/libxl: add command to show CMT hardware info
On Tue, 2015-04-21 at 02:37 +0200, Dario Faggioli wrote: On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote: Add dedicated one to show hardware information. [root@vmm-psr]xl psr-cmt-hwinfo Cache Monitoring Technology (CMT): Enabled : 1 Total RMID : 63 Supported monitor types: cache-occupancy total-mem-bandwidth local-mem-bandwidth Nice. Signed-off-by: Chao Peng chao.p.p...@linux.intel.com --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -8014,6 +8014,36 @@ out: } #ifdef LIBXL_HAVE_PSR_CMT +static int psr_cmt_hwinfo(void) +{ +int rc; +int enabled; +uint32_t total_rmid; + I think you should still have something like: SWITCH_FOREACH_OPT(opt, , NULL, psr-cmt-hwinfo, 0) { /* No options */ } Or `xl psr-cmt-hwinfo -h' wouldn't work, would it? Yes, that is neded. With that done: Reviewed-by: Dario Faggioli dario.faggi...@citrix.com Likewise: Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? (Tamas, while I had asked you to avoid Cc-ing me on ARM only patches, I would have been able to spot this earlier if you had Cc-ed me on this non-ARM change. Again, please Cc maintainers, but especially on larger series make sure you don't Cc everyone on every patch. I'm sure avoiding the extra mail load is being appreciated not just by me.) Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 6/7] docs/build: Move install checks into individual build targets
On Mon, 2015-04-20 at 13:01 +0100, Andrew Cooper wrote: On 20/04/15 12:57, Jan Beulich wrote: On 20.04.15 at 12:49, andrew.coop...@citrix.com wrote: @@ -103,12 +88,20 @@ install: install-man-pages install-html # Individual file build targets man1/%.1: man/%.pod.1 Makefile +ifdef POD2MAN Perhaps better to use ifneq($(POD2MAN),) in such cases? I was following the prevailing style, but can update all instances. FWIW, it is not currently an issue. I don't really mind, ifneq seems more prevalent in other makefiles, I'm not sure why this one uses ifdef nor really what the implications are. @$(INSTALL_DIR) $(@D) $(POD2MAN) --release=$(VERSION) --name=$* -s 1 -c Xen $ $@ +else + @echo pod2man not installed; skipping $(@F) Why do you strip off the directory part? Leaving it in place would make the output even less ambiguous (and hence more helpful). Can do. Thanks. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 2/9] xen/arm: Allow hypervisor access to mem_access protected pages
On Tue, 2015-04-21 at 16:10 +0200, Tamas K Lengyel wrote: On Tue, Apr 21, 2015 at 3:14 PM, Ian Campbell ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote: -paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t) +static paddr_t __p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t) I'd have been inclined to make this take a p2m_domain* rather than a domain*, on the basis that the caller must have one in hand to have locked it. I don't think __p2m_lookup uses d other than to find the p2m. Not worth changing unless there is some other reason to resend though, so with or without that: Acked-by: Ian Campbell ian.campb...@citrix.com Ian. Thanks, I made the change as I'll have another round anyway to introduce XENMEM_maximum_gpfn2 in the series. Great, thanks. FWIW there was another similar wrapper in another patch later on, but that one did use d so I didn't suggest the same change there for that reason. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote: On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64 bit host (maximum pfn more than 2^32)? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On 21.04.15 at 16:24, ian.campb...@citrix.com wrote: On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote: On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64 bit host (maximum pfn more than 2^32)? It is, but do we really want to introduce other than just compat mode helper interfaces (i.e. leaving the current ones alone, and perhaps even making the new ones tools only) if we really care about such setups in the first place? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology
On 21/04/15 14:13, Boris Ostrovsky wrote: On 04/21/2015 09:14 AM, Andrew Cooper wrote: On 21/04/15 13:56, Boris Ostrovsky wrote: On 04/21/2015 03:01 AM, Jan Beulich wrote: --- a/docs/misc/xsm-flask.txt +++ b/docs/misc/xsm-flask.txt @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h) * XEN_SYSCTL_cpupool_op * XEN_SYSCTL_scheduler_op * XEN_SYSCTL_coverage_op + * XEN_SYSCTL_pcitopoinfo No additions to this list are permitted. Either the new sub-op is disaggregation safe (which it looks to be), or it can't be accepted. True, it *is* safe, but why then cputopoinfo and numainfo are in this list? They look to be safe as well. This list includes not yet audited. It is quite likely that some entries in the list are safe. fwiw, neither cputopo nor numainfo are currently safe for very large systems. Why would safety of an operation depend on system size? And how are these two unsafe? (Because maybe then PCI topology query is unsafe in the same manner) PCI topology has continuations in it. cpu/numa info is bounded by host values, but does attempt to cover all cpus/nodes in a single pass, which will get progressively longer as the server gets larger. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Wg-test-framework] osstest outage over the weekend
Don Koch writes (Re: [Wg-test-framework] osstest outage over the weekend): Sorry. I had gotten the impression that you had some other tool that was handling the DHCP addresses for the test machines. I guess what you meant (and I misinterpreted) was that it was handling figuring out which addresses were handed out to which machines. As a result, I thought that the dhcp server wasn't really in the production loop. What I have is a system for automatically producing dhcp server configuration so that test boxes can be automatically provisioned. You can see a hook for this in dhcpd.conf, which says: include /etc/dhcp/testhosts.dhcp.conf; There is an analogous hook in the dns zonefile. I will refrain from modifying it from here on. Thanks. In general this applies to everything in the colo, both hardware and software: Unless the resource in question is specific to one of the test boxes that have not yet been handed over to me, it should not be touched without consulting me. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] Config.mk: Fix (and, effectively, update) QEMU_TAG
Jan Beulich writes (Re: [PATCH] Config.mk: Fix (and, effectively, update) QEMU_TAG): On 21.04.15 at 12:33, ian.jack...@eu.citrix.com wrote: In 952944f7 QEMU_TAG update my tag update script mangled the machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first assignment to QEMU_TRADITIONAL_REVISION it found rather than the one which ought to have been replaced. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com Reported-by: Jan Beulich jbeul...@suse.com Acked-by: Jan Beulich jbeul...@suse.com Thanks. Sorry for the breakage. (and please also for 4.5) Indeed. See below, which I have just pushed (without waiting for your further ack). Ian. From d41906197d9a89355f43d4359d795b1d0257a53a Mon Sep 17 00:00:00 2001 From: Ian Jackson ian.jack...@eu.citrix.com Date: Tue, 21 Apr 2015 14:47:38 +0100 Subject: [PATCH] Config.mk: Fix QEMU_TAG and QEMU_TRADITIONAL_REVISION handling In 2417e243 QEMU_TAG update my tag update script mangled the machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first assignment to QEMU_TRADITIONAL_REVISION it found rather than the one which ought to have been replaced. The result was that from that commit on, QEMU_TAG was no longer honoured although QEMU_TRADITIONAL_REVISION still was. Fix this by restoring the transfer from QEMU_TAG. This fix is analogous to 5d4c0952 Config.mk: Fix (and, effectively, update) QEMU_TAG from xen.git#staging, but in 4.5 there has not been a subsequent attempt to update the qemu revision, so both settings of QEMU_TRADITIONAL_REVISION are the same, and restoring the proper behaviour will not have the side effect of making a previous qemu tag update effective. Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: George Dunlap george.dun...@eu.citrix.com CC: Jan Beulich jbeul...@suse.com Reported-by: Jan Beulich jbeul...@suse.com --- Config.mk |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/Config.mk b/Config.mk index cb1d23b..08df07d 100644 --- a/Config.mk +++ b/Config.mk @@ -237,9 +237,7 @@ ifneq (,$(CONFIG_QEMU)) QEMU_TRADITIONAL_LOC ?= $(CONFIG_QEMU) endif ifneq (,$(QEMU_TAG)) -QEMU_TRADITIONAL_REVISION ?= 62e41581f69c3fd4a8f829a773015eb4c17f1f3e -# Tue Mar 31 16:27:45 2015 +0100 -# xen: limit guest control of PCI command register +QEMU_TRADITIONAL_REVISION ?= $(QEMU_TAG) endif ifeq ($(GIT_HTTP),y) -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v5 12/13] tools: add tools support for Intel CAT
On Tue, 2015-04-21 at 17:49 +0800, Chao Peng wrote: On Tue, Apr 21, 2015 at 03:24:37AM +0200, Dario Faggioli wrote: On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote: This is the xc/xl changes to support Intel Cache Allocation Technology(CAT). Two commands are introduced: - xl psr-cat-hwinfo Show CAT hardware information. Examples: [root@vmm-psr vmm]# xl psr-cat-hwinfo Cache Allocation Technology (CAT): Socket ID : 0 L3 Cache: 12288KB Maximum COS : 15 CBM length : 12 Default CBM : 0xfff Or, you can rename the psr-cmt-hwinfo command, added in the previous patch, to 'psr-hwinfo' and make it accept options, e.g., -m, --cmt show Cache Monitoring Technology (CMT) hardware info -c, --cat show Cache Allocation Technology (CAT) hardware info By default (i.e., no options provided), it can just print all the hw info. Not a big deal, but I think that would make a better command line interface. Tools' maintainers' call, I guess. Thanks for suggestion. FWIW I think this is a good idea. --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h +#ifdef LIBXL_HAVE_PSR_CAT + +#define LIBXL_PSR_TARGET_ALL (~0U) +int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid, + libxl_psr_cbm_type type, uint32_t target, + uint64_t cbm); +int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid, + libxl_psr_cbm_type type, uint32_t target, + uint64_t *cbm_r); + The same applies here: I'd rename taget to socket. Then there's that LIBXL_PSR_TARGET_ALL. So, target (or socket, if you'll rename it) is an integer representing _which_one_ socket to act on. However, there is a special value to mean all sockets. Another possibility would be to offer an API that natively allows for operating on multiple sockets, by using libxl_bitmap-s, as it happens in many other places, in libxl itself (e.g., setting and getting vcpu affinity). That means target would become a libxl_bitmap, and, in the implementation, you'll apply the operation on all the sockets corresponding to a set bit in there. Only one bit set means just that socket, all bits means all sockets. This looks like a better interface to me (no need for special ~0U values), it'd make the implementation more linear, and is more consistent with how other similar situations are handled in libxl. However, I appreciate that one may find it overkill... I guess it depends whether we expect the prevalent usage pattern to be almost always about single sockets --and maybe on all sockets, from time to time-- or if we see value in being able to specify more than one and less than all sockets. For instance, now that we have vNUMA, if a domain has 4 vNUMA nodes, each one mapped to a physical NUMA node, it looks to me like it would make sense to set CAT to, say, 0x0F, on the sockets corresponding to the physical NODEs. With the interface in this patch, that would require calling libxl_psr_cat_set_cbm() 4 times, with the libxl_bitmap approach, just once, after setting up the bitmap properly. Thoughts? I do like this suggestion and I have ever considered it actually. The only thing prevents me is that we need an extra _get_socket_count in xl for TARGET_ALL case. So libxl__count_physical_sockets is needed to be public. If Ian/Wei have no concerns for this, then I'm glad to do this. I don't think you need libxl__count_physical_sockets for this, the existing xl code for similar things just uses libxl_bitmap_set_any to handle the all case. With that in mind Dario's suggestion does seem like a good improvement to the interface. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/7] docs/build: Move two rules for consistency, and comment sections
On Mon, 2015-04-20 at 11:49 +0100, Andrew Cooper wrote: No functional change. Signed-off-by: Andrew Cooper andrew.coop...@citrix.com Acked-by: Ian Campbell ian.campb...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/7] docs/build: Do not use move-if-changed
On Mon, 2015-04-20 at 11:49 +0100, Andrew Cooper wrote: Nothing expensive depends on these results. But are those uses a problem? Also prefer $(INSTALL_DATA) over cp to get correct file attributes (see fb33b2b docs: make .txt files over-writable when building from r/o sources) Signed-off-by: Andrew Cooper andrew.coop...@citrix.com CC: Ian Campbell ian.campb...@citrix.com CC: Ian Jackson ian.jack...@eu.citrix.com CC: Wei Liu wei.l...@citrix.com --- docs/Makefile | 21 +++-- 1 file changed, 7 insertions(+), 14 deletions(-) diff --git a/docs/Makefile b/docs/Makefile index 0b458f1..d31b36f 100644 --- a/docs/Makefile +++ b/docs/Makefile @@ -116,8 +116,7 @@ html/index.html: $(DOC_HTML) $(CURDIR)/gen-html-index INDEX html/%.html: %.markdown $(INSTALL_DIR) $(@D) ifdef MARKDOWN - $(MARKDOWN) $ $@.tmp ; \ - $(call move-if-changed,$@.tmp,$@) + $(MARKDOWN) $ $@ else @echo markdown not installed; skipping $*.html. endif @@ -129,8 +128,7 @@ html/%.txt: %.txt html/man/%.1.html: man/%.pod.1 Makefile $(INSTALL_DIR) $(@D) ifdef POD2HTML - $(POD2HTML) --infile=$ --outfile=$@.tmp - $(call move-if-changed,$@.tmp,$@) + $(POD2HTML) --infile=$ --outfile=$@ else @echo pod2html not installed; skipping $. endif @@ -138,8 +136,7 @@ endif html/man/%.5.html: man/%.pod.5 Makefile $(INSTALL_DIR) $(@D) ifdef POD2HTML - $(POD2HTML) --infile=$ --outfile=$@.tmp - $(call move-if-changed,$@.tmp,$@) + $(POD2HTML) --infile=$ --outfile=$@ else @echo pod2html not installed; skipping $. endif @@ -161,19 +158,16 @@ html/hypercall/%/index.html: $(CURDIR)/xen-headers Makefile txt/%.txt: %.txt $(INSTALL_DIR) $(@D) - cp $ $@.tmp - $(call move-if-changed,$@.tmp,$@) + $(INSTALL_DATA) $ $@ txt/%.txt: %.markdown $(INSTALL_DIR) $(@D) - cp $ $@.tmp - $(call move-if-changed,$@.tmp,$@) + $(INSTALL_DATA) $ $@ txt/man/%.1.txt: man/%.pod.1 Makefile $(INSTALL_DIR) $(@D) ifdef POD2TEXT - $(POD2TEXT) $ $@.tmp - $(call move-if-changed,$@.tmp,$@) + $(POD2TEXT) $ $@ else @echo pod2text not installed; skipping $. endif @@ -181,8 +175,7 @@ endif txt/man/%.5.txt: man/%.pod.5 Makefile $(INSTALL_DIR) $(@D) ifdef POD2TEXT - $(POD2TEXT) $ $@.tmp - $(call move-if-changed,$@.tmp,$@) + $(POD2TEXT) $ $@ else @echo pod2text not installed; skipping $. endif ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values
On Tue, Apr 21, 2015 at 4:14 PM, Jan Beulich jbeul...@suse.com wrote: On 21.04.15 at 15:23, ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote: On 20/04/15 16:06, Tamas K Lengyel wrote: The current implementation of three memops, XENMEM_current_reservation, XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an int. However, in ARM64 we could potentially have 36-bit pfn's, thus in preparation for the ARM patch, in this patch we update the existing memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info as a uin64_t. This patch also adds error checking on the toolside in case the memop fails. Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API. You cannot make adjustments like this, but you can add a brand new op with appropriate parameters and list the old ops as deprecated. Right. For the benefit of callers using the old API it seems what we usually do is rename the old op XENMEM_foo_compat and use the name with a new number for the new functionality, then use a __XEN_INTERFACE_VERSION__ to #define back to the old name. The handling of __HYPERVISOR_sched_op in public/xen.h seems like a reasonable example, I couldn't find one specifically for the memory ops. And there's no need to afaict: This complication isn't needed in the first place. The patch's context already makes this clear: --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg) Note the long return type. Yet the patch description, for whatever reason, claims the hypercall to only return an int. Maybe because (as pointed out before) the respective Linux hypercall stub wrongly an int return type? The privcmd driver on Linux certainly does return an int via the ioctl. (Tamas, while I had asked you to avoid Cc-ing me on ARM only patches, I would have been able to spot this earlier if you had Cc-ed me on this non-ARM change. Again, please Cc maintainers, but especially on larger series make sure you don't Cc everyone on every patch. I'm sure avoiding the extra mail load is being appreciated not just by me.) Jan Ack, I took you off only after you repeatedly asked me to exclude you from this series. I agree that patches should be only cc-d to the respective maintainers and not all maintainers who are involved in the series. I'll look into how that could be automated because doing that by hand is not very pleasant. Thanks, Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test
Ian Campbell writes (Re: [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test): On Tue, 2015-04-21 at 13:33 +0100, Ian Jackson wrote: It seems to be copying information from runvars into the in-memory data structure for host properties. But host properties are (by definition) matters of (fixed) configuration, not runtime definition. Remember that here the host is actual an L1 virtual machine, whose IP is not fixed (since it mac address isn't). Indeed. I don't know if that affects your opinion at all? No, it doesn't. I agree that the IP address should be dynamically figured out, and passing it through a runvar is fine. I'm only objecting to the involvement of the host property machinery. Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Wg-test-framework] osstest outage over the weekend
On Mon, 20 Apr 2015 06:40:18 -0400 Ian Jackson ian.jack...@eu.citrix.com wrote: osstest service user writes ([qemu-mainline test] 50651: trouble: blocked/broken): flight 50651 qemu-mainline real [real] http://logs.test-lab.xenproject.org/osstest/logs/50651/ Failures and problems with tests :-( Tests which did not succeed and are blocking, including tests which could not be run: build-armhf-xsm 3 host-install(3)broken REGR. vs. 50391 build-armhf 3 host-install(3)broken REGR. vs. 50391 build-i386-pvops 3 host-install(3)broken REGR. vs. 50391 build-amd64-xsm 3 host-install(3)broken REGR. vs. 50391 build-i386-xsm3 host-install(3)broken REGR. vs. 50391 build-amd64-pvops 3 host-install(3)broken REGR. vs. 50391 build-amd64 3 host-install(3)broken REGR. vs. 50391 build-i3863 host-install(3)broken REGR. vs. 50391 build-armhf-pvops 3 host-install(3)broken REGR. vs. 50391 This was due to our contractors / hardware suppliers making an ill-advised change to the DHCP server configuration, which I have now reverted. All jobs started between some time around 2100 UTC on Friday and around 1030 UTC today failed in this way. The effect on individual flights can be mixed and depends on the period during which the flight was running. Don: This is a live, production system. Please do not make ANY changes with global effect, no matter how minor, without prior consultation. Sorry. I had gotten the impression that you had some other tool that was handling the DHCP addresses for the test machines. I guess what you meant (and I misinterpreted) was that it was handling figuring out which addresses were handed out to which machines. As a result, I thought that the dhcp server wasn't really in the production loop. I will refrain from modifying it from here on. -d Also if you do change anything you should let me know. I assume it was you rather than Paul. I only know exactly what you changed from looking at the git logs (thanks to etckeeper!) Thanks, Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 1/5] vscsiif.h: fix WWN notation for p-dev property
On Fri, Apr 17, 2015 at 08:30:56AM +, Olaf Hering wrote: The pvops kernel expects either naa.WWN:LUN or h:c:t:l in the p-dev property. Add the missing :LUN part to the comment. What about the older kernels that had said driver? Signed-off-by: Olaf Hering o...@aepfle.de Cc: Ian Campbell ian.campb...@citrix.com Cc: Ian Jackson ian.jack...@eu.citrix.com Cc: Jan Beulich jbeul...@suse.com Cc: Keir Fraser k...@xen.org Cc: Tim Deegan t...@xen.org --- xen/include/public/io/vscsiif.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/include/public/io/vscsiif.h b/xen/include/public/io/vscsiif.h index 7a1db05..e8e38a9 100644 --- a/xen/include/public/io/vscsiif.h +++ b/xen/include/public/io/vscsiif.h @@ -60,7 +60,7 @@ * * A string specifying the backend device: either a 4-tuple h:c:t:l * (host, controller, target, lun, all integers), or a WWN (e.g. - * naa.60014054ac780582). + * naa.60014054ac780582:0). * * v-dev * Values: string ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH V15 2/9] xen/arm: Allow hypervisor access to mem_access protected pages
On Tue, Apr 21, 2015 at 3:14 PM, Ian Campbell ian.campb...@citrix.com wrote: On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote: -paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t) +static paddr_t __p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t) I'd have been inclined to make this take a p2m_domain* rather than a domain*, on the basis that the caller must have one in hand to have locked it. I don't think __p2m_lookup uses d other than to find the p2m. Not worth changing unless there is some other reason to resend though, so with or without that: Acked-by: Ian Campbell ian.campb...@citrix.com Ian. Thanks, I made the change as I'll have another round anyway to introduce XENMEM_maximum_gpfn2 in the series. Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel