Re: [Xen-devel] [PATCH v17 13/23] x86/VPMU: Interface for setting PMU mode and flags
On 01/30/2015 08:31 AM, Jan Beulich wrote: On 05.01.15 at 22:44, boris.ostrov...@oracle.com wrote: +static long vpmu_unload_next(void *arg) +{ +struct vcpu *last; +int ret; +unsigned int thiscpu = smp_processor_id(); + +if ( thiscpu != vpmu_next_unload_cpu ) +{ +/* Continuation thread may have been moved due to CPU hot-unplug */ +vpmu_mode = (unsigned long)arg; +vpmu_first_unload_cpu = VPMU_INVALID_CPU; +return -EAGAIN; +} + +local_irq_disable(); /* so that last_vcpu doesn't change under us. */ + +last = this_cpu(last_vcpu); +if ( last ) +{ +vpmu_unload(vcpu_vpmu(last)); +this_cpu(last_vcpu) = NULL; +} So you do this for last_vcpu here, but ... +static int vpmu_unload_all(unsigned long old_mode) +{ +int ret = 0; + +vpmu_unload(vcpu_vpmu(current)); ... for current here, also without clearing this_cpu(last_vcpu). Is that really correct? No, I should clear it here as well. +long do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg) +{ +int ret; +struct xen_pmu_params pmu_params; + +ret = xsm_pmu_op(XSM_OTHER, current-domain, op); +if ( ret ) +return ret; + +switch ( op ) +{ +case XENPMU_mode_set: +{ +unsigned int old_mode; +static DEFINE_SPINLOCK(xenpmu_mode_lock); + +if ( copy_from_guest(pmu_params, arg, 1) ) +return -EFAULT; + +if ( pmu_params.val ~(XENPMU_MODE_SELF | XENPMU_MODE_HV) ) +return -EINVAL; + +/* 32-bit dom0 can only sample itself. */ +if ( is_pv_32bit_vcpu(current) (pmu_params.val XENPMU_MODE_HV) ) +return -EINVAL; + +/* + * Return error if someone else is in the middle of changing mode --- + * this is most likely indication of two system administrators + * working against each other. + */ +if ( !spin_trylock(xenpmu_mode_lock) ) +return -EAGAIN; +if ( vpmu_first_unload_cpu != VPMU_INVALID_CPU ) +{ +spin_unlock(xenpmu_mode_lock); +return -EAGAIN; +} + +old_mode = vpmu_mode; +vpmu_mode = pmu_params.val; + +if ( vpmu_mode == XENPMU_MODE_OFF ) +{ +/* Make sure all (non-dom0) VCPUs have unloaded their VPMUs. */ +ret = vpmu_unload_all(old_mode); +if ( ret ) +vpmu_mode = old_mode; +} + +spin_unlock(xenpmu_mode_lock); + +break; +} + +case XENPMU_mode_get: +memset(pmu_params, 0, sizeof(pmu_params)); +pmu_params.val = vpmu_mode; + +pmu_params.version.maj = XENPMU_VER_MAJ; +pmu_params.version.min = XENPMU_VER_MIN; + +if ( copy_to_guest(arg, pmu_params, 1) ) +return -EFAULT; +break; + +case XENPMU_feature_set: +if ( copy_from_guest(pmu_params, arg, 1) ) +return -EFAULT; + +if ( pmu_params.val ~XENPMU_FEATURE_INTEL_BTS ) +return -EINVAL; + +vpmu_features = pmu_params.val; +break; + +case XENPMU_feature_get: +pmu_params.val = vpmu_features; +if ( copy_field_to_guest(arg, pmu_params, val) ) +return -EFAULT; +break; + +default: +ret = -EINVAL; +} + +return ret; +} Throughout this function, the pad field isn't being checked to be zero (and XENPMU_feature_get also doesn't clear it, but that seems secondary, at least as long as it's being declared IN only). As I'm sure I said before - we ought to do this in order to be able to assign meaning to the field later on. Perhaps it would even better be renamed to e.g. mbz. If we decide to start using this filed then presumably we will have to bump interface version. If pad field becomes something else it will only be considered when minor version is 1. (And I should probably add major version check.) -boris ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [xen-unstable test] 33931: regressions - FAIL
flight 33931 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/33931/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-amd64-xl-winxpsp3 7 windows-install fail REGR. vs. 33637 Regressions which are regarded as allowable (not blocking): test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 33637 Tests which did not succeed, but are not blocking: test-amd64-i386-libvirt 9 guest-start fail never pass test-armhf-armhf-xl 10 migrate-support-checkfail never pass test-armhf-armhf-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pvh-amd 9 guest-start fail never pass test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-amd64-libvirt 9 guest-start fail never pass test-amd64-amd64-xl-pvh-intel 9 guest-start fail never pass test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-i386-xl-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop fail never pass test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass version targeted for testing: xen 9f7798eb6ea20ba94617772ed0f3b8862b826533 baseline version: xen 7106c691a6332cffab4037186d1caa3012ae051e People who touched revisions under test: Andrew Cooper andrew.coop...@citrix.com Boris Ostrovsky boris.ostrov...@oracle.com Chao Peng chao.p.p...@linux.intel.com Dan Carpenter dan.carpen...@oracle.com Daniel De Graaf dgde...@tycho.nsa.gov David Scott dave.sc...@citrix.com David Vrabel david.vra...@citrix.com Dietmar Hahn dietmar.h...@ts.fujitsu.com Fabio Fantoni fabio.fant...@m2r.biz Ian Campbell ian.campb...@citrix.com Ian Jackson ian.jack...@eu.citrix.com Jan Beulich jbeul...@suse.com Julien Grall julien.gr...@linaro.org Kevin Tian kevin.t...@intel.com Konrad Rzeszutek Wilk konrad.w...@oracle.com Oleksandr Tyshchenko oleksandr.tyshche...@globallogic.com Quan Xu quan...@intel.com Roger Pau Monné roger@citrix.com Sander Eikelenboom li...@eikelenboom.it Stefano Stabellini stefano.stabell...@eu.citrix.com Tim Deegan t...@xen.org Wei Liu wei.l...@citrix.com jobs: build-amd64 pass build-armhf pass build-i386 pass build-amd64-libvirt pass build-armhf-libvirt pass build-i386-libvirt pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvopspass build-armhf-pvopspass build-i386-pvops pass build-amd64-rumpuserxen pass build-i386-rumpuserxen pass test-amd64-amd64-xl pass test-armhf-armhf-xl pass test-amd64-i386-xl pass test-amd64-amd64-xl-pvh-amd fail test-amd64-i386-rhel6hvm-amd pass test-amd64-i386-qemut-rhel6hvm-amd pass test-amd64-i386-qemuu-rhel6hvm-amd pass test-amd64-amd64-xl-qemut-debianhvm-amd64pass test-amd64-i386-xl-qemut-debianhvm-amd64 pass test-amd64-amd64-xl-qemuu-debianhvm-amd64pass test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm
On 30/01/15 15:46, Julien Grall wrote: Hello Denys, On 30/01/15 15:40, Denis Drozdov wrote: From: denys drozdov denys.droz...@globallogic.com Update RT scheduler to run on arm platform You need to give more background of the problem (i.e why you have to disable the IRQ on ARM). As the scheduler is common with x86, I would expect the problem is also happening on this architecture. Changing a spinlock_irq into an irqsave is safe, functionally speaking, but it is concerning that the scheduler appears to be called in different interrupt states between architectures. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues
On Fri, Jan 30, 2015 at 03:34:19PM +, Jan Beulich wrote: On 30.01.15 at 16:09, daniel.ki...@oracle.com wrote: I suppose that we should provide additional kexec hypercall function which will return info about RS. kexec-tools should load new kernel as usual and add relevant argument to its command line. Most things are in place so we should just learn kexec-tools to do new things. There is a reason why the RS table info can't currently be obtained via a hypercall - Dom0 has nothing to do with it. Plus any kexec-ed kernel (Linux or other) will, under EFI, want this information, so a mechanism by which to pass the information to the secondary kernel without exposing it to entities not having a need to know would be preferable (albeit I have no idea so far how that might look like). Currently, anybody able to use HYPERVISOR_memory_op hypercall on dom0 is able to get full machine memory map. So, what is the problem with exposing more details about RS? Do you think about security? I suppose that we expose to dom0 other hardware details which potentially could be used in malicious way and RS details exposure will not undermine our security so strong. Plus, this still doesn't in any way deal with the aspect that was so far discussed in this thread - SetVirtualAddressMap() being callable only once. Well, we must live with it because this is part of UEFI spec. Or change UEFI spec. Still thinking why spec does not allow OS do this operation more then once. OTH, Konrad pointed out a solution (workaround) for this issue used in Linux which seems sensible and could be used by us too. Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] p2m: p2m_mmio_direct set RW permissions
On 30.01.15 at 17:34, elena.ufimts...@oracle.com wrote: On Tue, Jan 27, 2015 at 08:24:36AM +, Jan Beulich wrote: On 26.01.15 at 18:30, elena.ufimts...@oracle.com wrote: On Mon, Jan 26, 2015 at 05:06:12PM +, Jan Beulich wrote: On 26.01.15 at 17:57, elena.ufimts...@oracle.com wrote: On Fri, Jan 23, 2015 at 10:50:23AM +, Jan Beulich wrote: On 22.01.15 at 18:34, elena.ufimts...@oracle.com wrote: (XEN) d56f - d5fff000 (reserved) So this is where one of the RMRRs sits in (and also where the faults occur according to the two numbers you sent earlier, which - as others have already said - is an indication of the reported RMRRs being incomplete), ... (XEN) d5fff000 - d600 (usable) (XEN) d700 - df20 (reserved) ... and this is the exact range of the other one. But the usable entry between them is a sign of the firmware not doing the best job in assigning resources. I don't, btw, think that blindly mapping all the reserved regions into PVH Dom0's P2M would be (or is, if that's what's happening today) correct - these regions are named reserved for a reason. In the case here it's actually RAM, not MMIO, and Dom0 (as well as Xen) has no business accessing these (for others this may be different, e.g. the LAPIC and IO-APIC ones below, but Xen learns/knows of them by means different from looking at the memory map). I understand this this. At the same time I think pv dom0 does exactly this blind mapping. I also tried to map these regions as read-only and that worked. Can be this an option for these ram regions? No - they're reserved, so there shouldn't be _any_ access to them. The only possible workaround I see as acceptable would be the already proposed addition of a command line option allowing to specify additional RMRR-like regions. Understood. And I am guessing the permissions overloading option should be available as well? For example, RW or R only. RMRRs are always mapped with RW. That's an option, but not a requirement imo. Why can be this a platform quirk? Did you mean can't? If not, I don't understand the question. If so, remember that we're talking about RAM allocated by the firmware for special purposes. Hence such a quirk would need tailoring to any particular published firmware version, and would need to take into account any differences in the memory use that may result from setting firmware options differently (and assuming that the memory use is deterministic across boots when the options don't change - I've seen systems where memory use differed between warm and cold boots). IOW, not something that's likely to be practical. What about pv dom0 case? pv dom0 maps these reserved memory ranges with RW access rights. Should be this unified in some way? I'm again having some difficulty relating the question to the context given. But anyway - where do you see Dom0 getting reserved memory regions mapped RW unconditionally? The fact that iommu_inclusive_mapping=1 by default is to cover firmware bugs afaik. I could see us altering this at some point to only behave that way on older systems. But that's a decision mainly to be made by the VT-d maintainers. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm
Julien, The details of this issue can be found in the current mailing list. Please have a look at http://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html. This is exactly the same behaviour we are observing when running on our arm setup. The fix changes spin_lock_irq/spin_unlock_irq to spin_lock_irq_save/spin_lock_irq_restore, since context_save executed right from IRQ level. The arm interrupt handling differs from x86. ARM is handling context_saved with IRQ disabled in CPSR, though x86 has IRQs on. Thus it is failing on ASSERT inside spin_lock_irq when running on ARM. I guess it should work on x86, so this issue is ARM platform specific. On Fri, Jan 30, 2015 at 6:10 PM, Julien Grall julien.gr...@linaro.org wrote: On 30/01/15 15:56, Andrew Cooper wrote: On 30/01/15 15:46, Julien Grall wrote: Hello Denys, On 30/01/15 15:40, Denis Drozdov wrote: From: denys drozdov denys.droz...@globallogic.com Update RT scheduler to run on arm platform You need to give more background of the problem (i.e why you have to disable the IRQ on ARM). As the scheduler is common with x86, I would expect the problem is also happening on this architecture. Changing a spinlock_irq into an irqsave is safe, functionally speaking, but it is concerning that the scheduler appears to be called in different interrupt states between architectures. For instance credit2 is also using vcpu_schedule_lock_irq... Although, IIRC, it's not used by default. So it has to be fixed too on ARM. In any case, the commit message needs more background such as stack trace and/or explaining why we have to take spinlock_irq. This may also need to document context_saved to explain the IRQs may not be enabled when this function is called. Regards, -- Julien Grall -- Denis Drozdov GlobalLogic M +(380)988969537 S denis.drozdov www.globallogic.com http://www.globallogic.com/ http://www.globallogic.com/email_disclaimer.txt ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 07/12] xen: Remove mem_event
On 01/29/2015 04:46 PM, Tamas K Lengyel wrote: This is the third and final patch in the mem_event - vm_event renaming process which removes all references and remaining code of mem_event from the Xen source. Signed-off-by: Tamas K Lengyel tamas.leng...@zentific.com --- MAINTAINERS| 1 - docs/misc/xsm-flask.txt| 1 - tools/libxc/Makefile | 1 - tools/libxc/xc_mem_access.c| 16 +- tools/libxc/xc_mem_event.c | 162 - tools/libxc/xc_private.h | 12 - xen/common/Makefile| 1 - xen/common/mem_event.c | 738 - xen/include/public/domctl.h| 90 + xen/include/public/mem_event.h | 192 --- xen/include/xen/mem_event.h| 143 xen/include/xen/sched.h| 41 --- xen/include/xsm/dummy.h| 12 - xen/include/xsm/xsm.h | 12 - xen/xsm/dummy.c| 2 - xen/xsm/flask/hooks.c | 12 - 16 files changed, 14 insertions(+), 1422 deletions(-) delete mode 100644 tools/libxc/xc_mem_event.c delete mode 100644 xen/common/mem_event.c delete mode 100644 xen/include/public/mem_event.h delete mode 100644 xen/include/xen/mem_event.h You should also remove mem_event from xen/xsm/flask/policy/access_vectors in this patch. -- Daniel De Graaf National Security Agency ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm
On 30/01/15 15:56, Andrew Cooper wrote: On 30/01/15 15:46, Julien Grall wrote: Hello Denys, On 30/01/15 15:40, Denis Drozdov wrote: From: denys drozdov denys.droz...@globallogic.com Update RT scheduler to run on arm platform You need to give more background of the problem (i.e why you have to disable the IRQ on ARM). As the scheduler is common with x86, I would expect the problem is also happening on this architecture. Changing a spinlock_irq into an irqsave is safe, functionally speaking, but it is concerning that the scheduler appears to be called in different interrupt states between architectures. For instance credit2 is also using vcpu_schedule_lock_irq... Although, IIRC, it's not used by default. So it has to be fixed too on ARM. In any case, the commit message needs more background such as stack trace and/or explaining why we have to take spinlock_irq. This may also need to document context_saved to explain the IRQs may not be enabled when this function is called. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm
On 30/01/15 16:19, Denys Drozdov wrote: Julien, The details of this issue can be found in the current mailing list. Please have a look athttp://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html http://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html. This is exactly the same behaviour we are observing when running on our arm setup. I understood the problem... I just pointed out that the commit message is not useful. Hence the request of more background. If we have to read the log in a couple of years time. We should be able to understand the problem/solution with the commit message. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 3/3] xen/arm: Introduce support for Renesas R-Car Gen2 platform
On 30/01/15 16:44, Oleksandr Tyshchenko wrote: On Fri, Jan 30, 2015 at 5:52 PM, Julien Grall julien.gr...@linaro.org wrote: Hi Oleksandr, Hi Julien On 30/01/15 15:38, Oleksandr Tyshchenko wrote: For example, such a simple thing as a formula for calculating divider value for uart baudrate can differs from one family to another despite the fact that these families use the same UART IP block named SCIF (clock may be differs - ext/int, another freq). Or FIFO size, etc. IHMO this look more configuration than logic change in the UART driver. We don't ask you to support all the configuration of the UART right now ;). However, if someone comes with a new platform using the SCIF, I expect him to reuse and fix this driver rather than creating a new one. FWIW, the compatible string is renesas,scif not rcar smth. So, SCIF sounds a better name for this UART. ok Just to clarify: - Should I remove all references to R-Car Gen2 from UART stuff? Yes please. - Can I keep your Reviewed-by after there mechanical changes. Sure. It would be nice to hear your and Ian's opinions about board file. I'm not sure about this one. I will let Ian decides. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues
On 30.01.15 at 16:09, daniel.ki...@oracle.com wrote: I suppose that we should provide additional kexec hypercall function which will return info about RS. kexec-tools should load new kernel as usual and add relevant argument to its command line. Most things are in place so we should just learn kexec-tools to do new things. There is a reason why the RS table info can't currently be obtained via a hypercall - Dom0 has nothing to do with it. Plus any kexec-ed kernel (Linux or other) will, under EFI, want this information, so a mechanism by which to pass the information to the secondary kernel without exposing it to entities not having a need to know would be preferable (albeit I have no idea so far how that might look like). Plus, this still doesn't in any way deal with the aspect that was so far discussed in this thread - SetVirtualAddressMap() being callable only once. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 3/3] xen/arm: Introduce support for Renesas R-Car Gen2 platform
On Fri, Jan 30, 2015 at 5:52 PM, Julien Grall julien.gr...@linaro.org wrote: Hi Oleksandr, Hi Julien On 30/01/15 15:38, Oleksandr Tyshchenko wrote: For example, such a simple thing as a formula for calculating divider value for uart baudrate can differs from one family to another despite the fact that these families use the same UART IP block named SCIF (clock may be differs - ext/int, another freq). Or FIFO size, etc. IHMO this look more configuration than logic change in the UART driver. We don't ask you to support all the configuration of the UART right now ;). However, if someone comes with a new platform using the SCIF, I expect him to reuse and fix this driver rather than creating a new one. FWIW, the compatible string is renesas,scif not rcar smth. So, SCIF sounds a better name for this UART. ok Just to clarify: - Should I remove all references to R-Car Gen2 from UART stuff? - Can I keep your Reviewed-by after there mechanical changes. It would be nice to hear your and Ian's opinions about board file. Regards, -- Julien Grall -- Oleksandr Tyshchenko | Embedded Dev GlobalLogic www.globallogic.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 2/3] x86/VPMU: Disable VPMU when NMI watchdog is on
NMI watchdog sets APIC_LVTPC register to generate an NMI when PMU counter overflow occurs. This may be overwritten by VPMU code later, effectively turning off the watchdog. We should disable VPMU when NMI watchdog is running. Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com --- docs/misc/xen-command-line.markdown | 2 ++ xen/arch/x86/hvm/vpmu.c | 13 + 2 files changed, 15 insertions(+) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 2274e74..bc316be 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1346,6 +1346,8 @@ wrong behaviour (see handle\_pmc\_quirk()). If 'vpmu=bts' is specified the virtualisation of the Branch Trace Store (BTS) feature is switched on on Intel processors supporting this feature. +Note that if **watchdog** option is also specified vpmu will be turned off. + *Warning:* As the BTS virtualisation is not 100% safe and because of the nehalem quirk don't use the vpmu flag on production systems with Intel cpus! diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c index 63b2158..b2e8e65 100644 --- a/xen/arch/x86/hvm/vpmu.c +++ b/xen/arch/x86/hvm/vpmu.c @@ -24,6 +24,7 @@ #include asm/regs.h #include asm/types.h #include asm/msr.h +#include asm/nmi.h #include asm/hvm/support.h #include asm/hvm/vmx/vmx.h #include asm/hvm/vmx/vmcs.h @@ -288,3 +289,15 @@ void vpmu_dump(struct vcpu *v) vpmu-arch_vpmu_ops-arch_vpmu_dump(v); } +static int __init vpmu_init(void) +{ +/* NMI watchdog uses LVTPC and HW counter */ +if ( opt_watchdog opt_vpmu_enabled ) +{ +printk(XENLOG_WARNING NMI watchdog is enabled. Turning VPMU off.\n); +opt_vpmu_enabled = 0; +} + +return 0; +} +__initcall(vpmu_init); -- 1.8.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 1/3] x86/nmi: Cleanup usage of nmi_watchdog
Use NMI_NONE when testing whether NMI watchdog is off. Remove unused NMI_INVALID macro. Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com --- xen/arch/x86/nmi.c | 4 ++-- xen/arch/x86/traps.c | 3 ++- xen/include/asm-x86/apic.h | 1 - 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c index 98c1e15..2ab97a0 100644 --- a/xen/arch/x86/nmi.c +++ b/xen/arch/x86/nmi.c @@ -148,7 +148,7 @@ int __init check_nmi_watchdog (void) int cpu; bool_t ok = 1; -if ( !nmi_watchdog ) +if ( nmi_watchdog == NMI_NONE ) return 0; printk(Testing NMI watchdog on all CPUs:); @@ -361,7 +361,7 @@ static int __pminit setup_p4_watchdog(void) void __pminit setup_apic_nmi_watchdog(void) { -if (!nmi_watchdog) +if ( nmi_watchdog == NMI_NONE ) return; switch (boot_cpu_data.x86_vendor) { diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index ec324b0..f5516dc 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -3385,7 +3385,8 @@ void do_nmi(const struct cpu_user_regs *regs) if ( nmi_callback(regs, cpu) ) return; -if ( !nmi_watchdog || (!nmi_watchdog_tick(regs) watchdog_force) ) +if ( (nmi_watchdog == NMI_NONE) || + (!nmi_watchdog_tick(regs) watchdog_force) ) handle_unknown = 1; /* Only the BSP gets external NMIs from the system. */ diff --git a/xen/include/asm-x86/apic.h b/xen/include/asm-x86/apic.h index 6697245..be9a535 100644 --- a/xen/include/asm-x86/apic.h +++ b/xen/include/asm-x86/apic.h @@ -221,7 +221,6 @@ extern unsigned int nmi_watchdog; #define NMI_NONE 0 #define NMI_IO_APIC1 #define NMI_LOCAL_APIC 2 -#define NMI_INVALID3 #else /* !CONFIG_X86_LOCAL_APIC */ static inline int lapic_suspend(void) {return 0;} -- 1.8.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] rcu_sched self-detect stall when disable vif device
On 28/01/15 17:27, Julien Grall wrote: On 28/01/15 17:06, David Vrabel wrote: On 28/01/15 16:45, Julien Grall wrote: On 27/01/15 16:53, Wei Liu wrote: On Tue, Jan 27, 2015 at 04:47:45PM +, Julien Grall wrote: On 27/01/15 16:45, Wei Liu wrote: On Tue, Jan 27, 2015 at 04:03:52PM +, Julien Grall wrote: Hi, While I'm working on support for 64K page in netfront, I got an rcu_sced self-detect message. It happens when netback is disabling the vif device due to an error. I'm using Linux 3.19-rc5 on seattle (ARM64). Any idea why the processor is stucked in xenvif_rx_queue_purge? When you try to release a SKB, core network driver need to enter some RCU cirital region to clean up. dst_release for one, calls call_rcu. But this message shouldn't happen in normal condition or because of netfront. Right? Never saw report like this before, even in the case that netfront is buggy. This is only happening when preemption is not enabled (i.e CONFIG_PREEMPT_NONE in the config file) in the backend kernel. When the vif is disabled, the loop in xenvif_kthread_guest_rx turned into an infinite loop. In my case, the code executed looks like: 1. for (;;) { 2. xenvif_wait_for_rx_work(queue); 3. 4. if (kthread_should_stop()) 5. break; 6. 7. if (unlikely(vif-disabled queue-id == 0) { 8. xenvif_carrier_off(vif); 9. xenvif_rx_queue_purge(queue); 10. continue; 11. } 12. } The wait on line 2 will return directly because the vif is disabled (see xenvif_have_rx_work) We are on queue 0, so the condition on line 7 is true. Therefore we will loop on line 10. And so on... On platform where preemption is not enabled, this thread will never yield/give the hand to another thread (unless the domain is destroyed). I'm not sure why we have a continue in the vif-disabled case and not just a break. Can you try that? So I applied this small patches: diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 908e65e..9448c6c 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -2110,7 +2110,7 @@ int xenvif_kthread_guest_rx(void *data) if (unlikely(vif-disabled queue-id == 0)) { xenvif_carrier_off(vif); xenvif_rx_queue_purge(queue); - continue; + break; } if (!skb_queue_empty(queue-rx_queue)) How about this? 8-- xen-netback: stop the guest rx thread after a fatal error After commit e9d8b2c2968499c1f96563e6522c56958d5a1d0d (xen-netback: disable rogue vif in kthread context), a fatal (protocol) error would leave the guest Rx thread spinning, wasting CPU time. Commit ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e (xen-netback: reintroduce guest Rx stall detection) made this even worse by removing a cond_resched() from this path. A fatal error is non-recoverable so just allow the guest Rx thread to exit. This requires taking additional refs to the task so the thread exiting early is handled safely. Signed-off-by: David Vrabel david.vra...@citrix.com diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c index 9259a73..037f74f 100644 --- a/drivers/net/xen-netback/interface.c +++ b/drivers/net/xen-netback/interface.c @@ -578,6 +578,7 @@ int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref, goto err_rx_unbind; } queue-task = task; + get_task_struct(task); task = kthread_create(xenvif_dealloc_kthread, (void *)queue, %s-dealloc, queue-name); @@ -634,6 +635,7 @@ void xenvif_disconnect(struct xenvif *vif) if (queue-task) { kthread_stop(queue-task); + put_task_struct(queue-task); queue-task = NULL; } diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c index 908e65e..c8ce701 100644 --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -2109,8 +2109,7 @@ int xenvif_kthread_guest_rx(void *data) */ if (unlikely(vif-disabled queue-id == 0)) { xenvif_carrier_off(vif); - xenvif_rx_queue_purge(queue); - continue; + break; } if (!skb_queue_empty(queue-rx_queue)) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC] p2m: p2m_mmio_direct set RW permissions
On Tue, Jan 27, 2015 at 08:24:36AM +, Jan Beulich wrote: On 26.01.15 at 18:30, elena.ufimts...@oracle.com wrote: On Mon, Jan 26, 2015 at 05:06:12PM +, Jan Beulich wrote: On 26.01.15 at 17:57, elena.ufimts...@oracle.com wrote: On Fri, Jan 23, 2015 at 10:50:23AM +, Jan Beulich wrote: On 22.01.15 at 18:34, elena.ufimts...@oracle.com wrote: (XEN) d56f - d5fff000 (reserved) So this is where one of the RMRRs sits in (and also where the faults occur according to the two numbers you sent earlier, which - as others have already said - is an indication of the reported RMRRs being incomplete), ... (XEN) d5fff000 - d600 (usable) (XEN) d700 - df20 (reserved) ... and this is the exact range of the other one. But the usable entry between them is a sign of the firmware not doing the best job in assigning resources. I don't, btw, think that blindly mapping all the reserved regions into PVH Dom0's P2M would be (or is, if that's what's happening today) correct - these regions are named reserved for a reason. In the case here it's actually RAM, not MMIO, and Dom0 (as well as Xen) has no business accessing these (for others this may be different, e.g. the LAPIC and IO-APIC ones below, but Xen learns/knows of them by means different from looking at the memory map). I understand this this. At the same time I think pv dom0 does exactly this blind mapping. I also tried to map these regions as read-only and that worked. Can be this an option for these ram regions? No - they're reserved, so there shouldn't be _any_ access to them. The only possible workaround I see as acceptable would be the already proposed addition of a command line option allowing to specify additional RMRR-like regions. Understood. And I am guessing the permissions overloading option should be available as well? For example, RW or R only. RMRRs are always mapped with RW. That's an option, but not a requirement imo. Why can be this a platform quirk? Did you mean can't? If not, I don't understand the question. If so, remember that we're talking about RAM allocated by the firmware for special purposes. Hence such a quirk would need tailoring to any particular published firmware version, and would need to take into account any differences in the memory use that may result from setting firmware options differently (and assuming that the memory use is deterministic across boots when the options don't change - I've seen systems where memory use differed between warm and cold boots). IOW, not something that's likely to be practical. Jan What about pv dom0 case? pv dom0 maps these reserved memory ranges with RW access rights. Should be this unified in some way? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/some blob to the hypervisor
On Fri, Jan 30, 2015 at 08:37:33PM +, Andrew Cooper wrote: On 30/01/15 19:51, Luis R. Rodriguez wrote: On Fri, Jan 30, 2015 at 02:23:48PM +, Andrew Cooper wrote: On 30/01/15 01:14, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com There are several ways that a Xen system can update the CPU microcode on a pvops kernel [0] now, the preferred way is through the early microcode update mechanism. At run time folks should use this new xenmicrocode tool and use the same CPU microcode file as present on /lib/firmware. Some distributions may use the historic sysfs rescan interface, users of that mechanism should be aware that the interface will not be available when using Xen and as such should first check the presence of the interface before usage, as an alternative this xenmicrocode tool can be used on priviledged domains. Folks wishing to update CPU microcode at run time should be aware that not all CPU microcode can be updated on a system and should take care to ensure that only known-to-work and supported CPU microcode updates are used [0]. To avoid issues with delays on the hypercall / microcode update this implementation will quiesce all domains prior to updating the microcode, and then only queisce'd domains will be unpaused once done. [0] http://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update Based on original work by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Andrew Cooper andrew.coop...@citrix.com Cc: Borislav Petkov b...@suse.de Cc: Takashi Iwai ti...@suse.de Cc: Olaf Hering oher...@suse.de Cc: Jan Beulich jbeul...@suse.com Cc: Jason Douglas jdoug...@suse.com Cc: Juergen Gross jgr...@suse.com Cc: Michal Marek mma...@suse.cz Cc: Henrique de Moraes Holschuh h...@debian.org Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- Just wrote this, haven't tested it. This does some queiscing prior to doing the microcode update. The quiescing is done by pausing all domains. Once the microcode update is done we only unpause domains which we queisced as part of our work. Let me know if this is on the right track to help avoid issues noted on the list. There is also a TOCTOU race with your paused check, which itself is buggy as it you should unconditionally pause all VMs (userspace pause refcounting has been fixed for a little while now). Also, currently __domain_pause_by_systemcontroller() has a limit to 255 guests. Are the fixes that you describe to the refcounting sufficient to remove that limitation from __domain_pause_by_systemcontroller()? The limit is the number of concurrent userspace refs taken on an individual domain. I.e. you can plausibly have 255 different bits of the toolstack each taking a pause reference for a specific reason. 255 was chosen an arbitrary limit, which is used to prevent buggy toolstacks from being able to overflow the refcounts used by the scheduler by using the pause domain hypercall 4 billion times. My implementation uses 1024 but has no check on nb_domain (the amount of domains set) so that requires fixing as well but I figure we should also review the above first too. Artificial limits bother me and I went with 1024 as I also saw that limit used elsewhere, not sure if it was a stack concern or what. However, xenmicrocode (even indirectly via xc_microcode_update()) is not in a position to safely pause all domains as there is no interlock to preventing a new domain being created. Even if all domains do get successfully paused, I did think about this and figured we could use this as a segway into a discussion about how we would want to implement this sort of interlocking or see if there is anything available already for this sort of thing. Also, are there other future users of this that could benefit from it ? If so then perhaps we can wrap the requirements up together. the idle loops are substantially less trivial than ideal. Interesting, can you elaborate on the possible issues that might creep up on the idle loops while a guest is paused during a microcode update? Any single issue would suffice, just curious. Do we need something more intricate than pause implemented then? Using suspend seemed rather grotesque to shove down a guest's throat. If pause / suspend do not suffice perhaps some new artificial virtual tempory quiesce is needed to ensure integrity here, which would address some of the idle concerns you highligted might creep up. The toolstack should not hack around hypervisor bugs, and indeed is not capable of doing so. Agreed. I figured I'd at least do what I can with what is available and use this as a discussoin for what is the Right Thing To Do (TM) in the future. The right thing to do is to fix the hypercall implementation, at which
Re: [Xen-devel] [PATCHv5 00/14] xen: fix many long-standing grant mapping bugs
On Tue, Jan 27, 2015 at 04:44:02PM +, David Vrabel wrote: This series fixes a number of long-standing bugs in the handling of grant maps. Refer to the following for all the details. http://xenbits.xen.org/people/dvrabel/grant-improvements-C.pdf In summary, the important uses that this enables are: 1. Block backends can use networked storage safely. 2. Block backends can use network storage provided by other guests on the same host. 3. User space block backends can use direct I/O or asynchronous I/O. The first two patches are the core MM changes necessary. These have already been acked. Patches #3 and #4 remove existing (broken) mechanisms. This does temporarily break some previously working use cases, but it does make the subsequent additions much easier to review. Only patches #3, #4, and #13 need acks from other Xen maintainers. I took a look at the patches and they look good to me. As a happy side effect, performance is also likely to be improved in some areas (but I've not got any measurements yet). User space backends using grant mapping should see some good improvements from reduced overheads and better unmap batching. VIF to VIF network traffic may also see a small improvement. Finally, thanks to Jenny who did much of the implementation. Changes in v5: - Fix gntdev for auto-xlate guests. Changes in v4: - Removed xen_blkbk_unmap_done(). - Fixed bug in blkback when using mixed persistent and non-persistent grants. Changes in v3: - find_page renamed to find_special_page. - Added documentation for mm changes. - Fixed mangled forward port of blkback's safe unmap patch. - Export gnttab_alloc_pages() and gnttab_free_pages(). - Fix 32-bit build. Changes in v2: - Add find_page() VMA op instead of pages field (so struct vm_area_struct doesn't increase in size. - Make foreign page tracking arch-independant and improve the API. - Alloc extra memory (for 32-bit archs) for the (domain, gref) when allocating the page (instead of during the map). - Convert gntdev's lock to a mutex. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/some blob to the hypervisor
On 01/30/2015 03:37 PM, Andrew Cooper wrote: The actions Xen needs to take are: - Copy the buffer into Xen. - Scan the buffer for the correct patch Why not have the toolstack search for the right patch? Hypervisor will verify that it's appropriate but won't have to spend time scanning the (potentially large) buffer. (The logic for scanning the buffer is in the hypervisor code but we could move it to a library) -boris - Rendezvous all online cpus in an IPI to apply the patch, and keep the processors in until all have completed the patch. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/some blob to the hypervisor
On 30/01/2015 21:24, Boris Ostrovsky wrote: On 01/30/2015 03:37 PM, Andrew Cooper wrote: The actions Xen needs to take are: - Copy the buffer into Xen. - Scan the buffer for the correct patch Why not have the toolstack search for the right patch? Hypervisor will verify that it's appropriate but won't have to spend time scanning the (potentially large) buffer. (The logic for scanning the buffer is in the hypervisor code but we could move it to a library) a) Xen already needs to scan for the boot time patching and b) because the interface already has this property. One could certainly prepare the patch in userspace and that would be more efficient (patches welcome if you are feeling particularly keen), but I am not going to loose sleep over it. It is likely that the majority of users will delay a microcode update until a maintenance window and plan time for a reboot. ~Andrew ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 13/18] efi: create efi_tables()
..which collects system tables data. We need this to support multiboot2 protocol on EFI platforms. Signed-off-by: Daniel Kiper daniel.ki...@oracle.com --- xen/common/efi/boot.c | 51 - 1 file changed, 29 insertions(+), 22 deletions(-) diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index 114019e..cf0fbc2 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -707,6 +707,34 @@ static UINTN __init efi_find_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, return gop_mode; } +static void __init efi_tables(void) +{ +unsigned int i; + +/* Obtain basic table pointers. */ +for ( i = 0; i efi_num_ct; ++i ) +{ +static EFI_GUID __initdata acpi2_guid = ACPI_20_TABLE_GUID; +static EFI_GUID __initdata acpi_guid = ACPI_TABLE_GUID; +static EFI_GUID __initdata mps_guid = MPS_TABLE_GUID; +static EFI_GUID __initdata smbios_guid = SMBIOS_TABLE_GUID; + +if ( match_guid(acpi2_guid, efi_ct[i].VendorGuid) ) + efi.acpi20 = (long)efi_ct[i].VendorTable; +if ( match_guid(acpi_guid, efi_ct[i].VendorGuid) ) + efi.acpi = (long)efi_ct[i].VendorTable; +if ( match_guid(mps_guid, efi_ct[i].VendorGuid) ) + efi.mps = (long)efi_ct[i].VendorTable; +if ( match_guid(smbios_guid, efi_ct[i].VendorGuid) ) + efi.smbios = (long)efi_ct[i].VendorTable; +} + +#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */ +if (efi.smbios != EFI_INVALID_TABLE_ADDR) +dmi_efi_get_table((void *)(long)efi.smbios); +#endif +} + static void __init setup_efi_pci(void) { EFI_STATUS status; @@ -1027,28 +1055,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) /* XXX Collect EDID info. */ efi_arch_cpu(); -/* Obtain basic table pointers. */ -for ( i = 0; i efi_num_ct; ++i ) -{ -static EFI_GUID __initdata acpi2_guid = ACPI_20_TABLE_GUID; -static EFI_GUID __initdata acpi_guid = ACPI_TABLE_GUID; -static EFI_GUID __initdata mps_guid = MPS_TABLE_GUID; -static EFI_GUID __initdata smbios_guid = SMBIOS_TABLE_GUID; - -if ( match_guid(acpi2_guid, efi_ct[i].VendorGuid) ) - efi.acpi20 = (long)efi_ct[i].VendorTable; -if ( match_guid(acpi_guid, efi_ct[i].VendorGuid) ) - efi.acpi = (long)efi_ct[i].VendorTable; -if ( match_guid(mps_guid, efi_ct[i].VendorGuid) ) - efi.mps = (long)efi_ct[i].VendorTable; -if ( match_guid(smbios_guid, efi_ct[i].VendorGuid) ) - efi.smbios = (long)efi_ct[i].VendorTable; -} - -#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */ -if (efi.smbios != EFI_INVALID_TABLE_ADDR) -dmi_efi_get_table((void *)(long)efi.smbios); -#endif +efi_tables(); /* Collect PCI ROM contents. */ setup_efi_pci(); -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 09/18] efi: create efi_init()
..which initializes basic EFI variables. We need this to support multiboot2 protocol on EFI platforms. Signed-off-by: Daniel Kiper daniel.ki...@oracle.com --- xen/common/efi/boot.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index 89d081d..1bf88e4 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -586,6 +586,21 @@ static char *__init get_value(const struct file *cfg, const char *section, return NULL; } +static void __init efi_init(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) +{ +efi_ih = ImageHandle; +efi_bs = SystemTable-BootServices; +efi_rs = SystemTable-RuntimeServices; +efi_ct = SystemTable-ConfigurationTable; +efi_num_ct = SystemTable-NumberOfTableEntries; +efi_version = SystemTable-Hdr.Revision; +efi_fw_vendor = SystemTable-FirmwareVendor; +efi_fw_revision = SystemTable-FirmwareRevision; + +StdOut = SystemTable-ConOut; +StdErr = SystemTable-StdErr ?: StdOut; +} + static void __init setup_efi_pci(void) { EFI_STATUS status; @@ -713,17 +728,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable) efi_loader = 1; #endif -efi_ih = ImageHandle; -efi_bs = SystemTable-BootServices; -efi_rs = SystemTable-RuntimeServices; -efi_ct = SystemTable-ConfigurationTable; -efi_num_ct = SystemTable-NumberOfTableEntries; -efi_version = SystemTable-Hdr.Revision; -efi_fw_vendor = SystemTable-FirmwareVendor; -efi_fw_revision = SystemTable-FirmwareRevision; +efi_init(ImageHandle, SystemTable); -StdOut = SystemTable-ConOut; -StdErr = SystemTable-StdErr ?: StdOut; use_cfg_file = efi_arch_use_config_file(SystemTable); status = efi_bs-HandleProtocol(ImageHandle, loaded_image_guid, -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 03/18] x86/boot: use %ecx instead of %eax
Use %ecx instead of %eax to store low memory upper limit from EBDA. This way we do not wipe multiboot protocol identifier. It is needed in reloc() to differentiate between multiboot (v1) and multiboot2 protocol. Signed-off-by: Daniel Kiper daniel.ki...@oracle.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/boot/head.S | 27 ++- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index c99f739..6180783 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -86,14 +86,14 @@ __start: jne not_multiboot /* Set up trampoline segment 64k below EBDA */ -movzwl 0x40e,%eax /* EBDA segment */ -cmp $0xa000,%eax/* sanity check (high) */ +movzwl 0x40e,%ecx /* EBDA segment */ +cmp $0xa000,%ecx/* sanity check (high) */ jae 0f -cmp $0x4000,%eax/* sanity check (low) */ +cmp $0x4000,%ecx/* sanity check (low) */ jae 1f 0: -movzwl 0x413,%eax /* use base memory size on failure */ -shl $10-4,%eax +movzwl 0x413,%ecx /* use base memory size on failure */ +shl $10-4,%ecx 1: /* * Compare the value in the BDA with the information from the @@ -105,21 +105,22 @@ __start: cmp $0x100,%edx /* is the multiboot value too small? */ jb 2f /* if so, do not use it */ shl $10-4,%edx -cmp %eax,%edx /* compare with BDA value */ -cmovb %edx,%eax /* and use the smaller */ +cmp %ecx,%edx /* compare with BDA value */ +cmovb %edx,%ecx /* and use the smaller */ 2: /* Reserve 64kb for the trampoline */ -sub $0x1000,%eax +sub $0x1000,%ecx /* From arch/x86/smpboot.c: start_eip had better be page-aligned! */ -xor %al, %al -shl $4, %eax -mov %eax,sym_phys(trampoline_phys) +xor %cl, %cl +shl $4, %ecx +mov %ecx,sym_phys(trampoline_phys) /* Save the Multiboot info struct (after relocation) for later use. */ mov $sym_phys(cpu0_stack)+1024,%esp -push%ebx -callreloc +mov %ecx,%eax +push%ebx/* Multiboot information address */ +callreloc /* %eax contains trampoline address */ mov %eax,sym_phys(multiboot_ptr) /* Initialize BSS (no nasty surprises!) */ -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 02/18] x86/boot/reloc: create generic alloc and copy functions
Create generic alloc and copy functions. We need separate tools for memory allocation and copy to provide multiboot2 protocol support. Signed-off-by: Daniel Kiper daniel.ki...@oracle.com --- xen/arch/x86/boot/reloc.c | 52 - 1 file changed, 33 insertions(+), 19 deletions(-) diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c index 63045c0..f964cda 100644 --- a/xen/arch/x86/boot/reloc.c +++ b/xen/arch/x86/boot/reloc.c @@ -33,9 +33,10 @@ asm ( typedef unsigned int u32; #include ../../../include/xen/multiboot.h -static void *reloc_mbi_struct(void *old, unsigned int bytes) +static u32 alloc_struct(u32 bytes) { -void *new; +u32 s; + asm( call 1f \n 1: pop %%edx \n @@ -43,50 +44,63 @@ static void *reloc_mbi_struct(void *old, unsigned int bytes) sub %1,%0 \n and $~15,%0 \n mov %0,alloc-1b(%%edx) \n -mov %0,%%edi\n -rep movsb \n - : =r (new), +c (bytes), +S (old) - : : edx, edi, memory); -return new; + : =r (s) : r (bytes) : edx, memory); + +return s; +} + +static u32 copy_struct(u32 src, u32 bytes) +{ +u32 dst, dst_asm; + +dst = alloc_struct(bytes); +dst_asm = dst; + +asm volatile(rep movsb : +S (src), +D (dst_asm), +c (bytes) : : memory); + +return dst; } -static char *reloc_mbi_string(char *old) +static u32 copy_string(u32 src) { char *p; -for ( p = old; *p != '\0'; p++ ) + +if ( src == 0 ) +return 0; + +for ( p = (char *)src; *p != '\0'; p++ ) continue; -return reloc_mbi_struct(old, p - old + 1); + +return copy_struct(src, p - (char *)src + 1); } multiboot_info_t *reloc(multiboot_info_t *mbi_old) { -multiboot_info_t *mbi = reloc_mbi_struct(mbi_old, sizeof(*mbi)); +multiboot_info_t *mbi = (multiboot_info_t *)copy_struct((u32)mbi_old, sizeof(*mbi)); int i; if ( mbi-flags MBI_CMDLINE ) -mbi-cmdline = (u32)reloc_mbi_string((char *)mbi-cmdline); +mbi-cmdline = copy_string(mbi-cmdline); if ( mbi-flags MBI_MODULES ) { -module_t *mods = reloc_mbi_struct( -(module_t *)mbi-mods_addr, mbi-mods_count * sizeof(module_t)); +module_t *mods = (module_t *)copy_struct( +mbi-mods_addr, mbi-mods_count * sizeof(module_t)); mbi-mods_addr = (u32)mods; for ( i = 0; i mbi-mods_count; i++ ) { if ( mods[i].string ) -mods[i].string = (u32)reloc_mbi_string((char *)mods[i].string); +mods[i].string = copy_string(mods[i].string); } } if ( mbi-flags MBI_MEMMAP ) -mbi-mmap_addr = (u32)reloc_mbi_struct( -(memory_map_t *)mbi-mmap_addr, mbi-mmap_length); +mbi-mmap_addr = copy_struct(mbi-mmap_addr, mbi-mmap_length); if ( mbi-flags MBI_LOADERNAME ) -mbi-boot_loader_name = (u32)reloc_mbi_string( -(char *)mbi-boot_loader_name); +mbi-boot_loader_name = copy_string(mbi-boot_loader_name); /* Mask features we don't understand or don't relocate. */ mbi-flags = (MBI_MEMLIMITS | -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 07/18] efi: run EFI specific code on EFI platform only
Signed-off-by: Daniel Kiper daniel.ki...@oracle.com --- xen/arch/x86/shutdown.c |3 ++- xen/common/efi/boot.c|5 + xen/common/efi/runtime.c |6 ++ 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/shutdown.c b/xen/arch/x86/shutdown.c index 21f6cf5..1c8336f 100644 --- a/xen/arch/x86/shutdown.c +++ b/xen/arch/x86/shutdown.c @@ -504,7 +504,8 @@ void machine_restart(unsigned int delay_millisecs) tboot_shutdown(TB_SHUTDOWN_REBOOT); } -efi_reset_system(reboot_mode != 0); +if ( efi_platform ) +efi_reset_system(reboot_mode != 0); /* Rebooting needs to touch the page at absolute address 0. */ *((unsigned short *)__va(0x472)) = reboot_mode; diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index 8aafcfd..89d081d 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -1154,6 +1154,11 @@ void __init efi_init_memory(void) } *extra, *extra_head = NULL; #endif +#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */ +if ( !efi_platform ) +return; +#endif + printk(XENLOG_INFO EFI memory map:\n); for ( i = 0; i efi_memmap_size; i += efi_mdesc_size ) { diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c index b229c69..7c767e4 100644 --- a/xen/common/efi/runtime.c +++ b/xen/common/efi/runtime.c @@ -164,6 +164,9 @@ int efi_get_info(uint32_t idx, union xenpf_efi_info *info) { unsigned int i, n; +if ( !efi_platform ) +return -ENOSYS; + switch ( idx ) { case XEN_FW_EFI_VERSION: @@ -298,6 +301,9 @@ int efi_runtime_call(struct xenpf_efi_runtime_call *op) EFI_STATUS status = EFI_NOT_STARTED; int rc = 0; +if ( !efi_platform ) +return -ENOSYS; + switch ( op-function ) { case XEN_EFI_get_time: -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH v3 12/13] xen/iommu: smmu: Add Xen specific code to be able to use the driver
The main goal is to modify as little the Linux code to be able to port easily new feature added in Linux repo for the driver. To achieve that we: - Add helpers to Linux function not implemented on Xen - Add callbacks used by Xen to do our own stuff and call Linux ones - Only modify when required the code which comes from Linux. If so a comment has been added with /* Xen: ... */ explaining why it's necessary. The support for PCI has been commented because it's not yet supported by Xen ARM and therefore won't compile. Signed-off-by: Julien Grall julien.gr...@linaro.org --- Changes in v2: - Add the ACCESS_ONCE definition in the drivers. The patch to introduce the one in common code has been dropped. - The include xen/device.h has been dropped in favor of asm/device.h --- xen/drivers/passthrough/arm/Makefile | 1 + xen/drivers/passthrough/arm/smmu.c | 678 +++ 2 files changed, 612 insertions(+), 67 deletions(-) diff --git a/xen/drivers/passthrough/arm/Makefile b/xen/drivers/passthrough/arm/Makefile index 0484b79..f4cd26e 100644 --- a/xen/drivers/passthrough/arm/Makefile +++ b/xen/drivers/passthrough/arm/Makefile @@ -1 +1,2 @@ obj-y += iommu.o +obj-y += smmu.o diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c index 8a6514f..373eee8 100644 --- a/xen/drivers/passthrough/arm/smmu.c +++ b/xen/drivers/passthrough/arm/smmu.c @@ -18,6 +18,13 @@ * * Author: Will Deacon will.dea...@arm.com * + * Based on Linux drivers/iommu/arm-smmu.c + * = commit e6b5be2be4e30037eb551e0ed09dd97bd00d85d3 + * + * Xen modification: + * Julien Grall julien.gr...@linaro.org + * Copyright (C) 2014 Linaro Limited. + * * This driver currently supports: * - SMMUv1 and v2 implementations * - Stream-matching and stream-indexing @@ -28,26 +35,164 @@ * - Context fault reporting */ -#define pr_fmt(fmt) arm-smmu: fmt -#include linux/delay.h -#include linux/dma-mapping.h -#include linux/err.h -#include linux/interrupt.h -#include linux/io.h -#include linux/iommu.h -#include linux/mm.h -#include linux/module.h -#include linux/of.h -#include linux/pci.h -#include linux/platform_device.h -#include linux/slab.h -#include linux/spinlock.h -#include linux/bitops.h +#include xen/config.h +#include xen/delay.h +#include xen/errno.h +#include xen/err.h +#include xen/irq.h +#include xen/lib.h +#include xen/list.h +#include xen/mm.h +#include xen/vmap.h +#include xen/rbtree.h +#include xen/sched.h +#include xen/sizes.h +#include asm/atomic.h +#include asm/device.h +#include asm/io.h +#include asm/platform.h + +/* Xen: The below defines are redefined within the file. Undef it */ +#undef SCTLR_AFE +#undef SCTLR_TRE +#undef SCTLR_M +#undef TTBCR_EAE + +/* Alias to Xen device tree helpers */ +#define device_node dt_device_node +#define of_phandle_args dt_phandle_args +#define of_device_id dt_device_match +#define of_match_node dt_match_node +#define of_property_read_u32(np, pname, out) (!dt_property_read_u32(np, pname, out)) +#define of_property_read_bool dt_property_read_bool +#define of_parse_phandle_with_args dt_parse_phandle_with_args + +/* Xen: Helpers to get device MMIO and IRQs */ +struct resource +{ + u64 addr; + u64 size; + unsigned int type; +}; + +#define resource_size(res) (res)-size; + +#define platform_device dt_device_node + +#define IORESOURCE_MEM 0 +#define IORESOURCE_IRQ 1 + +static struct resource *platform_get_resource(struct platform_device *pdev, + unsigned int type, + unsigned int num) +{ + /* +* The resource is only used between 2 calls of platform_get_resource. +* It's quite ugly but it's avoid to add too much code in the part +* imported from Linux +*/ + static struct resource res; + int ret = 0; + + res.type = type; + + switch (type) { + case IORESOURCE_MEM: + ret = dt_device_get_address(pdev, num, res.addr, res.size); + + return ((ret) ? NULL : res); + + case IORESOURCE_IRQ: + ret = platform_get_irq(pdev, num); + if (ret 0) + return NULL; + + res.addr = ret; + res.size = 1; + + return res; + + default: + return NULL; + } +} + +/* Alias to Xen IRQ functions */ +#define request_irq(irq, func, flags, name, dev) request_irq(irq, flags, func, name, dev) +#define free_irq release_irq + +/* + * Device logger functions + * TODO: Handle PCI + */ +#define dev_print(dev, lvl, fmt, ...) \ +printk(lvl smmu: %s: fmt, dt_node_full_name(dev_to_dt(dev)), ## __VA_ARGS__) + +#define dev_dbg(dev, fmt, ...) dev_print(dev, XENLOG_DEBUG, fmt, ## __VA_ARGS__) +#define dev_notice(dev, fmt, ...) dev_print(dev,
[Xen-devel] [PATCH v3 02/13] xen/arm: vgic: Drop unecessary include asm/device.h
The header asm/device.h has been included in the vgic code during splitting to support multiple version. But no code within those files requires it. Signed-off-by: Julien Grall julien.gr...@linaro.org Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com --- Changes in v3: - Add Stefano's ack --- xen/arch/arm/vgic-v2.c | 1 - xen/arch/arm/vgic-v3.c | 1 - 2 files changed, 2 deletions(-) diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c index 515faf7..cd8f7d1 100644 --- a/xen/arch/arm/vgic-v2.c +++ b/xen/arch/arm/vgic-v2.c @@ -26,7 +26,6 @@ #include xen/sched.h #include asm/current.h -#include asm/device.h #include asm/mmio.h #include asm/gic.h diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index bece189..cab3b63 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -27,7 +27,6 @@ #include xen/sched.h #include xen/sizes.h #include asm/current.h -#include asm/device.h #include asm/mmio.h #include asm/gic_v3_defs.h #include asm/gic.h -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] hvm_complete_assist_req: Tell caller we failed to send
On 01/30/15 05:53, Paul Durrant wrote: -Original Message- From: Don Slutz [mailto:dsl...@verizon.com] Sent: 30 January 2015 00:53 To: xen-devel@lists.xen.org Cc: Andrew Cooper; George Dunlap; Ian Campbell; Ian Jackson; Jan Beulich; Keir (Xen.org); Stefano Stabellini; Wei Liu; Paul Durrant; Don Slutz Subject: [PATCH 5/5] hvm_complete_assist_req: Tell caller we failed to send This saves a VMENTRY and a VMEXIT since we not longer retry the ioport read. hvmemul_do_io() will retry in order to complete an ioport read when hvm_send_assist_req() is successful. Signed-off-by: Don Slutz dsl...@verizon.com --- xen/arch/x86/hvm/hvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 6f7b407..bad410e 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2599,7 +2599,7 @@ static bool_t hvm_complete_assist_req(ioreq_t *p) break; } -return 1; +return 0; /* implicitly bins the i/o operation */ Actually that comment is not right. The operation is not binned; it's just already been done. Will adjust, I expect the return to go away. -Don Slutz Paul } bool_t hvm_send_assist_req(ioreq_t *p) -- 1.8.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 2/3] x86/VPMU: Disable VPMU when NMI watchdog is on
On 30/01/15 17:15, Boris Ostrovsky wrote: NMI watchdog sets APIC_LVTPC register to generate an NMI when PMU counter overflow occurs. This may be overwritten by VPMU code later, effectively turning off the watchdog. We should disable VPMU when NMI watchdog is running. Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com --- docs/misc/xen-command-line.markdown | 2 ++ xen/arch/x86/hvm/vpmu.c | 13 + 2 files changed, 15 insertions(+) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 2274e74..bc316be 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1346,6 +1346,8 @@ wrong behaviour (see handle\_pmc\_quirk()). If 'vpmu=bts' is specified the virtualisation of the Branch Trace Store (BTS) feature is switched on on Intel processors supporting this feature. +Note that if **watchdog** option is also specified vpmu will be turned off. + *Warning:* As the BTS virtualisation is not 100% safe and because of the nehalem quirk don't use the vpmu flag on production systems with Intel cpus! diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c index 63b2158..b2e8e65 100644 --- a/xen/arch/x86/hvm/vpmu.c +++ b/xen/arch/x86/hvm/vpmu.c @@ -24,6 +24,7 @@ #include asm/regs.h #include asm/types.h #include asm/msr.h +#include asm/nmi.h #include asm/hvm/support.h #include asm/hvm/vmx/vmx.h #include asm/hvm/vmx/vmcs.h @@ -288,3 +289,15 @@ void vpmu_dump(struct vcpu *v) vpmu-arch_vpmu_ops-arch_vpmu_dump(v); } +static int __init vpmu_init(void) +{ +/* NMI watchdog uses LVTPC and HW counter */ +if ( opt_watchdog opt_vpmu_enabled ) +{ +printk(XENLOG_WARNING NMI watchdog is enabled. Turning VPMU off.\n); +opt_vpmu_enabled = 0; +} + +return 0; +} +__initcall(vpmu_init); ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 04/18] xen/x86: add multiboot2 protocol support
On 30/01/15 17:54, Daniel Kiper wrote: Add multiboot2 protocol support. Alter min memory limit handling as we now may not find it from either multiboot (v1) or multiboot2. This way we are laying the foundation for EFI + GRUB2 + Xen development. Signed-off-by: Daniel Kiper daniel.ki...@oracle.com I have not reviewed the multiboot2 protocol in detail, but it appears sane and I presume it is suitably tested and works. As far as the mb1/mb2 interaction goes, this is looking far better. Reviewed-by: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/boot/Makefile|3 +- xen/arch/x86/boot/head.S | 104 -- xen/arch/x86/boot/reloc.c | 174 - xen/arch/x86/x86_64/asm-offsets.c |6 ++ xen/include/xen/multiboot2.h | 169 +++ 5 files changed, 429 insertions(+), 27 deletions(-) create mode 100644 xen/include/xen/multiboot2.h diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile index 5fdb5ae..0efa7d2 100644 --- a/xen/arch/x86/boot/Makefile +++ b/xen/arch/x86/boot/Makefile @@ -1,6 +1,7 @@ obj-bin-y += head.o -RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h $(BASEDIR)/include/xen/multiboot.h +RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h $(BASEDIR)/include/xen/compiler.h \ + $(BASEDIR)/include/xen/multiboot.h $(BASEDIR)/include/xen/multiboot2.h head.o: reloc.S diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index 6180783..7861057 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -1,5 +1,6 @@ #include xen/config.h #include xen/multiboot.h +#include xen/multiboot2.h #include public/xen.h #include asm/asm_defns.h #include asm/desc.h @@ -32,6 +33,59 @@ ENTRY(start) .long MULTIBOOT_HEADER_FLAGS /* Checksum: must be the negated sum of the first two fields. */ .long -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS) + +/*** MULTIBOOT2 HEADER / +/* Some ideas are taken from grub-2.00/grub-core/tests/boot/kernel-i386.S file. */ +.align MULTIBOOT2_HEADER_ALIGN + +.Lmultiboot2_header: +/* Magic number indicating a Multiboot2 header. */ +.long MULTIBOOT2_HEADER_MAGIC +/* Architecture: i386. */ +.long MULTIBOOT2_ARCHITECTURE_I386 +/* Multiboot2 header length. */ +.long .Lmultiboot2_header_end - .Lmultiboot2_header +/* Multiboot2 header checksum. */ +.long -(MULTIBOOT2_HEADER_MAGIC + MULTIBOOT2_ARCHITECTURE_I386 + \ +(.Lmultiboot2_header_end - .Lmultiboot2_header)) + +/* Multiboot2 tags... */ +.Lmultiboot2_info_req: +/* Multiboot2 information request tag. */ +.short MULTIBOOT2_HEADER_TAG_INFORMATION_REQUEST +.short MULTIBOOT2_HEADER_TAG_REQUIRED +.long .Lmultiboot2_info_req_end - .Lmultiboot2_info_req +.long MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO +.long MULTIBOOT2_TAG_TYPE_MMAP +.Lmultiboot2_info_req_end: + +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_MODULE_ALIGN +.short MULTIBOOT2_HEADER_TAG_REQUIRED +.long 8 /* Tag size. */ + +/* Console flags tag. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_CONSOLE_FLAGS +.short MULTIBOOT2_HEADER_TAG_OPTIONAL +.long 12 /* Tag size. */ +.long MULTIBOOT2_CONSOLE_FLAGS_EGA_TEXT_SUPPORTED + +/* Framebuffer tag. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_FRAMEBUFFER +.short MULTIBOOT2_HEADER_TAG_OPTIONAL +.long 20 /* Tag size. */ +.long 0 /* Number of the columns - no preference. */ +.long 0 /* Number of the lines - no preference. */ +.long 0 /* Number of bits per pixel - no preference. */ + +/* Multiboot2 header end tag. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_END +.short 0 +.long 8 /* Tag size. */ +.Lmultiboot2_header_end: .section .init.rodata, a, @progbits .align 4 @@ -81,10 +135,51 @@ __start: mov %ecx,%es mov %ecx,%ss +/* Bootloaders may set multiboot[12].mem_lower to a nonzero value */ +xor %edx,%edx + /* Check for Multiboot bootloader */ cmp $MULTIBOOT_BOOTLOADER_MAGIC,%eax -jne not_multiboot +je multiboot1_proto +/* Check for Multiboot2 bootloader */ +cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax +je multiboot2_proto + +jmp not_multiboot + +multiboot1_proto: +/* Get mem_lower from Multiboot information */ +testb $MBI_MEMLIMITS,(%ebx) +jz trampoline_setup/* not available? BDA value will be
Re: [Xen-devel] [PATCH 18/18] x86: add multiboot2 protocol support for EFI platforms
On 30/01/15 17:54, Daniel Kiper wrote: Signed-off-by: Daniel Kiper daniel.ki...@oracle.com --- xen/arch/x86/boot/head.S | 174 +++-- xen/arch/x86/efi/efi-boot.h | 29 +++ xen/arch/x86/setup.c | 23 ++--- xen/arch/x86/x86_64/asm-offsets.c |2 + xen/common/efi/boot.c | 11 +++ 5 files changed, 222 insertions(+), 17 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index 7861057..89f5aa7 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -8,6 +8,7 @@ #include asm/page.h #include asm/msr.h #include asm/cpufeature.h +#include asm/processor.h .text .code32 @@ -57,6 +58,9 @@ ENTRY(start) .long .Lmultiboot2_info_req_end - .Lmultiboot2_info_req .long MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO .long MULTIBOOT2_TAG_TYPE_MMAP +.long MULTIBOOT2_TAG_TYPE_EFI_BS +.long MULTIBOOT2_TAG_TYPE_EFI64 +.long MULTIBOOT2_TAG_TYPE_EFI64_IH .Lmultiboot2_info_req_end: .align MULTIBOOT2_TAG_ALIGN @@ -80,6 +84,19 @@ ENTRY(start) .long 0 /* Number of the lines - no preference. */ .long 0 /* Number of bits per pixel - no preference. */ +/* Do not disable EFI boot services. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_EFI_BS +.short MULTIBOOT2_HEADER_TAG_OPTIONAL +.long 8 /* Tag size. */ + +/* EFI64 entry point. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI64 +.short MULTIBOOT2_HEADER_TAG_OPTIONAL +.long 12 /* Tag size. */ +.long sym_phys(__efi64_start) + /* Multiboot2 header end tag. */ .align MULTIBOOT2_TAG_ALIGN .short MULTIBOOT2_HEADER_TAG_END @@ -94,6 +111,17 @@ ENTRY(start) gdt_boot_descr: .word 6*8-1 .long sym_phys(trampoline_gdt) +.long 0 /* Needed for 64-bit lgdt */ + +cs32_switch_addr: +.long sym_phys(cs32_switch) +.long BOOT_CS32 .word ljmpl refers to an m32:16 not an m32:32 + +efi_st: +.quad 0 + +efi_ih: +.quad 0 .Lbad_cpu_msg: .asciz ERR: Not a 64-bit CPU! .Lbad_ldr_msg: .asciz ERR: Not a Multiboot bootloader! @@ -124,6 +152,133 @@ print_err: .Lhalt: hlt jmp .Lhalt +.code64 + +__efi64_start: +cld + +/* Bootloaders may set multiboot[12].mem_lower to a nonzero value */ +xor %edx,%edx + +/* Check for Multiboot2 bootloader */ +cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax +je efi_multiboot2_proto + +jmp not_multiboot not_multiboot is 32bit code. You must drop out of 64bit before using it, or make a 64bit variant. + +efi_multiboot2_proto: +/* Skip Multiboot2 information fixed part */ +lea MB2_fixed_sizeof(%ebx),%ecx + +0: +/* Get mem_lower from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,(%ecx) +jne 1f + +mov MB2_mem_lower(%ecx),%edx +jmp 4f + +1: +/* Get EFI SystemTable address from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_EFI64,(%ecx) +jne 2f + +lea MB2_efi64_st(%ecx),%esi +lea efi_st(%rip),%edi +movsq This is complete overkill for copying a 64bit variable out of the tag and into a local variable. Just use a plain 64bit load and store. + +/* Do not go into real mode on EFI platform */ +movb$1,skip_realmode(%rip) + +jmp 4f + +2: +/* Get EFI ImageHandle address from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_EFI64_IH,(%ecx) +jne 3f + +lea MB2_efi64_ih(%ecx),%esi +lea efi_ih(%rip),%edi +movsq And here. +jmp 4f + +3: +/* Is it the end of Multiboot2 information? */ +cmpl$MULTIBOOT2_TAG_TYPE_END,(%ecx) +je run_bs + +4: +/* Go to next Multiboot2 information tag */ +add MB2_tag_size(%ecx),%ecx +add $(MULTIBOOT2_TAG_ALIGN-1),%ecx +and $~(MULTIBOOT2_TAG_ALIGN-1),%ecx +jmp 0b + +run_bs: +push%rax +push%rdx Does the EFI spec guarantee that we have a good stack to use at this point? + +/* Initialize BSS (no nasty surprises!) */ +lea __bss_start(%rip),%rdi +lea _end(%rip),%rcx +sub %rdi,%rcx +xor %rax,%rax xor %eax,%eax is shorter. +rep stosb It would be more efficient to make sure that the linker aligns __bss_start and _end on 8 byte boundaries, and use stosq instead. + +mov efi_ih(%rip),%rdi /* EFI ImageHandle */
Re: [Xen-devel] [PATCH] libxl: Wait for ballooning if free memory is increasing
On Thursday, January 29, 2015 10:14:26 AM Ian Campbell wrote: I'm thinking it would be clearer if the comment and the condition were logically inverted. e.g.: /* * If the amount of free mem has increased on this iteration (i.e. * some progress has been made) then reset the retry counter. */ if (freemem_kb freemem_kb_prev) { retries = MAX_RETRIES; free_memkb_prev = free_memkb; } else { retires--; } Thanks for the comments, Ian. Inverting the logic makes sense, and I'll send a v2 shortly. Given that this new loop can take significantly longer to fail I wonder if we should add some progress logging? xl has an xtl logger instance available so using xtl_progress might be an easy option. Maybe a separate patch though. xtl_progress looks interesting. I'll do some additional testing before I submit a patch containing this improvement. -Mike ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 01/18] x86/boot/reloc: mask out MBI_BOOTDEV from mbi flags
..because it is ignored by Xen. Signed-off-by: Daniel Kiper daniel.ki...@oracle.com --- xen/arch/x86/boot/reloc.c |1 - 1 file changed, 1 deletion(-) diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c index f971920..63045c0 100644 --- a/xen/arch/x86/boot/reloc.c +++ b/xen/arch/x86/boot/reloc.c @@ -90,7 +90,6 @@ multiboot_info_t *reloc(multiboot_info_t *mbi_old) /* Mask features we don't understand or don't relocate. */ mbi-flags = (MBI_MEMLIMITS | - MBI_BOOTDEV | MBI_CMDLINE | MBI_MODULES | MBI_MEMMAP | -- 1.7.10.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 02/18] x86/boot/reloc: create generic alloc and copy functions
On 30/01/15 17:54, Daniel Kiper wrote: Create generic alloc and copy functions. We need separate tools for memory allocation and copy to provide multiboot2 protocol support. Signed-off-by: Daniel Kiper daniel.ki...@oracle.com Reviewed-by: Andrew Cooper andrew.coop...@citrix.com --- xen/arch/x86/boot/reloc.c | 52 - 1 file changed, 33 insertions(+), 19 deletions(-) diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c index 63045c0..f964cda 100644 --- a/xen/arch/x86/boot/reloc.c +++ b/xen/arch/x86/boot/reloc.c @@ -33,9 +33,10 @@ asm ( typedef unsigned int u32; #include ../../../include/xen/multiboot.h -static void *reloc_mbi_struct(void *old, unsigned int bytes) +static u32 alloc_struct(u32 bytes) { -void *new; +u32 s; + asm( call 1f \n 1: pop %%edx \n @@ -43,50 +44,63 @@ static void *reloc_mbi_struct(void *old, unsigned int bytes) sub %1,%0 \n and $~15,%0 \n mov %0,alloc-1b(%%edx) \n -mov %0,%%edi\n -rep movsb \n - : =r (new), +c (bytes), +S (old) - : : edx, edi, memory); -return new; + : =r (s) : r (bytes) : edx, memory); + +return s; +} + +static u32 copy_struct(u32 src, u32 bytes) +{ +u32 dst, dst_asm; + +dst = alloc_struct(bytes); +dst_asm = dst; + +asm volatile(rep movsb : +S (src), +D (dst_asm), +c (bytes) : : memory); + +return dst; } -static char *reloc_mbi_string(char *old) +static u32 copy_string(u32 src) { char *p; -for ( p = old; *p != '\0'; p++ ) + +if ( src == 0 ) +return 0; + +for ( p = (char *)src; *p != '\0'; p++ ) continue; -return reloc_mbi_struct(old, p - old + 1); + +return copy_struct(src, p - (char *)src + 1); } multiboot_info_t *reloc(multiboot_info_t *mbi_old) { -multiboot_info_t *mbi = reloc_mbi_struct(mbi_old, sizeof(*mbi)); +multiboot_info_t *mbi = (multiboot_info_t *)copy_struct((u32)mbi_old, sizeof(*mbi)); int i; if ( mbi-flags MBI_CMDLINE ) -mbi-cmdline = (u32)reloc_mbi_string((char *)mbi-cmdline); +mbi-cmdline = copy_string(mbi-cmdline); if ( mbi-flags MBI_MODULES ) { -module_t *mods = reloc_mbi_struct( -(module_t *)mbi-mods_addr, mbi-mods_count * sizeof(module_t)); +module_t *mods = (module_t *)copy_struct( +mbi-mods_addr, mbi-mods_count * sizeof(module_t)); mbi-mods_addr = (u32)mods; for ( i = 0; i mbi-mods_count; i++ ) { if ( mods[i].string ) -mods[i].string = (u32)reloc_mbi_string((char *)mods[i].string); +mods[i].string = copy_string(mods[i].string); } } if ( mbi-flags MBI_MEMMAP ) -mbi-mmap_addr = (u32)reloc_mbi_struct( -(memory_map_t *)mbi-mmap_addr, mbi-mmap_length); +mbi-mmap_addr = copy_struct(mbi-mmap_addr, mbi-mmap_length); if ( mbi-flags MBI_LOADERNAME ) -mbi-boot_loader_name = (u32)reloc_mbi_string( -(char *)mbi-boot_loader_name); +mbi-boot_loader_name = copy_string(mbi-boot_loader_name); /* Mask features we don't understand or don't relocate. */ mbi-flags = (MBI_MEMLIMITS | ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] hvm_complete_assist_req: Tell caller we failed to send
On 01/30/15 05:24, Jan Beulich wrote: On 30.01.15 at 01:52, dsl...@verizon.com wrote: --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2599,7 +2599,7 @@ static bool_t hvm_complete_assist_req(ioreq_t *p) break; } -return 1; +return 0; /* implicitly bins the i/o operation */ } This change points out that having hvm_complete_assist_req() be a separate function yet having only a single caller, and it returning non-void with only a single possible return value isn't the best arrangement. I think this should be brought back into hvm_send_assist_req(), by inverting the if() condition there. Unless there are intentions for it to have another caller, but in that case it should still be made return void, with the caller choosing what to return. Sounds good to me will do. -Don Slutz Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 18/18] x86: add multiboot2 protocol support for EFI platforms
On 30/01/2015 23:43, Daniel Kiper wrote: On Fri, Jan 30, 2015 at 07:06:53PM +, Andrew Cooper wrote: On 30/01/15 17:54, Daniel Kiper wrote: + +efi_multiboot2_proto: +/* Skip Multiboot2 information fixed part */ +lea MB2_fixed_sizeof(%ebx),%ecx + +0: +/* Get mem_lower from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,(%ecx) +jne 1f + +mov MB2_mem_lower(%ecx),%edx +jmp 4f + +1: +/* Get EFI SystemTable address from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_EFI64,(%ecx) +jne 2f + +lea MB2_efi64_st(%ecx),%esi +lea efi_st(%rip),%edi +movsq This is complete overkill for copying a 64bit variable out of the tag and into a local variable. Just use a plain 64bit load and store. I am not sure what do you mean by 64bit load and store but I have just realized that we do not need these variables. They are remnants from early developments when I thought that we need ImageHandle and SystemTable here and later somewhere else. mov MB2_efi64_st(%rcx), %rdi mov %rdi, efi_st(%rip) But if they are not needed, drop the code completely. +jmp 4f + +3: +/* Is it the end of Multiboot2 information? */ +cmpl$MULTIBOOT2_TAG_TYPE_END,(%ecx) +je run_bs + +4: +/* Go to next Multiboot2 information tag */ +add MB2_tag_size(%ecx),%ecx +add $(MULTIBOOT2_TAG_ALIGN-1),%ecx +and $~(MULTIBOOT2_TAG_ALIGN-1),%ecx +jmp 0b + +run_bs: +push%rax +push%rdx Does the EFI spec guarantee that we have a good stack to use at this point? Unified Extensible Firmware Interface Specification, Version 2.4 Errata B, section 2.3.4, x64 Platforms says: During boot services time the processor is in the following execution mode: ..., 128 KiB, or more, of available stack space. GRUB2 uses this stack too and do not move it to different memory region. So, I think that here we are on safe side. Sounds ok then. +/* Initialize BSS (no nasty surprises!) */ +lea __bss_start(%rip),%rdi +lea _end(%rip),%rcx +sub %rdi,%rcx +xor %rax,%rax xor %eax,%eax is shorter. +rep stosb It would be more efficient to make sure that the linker aligns __bss_start and _end on 8 byte boundaries, and use stosq instead. Right but just for this. Is it pays? We do this only once. The BSS in Xen is 300k. It is absolutely better to clear it 8 bytes at a time rather than 1. However, if you wish... +mov efi_ih(%rip),%rdi /* EFI ImageHandle */ +mov efi_st(%rip),%rsi /* EFI SystemTable */ +callefi_multiboot2 + +pop %rcx +pop %rax + +shl $10-4,%rcx /* Convert multiboot2.mem_lower to bytes/16 */ + +cli This looks suspiciously out of place. Surely the EFI spec doesn't permit entry with interrupts enabled? Unified Extensible Firmware Interface Specification, Version 2.4 Errata B, section 2.3.4, x64 Platforms says: During boot services time the processor is in the following execution mode: ..., Interrupts are enabled–though no interrupt services are supported other than the UEFI boot services timer functions (All loaded device drivers are serviced synchronously by “polling.”). So, I think that we should use BS with interrupts enabled and disable them after ExitBootServices(). Hmmm... Now I think that we should use cli immediately after efi_multiboot2() call. I presume then that the firmware has set up a valid idt somewhere and is actually serving any interrupts we get. diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c index f8be3dd..c5725ca 100644 --- a/xen/common/efi/boot.c +++ b/xen/common/efi/boot.c @@ -75,6 +75,17 @@ static size_t wstrlen(const CHAR16 * s); static int set_color(u32 mask, int bpp, u8 *pos, u8 *sz); static bool_t match_guid(const EFI_GUID *guid1, const EFI_GUID *guid2); +static void efi_init(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable); +static void efi_console_set_mode(void); +static EFI_GRAPHICS_OUTPUT_PROTOCOL *efi_get_gop(void); +static UINTN efi_find_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, + UINTN cols, UINTN rows, UINTN depth); +static void efi_tables(void); +static void setup_efi_pci(void); +static void efi_variables(void); +static void efi_set_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, UINTN gop_mode); +static void efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable); + If any of these forward declarations are needed, they should be They are needed because efi-boot.h is included before above mentioned functions definitions. introduced in the appropriate create efi_$FOO patch. However, I can't I thought about that during development.
Re: [Xen-devel] [PATCH 18/18] x86: add multiboot2 protocol support for EFI platforms
On Fri, Jan 30, 2015 at 07:06:53PM +, Andrew Cooper wrote: On 30/01/15 17:54, Daniel Kiper wrote: Signed-off-by: Daniel Kiper daniel.ki...@oracle.com --- xen/arch/x86/boot/head.S | 174 +++-- xen/arch/x86/efi/efi-boot.h | 29 +++ xen/arch/x86/setup.c | 23 ++--- xen/arch/x86/x86_64/asm-offsets.c |2 + xen/common/efi/boot.c | 11 +++ 5 files changed, 222 insertions(+), 17 deletions(-) diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index 7861057..89f5aa7 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -8,6 +8,7 @@ #include asm/page.h #include asm/msr.h #include asm/cpufeature.h +#include asm/processor.h .text .code32 @@ -57,6 +58,9 @@ ENTRY(start) .long .Lmultiboot2_info_req_end - .Lmultiboot2_info_req .long MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO .long MULTIBOOT2_TAG_TYPE_MMAP +.long MULTIBOOT2_TAG_TYPE_EFI_BS +.long MULTIBOOT2_TAG_TYPE_EFI64 +.long MULTIBOOT2_TAG_TYPE_EFI64_IH .Lmultiboot2_info_req_end: .align MULTIBOOT2_TAG_ALIGN @@ -80,6 +84,19 @@ ENTRY(start) .long 0 /* Number of the lines - no preference. */ .long 0 /* Number of bits per pixel - no preference. */ +/* Do not disable EFI boot services. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_EFI_BS +.short MULTIBOOT2_HEADER_TAG_OPTIONAL +.long 8 /* Tag size. */ + +/* EFI64 entry point. */ +.align MULTIBOOT2_TAG_ALIGN +.short MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI64 +.short MULTIBOOT2_HEADER_TAG_OPTIONAL +.long 12 /* Tag size. */ +.long sym_phys(__efi64_start) + /* Multiboot2 header end tag. */ .align MULTIBOOT2_TAG_ALIGN .short MULTIBOOT2_HEADER_TAG_END @@ -94,6 +111,17 @@ ENTRY(start) gdt_boot_descr: .word 6*8-1 .long sym_phys(trampoline_gdt) +.long 0 /* Needed for 64-bit lgdt */ + +cs32_switch_addr: +.long sym_phys(cs32_switch) +.long BOOT_CS32 .word ljmpl refers to an m32:16 not an m32:32 + +efi_st: +.quad 0 + +efi_ih: +.quad 0 .Lbad_cpu_msg: .asciz ERR: Not a 64-bit CPU! .Lbad_ldr_msg: .asciz ERR: Not a Multiboot bootloader! @@ -124,6 +152,133 @@ print_err: .Lhalt: hlt jmp .Lhalt +.code64 + +__efi64_start: +cld + +/* Bootloaders may set multiboot[12].mem_lower to a nonzero value */ +xor %edx,%edx + +/* Check for Multiboot2 bootloader */ +cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax +je efi_multiboot2_proto + +jmp not_multiboot not_multiboot is 32bit code. You must drop out of 64bit before using it, or make a 64bit variant. + +efi_multiboot2_proto: +/* Skip Multiboot2 information fixed part */ +lea MB2_fixed_sizeof(%ebx),%ecx + +0: +/* Get mem_lower from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,(%ecx) +jne 1f + +mov MB2_mem_lower(%ecx),%edx +jmp 4f + +1: +/* Get EFI SystemTable address from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_EFI64,(%ecx) +jne 2f + +lea MB2_efi64_st(%ecx),%esi +lea efi_st(%rip),%edi +movsq This is complete overkill for copying a 64bit variable out of the tag and into a local variable. Just use a plain 64bit load and store. I am not sure what do you mean by 64bit load and store but I have just realized that we do not need these variables. They are remnants from early developments when I thought that we need ImageHandle and SystemTable here and later somewhere else. +/* Do not go into real mode on EFI platform */ +movb$1,skip_realmode(%rip) + +jmp 4f + +2: +/* Get EFI ImageHandle address from Multiboot2 information */ +cmpl$MULTIBOOT2_TAG_TYPE_EFI64_IH,(%ecx) +jne 3f + +lea MB2_efi64_ih(%ecx),%esi +lea efi_ih(%rip),%edi +movsq And here. Ditto. +jmp 4f + +3: +/* Is it the end of Multiboot2 information? */ +cmpl$MULTIBOOT2_TAG_TYPE_END,(%ecx) +je run_bs + +4: +/* Go to next Multiboot2 information tag */ +add MB2_tag_size(%ecx),%ecx +add $(MULTIBOOT2_TAG_ALIGN-1),%ecx +and $~(MULTIBOOT2_TAG_ALIGN-1),%ecx +jmp 0b + +run_bs: +push%rax +push%rdx Does the EFI spec guarantee that we have a good stack to
Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine property to support IGD GFX passthrough
-Original Message- From: xen-devel-boun...@lists.xen.org [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Wei Liu Sent: Friday, January 30, 2015 8:26 PM To: Chen, Tiejun Cc: Wei Liu; ian.campb...@citrix.com; m...@redhat.com; Ian Jackson; qemu-de...@nongnu.org; xen-devel@lists.xen.org; Gerd Hoffmann Subject: Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine property to support IGD GFX passthrough On Fri, Jan 30, 2015 at 08:56:48AM +0800, Chen, Tiejun wrote: [...] Just remember to handle old option in libxl if your old option is already released by some older version of QEMUs. I just drop that old option, -gfx_passthru, if we're under qemu upstream circumstance, like this, The question is, is there any version of qemu upstream that has been released that has the old option (-gfx-passthru)? No. Just now we're starting to support IGD passthrough in qemu upstream. Right, as of QEMU 2.2.0 there's no support of IGD passthrough in QMEU upstream. Just a question: Now what features do vt-d support? Thanks. -Quan This gives us a situation that we need to support both the old (-gfx-passthru) and new (-igd-passthru) options. Presumably we (libxl) would need to fork a qemu process to determine which option it has and pass the right one. Or you can try to keep both old and new option at the same time but Yeah, actually I also have considered to keep both two options at the same time. Its really friendly to any qemu version. deprecate the old one. Then in a few qemu release cycles later (or This should be like 'accel=kvm' versus 'enable-kvm' in qemu upstream. They're coexisted now but just the former is a modern option. probably one year or two?) you can finally remove the old one. The point is that to give downstream (in this case, Xen) time to cope with the change. Here I'm fine to this way. So Gerd, So you don't actually need to ask Gerd this question because there is no old option to keep in qemu upstream. Libxl (or any sensible toolstack) will just do the right thing to either pass -igd-passthru (or whatever you guys agree upon) to qemu upstream or pass -gfx-passthru to qemu traditional. :-) Wei. What about this? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers
Up to now the pvscsi drivers haven't supported domain suspend and resume. When a domain with an assigned pvscsi device was suspended and resumed again, it was not able to use the device any more: trying to do so resulted in hanging processes. Support suspend and resume of pvscsi devices. Signed-off-by: Juergen Gross jgr...@suse.com diff -r 578e5aea3cbb drivers/xen/scsiback/xenbus.c --- a/drivers/xen/scsiback/xenbus.c Mon Jan 19 11:51:46 2015 +0100 +++ b/drivers/xen/scsiback/xenbus.c Fri Jan 30 13:57:29 2015 +0100 @@ -167,33 +167,48 @@ static void scsiback_do_lun_hotplug(stru switch (op) { case VSCSIBACK_OP_ADD_OR_DEL_LUN: - if (device_state == XenbusStateInitialising) { + switch (device_state) { + case XenbusStateInitialising: + case XenbusStateConnected: sdev = scsiback_get_scsi_device(phy); - if (!sdev) - xenbus_printf(XBT_NIL, dev-nodename, state_str, - %d, XenbusStateClosed); - else { - err = scsiback_add_translation_entry(be-info, sdev, vir); - if (!err) { - if (xenbus_printf(XBT_NIL, dev-nodename, state_str, - %d, XenbusStateInitialised)) { - printk(KERN_ERR scsiback: xenbus_printf error %s\n, state_str); - scsiback_del_translation_entry(be-info, vir); - } - } else { - scsi_device_put(sdev); - xenbus_printf(XBT_NIL, dev-nodename, state_str, - %d, XenbusStateClosed); - } + if (!sdev) { + xenbus_printf(XBT_NIL, dev-nodename, + state_str, + %d, XenbusStateClosed); + break; } - } + if (scsiback_add_translation_entry(be-info, + sdev, vir)) { + scsi_device_put(sdev); + if (device_state == XenbusStateConnected) + break; + xenbus_printf(XBT_NIL, dev-nodename, + state_str, + %d, XenbusStateClosed); + break; + } + if (!xenbus_printf(XBT_NIL, dev-nodename, + state_str, %d, + XenbusStateInitialised)) + break; + printk(KERN_ERR scsiback: xenbus_printf error %s\n, + state_str); + scsiback_del_translation_entry(be-info, vir); + break; - if (device_state == XenbusStateClosing) { - if (!scsiback_del_translation_entry(be-info, vir)) { - if (xenbus_printf(XBT_NIL, dev-nodename, state_str, - %d, XenbusStateClosed)) - printk(KERN_ERR scsiback: xenbus_printf error %s\n, state_str); - } + case XenbusStateClosing: + if (scsiback_del_translation_entry(be-info, + vir)) + break; + if (xenbus_printf(XBT_NIL, dev-nodename, + state_str, %d, + XenbusStateClosed)) + printk(KERN_ERR scsiback: xenbus_printf error %s\n, + state_str); + break; + + default: + break; }
Re: [Xen-devel] [RFC V10 0/4] domain snapshot document
On Fri, Jan 30, 2015 at 03:01:11PM +0100, Fabio Fantoni wrote: Il 30/01/2015 14:38, Wei Liu ha scritto: On Thu, Jan 29, 2015 at 04:36:43PM +, Ian Campbell wrote: On Mon, 2015-01-26 at 11:25 +0800, Chunyan Liu wrote: Changes to V9: This looks good to me, thanks. Ian/Wei do you have any comments? xapi folks? Looks good to me. Wei. I looked to the project of domain snapshot and seems good. I have only a question. I saw this: Disk snapshot can be created by many external tools, like qemu-img, vhd-util and lvm, etc. Wouldn't it be good to also use btrfs for external disks snapshot? Even if btrfs snapshots works on subvolumes (directories) it is possible to atomically creates a copy-on-write snapshot of a file (with domUs disks on file, including raw) with btrfs clone operation (reflink systemcall, eg: cp --reflink source dest) I think that etc. potentially covers all other external tools, including btrfs. You can list all possible external tools in the world. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V10 0/4] domain snapshot document
On Fri, Jan 30, 2015 at 02:05:05PM +, Wei Liu wrote: On Fri, Jan 30, 2015 at 03:01:11PM +0100, Fabio Fantoni wrote: Il 30/01/2015 14:38, Wei Liu ha scritto: On Thu, Jan 29, 2015 at 04:36:43PM +, Ian Campbell wrote: On Mon, 2015-01-26 at 11:25 +0800, Chunyan Liu wrote: Changes to V9: This looks good to me, thanks. Ian/Wei do you have any comments? xapi folks? Looks good to me. Wei. I looked to the project of domain snapshot and seems good. I have only a question. I saw this: Disk snapshot can be created by many external tools, like qemu-img, vhd-util and lvm, etc. Wouldn't it be good to also use btrfs for external disks snapshot? Even if btrfs snapshots works on subvolumes (directories) it is possible to atomically creates a copy-on-write snapshot of a file (with domUs disks on file, including raw) with btrfs clone operation (reflink systemcall, eg: cp --reflink source dest) I think that etc. potentially covers all other external tools, including btrfs. You can list all possible external tools in the world. ^^^ can't Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] ocaml/xenctrl: Make failwith_xc() thread safe
On Fri, Jan 30, 2015 at 02:19:53PM +, Dave Scott wrote: Looks ok to me. Signed-off-by: David Scott dave.sc...@citrix.com I think this should be an Acked-by. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH linux-2.6.18] xen: mark pvscsi frontend request consumed only after last read
On 30.01.15 at 14:51, jgr...@suse.com.non-mime.internet wrote: A request in the ring buffer mustn't be read after it has been marked as consumed. Otherwise it might already have been reused by the frontend without violating the ring protocol. To avoid inconsistencies in the backend only work on a private copy of the request. This will ensure a malicious guest not being able to bypass consistency checks of the backend by modifying an active request. I'm not convinced we need this in this version of the driver: c/s 590:c4134d1a3e3f took care of reading each ring_req field just once. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH linux-2.6.18] xen: mark pvscsi frontend request consumed only after last read
A request in the ring buffer mustn't be read after it has been marked as consumed. Otherwise it might already have been reused by the frontend without violating the ring protocol. To avoid inconsistencies in the backend only work on a private copy of the request. This will ensure a malicious guest not being able to bypass consistency checks of the backend by modifying an active request. Signed-off-by: Juergen Gross jgr...@suse.com diff -r 578e5aea3cbb drivers/xen/scsiback/scsiback.c --- a/drivers/xen/scsiback/scsiback.c Mon Jan 19 11:51:46 2015 +0100 +++ b/drivers/xen/scsiback/scsiback.c Fri Jan 30 14:43:29 2015 +0100 @@ -579,7 +579,7 @@ invalid_value: static int _scsiback_do_cmd_fn(struct vscsibk_info *info) { struct vscsiif_back_ring *ring = info-ring; - vscsiif_request_t *ring_req; + vscsiif_request_t ring_req; pending_req_t *pending_req; RING_IDX rc, rp; @@ -609,10 +609,10 @@ static int _scsiback_do_cmd_fn(struct vs break; } - ring_req = RING_GET_REQUEST(ring, rc); + ring_req = *RING_GET_REQUEST(ring, rc); ring-req_cons = ++rc; - err = prepare_pending_reqs(info, ring_req, + err = prepare_pending_reqs(info, ring_req, pending_req); switch (err ?: pending_req-act) { case VSCSIIF_ACT_SCSI_CDB: ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V10 0/4] domain snapshot document
Il 30/01/2015 14:38, Wei Liu ha scritto: On Thu, Jan 29, 2015 at 04:36:43PM +, Ian Campbell wrote: On Mon, 2015-01-26 at 11:25 +0800, Chunyan Liu wrote: Changes to V9: This looks good to me, thanks. Ian/Wei do you have any comments? xapi folks? Looks good to me. Wei. I looked to the project of domain snapshot and seems good. I have only a question. I saw this: Disk snapshot can be created by many external tools, like qemu-img, vhd-util and lvm, etc. Wouldn't it be good to also use btrfs for external disks snapshot? Even if btrfs snapshots works on subvolumes (directories) it is possible to atomically creates a copy-on-write snapshot of a file (with domUs disks on file, including raw) with btrfs clone operation (reflink systemcall, eg: cp --reflink source dest) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues
On 30/01/15 14:17, Konrad Rzeszutek Wilk wrote: blinks kexec is OS agnostic? Yes. Extra information to be passed to (e.g., a Linux kernel being kexec'd) can be supplied in an additional segment and marshalled into a suitable format/location for the exec'd kernel by purgatory (or similar). There shouldn't be a need to extend the hypervisor ABI for this. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/some blob to the hypervisor
On 01/29/2015 08:14 PM, Luis R. Rodriguez wrote: From: Luis R. Rodriguez mcg...@suse.com There are several ways that a Xen system can update the CPU microcode on a pvops kernel [0] now, the preferred way is through the early microcode update mechanism. At run time folks should use this new xenmicrocode tool and use the same CPU microcode file as present on /lib/firmware. Some distributions may use the historic sysfs rescan interface, users of that mechanism should be aware that the interface will not be available when using Xen and as such should first check the presence of the interface before usage, as an alternative this xenmicrocode tool can be used on priviledged domains. Folks wishing to update CPU microcode at run time should be aware that not all CPU microcode can be updated on a system and should take care to ensure that only known-to-work and supported CPU microcode updates are used [0]. To avoid issues with delays on the hypercall / microcode update this implementation will quiesce all domains prior to updating the microcode, and then only queisce'd domains will be unpaused once done. [0] http://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update Based on original work by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Andrew Cooper andrew.coop...@citrix.com Cc: Borislav Petkov b...@suse.de Cc: Takashi Iwai ti...@suse.de Cc: Olaf Hering oher...@suse.de Cc: Jan Beulich jbeul...@suse.com Cc: Jason Douglas jdoug...@suse.com Cc: Juergen Gross jgr...@suse.com Cc: Michal Marek mma...@suse.cz Cc: Henrique de Moraes Holschuh h...@debian.org Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com Signed-off-by: Luis R. Rodriguez mcg...@suse.com --- Just wrote this, haven't tested it. This does some queiscing prior to doing the microcode update. The quiescing is done by pausing all domains. Once the microcode update is done we only unpause domains which we queisced as part of our work. Let me know if this is on the right track to help avoid issues noted on the list. tools/libxc/include/xenctrl.h | 2 + tools/libxc/xc_misc.c | 102 ++ tools/misc/Makefile | 4 ++ tools/misc/xenmicrocode.c | 101 + 4 files changed, 209 insertions(+) create mode 100644 tools/misc/xenmicrocode.c diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h index 0ad8b8d..d5db0b8 100644 --- a/tools/libxc/include/xenctrl.h +++ b/tools/libxc/include/xenctrl.h @@ -55,6 +55,7 @@ #include xen/foreign/x86_32.h #include xen/foreign/x86_64.h #include xen/arch-x86/xen-mca.h +#include xen/platform.h #endif #define XC_PAGE_SHIFT 12 @@ -2145,6 +2146,7 @@ int xc_cpuid_apply_policy(xc_interface *xch, void xc_cpuid_to_str(const unsigned int *regs, char **strs); /* some strs[] may be NULL if ENOMEM */ int xc_mca_op(xc_interface *xch, struct xen_mc *mc); +int xc_microcode_update(xc_interface *xch, uint8_t *fbuf, size_t len); #endif struct xc_px_val { diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c index e253a58..6137e24 100644 --- a/tools/libxc/xc_misc.c +++ b/tools/libxc/xc_misc.c @@ -20,6 +20,7 @@ #include xc_private.h #include xen/hvm/hvm_op.h +#include xen/platform.h int xc_get_max_cpus(xc_interface *xch) { @@ -250,6 +251,107 @@ int xc_mca_op(xc_interface *xch, struct xen_mc *mc) xc_hypercall_bounce_post(xch, mc); return ret; } + +struct xc_quiesce_request { +int num_quiesced; +int domids[1024]; /* never domid 0 */ +}; + +/* + * Do not pause already paused domains, and allow us to + * unpause only quiesced domains. + */ +static int quiesce_all_domains(xc_interface *xch, + struct xc_quiesce_request *quiesce_request) +{ +xc_domaininfo_t info[1024]; +int i, nb_domain; + +nb_domain = xc_domain_getinfolist(xch, 0, 1024, info); +if ( nb_domain 0 ) +{ + return -1; +} + +for ( i = 0; i nb_domain; i++ ) +{ +if ( info[i].domain == 0 ) + continue; + if ( info[i].flags XEN_DOMINF_paused ) + continue; + +xc_domain_pause(xch, info[i].domain); You are not handling errors returned by xc_domain_pause(). So then you will try to unpause a domain that may not have been paused and (I think more importantly) may proceed with microcode update while not all domains are paused. + + quiesce_request-domids[quiesce_request-num_quiesced] = info[i].domain; + quiesce_request-num_quiesced++; +} + +return 0; +} + +static void unquiesce_all_domains(xc_interface *xch, + struct xc_quiesce_request *quiesce_request) +{ +int i; + +for ( i = 0; i quiesce_request-num_quiesced; i++ ) +{ + xc_domain_unpause(xch, quiesce_request-domids[i]); Same here --- you may fail unpausing a domain and
Re: [Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers
On 30.01.15 at 14:52, jgr...@suse.com.non-mime.internet wrote: @@ -231,8 +242,23 @@ static int scsifront_cmd_done(struct vsc return more_to_do; } +void scsifront_finish_all(struct vscsifrnt_info *info) +{ + unsigned i; + struct vscsiif_response resp; + scsifront_ring_drain(info); Shouldn't you at least issue some kind of warning when this returns non-zero? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v17 14/23] x86/VPMU: Initialize VPMUs with __initcall
On 05.01.15 at 22:44, boris.ostrov...@oracle.com wrote: --- a/xen/arch/x86/hvm/vpmu.c +++ b/xen/arch/x86/hvm/vpmu.c @@ -497,3 +497,39 @@ long do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg) return ret; } + +static int __init vpmu_init(void) +{ +int vendor = current_cpu_data.x86_vendor; + +if ( vpmu_mode == XENPMU_MODE_OFF ) +{ +printk(XENLOG_INFO VPMU: disabled\n); +return 0; +} + +switch ( vendor ) +{ +case X86_VENDOR_AMD: +if ( amd_vpmu_init() ) + vpmu_mode = XENPMU_MODE_OFF; +break; +case X86_VENDOR_INTEL: +if ( core2_vpmu_init() ) + vpmu_mode = XENPMU_MODE_OFF; +break; +default: +printk(XENLOG_WARNING VPMU: Unknown CPU vendor: %d\n, vendor); +vpmu_mode = XENPMU_MODE_OFF; +break; return 0; (i.e. avoid printing another message below) +} + +if ( vpmu_mode == XENPMU_MODE_OFF ) +printk(XENLOG_WARNING VPMU: Disabling due to initialization error\n); We repeatedly find that not printing at least a vague indication of what went wrong makes problem analysis quite a bit more difficult. It won't cost much to include the actual error code here. +else +printk(XENLOG_INFO VPMU: version %s.%s\n, + __stringify(XENPMU_VER_MAJ), __stringify(XENPMU_VER_MIN)); %s and __stringify()? Either print the numbers with %d or %u, or use __stringify() to avoid any argument besides the format string. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers
On 01/30/2015 03:46 PM, Jan Beulich wrote: On 30.01.15 at 14:52, jgr...@suse.com.non-mime.internet wrote: @@ -231,8 +242,23 @@ static int scsifront_cmd_done(struct vsc return more_to_do; } +void scsifront_finish_all(struct vscsifrnt_info *info) +{ + unsigned i; + struct vscsiif_response resp; + scsifront_ring_drain(info); Shouldn't you at least issue some kind of warning when this returns non-zero? If a warning should be issued, then this should be done after the following loop in case of at least one request terminated there. I'm really not sure, whether a warning is required here. If you like, I can add one. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues
On 30/01/15 14:52, Konrad Rzeszutek Wilk wrote: On Fri, Jan 30, 2015 at 02:40:46PM +, David Vrabel wrote: On 30/01/15 14:17, Konrad Rzeszutek Wilk wrote: blinks kexec is OS agnostic? Yes. Extra information to be passed to (e.g., a Linux kernel being kexec'd) can be supplied in an additional segment and marshalled into a suitable format/location for the exec'd kernel by purgatory (or similar). There shouldn't be a need to extend the hypervisor ABI for this. How are you planning to make kexec work under EFI? I don't know at this time. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers
On 30.01.15 at 15:54, jgr...@suse.com wrote: On 01/30/2015 03:46 PM, Jan Beulich wrote: On 30.01.15 at 14:52, jgr...@suse.com.non-mime.internet wrote: @@ -231,8 +242,23 @@ static int scsifront_cmd_done(struct vsc return more_to_do; } +void scsifront_finish_all(struct vscsifrnt_info *info) +{ + unsigned i; + struct vscsiif_response resp; + scsifront_ring_drain(info); Shouldn't you at least issue some kind of warning when this returns non-zero? If a warning should be issued, then this should be done after the following loop in case of at least one request terminated there. I'm really not sure, whether a warning is required here. If you like, I can add one. I'm not sure, I'm merely asking because I saw the function return value being ignored here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [ovmf test] 33958: regressions - FAIL
flight 33958 ovmf real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/33958/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: build-amd64 5 xen-build fail REGR. vs. 33686 build-i3865 xen-build fail REGR. vs. 33686 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-credit2 1 build-check(1) blocked n/a test-amd64-amd64-xl-multivcpu 1 build-check(1) blocked n/a build-amd64-libvirt 1 build-check(1) blocked n/a build-i386-libvirt1 build-check(1) blocked n/a test-amd64-amd64-libvirt 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-intel 1 build-check(1) blocked n/a test-amd64-i386-qemut-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-amd64-xl-pvh-amd 1 build-check(1) blocked n/a test-amd64-i386-qemut-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-i386 1 build-check(1) blocked n/a test-amd64-amd64-xl-sedf-pin 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-libvirt 1 build-check(1) blocked n/a test-amd64-i386-rhel6hvm-intel 1 build-check(1) blocked n/a test-amd64-i386-qemuu-rhel6hvm-amd 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-ovmf-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-xl-sedf 1 build-check(1) blocked n/a test-amd64-i386-freebsd10-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl 1 build-check(1) blocked n/a test-amd64-i386-pair 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-debianhvm-amd64 1 build-check(1)blocked n/a test-amd64-amd64-xl-qemut-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-pcipt-intel 1 build-check(1) blocked n/a test-amd64-i386-xl-winxpsp3 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-debianhvm-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-pair 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-win7-amd64 1 build-check(1) blocked n/a test-amd64-amd64-xl-win7-amd64 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-amd64-xl-winxpsp3 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-i386-xl-winxpsp3-vcpus1 1 build-check(1) blocked n/a test-amd64-i386-xl-qemut-winxpsp3 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemut-winxpsp3 1 build-check(1) blocked n/a test-amd64-amd64-xl-qemuu-winxpsp3 1 build-check(1) blocked n/a version targeted for testing: ovmf 49a228ca6980b197bb854dc88c909d712959cb47 baseline version: ovmf 447d264115c476142f884af0be287622cd244423 People who touched revisions under test: Gao, Liming liming@intel.com Long, Qin qin.l...@intel.com Yao, Jiewen jiewen@intel.com Aaron Pop aar...@ami.com Abner Chang abner.ch...@hp.com Alex Williamson alex.william...@redhat.com Anderw Fish af...@apple.com Andrew Fish af...@apple.com Anthony PERARD anthony.per...@citrix.com Ard Biesheuvel ard.biesheu...@linaro.org Ari Zigler a...@mellanox.com Brendan Jackman brendan.jack...@arm.com Bruce Cran bruce.c...@gmail.com Cecil Sheng cecil.sh...@hp.com Chao Zhang chao.b.zh...@intel.com Chao, Zhang chao.b.zh...@intel.com Chen Fan chen.fan.f...@cn.fujitsu.com Chris Phillips chr...@hp.com Chris Ruffin chris.ruf...@intel.com Cinnamon Shia cinnamon.s...@hp.com Daryl McDaniel daryl.mcdan...@intel.com Daryl McDaniel daryl.mcdan...@intel.com daryl.mcdaniel daryl.mcdan...@intel.com daryl.mcdan...@intel.com darylm503 darylm503@Edk2
Re: [Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers
On 01/30/2015 04:05 PM, Jan Beulich wrote: On 30.01.15 at 15:54, jgr...@suse.com wrote: On 01/30/2015 03:46 PM, Jan Beulich wrote: On 30.01.15 at 14:52, jgr...@suse.com.non-mime.internet wrote: @@ -231,8 +242,23 @@ static int scsifront_cmd_done(struct vsc return more_to_do; } +void scsifront_finish_all(struct vscsifrnt_info *info) +{ + unsigned i; + struct vscsiif_response resp; + scsifront_ring_drain(info); Shouldn't you at least issue some kind of warning when this returns non-zero? If a warning should be issued, then this should be done after the following loop in case of at least one request terminated there. I'm really not sure, whether a warning is required here. If you like, I can add one. I'm not sure, I'm merely asking because I saw the function return value being ignored here. I think it can be 0 only. We are handling resume, so the ring which is being drained will no longer be filled by the backend. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues
On Fri, Jan 30, 2015 at 02:57:32PM +, David Vrabel wrote: On 30/01/15 14:52, Konrad Rzeszutek Wilk wrote: On Fri, Jan 30, 2015 at 02:40:46PM +, David Vrabel wrote: On 30/01/15 14:17, Konrad Rzeszutek Wilk wrote: blinks kexec is OS agnostic? Yes. Extra information to be passed to (e.g., a Linux kernel being kexec'd) can be supplied in an additional segment and marshalled into a suitable format/location for the exec'd kernel by purgatory (or similar). There shouldn't be a need to extend the hypervisor ABI for this. How are you planning to make kexec work under EFI? I don't know at this time. I suppose that we should provide additional kexec hypercall function which will return info about RS. kexec-tools should load new kernel as usual and add relevant argument to its command line. Most things are in place so we should just learn kexec-tools to do new things. Daniel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] open-coded page list manipulation in shadow code
On 29.01.15 at 18:30, t...@xen.org wrote: OK, here's what I have. It keeps the mechanism the same, threading on the main page list entry, but tidies it up to use the page_list operations rather than open-coding. I've tested it (lightly - basically jsut boot testing) with 32bit, 32pae and 64bit Windows VMs (all SMP), with bigmem=y and bigmem=n. It was developed on top of your bigmem series, but it should apply OK before patch 3/4, removing the need to nobble shadow-paging in that patch. Thanks, looks quite okay, just a couple of remarks. @@ -1525,6 +1495,14 @@ mfn_t shadow_alloc(struct domain *d, if ( shadow_type = SH_type_min_shadow shadow_type = SH_type_max_shadow ) sp-u.sh.head = 1; + +#ifndef PAGE_LIST_NULL +/* The temporary list-head is on our stack. Blank out the + * pointers to it in the shadows, just to get a clean failure if + * we accidentally follow them. */ +tmp_list.next-prev = tmp_list.prev-next = NULL; +#endif Perhaps better to use LIST_POISON{1,2} here, so we don't point into PV VA space? --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -318,6 +318,33 @@ static inline int mfn_oos_may_write(mfn_t gmfn) } #endif /* (SHADOW_OPTIMIZATIONS SHOPT_OUT_OF_SYNC) */ +/* Figure out the size (in pages) of a given shadow type */ +static inline u32 +shadow_size(unsigned int shadow_type) +{ +static const u32 type_to_size[SH_type_unused] = { +1, /* SH_type_none */ +2, /* SH_type_l1_32_shadow */ +2, /* SH_type_fl1_32_shadow */ +4, /* SH_type_l2_32_shadow */ +1, /* SH_type_l1_pae_shadow */ +1, /* SH_type_fl1_pae_shadow */ +1, /* SH_type_l2_pae_shadow */ +1, /* SH_type_l2h_pae_shadow */ +1, /* SH_type_l1_64_shadow */ +1, /* SH_type_fl1_64_shadow */ +1, /* SH_type_l2_64_shadow */ +1, /* SH_type_l2h_64_shadow */ +1, /* SH_type_l3_64_shadow */ +1, /* SH_type_l4_64_shadow */ +1, /* SH_type_p2m_table */ +1, /* SH_type_monitor_table */ +1 /* SH_type_oos_snapshot */ +}; +ASSERT(shadow_type SH_type_unused); +return type_to_size[shadow_type]; +} Isn't this going to lead to multiple instances of that static array? @@ -668,46 +697,40 @@ static inline int sh_pin(struct vcpu *v, mfn_t smfn) * of pinned shadows, and release the extra ref. */ static inline void sh_unpin(struct vcpu *v, mfn_t smfn) { -struct page_list_head h, *pin_list; -struct page_info *sp; - +struct page_list_head tmp_list, *pin_list; +struct page_info *sp, *next; +unsigned int i, head_type; + ASSERT(mfn_valid(smfn)); sp = mfn_to_page(smfn); +head_type = sp-u.sh.type; ASSERT(sh_type_is_pinnable(v, sp-u.sh.type)); ASSERT(sp-u.sh.head); -/* Treat the up-to-four pages of the shadow as a unit in the list ops */ -h.next = h.tail = sp; -if ( sp-u.sh.type == SH_type_l2_32_shadow ) -{ -h.tail = pdx_to_page(h.tail-list.next); -h.tail = pdx_to_page(h.tail-list.next); -h.tail = pdx_to_page(h.tail-list.next); -ASSERT(h.tail-u.sh.type == SH_type_l2_32_shadow); -} -pin_list = v-domain-arch.paging.shadow.pinned_shadows; - if ( !sp-u.sh.pinned ) return; - sp-u.sh.pinned = 0; -/* Cut the sub-list out of the list of pinned shadows */ -if ( pin_list-next == h.next pin_list-tail == h.tail ) -pin_list-next = pin_list-tail = NULL; -else +/* Cut the sub-list out of the list of pinned shadows, + * stitching it back into a list fragment of its own. */ +pin_list = v-domain-arch.paging.shadow.pinned_shadows; +INIT_PAGE_LIST_HEAD(tmp_list); +for ( i = 0; i shadow_size(head_type); i++ ) { -if ( pin_list-next == h.next ) -pin_list-next = page_list_next(h.tail, pin_list); -else -page_list_prev(h.next, pin_list)-list.next = h.tail-list.next; -if ( pin_list-tail == h.tail ) -pin_list-tail = page_list_prev(h.next, pin_list); -else -page_list_next(h.tail, pin_list)-list.prev = h.next-list.prev; +ASSERT(sp-u.sh.type == head_type); +ASSERT(!i || !sp-u.sh.head); +next = page_list_next(sp, pin_list); +page_list_del(sp, pin_list); +page_list_add_tail(sp, tmp_list); +sp = next; } -h.tail-list.next = h.next-list.prev = PAGE_LIST_NULL; - +#ifndef PAGE_LIST_NULL +/* The temporary list-head is on our stack. Blank out the + * pointers to it in the shadows, just to get a clean failure if + * we accidentally follow them. */ +tmp_list.next-prev = tmp_list.prev-next = NULL; +#endif Same here. Perhaps worth putting into another inline helper? Jan
Re: [Xen-devel] [PATCH 2/4] x86/mm: allow for building without shadow mode support
On 29.01.15 at 18:34, t...@xen.org wrote: At 11:10 + on 28 Jan (1422439811), Andrew Cooper wrote: On 28/01/15 08:11, Jan Beulich wrote: --- a/xen/arch/x86/mm/shadow/Makefile +++ b/xen/arch/x86/mm/shadow/Makefile @@ -1,4 +1,5 @@ -obj-$(x86_64) += common.o guest_2.o guest_3.o guest_4.o +obj-y:= none.o +obj-$(shadow-paging) := common.o guest_2.o guest_3.o guest_4.o Can this be ifeq($(shadow-paging),y) obj-y := common.o guest_2.o guest_3.o guest_4.o else obj-y := none.o endif Rather than relying on the double := to clobber none.o and prevent a link failure ? +1 to this too, for readability. As you both ask for it I'll do it, but very reluctantly, as to me this makes it worse to read, not better. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] Security patching of XenProject's wiki, mail, blog, etc
Hi Xen developers, Just a heads up as we will be applying some security patches to our VMs hosting several XenProject services like wiki, mail, blog, etc. These updates will require rebooting the VMs so the services will be down for a short period of time (5 - 10 minutes) within the next hour. Cheers, Birin ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 0/3] A few VPMU/watchdog-related patches
On 29.01.15 at 18:58, boris.ostrov...@oracle.com wrote: The first patch is a light cleanup of nmi_watchdog usage. (I removed NMI_INVALID definition but left NMI_IO_APIC. Do we actually need it? I don't see anyone using it except for one place, which doesn't really use it neither) As we really should gain back support for NMI_IO_APIC (considering that there are quite a few systems where the LAPIC variant doesn't work), I think it shouldn't be dropped. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] open-coded page list manipulation in shadow code
At 08:36 + on 30 Jan (1422603387), Jan Beulich wrote: On 29.01.15 at 18:30, t...@xen.org wrote: OK, here's what I have. It keeps the mechanism the same, threading on the main page list entry, but tidies it up to use the page_list operations rather than open-coding. I've tested it (lightly - basically jsut boot testing) with 32bit, 32pae and 64bit Windows VMs (all SMP), with bigmem=y and bigmem=n. It was developed on top of your bigmem series, but it should apply OK before patch 3/4, removing the need to nobble shadow-paging in that patch. Thanks, looks quite okay, just a couple of remarks. @@ -1525,6 +1495,14 @@ mfn_t shadow_alloc(struct domain *d, if ( shadow_type = SH_type_min_shadow shadow_type = SH_type_max_shadow ) sp-u.sh.head = 1; + +#ifndef PAGE_LIST_NULL +/* The temporary list-head is on our stack. Blank out the + * pointers to it in the shadows, just to get a clean failure if + * we accidentally follow them. */ +tmp_list.next-prev = tmp_list.prev-next = NULL; +#endif Perhaps better to use LIST_POISON{1,2} here, so we don't point into PV VA space? Yep, will do. --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -318,6 +318,33 @@ static inline int mfn_oos_may_write(mfn_t gmfn) } #endif /* (SHADOW_OPTIMIZATIONS SHOPT_OUT_OF_SYNC) */ +/* Figure out the size (in pages) of a given shadow type */ +static inline u32 +shadow_size(unsigned int shadow_type) +{ +static const u32 type_to_size[SH_type_unused] = { +1, /* SH_type_none */ +2, /* SH_type_l1_32_shadow */ +2, /* SH_type_fl1_32_shadow */ +4, /* SH_type_l2_32_shadow */ +1, /* SH_type_l1_pae_shadow */ +1, /* SH_type_fl1_pae_shadow */ +1, /* SH_type_l2_pae_shadow */ +1, /* SH_type_l2h_pae_shadow */ +1, /* SH_type_l1_64_shadow */ +1, /* SH_type_fl1_64_shadow */ +1, /* SH_type_l2_64_shadow */ +1, /* SH_type_l2h_64_shadow */ +1, /* SH_type_l3_64_shadow */ +1, /* SH_type_l4_64_shadow */ +1, /* SH_type_p2m_table */ +1, /* SH_type_monitor_table */ +1 /* SH_type_oos_snapshot */ +}; +ASSERT(shadow_type SH_type_unused); +return type_to_size[shadow_type]; +} Isn't this going to lead to multiple instances of that static array? Urgh, maybe. I'd have thought this would end up as a common symbol (if that's the term I mean - one where the linker will merge identical copies) but I didn't check what actually happens. I'll move it back into a .c and just have the lookup function in the header. +#ifndef PAGE_LIST_NULL +/* The temporary list-head is on our stack. Blank out the + * pointers to it in the shadows, just to get a clean failure if + * we accidentally follow them. */ +tmp_list.next-prev = tmp_list.prev-next = NULL; +#endif Same here. Perhaps worth putting into another inline helper? Yep, will do so for v2. Cheers, Tim. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 4/5] hvm_complete_assist_req: We should not be able to get here on IOREQ_TYPE_PCI_CONFIG
On 30.01.15 at 01:52, dsl...@verizon.com wrote: According to your 0/5 Suggested-by: Paul Durrant paul.durr...@citrix.com ? Jan Signed-off-by: Don Slutz dsl...@verizon.com --- xen/arch/x86/hvm/hvm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index c7984d1..6f7b407 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2573,6 +2573,7 @@ bool_t hvm_send_assist_req_to_ioreq_server(struct hvm_ioreq_server *s, static bool_t hvm_complete_assist_req(ioreq_t *p) { +ASSERT(p-type != IOREQ_TYPE_PCI_CONFIG); switch ( p-type ) { case IOREQ_TYPE_COPY: -- 1.8.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 5/5] hvm_complete_assist_req: Tell caller we failed to send
On 30.01.15 at 01:52, dsl...@verizon.com wrote: --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -2599,7 +2599,7 @@ static bool_t hvm_complete_assist_req(ioreq_t *p) break; } -return 1; +return 0; /* implicitly bins the i/o operation */ } This change points out that having hvm_complete_assist_req() be a separate function yet having only a single caller, and it returning non-void with only a single possible return value isn't the best arrangement. I think this should be brought back into hvm_send_assist_req(), by inverting the if() condition there. Unless there are intentions for it to have another caller, but in that case it should still be made return void, with the caller choosing what to return. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] New IOREQ type -- IOREQ_TYPE_VMWARE_PORT
-Original Message- From: Don Slutz [mailto:dsl...@verizon.com] Sent: 29 January 2015 19:41 To: Paul Durrant; Don Slutz; qemu-de...@nongnu.org; Stefano Stabellini Cc: Peter Maydell; Olaf Hering; Alexey Kardashevskiy; Stefan Weil; Michael Tokarev; Alexander Graf; Gerd Hoffmann; Stefan Hajnoczi; Paolo Bonzini; xen-devel Subject: New IOREQ type -- IOREQ_TYPE_VMWARE_PORT On 01/29/15 07:09, Paul Durrant wrote: ... Given that IIRC you are using a new dedicated IOREQ type, I think there needs to be something that allows an emulator to register for this IOREQ type. How about adding a new type to those defined for HVMOP_map_io_range_to_ioreq_server for your case? (In your case the start and end values in the hypercall would be meaningless but it could be used to steer hvm_select_ioreq_server() into sending all emulation requests or your new type to QEMU. This is an interesting idea. Will need to spend more time on it. Actually such a mechanism could be used to steer IOREQ_TYPE_TIMEOFFSET requests as, with the new QEMU patches, they are going nowhere. Upstream QEMU (as default) used to ignore them anyway, which is why I didn't bother with such a patch to Xen before but since you now need one maybe you could add that too? I think it would not be that hard. Will look into adding it. Currently I do not see how hvm_do_resume() works with 2 ioreq servers. It looks like to me that if a vpcu (like 0) needs to wait for the 2nd ioreq server, hvm_do_resume() will check the 1st ioreq server and return as if the ioreq is done. What am I missing? hvm_do_resume() walks the ioreq server list looking at the IOREQ state in the shared page of each server in turn. If no IOREQ was sent to that server then then state will be IOREQ_NONE and hvm_wait_for_io() will return 1 immediately so the outer loop in hvm_do_resume() will move on to the next server. If a state of IOREQ_READY or IOREQ_INPROCESS is found then the vcpu blocks on the relevant event channel until the state transitions to IORESP_READY. The IOREQ is then completed and the loop moves on to the next server. Normally an IOREQ would only be directed at one server and indeed IOREQs that are issued for emulation requests (i.e. when io_state != HVMIO_none) fall into this category but there is one case of a broadcast IOREQ, which is the INVALIDATE IOREQ (sent to tell emulators to invalidate any mappings of guest memory they may have cached) and that is why hvm_do_resume() has to iterate over all servers. Does that make sense? Paul -Don Slutz ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 3/5] hvmemul_do_io: If the send to the ioreq server failed do not retry.
On 30.01.15 at 01:52, dsl...@verizon.com wrote: I.E. do just what no backing DM does. _If_ this is correct, the if() modified here should be folded with the one a few lines up. But looking at the description of the commit that introduced this (bac0999325 x86 hvm: Do not incorrectly retire an instruction emulation..., almost immediately modified by f20f3c8ece x86 hvm: On failed hvm_send_assist_req(), io emulation...) I doubt this is really what we want, or at the very least your change description should explain what was wrong with the original commit. Jan --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -228,7 +228,11 @@ static int hvmemul_do_io( { rc = X86EMUL_RETRY; if ( !hvm_send_assist_req(p) ) +{ +/* Since the send failed, do not retry */ +rc = X86EMUL_OKAY; vio-io_state = HVMIO_none; +} else if ( p_data == NULL ) rc = X86EMUL_OKAY; } -- 1.8.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 05/29] ArmVirtualizationPkg: allow patchable PCD for device tree base address
On 01/26/15 20:03, Ard Biesheuvel wrote: To allow a runtime self relocating PrePi instance to discover the base address of the device tree at runtime, allow the use of a patchable PCD for gArmVirtualizationTokenSpaceGuid.PcdDeviceTreeInitialBaseAddress. We will not be using the build time patch tool in this case, but using a patchable PCD will make the build system aware that its value is not a compile time constant. Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org --- ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec | 2 +- ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec b/ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec index 99411548aff6..d83117fc6abe 100644 --- a/ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec +++ b/ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec @@ -34,7 +34,7 @@ gArmVirtualizationTokenSpaceGuid = { 0x0B6F5CA7, 0x4F53, 0x445A, { 0xB7, 0x6E, 0x2E, 0x36, 0x5B, 0x80, 0x63, 0x66 } } gEarlyPL011BaseAddressGuid = { 0xB199DEA9, 0xFD5C, 0x4A84, { 0x80, 0x82, 0x2F, 0x41, 0x70, 0x78, 0x03, 0x05 } } -[PcdsFixedAtBuild] +[PcdsFixedAtBuild, PcdsPatchableInModule] # # This is the physical address where the device tree is expected to be stored # upon first entry into UEFI. This needs to be a FixedAtBuild PCD, so that we diff --git a/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c index aa4ced4582e8..3e3074af72f1 100644 --- a/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c +++ b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c @@ -96,7 +96,7 @@ ArmPlatformInitializeSystemMemory ( ASSERT (HobData != NULL); *HobData = 0; - DeviceTreeBase = (VOID *)(UINTN)FixedPcdGet64 (PcdDeviceTreeInitialBaseAddress); + DeviceTreeBase = (VOID *)(UINTN)PcdGet64 (PcdDeviceTreeInitialBaseAddress); ASSERT (DeviceTreeBase != NULL); // Reviewed-by: Laszlo Ersek ler...@redhat.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] stubdom vtpm build failure in staging
On Thu, Jan 29, Ian Campbell wrote: This compiles for me on Debian with gcc 4.7.2-5. Same here. I'm honestly not sure what the C standard says about duplicate but identical typedefs -- Ian? How would that matter now? ;-) Old gcc can not deal with it, the commits cause a regression for gcc-4.5 and older. Olaf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl
On 01/30/2015 12:38 PM, Tamas K Lengyel wrote: On Fri, Jan 30, 2015 at 8:58 AM, Razvan Cojocaru rcojoc...@bitdefender.com wrote: On 01/29/2015 11:46 PM, Tamas K Lengyel wrote: diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index c7a0bde..3b58700 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1445,15 +1445,6 @@ void p2m_vm_event_emulate_check(struct vcpu *v, const vm_event_response_t *rsp) } } -void p2m_setup_introspection(struct domain *d) -{ -if ( hvm_funcs.enable_msr_exit_interception ) -{ -d-arch.hvm_domain.introspection_enabled = 1; -hvm_funcs.enable_msr_exit_interception(d); -} -} - bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla, struct npfec npfec, vm_event_request_t **req_ptr) I see that introspection_enabled is no longer assigned here ... Introspection_enabled is getting deprecated in this patch and is moved into the monitor_op domctl. Moved, right, but where? This patch removes the d-arch.hvm_domain.introspection_enabled = 1 assignment from here but AFAICT it doesn't move it anwyhere else. It just removes it, so introspection_enabled is always 0. I couldn't find any place in this series where introspection_enabled = 1. Could you please point it out to me if I've missed it by accident? diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c index 0db899e..0b30750 100644 --- a/xen/common/vm_event.c +++ b/xen/common/vm_event.c @@ -617,16 +617,10 @@ int vm_event_domctl(struct domain *d, xen_domctl_vm_event_op_t *vec, switch( vec-op ) { case XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE: -case XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE_INTROSPECTION: { rc = vm_event_enable(d, vec, ved, _VPF_mem_access, HVM_PARAM_MONITOR_RING_PFN, mem_access_notification); - -if ( vec-op == XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE_INTROSPECTION - !rc ) -p2m_setup_introspection(d); - } break; @@ -635,7 +629,6 @@ int vm_event_domctl(struct domain *d, xen_domctl_vm_event_op_t *vec, if ( ved-ring_page ) { rc = vm_event_disable(d, ved); -d-arch.hvm_domain.introspection_enabled = 0; } } break; ... nor here. Patch 6/12 checks it but doesn't set it. Patch 5/12 sets it to 0 (which could account for the removal of the assignment in vm_event.c) but never to 1. A few important things depend on it being enabled: it becomes impossible to disable interception for a select set of MSRs, optimization for RET instructions emulation is disabled, and othere places in p2m.c makes use of the flag as well. Is there some place in the code, untouched by this series, where introspection_enabled is being set to 1? It is moved into the monitor_op domctl when mov_to_msr trapping is enabled. The reason of having introspection_enabled AFAIU was to reenable trapping MSR's that were disabled shortly after boot. Thus, an option field is present in the monitor_op when enabling mov_to_msr trapping: extended_capture. Let me know if this still achieves the same effect as before! No, there are several places where introspection_enabled is needed, as I've said before. Not just the MSRs. One important place is in hvmemul_virtual_to_linear(), in xen/arch/x86/hvm/emulate.c, where we disable optimizations for REP instructions (in today's version of mainline Xen, at line 413). There are also places in patches yet to be published I've worked on where we gate things on introspection_enabled being != 0, so please don't remove it or have it misbehave. I'll look into the extended_capture option in case it will allow future removal of the MSR special case for introspection, but a flag like that is necessary and can't simply be deprecated and removed. Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 13/29] ArmVirtualizationPkg: allow patchable PCD for FV base address
On 01/26/15 20:03, Ard Biesheuvel wrote: Allow the use of a patchable PCD for gArmTokenSpaceGuid.PcdFvBaseAddress by moving it from the [FixedPcd] to the [Pcd] section in the INF file of PlatformPeiLib. Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org --- ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf b/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf index 96019e4009ff..1fca9b28f9e2 100644 --- a/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf +++ b/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf @@ -37,9 +37,11 @@ FdtLib [FixedPcd] - gArmTokenSpaceGuid.PcdFvBaseAddress gArmTokenSpaceGuid.PcdFvSize + +[Pcd] gArmVirtualizationTokenSpaceGuid.PcdDeviceTreeInitialBaseAddress + gArmTokenSpaceGuid.PcdFvBaseAddress [Guids] gEarlyPL011BaseAddressGuid It also seems to change the PCD type of PcdDeviceTreeInitialBaseAddress. Care to mention that too in the commit message, just for completeness? Reviewed-by: Laszlo Ersek ler...@redhat.com Thanks Laszlo ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/3] xen: support suspend/resume in pvscsi frontend
Up to now the pvscsi frontend hasn't supported domain suspend and resume. When a domain with an assigned pvscsi device was suspended and resumed again, it was not able to use the device any more: trying to do so resulted in hanging processes. Support suspend and resume of pvscsi devices. Signed-off-by: Juergen Gross jgr...@suse.com --- drivers/scsi/xen-scsifront.c | 189 --- 1 file changed, 162 insertions(+), 27 deletions(-) diff --git a/drivers/scsi/xen-scsifront.c b/drivers/scsi/xen-scsifront.c index 34199d2..b32157b 100644 --- a/drivers/scsi/xen-scsifront.c +++ b/drivers/scsi/xen-scsifront.c @@ -63,6 +63,7 @@ #define VSCSIFRONT_OP_ADD_LUN 1 #define VSCSIFRONT_OP_DEL_LUN 2 +#define VSCSIFRONT_OP_READD_LUN3 /* Tuning point. */ #define VSCSIIF_DEFAULT_CMD_PER_LUN 10 @@ -113,8 +114,13 @@ struct vscsifrnt_info { DECLARE_BITMAP(shadow_free_bitmap, VSCSIIF_MAX_REQS); struct vscsifrnt_shadow *shadow[VSCSIIF_MAX_REQS]; + /* Following items are protected by the host lock. */ wait_queue_head_t wq_sync; + wait_queue_head_t wq_pause; unsigned int wait_ring_available:1; + unsigned int waiting_pause:1; + unsigned int pause:1; + unsigned callers; char dev_state_path[64]; struct task_struct *curr; @@ -274,31 +280,31 @@ static void scsifront_sync_cmd_done(struct vscsifrnt_info *info, wake_up(shadow-wq_reset); } -static int scsifront_cmd_done(struct vscsifrnt_info *info) +static void scsifront_do_response(struct vscsifrnt_info *info, + struct vscsiif_response *ring_rsp) +{ + if (WARN(ring_rsp-rqid = VSCSIIF_MAX_REQS || +test_bit(ring_rsp-rqid, info-shadow_free_bitmap), +illegal rqid %u returned by backend!\n, ring_rsp-rqid)) + return; + + if (info-shadow[ring_rsp-rqid]-act == VSCSIIF_ACT_SCSI_CDB) + scsifront_cdb_cmd_done(info, ring_rsp); + else + scsifront_sync_cmd_done(info, ring_rsp); +} + +static int scsifront_ring_drain(struct vscsifrnt_info *info) { struct vscsiif_response *ring_rsp; RING_IDX i, rp; int more_to_do = 0; - unsigned long flags; - - spin_lock_irqsave(info-host-host_lock, flags); rp = info-ring.sring-rsp_prod; rmb(); /* ordering required respective to dom0 */ for (i = info-ring.rsp_cons; i != rp; i++) { - ring_rsp = RING_GET_RESPONSE(info-ring, i); - - if (WARN(ring_rsp-rqid = VSCSIIF_MAX_REQS || -test_bit(ring_rsp-rqid, info-shadow_free_bitmap), -illegal rqid %u returned by backend!\n, -ring_rsp-rqid)) - continue; - - if (info-shadow[ring_rsp-rqid]-act == VSCSIIF_ACT_SCSI_CDB) - scsifront_cdb_cmd_done(info, ring_rsp); - else - scsifront_sync_cmd_done(info, ring_rsp); + scsifront_do_response(info, ring_rsp); } info-ring.rsp_cons = i; @@ -308,6 +314,18 @@ static int scsifront_cmd_done(struct vscsifrnt_info *info) else info-ring.sring-rsp_event = i + 1; + return more_to_do; +} + +static int scsifront_cmd_done(struct vscsifrnt_info *info) +{ + int more_to_do; + unsigned long flags; + + spin_lock_irqsave(info-host-host_lock, flags); + + more_to_do = scsifront_ring_drain(info); + info-wait_ring_available = 0; spin_unlock_irqrestore(info-host-host_lock, flags); @@ -328,6 +346,24 @@ static irqreturn_t scsifront_irq_fn(int irq, void *dev_id) return IRQ_HANDLED; } +static void scsifront_finish_all(struct vscsifrnt_info *info) +{ + unsigned i; + struct vscsiif_response resp; + + scsifront_ring_drain(info); + + for (i = 0; i VSCSIIF_MAX_REQS; i++) { + if (test_bit(i, info-shadow_free_bitmap)) + continue; + resp.rqid = i; + resp.sense_len = 0; + resp.rslt = DID_RESET 16; + resp.residual_len = 0; + scsifront_do_response(info, resp); + } +} + static int map_data_for_request(struct vscsifrnt_info *info, struct scsi_cmnd *sc, struct vscsiif_request *ring_req, @@ -475,6 +511,27 @@ static struct vscsiif_request *scsifront_command2ring( return ring_req; } +static int scsifront_enter(struct vscsifrnt_info *info) +{ + if (info-pause) + return 1; + info-callers++; + return 0; +} + +static void scsifront_return(struct vscsifrnt_info *info) +{ + info-callers--; + if (info-callers) + return; + + if (!info-waiting_pause) + return; + + info-waiting_pause = 0; + wake_up(info-wq_pause); +}
[Xen-devel] [PATCH 2/3] xen: scsiback: add LUN of restored domain
When a xen domain is being restored the LUN state of a pvscsi device is Connected and not Initialising as in case of attaching a new pvscsi LUN. This must be taken into account when adding a new pvscsi device for a domain as otherwise the pvscsi LUN won't be connected to the SCSI target associated with it. Signed-off-by: Juergen Gross jgr...@suse.com --- drivers/xen/xen-scsiback.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c index 6457784..4290921 100644 --- a/drivers/xen/xen-scsiback.c +++ b/drivers/xen/xen-scsiback.c @@ -993,7 +993,7 @@ found: } static void scsiback_do_add_lun(struct vscsibk_info *info, const char *state, - char *phy, struct ids_tuple *vir) + char *phy, struct ids_tuple *vir, int try) { if (!scsiback_add_translation_entry(info, phy, vir)) { if (xenbus_printf(XBT_NIL, info-dev-nodename, state, @@ -1001,7 +1001,7 @@ static void scsiback_do_add_lun(struct vscsibk_info *info, const char *state, pr_err(xen-pvscsi: xenbus_printf error %s\n, state); scsiback_del_translation_entry(info, vir); } - } else { + } else if (!try) { xenbus_printf(XBT_NIL, info-dev-nodename, state, %d, XenbusStateClosed); } @@ -1061,10 +1061,19 @@ static void scsiback_do_1lun_hotplug(struct vscsibk_info *info, int op, switch (op) { case VSCSIBACK_OP_ADD_OR_DEL_LUN: - if (device_state == XenbusStateInitialising) - scsiback_do_add_lun(info, state, phy, vir); - if (device_state == XenbusStateClosing) + switch (device_state) { + case XenbusStateInitialising: + scsiback_do_add_lun(info, state, phy, vir, 0); + break; + case XenbusStateConnected: + scsiback_do_add_lun(info, state, phy, vir, 1); + break; + case XenbusStateClosing: scsiback_do_del_lun(info, state, vir); + break; + default: + break; + } break; case VSCSIBACK_OP_UPDATEDEV_STATE: -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/3] xen: mark pvscsi frontend request consumed only after last read
A request in the ring buffer mustn't be read after it has been marked as consumed. Otherwise it might already have been reused by the frontend without violating the ring protocol. To avoid inconsistencies in the backend only work on a private copy of the request. This will ensure a malicious guest not being able to bypass consistency checks of the backend by modifying an active request. Signed-off-by: Juergen Gross jgr...@suse.com --- drivers/xen/xen-scsiback.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c index e999496e..6457784 100644 --- a/drivers/xen/xen-scsiback.c +++ b/drivers/xen/xen-scsiback.c @@ -708,7 +708,7 @@ static int prepare_pending_reqs(struct vscsibk_info *info, static int scsiback_do_cmd_fn(struct vscsibk_info *info) { struct vscsiif_back_ring *ring = info-ring; - struct vscsiif_request *ring_req; + struct vscsiif_request ring_req; struct vscsibk_pend *pending_req; RING_IDX rc, rp; int err, more_to_do; @@ -734,11 +734,11 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info) if (!pending_req) return 1; - ring_req = RING_GET_REQUEST(ring, rc); + memcpy(ring_req, RING_GET_REQUEST(ring, rc), sizeof(ring_req)); ring-req_cons = ++rc; - act = ring_req-act; - err = prepare_pending_reqs(info, ring_req, pending_req); + act = ring_req.act; + err = prepare_pending_reqs(info, ring_req, pending_req); if (err) { switch (err) { case -ENODEV: @@ -756,7 +756,7 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info) switch (act) { case VSCSIIF_ACT_SCSI_CDB: - if (scsiback_gnttab_data_map(ring_req, pending_req)) { + if (scsiback_gnttab_data_map(ring_req, pending_req)) { scsiback_fast_flush_area(pending_req); scsiback_do_resp_with_sense(NULL, DRIVER_ERROR 24, 0, pending_req); @@ -767,7 +767,7 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info) break; case VSCSIIF_ACT_SCSI_ABORT: scsiback_device_action(pending_req, TMR_ABORT_TASK, - ring_req-ref_rqid); + ring_req.ref_rqid); break; case VSCSIIF_ACT_SCSI_RESET: scsiback_device_action(pending_req, TMR_LUN_RESET, 0); -- 2.1.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] tools/xenctrl: correct some function declarations
On Fri, Jan 30, 2015 at 03:32:26PM +0800, Tiejun Chen wrote: When commit 6865e52b78f4, PCI multi-seg: adjust domctl interface, is introduced, we missed to sync that head file. Signed-off-by: Tiejun Chen tiejun.c...@intel.com Acked-by: Wei Liu wei.l...@citrix.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v17 13/23] x86/VPMU: Interface for setting PMU mode and flags
On 05.01.15 at 22:44, boris.ostrov...@oracle.com wrote: +static long vpmu_unload_next(void *arg) +{ +struct vcpu *last; +int ret; +unsigned int thiscpu = smp_processor_id(); + +if ( thiscpu != vpmu_next_unload_cpu ) +{ +/* Continuation thread may have been moved due to CPU hot-unplug */ +vpmu_mode = (unsigned long)arg; +vpmu_first_unload_cpu = VPMU_INVALID_CPU; +return -EAGAIN; +} + +local_irq_disable(); /* so that last_vcpu doesn't change under us. */ + +last = this_cpu(last_vcpu); +if ( last ) +{ +vpmu_unload(vcpu_vpmu(last)); +this_cpu(last_vcpu) = NULL; +} So you do this for last_vcpu here, but ... +static int vpmu_unload_all(unsigned long old_mode) +{ +int ret = 0; + +vpmu_unload(vcpu_vpmu(current)); ... for current here, also without clearing this_cpu(last_vcpu). Is that really correct? +long do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg) +{ +int ret; +struct xen_pmu_params pmu_params; + +ret = xsm_pmu_op(XSM_OTHER, current-domain, op); +if ( ret ) +return ret; + +switch ( op ) +{ +case XENPMU_mode_set: +{ +unsigned int old_mode; +static DEFINE_SPINLOCK(xenpmu_mode_lock); + +if ( copy_from_guest(pmu_params, arg, 1) ) +return -EFAULT; + +if ( pmu_params.val ~(XENPMU_MODE_SELF | XENPMU_MODE_HV) ) +return -EINVAL; + +/* 32-bit dom0 can only sample itself. */ +if ( is_pv_32bit_vcpu(current) (pmu_params.val XENPMU_MODE_HV) ) +return -EINVAL; + +/* + * Return error if someone else is in the middle of changing mode --- + * this is most likely indication of two system administrators + * working against each other. + */ +if ( !spin_trylock(xenpmu_mode_lock) ) +return -EAGAIN; +if ( vpmu_first_unload_cpu != VPMU_INVALID_CPU ) +{ +spin_unlock(xenpmu_mode_lock); +return -EAGAIN; +} + +old_mode = vpmu_mode; +vpmu_mode = pmu_params.val; + +if ( vpmu_mode == XENPMU_MODE_OFF ) +{ +/* Make sure all (non-dom0) VCPUs have unloaded their VPMUs. */ +ret = vpmu_unload_all(old_mode); +if ( ret ) +vpmu_mode = old_mode; +} + +spin_unlock(xenpmu_mode_lock); + +break; +} + +case XENPMU_mode_get: +memset(pmu_params, 0, sizeof(pmu_params)); +pmu_params.val = vpmu_mode; + +pmu_params.version.maj = XENPMU_VER_MAJ; +pmu_params.version.min = XENPMU_VER_MIN; + +if ( copy_to_guest(arg, pmu_params, 1) ) +return -EFAULT; +break; + +case XENPMU_feature_set: +if ( copy_from_guest(pmu_params, arg, 1) ) +return -EFAULT; + +if ( pmu_params.val ~XENPMU_FEATURE_INTEL_BTS ) +return -EINVAL; + +vpmu_features = pmu_params.val; +break; + +case XENPMU_feature_get: +pmu_params.val = vpmu_features; +if ( copy_field_to_guest(arg, pmu_params, val) ) +return -EFAULT; +break; + +default: +ret = -EINVAL; +} + +return ret; +} Throughout this function, the pad field isn't being checked to be zero (and XENPMU_feature_get also doesn't clear it, but that seems secondary, at least as long as it's being declared IN only). As I'm sure I said before - we ought to do this in order to be able to assign meaning to the field later on. Perhaps it would even better be renamed to e.g. mbz. @@ -144,6 +145,9 @@ do_tmem_op( extern long do_xenoprof_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg); +extern long +do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg); Despite there being numerous bad examples - can op please be unsigned int? --- a/xen/include/xsm/xsm.h +++ b/xen/include/xsm/xsm.h @@ -173,6 +173,7 @@ struct xsm_operations { int (*unbind_pt_irq) (struct domain *d, struct xen_domctl_bind_pt_irq *bind); int (*ioport_permission) (struct domain *d, uint32_t s, uint32_t e, uint8_t allow); int (*ioport_mapping) (struct domain *d, uint32_t s, uint32_t e, uint8_t allow); +int (*pmu_op) (struct domain *d, int op); And then here too? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC V10 0/4] domain snapshot document
On Thu, Jan 29, 2015 at 04:36:43PM +, Ian Campbell wrote: On Mon, 2015-01-26 at 11:25 +0800, Chunyan Liu wrote: Changes to V9: This looks good to me, thanks. Ian/Wei do you have any comments? xapi folks? Looks good to me. Wei. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl
There are also places in patches yet to be published I've worked on where we gate things on introspection_enabled being != 0, so please don't remove it or have it misbehave. I'll look into the extended_capture option in case it will allow future removal of the MSR special case for introspection, but a flag like that is necessary and can't simply be deprecated and removed. Ack, the plan was actually to replace all references to arch.hvm_domain.introspection_enabled with arch.monitor_options.mov_to_msr.extended_capture. I see I forgot to actually fully follow through that plan but that's the intention at least. So the functionality would remain, it would just be worked into the coherent settings field with every other type of events. If you take a look at the next iteration of this series at https://github.com/tklengyel/xen/compare/mem_event_cleanup4?expand=1 I updated this patch to implement what I described above. Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 0/5] Split off mini-os to a separate tree
On Mon, Jan 26, 2015 at 09:53:25AM +, Thomas Leonard wrote: [ Cc += Anil ] On 25 January 2015 at 18:25, Wei Liu wei.l...@citrix.com wrote: Cc Ian and Ian and some folks who might be interested in this work. On Sun, Jan 25, 2015 at 06:13:41PM +, Wei Liu wrote: There has been increasing use of mini-os in unikernel world and basically everybody has their own fork of mini-os. That way going foward is going to cause fragmentation of the community. We would like to split off mini-os tree so that everybody can upstream their changes and work on a common code base. Later I would also like to setup a proper push gate for mini-os. Stubdom's build environment is known to be very fragile, so basically all the real work is done in top level Makefile. I use following runes to split off mini-os: git filter-branch --tag-name-filter cat \ --subdirectory-filter extras/mini-os/ -- --all # There is already a tag name 4.3.0-rc2 which points to the same commit. git tag -d xen-4.3.0-rc2 # Add xen- prefix to all tags for t in `git tag`; do git tag xen-$t $t; git tag -d $t ; done git gc --aggressive The tree can be found at: git://xenbits.xen.org/people/liuw/mini-os.git master Note that mini-os cannot build on its own due to the limitation of it's own build system. Splitting it off it's the first step towards fixing that problem In case it's useful: for the standalone version of Mini-OS used by Mirage, I had to include these directories too: 1. xen/include/public 2. xen/common/libfdt 3. xen/include/xen/libfdt Is libfdt required? I see inclusion of libfdt.h in arch/arm/mm.c but I couldn't find trace to link libfdt in Makefile. And I ran a simple test to compile minios in Xen tree on an arm board it doesn't compile. If arm support is still WIP I'm going to ignore libfdt for now. Wei. 4. config ( https://github.com/talex5/xen/tree/minios-releases ) Will there be a separate mailing list for Mini-OS? That would be very useful for people who don't want to keep up with xen-devel. Thanks! -- Dr Thomas Leonardhttp://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl
Ack, the plan was actually to replace all references to arch.hvm_domain.introspection_enabled with arch.monitor_options.mov_to_msr.extended_capture. I see I forgot to actually fully follow through that plan but that's the intention at least. So the functionality would remain, it would just be worked into the coherent settings field with every other type of events. I see, but is it conceivable that some vm_event consumer does want to enable arch.monitor_options.mov_to_msr.extended_capture but not be interested in doing full-blown introspection (for example, is fine with having the REP optimization enabled)? I'm not sure I follow what the difference here between mov_to_msr.extended_capture and what you refer to as full-blown introspection. My understanding was that without gating some MSR's via introspection_enabled, they are never trapped into the hypervisor when they get written to, while other MSR do. Thus the two options: mov_to_msr.enabled for basic trapping, and mov_to_msr.extended_capture for gating the extended set of MSRs to be also trappable. What you're proposing here (as far as introspection_enabled is concerned) is, if I understand correctly, basically renaming introspection_enabled to mov_to_msr.extended_capture. I can see how that would seem to simplify some things, but it might not look very intuitive to future developers, and it will definitely complicate other things down the road. No, not just simply renaming it. So far the options for the various VM events were scattered all over the place, yours defined directly on hvm_domain, others in hvm parameters. Now there is an assigned spot for all current - and future - events to store their settings: in arch_domain monitor_settings. This will work for PV domains as well, while keeping it on the arch_domain layer will avoid defining options on ARM that are architecture specific (mov_to_cr0/3/4 etc.). Furthermore, setting these options was also a mess which I try to address in this patch: some were enabled via hvm_op hypercalls, yours via domctl. Now everything is moved into one spot: monitor_op domctl. An example is havig the guest trigger a vm_event, requested by the privileged userspace application. In our case, it was VMCALL in the original series, and the patch has been eventually dropped from the conversation - to be reworked at a later time. But we do need it, and our patch now does its thing gated on introspection_enabled. It's hard to connect that logic to mov_to_msr.extended_capture. I guess we could keep the flag and move it to arch.monitor_options if you prefer? And setting it could cause mov_to_msr.extended_capture and assorted flags to be set also (some sort of umbrella setting)? Yes, that would be the logic going forward - all VM event related options and settings should be stored in this structure. Also, I don't see a problem with having an event or setting that enables multiple at the same time. Thanks, Tamas Thanks, Razvan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/3] xen: mark pvscsi frontend request consumed only after last read
On 01/30/2015 12:47 PM, Jan Beulich wrote: On 30.01.15 at 12:21, jgr...@suse.com wrote: @@ -734,11 +734,11 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info) if (!pending_req) return 1; - ring_req = RING_GET_REQUEST(ring, rc); + memcpy(ring_req, RING_GET_REQUEST(ring, rc), sizeof(ring_req)); I'd recommend the type safe *ring_req = *RING_GET_REQUEST(ring, rc) here. I think I'll use ring_req = *RING_GET_REQUEST(ring, rc) :-) ring-req_cons = ++rc; - act = ring_req-act; - err = prepare_pending_reqs(info, ring_req, pending_req); + act = ring_req.act; Is this helper variable then still needed? No, you're right. Will delete it. Juergen ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm
From: denys drozdov denys.droz...@globallogic.com Change-Id: I9b315f213775b8410fe75cd96968dcb213ea287b Signed-off-by: denys drozdov denys.droz...@globallogic.com --- xen/common/sched_rt.c |7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index e70d6c7..1ab0a62 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -1010,8 +1010,9 @@ rt_context_saved(const struct scheduler *ops, struct vcpu *vc) struct rt_vcpu *snext = NULL; struct rt_dom *sdom = NULL; struct rt_private *prv = rt_priv(ops); -cpumask_t *online; -spinlock_t *lock = vcpu_schedule_lock_irq(vc); +cpumask_t *online; +unsigned long flags; +spinlock_t *lock = vcpu_schedule_lock_irqsave(vc, flags); clear_bit(__RTDS_scheduled, svc-flags); /* not insert idle vcpu to runq */ @@ -1032,7 +1033,7 @@ rt_context_saved(const struct scheduler *ops, struct vcpu *vc) runq_tickle(ops, snext); } out: -vcpu_schedule_unlock_irq(lock, vc); +vcpu_schedule_unlock_irqrestore(lock, flags, vc); } /* -- 1.7.9.5 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 16/29] Ovmf/Xen: fix pointer to int cast in XenBusDxe
On 01/26/15 20:03, Ard Biesheuvel wrote: On ARM, xen_pfn_t is 64 bits but the size of a pointer is only 32 bits, so casting between them needs to go via (UINTN). Also move the xen_pfn_t cast outside the shift so that we can avoid shifting 64-bit quantities on 32-bit architectures, which may require runtime library support. Contributed-under: TianoCore Contribution Agreement 1.0 Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org --- OvmfPkg/XenBusDxe/GrantTable.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/OvmfPkg/XenBusDxe/GrantTable.c b/OvmfPkg/XenBusDxe/GrantTable.c index 37d3bf786c64..8405edc51bc4 100644 --- a/OvmfPkg/XenBusDxe/GrantTable.c +++ b/OvmfPkg/XenBusDxe/GrantTable.c @@ -160,7 +160,7 @@ XenGrantTableInit ( Parameters.domid = DOMID_SELF; Parameters.idx = Index; Parameters.space = XENMAPSPACE_grant_table; -Parameters.gpfn = (((xen_pfn_t) GrantTable) EFI_PAGE_SHIFT) + Index; +Parameters.gpfn = (xen_pfn_t) ((UINTN) GrantTable EFI_PAGE_SHIFT) + Index; ReturnCode = XenHypercallMemoryOp (Dev, XENMEM_add_to_physmap, Parameters); if (ReturnCode != 0) { DEBUG ((EFI_D_ERROR, Xen GrantTable, add_to_physmap hypercall error: %d\n, ReturnCode)); @@ -182,7 +182,7 @@ XenGrantTableDeinit ( for (Index = NR_GRANT_FRAMES - 1; Index = 0; Index--) { Parameters.domid = DOMID_SELF; -Parameters.gpfn = (((xen_pfn_t) GrantTable) EFI_PAGE_SHIFT) + Index; +Parameters.gpfn = (xen_pfn_t) ((UINTN) GrantTable EFI_PAGE_SHIFT) + Index; DEBUG ((EFI_D_INFO, Xen GrantTable, removing %X\n, Parameters.gpfn)); ReturnCode = XenHypercallMemoryOp (Dev, XENMEM_remove_from_physmap, Parameters); if (ReturnCode != 0) { One edk2 coding style guideline that I consistently reject never apply is the first space in (type) expr1 op expr2 We had an obscure bug due to this guideline earlier, which I luckily managed to catch through review, before it could do damage. The problem with the first space is that it actively obscures the fact that the cast operator has one of the strongest operator bindings. Most of the time, the above means ((type)expr1) op expr2 and not (type)(expr1 op expr2) as the space would incorrectly suggest. Your patch is fine, but I had to think thrice. :) Reviewed-by: Laszlo Ersek ler...@redhat.com ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/balloon: cancel ballooning if adding new memory failed
On Thu, 2015-01-29 at 18:01 +, David Vrabel wrote: On 29/01/15 13:36, Ian Campbell wrote: On Mon, 2014-09-01 at 18:52 +0100, David Vrabel wrote: If the balloon driver is adding additional memory regions to the balloon and add_memory() fails it will likely continuously fail so cancel the balloon operation. Signed-off-by: David Vrabel david.vra...@citrix.com https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776448 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1304001 seem to suggest this should be a candidate for stable backports? It's up to the distro kernel maintainer to request it if they think it is important. Which is what I thought I was doing. Surely you don't think I should just ask stable@ without conferring with the relevant maintainers first? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl
On 01/30/2015 01:45 PM, Tamas K Lengyel wrote: Ack, the plan was actually to replace all references to arch.hvm_domain.introspection_enabled with arch.monitor_options.mov_to_msr.extended_capture. I see I forgot to actually fully follow through that plan but that's the intention at least. So the functionality would remain, it would just be worked into the coherent settings field with every other type of events. I see, but is it conceivable that some vm_event consumer does want to enable arch.monitor_options.mov_to_msr.extended_capture but not be interested in doing full-blown introspection (for example, is fine with having the REP optimization enabled)? I'm not sure I follow what the difference here between mov_to_msr.extended_capture and what you refer to as full-blown introspection. My understanding was that without gating some MSR's via introspection_enabled, they are never trapped into the hypervisor when they get written to, while other MSR do. Thus the two options: mov_to_msr.enabled for basic trapping, and mov_to_msr.extended_capture for gating the extended set of MSRs to be also trappable. Well, write interception for the MSRs we're interested in is enabled _initally_, but then vmx_disable_intercept_for_msr() gets called and we're left without these events, so yes, in a nutshell we do want to keep write events for more MSRs than the default allows around. The initial, very basic patch: http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00310.html The point was to keep receiving these events even after the HV decides the corresponding MSR writes have become unnecessarily expensive to intercept, but only if the vm_event consumer is interested in guest introspection. By full-blown introspection I mean that, in order to do its job properly, guest introspection needs a bunch of requirements to be met (not just the proper MSR write events to be sent), not least of which is to send an event per each write performed in a REP situation (not just an event per page touched) - which is why, gated on introspection being enabled, we disable the REP emulation optimization. It also needs a number of other things, which will surely become RFCs here at some point. Hence my previous comments: full-blown vm_event-based introspection is more than the MSR trick, so it might be slightly confusing to gate all of them on mov_to_msr.extended_capture. What you're proposing here (as far as introspection_enabled is concerned) is, if I understand correctly, basically renaming introspection_enabled to mov_to_msr.extended_capture. I can see how that would seem to simplify some things, but it might not look very intuitive to future developers, and it will definitely complicate other things down the road. No, not just simply renaming it. So far the options for the various VM events were scattered all over the place, yours defined directly on hvm_domain, others in hvm parameters. Now there is an assigned spot for all current - and future - events to store their settings: in arch_domain monitor_settings. This will work for PV domains as well, while keeping it on the arch_domain layer will avoid defining options on ARM that are architecture specific (mov_to_cr0/3/4 etc.). Furthermore, setting these options was also a mess which I try to address in this patch: some were enabled via hvm_op hypercalls, yours via domctl. Now everything is moved into one spot: monitor_op domctl. Yes, and that's great! Your efforts are appreciated, and I'm sorry to have inadvertently contributed to the mess. I was simply talking about the way introspection_enabled is being used: you were taking before about replacing all instances of introspection_enabled with mov_to_msr.enabled (which is functionally speaking, renaming), and my only points have been that: A) in this series introspection_enabled has neither been removed nor is it of any use, since it's always 0, and B) exclusive use of mov_to_msr.extended_capture where introspection_enabled has been used before might not logically cover all the cases needed for proper guest introspection (not sure that gating the code that disables the REP emulation optimization on mov_to_msr.extended_capture wouldn't look strange to someone looking at the code for the first time). I hope I've been able to explain myself better this time, perhaps my concerns are not shared by others, I'm certainly open to discussion. An example is havig the guest trigger a vm_event, requested by the privileged userspace application. In our case, it was VMCALL in the original series, and the patch has been eventually dropped from the conversation - to be reworked at a later time. But we do need it, and our patch now does its thing gated on introspection_enabled. It's hard to connect that logic to mov_to_msr.extended_capture. I guess we could keep the flag and move it to arch.monitor_options if you prefer? And setting it could cause mov_to_msr.extended_capture and
Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine property to support IGD GFX passthrough
On Fri, Jan 30, 2015 at 08:56:48AM +0800, Chen, Tiejun wrote: [...] Just remember to handle old option in libxl if your old option is already released by some older version of QEMUs. I just drop that old option, -gfx_passthru, if we're under qemu upstream circumstance, like this, The question is, is there any version of qemu upstream that has been released that has the old option (-gfx-passthru)? No. Just now we're starting to support IGD passthrough in qemu upstream. Right, as of QEMU 2.2.0 there's no support of IGD passthrough in QMEU upstream. This gives us a situation that we need to support both the old (-gfx-passthru) and new (-igd-passthru) options. Presumably we (libxl) would need to fork a qemu process to determine which option it has and pass the right one. Or you can try to keep both old and new option at the same time but Yeah, actually I also have considered to keep both two options at the same time. Its really friendly to any qemu version. deprecate the old one. Then in a few qemu release cycles later (or This should be like 'accel=kvm' versus 'enable-kvm' in qemu upstream. They're coexisted now but just the former is a modern option. probably one year or two?) you can finally remove the old one. The point is that to give downstream (in this case, Xen) time to cope with the change. Here I'm fine to this way. So Gerd, So you don't actually need to ask Gerd this question because there is no old option to keep in qemu upstream. Libxl (or any sensible toolstack) will just do the right thing to either pass -igd-passthru (or whatever you guys agree upon) to qemu upstream or pass -gfx-passthru to qemu traditional. :-) Wei. What about this? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [linux-3.14 test] 33918: regressions - FAIL
flight 33918 linux-3.14 real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/33918/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-xl-qemuu-ovmf-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-xl-qemuu-winxpsp3 5 xen-bootfail REGR. vs. 33341 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-xl-qemut-winxpsp3 5 xen-bootfail REGR. vs. 33341 test-amd64-amd64-rumpuserxen-amd64 7 rumpuserxen-demo-setup fail REGR. vs. 33341 test-amd64-i386-xl-multivcpu 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-rumpuserxen-i386 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-xl-winxpsp3 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-xl 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-win7-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-xl-pvh-amd 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-qemuu-winxpsp3 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-freebsd10-i386 5 xen-bootfail REGR. vs. 33341 test-amd64-amd64-xl-qemut-debianhvm-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-qemuu-win7-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-xl-win7-amd64 7 windows-install fail REGR. vs. 33341 test-amd64-i386-qemuu-rhel6hvm-amd 7 redhat-install fail REGR. vs. 33341 test-amd64-amd64-xl-qemuu-debianhvm-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-qemut-rhel6hvm-amd 5 xen-bootfail REGR. vs. 33341 test-amd64-amd64-xl-pvh-intel 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-winxpsp3 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-qemut-debianhvm-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-freebsd10-amd64 4 xen-installfail REGR. vs. 33341 test-amd64-i386-xl5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-qemut-winxpsp3 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-xl-qemuu-win7-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-qemut-rhel6hvm-intel 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-libvirt 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-rhel6hvm-amd 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-winxpsp3-vcpus1 5 xen-bootfail REGR. vs. 33341 test-amd64-i386-xl-qemuu-debianhvm-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-qemuu-rhel6hvm-intel 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-credit27 debian-installfail REGR. vs. 33341 test-amd64-amd64-xl-qemuu-ovmf-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-xl-qemut-win7-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-libvirt 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-pair16 guest-start/debianfail REGR. vs. 33341 test-amd64-amd64-xl-qemut-win7-amd64 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-pair 8 xen-boot/dst_host fail REGR. vs. 33341 test-amd64-i386-pair 7 xen-boot/src_host fail REGR. vs. 33341 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf 5 xen-boot fail REGR. vs. 33341 test-amd64-i386-rhel6hvm-intel 5 xen-boot fail like 33296 test-amd64-amd64-xl-sedf-pin 5 xen-boot fail REGR. vs. 33341 test-amd64-amd64-xl-pcipt-intel 5 xen-boot fail REGR. vs. 33341 Tests which did not succeed, but are not blocking: test-armhf-armhf-libvirt 9 guest-start fail never pass test-armhf-armhf-xl 10 migrate-support-checkfail never pass version targeted for testing: linux4d7313cd7cbe56ccb511eca23ef5bba7f10ffcb8 baseline version: linuxc3b70f0bbb6a883f4afa639286043d3f71fbbddf People who touched revisions under test: Ahmed S. Darwish ahmed.darw...@valeo.com Alan Stern st...@rowland.harvard.edu Alex Deucher alexander.deuc...@amd.com Alex Williamson alex.william...@redhat.com Alexander Fomichev git.u...@gmail.com Alexander Y. Fomichev git.u...@gmail.com Amit Virdi amit.vi...@st.com Andrew Jackson andrew.jack...@arm.com Andrew Morton a...@linux-foundation.org Andy Lutomirski l...@amacapital.net Anton Blanchard an...@samba.org Antonio Borneo borneo.anto...@gmail.com Antonio Quartulli anto...@meshcoding.com Ard Biesheuvel ard.biesheu...@linaro.org Arik Nemtsov a...@wizery.com Arik Nemtsov arikx.nemt...@intel.com Arnaldo
Re: [Xen-devel] [PATCH v2 12/29] ArmVirtualizationPkg: implement custom MemoryInitPeiLib
On 01/26/15 20:03, Ard Biesheuvel wrote: This implements a MemoryInitPeiLib instance that differs from the stock ArmPlatformPkg version only in the fact that it does not remove the memory used by the flash device (FD). The reason is that, when using PrePi, the DXE core is started immediately and never returns so there is no reason to preserve any of the memory that the flash device occupied originally, and it is preferable to release is so that the OS loader can reuse it. This is especially important for the relocatable PrePi configuration, which is aimed at being launched from a boot loader that itself adheres to the Linux arm64 boot protocol. Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org --- .../Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.c | 91 .../Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.inf | 66 +++ 2 files changed, 157 insertions(+) diff --git a/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.c b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.c new file mode 100644 index ..5f6cd059c47f --- /dev/null +++ b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.c @@ -0,0 +1,91 @@ +/** @file +* +* Copyright (c) 2011-2014, ARM Limited. All rights reserved. +* Copyright (c) 2014, Linaro Limited. All rights reserved. +* +* This program and the accompanying materials +* are licensed and made available under the terms and conditions of the BSD License +* which accompanies this distribution. The full text of the license may be found at +* http://opensource.org/licenses/bsd-license.php +* +* THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN AS IS BASIS, +* WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED. +* +**/ + +#include PiPei.h + +#include Library/ArmPlatformLib.h +#include Library/DebugLib.h +#include Library/HobLib.h +#include Library/MemoryAllocationLib.h +#include Library/PcdLib.h + +VOID +BuildMemoryTypeInformationHob ( + VOID + ); + +VOID +InitMmu ( + VOID + ) +{ + ARM_MEMORY_REGION_DESCRIPTOR *MemoryTable; + VOID *TranslationTableBase; + UINTN TranslationTableSize; + RETURN_STATUS Status; + + // Get Virtual Memory Map from the Platform Library + ArmPlatformGetVirtualMemoryMap (MemoryTable); + + //Note: Because we called PeiServicesInstallPeiMemory() before to call InitMmu() the MMU Page Table resides in + // DRAM (even at the top of DRAM as it is the first permanent memory allocation) + Status = ArmConfigureMmu (MemoryTable, TranslationTableBase, TranslationTableSize); + if (EFI_ERROR (Status)) { +DEBUG ((EFI_D_ERROR, Error: Failed to enable MMU\n)); + } +} + +EFI_STATUS +EFIAPI +MemoryPeim ( + IN EFI_PHYSICAL_ADDRESS UefiMemoryBase, + IN UINT64 UefiMemorySize + ) +{ + EFI_RESOURCE_ATTRIBUTE_TYPE ResourceAttributes; + + // Ensure PcdSystemMemorySize has been set + ASSERT (PcdGet64 (PcdSystemMemorySize) != 0); + + // + // Now, the permanent memory has been installed, we can call AllocatePages() + // + ResourceAttributes = ( + EFI_RESOURCE_ATTRIBUTE_PRESENT | + EFI_RESOURCE_ATTRIBUTE_INITIALIZED | + EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE | + EFI_RESOURCE_ATTRIBUTE_WRITE_COMBINEABLE | + EFI_RESOURCE_ATTRIBUTE_WRITE_THROUGH_CACHEABLE | + EFI_RESOURCE_ATTRIBUTE_WRITE_BACK_CACHEABLE | + EFI_RESOURCE_ATTRIBUTE_TESTED + ); + + BuildResourceDescriptorHob ( + EFI_RESOURCE_SYSTEM_MEMORY, + ResourceAttributes, + PcdGet64 (PcdSystemMemoryBase), + PcdGet64 (PcdSystemMemorySize) + ); + + // Build Memory Allocation Hob + InitMmu (); + + if (FeaturePcdGet (PcdPrePiProduceMemoryTypeInformationHob)) { +// Optional feature that helps prevent EFI memory map fragmentation. +BuildMemoryTypeInformationHob (); + } + + return EFI_SUCCESS; +} diff --git a/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.inf b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.inf new file mode 100644 index ..fcdae06de7c2 --- /dev/null +++ b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.inf @@ -0,0 +1,66 @@ +#/** @file +# +# Copyright (c) 2011-2014, ARM Ltd. All rights reserved.BR +# Copyright (c) 2014,
Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl
On Fri, Jan 30, 2015 at 12:07 PM, Razvan Cojocaru rcojoc...@bitdefender.com wrote: On 01/30/2015 12:38 PM, Tamas K Lengyel wrote: On Fri, Jan 30, 2015 at 8:58 AM, Razvan Cojocaru rcojoc...@bitdefender.com wrote: On 01/29/2015 11:46 PM, Tamas K Lengyel wrote: diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index c7a0bde..3b58700 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -1445,15 +1445,6 @@ void p2m_vm_event_emulate_check(struct vcpu *v, const vm_event_response_t *rsp) } } -void p2m_setup_introspection(struct domain *d) -{ -if ( hvm_funcs.enable_msr_exit_interception ) -{ -d-arch.hvm_domain.introspection_enabled = 1; -hvm_funcs.enable_msr_exit_interception(d); -} -} - bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla, struct npfec npfec, vm_event_request_t **req_ptr) I see that introspection_enabled is no longer assigned here ... Introspection_enabled is getting deprecated in this patch and is moved into the monitor_op domctl Moved, right, but where? This patch removes the d-arch.hvm_domain.introspection_enabled = 1 assignment from here but AFAICT it doesn't move it anwyhere else. It just removes it, so introspection_enabled is always 0. Actually it should have been removed altogether from the HVM domain definition as the same information should be stored in arch_domain's monitor_options struct. I couldn't find any place in this series where introspection_enabled = 1. Could you please point it out to me if I've missed it by accident? diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c index 0db899e..0b30750 100644 --- a/xen/common/vm_event.c +++ b/xen/common/vm_event.c @@ -617,16 +617,10 @@ int vm_event_domctl(struct domain *d, xen_domctl_vm_event_op_t *vec, switch( vec-op ) { case XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE: -case XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE_INTROSPECTION: { rc = vm_event_enable(d, vec, ved, _VPF_mem_access, HVM_PARAM_MONITOR_RING_PFN, mem_access_notification); - -if ( vec-op == XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE_INTROSPECTION - !rc ) -p2m_setup_introspection(d); - } break; @@ -635,7 +629,6 @@ int vm_event_domctl(struct domain *d, xen_domctl_vm_event_op_t *vec, if ( ved-ring_page ) { rc = vm_event_disable(d, ved); -d-arch.hvm_domain.introspection_enabled = 0; } } break; ... nor here. Patch 6/12 checks it but doesn't set it. Patch 5/12 sets it to 0 (which could account for the removal of the assignment in vm_event.c) but never to 1. A few important things depend on it being enabled: it becomes impossible to disable interception for a select set of MSRs, optimization for RET instructions emulation is disabled, and othere places in p2m.c makes use of the flag as well. Is there some place in the code, untouched by this series, where introspection_enabled is being set to 1? It is moved into the monitor_op domctl when mov_to_msr trapping is enabled. The reason of having introspection_enabled AFAIU was to reenable trapping MSR's that were disabled shortly after boot. Thus, an option field is present in the monitor_op when enabling mov_to_msr trapping: extended_capture. Let me know if this still achieves the same effect as before! No, there are several places where introspection_enabled is needed, as I've said before. Not just the MSRs. One important place is in hvmemul_virtual_to_linear(), in xen/arch/x86/hvm/emulate.c, where we disable optimizations for REP instructions (in today's version of mainline Xen, at line 413). There are also places in patches yet to be published I've worked on where we gate things on introspection_enabled being != 0, so please don't remove it or have it misbehave. I'll look into the extended_capture option in case it will allow future removal of the MSR special case for introspection, but a flag like that is necessary and can't simply be deprecated and removed. Ack, the plan was actually to replace all references to arch.hvm_domain.introspection_enabled with arch.monitor_options.mov_to_msr.extended_capture. I see I forgot to actually fully follow through that plan but that's the intention at least. So the functionality would remain, it would just be worked into the coherent settings field with every other type of events. Thanks, Razvan Thanks, Tamas ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 1/3] xen: mark pvscsi frontend request consumed only after last read
On 30.01.15 at 12:21, jgr...@suse.com wrote: @@ -734,11 +734,11 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info) if (!pending_req) return 1; - ring_req = RING_GET_REQUEST(ring, rc); + memcpy(ring_req, RING_GET_REQUEST(ring, rc), sizeof(ring_req)); I'd recommend the type safe *ring_req = *RING_GET_REQUEST(ring, rc) here. ring-req_cons = ++rc; - act = ring_req-act; - err = prepare_pending_reqs(info, ring_req, pending_req); + act = ring_req.act; Is this helper variable then still needed? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH RFC 0/5] Split off mini-os to a separate tree
On 30 January 2015 at 11:26, Wei Liu wei.l...@citrix.com wrote: On Mon, Jan 26, 2015 at 09:53:25AM +, Thomas Leonard wrote: [ Cc += Anil ] On 25 January 2015 at 18:25, Wei Liu wei.l...@citrix.com wrote: Cc Ian and Ian and some folks who might be interested in this work. On Sun, Jan 25, 2015 at 06:13:41PM +, Wei Liu wrote: There has been increasing use of mini-os in unikernel world and basically everybody has their own fork of mini-os. That way going foward is going to cause fragmentation of the community. We would like to split off mini-os tree so that everybody can upstream their changes and work on a common code base. Later I would also like to setup a proper push gate for mini-os. Stubdom's build environment is known to be very fragile, so basically all the real work is done in top level Makefile. I use following runes to split off mini-os: git filter-branch --tag-name-filter cat \ --subdirectory-filter extras/mini-os/ -- --all # There is already a tag name 4.3.0-rc2 which points to the same commit. git tag -d xen-4.3.0-rc2 # Add xen- prefix to all tags for t in `git tag`; do git tag xen-$t $t; git tag -d $t ; done git gc --aggressive The tree can be found at: git://xenbits.xen.org/people/liuw/mini-os.git master Note that mini-os cannot build on its own due to the limitation of it's own build system. Splitting it off it's the first step towards fixing that problem In case it's useful: for the standalone version of Mini-OS used by Mirage, I had to include these directories too: 1. xen/include/public 2. xen/common/libfdt 3. xen/include/xen/libfdt Is libfdt required? I see inclusion of libfdt.h in arch/arm/mm.c but I couldn't find trace to link libfdt in Makefile. And I ran a simple test to compile minios in Xen tree on an arm board it doesn't compile. If arm support is still WIP I'm going to ignore libfdt for now. The ARM port works fine, but you need to apply the last couple of patches in the series (the rest are already in). The latest versions are on my next branch here: https://github.com/talex5/xen/commits/next If you want to test it, you need these two: - mini-os: arm: interrupt controller - mini-os: arm: build system Wei. 4. config ( https://github.com/talex5/xen/tree/minios-releases ) Will there be a separate mailing list for Mini-OS? That would be very useful for people who don't want to keep up with xen-devel. Thanks! -- Dr Thomas Leonardhttp://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA -- Dr Thomas Leonardhttp://0install.net/ GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm
On 30.01.15 at 13:30, denys.droz...@globallogic.com wrote: From: denys drozdov denys.droz...@globallogic.com There's a description missing here of _what_ (case) you are fixing. Change-Id: I9b315f213775b8410fe75cd96968dcb213ea287b And the purpose of this line is unclear. --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -1010,8 +1010,9 @@ rt_context_saved(const struct scheduler *ops, struct vcpu *vc) struct rt_vcpu *snext = NULL; struct rt_dom *sdom = NULL; struct rt_private *prv = rt_priv(ops); -cpumask_t *online; -spinlock_t *lock = vcpu_schedule_lock_irq(vc); +cpumask_t *online; You're introducing a trailing blank here. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel