Re: [Xen-devel] [PATCH v17 13/23] x86/VPMU: Interface for setting PMU mode and flags

2015-01-30 Thread Boris Ostrovsky

On 01/30/2015 08:31 AM, Jan Beulich wrote:

On 05.01.15 at 22:44, boris.ostrov...@oracle.com wrote:

+static long vpmu_unload_next(void *arg)
+{
+struct vcpu *last;
+int ret;
+unsigned int thiscpu = smp_processor_id();
+
+if ( thiscpu != vpmu_next_unload_cpu )
+{
+/* Continuation thread may have been moved due to CPU hot-unplug */
+vpmu_mode = (unsigned long)arg;
+vpmu_first_unload_cpu = VPMU_INVALID_CPU;
+return -EAGAIN;
+}
+
+local_irq_disable(); /* so that last_vcpu doesn't change under us. */
+
+last = this_cpu(last_vcpu);
+if ( last )
+{
+vpmu_unload(vcpu_vpmu(last));
+this_cpu(last_vcpu) = NULL;
+}

So you do this for last_vcpu here, but ...



+static int vpmu_unload_all(unsigned long old_mode)
+{
+int ret = 0;
+
+vpmu_unload(vcpu_vpmu(current));

... for current here, also without clearing this_cpu(last_vcpu). Is
that really correct?


No, I should clear it here as well.




+long do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg)
+{
+int ret;
+struct xen_pmu_params pmu_params;
+
+ret = xsm_pmu_op(XSM_OTHER, current-domain, op);
+if ( ret )
+return ret;
+
+switch ( op )
+{
+case XENPMU_mode_set:
+{
+unsigned int old_mode;
+static DEFINE_SPINLOCK(xenpmu_mode_lock);
+
+if ( copy_from_guest(pmu_params, arg, 1) )
+return -EFAULT;
+
+if ( pmu_params.val  ~(XENPMU_MODE_SELF | XENPMU_MODE_HV) )
+return -EINVAL;
+
+/* 32-bit dom0 can only sample itself. */
+if ( is_pv_32bit_vcpu(current)  (pmu_params.val  XENPMU_MODE_HV) )
+return -EINVAL;
+
+/*
+ * Return error if someone else is in the middle of changing mode ---
+ * this is most likely indication of two system administrators
+ * working against each other.
+ */
+if ( !spin_trylock(xenpmu_mode_lock) )
+return -EAGAIN;
+if ( vpmu_first_unload_cpu != VPMU_INVALID_CPU )
+{
+spin_unlock(xenpmu_mode_lock);
+return -EAGAIN;
+}
+
+old_mode = vpmu_mode;
+vpmu_mode = pmu_params.val;
+
+if ( vpmu_mode == XENPMU_MODE_OFF )
+{
+/* Make sure all (non-dom0) VCPUs have unloaded their VPMUs. */
+ret = vpmu_unload_all(old_mode);
+if ( ret )
+vpmu_mode = old_mode;
+}
+
+spin_unlock(xenpmu_mode_lock);
+
+break;
+}
+
+case XENPMU_mode_get:
+memset(pmu_params, 0, sizeof(pmu_params));
+pmu_params.val = vpmu_mode;
+
+pmu_params.version.maj = XENPMU_VER_MAJ;
+pmu_params.version.min = XENPMU_VER_MIN;
+
+if ( copy_to_guest(arg, pmu_params, 1) )
+return -EFAULT;
+break;
+
+case XENPMU_feature_set:
+if ( copy_from_guest(pmu_params, arg, 1) )
+return -EFAULT;
+
+if ( pmu_params.val  ~XENPMU_FEATURE_INTEL_BTS )
+return -EINVAL;
+
+vpmu_features = pmu_params.val;
+break;
+
+case XENPMU_feature_get:
+pmu_params.val = vpmu_features;
+if ( copy_field_to_guest(arg, pmu_params, val) )
+return -EFAULT;
+break;
+
+default:
+ret = -EINVAL;
+}
+
+return ret;
+}

Throughout this function, the pad field isn't being checked to be
zero (and XENPMU_feature_get also doesn't clear it, but that seems
secondary, at least as long as it's being declared IN only). As I'm
sure I said before - we ought to do this in order to be able to assign
meaning to the field later on. Perhaps it would even better be
renamed to e.g. mbz.


If we decide to start using this filed then presumably we will have to 
bump interface version. If pad field becomes something else it will only 
be considered when minor version is  1. (And I should probably add 
major version check.)


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 33931: regressions - FAIL

2015-01-30 Thread xen . org
flight 33931 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/33931/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-winxpsp3  7 windows-install   fail REGR. vs. 33637

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 33637

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt  9 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass

version targeted for testing:
 xen  9f7798eb6ea20ba94617772ed0f3b8862b826533
baseline version:
 xen  7106c691a6332cffab4037186d1caa3012ae051e


People who touched revisions under test:
  Andrew Cooper andrew.coop...@citrix.com
  Boris Ostrovsky boris.ostrov...@oracle.com
  Chao Peng chao.p.p...@linux.intel.com
  Dan Carpenter dan.carpen...@oracle.com
  Daniel De Graaf dgde...@tycho.nsa.gov
  David Scott dave.sc...@citrix.com
  David Vrabel david.vra...@citrix.com
  Dietmar Hahn dietmar.h...@ts.fujitsu.com
  Fabio Fantoni fabio.fant...@m2r.biz
  Ian Campbell ian.campb...@citrix.com
  Ian Jackson ian.jack...@eu.citrix.com
  Jan Beulich jbeul...@suse.com
  Julien Grall julien.gr...@linaro.org
  Kevin Tian kevin.t...@intel.com
  Konrad Rzeszutek Wilk konrad.w...@oracle.com
  Oleksandr Tyshchenko oleksandr.tyshche...@globallogic.com
  Quan Xu quan...@intel.com
  Roger Pau Monné roger@citrix.com
  Sander Eikelenboom li...@eikelenboom.it
  Stefano Stabellini stefano.stabell...@eu.citrix.com
  Tim Deegan t...@xen.org
  Wei Liu wei.l...@citrix.com


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 

Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm

2015-01-30 Thread Andrew Cooper
On 30/01/15 15:46, Julien Grall wrote:
 Hello Denys,

 On 30/01/15 15:40, Denis Drozdov wrote:
 From: denys drozdov denys.droz...@globallogic.com

 Update RT scheduler to run on arm platform
 You need to give more background of the problem (i.e why you have to
 disable the IRQ on ARM).

 As the scheduler is common with x86, I would expect the problem is also
 happening on this architecture.

Changing a spinlock_irq into an irqsave is safe, functionally speaking,
but it is concerning that the scheduler appears to be called in
different interrupt states between architectures.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues

2015-01-30 Thread Daniel Kiper
On Fri, Jan 30, 2015 at 03:34:19PM +, Jan Beulich wrote:
  On 30.01.15 at 16:09, daniel.ki...@oracle.com wrote:
  I suppose that we should provide additional kexec hypercall
  function which will return info about RS. kexec-tools should
  load new kernel as usual and add relevant argument to its
  command line. Most things are in place so we should just
  learn kexec-tools to do new things.

 There is a reason why the RS table info can't currently be
 obtained via a hypercall - Dom0 has nothing to do with it. Plus any
 kexec-ed kernel (Linux or other) will, under EFI, want this
 information, so a mechanism by which to pass the information to
 the secondary kernel without exposing it to entities not having a
 need to know would be preferable (albeit I have no idea so far
 how that might look like).

Currently, anybody able to use HYPERVISOR_memory_op hypercall on dom0
is able to get full machine memory map. So, what is the problem with
exposing more details about RS? Do you think about security? I suppose
that we expose to dom0 other hardware details which potentially could
be used in malicious way and RS details exposure will not undermine
our security so strong.

 Plus, this still doesn't in any way deal with the aspect that was
 so far discussed in this thread - SetVirtualAddressMap() being
 callable only once.

Well, we must live with it because this is part of UEFI spec. Or change
UEFI spec. Still thinking why spec does not allow OS do this operation
more then once. OTH, Konrad pointed out a solution (workaround) for this
issue used in Linux which seems sensible and could be used by us too.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC] p2m: p2m_mmio_direct set RW permissions

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 17:34, elena.ufimts...@oracle.com wrote:
 On Tue, Jan 27, 2015 at 08:24:36AM +, Jan Beulich wrote:
  On 26.01.15 at 18:30, elena.ufimts...@oracle.com wrote:
  On Mon, Jan 26, 2015 at 05:06:12PM +, Jan Beulich wrote:
   On 26.01.15 at 17:57, elena.ufimts...@oracle.com wrote:
   On Fri, Jan 23, 2015 at 10:50:23AM +, Jan Beulich wrote:
On 22.01.15 at 18:34, elena.ufimts...@oracle.com wrote:
(XEN)  d56f - d5fff000 (reserved)
   
   So this is where one of the RMRRs sits in (and also where
   the faults occur according to the two numbers you sent
   earlier, which - as others have already said - is an indication
   of the reported RMRRs being incomplete), ...
   
(XEN)  d5fff000 - d600 (usable)
(XEN)  d700 - df20 (reserved)
   
   ... and this is the exact range of the other one. But the usable
   entry between them is a sign of the firmware not doing the
   best job in assigning resources.
   
   I don't, btw, think that blindly mapping all the reserved regions
   into PVH Dom0's P2M would be (or is, if that's what's happening
   today) correct - these regions are named reserved for a
   reason. In the case here it's actually RAM, not MMIO, and
   Dom0 (as well as Xen) has no business accessing these (for others
   this may be different, e.g. the LAPIC and IO-APIC ones below,
   but Xen learns/knows of them by means different from looking
   at the memory map).
   
   I understand this this. At the same time I think pv dom0 does exactly
   this blind mapping. I also tried to map these regions as read-only and
   that worked. Can be this an option for these ram regions?
  
  No - they're reserved, so there shouldn't be _any_ access to them.
  The only possible workaround I see as acceptable would be the
  already proposed addition of a command line option allowing to
  specify additional RMRR-like regions.
  
  
  Understood. And I am guessing the permissions overloading option should
  be available as well? For example, RW or R only. RMRRs are always mapped 
  with RW.
 
 That's an option, but not a requirement imo.
 
  Why can be this a platform quirk?
 
 Did you mean can't? If not, I don't understand the question. If so,
 remember that we're talking about RAM allocated by the firmware
 for special purposes. Hence such a quirk would need tailoring to any
 particular published firmware version, and would need to take into
 account any differences in the memory use that may result from
 setting firmware options differently (and assuming that the memory
 use is deterministic across boots when the options don't change -
 I've seen systems where memory use differed between warm and
 cold boots). IOW, not something that's likely to be practical.
 
 What about pv dom0 case? pv dom0 maps these reserved memory ranges with
 RW access rights. Should be this unified in some way?

I'm again having some difficulty relating the question to the context
given. But anyway - where do you see Dom0 getting reserved
memory regions mapped RW unconditionally? The fact that
iommu_inclusive_mapping=1 by default is to cover firmware bugs
afaik. I could see us altering this at some point to only behave that
way on older systems. But that's a decision mainly to be made by
the VT-d maintainers.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm

2015-01-30 Thread Denys Drozdov
Julien,

The details of this issue can be found in the current mailing list.
Please have a look at
http://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html.
This is exactly the same behaviour we are observing when running on our arm
setup.

The fix changes spin_lock_irq/spin_unlock_irq to
spin_lock_irq_save/spin_lock_irq_restore,
since context_save executed right from IRQ level. The arm interrupt
handling differs from x86. ARM is handling  context_saved with IRQ disabled
in CPSR, though x86 has IRQs on. Thus it is failing on ASSERT inside
spin_lock_irq when running on ARM. I guess it should work on x86, so this
issue is ARM platform specific.

On Fri, Jan 30, 2015 at 6:10 PM, Julien Grall julien.gr...@linaro.org
wrote:

 On 30/01/15 15:56, Andrew Cooper wrote:
  On 30/01/15 15:46, Julien Grall wrote:
  Hello Denys,
 
  On 30/01/15 15:40, Denis Drozdov wrote:
  From: denys drozdov denys.droz...@globallogic.com
 
  Update RT scheduler to run on arm platform
  You need to give more background of the problem (i.e why you have to
  disable the IRQ on ARM).
 
  As the scheduler is common with x86, I would expect the problem is also
  happening on this architecture.
 
  Changing a spinlock_irq into an irqsave is safe, functionally speaking,
  but it is concerning that the scheduler appears to be called in
  different interrupt states between architectures.

 For instance credit2 is also using vcpu_schedule_lock_irq... Although,
 IIRC, it's not used by default. So it has to be fixed too on ARM.

 In any case, the commit message needs more background such as stack
 trace and/or explaining why we have to take spinlock_irq.

 This may also need to document context_saved to explain the IRQs may not
 be enabled when this function is called.

 Regards,

 --
 Julien Grall




-- 

Denis Drozdov
GlobalLogic
M +(380)988969537  S denis.drozdov
www.globallogic.com
http://www.globallogic.com/
http://www.globallogic.com/email_disclaimer.txt
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH V3 07/12] xen: Remove mem_event

2015-01-30 Thread Daniel De Graaf

On 01/29/2015 04:46 PM, Tamas K Lengyel wrote:

This is the third and final patch in the mem_event - vm_event renaming process
which removes all references and remaining code of mem_event from the Xen
source.

Signed-off-by: Tamas K Lengyel tamas.leng...@zentific.com
---
  MAINTAINERS|   1 -
  docs/misc/xsm-flask.txt|   1 -
  tools/libxc/Makefile   |   1 -
  tools/libxc/xc_mem_access.c|  16 +-
  tools/libxc/xc_mem_event.c | 162 -
  tools/libxc/xc_private.h   |  12 -
  xen/common/Makefile|   1 -
  xen/common/mem_event.c | 738 -
  xen/include/public/domctl.h|  90 +
  xen/include/public/mem_event.h | 192 ---
  xen/include/xen/mem_event.h| 143 
  xen/include/xen/sched.h|  41 ---
  xen/include/xsm/dummy.h|  12 -
  xen/include/xsm/xsm.h  |  12 -
  xen/xsm/dummy.c|   2 -
  xen/xsm/flask/hooks.c  |  12 -
  16 files changed, 14 insertions(+), 1422 deletions(-)
  delete mode 100644 tools/libxc/xc_mem_event.c
  delete mode 100644 xen/common/mem_event.c
  delete mode 100644 xen/include/public/mem_event.h
  delete mode 100644 xen/include/xen/mem_event.h


You should also remove mem_event from xen/xsm/flask/policy/access_vectors
in this patch.

--
Daniel De Graaf
National Security Agency

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm

2015-01-30 Thread Julien Grall
On 30/01/15 15:56, Andrew Cooper wrote:
 On 30/01/15 15:46, Julien Grall wrote:
 Hello Denys,

 On 30/01/15 15:40, Denis Drozdov wrote:
 From: denys drozdov denys.droz...@globallogic.com

 Update RT scheduler to run on arm platform
 You need to give more background of the problem (i.e why you have to
 disable the IRQ on ARM).

 As the scheduler is common with x86, I would expect the problem is also
 happening on this architecture.
 
 Changing a spinlock_irq into an irqsave is safe, functionally speaking,
 but it is concerning that the scheduler appears to be called in
 different interrupt states between architectures.

For instance credit2 is also using vcpu_schedule_lock_irq... Although,
IIRC, it's not used by default. So it has to be fixed too on ARM.

In any case, the commit message needs more background such as stack
trace and/or explaining why we have to take spinlock_irq.

This may also need to document context_saved to explain the IRQs may not
be enabled when this function is called.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm

2015-01-30 Thread Julien Grall
On 30/01/15 16:19, Denys Drozdov wrote:
 Julien,
 
 The details of this issue can be found in the current mailing list.
 Please have a look
 athttp://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html
 http://lists.xen.org/archives/html/xen-devel/2014-09/msg04250.html.
 This is exactly the same behaviour we are observing when running on our
 arm setup.

I understood the problem... I just pointed out that the commit message
is not useful. Hence the request of more background.

If we have to read the log in a couple of years time. We should be able
to understand the problem/solution with the commit message.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 3/3] xen/arm: Introduce support for Renesas R-Car Gen2 platform

2015-01-30 Thread Julien Grall
On 30/01/15 16:44, Oleksandr Tyshchenko wrote:
 On Fri, Jan 30, 2015 at 5:52 PM, Julien Grall julien.gr...@linaro.org wrote:
 Hi Oleksandr,
 Hi Julien
 

 On 30/01/15 15:38, Oleksandr Tyshchenko wrote:
 For example, such a simple thing as a formula for calculating divider
 value for uart baudrate can differs from one family to
 another despite the fact that these families use the same UART IP
 block named SCIF (clock may be differs - ext/int, another freq).
 Or FIFO size, etc.

 IHMO this look more configuration than logic change in the UART driver.
 We don't ask you to support all the configuration of the UART right now
 ;). However, if someone comes with a new platform using the SCIF, I
 expect him to reuse and fix this driver rather than creating a new one.

 FWIW, the compatible string is renesas,scif not rcar smth. So, SCIF
 sounds a better name for this UART.
 ok
 
 Just to clarify:
 - Should I remove all references to R-Car Gen2 from UART stuff?

Yes please.

 - Can I keep your Reviewed-by after there mechanical changes.

Sure.

 It would be nice to hear your and Ian's opinions about board file.

I'm not sure about this one. I will let Ian decides.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 16:09, daniel.ki...@oracle.com wrote:
 I suppose that we should provide additional kexec hypercall
 function which will return info about RS. kexec-tools should
 load new kernel as usual and add relevant argument to its
 command line. Most things are in place so we should just
 learn kexec-tools to do new things.

There is a reason why the RS table info can't currently be
obtained via a hypercall - Dom0 has nothing to do with it. Plus any
kexec-ed kernel (Linux or other) will, under EFI, want this
information, so a mechanism by which to pass the information to
the secondary kernel without exposing it to entities not having a
need to know would be preferable (albeit I have no idea so far
how that might look like).

Plus, this still doesn't in any way deal with the aspect that was
so far discussed in this thread - SetVirtualAddressMap() being
callable only once.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 3/3] xen/arm: Introduce support for Renesas R-Car Gen2 platform

2015-01-30 Thread Oleksandr Tyshchenko
On Fri, Jan 30, 2015 at 5:52 PM, Julien Grall julien.gr...@linaro.org wrote:
 Hi Oleksandr,
Hi Julien


 On 30/01/15 15:38, Oleksandr Tyshchenko wrote:
 For example, such a simple thing as a formula for calculating divider
 value for uart baudrate can differs from one family to
 another despite the fact that these families use the same UART IP
 block named SCIF (clock may be differs - ext/int, another freq).
 Or FIFO size, etc.

 IHMO this look more configuration than logic change in the UART driver.
 We don't ask you to support all the configuration of the UART right now
 ;). However, if someone comes with a new platform using the SCIF, I
 expect him to reuse and fix this driver rather than creating a new one.

 FWIW, the compatible string is renesas,scif not rcar smth. So, SCIF
 sounds a better name for this UART.
ok

Just to clarify:
- Should I remove all references to R-Car Gen2 from UART stuff?
- Can I keep your Reviewed-by after there mechanical changes.

It would be nice to hear your and Ian's opinions about board file.


 Regards,

 --
 Julien Grall



-- 

Oleksandr Tyshchenko | Embedded Dev
GlobalLogic
www.globallogic.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 2/3] x86/VPMU: Disable VPMU when NMI watchdog is on

2015-01-30 Thread Boris Ostrovsky
NMI watchdog sets APIC_LVTPC register to generate an NMI when PMU counter
overflow occurs. This may be overwritten by VPMU code later, effectively
turning off the watchdog.

We should disable VPMU when NMI watchdog is running.

Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com
---
 docs/misc/xen-command-line.markdown |  2 ++
 xen/arch/x86/hvm/vpmu.c | 13 +
 2 files changed, 15 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 2274e74..bc316be 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1346,6 +1346,8 @@ wrong behaviour (see handle\_pmc\_quirk()).
 If 'vpmu=bts' is specified the virtualisation of the Branch Trace Store (BTS)
 feature is switched on on Intel processors supporting this feature.
 
+Note that if **watchdog** option is also specified vpmu will be turned off.
+
 *Warning:*
 As the BTS virtualisation is not 100% safe and because of the nehalem quirk
 don't use the vpmu flag on production systems with Intel cpus!
diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c
index 63b2158..b2e8e65 100644
--- a/xen/arch/x86/hvm/vpmu.c
+++ b/xen/arch/x86/hvm/vpmu.c
@@ -24,6 +24,7 @@
 #include asm/regs.h
 #include asm/types.h
 #include asm/msr.h
+#include asm/nmi.h
 #include asm/hvm/support.h
 #include asm/hvm/vmx/vmx.h
 #include asm/hvm/vmx/vmcs.h
@@ -288,3 +289,15 @@ void vpmu_dump(struct vcpu *v)
 vpmu-arch_vpmu_ops-arch_vpmu_dump(v);
 }
 
+static int __init vpmu_init(void)
+{
+/* NMI watchdog uses LVTPC and HW counter */
+if ( opt_watchdog  opt_vpmu_enabled )
+{
+printk(XENLOG_WARNING NMI watchdog is enabled. Turning VPMU off.\n);
+opt_vpmu_enabled = 0;
+}
+
+return 0;
+}
+__initcall(vpmu_init);
-- 
1.8.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 1/3] x86/nmi: Cleanup usage of nmi_watchdog

2015-01-30 Thread Boris Ostrovsky
Use NMI_NONE when testing whether NMI watchdog is off.

Remove unused NMI_INVALID macro.

Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com
---
 xen/arch/x86/nmi.c | 4 ++--
 xen/arch/x86/traps.c   | 3 ++-
 xen/include/asm-x86/apic.h | 1 -
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
index 98c1e15..2ab97a0 100644
--- a/xen/arch/x86/nmi.c
+++ b/xen/arch/x86/nmi.c
@@ -148,7 +148,7 @@ int __init check_nmi_watchdog (void)
 int cpu;
 bool_t ok = 1;
 
-if ( !nmi_watchdog )
+if ( nmi_watchdog == NMI_NONE )
 return 0;
 
 printk(Testing NMI watchdog on all CPUs:);
@@ -361,7 +361,7 @@ static int __pminit setup_p4_watchdog(void)
 
 void __pminit setup_apic_nmi_watchdog(void)
 {
-if (!nmi_watchdog)
+if ( nmi_watchdog == NMI_NONE )
 return;
 
 switch (boot_cpu_data.x86_vendor) {
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index ec324b0..f5516dc 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -3385,7 +3385,8 @@ void do_nmi(const struct cpu_user_regs *regs)
 if ( nmi_callback(regs, cpu) )
 return;
 
-if ( !nmi_watchdog || (!nmi_watchdog_tick(regs)  watchdog_force) )
+if ( (nmi_watchdog == NMI_NONE) ||
+ (!nmi_watchdog_tick(regs)  watchdog_force) )
 handle_unknown = 1;
 
 /* Only the BSP gets external NMIs from the system. */
diff --git a/xen/include/asm-x86/apic.h b/xen/include/asm-x86/apic.h
index 6697245..be9a535 100644
--- a/xen/include/asm-x86/apic.h
+++ b/xen/include/asm-x86/apic.h
@@ -221,7 +221,6 @@ extern unsigned int nmi_watchdog;
 #define NMI_NONE   0
 #define NMI_IO_APIC1
 #define NMI_LOCAL_APIC 2
-#define NMI_INVALID3
 
 #else /* !CONFIG_X86_LOCAL_APIC */
 static inline int lapic_suspend(void) {return 0;}
-- 
1.8.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] rcu_sched self-detect stall when disable vif device

2015-01-30 Thread David Vrabel
On 28/01/15 17:27, Julien Grall wrote:
 On 28/01/15 17:06, David Vrabel wrote:
 On 28/01/15 16:45, Julien Grall wrote:
 On 27/01/15 16:53, Wei Liu wrote:
 On Tue, Jan 27, 2015 at 04:47:45PM +, Julien Grall wrote:
 On 27/01/15 16:45, Wei Liu wrote:
 On Tue, Jan 27, 2015 at 04:03:52PM +, Julien Grall wrote:
 Hi,

 While I'm working on support for 64K page in netfront, I got
 an rcu_sced self-detect message. It happens when netback is
 disabling the vif device due to an error.

 I'm using Linux 3.19-rc5 on seattle (ARM64). Any idea why
 the processor is stucked in xenvif_rx_queue_purge?


 When you try to release a SKB, core network driver need to enter some
 RCU cirital region to clean up. dst_release for one, calls call_rcu.

 But this message shouldn't happen in normal condition or because of
 netfront. Right?


 Never saw  report like this before, even in the case that netfront is
 buggy.

 This is only happening when preemption is not enabled (i.e
 CONFIG_PREEMPT_NONE in the config file) in the backend kernel.

 When the vif is disabled, the loop in xenvif_kthread_guest_rx turned
 into an infinite loop. In my case, the code executed looks like:


  1. for (;;) {
  2. xenvif_wait_for_rx_work(queue);
  3.
  4. if (kthread_should_stop())
  5. break;
  6.
  7. if (unlikely(vif-disabled  queue-id == 0) {
  8. xenvif_carrier_off(vif);
  9. xenvif_rx_queue_purge(queue);
 10. continue;
 11. }
 12. }

 The wait on line 2 will return directly because the vif is disabled
 (see xenvif_have_rx_work)

 We are on queue 0, so the condition on line 7 is true. Therefore we will
 loop on line 10. And so on...

 On platform where preemption is not enabled, this thread will never
 yield/give the hand to another thread (unless the domain is destroyed).

 I'm not sure why we have a continue in the vif-disabled case and not
 just a break.  Can you try that?
 
 So I applied this small patches:
 
 diff --git a/drivers/net/xen-netback/netback.c 
 b/drivers/net/xen-netback/netback.c
 index 908e65e..9448c6c 100644
 --- a/drivers/net/xen-netback/netback.c
 +++ b/drivers/net/xen-netback/netback.c
 @@ -2110,7 +2110,7 @@ int xenvif_kthread_guest_rx(void *data)
 if (unlikely(vif-disabled  queue-id == 0)) {
 xenvif_carrier_off(vif);
 xenvif_rx_queue_purge(queue);
 -   continue;
 +   break;
 }
  
 if (!skb_queue_empty(queue-rx_queue))

How about this?

8--
xen-netback: stop the guest rx thread after a fatal error

After commit e9d8b2c2968499c1f96563e6522c56958d5a1d0d (xen-netback:
disable rogue vif in kthread context), a fatal (protocol) error would
leave the guest Rx thread spinning, wasting CPU time.  Commit
ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e (xen-netback: reintroduce
guest Rx stall detection) made this even worse by removing a
cond_resched() from this path.

A fatal error is non-recoverable so just allow the guest Rx thread to
exit.  This requires taking additional refs to the task so the thread
exiting early is handled safely.

Signed-off-by: David Vrabel david.vra...@citrix.com

diff --git a/drivers/net/xen-netback/interface.c
b/drivers/net/xen-netback/interface.c
index 9259a73..037f74f 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -578,6 +578,7 @@ int xenvif_connect(struct xenvif_queue *queue,
unsigned long tx_ring_ref,
goto err_rx_unbind;
}
queue-task = task;
+   get_task_struct(task);

task = kthread_create(xenvif_dealloc_kthread,
  (void *)queue, %s-dealloc, queue-name);
@@ -634,6 +635,7 @@ void xenvif_disconnect(struct xenvif *vif)

if (queue-task) {
kthread_stop(queue-task);
+   put_task_struct(queue-task);
queue-task = NULL;
}

diff --git a/drivers/net/xen-netback/netback.c
b/drivers/net/xen-netback/netback.c
index 908e65e..c8ce701 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -2109,8 +2109,7 @@ int xenvif_kthread_guest_rx(void *data)
 */
if (unlikely(vif-disabled  queue-id == 0)) {
xenvif_carrier_off(vif);
-   xenvif_rx_queue_purge(queue);
-   continue;
+   break;
}

if (!skb_queue_empty(queue-rx_queue))


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC] p2m: p2m_mmio_direct set RW permissions

2015-01-30 Thread Elena Ufimtseva
On Tue, Jan 27, 2015 at 08:24:36AM +, Jan Beulich wrote:
  On 26.01.15 at 18:30, elena.ufimts...@oracle.com wrote:
  On Mon, Jan 26, 2015 at 05:06:12PM +, Jan Beulich wrote:
   On 26.01.15 at 17:57, elena.ufimts...@oracle.com wrote:
   On Fri, Jan 23, 2015 at 10:50:23AM +, Jan Beulich wrote:
On 22.01.15 at 18:34, elena.ufimts...@oracle.com wrote:
(XEN)  d56f - d5fff000 (reserved)
   
   So this is where one of the RMRRs sits in (and also where
   the faults occur according to the two numbers you sent
   earlier, which - as others have already said - is an indication
   of the reported RMRRs being incomplete), ...
   
(XEN)  d5fff000 - d600 (usable)
(XEN)  d700 - df20 (reserved)
   
   ... and this is the exact range of the other one. But the usable
   entry between them is a sign of the firmware not doing the
   best job in assigning resources.
   
   I don't, btw, think that blindly mapping all the reserved regions
   into PVH Dom0's P2M would be (or is, if that's what's happening
   today) correct - these regions are named reserved for a
   reason. In the case here it's actually RAM, not MMIO, and
   Dom0 (as well as Xen) has no business accessing these (for others
   this may be different, e.g. the LAPIC and IO-APIC ones below,
   but Xen learns/knows of them by means different from looking
   at the memory map).
   
   I understand this this. At the same time I think pv dom0 does exactly
   this blind mapping. I also tried to map these regions as read-only and
   that worked. Can be this an option for these ram regions?
  
  No - they're reserved, so there shouldn't be _any_ access to them.
  The only possible workaround I see as acceptable would be the
  already proposed addition of a command line option allowing to
  specify additional RMRR-like regions.
  
  
  Understood. And I am guessing the permissions overloading option should
  be available as well? For example, RW or R only. RMRRs are always mapped 
  with RW.
 
 That's an option, but not a requirement imo.
 
  Why can be this a platform quirk?
 
 Did you mean can't? If not, I don't understand the question. If so,
 remember that we're talking about RAM allocated by the firmware
 for special purposes. Hence such a quirk would need tailoring to any
 particular published firmware version, and would need to take into
 account any differences in the memory use that may result from
 setting firmware options differently (and assuming that the memory
 use is deterministic across boots when the options don't change -
 I've seen systems where memory use differed between warm and
 cold boots). IOW, not something that's likely to be practical.
 
 Jan

What about pv dom0 case? pv dom0 maps these reserved memory ranges with
RW access rights. Should be this unified in some way?

 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/some blob to the hypervisor

2015-01-30 Thread Luis R. Rodriguez
On Fri, Jan 30, 2015 at 08:37:33PM +, Andrew Cooper wrote:
 On 30/01/15 19:51, Luis R. Rodriguez wrote:
  On Fri, Jan 30, 2015 at 02:23:48PM +, Andrew Cooper wrote:
  On 30/01/15 01:14, Luis R. Rodriguez wrote:
  From: Luis R. Rodriguez mcg...@suse.com
 
  There are several ways that a Xen system can update the
  CPU microcode on a pvops kernel [0] now, the preferred way
  is through the early microcode update mechanism. At run
  time folks should use this new xenmicrocode tool and use
  the same CPU microcode file as present on /lib/firmware.
 
  Some distributions may use the historic sysfs rescan interface,
  users of that mechanism should be aware that the interface will
  not be available when using Xen and as such should first check
  the presence of the interface before usage, as an alternative
  this xenmicrocode tool can be used on priviledged domains.
 
  Folks wishing to update CPU microcode at run time should be
  aware that not all CPU microcode can be updated on a system
  and should take care to ensure that only known-to-work and
  supported CPU microcode updates are used [0]. To avoid issues
  with delays on the hypercall / microcode update this
  implementation will quiesce all domains prior to updating
  the microcode, and then only queisce'd domains will be
  unpaused once done.
 
  [0] http://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update
 
  Based on original work by: Konrad Rzeszutek Wilk konrad.w...@oracle.com
  Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
  Cc: Andrew Cooper andrew.coop...@citrix.com
  Cc: Borislav Petkov b...@suse.de
  Cc: Takashi Iwai ti...@suse.de
  Cc: Olaf Hering oher...@suse.de
  Cc: Jan Beulich jbeul...@suse.com
  Cc: Jason Douglas jdoug...@suse.com
  Cc: Juergen Gross jgr...@suse.com
  Cc: Michal Marek mma...@suse.cz
  Cc: Henrique de Moraes Holschuh h...@debian.org
  Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com
  Signed-off-by: Luis R. Rodriguez mcg...@suse.com
  ---
 
  Just wrote this, haven't tested it. This does some queiscing prior
  to doing the microcode update. The quiescing is done by pausing
  all domains. Once the microcode update is done we only unpause
  domains which we queisced as part of our work. Let me know if this
  is on the right track to help avoid issues noted on the list.
  There is also a TOCTOU race with your paused check, which itself is
  buggy as it you should unconditionally pause all VMs (userspace pause
  refcounting has been fixed for a little while now).
  Also, currently __domain_pause_by_systemcontroller() has a limit to 255
  guests. Are the fixes that you describe to the refcounting sufficient
  to remove that limitation from __domain_pause_by_systemcontroller()?
 
 The limit is the number of concurrent userspace refs taken on an
 individual domain.  I.e. you can plausibly have 255 different bits of
 the toolstack each taking a pause reference for a specific reason.
 
 255 was chosen an arbitrary limit, which is used to prevent buggy
 toolstacks from being able to overflow the refcounts used by the
 scheduler by using the pause domain hypercall 4 billion times.
 
 
  My implementation uses 1024 but has no check on nb_domain (the amount of
  domains set) so that requires fixing as well but I figure we should also
  review the above first too. Artificial limits bother me and I went with
  1024 as I also saw that limit used elsewhere, not sure if it was a stack
  concern or what.
 
  However, xenmicrocode (even indirectly via xc_microcode_update()) is not
  in a position to safely pause all domains as there is no interlock to
  preventing a new domain being created.  Even if all domains do get
  successfully paused, 
  I did think about this and figured we could use this as a segway into
  a discussion about how we would want to implement this sort of
  interlocking or see if there is anything available already for this
  sort of thing. Also, are there other future users of this that could benefit
  from it ? If so then perhaps we can wrap the requirements up together.
 
  the idle loops are substantially less trivial than
  ideal.
  Interesting, can you elaborate on the possible issues that might creep
  up on the idle loops while a guest is paused during a microcode update?
  Any single issue would suffice, just curious.
 
  Do we need something more intricate than pause implemented then? Using
  suspend seemed rather grotesque to shove down a guest's throat. If
  pause / suspend do not suffice perhaps some new artificial virtual
  tempory quiesce is needed to ensure integrity here, which would address
  some of the idle concerns you highligted might creep up.
 
  The toolstack should not hack around hypervisor bugs, and indeed is not
  capable of doing so.
  Agreed. I figured I'd at least do what I can with what is available
  and use this as a discussoin for what is the Right Thing To Do (TM)
  in the future.
 
 The right thing to do is to fix the hypercall implementation, at which

Re: [Xen-devel] [PATCHv5 00/14] xen: fix many long-standing grant mapping bugs

2015-01-30 Thread Konrad Rzeszutek Wilk
On Tue, Jan 27, 2015 at 04:44:02PM +, David Vrabel wrote:
 This series fixes a number of long-standing bugs in the handling of
 grant maps.  Refer to the following for all the details.
 
   http://xenbits.xen.org/people/dvrabel/grant-improvements-C.pdf
 
 In summary, the important uses that this enables are:
 
 1. Block backends can use networked storage safely.
 
 2. Block backends can use network storage provided by other guests on
the same host.
 
 3. User space block backends can use direct I/O or asynchronous I/O.
 
 The first two patches are the core MM changes necessary.  These have
 already been acked.
 
 Patches #3 and #4 remove existing (broken) mechanisms.  This does
 temporarily break some previously working use cases, but it does make
 the subsequent additions much easier to review.
 
 Only patches #3, #4, and #13 need acks from other Xen maintainers.

I took a look at the patches and they look good to me.

 
 As a happy side effect, performance is also likely to be improved in
 some areas (but I've not got any measurements yet).  User space
 backends using grant mapping should see some good improvements from
 reduced overheads and better unmap batching.  VIF to VIF network
 traffic may also see a small improvement.
 
 Finally, thanks to Jenny who did much of the implementation.
 
 Changes in v5:
 - Fix gntdev for auto-xlate guests.
 
 Changes in v4:
 - Removed xen_blkbk_unmap_done().
 - Fixed bug in blkback when using mixed persistent and non-persistent
   grants.
 
 Changes in v3:
 - find_page renamed to find_special_page.
 - Added documentation for mm changes.
 - Fixed mangled forward port of blkback's safe unmap patch.
 - Export gnttab_alloc_pages() and gnttab_free_pages().
 - Fix 32-bit build.
 
 Changes in v2:
 - Add find_page() VMA op instead of pages field (so struct
   vm_area_struct doesn't increase in size.
 - Make foreign page tracking arch-independant and improve the API.
 - Alloc extra memory (for 32-bit archs) for the (domain, gref) when
   allocating the page (instead of during the map).
 - Convert gntdev's lock to a mutex.
 
 David
 
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/some blob to the hypervisor

2015-01-30 Thread Boris Ostrovsky

On 01/30/2015 03:37 PM, Andrew Cooper wrote:



The actions Xen needs to take are:

- Copy the buffer into Xen.
- Scan the buffer for the correct patch



Why not have the toolstack search for the right patch? Hypervisor will 
verify that it's appropriate but won't have to spend time scanning the 
(potentially large) buffer.


(The logic for scanning the buffer is in the hypervisor code but we 
could move it to a library)


-boris



- Rendezvous all online cpus in an IPI to apply the patch, and keep the
processors in until all have completed the patch.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/some blob to the hypervisor

2015-01-30 Thread Andrew Cooper
On 30/01/2015 21:24, Boris Ostrovsky wrote:
 On 01/30/2015 03:37 PM, Andrew Cooper wrote:


 The actions Xen needs to take are:

 - Copy the buffer into Xen.
 - Scan the buffer for the correct patch


 Why not have the toolstack search for the right patch? Hypervisor will
 verify that it's appropriate but won't have to spend time scanning the
 (potentially large) buffer.

 (The logic for scanning the buffer is in the hypervisor code but we
 could move it to a library)

a) Xen already needs to scan for the boot time patching and b) because
the interface already has this property.

One could certainly prepare the patch in userspace and that would be
more efficient (patches welcome if you are feeling particularly keen),
but I am not going to loose sleep over it.  It is likely that the
majority of users will delay a microcode update until a maintenance
window and plan time for a reboot.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 13/18] efi: create efi_tables()

2015-01-30 Thread Daniel Kiper
..which collects system tables data. We need this to support
multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
---
 xen/common/efi/boot.c |   51 -
 1 file changed, 29 insertions(+), 22 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 114019e..cf0fbc2 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -707,6 +707,34 @@ static UINTN __init 
efi_find_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop,
 return gop_mode;
 }
 
+static void __init efi_tables(void)
+{
+unsigned int i;
+
+/* Obtain basic table pointers. */
+for ( i = 0; i  efi_num_ct; ++i )
+{
+static EFI_GUID __initdata acpi2_guid = ACPI_20_TABLE_GUID;
+static EFI_GUID __initdata acpi_guid = ACPI_TABLE_GUID;
+static EFI_GUID __initdata mps_guid = MPS_TABLE_GUID;
+static EFI_GUID __initdata smbios_guid = SMBIOS_TABLE_GUID;
+
+if ( match_guid(acpi2_guid, efi_ct[i].VendorGuid) )
+  efi.acpi20 = (long)efi_ct[i].VendorTable;
+if ( match_guid(acpi_guid, efi_ct[i].VendorGuid) )
+  efi.acpi = (long)efi_ct[i].VendorTable;
+if ( match_guid(mps_guid, efi_ct[i].VendorGuid) )
+  efi.mps = (long)efi_ct[i].VendorTable;
+if ( match_guid(smbios_guid, efi_ct[i].VendorGuid) )
+  efi.smbios = (long)efi_ct[i].VendorTable;
+}
+
+#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */
+if (efi.smbios != EFI_INVALID_TABLE_ADDR)
+dmi_efi_get_table((void *)(long)efi.smbios);
+#endif
+}
+
 static void __init setup_efi_pci(void)
 {
 EFI_STATUS status;
@@ -1027,28 +1055,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 /* XXX Collect EDID info. */
 efi_arch_cpu();
 
-/* Obtain basic table pointers. */
-for ( i = 0; i  efi_num_ct; ++i )
-{
-static EFI_GUID __initdata acpi2_guid = ACPI_20_TABLE_GUID;
-static EFI_GUID __initdata acpi_guid = ACPI_TABLE_GUID;
-static EFI_GUID __initdata mps_guid = MPS_TABLE_GUID;
-static EFI_GUID __initdata smbios_guid = SMBIOS_TABLE_GUID;
-
-if ( match_guid(acpi2_guid, efi_ct[i].VendorGuid) )
-  efi.acpi20 = (long)efi_ct[i].VendorTable;
-if ( match_guid(acpi_guid, efi_ct[i].VendorGuid) )
-  efi.acpi = (long)efi_ct[i].VendorTable;
-if ( match_guid(mps_guid, efi_ct[i].VendorGuid) )
-  efi.mps = (long)efi_ct[i].VendorTable;
-if ( match_guid(smbios_guid, efi_ct[i].VendorGuid) )
-  efi.smbios = (long)efi_ct[i].VendorTable;
-}
-
-#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */
-if (efi.smbios != EFI_INVALID_TABLE_ADDR)
-dmi_efi_get_table((void *)(long)efi.smbios);
-#endif
+efi_tables();
 
 /* Collect PCI ROM contents. */
 setup_efi_pci();
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 09/18] efi: create efi_init()

2015-01-30 Thread Daniel Kiper
..which initializes basic EFI variables. We need this to support
multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
---
 xen/common/efi/boot.c |   26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 89d081d..1bf88e4 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -586,6 +586,21 @@ static char *__init get_value(const struct file *cfg, 
const char *section,
 return NULL;
 }
 
+static void __init efi_init(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
+{
+efi_ih = ImageHandle;
+efi_bs = SystemTable-BootServices;
+efi_rs = SystemTable-RuntimeServices;
+efi_ct = SystemTable-ConfigurationTable;
+efi_num_ct = SystemTable-NumberOfTableEntries;
+efi_version = SystemTable-Hdr.Revision;
+efi_fw_vendor = SystemTable-FirmwareVendor;
+efi_fw_revision = SystemTable-FirmwareRevision;
+
+StdOut = SystemTable-ConOut;
+StdErr = SystemTable-StdErr ?: StdOut;
+}
+
 static void __init setup_efi_pci(void)
 {
 EFI_STATUS status;
@@ -713,17 +728,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 efi_loader = 1;
 #endif
 
-efi_ih = ImageHandle;
-efi_bs = SystemTable-BootServices;
-efi_rs = SystemTable-RuntimeServices;
-efi_ct = SystemTable-ConfigurationTable;
-efi_num_ct = SystemTable-NumberOfTableEntries;
-efi_version = SystemTable-Hdr.Revision;
-efi_fw_vendor = SystemTable-FirmwareVendor;
-efi_fw_revision = SystemTable-FirmwareRevision;
+efi_init(ImageHandle, SystemTable);
 
-StdOut = SystemTable-ConOut;
-StdErr = SystemTable-StdErr ?: StdOut;
 use_cfg_file = efi_arch_use_config_file(SystemTable);
 
 status = efi_bs-HandleProtocol(ImageHandle, loaded_image_guid,
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 03/18] x86/boot: use %ecx instead of %eax

2015-01-30 Thread Daniel Kiper
Use %ecx instead of %eax to store low memory upper limit from EBDA.
This way we do not wipe multiboot protocol identifier. It is needed
in reloc() to differentiate between multiboot (v1) and
multiboot2 protocol.

Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
Reviewed-by: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/boot/head.S |   27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index c99f739..6180783 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -86,14 +86,14 @@ __start:
 jne not_multiboot
 
 /* Set up trampoline segment 64k below EBDA */
-movzwl  0x40e,%eax  /* EBDA segment */
-cmp $0xa000,%eax/* sanity check (high) */
+movzwl  0x40e,%ecx  /* EBDA segment */
+cmp $0xa000,%ecx/* sanity check (high) */
 jae 0f
-cmp $0x4000,%eax/* sanity check (low) */
+cmp $0x4000,%ecx/* sanity check (low) */
 jae 1f
 0:
-movzwl  0x413,%eax  /* use base memory size on failure */
-shl $10-4,%eax
+movzwl  0x413,%ecx  /* use base memory size on failure */
+shl $10-4,%ecx
 1:
 /*
  * Compare the value in the BDA with the information from the
@@ -105,21 +105,22 @@ __start:
 cmp $0x100,%edx /* is the multiboot value too small? */
 jb  2f  /* if so, do not use it */
 shl $10-4,%edx
-cmp %eax,%edx   /* compare with BDA value */
-cmovb   %edx,%eax   /* and use the smaller */
+cmp %ecx,%edx   /* compare with BDA value */
+cmovb   %edx,%ecx   /* and use the smaller */
 
 2:  /* Reserve 64kb for the trampoline */
-sub $0x1000,%eax
+sub $0x1000,%ecx
 
 /* From arch/x86/smpboot.c: start_eip had better be page-aligned! */
-xor %al, %al
-shl $4, %eax
-mov %eax,sym_phys(trampoline_phys)
+xor %cl, %cl
+shl $4, %ecx
+mov %ecx,sym_phys(trampoline_phys)
 
 /* Save the Multiboot info struct (after relocation) for later use. */
 mov $sym_phys(cpu0_stack)+1024,%esp
-push%ebx
-callreloc
+mov %ecx,%eax
+push%ebx/* Multiboot information address */
+callreloc   /* %eax contains trampoline address */
 mov %eax,sym_phys(multiboot_ptr)
 
 /* Initialize BSS (no nasty surprises!) */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 02/18] x86/boot/reloc: create generic alloc and copy functions

2015-01-30 Thread Daniel Kiper
Create generic alloc and copy functions. We need
separate tools for memory allocation and copy to
provide multiboot2 protocol support.

Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
---
 xen/arch/x86/boot/reloc.c |   52 -
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
index 63045c0..f964cda 100644
--- a/xen/arch/x86/boot/reloc.c
+++ b/xen/arch/x86/boot/reloc.c
@@ -33,9 +33,10 @@ asm (
 typedef unsigned int u32;
 #include ../../../include/xen/multiboot.h
 
-static void *reloc_mbi_struct(void *old, unsigned int bytes)
+static u32 alloc_struct(u32 bytes)
 {
-void *new;
+u32 s;
+
 asm(
 call 1f  \n
 1:  pop  %%edx   \n
@@ -43,50 +44,63 @@ static void *reloc_mbi_struct(void *old, unsigned int bytes)
 sub  %1,%0   \n
 and  $~15,%0 \n
 mov  %0,alloc-1b(%%edx)  \n
-mov  %0,%%edi\n
-rep  movsb   \n
-   : =r (new), +c (bytes), +S (old)
-   : : edx, edi, memory);
-return new;
+   : =r (s) : r (bytes) : edx, memory);
+
+return s;
+}
+
+static u32 copy_struct(u32 src, u32 bytes)
+{
+u32 dst, dst_asm;
+
+dst = alloc_struct(bytes);
+dst_asm = dst;
+
+asm volatile(rep movsb : +S (src), +D (dst_asm), +c (bytes) : : 
memory);
+
+return dst;
 }
 
-static char *reloc_mbi_string(char *old)
+static u32 copy_string(u32 src)
 {
 char *p;
-for ( p = old; *p != '\0'; p++ )
+
+if ( src == 0 )
+return 0;
+
+for ( p = (char *)src; *p != '\0'; p++ )
 continue;
-return reloc_mbi_struct(old, p - old + 1);
+
+return copy_struct(src, p - (char *)src + 1);
 }
 
 multiboot_info_t *reloc(multiboot_info_t *mbi_old)
 {
-multiboot_info_t *mbi = reloc_mbi_struct(mbi_old, sizeof(*mbi));
+multiboot_info_t *mbi = (multiboot_info_t *)copy_struct((u32)mbi_old, 
sizeof(*mbi));
 int i;
 
 if ( mbi-flags  MBI_CMDLINE )
-mbi-cmdline = (u32)reloc_mbi_string((char *)mbi-cmdline);
+mbi-cmdline = copy_string(mbi-cmdline);
 
 if ( mbi-flags  MBI_MODULES )
 {
-module_t *mods = reloc_mbi_struct(
-(module_t *)mbi-mods_addr, mbi-mods_count * sizeof(module_t));
+module_t *mods = (module_t *)copy_struct(
+mbi-mods_addr, mbi-mods_count * sizeof(module_t));
 
 mbi-mods_addr = (u32)mods;
 
 for ( i = 0; i  mbi-mods_count; i++ )
 {
 if ( mods[i].string )
-mods[i].string = (u32)reloc_mbi_string((char *)mods[i].string);
+mods[i].string = copy_string(mods[i].string);
 }
 }
 
 if ( mbi-flags  MBI_MEMMAP )
-mbi-mmap_addr = (u32)reloc_mbi_struct(
-(memory_map_t *)mbi-mmap_addr, mbi-mmap_length);
+mbi-mmap_addr = copy_struct(mbi-mmap_addr, mbi-mmap_length);
 
 if ( mbi-flags  MBI_LOADERNAME )
-mbi-boot_loader_name = (u32)reloc_mbi_string(
-(char *)mbi-boot_loader_name);
+mbi-boot_loader_name = copy_string(mbi-boot_loader_name);
 
 /* Mask features we don't understand or don't relocate. */
 mbi-flags = (MBI_MEMLIMITS |
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 07/18] efi: run EFI specific code on EFI platform only

2015-01-30 Thread Daniel Kiper
Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
---
 xen/arch/x86/shutdown.c  |3 ++-
 xen/common/efi/boot.c|5 +
 xen/common/efi/runtime.c |6 ++
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/shutdown.c b/xen/arch/x86/shutdown.c
index 21f6cf5..1c8336f 100644
--- a/xen/arch/x86/shutdown.c
+++ b/xen/arch/x86/shutdown.c
@@ -504,7 +504,8 @@ void machine_restart(unsigned int delay_millisecs)
 tboot_shutdown(TB_SHUTDOWN_REBOOT);
 }
 
-efi_reset_system(reboot_mode != 0);
+if ( efi_platform )
+efi_reset_system(reboot_mode != 0);
 
 /* Rebooting needs to touch the page at absolute address 0. */
 *((unsigned short *)__va(0x472)) = reboot_mode;
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 8aafcfd..89d081d 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -1154,6 +1154,11 @@ void __init efi_init_memory(void)
 } *extra, *extra_head = NULL;
 #endif
 
+#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */
+if ( !efi_platform )
+return;
+#endif
+
 printk(XENLOG_INFO EFI memory map:\n);
 for ( i = 0; i  efi_memmap_size; i += efi_mdesc_size )
 {
diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c
index b229c69..7c767e4 100644
--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -164,6 +164,9 @@ int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
 {
 unsigned int i, n;
 
+if ( !efi_platform )
+return -ENOSYS;
+
 switch ( idx )
 {
 case XEN_FW_EFI_VERSION:
@@ -298,6 +301,9 @@ int efi_runtime_call(struct xenpf_efi_runtime_call *op)
 EFI_STATUS status = EFI_NOT_STARTED;
 int rc = 0;
 
+if ( !efi_platform )
+return -ENOSYS;
+
 switch ( op-function )
 {
 case XEN_EFI_get_time:
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 12/13] xen/iommu: smmu: Add Xen specific code to be able to use the driver

2015-01-30 Thread Julien Grall
The main goal is to modify as little the Linux code to be able to port
easily new feature added in Linux repo for the driver.

To achieve that we:
- Add helpers to Linux function not implemented on Xen
- Add callbacks used by Xen to do our own stuff and call Linux ones
- Only modify when required the code which comes from Linux. If so a
comment has been added with /* Xen: ... */ explaining why it's
necessary.

The support for PCI has been commented because it's not yet supported by
Xen ARM and therefore won't compile.

Signed-off-by: Julien Grall julien.gr...@linaro.org

---
Changes in v2:
- Add the ACCESS_ONCE definition in the drivers. The patch to
introduce the one in common code has been dropped.
- The include xen/device.h has been dropped in favor of
asm/device.h
---
 xen/drivers/passthrough/arm/Makefile |   1 +
 xen/drivers/passthrough/arm/smmu.c   | 678 +++
 2 files changed, 612 insertions(+), 67 deletions(-)

diff --git a/xen/drivers/passthrough/arm/Makefile 
b/xen/drivers/passthrough/arm/Makefile
index 0484b79..f4cd26e 100644
--- a/xen/drivers/passthrough/arm/Makefile
+++ b/xen/drivers/passthrough/arm/Makefile
@@ -1 +1,2 @@
 obj-y += iommu.o
+obj-y += smmu.o
diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 8a6514f..373eee8 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -18,6 +18,13 @@
  *
  * Author: Will Deacon will.dea...@arm.com
  *
+ * Based on Linux drivers/iommu/arm-smmu.c
+ * = commit e6b5be2be4e30037eb551e0ed09dd97bd00d85d3
+ *
+ * Xen modification:
+ * Julien Grall julien.gr...@linaro.org
+ * Copyright (C) 2014 Linaro Limited.
+ *
  * This driver currently supports:
  * - SMMUv1 and v2 implementations
  * - Stream-matching and stream-indexing
@@ -28,26 +35,164 @@
  * - Context fault reporting
  */
 
-#define pr_fmt(fmt) arm-smmu:  fmt
 
-#include linux/delay.h
-#include linux/dma-mapping.h
-#include linux/err.h
-#include linux/interrupt.h
-#include linux/io.h
-#include linux/iommu.h
-#include linux/mm.h
-#include linux/module.h
-#include linux/of.h
-#include linux/pci.h
-#include linux/platform_device.h
-#include linux/slab.h
-#include linux/spinlock.h
-#include linux/bitops.h
+#include xen/config.h
+#include xen/delay.h
+#include xen/errno.h
+#include xen/err.h
+#include xen/irq.h
+#include xen/lib.h
+#include xen/list.h
+#include xen/mm.h
+#include xen/vmap.h
+#include xen/rbtree.h
+#include xen/sched.h
+#include xen/sizes.h
+#include asm/atomic.h
+#include asm/device.h
+#include asm/io.h
+#include asm/platform.h
+
+/* Xen: The below defines are redefined within the file. Undef it */
+#undef SCTLR_AFE
+#undef SCTLR_TRE
+#undef SCTLR_M
+#undef TTBCR_EAE
+
+/* Alias to Xen device tree helpers */
+#define device_node dt_device_node
+#define of_phandle_args dt_phandle_args
+#define of_device_id dt_device_match
+#define of_match_node dt_match_node
+#define of_property_read_u32(np, pname, out) (!dt_property_read_u32(np, pname, 
out))
+#define of_property_read_bool dt_property_read_bool
+#define of_parse_phandle_with_args dt_parse_phandle_with_args
+
+/* Xen: Helpers to get device MMIO and IRQs */
+struct resource
+{
+   u64 addr;
+   u64 size;
+   unsigned int type;
+};
+
+#define resource_size(res) (res)-size;
+
+#define platform_device dt_device_node
+
+#define IORESOURCE_MEM 0
+#define IORESOURCE_IRQ 1
+
+static struct resource *platform_get_resource(struct platform_device *pdev,
+ unsigned int type,
+ unsigned int num)
+{
+   /*
+* The resource is only used between 2 calls of platform_get_resource.
+* It's quite ugly but it's avoid to add too much code in the part
+* imported from Linux
+*/
+   static struct resource res;
+   int ret = 0;
+
+   res.type = type;
+
+   switch (type) {
+   case IORESOURCE_MEM:
+   ret = dt_device_get_address(pdev, num, res.addr, res.size);
+
+   return ((ret) ? NULL : res);
+
+   case IORESOURCE_IRQ:
+   ret = platform_get_irq(pdev, num);
+   if (ret  0)
+   return NULL;
+
+   res.addr = ret;
+   res.size = 1;
+
+   return res;
+
+   default:
+   return NULL;
+   }
+}
+
+/* Alias to Xen IRQ functions */
+#define request_irq(irq, func, flags, name, dev) request_irq(irq, flags, func, 
name, dev)
+#define free_irq release_irq
+
+/*
+ * Device logger functions
+ * TODO: Handle PCI
+ */
+#define dev_print(dev, lvl, fmt, ...)  
\
+printk(lvl smmu: %s:  fmt, dt_node_full_name(dev_to_dt(dev)), ## 
__VA_ARGS__)
+
+#define dev_dbg(dev, fmt, ...) dev_print(dev, XENLOG_DEBUG, fmt, ## 
__VA_ARGS__)
+#define dev_notice(dev, fmt, ...) dev_print(dev, 

[Xen-devel] [PATCH v3 02/13] xen/arm: vgic: Drop unecessary include asm/device.h

2015-01-30 Thread Julien Grall
The header asm/device.h has been included in the vgic code during
splitting to support multiple version. But no code within those files
requires it.

Signed-off-by: Julien Grall julien.gr...@linaro.org
Acked-by: Stefano Stabellini stefano.stabell...@eu.citrix.com

---
Changes in v3:
- Add Stefano's ack
---
 xen/arch/arm/vgic-v2.c | 1 -
 xen/arch/arm/vgic-v3.c | 1 -
 2 files changed, 2 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index 515faf7..cd8f7d1 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -26,7 +26,6 @@
 #include xen/sched.h
 
 #include asm/current.h
-#include asm/device.h
 
 #include asm/mmio.h
 #include asm/gic.h
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index bece189..cab3b63 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -27,7 +27,6 @@
 #include xen/sched.h
 #include xen/sizes.h
 #include asm/current.h
-#include asm/device.h
 #include asm/mmio.h
 #include asm/gic_v3_defs.h
 #include asm/gic.h
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] hvm_complete_assist_req: Tell caller we failed to send

2015-01-30 Thread Don Slutz
On 01/30/15 05:53, Paul Durrant wrote:
 -Original Message-
 From: Don Slutz [mailto:dsl...@verizon.com]
 Sent: 30 January 2015 00:53
 To: xen-devel@lists.xen.org
 Cc: Andrew Cooper; George Dunlap; Ian Campbell; Ian Jackson; Jan Beulich;
 Keir (Xen.org); Stefano Stabellini; Wei Liu; Paul Durrant; Don Slutz
 Subject: [PATCH 5/5] hvm_complete_assist_req: Tell caller we failed to send

 This saves a VMENTRY and a VMEXIT since we not longer retry the
 ioport read.

 hvmemul_do_io() will retry in order to complete an ioport read when
 hvm_send_assist_req() is successful.

 Signed-off-by: Don Slutz dsl...@verizon.com
 ---
  xen/arch/x86/hvm/hvm.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
 index 6f7b407..bad410e 100644
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -2599,7 +2599,7 @@ static bool_t hvm_complete_assist_req(ioreq_t *p)
  break;
  }

 -return 1;
 +return 0; /* implicitly bins the i/o operation */
 
 Actually that comment is not right. The operation is not binned; it's just 
 already been done.
 

Will adjust, I expect the return to go away.
   -Don Slutz

   Paul
 
  }

  bool_t hvm_send_assist_req(ioreq_t *p)
 --
 1.8.4
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 2/3] x86/VPMU: Disable VPMU when NMI watchdog is on

2015-01-30 Thread Andrew Cooper
On 30/01/15 17:15, Boris Ostrovsky wrote:
 NMI watchdog sets APIC_LVTPC register to generate an NMI when PMU counter
 overflow occurs. This may be overwritten by VPMU code later, effectively
 turning off the watchdog.

 We should disable VPMU when NMI watchdog is running.

 Signed-off-by: Boris Ostrovsky boris.ostrov...@oracle.com

Reviewed-by: Andrew Cooper andrew.coop...@citrix.com

 ---
  docs/misc/xen-command-line.markdown |  2 ++
  xen/arch/x86/hvm/vpmu.c | 13 +
  2 files changed, 15 insertions(+)

 diff --git a/docs/misc/xen-command-line.markdown 
 b/docs/misc/xen-command-line.markdown
 index 2274e74..bc316be 100644
 --- a/docs/misc/xen-command-line.markdown
 +++ b/docs/misc/xen-command-line.markdown
 @@ -1346,6 +1346,8 @@ wrong behaviour (see handle\_pmc\_quirk()).
  If 'vpmu=bts' is specified the virtualisation of the Branch Trace Store (BTS)
  feature is switched on on Intel processors supporting this feature.
  
 +Note that if **watchdog** option is also specified vpmu will be turned off.
 +
  *Warning:*
  As the BTS virtualisation is not 100% safe and because of the nehalem quirk
  don't use the vpmu flag on production systems with Intel cpus!
 diff --git a/xen/arch/x86/hvm/vpmu.c b/xen/arch/x86/hvm/vpmu.c
 index 63b2158..b2e8e65 100644
 --- a/xen/arch/x86/hvm/vpmu.c
 +++ b/xen/arch/x86/hvm/vpmu.c
 @@ -24,6 +24,7 @@
  #include asm/regs.h
  #include asm/types.h
  #include asm/msr.h
 +#include asm/nmi.h
  #include asm/hvm/support.h
  #include asm/hvm/vmx/vmx.h
  #include asm/hvm/vmx/vmcs.h
 @@ -288,3 +289,15 @@ void vpmu_dump(struct vcpu *v)
  vpmu-arch_vpmu_ops-arch_vpmu_dump(v);
  }
  
 +static int __init vpmu_init(void)
 +{
 +/* NMI watchdog uses LVTPC and HW counter */
 +if ( opt_watchdog  opt_vpmu_enabled )
 +{
 +printk(XENLOG_WARNING NMI watchdog is enabled. Turning VPMU 
 off.\n);
 +opt_vpmu_enabled = 0;
 +}
 +
 +return 0;
 +}
 +__initcall(vpmu_init);


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 04/18] xen/x86: add multiboot2 protocol support

2015-01-30 Thread Andrew Cooper
On 30/01/15 17:54, Daniel Kiper wrote:
 Add multiboot2 protocol support. Alter min memory limit handling as we
 now may not find it from either multiboot (v1) or multiboot2.

 This way we are laying the foundation for EFI + GRUB2 + Xen development.

 Signed-off-by: Daniel Kiper daniel.ki...@oracle.com

I have not reviewed the multiboot2 protocol in detail, but it appears
sane and I presume it is suitably tested and works.

As far as the mb1/mb2 interaction goes, this is looking far better.

Reviewed-by: Andrew Cooper andrew.coop...@citrix.com

 ---
  xen/arch/x86/boot/Makefile|3 +-
  xen/arch/x86/boot/head.S  |  104 --
  xen/arch/x86/boot/reloc.c |  174 
 -
  xen/arch/x86/x86_64/asm-offsets.c |6 ++
  xen/include/xen/multiboot2.h  |  169 +++
  5 files changed, 429 insertions(+), 27 deletions(-)
  create mode 100644 xen/include/xen/multiboot2.h

 diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
 index 5fdb5ae..0efa7d2 100644
 --- a/xen/arch/x86/boot/Makefile
 +++ b/xen/arch/x86/boot/Makefile
 @@ -1,6 +1,7 @@
  obj-bin-y += head.o
  
 -RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h 
 $(BASEDIR)/include/xen/multiboot.h
 +RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h 
 $(BASEDIR)/include/xen/compiler.h \
 +  $(BASEDIR)/include/xen/multiboot.h 
 $(BASEDIR)/include/xen/multiboot2.h
  
  head.o: reloc.S
  
 diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
 index 6180783..7861057 100644
 --- a/xen/arch/x86/boot/head.S
 +++ b/xen/arch/x86/boot/head.S
 @@ -1,5 +1,6 @@
  #include xen/config.h
  #include xen/multiboot.h
 +#include xen/multiboot2.h
  #include public/xen.h
  #include asm/asm_defns.h
  #include asm/desc.h
 @@ -32,6 +33,59 @@ ENTRY(start)
  .long   MULTIBOOT_HEADER_FLAGS
  /* Checksum: must be the negated sum of the first two fields. */
  .long   -(MULTIBOOT_HEADER_MAGIC + MULTIBOOT_HEADER_FLAGS)
 + 
 +/*** MULTIBOOT2 HEADER /
 +/* Some ideas are taken from grub-2.00/grub-core/tests/boot/kernel-i386.S 
 file. */
 +.align  MULTIBOOT2_HEADER_ALIGN
 +
 +.Lmultiboot2_header:
 +/* Magic number indicating a Multiboot2 header. */
 +.long   MULTIBOOT2_HEADER_MAGIC
 +/* Architecture: i386. */
 +.long   MULTIBOOT2_ARCHITECTURE_I386
 +/* Multiboot2 header length. */
 +.long   .Lmultiboot2_header_end - .Lmultiboot2_header
 +/* Multiboot2 header checksum. */
 +.long   -(MULTIBOOT2_HEADER_MAGIC + MULTIBOOT2_ARCHITECTURE_I386 + \
 +(.Lmultiboot2_header_end - .Lmultiboot2_header))
 +
 +/* Multiboot2 tags... */
 +.Lmultiboot2_info_req:
 +/* Multiboot2 information request tag. */
 +.short  MULTIBOOT2_HEADER_TAG_INFORMATION_REQUEST
 +.short  MULTIBOOT2_HEADER_TAG_REQUIRED
 +.long   .Lmultiboot2_info_req_end - .Lmultiboot2_info_req
 +.long   MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
 +.long   MULTIBOOT2_TAG_TYPE_MMAP
 +.Lmultiboot2_info_req_end:
 +
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_MODULE_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_REQUIRED
 +.long   8 /* Tag size. */
 +
 +/* Console flags tag. */
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_CONSOLE_FLAGS
 +.short  MULTIBOOT2_HEADER_TAG_OPTIONAL
 +.long   12 /* Tag size. */
 +.long   MULTIBOOT2_CONSOLE_FLAGS_EGA_TEXT_SUPPORTED
 +
 +/* Framebuffer tag. */
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_FRAMEBUFFER
 +.short  MULTIBOOT2_HEADER_TAG_OPTIONAL
 +.long   20 /* Tag size. */
 +.long   0 /* Number of the columns - no preference. */
 +.long   0 /* Number of the lines - no preference. */
 +.long   0 /* Number of bits per pixel - no preference. */
 +
 +/* Multiboot2 header end tag. */
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_END
 +.short  0
 +.long   8 /* Tag size. */
 +.Lmultiboot2_header_end:
  
  .section .init.rodata, a, @progbits
  .align 4
 @@ -81,10 +135,51 @@ __start:
  mov %ecx,%es
  mov %ecx,%ss
  
 +/* Bootloaders may set multiboot[12].mem_lower to a nonzero value */
 +xor %edx,%edx
 +
  /* Check for Multiboot bootloader */
  cmp $MULTIBOOT_BOOTLOADER_MAGIC,%eax
 -jne not_multiboot
 +je  multiboot1_proto
  
 +/* Check for Multiboot2 bootloader */
 +cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax
 +je  multiboot2_proto
 +
 +jmp not_multiboot
 +
 +multiboot1_proto:
 +/* Get mem_lower from Multiboot information */
 +testb   $MBI_MEMLIMITS,(%ebx)
 +jz  trampoline_setup/* not available? BDA value will be 

Re: [Xen-devel] [PATCH 18/18] x86: add multiboot2 protocol support for EFI platforms

2015-01-30 Thread Andrew Cooper
On 30/01/15 17:54, Daniel Kiper wrote:
 Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
 ---
  xen/arch/x86/boot/head.S  |  174 
 +++--
  xen/arch/x86/efi/efi-boot.h   |   29 +++
  xen/arch/x86/setup.c  |   23 ++---
  xen/arch/x86/x86_64/asm-offsets.c |2 +
  xen/common/efi/boot.c |   11 +++
  5 files changed, 222 insertions(+), 17 deletions(-)

 diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
 index 7861057..89f5aa7 100644
 --- a/xen/arch/x86/boot/head.S
 +++ b/xen/arch/x86/boot/head.S
 @@ -8,6 +8,7 @@
  #include asm/page.h
  #include asm/msr.h
  #include asm/cpufeature.h
 +#include asm/processor.h
  
  .text
  .code32
 @@ -57,6 +58,9 @@ ENTRY(start)
  .long   .Lmultiboot2_info_req_end - .Lmultiboot2_info_req
  .long   MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
  .long   MULTIBOOT2_TAG_TYPE_MMAP
 +.long   MULTIBOOT2_TAG_TYPE_EFI_BS
 +.long   MULTIBOOT2_TAG_TYPE_EFI64
 +.long   MULTIBOOT2_TAG_TYPE_EFI64_IH
  .Lmultiboot2_info_req_end:
  
  .align  MULTIBOOT2_TAG_ALIGN
 @@ -80,6 +84,19 @@ ENTRY(start)
  .long   0 /* Number of the lines - no preference. */
  .long   0 /* Number of bits per pixel - no preference. */
  
 +/* Do not disable EFI boot services. */
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_EFI_BS
 +.short  MULTIBOOT2_HEADER_TAG_OPTIONAL
 +.long   8 /* Tag size. */
 +
 +/* EFI64 entry point. */
 +.align  MULTIBOOT2_TAG_ALIGN
 +.short  MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI64
 +.short  MULTIBOOT2_HEADER_TAG_OPTIONAL
 +.long   12 /* Tag size. */
 +.long   sym_phys(__efi64_start)
 +
  /* Multiboot2 header end tag. */
  .align  MULTIBOOT2_TAG_ALIGN
  .short  MULTIBOOT2_HEADER_TAG_END
 @@ -94,6 +111,17 @@ ENTRY(start)
  gdt_boot_descr:
  .word   6*8-1
  .long   sym_phys(trampoline_gdt)
 +.long   0 /* Needed for 64-bit lgdt */
 +
 +cs32_switch_addr:
 +.long   sym_phys(cs32_switch)
 +.long   BOOT_CS32

.word

ljmpl refers to an m32:16 not an m32:32

 +
 +efi_st:
 +.quad   0
 +
 +efi_ih:
 +.quad   0
  
  .Lbad_cpu_msg: .asciz ERR: Not a 64-bit CPU!
  .Lbad_ldr_msg: .asciz ERR: Not a Multiboot bootloader!
 @@ -124,6 +152,133 @@ print_err:
  .Lhalt: hlt
  jmp .Lhalt
  
 +.code64
 +
 +__efi64_start:
 +cld
 +
 +/* Bootloaders may set multiboot[12].mem_lower to a nonzero value */
 +xor %edx,%edx
 +
 +/* Check for Multiboot2 bootloader */
 +cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax
 +je  efi_multiboot2_proto
 +
 +jmp not_multiboot

not_multiboot is 32bit code.  You must drop out of 64bit before using
it, or make a 64bit variant.

 +
 +efi_multiboot2_proto:
 +/* Skip Multiboot2 information fixed part */
 +lea MB2_fixed_sizeof(%ebx),%ecx
 +
 +0:
 +/* Get mem_lower from Multiboot2 information */
 +cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,(%ecx)
 +jne 1f
 +
 +mov MB2_mem_lower(%ecx),%edx
 +jmp 4f
 +
 +1:
 +/* Get EFI SystemTable address from Multiboot2 information */
 +cmpl$MULTIBOOT2_TAG_TYPE_EFI64,(%ecx)
 +jne 2f
 +
 +lea MB2_efi64_st(%ecx),%esi
 +lea efi_st(%rip),%edi
 +movsq

This is complete overkill for copying a 64bit variable out of the tag
and into a local variable.  Just use a plain 64bit load and store.

 +
 +/* Do not go into real mode on EFI platform */
 +movb$1,skip_realmode(%rip)
 +
 +jmp 4f
 +
 +2:
 +/* Get EFI ImageHandle address from Multiboot2 information */
 +cmpl$MULTIBOOT2_TAG_TYPE_EFI64_IH,(%ecx)
 +jne 3f
 +
 +lea MB2_efi64_ih(%ecx),%esi
 +lea efi_ih(%rip),%edi
 +movsq

And here.

 +jmp 4f
 +
 +3:
 +/* Is it the end of Multiboot2 information? */
 +cmpl$MULTIBOOT2_TAG_TYPE_END,(%ecx)
 +je  run_bs
 +
 +4:
 +/* Go to next Multiboot2 information tag */
 +add MB2_tag_size(%ecx),%ecx
 +add $(MULTIBOOT2_TAG_ALIGN-1),%ecx
 +and $~(MULTIBOOT2_TAG_ALIGN-1),%ecx
 +jmp 0b
 +
 +run_bs:
 +push%rax
 +push%rdx

Does the EFI spec guarantee that we have a good stack to use at this point?

 +
 +/* Initialize BSS (no nasty surprises!) */
 +lea __bss_start(%rip),%rdi
 +lea _end(%rip),%rcx
 +sub %rdi,%rcx
 +xor %rax,%rax

xor %eax,%eax is shorter.

 +rep stosb

It would be more efficient to make sure that the linker aligns
__bss_start and _end on 8 byte boundaries, and use stosq instead.

 +
 +mov efi_ih(%rip),%rdi   /* EFI ImageHandle */
 

Re: [Xen-devel] [PATCH] libxl: Wait for ballooning if free memory is increasing

2015-01-30 Thread Mike Latimer
On Thursday, January 29, 2015 10:14:26 AM Ian Campbell wrote:
 I'm thinking it would be clearer if the comment and the condition were
 logically inverted. e.g.:
 
 /*
  * If the amount of free mem has increased on this iteration (i.e.
  * some progress has been made) then reset the retry counter.
  */
 if (freemem_kb  freemem_kb_prev) {
 retries = MAX_RETRIES;
 free_memkb_prev = free_memkb;
 } else {
 retires--;
 }

Thanks for the comments, Ian. Inverting the logic makes sense, and I'll send a 
v2 shortly.

   Given that this new loop can take significantly longer to fail I wonder
   if we should add some progress logging? xl has an xtl logger instance
   available so using xtl_progress might be an easy option. Maybe a
   separate patch though.

xtl_progress looks interesting. I'll do some additional testing before I 
submit a patch containing this improvement.

-Mike

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 01/18] x86/boot/reloc: mask out MBI_BOOTDEV from mbi flags

2015-01-30 Thread Daniel Kiper
..because it is ignored by Xen.

Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
---
 xen/arch/x86/boot/reloc.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
index f971920..63045c0 100644
--- a/xen/arch/x86/boot/reloc.c
+++ b/xen/arch/x86/boot/reloc.c
@@ -90,7 +90,6 @@ multiboot_info_t *reloc(multiboot_info_t *mbi_old)
 
 /* Mask features we don't understand or don't relocate. */
 mbi-flags = (MBI_MEMLIMITS |
-   MBI_BOOTDEV |
MBI_CMDLINE |
MBI_MODULES |
MBI_MEMMAP |
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 02/18] x86/boot/reloc: create generic alloc and copy functions

2015-01-30 Thread Andrew Cooper
On 30/01/15 17:54, Daniel Kiper wrote:
 Create generic alloc and copy functions. We need
 separate tools for memory allocation and copy to
 provide multiboot2 protocol support.

 Signed-off-by: Daniel Kiper daniel.ki...@oracle.com

Reviewed-by: Andrew Cooper andrew.coop...@citrix.com

 ---
  xen/arch/x86/boot/reloc.c |   52 
 -
  1 file changed, 33 insertions(+), 19 deletions(-)

 diff --git a/xen/arch/x86/boot/reloc.c b/xen/arch/x86/boot/reloc.c
 index 63045c0..f964cda 100644
 --- a/xen/arch/x86/boot/reloc.c
 +++ b/xen/arch/x86/boot/reloc.c
 @@ -33,9 +33,10 @@ asm (
  typedef unsigned int u32;
  #include ../../../include/xen/multiboot.h
  
 -static void *reloc_mbi_struct(void *old, unsigned int bytes)
 +static u32 alloc_struct(u32 bytes)
  {
 -void *new;
 +u32 s;
 +
  asm(
  call 1f  \n
  1:  pop  %%edx   \n
 @@ -43,50 +44,63 @@ static void *reloc_mbi_struct(void *old, unsigned int 
 bytes)
  sub  %1,%0   \n
  and  $~15,%0 \n
  mov  %0,alloc-1b(%%edx)  \n
 -mov  %0,%%edi\n
 -rep  movsb   \n
 -   : =r (new), +c (bytes), +S (old)
 - : : edx, edi, memory);
 -return new;
 +   : =r (s) : r (bytes) : edx, memory);
 +
 +return s;
 +}
 +
 +static u32 copy_struct(u32 src, u32 bytes)
 +{
 +u32 dst, dst_asm;
 +
 +dst = alloc_struct(bytes);
 +dst_asm = dst;
 +
 +asm volatile(rep movsb : +S (src), +D (dst_asm), +c (bytes) : : 
 memory);
 +
 +return dst;
  }
  
 -static char *reloc_mbi_string(char *old)
 +static u32 copy_string(u32 src)
  {
  char *p;
 -for ( p = old; *p != '\0'; p++ )
 +
 +if ( src == 0 )
 +return 0;
 +
 +for ( p = (char *)src; *p != '\0'; p++ )
  continue;
 -return reloc_mbi_struct(old, p - old + 1);
 +
 +return copy_struct(src, p - (char *)src + 1);
  }
  
  multiboot_info_t *reloc(multiboot_info_t *mbi_old)
  {
 -multiboot_info_t *mbi = reloc_mbi_struct(mbi_old, sizeof(*mbi));
 +multiboot_info_t *mbi = (multiboot_info_t *)copy_struct((u32)mbi_old, 
 sizeof(*mbi));
  int i;
  
  if ( mbi-flags  MBI_CMDLINE )
 -mbi-cmdline = (u32)reloc_mbi_string((char *)mbi-cmdline);
 +mbi-cmdline = copy_string(mbi-cmdline);
  
  if ( mbi-flags  MBI_MODULES )
  {
 -module_t *mods = reloc_mbi_struct(
 -(module_t *)mbi-mods_addr, mbi-mods_count * sizeof(module_t));
 +module_t *mods = (module_t *)copy_struct(
 +mbi-mods_addr, mbi-mods_count * sizeof(module_t));
  
  mbi-mods_addr = (u32)mods;
  
  for ( i = 0; i  mbi-mods_count; i++ )
  {
  if ( mods[i].string )
 -mods[i].string = (u32)reloc_mbi_string((char 
 *)mods[i].string);
 +mods[i].string = copy_string(mods[i].string);
  }
  }
  
  if ( mbi-flags  MBI_MEMMAP )
 -mbi-mmap_addr = (u32)reloc_mbi_struct(
 -(memory_map_t *)mbi-mmap_addr, mbi-mmap_length);
 +mbi-mmap_addr = copy_struct(mbi-mmap_addr, mbi-mmap_length);
  
  if ( mbi-flags  MBI_LOADERNAME )
 -mbi-boot_loader_name = (u32)reloc_mbi_string(
 -(char *)mbi-boot_loader_name);
 +mbi-boot_loader_name = copy_string(mbi-boot_loader_name);
  
  /* Mask features we don't understand or don't relocate. */
  mbi-flags = (MBI_MEMLIMITS |


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] hvm_complete_assist_req: Tell caller we failed to send

2015-01-30 Thread Don Slutz
On 01/30/15 05:24, Jan Beulich wrote:
 On 30.01.15 at 01:52, dsl...@verizon.com wrote:
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -2599,7 +2599,7 @@ static bool_t hvm_complete_assist_req(ioreq_t *p)
  break;
  }
  
 -return 1;
 +return 0; /* implicitly bins the i/o operation */
  }
 
 This change points out that having hvm_complete_assist_req() be a
 separate function yet having only a single caller, and it returning
 non-void with only a single possible return value isn't the best
 arrangement. I think this should be brought back into
 hvm_send_assist_req(), by inverting the if() condition there. Unless
 there are intentions for it to have another caller, but in that case
 it should still be made return void, with the caller choosing what to
 return.
 

Sounds good to me will do.
   -Don Slutz

 Jan
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 18/18] x86: add multiboot2 protocol support for EFI platforms

2015-01-30 Thread Andrew Cooper
On 30/01/2015 23:43, Daniel Kiper wrote:
 On Fri, Jan 30, 2015 at 07:06:53PM +, Andrew Cooper wrote:
 On 30/01/15 17:54, Daniel Kiper wrote:
 +
 +efi_multiboot2_proto:
 +/* Skip Multiboot2 information fixed part */
 +lea MB2_fixed_sizeof(%ebx),%ecx
 +
 +0:
 +/* Get mem_lower from Multiboot2 information */
 +cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,(%ecx)
 +jne 1f
 +
 +mov MB2_mem_lower(%ecx),%edx
 +jmp 4f
 +
 +1:
 +/* Get EFI SystemTable address from Multiboot2 information */
 +cmpl$MULTIBOOT2_TAG_TYPE_EFI64,(%ecx)
 +jne 2f
 +
 +lea MB2_efi64_st(%ecx),%esi
 +lea efi_st(%rip),%edi
 +movsq
 This is complete overkill for copying a 64bit variable out of the tag
 and into a local variable.  Just use a plain 64bit load and store.
 I am not sure what do you mean by 64bit load and store but I have
 just realized that we do not need these variables. They are remnants
 from early developments when I thought that we need ImageHandle
 and SystemTable here and later somewhere else.

mov MB2_efi64_st(%rcx), %rdi
mov %rdi, efi_st(%rip)

But if they are not needed, drop the code completely.

 +jmp 4f
 +
 +3:
 +/* Is it the end of Multiboot2 information? */
 +cmpl$MULTIBOOT2_TAG_TYPE_END,(%ecx)
 +je  run_bs
 +
 +4:
 +/* Go to next Multiboot2 information tag */
 +add MB2_tag_size(%ecx),%ecx
 +add $(MULTIBOOT2_TAG_ALIGN-1),%ecx
 +and $~(MULTIBOOT2_TAG_ALIGN-1),%ecx
 +jmp 0b
 +
 +run_bs:
 +push%rax
 +push%rdx
 Does the EFI spec guarantee that we have a good stack to use at this point?
 Unified Extensible Firmware Interface Specification, Version 2.4 Errata B,
 section 2.3.4, x64 Platforms says: During boot services time the processor
 is in the following execution mode: ..., 128 KiB, or more, of available
 stack space. GRUB2 uses this stack too and do not move it to different
 memory region. So, I think that here we are on safe side.

Sounds ok then.


 +/* Initialize BSS (no nasty surprises!) */
 +lea __bss_start(%rip),%rdi
 +lea _end(%rip),%rcx
 +sub %rdi,%rcx
 +xor %rax,%rax
 xor %eax,%eax is shorter.

 +rep stosb
 It would be more efficient to make sure that the linker aligns
 __bss_start and _end on 8 byte boundaries, and use stosq instead.
 Right but just for this. Is it pays? We do this only once.

The BSS in Xen is 300k.  It is absolutely better to clear it 8 bytes at
a time rather than 1.

 However, if you wish...

 +mov efi_ih(%rip),%rdi   /* EFI ImageHandle */
 +mov efi_st(%rip),%rsi   /* EFI SystemTable */
 +callefi_multiboot2
 +
 +pop %rcx
 +pop %rax
 +
 +shl $10-4,%rcx  /* Convert multiboot2.mem_lower to 
 bytes/16 */
 +
 +cli
 This looks suspiciously out of place.  Surely the EFI spec doesn't
 permit entry with interrupts enabled?
 Unified Extensible Firmware Interface Specification, Version 2.4 Errata B,
 section 2.3.4, x64 Platforms says: During boot services time the processor
 is in the following execution mode: ..., Interrupts are enabled–though no
 interrupt services are supported other than the UEFI boot services timer
 functions (All loaded device drivers are serviced synchronously by 
 “polling.”).
 So, I think that we should use BS with interrupts enabled and disable
 them after ExitBootServices(). Hmmm... Now I think that we should use
 cli immediately after efi_multiboot2() call.

I presume then that the firmware has set up a valid idt somewhere and is
actually serving any interrupts we get.

 diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
 index f8be3dd..c5725ca 100644
 --- a/xen/common/efi/boot.c
 +++ b/xen/common/efi/boot.c
 @@ -75,6 +75,17 @@ static size_t wstrlen(const CHAR16 * s);
  static int set_color(u32 mask, int bpp, u8 *pos, u8 *sz);
  static bool_t match_guid(const EFI_GUID *guid1, const EFI_GUID *guid2);

 +static void efi_init(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable);
 +static void efi_console_set_mode(void);
 +static EFI_GRAPHICS_OUTPUT_PROTOCOL *efi_get_gop(void);
 +static UINTN efi_find_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop,
 +   UINTN cols, UINTN rows, UINTN depth);
 +static void efi_tables(void);
 +static void setup_efi_pci(void);
 +static void efi_variables(void);
 +static void efi_set_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, UINTN 
 gop_mode);
 +static void efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
 *SystemTable);
 +
 If any of these forward declarations are needed, they should be
 They are needed because efi-boot.h is included before above
 mentioned functions definitions.

 introduced in the appropriate create efi_$FOO patch.  However, I can't
 I thought about that during development. 

Re: [Xen-devel] [PATCH 18/18] x86: add multiboot2 protocol support for EFI platforms

2015-01-30 Thread Daniel Kiper
On Fri, Jan 30, 2015 at 07:06:53PM +, Andrew Cooper wrote:
 On 30/01/15 17:54, Daniel Kiper wrote:
  Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
  ---
   xen/arch/x86/boot/head.S  |  174 
  +++--
   xen/arch/x86/efi/efi-boot.h   |   29 +++
   xen/arch/x86/setup.c  |   23 ++---
   xen/arch/x86/x86_64/asm-offsets.c |2 +
   xen/common/efi/boot.c |   11 +++
   5 files changed, 222 insertions(+), 17 deletions(-)
 
  diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
  index 7861057..89f5aa7 100644
  --- a/xen/arch/x86/boot/head.S
  +++ b/xen/arch/x86/boot/head.S
  @@ -8,6 +8,7 @@
   #include asm/page.h
   #include asm/msr.h
   #include asm/cpufeature.h
  +#include asm/processor.h
 
   .text
   .code32
  @@ -57,6 +58,9 @@ ENTRY(start)
   .long   .Lmultiboot2_info_req_end - .Lmultiboot2_info_req
   .long   MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO
   .long   MULTIBOOT2_TAG_TYPE_MMAP
  +.long   MULTIBOOT2_TAG_TYPE_EFI_BS
  +.long   MULTIBOOT2_TAG_TYPE_EFI64
  +.long   MULTIBOOT2_TAG_TYPE_EFI64_IH
   .Lmultiboot2_info_req_end:
 
   .align  MULTIBOOT2_TAG_ALIGN
  @@ -80,6 +84,19 @@ ENTRY(start)
   .long   0 /* Number of the lines - no preference. */
   .long   0 /* Number of bits per pixel - no preference. */
 
  +/* Do not disable EFI boot services. */
  +.align  MULTIBOOT2_TAG_ALIGN
  +.short  MULTIBOOT2_HEADER_TAG_EFI_BS
  +.short  MULTIBOOT2_HEADER_TAG_OPTIONAL
  +.long   8 /* Tag size. */
  +
  +/* EFI64 entry point. */
  +.align  MULTIBOOT2_TAG_ALIGN
  +.short  MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI64
  +.short  MULTIBOOT2_HEADER_TAG_OPTIONAL
  +.long   12 /* Tag size. */
  +.long   sym_phys(__efi64_start)
  +
   /* Multiboot2 header end tag. */
   .align  MULTIBOOT2_TAG_ALIGN
   .short  MULTIBOOT2_HEADER_TAG_END
  @@ -94,6 +111,17 @@ ENTRY(start)
   gdt_boot_descr:
   .word   6*8-1
   .long   sym_phys(trampoline_gdt)
  +.long   0 /* Needed for 64-bit lgdt */
  +
  +cs32_switch_addr:
  +.long   sym_phys(cs32_switch)
  +.long   BOOT_CS32

 .word

 ljmpl refers to an m32:16 not an m32:32

  +
  +efi_st:
  +.quad   0
  +
  +efi_ih:
  +.quad   0
 
   .Lbad_cpu_msg: .asciz ERR: Not a 64-bit CPU!
   .Lbad_ldr_msg: .asciz ERR: Not a Multiboot bootloader!
  @@ -124,6 +152,133 @@ print_err:
   .Lhalt: hlt
   jmp .Lhalt
 
  +.code64
  +
  +__efi64_start:
  +cld
  +
  +/* Bootloaders may set multiboot[12].mem_lower to a nonzero value 
  */
  +xor %edx,%edx
  +
  +/* Check for Multiboot2 bootloader */
  +cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax
  +je  efi_multiboot2_proto
  +
  +jmp not_multiboot

 not_multiboot is 32bit code.  You must drop out of 64bit before using
 it, or make a 64bit variant.

  +
  +efi_multiboot2_proto:
  +/* Skip Multiboot2 information fixed part */
  +lea MB2_fixed_sizeof(%ebx),%ecx
  +
  +0:
  +/* Get mem_lower from Multiboot2 information */
  +cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,(%ecx)
  +jne 1f
  +
  +mov MB2_mem_lower(%ecx),%edx
  +jmp 4f
  +
  +1:
  +/* Get EFI SystemTable address from Multiboot2 information */
  +cmpl$MULTIBOOT2_TAG_TYPE_EFI64,(%ecx)
  +jne 2f
  +
  +lea MB2_efi64_st(%ecx),%esi
  +lea efi_st(%rip),%edi
  +movsq

 This is complete overkill for copying a 64bit variable out of the tag
 and into a local variable.  Just use a plain 64bit load and store.

I am not sure what do you mean by 64bit load and store but I have
just realized that we do not need these variables. They are remnants
from early developments when I thought that we need ImageHandle
and SystemTable here and later somewhere else.

  +/* Do not go into real mode on EFI platform */
  +movb$1,skip_realmode(%rip)
  +
  +jmp 4f
  +
  +2:
  +/* Get EFI ImageHandle address from Multiboot2 information */
  +cmpl$MULTIBOOT2_TAG_TYPE_EFI64_IH,(%ecx)
  +jne 3f
  +
  +lea MB2_efi64_ih(%ecx),%esi
  +lea efi_ih(%rip),%edi
  +movsq

 And here.

Ditto.

  +jmp 4f
  +
  +3:
  +/* Is it the end of Multiboot2 information? */
  +cmpl$MULTIBOOT2_TAG_TYPE_END,(%ecx)
  +je  run_bs
  +
  +4:
  +/* Go to next Multiboot2 information tag */
  +add MB2_tag_size(%ecx),%ecx
  +add $(MULTIBOOT2_TAG_ALIGN-1),%ecx
  +and $~(MULTIBOOT2_TAG_ALIGN-1),%ecx
  +jmp 0b
  +
  +run_bs:
  +push%rax
  +push%rdx

 Does the EFI spec guarantee that we have a good stack to 

Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine property to support IGD GFX passthrough

2015-01-30 Thread Xu, Quan


 -Original Message-
 From: xen-devel-boun...@lists.xen.org
 [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Wei Liu
 Sent: Friday, January 30, 2015 8:26 PM
 To: Chen, Tiejun
 Cc: Wei Liu; ian.campb...@citrix.com; m...@redhat.com; Ian Jackson;
 qemu-de...@nongnu.org; xen-devel@lists.xen.org; Gerd Hoffmann
 Subject: Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine
 property to support IGD GFX passthrough
 
 On Fri, Jan 30, 2015 at 08:56:48AM +0800, Chen, Tiejun wrote:
 [...]
  
  Just remember to handle old option in libxl if your old option is
  already released by some older version of QEMUs.
  
  I just drop that old option, -gfx_passthru, if we're under qemu
  upstream circumstance, like this,
  
  
  The question is, is there any version of qemu upstream that has been
  released that has the old option (-gfx-passthru)?
 
  No. Just now we're starting to support IGD passthrough in qemu upstream.
 
 
 Right, as of QEMU 2.2.0 there's no support of IGD passthrough in QMEU
 upstream.
 

Just a question:
   Now what features do vt-d support? Thanks.

-Quan

  
  This gives us a situation that we need to support both the old
  (-gfx-passthru) and new (-igd-passthru) options. Presumably we
  (libxl) would need to fork a qemu process to determine which option
  it has and pass the right one.
  
  Or you can try to keep both old and new option at the same time but
 
  Yeah, actually I also have considered to keep both two options at the
  same time. Its really friendly to any qemu version.
 
  deprecate the old one. Then in a few qemu release cycles later (or
 
  This should be like 'accel=kvm' versus 'enable-kvm' in qemu upstream.
  They're coexisted now but just the former is a modern option.
 
  probably one year or two?) you can finally remove the old one. The
  point is that to give downstream (in this case, Xen) time to cope
  with the change.
 
  Here I'm fine to this way.
 
  So Gerd,
 
 
 So you don't actually need to ask Gerd this question because there is no old
 option to keep in qemu upstream.
 
 Libxl (or any sensible toolstack) will just do the right thing to either pass
 -igd-passthru (or whatever you guys agree upon) to qemu upstream or pass
 -gfx-passthru to qemu traditional. :-)
 
 Wei.
 
  What about this?
 
 
 
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers

2015-01-30 Thread Juergen Gross
Up to now the pvscsi drivers haven't supported domain suspend and
resume. When a domain with an assigned pvscsi device was suspended
and resumed again, it was not able to use the device any more: trying
to do so resulted in hanging processes.

Support suspend and resume of pvscsi devices.

Signed-off-by: Juergen Gross jgr...@suse.com

diff -r 578e5aea3cbb drivers/xen/scsiback/xenbus.c
--- a/drivers/xen/scsiback/xenbus.c Mon Jan 19 11:51:46 2015 +0100
+++ b/drivers/xen/scsiback/xenbus.c Fri Jan 30 13:57:29 2015 +0100
@@ -167,33 +167,48 @@ static void scsiback_do_lun_hotplug(stru
 
switch (op) {
case VSCSIBACK_OP_ADD_OR_DEL_LUN:
-   if (device_state == XenbusStateInitialising) {
+   switch (device_state) {
+   case XenbusStateInitialising:
+   case XenbusStateConnected:
sdev = scsiback_get_scsi_device(phy);
-   if (!sdev)
-   xenbus_printf(XBT_NIL, dev-nodename, 
state_str, 
-   %d, 
XenbusStateClosed);
-   else {
-   err = 
scsiback_add_translation_entry(be-info, sdev, vir);
-   if (!err) {
-   if (xenbus_printf(XBT_NIL, 
dev-nodename, state_str, 
-   %d, 
XenbusStateInitialised)) {
-   printk(KERN_ERR 
scsiback: xenbus_printf error %s\n, state_str);
-   
scsiback_del_translation_entry(be-info, vir);
-   }
-   } else {
-   scsi_device_put(sdev);
-   xenbus_printf(XBT_NIL, 
dev-nodename, state_str, 
-   %d, 
XenbusStateClosed);
-   }
+   if (!sdev) {
+   xenbus_printf(XBT_NIL, dev-nodename,
+ state_str,
+ %d, XenbusStateClosed);
+   break;
}
-   }
+   if (scsiback_add_translation_entry(be-info,
+   sdev, vir)) {
+   scsi_device_put(sdev);
+   if (device_state == 
XenbusStateConnected)
+   break;
+   xenbus_printf(XBT_NIL, dev-nodename,
+ state_str,
+ %d, XenbusStateClosed);
+   break;
+   }
+   if (!xenbus_printf(XBT_NIL, dev-nodename,
+ state_str, %d,
+ XenbusStateInitialised))
+   break;
+   printk(KERN_ERR scsiback: xenbus_printf error 
%s\n,
+  state_str);
+   scsiback_del_translation_entry(be-info, vir);
+   break;
 
-   if (device_state == XenbusStateClosing) {
-   if (!scsiback_del_translation_entry(be-info, 
vir)) {
-   if (xenbus_printf(XBT_NIL, 
dev-nodename, state_str, 
-   %d, 
XenbusStateClosed))
-   printk(KERN_ERR scsiback: 
xenbus_printf error %s\n, state_str);
-   }
+   case XenbusStateClosing:
+   if (scsiback_del_translation_entry(be-info,
+  vir))
+   break;
+   if (xenbus_printf(XBT_NIL, dev-nodename,
+ state_str, %d,
+ XenbusStateClosed))
+   printk(KERN_ERR scsiback: 
xenbus_printf error %s\n,
+  state_str);
+   break;
+
+   default:
+   break;
}

Re: [Xen-devel] [RFC V10 0/4] domain snapshot document

2015-01-30 Thread Wei Liu
On Fri, Jan 30, 2015 at 03:01:11PM +0100, Fabio Fantoni wrote:
 Il 30/01/2015 14:38, Wei Liu ha scritto:
 On Thu, Jan 29, 2015 at 04:36:43PM +, Ian Campbell wrote:
 On Mon, 2015-01-26 at 11:25 +0800, Chunyan Liu wrote:
 Changes to V9:
 This looks good to me, thanks.
 
 Ian/Wei do you have any comments? xapi folks?
 
 Looks good to me.
 
 Wei.
 
 I looked to the project of domain snapshot and seems good.
 
 I have only a question.
 I saw this:
 Disk snapshot can be created by many external tools, like qemu-img,
 vhd-util and lvm, etc.
 Wouldn't it be good to also use btrfs for external disks snapshot?
 Even if btrfs snapshots works on subvolumes (directories) it is possible to
 atomically creates a copy-on-write snapshot of a file (with domUs disks on
 file, including raw) with btrfs clone operation (reflink systemcall, eg: cp
 --reflink source dest)
 

I think that etc. potentially covers all other external tools,
including btrfs. You can list all possible external tools in the world.

Wei.

 
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V10 0/4] domain snapshot document

2015-01-30 Thread Wei Liu
On Fri, Jan 30, 2015 at 02:05:05PM +, Wei Liu wrote:
 On Fri, Jan 30, 2015 at 03:01:11PM +0100, Fabio Fantoni wrote:
  Il 30/01/2015 14:38, Wei Liu ha scritto:
  On Thu, Jan 29, 2015 at 04:36:43PM +, Ian Campbell wrote:
  On Mon, 2015-01-26 at 11:25 +0800, Chunyan Liu wrote:
  Changes to V9:
  This looks good to me, thanks.
  
  Ian/Wei do you have any comments? xapi folks?
  
  Looks good to me.
  
  Wei.
  
  I looked to the project of domain snapshot and seems good.
  
  I have only a question.
  I saw this:
  Disk snapshot can be created by many external tools, like qemu-img,
  vhd-util and lvm, etc.
  Wouldn't it be good to also use btrfs for external disks snapshot?
  Even if btrfs snapshots works on subvolumes (directories) it is possible to
  atomically creates a copy-on-write snapshot of a file (with domUs disks on
  file, including raw) with btrfs clone operation (reflink systemcall, eg: cp
  --reflink source dest)
  
 
 I think that etc. potentially covers all other external tools,
 including btrfs. You can list all possible external tools in the world.
   ^^^
   can't
 
 Wei.
 
  
  
  ___
  Xen-devel mailing list
  Xen-devel@lists.xen.org
  http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] ocaml/xenctrl: Make failwith_xc() thread safe

2015-01-30 Thread Wei Liu
On Fri, Jan 30, 2015 at 02:19:53PM +, Dave Scott wrote:
 
 Looks ok to me.
 
 Signed-off-by: David Scott dave.sc...@citrix.com
 

I think this should be an Acked-by.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH linux-2.6.18] xen: mark pvscsi frontend request consumed only after last read

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 14:51, jgr...@suse.com.non-mime.internet wrote:
 A request in the ring buffer mustn't be read after it has been marked
 as consumed. Otherwise it might already have been reused by the
 frontend without violating the ring protocol.
 
 To avoid inconsistencies in the backend only work on a private copy
 of the request. This will ensure a malicious guest not being able to
 bypass consistency checks of the backend by modifying an active
 request.

I'm not convinced we need this in this version of the driver: c/s
590:c4134d1a3e3f took care of reading each ring_req field just
once.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH linux-2.6.18] xen: mark pvscsi frontend request consumed only after last read

2015-01-30 Thread Juergen Gross
A request in the ring buffer mustn't be read after it has been marked
as consumed. Otherwise it might already have been reused by the
frontend without violating the ring protocol.

To avoid inconsistencies in the backend only work on a private copy
of the request. This will ensure a malicious guest not being able to
bypass consistency checks of the backend by modifying an active
request.

Signed-off-by: Juergen Gross jgr...@suse.com

diff -r 578e5aea3cbb drivers/xen/scsiback/scsiback.c
--- a/drivers/xen/scsiback/scsiback.c   Mon Jan 19 11:51:46 2015 +0100
+++ b/drivers/xen/scsiback/scsiback.c   Fri Jan 30 14:43:29 2015 +0100
@@ -579,7 +579,7 @@ invalid_value:
 static int _scsiback_do_cmd_fn(struct vscsibk_info *info)
 {
struct vscsiif_back_ring *ring = info-ring;
-   vscsiif_request_t  *ring_req;
+   vscsiif_request_t ring_req;
 
pending_req_t *pending_req;
RING_IDX rc, rp;
@@ -609,10 +609,10 @@ static int _scsiback_do_cmd_fn(struct vs
break;
}
 
-   ring_req = RING_GET_REQUEST(ring, rc);
+   ring_req = *RING_GET_REQUEST(ring, rc);
ring-req_cons = ++rc;
 
-   err = prepare_pending_reqs(info, ring_req,
+   err = prepare_pending_reqs(info, ring_req,
pending_req);
switch (err ?: pending_req-act) {
case VSCSIIF_ACT_SCSI_CDB:

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V10 0/4] domain snapshot document

2015-01-30 Thread Fabio Fantoni

Il 30/01/2015 14:38, Wei Liu ha scritto:

On Thu, Jan 29, 2015 at 04:36:43PM +, Ian Campbell wrote:

On Mon, 2015-01-26 at 11:25 +0800, Chunyan Liu wrote:

Changes to V9:

This looks good to me, thanks.

Ian/Wei do you have any comments? xapi folks?


Looks good to me.

Wei.


I looked to the project of domain snapshot and seems good.

I have only a question.
I saw this:

Disk snapshot can be created by many external tools, like qemu-img,
vhd-util and lvm, etc.

Wouldn't it be good to also use btrfs for external disks snapshot?
Even if btrfs snapshots works on subvolumes (directories) it is possible 
to atomically creates a copy-on-write snapshot of a file (with domUs 
disks on file, including raw) with btrfs clone operation (reflink 
systemcall, eg: cp --reflink source dest)





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues

2015-01-30 Thread David Vrabel
On 30/01/15 14:17, Konrad Rzeszutek Wilk wrote:
 
 blinks kexec is OS agnostic?

Yes.

Extra information to be passed to (e.g., a Linux kernel being kexec'd)
can be supplied in an additional segment and marshalled into a suitable
format/location for the exec'd kernel by purgatory (or similar).

There shouldn't be a need to extend the hypervisor ABI for this.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2] misc/xenmicrocode: Upload /lib/firmware/some blob to the hypervisor

2015-01-30 Thread Boris Ostrovsky

On 01/29/2015 08:14 PM, Luis R. Rodriguez wrote:

From: Luis R. Rodriguez mcg...@suse.com

There are several ways that a Xen system can update the
CPU microcode on a pvops kernel [0] now, the preferred way
is through the early microcode update mechanism. At run
time folks should use this new xenmicrocode tool and use
the same CPU microcode file as present on /lib/firmware.

Some distributions may use the historic sysfs rescan interface,
users of that mechanism should be aware that the interface will
not be available when using Xen and as such should first check
the presence of the interface before usage, as an alternative
this xenmicrocode tool can be used on priviledged domains.

Folks wishing to update CPU microcode at run time should be
aware that not all CPU microcode can be updated on a system
and should take care to ensure that only known-to-work and
supported CPU microcode updates are used [0]. To avoid issues
with delays on the hypercall / microcode update this
implementation will quiesce all domains prior to updating
the microcode, and then only queisce'd domains will be
unpaused once done.

[0] http://wiki.xenproject.org/wiki/XenParavirtOps/microcode_update

Based on original work by: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Cc: Andrew Cooper andrew.coop...@citrix.com
Cc: Borislav Petkov b...@suse.de
Cc: Takashi Iwai ti...@suse.de
Cc: Olaf Hering oher...@suse.de
Cc: Jan Beulich jbeul...@suse.com
Cc: Jason Douglas jdoug...@suse.com
Cc: Juergen Gross jgr...@suse.com
Cc: Michal Marek mma...@suse.cz
Cc: Henrique de Moraes Holschuh h...@debian.org
Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Signed-off-by: Luis R. Rodriguez mcg...@suse.com
---

Just wrote this, haven't tested it. This does some queiscing prior
to doing the microcode update. The quiescing is done by pausing
all domains. Once the microcode update is done we only unpause
domains which we queisced as part of our work. Let me know if this
is on the right track to help avoid issues noted on the list.

  tools/libxc/include/xenctrl.h |   2 +
  tools/libxc/xc_misc.c | 102 ++
  tools/misc/Makefile   |   4 ++
  tools/misc/xenmicrocode.c | 101 +
  4 files changed, 209 insertions(+)
  create mode 100644 tools/misc/xenmicrocode.c

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 0ad8b8d..d5db0b8 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -55,6 +55,7 @@
  #include xen/foreign/x86_32.h
  #include xen/foreign/x86_64.h
  #include xen/arch-x86/xen-mca.h
+#include xen/platform.h
  #endif

  #define XC_PAGE_SHIFT   12
@@ -2145,6 +2146,7 @@ int xc_cpuid_apply_policy(xc_interface *xch,
  void xc_cpuid_to_str(const unsigned int *regs,
   char **strs); /* some strs[] may be NULL if ENOMEM */
  int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
+int xc_microcode_update(xc_interface *xch, uint8_t *fbuf, size_t len);
  #endif

  struct xc_px_val {
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index e253a58..6137e24 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -20,6 +20,7 @@

  #include xc_private.h
  #include xen/hvm/hvm_op.h
+#include xen/platform.h

  int xc_get_max_cpus(xc_interface *xch)
  {
@@ -250,6 +251,107 @@ int xc_mca_op(xc_interface *xch, struct xen_mc *mc)
  xc_hypercall_bounce_post(xch, mc);
  return ret;
  }
+
+struct xc_quiesce_request {
+int num_quiesced;
+int domids[1024]; /* never domid 0 */
+};
+
+/*
+ * Do not pause already paused domains, and allow us to
+ * unpause only quiesced domains.
+ */
+static int quiesce_all_domains(xc_interface *xch,
+  struct xc_quiesce_request *quiesce_request)
+{
+xc_domaininfo_t info[1024];
+int i, nb_domain;
+
+nb_domain = xc_domain_getinfolist(xch, 0, 1024, info);
+if ( nb_domain  0 )
+{
+   return -1;
+}
+
+for ( i = 0; i  nb_domain; i++ )
+{
+if ( info[i].domain == 0 )
+   continue;
+   if ( info[i].flags  XEN_DOMINF_paused )
+   continue;
+
+xc_domain_pause(xch, info[i].domain);



You are not handling errors returned by xc_domain_pause(). So then you 
will try to unpause a domain that may not have been paused and (I think 
more importantly) may proceed with microcode update while not all 
domains are paused.




+
+   quiesce_request-domids[quiesce_request-num_quiesced] = info[i].domain;
+   quiesce_request-num_quiesced++;
+}
+
+return 0;
+}
+
+static void unquiesce_all_domains(xc_interface *xch,
+  struct xc_quiesce_request *quiesce_request)
+{
+int i;
+
+for ( i = 0; i  quiesce_request-num_quiesced; i++ )
+{
+   xc_domain_unpause(xch, quiesce_request-domids[i]);



Same here --- you may fail unpausing a domain and 

Re: [Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 14:52, jgr...@suse.com.non-mime.internet wrote:
 @@ -231,8 +242,23 @@ static int scsifront_cmd_done(struct vsc
   return more_to_do;
  }
  
 +void scsifront_finish_all(struct vscsifrnt_info *info)
 +{
 + unsigned i;
 + struct vscsiif_response resp;
  
 + scsifront_ring_drain(info);

Shouldn't you at least issue some kind of warning when this returns
non-zero?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v17 14/23] x86/VPMU: Initialize VPMUs with __initcall

2015-01-30 Thread Jan Beulich
 On 05.01.15 at 22:44, boris.ostrov...@oracle.com wrote:
 --- a/xen/arch/x86/hvm/vpmu.c
 +++ b/xen/arch/x86/hvm/vpmu.c
 @@ -497,3 +497,39 @@ long do_xenpmu_op(int op, 
 XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg)
  
  return ret;
  }
 +
 +static int __init vpmu_init(void)
 +{
 +int vendor = current_cpu_data.x86_vendor;
 +
 +if ( vpmu_mode == XENPMU_MODE_OFF )
 +{
 +printk(XENLOG_INFO VPMU: disabled\n);
 +return 0;
 +}
 +
 +switch ( vendor )
 +{
 +case X86_VENDOR_AMD:
 +if ( amd_vpmu_init() )
 +   vpmu_mode = XENPMU_MODE_OFF;
 +break;
 +case X86_VENDOR_INTEL:
 +if ( core2_vpmu_init() )
 +   vpmu_mode = XENPMU_MODE_OFF;
 +break;
 +default:
 +printk(XENLOG_WARNING VPMU: Unknown CPU vendor: %d\n, vendor);
 +vpmu_mode = XENPMU_MODE_OFF;
 +break;

return 0;

(i.e. avoid printing another message below)

 +}
 +
 +if ( vpmu_mode == XENPMU_MODE_OFF )
 +printk(XENLOG_WARNING VPMU: Disabling due to initialization 
 error\n);

We repeatedly find that not printing at least a vague indication of
what went wrong makes problem analysis quite a bit more difficult.
It won't cost much to include the actual error code here. 

 +else
 +printk(XENLOG_INFO VPMU: version %s.%s\n,
 +   __stringify(XENPMU_VER_MAJ), __stringify(XENPMU_VER_MIN));

%s and __stringify()? Either print the numbers with %d or %u, or
use __stringify() to avoid any argument besides the format string.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers

2015-01-30 Thread Juergen Gross

On 01/30/2015 03:46 PM, Jan Beulich wrote:

On 30.01.15 at 14:52, jgr...@suse.com.non-mime.internet wrote:

@@ -231,8 +242,23 @@ static int scsifront_cmd_done(struct vsc
return more_to_do;
  }

+void scsifront_finish_all(struct vscsifrnt_info *info)
+{
+   unsigned i;
+   struct vscsiif_response resp;

+   scsifront_ring_drain(info);


Shouldn't you at least issue some kind of warning when this returns
non-zero?


If a warning should be issued, then this should be done after the
following loop in case of at least one request terminated there.

I'm really not sure, whether a warning is required here. If you like,
I can add one.

Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues

2015-01-30 Thread David Vrabel
On 30/01/15 14:52, Konrad Rzeszutek Wilk wrote:
 On Fri, Jan 30, 2015 at 02:40:46PM +, David Vrabel wrote:
 On 30/01/15 14:17, Konrad Rzeszutek Wilk wrote:

 blinks kexec is OS agnostic?

 Yes.

 Extra information to be passed to (e.g., a Linux kernel being kexec'd)
 can be supplied in an additional segment and marshalled into a suitable
 format/location for the exec'd kernel by purgatory (or similar).

 There shouldn't be a need to extend the hypervisor ABI for this.
 
 How are you planning to make kexec work under EFI?

I don't know at this time.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 15:54, jgr...@suse.com wrote:
 On 01/30/2015 03:46 PM, Jan Beulich wrote:
 On 30.01.15 at 14:52, jgr...@suse.com.non-mime.internet wrote:
 @@ -231,8 +242,23 @@ static int scsifront_cmd_done(struct vsc
 return more_to_do;
   }

 +void scsifront_finish_all(struct vscsifrnt_info *info)
 +{
 +   unsigned i;
 +   struct vscsiif_response resp;

 +   scsifront_ring_drain(info);

 Shouldn't you at least issue some kind of warning when this returns
 non-zero?
 
 If a warning should be issued, then this should be done after the
 following loop in case of at least one request terminated there.
 
 I'm really not sure, whether a warning is required here. If you like,
 I can add one.

I'm not sure, I'm merely asking because I saw the function return
value being ignored here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 33958: regressions - FAIL

2015-01-30 Thread xen . org
flight 33958 ovmf real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/33958/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64   5 xen-build fail REGR. vs. 33686
 build-i3865 xen-build fail REGR. vs. 33686

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 build-amd64-libvirt   1 build-check(1)   blocked  n/a
 build-i386-libvirt1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf-pin  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-rhel6hvm-amd  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-rhel6hvm-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pcipt-intel  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-winxpsp3   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-win7-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-win7-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-winxpsp3-vcpus1  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a

version targeted for testing:
 ovmf 49a228ca6980b197bb854dc88c909d712959cb47
baseline version:
 ovmf 447d264115c476142f884af0be287622cd244423


People who touched revisions under test:
  Gao, Liming liming@intel.com
  Long, Qin qin.l...@intel.com
  Yao, Jiewen jiewen@intel.com
  Aaron Pop aar...@ami.com
  Abner Chang abner.ch...@hp.com
  Alex Williamson alex.william...@redhat.com
  Anderw Fish af...@apple.com
  Andrew Fish af...@apple.com
  Anthony PERARD anthony.per...@citrix.com
  Ard Biesheuvel ard.biesheu...@linaro.org
  Ari Zigler a...@mellanox.com
  Brendan Jackman brendan.jack...@arm.com
  Bruce Cran bruce.c...@gmail.com
  Cecil Sheng cecil.sh...@hp.com
  Chao Zhang chao.b.zh...@intel.com
  Chao, Zhang chao.b.zh...@intel.com
  Chen Fan chen.fan.f...@cn.fujitsu.com
  Chris Phillips chr...@hp.com
  Chris Ruffin chris.ruf...@intel.com
  Cinnamon Shia cinnamon.s...@hp.com
  Daryl McDaniel  daryl.mcdan...@intel.com
  Daryl McDaniel daryl.mcdan...@intel.com
  daryl.mcdaniel daryl.mcdan...@intel.com
  daryl.mcdan...@intel.com
  darylm503 darylm503@Edk2
  

Re: [Xen-devel] [PATCH linux-2.6.18] support suspend/resume in pvscsi drivers

2015-01-30 Thread Juergen Gross

On 01/30/2015 04:05 PM, Jan Beulich wrote:

On 30.01.15 at 15:54, jgr...@suse.com wrote:

On 01/30/2015 03:46 PM, Jan Beulich wrote:

On 30.01.15 at 14:52, jgr...@suse.com.non-mime.internet wrote:

@@ -231,8 +242,23 @@ static int scsifront_cmd_done(struct vsc
return more_to_do;
   }

+void scsifront_finish_all(struct vscsifrnt_info *info)
+{
+   unsigned i;
+   struct vscsiif_response resp;

+   scsifront_ring_drain(info);


Shouldn't you at least issue some kind of warning when this returns
non-zero?


If a warning should be issued, then this should be done after the
following loop in case of at least one request terminated there.

I'm really not sure, whether a warning is required here. If you like,
I can add one.


I'm not sure, I'm merely asking because I saw the function return
value being ignored here.


I think it can be 0 only. We are handling resume, so the ring which is
being drained will no longer be filled by the backend.

Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Is: kexec EFI Was: Re: EFI GetNextVariableName crashes when running under Xen, but not under Linux. efi-rs=0 works. No memmap issues

2015-01-30 Thread Daniel Kiper
On Fri, Jan 30, 2015 at 02:57:32PM +, David Vrabel wrote:
 On 30/01/15 14:52, Konrad Rzeszutek Wilk wrote:
  On Fri, Jan 30, 2015 at 02:40:46PM +, David Vrabel wrote:
  On 30/01/15 14:17, Konrad Rzeszutek Wilk wrote:
 
  blinks kexec is OS agnostic?
 
  Yes.
 
  Extra information to be passed to (e.g., a Linux kernel being kexec'd)
  can be supplied in an additional segment and marshalled into a suitable
  format/location for the exec'd kernel by purgatory (or similar).
 
  There shouldn't be a need to extend the hypervisor ABI for this.
 
  How are you planning to make kexec work under EFI?

 I don't know at this time.

I suppose that we should provide additional kexec hypercall
function which will return info about RS. kexec-tools should
load new kernel as usual and add relevant argument to its
command line. Most things are in place so we should just
learn kexec-tools to do new things.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] open-coded page list manipulation in shadow code

2015-01-30 Thread Jan Beulich
 On 29.01.15 at 18:30, t...@xen.org wrote:
 OK, here's what I have.  It keeps the mechanism the same, threading on
 the main page list entry, but tidies it up to use the page_list operations
 rather than open-coding.  I've tested it (lightly - basically jsut
 boot testing) with 32bit, 32pae and 64bit Windows VMs (all SMP), with
 bigmem=y and bigmem=n.
 
 It was developed on top of your bigmem series, but it should apply OK
 before patch 3/4, removing the need to nobble shadow-paging in that
 patch.

Thanks, looks quite okay, just a couple of remarks.

 @@ -1525,6 +1495,14 @@ mfn_t shadow_alloc(struct domain *d,
  if ( shadow_type = SH_type_min_shadow 
shadow_type = SH_type_max_shadow )
  sp-u.sh.head = 1;
 +
 +#ifndef PAGE_LIST_NULL
 +/* The temporary list-head is on our stack.  Blank out the
 + * pointers to it in the shadows, just to get a clean failure if
 + * we accidentally follow them. */
 +tmp_list.next-prev = tmp_list.prev-next = NULL;
 +#endif

Perhaps better to use LIST_POISON{1,2} here, so we don't point into
PV VA space?

 --- a/xen/arch/x86/mm/shadow/private.h
 +++ b/xen/arch/x86/mm/shadow/private.h
 @@ -318,6 +318,33 @@ static inline int mfn_oos_may_write(mfn_t gmfn)
  }
  #endif /* (SHADOW_OPTIMIZATIONS  SHOPT_OUT_OF_SYNC) */
  
 +/* Figure out the size (in pages) of a given shadow type */
 +static inline u32
 +shadow_size(unsigned int shadow_type) 
 +{
 +static const u32 type_to_size[SH_type_unused] = {
 +1, /* SH_type_none   */
 +2, /* SH_type_l1_32_shadow   */
 +2, /* SH_type_fl1_32_shadow  */
 +4, /* SH_type_l2_32_shadow   */
 +1, /* SH_type_l1_pae_shadow  */
 +1, /* SH_type_fl1_pae_shadow */
 +1, /* SH_type_l2_pae_shadow  */
 +1, /* SH_type_l2h_pae_shadow */
 +1, /* SH_type_l1_64_shadow   */
 +1, /* SH_type_fl1_64_shadow  */
 +1, /* SH_type_l2_64_shadow   */
 +1, /* SH_type_l2h_64_shadow  */
 +1, /* SH_type_l3_64_shadow   */
 +1, /* SH_type_l4_64_shadow   */
 +1, /* SH_type_p2m_table  */
 +1, /* SH_type_monitor_table  */
 +1  /* SH_type_oos_snapshot   */
 +};
 +ASSERT(shadow_type  SH_type_unused);
 +return type_to_size[shadow_type];
 +}

Isn't this going to lead to multiple instances of that static array?

 @@ -668,46 +697,40 @@ static inline int sh_pin(struct vcpu *v, mfn_t smfn)
   * of pinned shadows, and release the extra ref. */
  static inline void sh_unpin(struct vcpu *v, mfn_t smfn)
  {
 -struct page_list_head h, *pin_list;
 -struct page_info *sp;
 -
 +struct page_list_head tmp_list, *pin_list;
 +struct page_info *sp, *next;
 +unsigned int i, head_type;
 +
  ASSERT(mfn_valid(smfn));
  sp = mfn_to_page(smfn);
 +head_type = sp-u.sh.type;
  ASSERT(sh_type_is_pinnable(v, sp-u.sh.type));
  ASSERT(sp-u.sh.head);
  
 -/* Treat the up-to-four pages of the shadow as a unit in the list ops */
 -h.next = h.tail = sp; 
 -if ( sp-u.sh.type == SH_type_l2_32_shadow ) 
 -{
 -h.tail = pdx_to_page(h.tail-list.next);
 -h.tail = pdx_to_page(h.tail-list.next);
 -h.tail = pdx_to_page(h.tail-list.next);
 -ASSERT(h.tail-u.sh.type == SH_type_l2_32_shadow); 
 -}
 -pin_list = v-domain-arch.paging.shadow.pinned_shadows;
 -
  if ( !sp-u.sh.pinned )
  return;
 -
  sp-u.sh.pinned = 0;
  
 -/* Cut the sub-list out of the list of pinned shadows */
 -if ( pin_list-next == h.next  pin_list-tail == h.tail )
 -pin_list-next = pin_list-tail = NULL;
 -else 
 +/* Cut the sub-list out of the list of pinned shadows,
 + * stitching it back into a list fragment of its own. */
 +pin_list = v-domain-arch.paging.shadow.pinned_shadows;
 +INIT_PAGE_LIST_HEAD(tmp_list);
 +for ( i = 0; i  shadow_size(head_type); i++ )
  {
 -if ( pin_list-next == h.next )
 -pin_list-next = page_list_next(h.tail, pin_list);
 -else
 -page_list_prev(h.next, pin_list)-list.next = h.tail-list.next;
 -if ( pin_list-tail == h.tail )
 -pin_list-tail = page_list_prev(h.next, pin_list);
 -else
 -page_list_next(h.tail, pin_list)-list.prev = h.next-list.prev;
 +ASSERT(sp-u.sh.type == head_type);
 +ASSERT(!i || !sp-u.sh.head);
 +next = page_list_next(sp, pin_list);
 +page_list_del(sp, pin_list);
 +page_list_add_tail(sp, tmp_list);
 +sp = next;
  }
 -h.tail-list.next = h.next-list.prev = PAGE_LIST_NULL;
 -
 +#ifndef PAGE_LIST_NULL
 +/* The temporary list-head is on our stack.  Blank out the
 + * pointers to it in the shadows, just to get a clean failure if
 + * we accidentally follow them. */
 +tmp_list.next-prev = tmp_list.prev-next = NULL;
 +#endif

Same here. Perhaps worth putting into another inline helper?

Jan


Re: [Xen-devel] [PATCH 2/4] x86/mm: allow for building without shadow mode support

2015-01-30 Thread Jan Beulich
 On 29.01.15 at 18:34, t...@xen.org wrote:
 At 11:10 + on 28 Jan (1422439811), Andrew Cooper wrote:
 On 28/01/15 08:11, Jan Beulich wrote:
  --- a/xen/arch/x86/mm/shadow/Makefile
  +++ b/xen/arch/x86/mm/shadow/Makefile
  @@ -1,4 +1,5 @@
  -obj-$(x86_64) += common.o guest_2.o guest_3.o guest_4.o
  +obj-y:= none.o
  +obj-$(shadow-paging) := common.o guest_2.o guest_3.o guest_4.o
   
 
 Can this be
 
 ifeq($(shadow-paging),y)
 obj-y := common.o guest_2.o guest_3.o guest_4.o
 else
 obj-y := none.o
 endif
 
 Rather than relying on the double := to clobber none.o and prevent a
 link failure ?
 
 +1 to this too, for readability.

As you both ask for it I'll do it, but very reluctantly, as to me this
makes it worse to read, not better.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Security patching of XenProject's wiki, mail, blog, etc

2015-01-30 Thread Birin Sanchez
Hi Xen developers,

Just a heads up as we will be applying some security patches to our VMs
hosting several XenProject services like wiki, mail, blog, etc.

These updates will require rebooting the VMs so the services will be
down for a short period of time (5 - 10 minutes) within the next hour.

Cheers,

Birin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/3] A few VPMU/watchdog-related patches

2015-01-30 Thread Jan Beulich
 On 29.01.15 at 18:58, boris.ostrov...@oracle.com wrote:
 The first patch is a light cleanup of nmi_watchdog usage. (I removed
 NMI_INVALID definition but left NMI_IO_APIC. Do we actually need it? I
 don't see anyone using it except for one place, which doesn't really
 use it neither)

As we really should gain back support for NMI_IO_APIC (considering
that there are quite a few systems where the LAPIC variant doesn't
work), I think it shouldn't be dropped.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] open-coded page list manipulation in shadow code

2015-01-30 Thread Tim Deegan
At 08:36 + on 30 Jan (1422603387), Jan Beulich wrote:
  On 29.01.15 at 18:30, t...@xen.org wrote:
  OK, here's what I have.  It keeps the mechanism the same, threading on
  the main page list entry, but tidies it up to use the page_list operations
  rather than open-coding.  I've tested it (lightly - basically jsut
  boot testing) with 32bit, 32pae and 64bit Windows VMs (all SMP), with
  bigmem=y and bigmem=n.
  
  It was developed on top of your bigmem series, but it should apply OK
  before patch 3/4, removing the need to nobble shadow-paging in that
  patch.
 
 Thanks, looks quite okay, just a couple of remarks.
 
  @@ -1525,6 +1495,14 @@ mfn_t shadow_alloc(struct domain *d,
   if ( shadow_type = SH_type_min_shadow 
 shadow_type = SH_type_max_shadow )
   sp-u.sh.head = 1;
  +
  +#ifndef PAGE_LIST_NULL
  +/* The temporary list-head is on our stack.  Blank out the
  + * pointers to it in the shadows, just to get a clean failure if
  + * we accidentally follow them. */
  +tmp_list.next-prev = tmp_list.prev-next = NULL;
  +#endif
 
 Perhaps better to use LIST_POISON{1,2} here, so we don't point into
 PV VA space?

Yep, will do.

  --- a/xen/arch/x86/mm/shadow/private.h
  +++ b/xen/arch/x86/mm/shadow/private.h
  @@ -318,6 +318,33 @@ static inline int mfn_oos_may_write(mfn_t gmfn)
   }
   #endif /* (SHADOW_OPTIMIZATIONS  SHOPT_OUT_OF_SYNC) */
   
  +/* Figure out the size (in pages) of a given shadow type */
  +static inline u32
  +shadow_size(unsigned int shadow_type) 
  +{
  +static const u32 type_to_size[SH_type_unused] = {
  +1, /* SH_type_none   */
  +2, /* SH_type_l1_32_shadow   */
  +2, /* SH_type_fl1_32_shadow  */
  +4, /* SH_type_l2_32_shadow   */
  +1, /* SH_type_l1_pae_shadow  */
  +1, /* SH_type_fl1_pae_shadow */
  +1, /* SH_type_l2_pae_shadow  */
  +1, /* SH_type_l2h_pae_shadow */
  +1, /* SH_type_l1_64_shadow   */
  +1, /* SH_type_fl1_64_shadow  */
  +1, /* SH_type_l2_64_shadow   */
  +1, /* SH_type_l2h_64_shadow  */
  +1, /* SH_type_l3_64_shadow   */
  +1, /* SH_type_l4_64_shadow   */
  +1, /* SH_type_p2m_table  */
  +1, /* SH_type_monitor_table  */
  +1  /* SH_type_oos_snapshot   */
  +};
  +ASSERT(shadow_type  SH_type_unused);
  +return type_to_size[shadow_type];
  +}
 
 Isn't this going to lead to multiple instances of that static array?

Urgh, maybe.  I'd have thought this would end up as a common symbol
(if that's the term I mean - one where the linker will merge identical
copies) but I didn't check what actually happens.  I'll move it back
into a .c and just have the lookup function in the header.

  +#ifndef PAGE_LIST_NULL
  +/* The temporary list-head is on our stack.  Blank out the
  + * pointers to it in the shadows, just to get a clean failure if
  + * we accidentally follow them. */
  +tmp_list.next-prev = tmp_list.prev-next = NULL;
  +#endif
 
 Same here. Perhaps worth putting into another inline helper?

Yep, will do so for v2.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/5] hvm_complete_assist_req: We should not be able to get here on IOREQ_TYPE_PCI_CONFIG

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 01:52, dsl...@verizon.com wrote:

According to your 0/5
Suggested-by: Paul Durrant paul.durr...@citrix.com
?

Jan

 Signed-off-by: Don Slutz dsl...@verizon.com
 ---
  xen/arch/x86/hvm/hvm.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
 index c7984d1..6f7b407 100644
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -2573,6 +2573,7 @@ bool_t hvm_send_assist_req_to_ioreq_server(struct 
 hvm_ioreq_server *s,
  
  static bool_t hvm_complete_assist_req(ioreq_t *p)
  {
 +ASSERT(p-type != IOREQ_TYPE_PCI_CONFIG);
  switch ( p-type )
  {
  case IOREQ_TYPE_COPY:
 -- 
 1.8.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] hvm_complete_assist_req: Tell caller we failed to send

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 01:52, dsl...@verizon.com wrote:
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -2599,7 +2599,7 @@ static bool_t hvm_complete_assist_req(ioreq_t *p)
  break;
  }
  
 -return 1;
 +return 0; /* implicitly bins the i/o operation */
  }

This change points out that having hvm_complete_assist_req() be a
separate function yet having only a single caller, and it returning
non-void with only a single possible return value isn't the best
arrangement. I think this should be brought back into
hvm_send_assist_req(), by inverting the if() condition there. Unless
there are intentions for it to have another caller, but in that case
it should still be made return void, with the caller choosing what to
return.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] New IOREQ type -- IOREQ_TYPE_VMWARE_PORT

2015-01-30 Thread Paul Durrant
 -Original Message-
 From: Don Slutz [mailto:dsl...@verizon.com]
 Sent: 29 January 2015 19:41
 To: Paul Durrant; Don Slutz; qemu-de...@nongnu.org; Stefano Stabellini
 Cc: Peter Maydell; Olaf Hering; Alexey Kardashevskiy; Stefan Weil; Michael
 Tokarev; Alexander Graf; Gerd Hoffmann; Stefan Hajnoczi; Paolo Bonzini;
 xen-devel
 Subject: New IOREQ type -- IOREQ_TYPE_VMWARE_PORT
 
  On 01/29/15 07:09, Paul Durrant wrote:
 ...
  Given that IIRC you are using a new dedicated IOREQ type, I
  think there needs to be something that allows an emulator to
  register for this IOREQ type. How about adding a new type to
  those defined for HVMOP_map_io_range_to_ioreq_server for your
  case? (In your case the start and end values in the hypercall
  would be meaningless but it could be used to steer
  hvm_select_ioreq_server() into sending all emulation requests or
  your new type to QEMU.
 
 
 This is an interesting idea.  Will need to spend more time on it.
 
 
  Actually such a mechanism could be used
  to steer IOREQ_TYPE_TIMEOFFSET requests as, with the new QEMU
  patches, they are going nowhere. Upstream QEMU (as default) used
  to ignore them anyway, which is why I didn't bother with such a
  patch to Xen before but since you now need one maybe you could
  add that too?
 
 
 I think it would not be that hard.  Will look into adding it.
 
 
 Currently I do not see how hvm_do_resume() works with 2 ioreq servers.
 It looks like to me that if a vpcu (like 0) needs to wait for the
 2nd ioreq server, hvm_do_resume() will check the 1st ioreq server
 and return as if the ioreq is done.  What am I missing?
 

hvm_do_resume() walks the ioreq server list looking at the IOREQ state in the 
shared page of each server in turn. If no IOREQ was sent to that server then 
then state will be IOREQ_NONE and hvm_wait_for_io() will return 1 immediately 
so the outer loop in hvm_do_resume() will move on to the next server. If a 
state of IOREQ_READY or IOREQ_INPROCESS is found then the vcpu blocks on the 
relevant event channel until the state transitions to IORESP_READY. The IOREQ 
is then completed and the loop moves on to the next server.
Normally an IOREQ would only be directed at one server and indeed IOREQs that 
are issued for emulation requests (i.e. when io_state != HVMIO_none) fall into 
this category but there is one case of a broadcast IOREQ, which is the 
INVALIDATE IOREQ (sent to tell emulators to invalidate any mappings of guest 
memory they may have cached) and that is why hvm_do_resume() has to iterate 
over all servers.

Does that make sense?

  Paul

-Don Slutz


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/5] hvmemul_do_io: If the send to the ioreq server failed do not retry.

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 01:52, dsl...@verizon.com wrote:
 I.E. do just what no backing DM does.

_If_ this is correct, the if() modified here should be folded with the
one a few lines up. But looking at the description of the commit that
introduced this (bac0999325 x86 hvm: Do not incorrectly retire an
instruction emulation..., almost immediately modified by f20f3c8ece
x86 hvm: On failed hvm_send_assist_req(), io emulation...) I doubt
this is really what we want, or at the very least your change
description should explain what was wrong with the original commit.

Jan

 --- a/xen/arch/x86/hvm/emulate.c
 +++ b/xen/arch/x86/hvm/emulate.c
 @@ -228,7 +228,11 @@ static int hvmemul_do_io(
  {
  rc = X86EMUL_RETRY;
  if ( !hvm_send_assist_req(p) )
 +{
 +/* Since the send failed, do not retry */
 +rc = X86EMUL_OKAY;
  vio-io_state = HVMIO_none;
 +}
  else if ( p_data == NULL )
  rc = X86EMUL_OKAY;
  }
 -- 
 1.8.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 05/29] ArmVirtualizationPkg: allow patchable PCD for device tree base address

2015-01-30 Thread Laszlo Ersek
On 01/26/15 20:03, Ard Biesheuvel wrote:
 To allow a runtime self relocating PrePi instance to discover the base
 address of the device tree at runtime, allow the use of a patchable PCD
 for gArmVirtualizationTokenSpaceGuid.PcdDeviceTreeInitialBaseAddress.
 We will not be using the build time patch tool in this case, but using
 a patchable PCD will make the build system aware that its value is not
 a compile time constant.
 
 Contributed-under: TianoCore Contribution Agreement 1.0
 Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org
 ---
  ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec 
| 2 +-
  
 ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c
  | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec 
 b/ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec
 index 99411548aff6..d83117fc6abe 100644
 --- a/ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec
 +++ b/ArmPlatformPkg/ArmVirtualizationPkg/ArmVirtualizationPkg.dec
 @@ -34,7 +34,7 @@
gArmVirtualizationTokenSpaceGuid = { 0x0B6F5CA7, 0x4F53, 0x445A, { 0xB7, 
 0x6E, 0x2E, 0x36, 0x5B, 0x80, 0x63, 0x66 } }
gEarlyPL011BaseAddressGuid   = { 0xB199DEA9, 0xFD5C, 0x4A84, { 0x80, 
 0x82, 0x2F, 0x41, 0x70, 0x78, 0x03, 0x05 } }
  
 -[PcdsFixedAtBuild]
 +[PcdsFixedAtBuild, PcdsPatchableInModule]
#
# This is the physical address where the device tree is expected to be 
 stored
# upon first entry into UEFI. This needs to be a FixedAtBuild PCD, so that 
 we
 diff --git 
 a/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c
  
 b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c
 index aa4ced4582e8..3e3074af72f1 100644
 --- 
 a/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c
 +++ 
 b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationPlatformLib/Virt.c
 @@ -96,7 +96,7 @@ ArmPlatformInitializeSystemMemory (
ASSERT (HobData != NULL);
*HobData = 0;
  
 -  DeviceTreeBase = (VOID *)(UINTN)FixedPcdGet64 
 (PcdDeviceTreeInitialBaseAddress);
 +  DeviceTreeBase = (VOID *)(UINTN)PcdGet64 (PcdDeviceTreeInitialBaseAddress);
ASSERT (DeviceTreeBase != NULL);
  
//
 

Reviewed-by: Laszlo Ersek ler...@redhat.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] stubdom vtpm build failure in staging

2015-01-30 Thread Olaf Hering
On Thu, Jan 29, Ian Campbell wrote:

 This compiles for me on Debian with gcc 4.7.2-5.

Same here.

 I'm honestly not sure what the C standard says about duplicate but
 identical typedefs -- Ian?

How would that matter now? ;-) Old gcc can not deal with it, the commits
cause a regression for gcc-4.5 and older.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl

2015-01-30 Thread Razvan Cojocaru
On 01/30/2015 12:38 PM, Tamas K Lengyel wrote:
 On Fri, Jan 30, 2015 at 8:58 AM, Razvan Cojocaru
 rcojoc...@bitdefender.com wrote:
 On 01/29/2015 11:46 PM, Tamas K Lengyel wrote:
 diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
 index c7a0bde..3b58700 100644
 --- a/xen/arch/x86/mm/p2m.c
 +++ b/xen/arch/x86/mm/p2m.c
 @@ -1445,15 +1445,6 @@ void p2m_vm_event_emulate_check(struct vcpu *v, 
 const vm_event_response_t *rsp)
  }
  }

 -void p2m_setup_introspection(struct domain *d)
 -{
 -if ( hvm_funcs.enable_msr_exit_interception )
 -{
 -d-arch.hvm_domain.introspection_enabled = 1;
 -hvm_funcs.enable_msr_exit_interception(d);
 -}
 -}
 -
  bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
  struct npfec npfec,
  vm_event_request_t **req_ptr)

 I see that introspection_enabled is no longer assigned here ...
 
 Introspection_enabled is getting deprecated in this patch and is moved
 into the monitor_op domctl.

Moved, right, but where?

This patch removes the d-arch.hvm_domain.introspection_enabled = 1
assignment from here but AFAICT it doesn't move it anwyhere else. It
just removes it, so introspection_enabled is always 0.

I couldn't find any place in this series where introspection_enabled =
1. Could you please point it out to me if I've missed it by accident?

 diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
 index 0db899e..0b30750 100644
 --- a/xen/common/vm_event.c
 +++ b/xen/common/vm_event.c
 @@ -617,16 +617,10 @@ int vm_event_domctl(struct domain *d, 
 xen_domctl_vm_event_op_t *vec,
  switch( vec-op )
  {
  case XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE:
 -case XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE_INTROSPECTION:
  {
  rc = vm_event_enable(d, vec, ved, _VPF_mem_access,
  HVM_PARAM_MONITOR_RING_PFN,
  mem_access_notification);
 -
 -if ( vec-op == 
 XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE_INTROSPECTION
 -  !rc )
 -p2m_setup_introspection(d);
 -
  }
  break;

 @@ -635,7 +629,6 @@ int vm_event_domctl(struct domain *d, 
 xen_domctl_vm_event_op_t *vec,
  if ( ved-ring_page )
  {
  rc = vm_event_disable(d, ved);
 -d-arch.hvm_domain.introspection_enabled = 0;
  }
  }
  break;

 ... nor here. Patch 6/12 checks it but doesn't set it. Patch 5/12 sets
 it to 0 (which could account for the removal of the assignment in
 vm_event.c) but never to 1. A few important things depend on it being
 enabled: it becomes impossible to disable interception for a select set
 of MSRs, optimization for RET instructions emulation is disabled, and
 othere places in p2m.c makes use of the flag as well.

 Is there some place in the code, untouched by this series, where
 introspection_enabled is being set to 1?
 
 It is moved into the monitor_op domctl when mov_to_msr trapping is
 enabled. The reason of having introspection_enabled AFAIU was to
 reenable trapping MSR's that were disabled shortly after boot. Thus,
 an option field is present in the monitor_op when enabling mov_to_msr
 trapping: extended_capture. Let me know if this still achieves the
 same effect as before!

No, there are several places where introspection_enabled is needed, as
I've said before. Not just the MSRs.

One important place is in hvmemul_virtual_to_linear(), in
xen/arch/x86/hvm/emulate.c, where we disable optimizations for REP
instructions (in today's version of mainline Xen, at line 413).

There are also places in patches yet to be published I've worked on
where we gate things on introspection_enabled being != 0, so please
don't remove it or have it misbehave.

I'll look into the extended_capture option in case it will allow future
removal of the MSR special case for introspection, but a flag like that
is necessary and can't simply be deprecated and removed.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 13/29] ArmVirtualizationPkg: allow patchable PCD for FV base address

2015-01-30 Thread Laszlo Ersek
On 01/26/15 20:03, Ard Biesheuvel wrote:
 Allow the use of a patchable PCD for gArmTokenSpaceGuid.PcdFvBaseAddress
 by moving it from the [FixedPcd] to the [Pcd] section in the INF file of
 PlatformPeiLib.
 
 Contributed-under: TianoCore Contribution Agreement 1.0
 Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org
 ---
  
 ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf 
 | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)
 
 diff --git 
 a/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf
  
 b/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf
 index 96019e4009ff..1fca9b28f9e2 100644
 --- 
 a/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf
 +++ 
 b/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformPeiLib/PlatformPeiLib.inf
 @@ -37,9 +37,11 @@
FdtLib
  
  [FixedPcd]
 -  gArmTokenSpaceGuid.PcdFvBaseAddress
gArmTokenSpaceGuid.PcdFvSize
 +
 +[Pcd]
gArmVirtualizationTokenSpaceGuid.PcdDeviceTreeInitialBaseAddress
 +  gArmTokenSpaceGuid.PcdFvBaseAddress
  
  [Guids]
gEarlyPL011BaseAddressGuid
 

It also seems to change the PCD type of PcdDeviceTreeInitialBaseAddress.
Care to mention that too in the commit message, just for completeness?

Reviewed-by: Laszlo Ersek ler...@redhat.com

Thanks
Laszlo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/3] xen: support suspend/resume in pvscsi frontend

2015-01-30 Thread Juergen Gross
Up to now the pvscsi frontend hasn't supported domain suspend and
resume. When a domain with an assigned pvscsi device was suspended
and resumed again, it was not able to use the device any more: trying
to do so resulted in hanging processes.

Support suspend and resume of pvscsi devices.

Signed-off-by: Juergen Gross jgr...@suse.com
---
 drivers/scsi/xen-scsifront.c | 189 ---
 1 file changed, 162 insertions(+), 27 deletions(-)

diff --git a/drivers/scsi/xen-scsifront.c b/drivers/scsi/xen-scsifront.c
index 34199d2..b32157b 100644
--- a/drivers/scsi/xen-scsifront.c
+++ b/drivers/scsi/xen-scsifront.c
@@ -63,6 +63,7 @@
 
 #define VSCSIFRONT_OP_ADD_LUN  1
 #define VSCSIFRONT_OP_DEL_LUN  2
+#define VSCSIFRONT_OP_READD_LUN3
 
 /* Tuning point. */
 #define VSCSIIF_DEFAULT_CMD_PER_LUN 10
@@ -113,8 +114,13 @@ struct vscsifrnt_info {
DECLARE_BITMAP(shadow_free_bitmap, VSCSIIF_MAX_REQS);
struct vscsifrnt_shadow *shadow[VSCSIIF_MAX_REQS];
 
+   /* Following items are protected by the host lock. */
wait_queue_head_t wq_sync;
+   wait_queue_head_t wq_pause;
unsigned int wait_ring_available:1;
+   unsigned int waiting_pause:1;
+   unsigned int pause:1;
+   unsigned callers;
 
char dev_state_path[64];
struct task_struct *curr;
@@ -274,31 +280,31 @@ static void scsifront_sync_cmd_done(struct vscsifrnt_info 
*info,
wake_up(shadow-wq_reset);
 }
 
-static int scsifront_cmd_done(struct vscsifrnt_info *info)
+static void scsifront_do_response(struct vscsifrnt_info *info,
+ struct vscsiif_response *ring_rsp)
+{
+   if (WARN(ring_rsp-rqid = VSCSIIF_MAX_REQS ||
+test_bit(ring_rsp-rqid, info-shadow_free_bitmap),
+illegal rqid %u returned by backend!\n, ring_rsp-rqid))
+   return;
+
+   if (info-shadow[ring_rsp-rqid]-act == VSCSIIF_ACT_SCSI_CDB)
+   scsifront_cdb_cmd_done(info, ring_rsp);
+   else
+   scsifront_sync_cmd_done(info, ring_rsp);
+}
+
+static int scsifront_ring_drain(struct vscsifrnt_info *info)
 {
struct vscsiif_response *ring_rsp;
RING_IDX i, rp;
int more_to_do = 0;
-   unsigned long flags;
-
-   spin_lock_irqsave(info-host-host_lock, flags);
 
rp = info-ring.sring-rsp_prod;
rmb();  /* ordering required respective to dom0 */
for (i = info-ring.rsp_cons; i != rp; i++) {
-
ring_rsp = RING_GET_RESPONSE(info-ring, i);
-
-   if (WARN(ring_rsp-rqid = VSCSIIF_MAX_REQS ||
-test_bit(ring_rsp-rqid, info-shadow_free_bitmap),
-illegal rqid %u returned by backend!\n,
-ring_rsp-rqid))
-   continue;
-
-   if (info-shadow[ring_rsp-rqid]-act == VSCSIIF_ACT_SCSI_CDB)
-   scsifront_cdb_cmd_done(info, ring_rsp);
-   else
-   scsifront_sync_cmd_done(info, ring_rsp);
+   scsifront_do_response(info, ring_rsp);
}
 
info-ring.rsp_cons = i;
@@ -308,6 +314,18 @@ static int scsifront_cmd_done(struct vscsifrnt_info *info)
else
info-ring.sring-rsp_event = i + 1;
 
+   return more_to_do;
+}
+
+static int scsifront_cmd_done(struct vscsifrnt_info *info)
+{
+   int more_to_do;
+   unsigned long flags;
+
+   spin_lock_irqsave(info-host-host_lock, flags);
+
+   more_to_do = scsifront_ring_drain(info);
+
info-wait_ring_available = 0;
 
spin_unlock_irqrestore(info-host-host_lock, flags);
@@ -328,6 +346,24 @@ static irqreturn_t scsifront_irq_fn(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 
+static void scsifront_finish_all(struct vscsifrnt_info *info)
+{
+   unsigned i;
+   struct vscsiif_response resp;
+
+   scsifront_ring_drain(info);
+
+   for (i = 0; i  VSCSIIF_MAX_REQS; i++) {
+   if (test_bit(i, info-shadow_free_bitmap))
+   continue;
+   resp.rqid = i;
+   resp.sense_len = 0;
+   resp.rslt = DID_RESET  16;
+   resp.residual_len = 0;
+   scsifront_do_response(info, resp);
+   }
+}
+
 static int map_data_for_request(struct vscsifrnt_info *info,
struct scsi_cmnd *sc,
struct vscsiif_request *ring_req,
@@ -475,6 +511,27 @@ static struct vscsiif_request *scsifront_command2ring(
return ring_req;
 }
 
+static int scsifront_enter(struct vscsifrnt_info *info)
+{
+   if (info-pause)
+   return 1;
+   info-callers++;
+   return 0;
+}
+
+static void scsifront_return(struct vscsifrnt_info *info)
+{
+   info-callers--;
+   if (info-callers)
+   return;
+
+   if (!info-waiting_pause)
+   return;
+
+   info-waiting_pause = 0;
+   wake_up(info-wq_pause);
+}

[Xen-devel] [PATCH 2/3] xen: scsiback: add LUN of restored domain

2015-01-30 Thread Juergen Gross
When a xen domain is being restored the LUN state of a pvscsi device
is Connected and not Initialising as in case of attaching a new
pvscsi LUN.

This must be taken into account when adding a new pvscsi device for
a domain as otherwise the pvscsi LUN won't be connected to the
SCSI target associated with it.

Signed-off-by: Juergen Gross jgr...@suse.com
---
 drivers/xen/xen-scsiback.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
index 6457784..4290921 100644
--- a/drivers/xen/xen-scsiback.c
+++ b/drivers/xen/xen-scsiback.c
@@ -993,7 +993,7 @@ found:
 }
 
 static void scsiback_do_add_lun(struct vscsibk_info *info, const char *state,
-   char *phy, struct ids_tuple *vir)
+   char *phy, struct ids_tuple *vir, int try)
 {
if (!scsiback_add_translation_entry(info, phy, vir)) {
if (xenbus_printf(XBT_NIL, info-dev-nodename, state,
@@ -1001,7 +1001,7 @@ static void scsiback_do_add_lun(struct vscsibk_info 
*info, const char *state,
pr_err(xen-pvscsi: xenbus_printf error %s\n, state);
scsiback_del_translation_entry(info, vir);
}
-   } else {
+   } else if (!try) {
xenbus_printf(XBT_NIL, info-dev-nodename, state,
  %d, XenbusStateClosed);
}
@@ -1061,10 +1061,19 @@ static void scsiback_do_1lun_hotplug(struct 
vscsibk_info *info, int op,
 
switch (op) {
case VSCSIBACK_OP_ADD_OR_DEL_LUN:
-   if (device_state == XenbusStateInitialising)
-   scsiback_do_add_lun(info, state, phy, vir);
-   if (device_state == XenbusStateClosing)
+   switch (device_state) {
+   case XenbusStateInitialising:
+   scsiback_do_add_lun(info, state, phy, vir, 0);
+   break;
+   case XenbusStateConnected:
+   scsiback_do_add_lun(info, state, phy, vir, 1);
+   break;
+   case XenbusStateClosing:
scsiback_do_del_lun(info, state, vir);
+   break;
+   default:
+   break;
+   }
break;
 
case VSCSIBACK_OP_UPDATEDEV_STATE:
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/3] xen: mark pvscsi frontend request consumed only after last read

2015-01-30 Thread Juergen Gross
A request in the ring buffer mustn't be read after it has been marked
as consumed. Otherwise it might already have been reused by the
frontend without violating the ring protocol.

To avoid inconsistencies in the backend only work on a private copy
of the request. This will ensure a malicious guest not being able to
bypass consistency checks of the backend by modifying an active
request.

Signed-off-by: Juergen Gross jgr...@suse.com
---
 drivers/xen/xen-scsiback.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
index e999496e..6457784 100644
--- a/drivers/xen/xen-scsiback.c
+++ b/drivers/xen/xen-scsiback.c
@@ -708,7 +708,7 @@ static int prepare_pending_reqs(struct vscsibk_info *info,
 static int scsiback_do_cmd_fn(struct vscsibk_info *info)
 {
struct vscsiif_back_ring *ring = info-ring;
-   struct vscsiif_request *ring_req;
+   struct vscsiif_request ring_req;
struct vscsibk_pend *pending_req;
RING_IDX rc, rp;
int err, more_to_do;
@@ -734,11 +734,11 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info)
if (!pending_req)
return 1;
 
-   ring_req = RING_GET_REQUEST(ring, rc);
+   memcpy(ring_req, RING_GET_REQUEST(ring, rc), sizeof(ring_req));
ring-req_cons = ++rc;
 
-   act = ring_req-act;
-   err = prepare_pending_reqs(info, ring_req, pending_req);
+   act = ring_req.act;
+   err = prepare_pending_reqs(info, ring_req, pending_req);
if (err) {
switch (err) {
case -ENODEV:
@@ -756,7 +756,7 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info)
 
switch (act) {
case VSCSIIF_ACT_SCSI_CDB:
-   if (scsiback_gnttab_data_map(ring_req, pending_req)) {
+   if (scsiback_gnttab_data_map(ring_req, pending_req)) {
scsiback_fast_flush_area(pending_req);
scsiback_do_resp_with_sense(NULL,
DRIVER_ERROR  24, 0, pending_req);
@@ -767,7 +767,7 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info)
break;
case VSCSIIF_ACT_SCSI_ABORT:
scsiback_device_action(pending_req, TMR_ABORT_TASK,
-   ring_req-ref_rqid);
+   ring_req.ref_rqid);
break;
case VSCSIIF_ACT_SCSI_RESET:
scsiback_device_action(pending_req, TMR_LUN_RESET, 0);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/xenctrl: correct some function declarations

2015-01-30 Thread Wei Liu
On Fri, Jan 30, 2015 at 03:32:26PM +0800, Tiejun Chen wrote:
 When commit 6865e52b78f4, PCI multi-seg: adjust domctl interface,
 is introduced, we missed to sync that head file.
 
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com

Acked-by: Wei Liu wei.l...@citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v17 13/23] x86/VPMU: Interface for setting PMU mode and flags

2015-01-30 Thread Jan Beulich
 On 05.01.15 at 22:44, boris.ostrov...@oracle.com wrote:
 +static long vpmu_unload_next(void *arg)
 +{
 +struct vcpu *last;
 +int ret;
 +unsigned int thiscpu = smp_processor_id();
 +
 +if ( thiscpu != vpmu_next_unload_cpu )
 +{
 +/* Continuation thread may have been moved due to CPU hot-unplug */
 +vpmu_mode = (unsigned long)arg;
 +vpmu_first_unload_cpu = VPMU_INVALID_CPU;
 +return -EAGAIN;
 +}
 +
 +local_irq_disable(); /* so that last_vcpu doesn't change under us. */
 +
 +last = this_cpu(last_vcpu);
 +if ( last )
 +{
 +vpmu_unload(vcpu_vpmu(last));
 +this_cpu(last_vcpu) = NULL;
 +}

So you do this for last_vcpu here, but ...


 +static int vpmu_unload_all(unsigned long old_mode)
 +{
 +int ret = 0;
 +
 +vpmu_unload(vcpu_vpmu(current));

... for current here, also without clearing this_cpu(last_vcpu). Is
that really correct?

 +long do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg)
 +{
 +int ret;
 +struct xen_pmu_params pmu_params;
 +
 +ret = xsm_pmu_op(XSM_OTHER, current-domain, op);
 +if ( ret )
 +return ret;
 +
 +switch ( op )
 +{
 +case XENPMU_mode_set:
 +{
 +unsigned int old_mode;
 +static DEFINE_SPINLOCK(xenpmu_mode_lock);
 +
 +if ( copy_from_guest(pmu_params, arg, 1) )
 +return -EFAULT;
 +
 +if ( pmu_params.val  ~(XENPMU_MODE_SELF | XENPMU_MODE_HV) )
 +return -EINVAL;
 +
 +/* 32-bit dom0 can only sample itself. */
 +if ( is_pv_32bit_vcpu(current)  (pmu_params.val  XENPMU_MODE_HV) )
 +return -EINVAL;
 +
 +/*
 + * Return error if someone else is in the middle of changing mode ---
 + * this is most likely indication of two system administrators
 + * working against each other.
 + */
 +if ( !spin_trylock(xenpmu_mode_lock) )
 +return -EAGAIN;
 +if ( vpmu_first_unload_cpu != VPMU_INVALID_CPU )
 +{
 +spin_unlock(xenpmu_mode_lock);
 +return -EAGAIN;
 +}
 +
 +old_mode = vpmu_mode;
 +vpmu_mode = pmu_params.val;
 +
 +if ( vpmu_mode == XENPMU_MODE_OFF )
 +{
 +/* Make sure all (non-dom0) VCPUs have unloaded their VPMUs. */
 +ret = vpmu_unload_all(old_mode);
 +if ( ret )
 +vpmu_mode = old_mode;
 +}
 +
 +spin_unlock(xenpmu_mode_lock);
 +
 +break;
 +}
 +
 +case XENPMU_mode_get:
 +memset(pmu_params, 0, sizeof(pmu_params));
 +pmu_params.val = vpmu_mode;
 +
 +pmu_params.version.maj = XENPMU_VER_MAJ;
 +pmu_params.version.min = XENPMU_VER_MIN;
 +
 +if ( copy_to_guest(arg, pmu_params, 1) )
 +return -EFAULT;
 +break;
 +
 +case XENPMU_feature_set:
 +if ( copy_from_guest(pmu_params, arg, 1) )
 +return -EFAULT;
 +
 +if ( pmu_params.val  ~XENPMU_FEATURE_INTEL_BTS )
 +return -EINVAL;
 +
 +vpmu_features = pmu_params.val;
 +break;
 +
 +case XENPMU_feature_get:
 +pmu_params.val = vpmu_features;
 +if ( copy_field_to_guest(arg, pmu_params, val) )
 +return -EFAULT;
 +break;
 +
 +default:
 +ret = -EINVAL;
 +}
 +
 +return ret;
 +}

Throughout this function, the pad field isn't being checked to be
zero (and XENPMU_feature_get also doesn't clear it, but that seems
secondary, at least as long as it's being declared IN only). As I'm
sure I said before - we ought to do this in order to be able to assign
meaning to the field later on. Perhaps it would even better be
renamed to e.g. mbz.

 @@ -144,6 +145,9 @@ do_tmem_op(
  extern long
  do_xenoprof_op(int op, XEN_GUEST_HANDLE_PARAM(void) arg);
  
 +extern long
 +do_xenpmu_op(int op, XEN_GUEST_HANDLE_PARAM(xen_pmu_params_t) arg);

Despite there being numerous bad examples - can op please be
unsigned int?

 --- a/xen/include/xsm/xsm.h
 +++ b/xen/include/xsm/xsm.h
 @@ -173,6 +173,7 @@ struct xsm_operations {
  int (*unbind_pt_irq) (struct domain *d, struct xen_domctl_bind_pt_irq 
 *bind);
  int (*ioport_permission) (struct domain *d, uint32_t s, uint32_t e, 
 uint8_t allow);
  int (*ioport_mapping) (struct domain *d, uint32_t s, uint32_t e, uint8_t 
 allow);
 +int (*pmu_op) (struct domain *d, int op);

And then here too?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC V10 0/4] domain snapshot document

2015-01-30 Thread Wei Liu
On Thu, Jan 29, 2015 at 04:36:43PM +, Ian Campbell wrote:
 On Mon, 2015-01-26 at 11:25 +0800, Chunyan Liu wrote:
  Changes to V9:
 
 This looks good to me, thanks.
 
 Ian/Wei do you have any comments? xapi folks?
 

Looks good to me.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl

2015-01-30 Thread Tamas K Lengyel

 There are also places in patches yet to be published I've worked on
 where we gate things on introspection_enabled being != 0, so please
 don't remove it or have it misbehave.

 I'll look into the extended_capture option in case it will allow future
 removal of the MSR special case for introspection, but a flag like that
 is necessary and can't simply be deprecated and removed.

 Ack, the plan was actually to replace all references to
 arch.hvm_domain.introspection_enabled with
 arch.monitor_options.mov_to_msr.extended_capture. I see I forgot to
 actually fully follow through that plan but that's the intention at
 least. So the functionality would remain, it would just be worked into
 the coherent settings field with every other type of events.

If you take a look at the next iteration of this series at
https://github.com/tklengyel/xen/compare/mem_event_cleanup4?expand=1 I
updated this patch to implement what I described above.

Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 0/5] Split off mini-os to a separate tree

2015-01-30 Thread Wei Liu
On Mon, Jan 26, 2015 at 09:53:25AM +, Thomas Leonard wrote:
 [ Cc += Anil ]
 
 On 25 January 2015 at 18:25, Wei Liu wei.l...@citrix.com wrote:
  Cc Ian and Ian and some folks who might be interested in this work.
 
  On Sun, Jan 25, 2015 at 06:13:41PM +, Wei Liu wrote:
  There has been increasing use of mini-os in unikernel world and basically
  everybody has their own fork of mini-os. That way going foward is going
  to cause fragmentation of the community.
 
  We would like to split off mini-os tree so that everybody can upstream 
  their
  changes and work on a common code base. Later I would also like to setup
  a proper push gate for mini-os.
 
  Stubdom's build environment is known to be very fragile, so basically all 
  the
  real work is done in top level Makefile.
 
  I use following runes to split off mini-os:
 
git filter-branch --tag-name-filter cat \
  --subdirectory-filter extras/mini-os/ -- --all
 
# There is already a tag name 4.3.0-rc2 which points to the same commit.
git tag -d xen-4.3.0-rc2
 
# Add xen- prefix to all tags
for t in `git tag`; do git tag xen-$t $t; git tag -d $t ; done
 
git gc --aggressive
 
  The tree can be found at:
git://xenbits.xen.org/people/liuw/mini-os.git master
 
  Note that mini-os cannot build on its own due to the limitation of it's own
  build system. Splitting it off it's the first step towards fixing that 
  problem
 
 In case it's useful: for the standalone version of Mini-OS used by
 Mirage, I had to include these directories too:
 
 1. xen/include/public
 2. xen/common/libfdt
 3. xen/include/xen/libfdt

Is libfdt required? I see inclusion of libfdt.h in arch/arm/mm.c but I
couldn't find trace to link libfdt in Makefile. And I ran a simple test
to compile minios in Xen tree on an arm board it doesn't compile.

If arm support is still WIP I'm going to ignore libfdt for now.

Wei.

 4. config
 
 ( https://github.com/talex5/xen/tree/minios-releases )
 
 Will there be a separate mailing list for Mini-OS? That would be very
 useful for people who don't want to keep up with xen-devel.
 
 Thanks!
 
 
 -- 
 Dr Thomas Leonardhttp://0install.net/
 GPG: 9242 9807 C985 3C07 44A6  8B9A AE07 8280 59A5 3CC1
 GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl

2015-01-30 Thread Tamas K Lengyel

 Ack, the plan was actually to replace all references to
 arch.hvm_domain.introspection_enabled with
 arch.monitor_options.mov_to_msr.extended_capture. I see I forgot to
 actually fully follow through that plan but that's the intention at
 least. So the functionality would remain, it would just be worked into
 the coherent settings field with every other type of events.

 I see, but is it conceivable that some vm_event consumer does want to
 enable arch.monitor_options.mov_to_msr.extended_capture but not be
 interested in doing full-blown introspection (for example, is fine with
 having the REP optimization enabled)?

I'm not sure I follow what the difference here between
mov_to_msr.extended_capture and what you refer to as full-blown
introspection. My understanding was that without gating some MSR's
via introspection_enabled, they are never trapped into the hypervisor
when they get written to, while other MSR do. Thus the two options:
mov_to_msr.enabled for basic trapping, and mov_to_msr.extended_capture
for gating the extended set of MSRs to be also trappable.


 What you're proposing here (as far as introspection_enabled is
 concerned) is, if I understand correctly, basically renaming
 introspection_enabled to mov_to_msr.extended_capture. I can see how that
 would seem to simplify some things, but it might not look very intuitive
 to future developers, and it will definitely complicate other things
 down the road.

No, not just simply renaming it. So far the options for the various VM
events were scattered all over the place, yours defined directly on
hvm_domain, others in hvm parameters. Now there is an assigned spot
for all current - and future - events to store their settings: in
arch_domain monitor_settings. This will work for PV domains as well,
while keeping it on the arch_domain layer will avoid defining options
on ARM that are architecture specific (mov_to_cr0/3/4 etc.).

Furthermore, setting these options was also a mess which I try to
address in this patch: some were enabled via hvm_op hypercalls, yours
via domctl. Now everything is moved into one spot: monitor_op domctl.


 An example is havig the guest trigger a vm_event, requested by the
 privileged userspace application. In our case, it was VMCALL in the
 original series, and the patch has been eventually dropped from the
 conversation - to be reworked at a later time. But we do need it, and
 our patch now does its thing gated on introspection_enabled. It's hard
 to connect that logic to mov_to_msr.extended_capture.

 I guess we could keep the flag and move it to arch.monitor_options if
 you prefer? And setting it could cause mov_to_msr.extended_capture and
 assorted flags to be set also (some sort of umbrella setting)?

Yes, that would be the logic going forward - all VM event related
options and settings should be stored in this structure. Also, I don't
see a problem with having an event or setting that enables multiple at
the same time.

Thanks,
Tamas



 Thanks,
 Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] xen: mark pvscsi frontend request consumed only after last read

2015-01-30 Thread Juergen Gross

On 01/30/2015 12:47 PM, Jan Beulich wrote:

On 30.01.15 at 12:21, jgr...@suse.com wrote:

@@ -734,11 +734,11 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info)
if (!pending_req)
return 1;

-   ring_req = RING_GET_REQUEST(ring, rc);
+   memcpy(ring_req, RING_GET_REQUEST(ring, rc), sizeof(ring_req));


I'd recommend the type safe *ring_req = *RING_GET_REQUEST(ring, rc)
here.


I think I'll use ring_req = *RING_GET_REQUEST(ring, rc) :-)




ring-req_cons = ++rc;

-   act = ring_req-act;
-   err = prepare_pending_reqs(info, ring_req, pending_req);
+   act = ring_req.act;


Is this helper variable then still needed?


No, you're right. Will delete it.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm

2015-01-30 Thread Denis Drozdov
From: denys drozdov denys.droz...@globallogic.com

Change-Id: I9b315f213775b8410fe75cd96968dcb213ea287b
Signed-off-by: denys drozdov denys.droz...@globallogic.com
---
 xen/common/sched_rt.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index e70d6c7..1ab0a62 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -1010,8 +1010,9 @@ rt_context_saved(const struct scheduler *ops, struct vcpu 
*vc)
 struct rt_vcpu *snext = NULL;
 struct rt_dom *sdom = NULL;
 struct rt_private *prv = rt_priv(ops);
-cpumask_t *online;
-spinlock_t *lock = vcpu_schedule_lock_irq(vc);
+cpumask_t *online; 
+unsigned long flags;
+spinlock_t *lock = vcpu_schedule_lock_irqsave(vc, flags);
 
 clear_bit(__RTDS_scheduled, svc-flags);
 /* not insert idle vcpu to runq */
@@ -1032,7 +1033,7 @@ rt_context_saved(const struct scheduler *ops, struct vcpu 
*vc)
 runq_tickle(ops, snext);
 }
 out:
-vcpu_schedule_unlock_irq(lock, vc);
+vcpu_schedule_unlock_irqrestore(lock, flags, vc);
 }
 
 /*
-- 
1.7.9.5


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 16/29] Ovmf/Xen: fix pointer to int cast in XenBusDxe

2015-01-30 Thread Laszlo Ersek
On 01/26/15 20:03, Ard Biesheuvel wrote:
 On ARM, xen_pfn_t is 64 bits but the size of a pointer is only
 32 bits, so casting between them needs to go via (UINTN). Also
 move the xen_pfn_t cast outside the shift so that we can avoid
 shifting 64-bit quantities on 32-bit architectures, which may
 require runtime library support.
 
 Contributed-under: TianoCore Contribution Agreement 1.0
 Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
 Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org
 ---
  OvmfPkg/XenBusDxe/GrantTable.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/OvmfPkg/XenBusDxe/GrantTable.c b/OvmfPkg/XenBusDxe/GrantTable.c
 index 37d3bf786c64..8405edc51bc4 100644
 --- a/OvmfPkg/XenBusDxe/GrantTable.c
 +++ b/OvmfPkg/XenBusDxe/GrantTable.c
 @@ -160,7 +160,7 @@ XenGrantTableInit (
  Parameters.domid = DOMID_SELF;
  Parameters.idx = Index;
  Parameters.space = XENMAPSPACE_grant_table;
 -Parameters.gpfn = (((xen_pfn_t) GrantTable)  EFI_PAGE_SHIFT) + Index;
 +Parameters.gpfn = (xen_pfn_t) ((UINTN) GrantTable  EFI_PAGE_SHIFT) + 
 Index;
  ReturnCode = XenHypercallMemoryOp (Dev, XENMEM_add_to_physmap, 
 Parameters);
  if (ReturnCode != 0) {
DEBUG ((EFI_D_ERROR, Xen GrantTable, add_to_physmap hypercall error: 
 %d\n, ReturnCode));
 @@ -182,7 +182,7 @@ XenGrantTableDeinit (
  
for (Index = NR_GRANT_FRAMES - 1; Index = 0; Index--) {
  Parameters.domid = DOMID_SELF;
 -Parameters.gpfn = (((xen_pfn_t) GrantTable)  EFI_PAGE_SHIFT) + Index;
 +Parameters.gpfn = (xen_pfn_t) ((UINTN) GrantTable  EFI_PAGE_SHIFT) + 
 Index;
  DEBUG ((EFI_D_INFO, Xen GrantTable, removing %X\n, Parameters.gpfn));
  ReturnCode = XenHypercallMemoryOp (Dev, XENMEM_remove_from_physmap, 
 Parameters);
  if (ReturnCode != 0) {
 

One edk2 coding style guideline that I consistently reject  never apply
is the first space in

  (type) expr1 op expr2

We had an obscure bug due to this guideline earlier, which I luckily
managed to catch through review, before it could do damage.

The problem with the first space is that it actively obscures the fact
that the cast operator has one of the strongest operator bindings. Most
of the time, the above means

  ((type)expr1) op expr2

and not

  (type)(expr1 op expr2)

as the space would incorrectly suggest.

Your patch is fine, but I had to think thrice. :)

Reviewed-by: Laszlo Ersek ler...@redhat.com


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/balloon: cancel ballooning if adding new memory failed

2015-01-30 Thread Ian Campbell
On Thu, 2015-01-29 at 18:01 +, David Vrabel wrote:
 On 29/01/15 13:36, Ian Campbell wrote:
  On Mon, 2014-09-01 at 18:52 +0100, David Vrabel wrote:
  If the balloon driver is adding additional memory regions to the
  balloon and add_memory() fails it will likely continuously fail so
  cancel the balloon operation.
 
  Signed-off-by: David Vrabel david.vra...@citrix.com
  
  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776448 and
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1304001 seem to
  suggest this should be a candidate for stable backports?
 
 It's up to the distro kernel maintainer to request it if they think it
 is important.

Which is what I thought I was doing. Surely you don't think I should
just ask stable@ without conferring with the relevant maintainers first?

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl

2015-01-30 Thread Razvan Cojocaru
On 01/30/2015 01:45 PM, Tamas K Lengyel wrote:

 Ack, the plan was actually to replace all references to
 arch.hvm_domain.introspection_enabled with
 arch.monitor_options.mov_to_msr.extended_capture. I see I forgot to
 actually fully follow through that plan but that's the intention at
 least. So the functionality would remain, it would just be worked into
 the coherent settings field with every other type of events.

 I see, but is it conceivable that some vm_event consumer does want to
 enable arch.monitor_options.mov_to_msr.extended_capture but not be
 interested in doing full-blown introspection (for example, is fine with
 having the REP optimization enabled)?
 
 I'm not sure I follow what the difference here between
 mov_to_msr.extended_capture and what you refer to as full-blown
 introspection. My understanding was that without gating some MSR's
 via introspection_enabled, they are never trapped into the hypervisor
 when they get written to, while other MSR do. Thus the two options:
 mov_to_msr.enabled for basic trapping, and mov_to_msr.extended_capture
 for gating the extended set of MSRs to be also trappable.

Well, write interception for the MSRs we're interested in is enabled
_initally_, but then vmx_disable_intercept_for_msr() gets called and
we're left without these events, so yes, in a nutshell we do want to
keep write events for more MSRs than the default allows around.

The initial, very basic patch:

http://lists.xenproject.org/archives/html/xen-devel/2014-07/msg00310.html

The point was to keep receiving these events even after the HV decides
the corresponding MSR writes have become unnecessarily expensive to
intercept, but only if the vm_event consumer is interested in guest
introspection.

By full-blown introspection I mean that, in order to do its job
properly, guest introspection needs a bunch of requirements to be met
(not just the proper MSR write events to be sent), not least of which is
to send an event per each write performed in a REP situation (not just
an event per page touched) - which is why, gated on introspection being
enabled, we disable the REP emulation optimization.

It also needs a number of other things, which will surely become RFCs
here at some point.

Hence my previous comments: full-blown vm_event-based introspection is
more than the MSR trick, so it might be slightly confusing to gate all
of them on mov_to_msr.extended_capture.

 What you're proposing here (as far as introspection_enabled is
 concerned) is, if I understand correctly, basically renaming
 introspection_enabled to mov_to_msr.extended_capture. I can see how that
 would seem to simplify some things, but it might not look very intuitive
 to future developers, and it will definitely complicate other things
 down the road.
 
 No, not just simply renaming it. So far the options for the various VM
 events were scattered all over the place, yours defined directly on
 hvm_domain, others in hvm parameters. Now there is an assigned spot
 for all current - and future - events to store their settings: in
 arch_domain monitor_settings. This will work for PV domains as well,
 while keeping it on the arch_domain layer will avoid defining options
 on ARM that are architecture specific (mov_to_cr0/3/4 etc.).
 
 Furthermore, setting these options was also a mess which I try to
 address in this patch: some were enabled via hvm_op hypercalls, yours
 via domctl. Now everything is moved into one spot: monitor_op domctl.

Yes, and that's great! Your efforts are appreciated, and I'm sorry to
have inadvertently contributed to the mess.

I was simply talking about the way introspection_enabled is being used:
you were taking before about replacing all instances of
introspection_enabled with mov_to_msr.enabled (which is functionally
speaking, renaming), and my only points
have been that:

A) in this series introspection_enabled has neither been removed nor is
it of any use, since it's always 0, and

B) exclusive use of mov_to_msr.extended_capture where
introspection_enabled has been used before might not logically cover all
the cases needed for proper guest introspection (not sure that gating
the code that disables the REP emulation optimization on
mov_to_msr.extended_capture wouldn't look strange to someone looking at
the code for the first time).

I hope I've been able to explain myself better this time, perhaps my
concerns are not shared by others, I'm certainly open to discussion.

 An example is havig the guest trigger a vm_event, requested by the
 privileged userspace application. In our case, it was VMCALL in the
 original series, and the patch has been eventually dropped from the
 conversation - to be reworked at a later time. But we do need it, and
 our patch now does its thing gated on introspection_enabled. It's hard
 to connect that logic to mov_to_msr.extended_capture.

 I guess we could keep the flag and move it to arch.monitor_options if
 you prefer? And setting it could cause mov_to_msr.extended_capture and
 

Re: [Xen-devel] [Qemu-devel] [RFC][PATCH 1/1] libxl: add one machine property to support IGD GFX passthrough

2015-01-30 Thread Wei Liu
On Fri, Jan 30, 2015 at 08:56:48AM +0800, Chen, Tiejun wrote:
[...]
 
 Just remember to handle old option in libxl if your old option is already
 released by some older version of QEMUs.
 
 I just drop that old option, -gfx_passthru, if we're under qemu upstream
 circumstance, like this,
 
 
 The question is, is there any version of qemu upstream that has
 been released that has the old option (-gfx-passthru)?
 
 No. Just now we're starting to support IGD passthrough in qemu upstream.
 

Right, as of QEMU 2.2.0 there's no support of IGD passthrough in QMEU
upstream.

 
 This gives us a situation that we need to support both the old
 (-gfx-passthru) and new (-igd-passthru) options. Presumably we (libxl)
 would need to fork a qemu process to determine which option it has and
 pass the right one.
 
 Or you can try to keep both old and new option at the same time but
 
 Yeah, actually I also have considered to keep both two options at the same
 time. Its really friendly to any qemu version.
 
 deprecate the old one. Then in a few qemu release cycles later (or
 
 This should be like 'accel=kvm' versus 'enable-kvm' in qemu upstream.
 They're coexisted now but just the former is a modern option.
 
 probably one year or two?) you can finally remove the old one. The point
 is that to give downstream (in this case, Xen) time to cope with the
 change.
 
 Here I'm fine to this way.
 
 So Gerd,
 

So you don't actually need to ask Gerd this question because there is no
old option to keep in qemu upstream.

Libxl (or any sensible toolstack) will just do the right thing to either
pass -igd-passthru (or whatever you guys agree upon) to qemu upstream or
pass -gfx-passthru to qemu traditional. :-)

Wei.

 What about this?
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.14 test] 33918: regressions - FAIL

2015-01-30 Thread xen . org
flight 33918 linux-3.14 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/33918/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-ovmf-amd64  5 xen-boot   fail REGR. vs. 33341
 test-amd64-amd64-xl-qemuu-winxpsp3  5 xen-bootfail REGR. vs. 33341
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  5 xen-boot  fail REGR. vs. 33341
 test-amd64-amd64-xl-qemut-winxpsp3  5 xen-bootfail REGR. vs. 33341
 test-amd64-amd64-rumpuserxen-amd64 7 rumpuserxen-demo-setup fail REGR. vs. 
33341
 test-amd64-i386-xl-multivcpu  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-rumpuserxen-i386  5 xen-boot  fail REGR. vs. 33341
 test-amd64-amd64-xl-winxpsp3  5 xen-boot  fail REGR. vs. 33341
 test-amd64-amd64-xl   5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-xl-win7-amd64  5 xen-boot fail REGR. vs. 33341
 test-amd64-amd64-xl-pvh-amd   5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-xl-qemuu-winxpsp3  5 xen-boot fail REGR. vs. 33341
 test-amd64-i386-freebsd10-i386  5 xen-bootfail REGR. vs. 33341
 test-amd64-amd64-xl-qemut-debianhvm-amd64  5 xen-boot fail REGR. vs. 33341
 test-amd64-i386-xl-qemuu-win7-amd64  5 xen-boot   fail REGR. vs. 33341
 test-amd64-amd64-xl-win7-amd64  7 windows-install fail REGR. vs. 33341
 test-amd64-i386-qemuu-rhel6hvm-amd  7 redhat-install  fail REGR. vs. 33341
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  5 xen-boot fail REGR. vs. 33341
 test-amd64-i386-qemut-rhel6hvm-amd  5 xen-bootfail REGR. vs. 33341
 test-amd64-amd64-xl-pvh-intel  5 xen-boot fail REGR. vs. 33341
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-xl-winxpsp3   5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-xl-qemut-debianhvm-amd64  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-freebsd10-amd64  4 xen-installfail REGR. vs. 33341
 test-amd64-i386-xl5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-xl-qemut-winxpsp3  5 xen-boot fail REGR. vs. 33341
 test-amd64-amd64-xl-qemuu-win7-amd64  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-qemut-rhel6hvm-intel  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-libvirt   5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-rhel6hvm-amd  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-xl-winxpsp3-vcpus1  5 xen-bootfail REGR. vs. 33341
 test-amd64-i386-xl-qemuu-debianhvm-amd64  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-qemuu-rhel6hvm-intel  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-xl-credit27 debian-installfail REGR. vs. 33341
 test-amd64-amd64-xl-qemuu-ovmf-amd64  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-xl-qemut-win7-amd64  5 xen-boot   fail REGR. vs. 33341
 test-amd64-amd64-libvirt  5 xen-boot  fail REGR. vs. 33341
 test-amd64-amd64-pair16 guest-start/debianfail REGR. vs. 33341
 test-amd64-amd64-xl-qemut-win7-amd64  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-pair  8 xen-boot/dst_host fail REGR. vs. 33341
 test-amd64-i386-pair  7 xen-boot/src_host fail REGR. vs. 33341

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-sedf  5 xen-boot  fail REGR. vs. 33341
 test-amd64-i386-rhel6hvm-intel  5 xen-boot fail like 33296
 test-amd64-amd64-xl-sedf-pin  5 xen-boot  fail REGR. vs. 33341
 test-amd64-amd64-xl-pcipt-intel  5 xen-boot   fail REGR. vs. 33341

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  9 guest-start  fail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass

version targeted for testing:
 linux4d7313cd7cbe56ccb511eca23ef5bba7f10ffcb8
baseline version:
 linuxc3b70f0bbb6a883f4afa639286043d3f71fbbddf


People who touched revisions under test:
  Ahmed S. Darwish ahmed.darw...@valeo.com
  Alan Stern st...@rowland.harvard.edu
  Alex Deucher alexander.deuc...@amd.com
  Alex Williamson alex.william...@redhat.com
  Alexander Fomichev git.u...@gmail.com
  Alexander Y. Fomichev git.u...@gmail.com
  Amit Virdi amit.vi...@st.com
  Andrew Jackson andrew.jack...@arm.com
  Andrew Morton a...@linux-foundation.org
  Andy Lutomirski l...@amacapital.net
  Anton Blanchard an...@samba.org
  Antonio Borneo borneo.anto...@gmail.com
  Antonio Quartulli anto...@meshcoding.com
  Ard Biesheuvel ard.biesheu...@linaro.org
  Arik Nemtsov a...@wizery.com
  Arik Nemtsov arikx.nemt...@intel.com
  Arnaldo 

Re: [Xen-devel] [PATCH v2 12/29] ArmVirtualizationPkg: implement custom MemoryInitPeiLib

2015-01-30 Thread Laszlo Ersek
On 01/26/15 20:03, Ard Biesheuvel wrote:
 This implements a MemoryInitPeiLib instance that differs from the
 stock ArmPlatformPkg version only in the fact that it does not remove
 the memory used by the flash device (FD). The reason is that, when using
 PrePi, the DXE core is started immediately and never returns so there is
 no reason to preserve any of the memory that the flash device occupied
 originally, and it is preferable to release is so that the OS loader
 can reuse it. This is especially important for the relocatable PrePi
 configuration, which is aimed at being launched from a boot loader that
 itself adheres to the Linux arm64 boot protocol.
 
 Contributed-under: TianoCore Contribution Agreement 1.0
 Signed-off-by: Ard Biesheuvel ard.biesheu...@linaro.org
 ---
  
 .../Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.c
   | 91 
  
 .../Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.inf
 | 66 +++
  2 files changed, 157 insertions(+)
 
 diff --git 
 a/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.c
  
 b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.c
 new file mode 100644
 index ..5f6cd059c47f
 --- /dev/null
 +++ 
 b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.c
 @@ -0,0 +1,91 @@
 +/** @file
 +*
 +*  Copyright (c) 2011-2014, ARM Limited. All rights reserved.
 +*  Copyright (c) 2014, Linaro Limited. All rights reserved.
 +*
 +*  This program and the accompanying materials
 +*  are licensed and made available under the terms and conditions of the BSD 
 License
 +*  which accompanies this distribution.  The full text of the license may be 
 found at
 +*  http://opensource.org/licenses/bsd-license.php
 +*
 +*  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN AS IS BASIS,
 +*  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR 
 IMPLIED.
 +*
 +**/
 +
 +#include PiPei.h
 +
 +#include Library/ArmPlatformLib.h
 +#include Library/DebugLib.h
 +#include Library/HobLib.h
 +#include Library/MemoryAllocationLib.h
 +#include Library/PcdLib.h
 +
 +VOID
 +BuildMemoryTypeInformationHob (
 +  VOID
 +  );
 +
 +VOID
 +InitMmu (
 +  VOID
 +  )
 +{
 +  ARM_MEMORY_REGION_DESCRIPTOR  *MemoryTable;
 +  VOID  *TranslationTableBase;
 +  UINTN TranslationTableSize;
 +  RETURN_STATUS Status;
 +
 +  // Get Virtual Memory Map from the Platform Library
 +  ArmPlatformGetVirtualMemoryMap (MemoryTable);
 +
 +  //Note: Because we called PeiServicesInstallPeiMemory() before to call 
 InitMmu() the MMU Page Table resides in
 +  //  DRAM (even at the top of DRAM as it is the first permanent memory 
 allocation)
 +  Status = ArmConfigureMmu (MemoryTable, TranslationTableBase, 
 TranslationTableSize);
 +  if (EFI_ERROR (Status)) {
 +DEBUG ((EFI_D_ERROR, Error: Failed to enable MMU\n));
 +  }
 +}
 +
 +EFI_STATUS
 +EFIAPI
 +MemoryPeim (
 +  IN EFI_PHYSICAL_ADDRESS   UefiMemoryBase,
 +  IN UINT64 UefiMemorySize
 +  )
 +{
 +  EFI_RESOURCE_ATTRIBUTE_TYPE ResourceAttributes;
 +
 +  // Ensure PcdSystemMemorySize has been set
 +  ASSERT (PcdGet64 (PcdSystemMemorySize) != 0);
 +
 +  //
 +  // Now, the permanent memory has been installed, we can call 
 AllocatePages()
 +  //
 +  ResourceAttributes = (
 +  EFI_RESOURCE_ATTRIBUTE_PRESENT |
 +  EFI_RESOURCE_ATTRIBUTE_INITIALIZED |
 +  EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE |
 +  EFI_RESOURCE_ATTRIBUTE_WRITE_COMBINEABLE |
 +  EFI_RESOURCE_ATTRIBUTE_WRITE_THROUGH_CACHEABLE |
 +  EFI_RESOURCE_ATTRIBUTE_WRITE_BACK_CACHEABLE |
 +  EFI_RESOURCE_ATTRIBUTE_TESTED
 +  );
 +
 +  BuildResourceDescriptorHob (
 +  EFI_RESOURCE_SYSTEM_MEMORY,
 +  ResourceAttributes,
 +  PcdGet64 (PcdSystemMemoryBase),
 +  PcdGet64 (PcdSystemMemorySize)
 +  );
 +
 +  // Build Memory Allocation Hob
 +  InitMmu ();
 +
 +  if (FeaturePcdGet (PcdPrePiProduceMemoryTypeInformationHob)) {
 +// Optional feature that helps prevent EFI memory map fragmentation.
 +BuildMemoryTypeInformationHob ();
 +  }
 +
 +  return EFI_SUCCESS;
 +}
 diff --git 
 a/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.inf
  
 b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.inf
 new file mode 100644
 index ..fcdae06de7c2
 --- /dev/null
 +++ 
 b/ArmPlatformPkg/ArmVirtualizationPkg/Library/ArmVirtualizationMemoryInitPeiLib/ArmVirtualizationMemoryInitPeiLib.inf
 @@ -0,0 +1,66 @@
 +#/** @file
 +#
 +#  Copyright (c) 2011-2014, ARM Ltd. All rights reserved.BR
 +#  Copyright (c) 2014, 

Re: [Xen-devel] [RFC PATCH V3 10/12] xen: Introduce monitor_op domctl

2015-01-30 Thread Tamas K Lengyel
On Fri, Jan 30, 2015 at 12:07 PM, Razvan Cojocaru
rcojoc...@bitdefender.com wrote:
 On 01/30/2015 12:38 PM, Tamas K Lengyel wrote:
 On Fri, Jan 30, 2015 at 8:58 AM, Razvan Cojocaru
 rcojoc...@bitdefender.com wrote:
 On 01/29/2015 11:46 PM, Tamas K Lengyel wrote:
 diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
 index c7a0bde..3b58700 100644
 --- a/xen/arch/x86/mm/p2m.c
 +++ b/xen/arch/x86/mm/p2m.c
 @@ -1445,15 +1445,6 @@ void p2m_vm_event_emulate_check(struct vcpu *v, 
 const vm_event_response_t *rsp)
  }
  }

 -void p2m_setup_introspection(struct domain *d)
 -{
 -if ( hvm_funcs.enable_msr_exit_interception )
 -{
 -d-arch.hvm_domain.introspection_enabled = 1;
 -hvm_funcs.enable_msr_exit_interception(d);
 -}
 -}
 -
  bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
  struct npfec npfec,
  vm_event_request_t **req_ptr)

 I see that introspection_enabled is no longer assigned here ...

 Introspection_enabled is getting deprecated in this patch and is moved
 into the monitor_op domctl

 Moved, right, but where?

 This patch removes the d-arch.hvm_domain.introspection_enabled = 1
 assignment from here but AFAICT it doesn't move it anwyhere else. It
 just removes it, so introspection_enabled is always 0.

Actually it should have been removed altogether from the HVM domain
definition as the same information should be stored in arch_domain's
monitor_options struct.


 I couldn't find any place in this series where introspection_enabled =
 1. Could you please point it out to me if I've missed it by accident?

 diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
 index 0db899e..0b30750 100644
 --- a/xen/common/vm_event.c
 +++ b/xen/common/vm_event.c
 @@ -617,16 +617,10 @@ int vm_event_domctl(struct domain *d, 
 xen_domctl_vm_event_op_t *vec,
  switch( vec-op )
  {
  case XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE:
 -case XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE_INTROSPECTION:
  {
  rc = vm_event_enable(d, vec, ved, _VPF_mem_access,
  HVM_PARAM_MONITOR_RING_PFN,
  mem_access_notification);
 -
 -if ( vec-op == 
 XEN_DOMCTL_VM_EVENT_OP_MONITOR_ENABLE_INTROSPECTION
 -  !rc )
 -p2m_setup_introspection(d);
 -
  }
  break;

 @@ -635,7 +629,6 @@ int vm_event_domctl(struct domain *d, 
 xen_domctl_vm_event_op_t *vec,
  if ( ved-ring_page )
  {
  rc = vm_event_disable(d, ved);
 -d-arch.hvm_domain.introspection_enabled = 0;
  }
  }
  break;

 ... nor here. Patch 6/12 checks it but doesn't set it. Patch 5/12 sets
 it to 0 (which could account for the removal of the assignment in
 vm_event.c) but never to 1. A few important things depend on it being
 enabled: it becomes impossible to disable interception for a select set
 of MSRs, optimization for RET instructions emulation is disabled, and
 othere places in p2m.c makes use of the flag as well.

 Is there some place in the code, untouched by this series, where
 introspection_enabled is being set to 1?

 It is moved into the monitor_op domctl when mov_to_msr trapping is
 enabled. The reason of having introspection_enabled AFAIU was to
 reenable trapping MSR's that were disabled shortly after boot. Thus,
 an option field is present in the monitor_op when enabling mov_to_msr
 trapping: extended_capture. Let me know if this still achieves the
 same effect as before!

 No, there are several places where introspection_enabled is needed, as
 I've said before. Not just the MSRs.

 One important place is in hvmemul_virtual_to_linear(), in
 xen/arch/x86/hvm/emulate.c, where we disable optimizations for REP
 instructions (in today's version of mainline Xen, at line 413).

 There are also places in patches yet to be published I've worked on
 where we gate things on introspection_enabled being != 0, so please
 don't remove it or have it misbehave.

 I'll look into the extended_capture option in case it will allow future
 removal of the MSR special case for introspection, but a flag like that
 is necessary and can't simply be deprecated and removed.

Ack, the plan was actually to replace all references to
arch.hvm_domain.introspection_enabled with
arch.monitor_options.mov_to_msr.extended_capture. I see I forgot to
actually fully follow through that plan but that's the intention at
least. So the functionality would remain, it would just be worked into
the coherent settings field with every other type of events.



 Thanks,
 Razvan

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] xen: mark pvscsi frontend request consumed only after last read

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 12:21, jgr...@suse.com wrote:
 @@ -734,11 +734,11 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info)
   if (!pending_req)
   return 1;
  
 - ring_req = RING_GET_REQUEST(ring, rc);
 + memcpy(ring_req, RING_GET_REQUEST(ring, rc), sizeof(ring_req));

I'd recommend the type safe *ring_req = *RING_GET_REQUEST(ring, rc)
here.

   ring-req_cons = ++rc;
  
 - act = ring_req-act;
 - err = prepare_pending_reqs(info, ring_req, pending_req);
 + act = ring_req.act;

Is this helper variable then still needed?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 0/5] Split off mini-os to a separate tree

2015-01-30 Thread Thomas Leonard
On 30 January 2015 at 11:26, Wei Liu wei.l...@citrix.com wrote:
 On Mon, Jan 26, 2015 at 09:53:25AM +, Thomas Leonard wrote:
 [ Cc += Anil ]

 On 25 January 2015 at 18:25, Wei Liu wei.l...@citrix.com wrote:
  Cc Ian and Ian and some folks who might be interested in this work.
 
  On Sun, Jan 25, 2015 at 06:13:41PM +, Wei Liu wrote:
  There has been increasing use of mini-os in unikernel world and basically
  everybody has their own fork of mini-os. That way going foward is going
  to cause fragmentation of the community.
 
  We would like to split off mini-os tree so that everybody can upstream 
  their
  changes and work on a common code base. Later I would also like to setup
  a proper push gate for mini-os.
 
  Stubdom's build environment is known to be very fragile, so basically all 
  the
  real work is done in top level Makefile.
 
  I use following runes to split off mini-os:
 
git filter-branch --tag-name-filter cat \
  --subdirectory-filter extras/mini-os/ -- --all
 
# There is already a tag name 4.3.0-rc2 which points to the same commit.
git tag -d xen-4.3.0-rc2
 
# Add xen- prefix to all tags
for t in `git tag`; do git tag xen-$t $t; git tag -d $t ; done
 
git gc --aggressive
 
  The tree can be found at:
git://xenbits.xen.org/people/liuw/mini-os.git master
 
  Note that mini-os cannot build on its own due to the limitation of it's 
  own
  build system. Splitting it off it's the first step towards fixing that 
  problem

 In case it's useful: for the standalone version of Mini-OS used by
 Mirage, I had to include these directories too:

 1. xen/include/public
 2. xen/common/libfdt
 3. xen/include/xen/libfdt

 Is libfdt required? I see inclusion of libfdt.h in arch/arm/mm.c but I
 couldn't find trace to link libfdt in Makefile. And I ran a simple test
 to compile minios in Xen tree on an arm board it doesn't compile.

 If arm support is still WIP I'm going to ignore libfdt for now.

The ARM port works fine, but you need to apply the last couple of
patches in the series (the rest are already in). The latest versions
are on my next branch here:

  https://github.com/talex5/xen/commits/next

If you want to test it, you need these two:

- mini-os: arm: interrupt controller
- mini-os: arm: build system

 Wei.

 4. config

 ( https://github.com/talex5/xen/tree/minios-releases )

 Will there be a separate mailing list for Mini-OS? That would be very
 useful for people who don't want to keep up with xen-devel.

 Thanks!


 --
 Dr Thomas Leonardhttp://0install.net/
 GPG: 9242 9807 C985 3C07 44A6  8B9A AE07 8280 59A5 3CC1
 GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA



-- 
Dr Thomas Leonardhttp://0install.net/
GPG: 9242 9807 C985 3C07 44A6  8B9A AE07 8280 59A5 3CC1
GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: Fix rtds scheduler for arm

2015-01-30 Thread Jan Beulich
 On 30.01.15 at 13:30, denys.droz...@globallogic.com wrote:
 From: denys drozdov denys.droz...@globallogic.com

There's a description missing here of _what_ (case) you are fixing.

 Change-Id: I9b315f213775b8410fe75cd96968dcb213ea287b

And the purpose of this line is unclear.

 --- a/xen/common/sched_rt.c
 +++ b/xen/common/sched_rt.c
 @@ -1010,8 +1010,9 @@ rt_context_saved(const struct scheduler *ops, struct 
 vcpu *vc)
  struct rt_vcpu *snext = NULL;
  struct rt_dom *sdom = NULL;
  struct rt_private *prv = rt_priv(ops);
 -cpumask_t *online;
 -spinlock_t *lock = vcpu_schedule_lock_irq(vc);
 +cpumask_t *online; 

You're introducing a trailing blank here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel