Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology

2015-04-21 Thread Jan Beulich
 On 17.04.15 at 18:59, boris.ostrov...@oracle.com wrote:
 Changes in v7:
 * Break from the loop when -ENODEV is encountered

This seems pretty inefficient for the caller. Returning a bad
identifier other than XEN_INVALID_NODE_ID would seem better
to me.

 --- a/docs/misc/xsm-flask.txt
 +++ b/docs/misc/xsm-flask.txt
 @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h)
   * XEN_SYSCTL_cpupool_op
   * XEN_SYSCTL_scheduler_op
   * XEN_SYSCTL_coverage_op
 + * XEN_SYSCTL_pcitopoinfo

No additions to this list are permitted. Either the new sub-op is
disaggregation safe (which it looks to be), or it can't be accepted.

 --- a/xen/common/sysctl.c
 +++ b/xen/common/sysctl.c
 @@ -411,6 +411,65 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
 u_sysctl)
  break;
  #endif
  
 +#ifdef HAS_PCI
 +case XEN_SYSCTL_pcitopoinfo:
 +{
 +xen_sysctl_pcitopoinfo_t *ti = op-u.pcitopoinfo;
 +unsigned dev_cnt = 0;
 +
 +if ( guest_handle_is_null(ti-devs) ||
 + guest_handle_is_null(ti-nodes) ||
 + (ti-first_dev  ti-num_devs) )
 +{
 +ret = -EINVAL;
 +break;
 +}
 +
 +while ( ti-first_dev  ti-num_devs )
 +{
 +physdev_pci_device_t dev;
 +uint32_t node;
 +struct pci_dev *pdev;
 +
 +if ( copy_from_guest_offset(dev, ti-devs, ti-first_dev, 1) )
 +{
 +ret = -EFAULT;
 +break;
 +}
 +
 +spin_lock(pcidevs_lock);
 +pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
 +if ( !pdev )
 +{
 +ret = -ENODEV;
 +node = XEN_INVALID_NODE_ID;
 +}
 +else if ( pdev-node == NUMA_NO_NODE )
 +node = XEN_INVALID_NODE_ID;
 +else
 +node = pdev-node;
 +spin_unlock(pcidevs_lock);
 +
 +if ( copy_to_guest_offset(ti-nodes, ti-first_dev, node, 1) )
 +{
 +ret = -EFAULT;
 +break;
 +}
 +
 +ti-first_dev++;
 +
 +if ( (ret == -ENODEV) ||

If despite the earlier comment you want to stay with this model, I think
you should keep first_dev pointing at the bad slot (but also see below),
leaving it to the caller to increment past it (consider this being function
0 and the next slots being the other functions - in that case it's clear
that the caller would want to skip more than just this one slot).

 + ((++dev_cnt  0x3f)  hypercall_preempt_check()) )
 +break;
 +}
 +
 +if ( (!ret || (ret == -ENODEV)) 
 + __copy_field_to_guest(u_sysctl, op, u.pcitopoinfo.first_dev) )
 +ret = -EFAULT;
 +}
 +break;
 +#endif

With the continuation-less model now used I don't think it makes
sense to have first_dev and num_devs - for re-invocation all the
caller needs to do is increment the buffer pointer suitably. I.e.
you can get away with just a count of devices afaict.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v2 05/11] vmx: add new data structure member to support PML

2015-04-21 Thread Kai Huang



On 04/17/2015 10:31 AM, Kai Huang wrote:



On 04/17/2015 06:39 AM, Tian, Kevin wrote:

From: Kai Huang [mailto:kai.hu...@linux.intel.com]
Sent: Wednesday, April 15, 2015 3:04 PM

A new 4K page pointer is added to arch_vmx_struct as PML buffer for 
vcpu.

And a
new 'status' field is added to vmx_domain to indicate whether PML is 
enabled

for
the domain or not. The 'status' field also can be used for further 
similiar

purpose.
not sure about the last sentence. what's the similar purpose to 
whether PML

is enabled? :-)
I mean potentially there might be such feature in the future, and I 
can't give you an example right now. If you are just commenting the 
description here but fine with the current code, I can remove that 
last sentence if you like. Or do you suggest to just use a bool_t 
pml_enabled? I am fine with both, but looks there's no objection from 
others so I intend to keep it as 'unsigned int status', if you agree.

Hi Kevin,

What's your opinion here? Is 'unsigned int status' OK to you?





Note both new members don't have to be initialized to zero 
explicitly as both

vcpu and domain structure are zero-ed when they are created.

no initialization in this patch, so why explaining it here?
OK. Looks it's a common sense to all of you so I'll just remove this 
sentence.





Signed-off-by: Kai Huang kai.hu...@linux.intel.com
---
  xen/include/asm-x86/hvm/vmx/vmcs.h | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h
b/xen/include/asm-x86/hvm/vmx/vmcs.h
index f831a78..2c679ac 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -70,8 +70,12 @@ struct ept_data {
  cpumask_var_t synced_mask;
  };

+#define _VMX_DOMAIN_PML_ENABLED0
+#define VMX_DOMAIN_PML_ENABLED (1ul 
_VMX_DOMAIN_PML_ENABLED)
  struct vmx_domain {
  unsigned long apic_access_mfn;
+/* VMX_DOMAIN_* */
+unsigned long status;
  };

  struct pi_desc {
@@ -142,6 +146,9 @@ struct arch_vmx_struct {
  /* Bitmap to control vmexit policy for Non-root VMREAD/VMWRITE */
  struct page_info *vmread_bitmap;
  struct page_info *vmwrite_bitmap;
+
+#define NR_PML_ENTRIES   512
+struct page_info *pml_pg;

move the macro out of the structure.

OK. I will move it just above the declaration of struct arch_vmx_struct.


and is pml_buffer or pml_buf more clear?


To me pml_buffer or pml_buf is more likely a virtual address you can 
access the buffer directly, while pml_pg indicates it's a pointer of 
struct page_info. If you you look at patch 6, you can find statements 
like:


uint64_t *pml_buf;

pml_buf = __map_domain_page(v-arch.hvm_vmx.pml_pg);

So I intend to keep it.

And this one? Are you OK with 'pml_pg'?

Thanks,
-Kai


Thanks,
-Kai



  };

  int vmx_create_vmcs(struct vcpu *v);
--
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants

2015-04-21 Thread Luis R. Rodriguez
On Thu, Mar 26, 2015 at 04:00:54AM +0100, Luis R. Rodriguez wrote:
 On Mon, Mar 23, 2015 at 12:20:47PM -0500, Bjorn Helgaas wrote:
  Hi Luis,
  
  This seems OK to me, 
 
 Great.
 
  but I'm curious about a few things.
  
  On Fri, Mar 20, 2015 at 6:17 PM, Luis R. Rodriguez
  mcg...@do-not-panic.com wrote:
   From: Luis R. Rodriguez mcg...@suse.com
  
   This allows drivers to take advantage of write-combining
   when possible. Ideally we'd have pci_read_bases() just
   peg an IORESOURCE_WC flag for us
  
  We do set IORESOURCE_PREFETCH.  Do you mean something different?
 
 I did not think we had a WC IORESOURCE flag. Are you saying that we can use
 IORESOURCE_PREFETCH for that purpose? If so then great.  As I read a PCI BAR
 can have PCI_BASE_ADDRESS_MEM_PREFETCH and when that's the case we peg
 IORESOURCE_PREFETCH. That seems to be what I want indeed. Questions below.
 
but where exactly
   video devices memory lie varies *largely* and at times things
   are mixed with MMIO registers, sometimes we can address
   the changes in drivers, other times the change requires
   intrusive changes.
  
  What does a video device address have to do with this?  I do see that
  if a BAR maps only a frame buffer, the device might be able to mark it
  prefetchable, while if the BAR mapped both a frame buffer and some
  registers, it might not be able to make it prefetchable.  But that
  doesn't seem like it depends on the *address*.
 
 I meant the offsets for each of those, either registers or framebuffer,
 and that typically they are mixed (primarily on older devices), so indeed your
 summary of the problem is what I meant. Let's remember that we are trying to
 take advantage of PAT here when available and avoid MTRR in that case, do we
 know that the same PCI BARs that have always historically used MTRRs had
 IORESOURCE_PREFETCH set, is that a fair assumption ? I realize they are
 different things -- but its precisely why I ask.
 
  pci_iomap_range() already makes a cacheable mapping if
  IORESOURCE_CACHEABLE; I'm guessing that you would like it to
  automatically use WC if the BAR if IORESOURCE_PREFETCH, e.g.,
  
if (flags  IORESOURCE_CACHEABLE)
  return ioremap(start, len);
if (flags  IORESOURCE_PREFETCH)
  return ioremap_wc(start, len);
return ioremap_nocache(start, len);
 
 Indeed, that's exactly what I think we should strive towards.
 
  Is there a reason not to do that?
 
 This depends on the exact defintion of IORESOURCE_PREFETCH and
 PCI_BASE_ADDRESS_MEM_PREFETCH and how they are used all over and
 accross *all devices*. This didn't look promising for starters:
 
 include/uapi/linux/pci_regs.h:#define  PCI_BASE_ADDRESS_MEM_PREFETCH0x08  
   /* prefetchable? */
 
 PCI_BASE_ADDRESS_MEM_PREFETCH seems to be BAR specific, so a few questions:
 
 1) Can we rest assured for instance that if we check for
 PCI_BASE_ADDRESS_MEM_PREFETCH and if set that it will *only* be set on a full
 PCI BAR if the full PCI BAR does want WC? If not this can regress
 functionality. That seems risky. It however would not be risky if we used
 another API that did look for IORESOURCE_PREFETCH and if so use ioremap_wc() 
 --
 that way only drivers we know that do use the full PCI bar would use this API.
 There's a bit of a problem with this though:
 
 2) Do we know that if a *full PCI BAR* is used for WC that
 PCI_BASE_ADDRESS_MEM_PREFETCH *was* definitely set for the PCI BAR? If so then
 the API usage would be restricted only to devices that we know *do* adhere to
 this. That reduces the possible uses for older drivers and can create
 regressions if used loosely without verification... but..
 
 3) If from now on we get folks to commit to uset PCI_BASE_ADDRESS_MEM_PREFETCH
 for full PCI BARs that do want WC perhaps newer devices / drivers will use
 this very consistently ? Can we bank on that and is it worth it ?
 
 4) If a PCI BAR *does not* have PCI_BASE_ADDRESS_MEM_PREFETCH do we know it
 must not never want WC ?
 
 If we don't have certainty on any of the above I'm afraid we can't do much
 right now but perhaps we can push towards better use of 
 PCI_BASE_ADDRESS_MEM_PREFETCH
 and hope folks will only use this for the full PCI BAR only if WC is desired.
 
 Thoughts?

Bjorn, now that you're done schooling me on English, any thoughts on the above?

   Although there is also arch_phys_wc_add() that makes use of
   architecture specific write-combinging alternatives (MTRR on
   x86 when a system does not have PAT) we void polluting
   pci_iomap() space with it and force drivers and subsystems
   that want to use it to be explicit.
  
   There are a few motivations for this:
  
   a) Take advantage of PAT when available
  
   b) Help bury MTRR code away, MTRR is architecture specific and on
  x86 its replaced by PAT
  
   c) Help with the goal of eventually using _PAGE_CACHE_UC over
  _PAGE_CACHE_UC_MINUS on x86 on ioremap_nocache() (de33c442e)
   ...
  
   +void __iomem *pci_iomap_wc_range(struct pci_dev 

Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants

2015-04-21 Thread Michael S. Tsirkin
On Mon, Mar 23, 2015 at 12:20:47PM -0500, Bjorn Helgaas wrote:
 pci_iomap_range() already makes a cacheable mapping if
 IORESOURCE_CACHEABLE; I'm guessing that you would like it to
 automatically use WC if the BAR if IORESOURCE_PREFETCH, e.g.,
 
   if (flags  IORESOURCE_CACHEABLE)
 return ioremap(start, len);
   if (flags  IORESOURCE_PREFETCH)
 return ioremap_wc(start, len);
   return ioremap_nocache(start, len);
 
 Is there a reason not to do that?

I think that's wrong and will break a bunch of things.
PCI prefetch bit merely means bridges can combine writes and prefetch
reads.  Prefetch does not affect ordering rules and does not allow
writes to be collapsed.

WC is stronger: it allows collapsing and changes ordering rules.

WC can also hurt latency as small writes are buffered.

To summarise, driver needs to know what it's doing,
we can't set WC in the pci core automatically.

-- 
MST

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants

2015-04-21 Thread Michael S. Tsirkin
On Tue, Apr 21, 2015 at 07:52:49PM +0200, Luis R. Rodriguez wrote:
 On Thu, Mar 26, 2015 at 04:00:54AM +0100, Luis R. Rodriguez wrote:
  On Mon, Mar 23, 2015 at 12:20:47PM -0500, Bjorn Helgaas wrote:
   Hi Luis,
   
   This seems OK to me, 
  
  Great.
  
   but I'm curious about a few things.
   
   On Fri, Mar 20, 2015 at 6:17 PM, Luis R. Rodriguez
   mcg...@do-not-panic.com wrote:
From: Luis R. Rodriguez mcg...@suse.com
   
This allows drivers to take advantage of write-combining
when possible. Ideally we'd have pci_read_bases() just
peg an IORESOURCE_WC flag for us
   
   We do set IORESOURCE_PREFETCH.  Do you mean something different?
  
  I did not think we had a WC IORESOURCE flag. Are you saying that we can use
  IORESOURCE_PREFETCH for that purpose? If so then great.  As I read a PCI BAR
  can have PCI_BASE_ADDRESS_MEM_PREFETCH and when that's the case we peg
  IORESOURCE_PREFETCH. That seems to be what I want indeed. Questions below.
  
 but where exactly
video devices memory lie varies *largely* and at times things
are mixed with MMIO registers, sometimes we can address
the changes in drivers, other times the change requires
intrusive changes.
   
   What does a video device address have to do with this?  I do see that
   if a BAR maps only a frame buffer, the device might be able to mark it
   prefetchable, while if the BAR mapped both a frame buffer and some
   registers, it might not be able to make it prefetchable.  But that
   doesn't seem like it depends on the *address*.
  
  I meant the offsets for each of those, either registers or framebuffer,
  and that typically they are mixed (primarily on older devices), so indeed 
  your
  summary of the problem is what I meant. Let's remember that we are trying to
  take advantage of PAT here when available and avoid MTRR in that case, do we
  know that the same PCI BARs that have always historically used MTRRs had
  IORESOURCE_PREFETCH set, is that a fair assumption ? I realize they are
  different things -- but its precisely why I ask.
  
   pci_iomap_range() already makes a cacheable mapping if
   IORESOURCE_CACHEABLE; I'm guessing that you would like it to
   automatically use WC if the BAR if IORESOURCE_PREFETCH, e.g.,
   
 if (flags  IORESOURCE_CACHEABLE)
   return ioremap(start, len);
 if (flags  IORESOURCE_PREFETCH)
   return ioremap_wc(start, len);
 return ioremap_nocache(start, len);
  
  Indeed, that's exactly what I think we should strive towards.
  
   Is there a reason not to do that?
  
  This depends on the exact defintion of IORESOURCE_PREFETCH and
  PCI_BASE_ADDRESS_MEM_PREFETCH and how they are used all over and
  accross *all devices*. This didn't look promising for starters:
  
  include/uapi/linux/pci_regs.h:#define  PCI_BASE_ADDRESS_MEM_PREFETCH
  0x08/* prefetchable? */
  
  PCI_BASE_ADDRESS_MEM_PREFETCH seems to be BAR specific, so a few questions:
  
  1) Can we rest assured for instance that if we check for
  PCI_BASE_ADDRESS_MEM_PREFETCH and if set that it will *only* be set on a 
  full
  PCI BAR if the full PCI BAR does want WC? If not this can regress
  functionality. That seems risky. It however would not be risky if we used
  another API that did look for IORESOURCE_PREFETCH and if so use 
  ioremap_wc() --
  that way only drivers we know that do use the full PCI bar would use this 
  API.
  There's a bit of a problem with this though:
  
  2) Do we know that if a *full PCI BAR* is used for WC that
  PCI_BASE_ADDRESS_MEM_PREFETCH *was* definitely set for the PCI BAR? If so 
  then
  the API usage would be restricted only to devices that we know *do* adhere 
  to
  this. That reduces the possible uses for older drivers and can create
  regressions if used loosely without verification... but..
  

In theory, PCI spec says this about prefetch memory:
Bridges are permitted to merge writes into this range (refer to Section 
3.2.6).

Exceptions could be:
- devices not behind a bridge (e.g. intergrated in a root
  complex)
- devices behind a virtual bridge from same vendor
  (which know bridge won't prefetch)

I worry that WC might also cause more reordering though.  I don't
remember this is true, off-hand.  Bridges can only reorder transactions
according to very specific rules.

  3) If from now on we get folks to commit to uset 
  PCI_BASE_ADDRESS_MEM_PREFETCH
  for full PCI BARs that do want WC perhaps newer devices / drivers will use
  this very consistently ? Can we bank on that and is it worth it ?

Unfortunately there's a separate good reason to set memory as prefetcheable:
it's the only way to get 64 bit addresses for devices behind bridges.
So WC might be *safe* for prefetch BARs, but might not be a good idea.

  
  4) If a PCI BAR *does not* have PCI_BASE_ADDRESS_MEM_PREFETCH do we know it
  must not never want WC ?

That's not true I think. It means device can't allow prefetch but maybe
it does allow 

[Xen-devel] [PATCH] raisin: Some git-checkout improvements

2015-04-21 Thread George Dunlap
1. Switch local variables to lower-case and declare them local.

2. Cloning git trees from remote repos is often a very long operation.
Allow the user to specify a faster git cache as a prefix.

3. At the moment you can either check out a specific changeset or
master, but you can't check out a different branch, because git
doesn't always look in origin/ for the branch.  If the initial git
checkout $tag fails, try checking out origin/$tag before giving up.

Signed-off-by: George Dunlap george.dun...@eu.citrix.com
---
CC: Stefano Stabellini stefano.stabell...@citrix.com
---
 defconfig   |  3 +++
 lib/git-checkout.sh | 33 -
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/defconfig b/defconfig
index d3ef283..38c7455 100644
--- a/defconfig
+++ b/defconfig
@@ -35,3 +35,6 @@ QEMU_TRADITIONAL_REVISION=master
 SEABIOS_REVISION=master
 GRUB_REVISION=master
 LIBVIRT_REVISION=master
+
+# Git prefix.  Use this if you have a git caching proxy.
+#GIT_PREFIX=
diff --git a/lib/git-checkout.sh b/lib/git-checkout.sh
index 2ca8f25..b033504 100755
--- a/lib/git-checkout.sh
+++ b/lib/git-checkout.sh
@@ -1,32 +1,39 @@
 #!/usr/bin/env bash
 
 function git-checkout() {
+local tree
+local tag
+local dir
+
 if [[ $# -lt 3 ]]
 then
echo Usage: $0 tree tag dir
exit 1
 fi
 
-TREE=$1
-TAG=$2
-DIR=$3
+tree=$1
+tag=$2
+dir=$3
+
+tree=${GIT_PREFIX}$tree
 
 set -e
 
-if [[ ! -d $DIR-remote ]]
+if [[ ! -d $dir-remote ]]
 then
-   rm -rf $DIR-remote $DIR-remote.tmp
-   mkdir -p $DIR-remote.tmp; rmdir $DIR-remote.tmp
-   $GIT clone $TREE $DIR-remote.tmp
-   if [[ $TAG ]]
+   rm -rf $dir-remote $dir-remote.tmp
+   mkdir -p $dir-remote.tmp; rmdir $dir-remote.tmp
+   $GIT clone $tree $dir-remote.tmp
+   if [[ $tag ]]
then
-   cd $DIR-remote.tmp
+   cd $dir-remote.tmp
$GIT branch -D dummy /dev/null 21 ||:
-   $GIT checkout -b dummy $TAG
+   $GIT checkout -b dummy $tag \
+   || $GIT checkout -b dummy origin/$tag
cd ..
fi
-   mv $DIR-remote.tmp $DIR-remote
+   mv $dir-remote.tmp $dir-remote
 fi
-rm -f $DIR
-ln -sf $DIR-remote $DIR
+rm -f $dir
+ln -sf $dir-remote $dir
 }
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 05/47] pci: add pci_iomap_wc() variants

2015-04-21 Thread Luis R. Rodriguez
On Tue, Apr 21, 2015 at 12:25 PM, Michael S. Tsirkin m...@redhat.com wrote:
 To summarise, driver needs to know what it's doing,
 we can't set WC in the pci core automatically.

Thanks, I'll document this and proceed with device driver helpers to
aid with this.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] [RFC] x86/cpu: Fix SMAP check in PVOPS environments

2015-04-21 Thread Andrew Cooper
On 21/04/2015 01:35, Andy Lutomirski wrote:
 On 04/20/2015 10:09 AM, Andrew Cooper wrote:
 There appears to be no formal statement of what pv_irq_ops.save_fl() is
 supposed to return precisely.  Native returns the full flags, while
 lguest and
 Xen only return the Interrupt Flag, and both have comments by the
 implementations stating that only the Interrupt Flag is looked at. 
 This may
 have been true when initially implemented, but no longer is.

 To make matters worse, the Xen PVOP leaves the upper bits undefined,
 making
 the BUG_ON() undefined behaviour.  Experimentally, this now trips for
 32bit PV
 guests on Broadwell hardware.  The BUG_ON() is consistent for an
 individual
 build, but not consistent for all builds.  It has also been a sitting
 timebomb
 since SMAP support was introduced.

 Use native_save_fl() instead, which will obtain an accurate view of
 the AC
 flag.

 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Thomas Gleixner t...@linutronix.de
 CC: Ingo Molnar mi...@redhat.com
 CC: H. Peter Anvin h...@zytor.com
 CC: x...@kernel.org
 CC: linux-ker...@vger.kernel.org
 CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 CC: Boris Ostrovsky boris.ostrov...@oracle.com
 CC: David Vrabel david.vra...@citrix.com
 CC: xen-devel xen-devel@lists.xen.org
 CC: Rusty Russell ru...@rustcorp.com.au
 CC: lgu...@lists.ozlabs.org

 ---
 This patch is RFC because I am not certain that native_save_fl() is
 necessarily the correct solution on lguest, but it does seem that
 setup_smap()
 wants to check the actual AC bit, rather than an idealised value.

 A different approach, given the dual nature of the AC flag now is to
 gate
 setup_smap() on a kernel rpl of 0.  SMAP necessarily can't be used in a
 paravirtual situation where the kernel runs in cpl  0.

 Another different approach would be to formally state that
 pv_irq_ops.save_fl() needs to return all the flags, which would make
 local_irq_save() safe to use in this circumstance, but that makes a
 hotpath
 longer for the sake of a single boot time check.

 ...which reminds me:

 Why does native_restore_fl restore anything other than IF?  A branch
 and sti should be considerably faster than popf.

I was wondering about the performance aspect, given a comment in your
patch which removed sysret64, but hadn't had time to investigate yet.

Unfortunately, irq_save()/irq_enable()/irq_restore() appears to be a
used pattern in the kernel, making the irq_restore() disable interrupts.

The performance improvement might be worth explicitly moving the onus
into the caller with irq_maybe_disable()/irq_maybe_enable(), but that
does involve altering a lot of common code for an architecture specific
gain.


 Also, if we did this, could Xen use PVI and then use native_restore_fl
 and avoid lots of pvops?

Xen HVM guests already use the native pvops in this area, so would
benefit from any improvement.  PV guests on the other hand run with cpl
 0 and instead have a writeable mask in a piece of shared memory with
Xen, and need the pvop.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] Config.mk: Fix (and, effectively, update) QEMU_TAG

2015-04-21 Thread Ian Jackson
In 952944f7 QEMU_TAG update my tag update script mangled the
machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first
assignment to QEMU_TRADITIONAL_REVISION it found rather than the one
which ought to have been replaced.

The result was that:
 * From that commit on, QEMU_TAG was no longer honoured although
   QEMU_TRADITIONAL_REVISION still was
 * That particular update to QEMU_TRADITIONAL_REVISION's default
   value was effective
 * The next attempt to update QEMU_TRADITIONAL_REVISION, in
   1fc3aeb3 libxl: use new QEMU xenstore protocol was totally
   ineffective.

Fix this by restoring the transfer from QEMU_TAG.  The effects are:
 * Once more, honour QEMU_TAG.
 * Belatedly apply the qemu-trad change part of libxl: use new QEMU
   xenstore protocol.

(I have also fixed my script to not do this again.)

Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: George Dunlap george.dun...@eu.citrix.com
CC: Jan Beulich jbeul...@suse.com
Reported-by: Jan Beulich jbeul...@suse.com
---
 Config.mk |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/Config.mk b/Config.mk
index e5976fc..28b77d6 100644
--- a/Config.mk
+++ b/Config.mk
@@ -237,9 +237,7 @@ ifneq (,$(CONFIG_QEMU))
 QEMU_TRADITIONAL_LOC ?= $(CONFIG_QEMU)
 endif
 ifneq (,$(QEMU_TAG))
-QEMU_TRADITIONAL_REVISION ?= ab42b4408cb4fc4f869d73218e3d2034e6f5e8ac
-# Tue Mar 31 16:27:45 2015 +0100
-# xen: limit guest control of PCI command register
+QEMU_TRADITIONAL_REVISION ?= $(QEMU_TAG)
 endif
 
 ifeq ($(GIT_HTTP),y)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's vCPU on Xen ARM 4.5.x?

2015-04-21 Thread Stefano Stabellini
On Tue, 21 Apr 2015, 신정섭 wrote:
 Thanks your reply!

  

 I Think I find a Interrupt mechanism problem in Xen ARM 4.5 and I fix that 
 simply.

 After fix it my irq routing code is working well on Xen ARM 4.5 too.

  

 I will start new thread about that problem.

 please confirm please.

Confirm what? You are welcome to start a new thread an any interrupt
bugs you might have found.


  

 Thanks

  

 -Original Message-
 From: Stefano Stabellinistefano.stabell...@eu.citrix.com
 To: 신정섭supsup5...@naver.com;
 Cc: Stefano Stabellinistefano.stabell...@eu.citrix.com; Ian 
 Campbellian.campb...@citrix.com; xen-devel@lists.xen.org;
 Sent: 2015-04-21 (화) 19:13:54
 Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's 
 vCPU on Xen ARM 4.5.x?
  

 On Tue, 21 Apr 2015, 신정섭 wrote:
  I have a one more question. 
 
   
 
  In Xen ARM 4.5, All SPI is routed to pcpu0 that run domain0's vcpu0.
 
  If domain0's vcpu0 run on pcpu0 All SPI is routed to pcpu0
 
  If domain0's vcpu0 run on pcpu1 All SPI is routed to pcpu1

 That is correct.


  these mean that Xen ARM 4.5 can inject spi only to domain0's vcpu0
 
  and Xen ARM 4.5 cannot inject spi to domain0's vcpu1.
 
  Right?

 No, that is wrong. If the guest requests the spis to be routed to
 another vcpu, writing the appropriate values to the virtual GICD, then
 Xen will route the spis to the pcpu running the requested vcpu.

 So if your guest is Linux and you

 echo 2  /proc/irq/SPI_NUMBER/smp_affinity

 then you should see that Xen will start injecting the SPI to vcpu1.



  And is this reason ARM 4.5 don't use maintanance interrupt?

 No, that is just a performance optimization.


  
 
  Thanks
 
   
 
  -Original Message-
  From: Stefano Stabellinistefano.stabell...@eu.citrix.com
  To: 신정섭supsup5...@naver.com;
  Cc: Stefano Stabellinistefano.stabell...@eu.citrix.com; Ian 
  Campbellian.campb...@citrix.com; xen-devel@lists.xen.org;
  Sent: 2015-04-21 (화) 02:25:09
  Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any Domain0's 
  vCPU on Xen ARM 4.5.x?
   
 
  On Mon, 20 Apr 2015, 신정섭 wrote:
   Thanks your rely. But sorry i can't understand your explanation fully.
  
    
  
   I don't want to change GICD setting. I only want to change target 
   Domain0' vcpu injected SPI. vcpu0 or vcpu1.
  
    
  
   I understand like below.
  
   In Xen4.4, vgic_vcpu_inject_irq() can inject SPI to any Domain0's vcpu on 
   any pcpu.
  
   But int Xen4.5 vgic_vcpu_inject_irq() can inject SPI on only pcpu that 
   receive SPI from GICD.
  
   Right?
 
  Yes, if you meant the virtual GICD (not the physical GICD).
 
  I'll repeat:
 
  In Xen 4.5 vgic_vcpu_inject_irq can inject a given SPI only to the pcpu
  that is set to run the vcpu that should receive the interrupt, as per
  the vGICD configuration.
 
  So if you
 
  echo VCPU_NUMBER  /proc/irq/IRQ_NUMBER/smp_affinity
 
  in the guest, it should work and it should have a concrete effect in the
  delivery of the physical interrupt.
 
 
    
  
    
  
   -Original Message-
   From: Stefano Stabellinistefano.stabell...@eu.citrix.com
   To: 신정섭supsup5...@naver.com;
   Cc: Ian Campbellian.campb...@citrix.com; xen-devel@lists.xen.org; 
   Stefano
 Stabellinistefano.stabell...@eu.citrix.com;
   Sent: 2015-04-20 (월) 19:49:50
   Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any 
   Domain0's vCPU on Xen ARM 4.5.x?
    
  
   In Xen 4.5 we rely on the fact that the physical irq is routed to the
   physical cpu running the vcpu of the domain that needs to receive the
   corresponding virq.
  
   So if you want to inject IRQ 100 to CPU 1, while Dom0 is set to receive
   vIRQ 100 (virtual irq corresponding to IRQ 100) to vcpu0, running on
   CPU 0, that won't work.
  
  
   On Sat, 18 Apr 2015, 신정섭 wrote:
NO
   
 
   
Peripheral IRQ routing means that  
   
Xen select itself one of domain0's vCPU to inject periperal IRQ.
   
 
   
So below Simple peripheral IRQ routing Code is a Example of Peripheral 
IRQ routing.
   
periperal IRQ is injected to Domain0' vcpu0 or vcpu1 without vGIC 
Information.
   
 
   
I know that periperal IRQ can be process on any cpu in linux.
   
So All Domain0's vcpu can process periperal IRQ injected by Xen.
   
 
   
On Xen 4.4.1 my simple Simple peripheral irq routing Code is working 
well. (below)
   
But Xen 4.5.0 it dosen't.
   
 
   
 
   
-Original Message-
From: Ian Campbellian.campb...@citrix.com
To: 신정섭supsup5...@naver.com;
Cc: xen-devel@lists.xen.org; Stefano 
Stabellinistefano.stabell...@eu.citrix.com;
Sent: 2015-04-17 (금) 18:49:39
Subject: Re: [Xen-devel] Is it ok to routing periperal irq to any 
Domain0's vCPU on Xen ARM 4.5.x?
 
   
On Fri, 2015-04-17 at 11:36 +0900, 신정섭 wrote:


 I'm studying periperal irq routing to Domain0's vCPU
   
What do you mean by peripheral irq routing? Do you mean 

Re: [Xen-devel] [PATCH] Config.mk: Fix (and, effectively, update) QEMU_TAG

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 12:33, ian.jack...@eu.citrix.com wrote:
 In 952944f7 QEMU_TAG update my tag update script mangled the
 machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first
 assignment to QEMU_TRADITIONAL_REVISION it found rather than the one
 which ought to have been replaced.
 
 The result was that:
  * From that commit on, QEMU_TAG was no longer honoured although
QEMU_TRADITIONAL_REVISION still was
  * That particular update to QEMU_TRADITIONAL_REVISION's default
value was effective
  * The next attempt to update QEMU_TRADITIONAL_REVISION, in
1fc3aeb3 libxl: use new QEMU xenstore protocol was totally
ineffective.
 
 Fix this by restoring the transfer from QEMU_TAG.  The effects are:
  * Once more, honour QEMU_TAG.
  * Belatedly apply the qemu-trad change part of libxl: use new QEMU
xenstore protocol.
 
 (I have also fixed my script to not do this again.)
 
 Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com
 Reported-by: Jan Beulich jbeul...@suse.com

Acked-by: Jan Beulich jbeul...@suse.com

(and please also for 4.5)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v8 5/7] Add new script to customize nested test configuration

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote:
 1. In this script, make some appropriate runvars which selecthost would
 recognise.
 2. Prepare the configurations for installing L2 guest VM.
 3. Create a lv disk in L0 and hot-attach it to L1, need to restart L1 to
 make the block disk to be recognized by L1 system, then using this disk
 to create a VG that used for installing L2.
 
 Signed-off-by: longtao.pang longtaox.p...@intel.com
 ---
 Changes in v8:
 1. Replace '$nested_host' by '$l1-{Guest}'.
 ---
  ts-nested-setup |   52 
  1 file changed, 52 insertions(+)
  create mode 100755 ts-nested-setup
 
 diff --git a/ts-nested-setup b/ts-nested-setup
 new file mode 100755
 index 000..41d5e80
 --- /dev/null
 +++ b/ts-nested-setup
 @@ -0,0 +1,52 @@
 +#!/usr/bin/perl -w
 +# This is part of osstest, an automated testing framework for Xen.
 +# Copyright (C) 2015 Intel Inc.
 +#
 +# This program is free software: you can redistribute it and/or modify
 +# it under the terms of the GNU Affero General Public License as published by
 +# the Free Software Foundation, either version 3 of the License, or
 +# (at your option) any later version.
 +#
 +# This program is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU Affero General Public License for more details.
 +#
 +# You should have received a copy of the GNU Affero General Public License
 +# along with this program.  If not, see http://www.gnu.org/licenses/.
 +
 +use strict qw(vars);
 +use DBI;
 +use Osstest;
 +use Osstest::Debian;
 +use Osstest::TestSupport;
 +
 +tsreadconfig();
 +our ($l0,$l1) = ts_get_host_guest(@ARGV);
 +
 +guest_check_ip($l1);
 +target_cmd_root($l1, mkdir -p /home/osstest/.ssh  cp 
 /root/.ssh/authorized_keys /home/osstest/.ssh/);
 +my $url = $c{WebspaceUrl}.$c{WebspaceCommon}.$l0-{Name}._.'overlay.tar';
 +target_cmd_root($l1, END);
 +wget -O overlay.tar $url
 +tar -xf overlay.tar -C /
 +rm overlay.tar -f
 +update-rc.d osstest-confirm-booted start 99 2 .
 +END

I cc'd you on some patches which I think should help avoid this
duplication.

 +
 +store_runvar('nested_l1',$l1-{Guest});

I'm not sure what this is for and it would normally be wrong to hardcode
nested_l1 like that, where is it used?

Most places you seem to use nestedl1, not nested_l1. I think you ended
up with this due to not using guest_var and the various hardcoding
you've done elsewhere. I suspect with all that fixed and make-flight
updated accordingly you won't need this var any more.

 +store_runvar($l1-{Guest}_ip,$l1-{Ip});
 +
 +my $l2_disk_mb = 2;
 +my $l2_lv_name = 'nestedl2-disk';
 +my $vgname = $l0-{Name};
 +$vgname .= .$c{TestHostDomain} if ($l0-{Suite} =~ m/lenny/);
 +target_cmd_root($l0, END);
 +lvremove -f /dev/${vgname}/${l2_lv_name} ||:
 +lvcreate -L ${l2_disk_mb}M -n $l2_lv_name $vgname
 +dd if=/dev/zero of=/dev/${vgname}/${l2_lv_name} count=10

I think you can do all of this by passing a suitable l2 $gho to
prepareguest_part_lvmdisk, can't you?

I think you can get that by my $l2 = selectguest($ARGV[], $l1).

Where ARGV[] is a new parameter passed by sg-run-job i.e. nestedl2,
i.e. the one after whatever ts_get_host_guest consumes at the top of the
file (so ARGV[2] perhaps?).

Once you have that $l2 you can then use guest_var for e.g. the size,
which will be good because it will be picked up by your modifications to
ts-debian-hvm-install such that they automatically match.

 +xl block-attach $l1-{Name} /dev/${vgname}/${l2_lv_name},raw,sdb,rw

You use sdb here, but xvdb below. I think xvdb would work here, or to
avoid HVM confusion wrt SCSI vs PV perhaps use xvde throughout (since it
has no emulated counterpart by default IIRC)?

 +END
 +target_install_packages_norec($l1, qw(lvm2 rsync genisoimage));
 +target_reboot($l1);
 +target_cmd_root($l1, pvcreate /dev/xvdb  vgcreate ${l2_lv_name}_vg 
 /dev/xvdb);



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] [RFC] x86/cpu: Fix SMAP check in PVOPS environments

2015-04-21 Thread Rusty Russell
Andy Lutomirski l...@kernel.org writes:
 On 04/20/2015 10:09 AM, Andrew Cooper wrote:
 There appears to be no formal statement of what pv_irq_ops.save_fl() is
 supposed to return precisely.  Native returns the full flags, while lguest 
 and
 Xen only return the Interrupt Flag, and both have comments by the
 implementations stating that only the Interrupt Flag is looked at.  This may
 have been true when initially implemented, but no longer is.

 To make matters worse, the Xen PVOP leaves the upper bits undefined, making
 the BUG_ON() undefined behaviour.  Experimentally, this now trips for 32bit 
 PV
 guests on Broadwell hardware.  The BUG_ON() is consistent for an individual
 build, but not consistent for all builds.  It has also been a sitting 
 timebomb
 since SMAP support was introduced.

 Use native_save_fl() instead, which will obtain an accurate view of the AC
 flag.

That should work for lguest.  Indeed, it does (in practice those bits
are 0).

Tested-by: Rusty Russell ru...@rustcorp.com.au (lguest)

Thanks,
Rusty.

 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Thomas Gleixner t...@linutronix.de
 CC: Ingo Molnar mi...@redhat.com
 CC: H. Peter Anvin h...@zytor.com
 CC: x...@kernel.org
 CC: linux-ker...@vger.kernel.org
 CC: Konrad Rzeszutek Wilk konrad.w...@oracle.com
 CC: Boris Ostrovsky boris.ostrov...@oracle.com
 CC: David Vrabel david.vra...@citrix.com
 CC: xen-devel xen-devel@lists.xen.org
 CC: Rusty Russell ru...@rustcorp.com.au
 CC: lgu...@lists.ozlabs.org

 ---
 This patch is RFC because I am not certain that native_save_fl() is
 necessarily the correct solution on lguest, but it does seem that 
 setup_smap()
 wants to check the actual AC bit, rather than an idealised value.

 A different approach, given the dual nature of the AC flag now is to gate
 setup_smap() on a kernel rpl of 0.  SMAP necessarily can't be used in a
 paravirtual situation where the kernel runs in cpl  0.

 Another different approach would be to formally state that
 pv_irq_ops.save_fl() needs to return all the flags, which would make
 local_irq_save() safe to use in this circumstance, but that makes a hotpath
 longer for the sake of a single boot time check.

 ...which reminds me:

 Why does native_restore_fl restore anything other than IF?  A branch and 
 sti should be considerably faster than popf.

 Also, if we did this, could Xen use PVI and then use native_restore_fl 
 and avoid lots of pvops?

 --Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v20 02/13] x86/VPMU: Add public xenpmu.h

2015-04-21 Thread Jan Beulich
 On 20.04.15 at 18:38, boris.ostrov...@oracle.com wrote:
 On 04/20/2015 04:50 AM, Jan Beulich wrote:
 On 09.04.15 at 17:44, boris.ostrov...@oracle.com wrote:
 --- /dev/null
 +++ b/xen/include/public/pmu.h
 @@ -0,0 +1,38 @@
 +#ifndef __XEN_PUBLIC_PMU_H__
 +#define __XEN_PUBLIC_PMU_H__
 +
 +#include xen.h
 +#if defined(__i386__) || defined(__x86_64__)
 +#include arch-x86/pmu.h
 +#elif defined (__arm__) || defined (__aarch64__)
 +#include arch-arm.h
 +#else
 +#error Unsupported architecture
 +#endif
 +
 +#define XENPMU_VER_MAJ0
 +#define XENPMU_VER_MIN1
 +
 +
 +/* Shared between hypervisor and PV domain */
 +struct xen_pmu_data {
 Iirc this sharing is r/o - if so, please state so in the comment. If not,
 please extend the comment to briefly explain why writable sharing
 is safe/secure.
 
 This data structure is writeable by guest (specifically, PMU registers 
 and APIC_LVTPC). There is a flag (PMU_CACHED, which is part of this 
 structure) that the hypervisor sets to let the guest know that it can 
 write those fields without having to trap. When the guest is done, it 
 issues XENPMU_flush command and the hypervisor writes out those values 
 to HW.
 
 I'll update the comments to make this clear.

I think you'll actually want to state for each of the fields who reads
and who writes them. In particular for (I hope) obvious reasons
some (most?) of the fields would apparently need to be documented
write-only by the hypervisor.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] [RFC] x86/cpu: Fix SMAP check in PVOPS environments

2015-04-21 Thread Jan Beulich
 On 20.04.15 at 19:09, andrew.coop...@citrix.com wrote:
 A different approach, given the dual nature of the AC flag now is to gate
 setup_smap() on a kernel rpl of 0.  SMAP necessarily can't be used in a
 paravirtual situation where the kernel runs in cpl  0.

Can't isn't true here - for 64-bit PV Xen guests, which already
toggle between two page table variants for kernel and user mode,
it would be possible (but perhaps expensive) to mimic the needed
behavior by introducing a 3rd set of page tables, containing only
the kernel mappings. You may recall that I had even posted an
RFC patch to tat effect about a year ago.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/configure.ac: generate Paths.mk if it's not available

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 12:30 +0100, Wei Liu wrote:
 On Tue, Apr 21, 2015 at 11:54:27AM +0100, Ian Campbell wrote:
  On Tue, 2015-04-21 at 11:16 +0100, Wei Liu wrote:
   On Mon, Apr 20, 2015 at 03:07:38PM +0100, Wei Liu wrote:
Xen toolstack references many variables in Paths.mk when building and
installing, so tools' configure should generate Paths.mk if it's not
available. Also make inclusion of Paths.mk mandatory in Tools.mk.
   
   Hmm... I just discovered that docs build also involves Paths.mk.  This
   patch is ugly enough that I don't want to duplicate it for docs.
   So advise on how to fix this would be much appreciated.
  
  I wasn't terribly happy with having more than one place update Paths.mk
  already.
  
   Or we can state clear that anyone who builds Xen from source needs to
   run ./configure in top level directory, not the ones in subsystems.
  
  I think that's essentially what we've done so far, but it's not terribly
  satisfactory I'll admit. Is this the only issue which prevents this?
  
  Perhaps change each subsystem to generate+consume its own Paths
  ${subsys}.mk instead of a single global one? Either in config/Paths
  ${subsys.mk} or in ${subsys}/Paths.mk.
  
  If you invoke from the top-level then they will all end up with the same
  contents, but so what...
  
 
 I can try to refactor Paths.mk.in into several files.

I was suggesting a single input but create multiple outputs, otherwise
we have to keep all the inputs in sync.

I expect the majority of paths are common to all sub components.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 13:33 +0100, Ian Jackson wrote:
 Ian Campbell writes (Re: [OSSTEST Nested PATCH v8 3/7] Edit some APIs in 
 TestSupport.pm for nested test):
  On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote:
   --- a/Osstest/TestSupport.pm
   +++ b/Osstest/TestSupport.pm
 ...
   +if ( $r{${name}_ip} ) {
   +$setprop-(IpAddr, $r{${name}_ip});
   +}
  
  This is one for Ian I think, but I suspect this should use ${ident}
  rather than ${name}. 
 
 Yes.
 
  ${ident} is e.g. 'host' or 'srchost' or 'nestedl1' it is the prefer used
  on the runvar names. ${name} is the specific value assigned, i.e. an
  actual host name.
  
  For such properties we usually prefer ${ident}_foo. Ian, correct me if
  I'm wrong please.
 
 Ian C is right.
 
 
 But, I think actually the principle behind this change is wrong.
 
 It seems to be copying information from runvars into the in-memory
 data structure for host properties.  But host properties are (by
 definition) matters of (fixed) configuration, not runtime definition.

Remember that here the host is actual an L1 virtual machine, whose IP
is not fixed (since it mac address isn't).

I don't know if that affects your opinion at all?

 I haven't looked at the rest of the series, but the same effect should
 be achieved in a different way.  Note that a $ho is a hash which
 already contains (after selecthost, say) an element with key `Ip'.
 
 So the right place to do this is indeed probably somewhere around
 selectguest or selecthost, but the information should be put into
 $ho-{Ip} without going via host properties.
 
 Thanks,
 Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: qemu crash of hvm domUs with spice (backtrace included)

2015-04-21 Thread Stefano Stabellini
On Tue, 21 Apr 2015, Fabio Fantoni wrote:
 Il 21/04/2015 12:49, Stefano Stabellini ha scritto:
  On Mon, 20 Apr 2015, Fabio Fantoni wrote:
   I updated xen and qemu from xen 4.5.0 with its upstream qemu included to
   xen
   4.5.1-pre with qemu upstream from stable-4.5 (changed Config.mk to use
   revision master).
   After few minutes I booted windows 7 64 bit domU qemu crash, tried 2 times
   with same result.
   
   In the domU's qemu log:
qemu-system-i386: malloc.c:3096: sYSMALLOc: Assertion `(old_top ==
(((mbinptr) (((char *) ((av)-bins[((1) - 1) * 2])) -
__builtin_offsetof
(struct malloc_chunk, fd  old_size == 0) || ((unsigned long)
(old_size) = (unsigned long)__builtin_offsetof (struct
malloc_chunk,
fd_nextsize))+((2 * (sizeof(size_t))) - 1))  ~((2 * (sizeof(size_t))) -
1)))  ((old_top)-size  0x1)  ((unsigned long)old_end  pagemask)
==
0)' failed.
Killing all inferiors
   In attachment the full backtrace of qemu crash.
   
   With a fast search after I saw the backtrace I found a probable cause of
   regression (I'm not sure):
   http://xenbits.xen.org/gitweb/?p=staging/qemu-upstream-4.5-testing.git;a=commit;h=5c3402816aaddb15156c69df73c54abe4e1c76aa
   spice: make sure we don't overflow ssd-buf
   
   Added also qemu-devel and spice-devel as cc.
   
   If you need more informations/tests tell me and I'll post them.
Maybe you could try to revert the offending commit
  (5c3402816aaddb15156c69df73c54abe4e1c76aa)? Or even better bisect the
  crash?
 Thanks for your reply.
 
 I reverted to 4.5.0 on dom0 for now on that system because I'm busy trying to
 found another problem that cause very bad performance without errors or
 nothing in logs :( I don't know if if xen related, kernel related or other for
 now.
 
 About this regression with spice I'll do further tests in next days (probably
 starting reverting the spice patch in qemu) but any help is appreciated.
 Based on data I have for now is possible that the problem is that qemu try to
 allocate other ram or videoram after domU create but with xen is not possible?
 In the spice related patch I saw something about dynamic allocation for
 example.

It is probably caused by a commit in the range:

1ebb75b1fee779621b63e84fefa7b07354c43a99..0b8fb1ec3d666d1eb8bbff56c76c5e6daa2789e4

there are only 10 commits in that range. By using git bisect you should
be able to narrow it down in just 3 tests.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] x86/asm/irq: Don't use POPF but STI

2015-04-21 Thread Borislav Petkov
On Tue, Apr 21, 2015 at 02:45:58PM +0200, Ingo Molnar wrote:
 From 6f01f6381e8293c360b7a89f516b8605e357d563 Mon Sep 17 00:00:00 2001
 From: Ingo Molnar mi...@kernel.org
 Date: Tue, 21 Apr 2015 13:32:13 +0200
 Subject: [PATCH] x86/asm/irq: Don't use POPF but STI
 
 So because the POPF instruction is slow and STI is faster on 
 essentially all x86 CPUs that matter, instead of:
 
   81891848:   9d  popfq
 
 we can do:
 
   81661a2e:   41 f7 c4 00 02 00 00test   $0x200,%r12d
   81661a35:   74 01   je 81661a38 
 snd_pcm_stream_unlock_irqrestore+0x28
   81661a37:   fb  sti
   81661a38:
 
 This bloats the kernel a bit, by about 1K on the 64-bit defconfig:
 
textdata bss dec hex filename
122586341812120 1085440 15156194 e743e2 vmlinux.before
122595821812120 1085440 15157142 e74796 vmlinux.after
 
 the other cost is the extra branching, adding extra pressure to the
 branch prediction hardware and also potential branch misses.

Do we care? After we enable interrupts, we'll most likely go somewhere
cache cold anyway, so the branch misses will happen anyway.

The question is, would the cost drop from POPF - STI cover the increase
in branch misses overhead?

Hmm, interesting.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv3 1/4] x86: provide xadd()

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 14:36, david.vra...@citrix.com wrote:
 On 21/04/15 11:36, Jan Beulich wrote:
 On 21.04.15 at 12:11, david.vra...@citrix.com wrote:
 +static always_inline unsigned long __xadd(
 +volatile void *ptr, unsigned long v, int size)
 +{
 +switch ( size )
 +{
 +case 1:
 +asm volatile ( lock; xaddb %b0,%1
 +   : +r (v), +m (*__xg((volatile void *)ptr))
 +   :: memory);
 +return v;
 
 This doesn't seem to guarantee to return the old value: When the
 passed in v has more than 8 significant bits (which will get ignored
 as input), nothing will zap those bits from the register. Same for
 the 16-bit case obviously.
 
 +#define xadd(ptr, v) ({ \
 +__xadd((ptr), (unsigned long)(v), sizeof(*(ptr)));  \
 +})
 
 Assuming only xadd() is supposed to be used directly, perhaps
 the easiest would be to cast v to typeof(*(ptr)) (instead of
 unsigned long) here?
 
 I don't see how this helps.  Did you perhaps mean cast the result?
 
 #define xadd(ptr, v) ({\
 (typeof *(ptr))__xadd(ptr, (unsigned long)(v), \
   sizeof(*(ptr))); \
 })

Casting the result would work too; casting the input would have
the same effect because (as said) the actual xadd doesn't alter
bits 8...63 (or 16...63 in the 16-bit case), i.e. whether zero
extension happens before or after doing the xadd doesn't matter.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology

2015-04-21 Thread Andrew Cooper
On 21/04/15 13:56, Boris Ostrovsky wrote:

 On 04/21/2015 03:01 AM, Jan Beulich wrote:
 On 17.04.15 at 18:59, boris.ostrov...@oracle.com wrote:
 Changes in v7:
 * Break from the loop when -ENODEV is encountered
 This seems pretty inefficient for the caller. Returning a bad
 identifier other than XEN_INVALID_NODE_ID would seem better
 to me.

 That would mean that there is really no reason to return -ENODEV,
 which I thought you wanted to see. In that case (i.e. if no error
 needs to be returned) yes, a new token would make things simpler.


 --- a/docs/misc/xsm-flask.txt
 +++ b/docs/misc/xsm-flask.txt
 @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h)
* XEN_SYSCTL_cpupool_op
* XEN_SYSCTL_scheduler_op
* XEN_SYSCTL_coverage_op
 + * XEN_SYSCTL_pcitopoinfo
 No additions to this list are permitted. Either the new sub-op is
 disaggregation safe (which it looks to be), or it can't be accepted.

 True, it *is* safe, but why then cputopoinfo and numainfo are in this
 list? They look to be safe as well.

This list includes not yet audited.  It is quite likely that some
entries in the list are safe.

fwiw, neither cputopo nor numainfo are currently safe for very large
systems.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] VT-d: replace bogus gprintk()

2015-04-21 Thread Tian, Kevin
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Tuesday, April 21, 2015 3:31 PM
 
 Just like the other messages in this function this one should be issued
 through plain printk() - the current vCPU is irrelevant here. (Noticed
 while backporting to older trees, which don't have gprintk().)
 
 Signed-off-by: Jan Beulich jbeul...@suse.com

Acked-by: Kevin Tian kevin.t...@intel.com

 
 --- a/xen/drivers/passthrough/vtd/iommu.c
 +++ b/xen/drivers/passthrough/vtd/iommu.c
 @@ -858,8 +858,8 @@ static int iommu_page_fault_do_one(struc
  break;
  }
 
 -gprintk(XENLOG_G_WARNING VTDPREFIX, %s: reason %02x - %s\n,
 -kind, fault_reason, reason);
 +printk(XENLOG_G_WARNING VTDPREFIX %s: reason %02x - %s\n,
 +   kind, fault_reason, reason);
 
  if ( iommu_verbose  fault_type == DMA_REMAP )
  print_vtd_entries(iommu, PCI_BUS(source_id),
 PCI_DEVFN2(source_id),
 
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 3/9] xen/arm: Data abort exception (R/W) mem_access events

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote:
 This patch enables to store, set, check and deliver LPAE R/W mem_events.
 As the LPAE PTE's lack enough available software programmable bits,
 we store the permissions in a Radix tree. The tree is only looked at if
 mem_access_enabled is turned on.
 
 Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de

Acked-by: Ian Campbell ian.campb...@citrix.com



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 4/9] xen/arm: Instruction prefetch abort (X) mem_access event handling

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote:
 Add missing structure definition for iabt and update the trap handling
 mechanism to only inject the exception if the mem_access checker
 decides to do so.
 
 Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de

I'm not super thrilled about the TLB flush here, but it does seem to be
unavoidable, at least with our combined level of cunning right at the
moment, so:

Acked-by: Ian Campbell ian.campb...@citrix.com

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv3 1/4] x86: provide xadd()

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 15:15, david.vra...@citrix.com wrote:
 On 21/04/15 14:12, Jan Beulich wrote:
 On 21.04.15 at 14:36, david.vra...@citrix.com wrote:
 On 21/04/15 11:36, Jan Beulich wrote:
 On 21.04.15 at 12:11, david.vra...@citrix.com wrote:
 +static always_inline unsigned long __xadd(
 +volatile void *ptr, unsigned long v, int size)
 +{
 +switch ( size )
 +{
 +case 1:
 +asm volatile ( lock; xaddb %b0,%1
 +   : +r (v), +m (*__xg((volatile void *)ptr))
 +   :: memory);
 +return v;

 This doesn't seem to guarantee to return the old value: When the
 passed in v has more than 8 significant bits (which will get ignored
 as input), nothing will zap those bits from the register. Same for
 the 16-bit case obviously.

 +#define xadd(ptr, v) ({ \
 +__xadd((ptr), (unsigned long)(v), sizeof(*(ptr)));  \
 +})

 Assuming only xadd() is supposed to be used directly, perhaps
 the easiest would be to cast v to typeof(*(ptr)) (instead of
 unsigned long) here?

 I don't see how this helps.  Did you perhaps mean cast the result?

 #define xadd(ptr, v) ({\
 (typeof *(ptr))__xadd(ptr, (unsigned long)(v), \
   sizeof(*(ptr))); \
 })
 
 Casting the result would work too; casting the input would have
 the same effect because (as said) the actual xadd doesn't alter
 bits 8...63 (or 16...63 in the 16-bit case), i.e. whether zero
 extension happens before or after doing the xadd doesn't matter.
 
 Oh yes, of course.  Any preference to which method?

Not really - I guess the way you proposed it is the more obvious
one.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
 On 20/04/15 16:06, Tamas K Lengyel wrote:
  The current implementation of three memops, XENMEM_current_reservation,
  XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an
  int. However, in ARM64 we could potentially have 36-bit pfn's, thus
  in preparation for the ARM patch, in this patch we update the existing
  memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info
  as a uin64_t.
 
  This patch also adds error checking on the toolside in case the memop
  fails.
 
  Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
 
 XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
 
 You cannot make adjustments like this, but you can add a brand new op
 with appropriate parameters and list the old ops as deprecated.

Right. For the benefit of callers using the old API it seems what we
usually do is rename the old op XENMEM_foo_compat and use the name with
a new number for the new functionality, then use a
__XEN_INTERFACE_VERSION__ to #define back to the old name.

The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
reasonable example, I couldn't find one specifically for the memory ops.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology

2015-04-21 Thread Boris Ostrovsky


On 04/21/2015 03:01 AM, Jan Beulich wrote:

On 17.04.15 at 18:59, boris.ostrov...@oracle.com wrote:

Changes in v7:
* Break from the loop when -ENODEV is encountered

This seems pretty inefficient for the caller. Returning a bad
identifier other than XEN_INVALID_NODE_ID would seem better
to me.


That would mean that there is really no reason to return -ENODEV, which 
I thought you wanted to see. In that case (i.e. if no error needs to be 
returned) yes, a new token would make things simpler.





--- a/docs/misc/xsm-flask.txt
+++ b/docs/misc/xsm-flask.txt
@@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h)
   * XEN_SYSCTL_cpupool_op
   * XEN_SYSCTL_scheduler_op
   * XEN_SYSCTL_coverage_op
+ * XEN_SYSCTL_pcitopoinfo

No additions to this list are permitted. Either the new sub-op is
disaggregation safe (which it looks to be), or it can't be accepted.


True, it *is* safe, but why then cputopoinfo and numainfo are in this 
list? They look to be safe as well.



-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v2 05/11] vmx: add new data structure member to support PML

2015-04-21 Thread Tian, Kevin
 From: Kai Huang [mailto:kai.hu...@linux.intel.com]
 Sent: Tuesday, April 21, 2015 2:05 PM
 
 
 On 04/17/2015 10:31 AM, Kai Huang wrote:
 
 
  On 04/17/2015 06:39 AM, Tian, Kevin wrote:
  From: Kai Huang [mailto:kai.hu...@linux.intel.com]
  Sent: Wednesday, April 15, 2015 3:04 PM
 
  A new 4K page pointer is added to arch_vmx_struct as PML buffer for
  vcpu.
  And a
  new 'status' field is added to vmx_domain to indicate whether PML is
  enabled
  for
  the domain or not. The 'status' field also can be used for further
  similiar
  purpose.
  not sure about the last sentence. what's the similar purpose to
  whether PML
  is enabled? :-)
  I mean potentially there might be such feature in the future, and I
  can't give you an example right now. If you are just commenting the
  description here but fine with the current code, I can remove that
  last sentence if you like. Or do you suggest to just use a bool_t
  pml_enabled? I am fine with both, but looks there's no objection from
  others so I intend to keep it as 'unsigned int status', if you agree.
 Hi Kevin,
 
 What's your opinion here? Is 'unsigned int status' OK to you?

yes

 
 
 
  Note both new members don't have to be initialized to zero
  explicitly as both
  vcpu and domain structure are zero-ed when they are created.
  no initialization in this patch, so why explaining it here?
  OK. Looks it's a common sense to all of you so I'll just remove this
  sentence.
 
 
  Signed-off-by: Kai Huang kai.hu...@linux.intel.com
  ---
xen/include/asm-x86/hvm/vmx/vmcs.h | 7 +++
1 file changed, 7 insertions(+)
 
  diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h
  b/xen/include/asm-x86/hvm/vmx/vmcs.h
  index f831a78..2c679ac 100644
  --- a/xen/include/asm-x86/hvm/vmx/vmcs.h
  +++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
  @@ -70,8 +70,12 @@ struct ept_data {
cpumask_var_t synced_mask;
};
 
  +#define _VMX_DOMAIN_PML_ENABLED0
  +#define VMX_DOMAIN_PML_ENABLED (1ul 
  _VMX_DOMAIN_PML_ENABLED)
struct vmx_domain {
unsigned long apic_access_mfn;
  +/* VMX_DOMAIN_* */
  +unsigned long status;
};
 
struct pi_desc {
  @@ -142,6 +146,9 @@ struct arch_vmx_struct {
/* Bitmap to control vmexit policy for Non-root
 VMREAD/VMWRITE */
struct page_info *vmread_bitmap;
struct page_info *vmwrite_bitmap;
  +
  +#define NR_PML_ENTRIES   512
  +struct page_info *pml_pg;
  move the macro out of the structure.
  OK. I will move it just above the declaration of struct arch_vmx_struct.
 
  and is pml_buffer or pml_buf more clear?
 
  To me pml_buffer or pml_buf is more likely a virtual address you can
  access the buffer directly, while pml_pg indicates it's a pointer of
  struct page_info. If you you look at patch 6, you can find statements
  like:
 
  uint64_t *pml_buf;
 
  pml_buf = __map_domain_page(v-arch.hvm_vmx.pml_pg);
 
  So I intend to keep it.
 And this one? Are you OK with 'pml_pg'?
 

good to me too.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 2/9] xen/arm: Allow hypervisor access to mem_access protected pages

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote:
 -paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t)
 +static paddr_t __p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t)

I'd have been inclined to make this take a p2m_domain* rather than a
domain*, on the basis that the caller must have one in hand to have
locked it. I don't think __p2m_lookup uses d other than to find the p2m.

Not worth changing unless there is some other reason to resend though,
so with or without that:

Acked-by: Ian Campbell ian.campb...@citrix.com

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 14:56, boris.ostrov...@oracle.com wrote:

 On 04/21/2015 03:01 AM, Jan Beulich wrote:
 On 17.04.15 at 18:59, boris.ostrov...@oracle.com wrote:
 Changes in v7:
 * Break from the loop when -ENODEV is encountered
 This seems pretty inefficient for the caller. Returning a bad
 identifier other than XEN_INVALID_NODE_ID would seem better
 to me.
 
 That would mean that there is really no reason to return -ENODEV, which 
 I thought you wanted to see. In that case (i.e. if no error needs to be 
 returned) yes, a new token would make things simpler.

All I asked for was to make the two cases (no device and device
with no known node) distinguishable.

 --- a/docs/misc/xsm-flask.txt
 +++ b/docs/misc/xsm-flask.txt
 @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h)
* XEN_SYSCTL_cpupool_op
* XEN_SYSCTL_scheduler_op
* XEN_SYSCTL_coverage_op
 + * XEN_SYSCTL_pcitopoinfo
 No additions to this list are permitted. Either the new sub-op is
 disaggregation safe (which it looks to be), or it can't be accepted.
 
 True, it *is* safe, but why then cputopoinfo and numainfo are in this 
 list? They look to be safe as well.

Because at the time the list got composed we weren't able to
spend time looking at _all_ of the operations, and hence we
simply added them all (with a few exceptions on the domctl side
iirc). Quite likely the two you mention could be removed from
the list, but such needs to happen with proper reasoning in the
patch description. Feel free to submit a patch.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology

2015-04-21 Thread Boris Ostrovsky


On 04/21/2015 09:14 AM, Andrew Cooper wrote:

On 21/04/15 13:56, Boris Ostrovsky wrote:

On 04/21/2015 03:01 AM, Jan Beulich wrote:

--- a/docs/misc/xsm-flask.txt
+++ b/docs/misc/xsm-flask.txt
@@ -121,6 +121,7 @@ __HYPERVISOR_sysctl (xen/include/public/sysctl.h)
* XEN_SYSCTL_cpupool_op
* XEN_SYSCTL_scheduler_op
* XEN_SYSCTL_coverage_op
+ * XEN_SYSCTL_pcitopoinfo

No additions to this list are permitted. Either the new sub-op is
disaggregation safe (which it looks to be), or it can't be accepted.

True, it *is* safe, but why then cputopoinfo and numainfo are in this
list? They look to be safe as well.

This list includes not yet audited.  It is quite likely that some
entries in the list are safe.

fwiw, neither cputopo nor numainfo are currently safe for very large
systems.


Why would safety of an operation depend on system size? And how are 
these two unsafe? (Because maybe then PCI topology query is unsafe in 
the same manner)


-boris


-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 8/8] raisin: RFC Add blktap2 as an external tree

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-20 at 18:05 +0100, Stefano Stabellini wrote:
   I think we need to disable the build on architectures other than x86,
   see grub for example

Eventually we might want to build our own grub on ARM in order to pick
up Fu Wei's multiboot for arm64 patches, until they enter distros?

Or maybe Raisin on UEFI should be calling efibootmgr to register Xen
directly with the BIOS, and creating a xen.cfg in /boot, i.e. the way it
currently works even on x86.

  Do we?  There's no reason a blktap2 kernel module couldn't be built on
  ARM, is there?
 
 Maybe not, but I am pretty sure that it doesn't work at the moment. I
 don't think that the userspace stuff even compiles on ARM.
 Eventually we might have blktap on ARM, but I don't want to enable
 stuff in Raisin that we know it does not work.

Especially if it is already to a greater or lesser extent deprecated (in
favour of eventual blktap3) even on x86.

  And in any case the long-term plan is to get blktap3 working as well,
  which doesn't require kernel-space drivers.
  
  Or do you mean on systems other than Linux?
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] effect of ticks in xen credit scheduler

2015-04-21 Thread teabe

Hi,
I am student starting to work with Xen. I am using Xen 4.2.0. I am 
trying to understand the functioning of  the credit scheduler  in Xen. I 
try to modify the scheduler number of ticks per timeslice (form 3 to 1). 
I wonder that the is no difference in the scheduling of my virtual 
machines (No change in CPU allocation % and also application performance 
in the virtual machine). I just wanted to know what is the use of tich 
during time slice? Why do xen define 3 ticks per timeslice?


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/configure.ac: generate Paths.mk if it's not available

2015-04-21 Thread Wei Liu
On Mon, Apr 20, 2015 at 03:07:38PM +0100, Wei Liu wrote:
 Xen toolstack references many variables in Paths.mk when building and
 installing, so tools' configure should generate Paths.mk if it's not
 available. Also make inclusion of Paths.mk mandatory in Tools.mk.

Hmm... I just discovered that docs build also involves Paths.mk.  This
patch is ugly enough that I don't want to duplicate it for docs.
So advise on how to fix this would be much appreciated.

Or we can state clear that anyone who builds Xen from source needs to
run ./configure in top level directory, not the ones in subsystems.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen-unstable-staging: Xen BUG at iommu_map.c:455

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 10:24, li...@eikelenboom.it wrote:
 Tuesday, April 21, 2015, 10:11:07 AM, you wrote:
 Interesting - didn't you say that as a side effect of Andrew's patch
 you saw massive log spam?
 
 If you mean these:
 
 (XEN) [2015-04-12 14:55:20.226] p2m.c:884:d0v0 gfn_to_mfn failed! 
 gfn=001ed type:4
 [...]
 
 Those were actually due to Konrad's kernel patch that was on the devel-4.1 
 branch that has already been dropped. 
 (commit 22d8a8938407cb1342af763e937fdf9ee8daf24a
  'xen/pciback: Don't disable PCI_COMMAND on PCI device reset.')

Ah, okay. Iirc there was no progress towards a resolution there yet?

 For the rest there is some extra log spam now, since the memory maps now are 
 done 
 in very small chunks (the hypercall continuation stuff working?):
 (XEN) [2015-04-21 08:04:01.207] memory_map:add: dom20 gfn=ec780 mfn=cc780 
 nr=40
 [...]
 Don't know if that makes much sense anymore (unless specifically enabled if 
 you 
 want such detail .. and the whole range with perhaps a start and finish 
 message 
 is not enough)

The hypervisor can't really tell whether a re-invocation of said
hypercall is a continuation or a new request. Hence we can only
either drop the message altogether or live with it being spammy
on large regions (it's a XENLOG_G_INFO one anyway, so not
enabled by default, and if enabled usually rate limited).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/19] xen: arm: Annotate registers trapped by MDCR_EL2.TDRA

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 13:26 +0500, Julien Grall wrote:
 Hi Ian,
 
 On 17/04/2015 16:51, Ian Campbell wrote:
  Furthermore, this is the only registers not handled on AArch32 for this
  bit. This is rather strange to list them while you didn't do it for the
  trace registers.
 
  My intention was that every register trapped by a bit which we set be
  listed somewhere, to make it easier to cross reference with the docs and
  check we haven't accidentally forgotten something (as opposed to
  deliberately ignoring as indicated by these comments).
 
  You seem to be saying I've missed some trace registers, which ones?
 
 I meant that you didn't list the trace registers trapped but unhandled. 
 Although I wasn't able to find a list, is it trace module specific? If 
 so maybe a comment would be good?

I think maybe you are talking about the things trapped by CPTR_EL2.TTA
rather than MDCR_EL2.TDRA (the subject of this patch)?

The table referenced for CPTR_EL2.TTA just says All implemented trace
registers, rather than listing anything specific. I could add a similar
comment to the relevant patch.

Looks like HCR_EL2.TIDCP is similarly lacking a comment for the
unhandled ones. I'll add one.

 
 Regards,
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 07/13] x86: dynamically get/set CBM for a domain

2015-04-21 Thread Chao Peng
On Mon, Apr 20, 2015 at 04:52:09PM +0100, Andrew Cooper wrote:
 On 17/04/15 15:33, Chao Peng wrote:
  For CAT, COS is maintained in hypervisor only while CBM is exposed to
  user space directly to allow getting/setting domain's cache capacity.
  For each specified CBM, hypervisor will either use a existed COS which
  has the same CBM or allocate a new one if the same CBM is not found. If
  the allocation fails because of no enough COS available then error is
  returned. The getting/setting are always operated on a specified socket.
  For multiple sockets system, the interface may be called several times.
 
  Signed-off-by: Chao Peng chao.p.p...@linux.intel.com
  ---
  Changes in v5:
  * Add spin_lock to protect cbm_map.
  ---
  +for ( cos = 0; cos = info-cos_max; cos++ )
  +{
  +/* If still not found, then keep unused one. */
  +if ( !find  cos != 0  map[cos].ref == 0 )
  +find = map + cos;
  +else if ( map[cos].cbm == cbm )
  +{
  +if ( unlikely(cos == old_cos) )
  +return 0;
  +find = map + cos;
  +break;
  +}
  +}
  +
  +/* If old cos is referred only by the domain, then use it. */
  +if ( !find  map[old_cos].ref == 1 )
  +find = map + old_cos;
  +
  +if ( !find )
  +return -EUSERS;
  +
  +cos = find - map;
  +if ( find-cbm != cbm )
  +{
  +ret = write_l3_cbm(socket, cos, cbm);
  +if ( ret )
  +return ret;
  +find-cbm = cbm;
  +}
  +
  +spin_lock(info-cbm_lock);
  +find-ref++;
  +map[old_cos].ref--;
  +spin_unlock(info-cbm_lock);
 
 The spinlock must cover read accesses as well, or old_cos is liable to
 be stale by this point.

You mean map[old_cos].ref and find-ref are stale, right? old_cos itself
seems not need to be protected.
 
 It might be better to split into a rw_lock as it is read often but
 modifications should be very rare.

NP, thanks.

Chao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 3/4] xen: use ticket locks for spin locks

2015-04-21 Thread David Vrabel
Replace the byte locks with ticket locks.  Ticket locks are: a) fair;
and b) peform better when contented since they spin without an atomic
operation.

The lock is split into two ticket values: head and tail.  A locker
acquires a ticket by (atomically) increasing tail and using the
previous tail value.  A CPU holds the lock if its ticket == head.  The
lock is released by increasing head.

Architectures need only provide an xadd() implementation.

Signed-off-by: David Vrabel david.vra...@citrix.com
---
 xen/common/spinlock.c  |  110 
 xen/include/xen/spinlock.h |   16 +--
 2 files changed, 72 insertions(+), 54 deletions(-)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 5fd8b1c..a2170d2 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -115,16 +115,32 @@ void spin_debug_disable(void)
 
 #endif
 
+static always_inline spinlock_tickets_t observe_lock(spinlock_tickets_t *t)
+{
+spinlock_tickets_t v;
+
+smp_rmb();
+v.head_tail = read_atomic(t-head_tail);
+return v;
+}
+
+static always_inline u16 observe_head(spinlock_tickets_t *t)
+{
+smp_rmb();
+return read_atomic(t-head);
+}
+
 void _spin_lock(spinlock_t *lock)
 {
+spinlock_tickets_t tickets = { .tail = 1, };
 LOCK_PROFILE_VAR;
 
 check_lock(lock-debug);
-while ( unlikely(!_raw_spin_trylock(lock-raw)) )
+tickets.head_tail = xadd(lock-tickets.head_tail, tickets.head_tail);
+while ( tickets.tail != observe_head(lock-tickets) )
 {
 LOCK_PROFILE_BLOCK;
-while ( likely(_raw_spin_is_locked(lock-raw)) )
-cpu_relax();
+cpu_relax();
 }
 LOCK_PROFILE_GOT;
 preempt_disable();
@@ -132,76 +148,58 @@ void _spin_lock(spinlock_t *lock)
 
 void _spin_lock_irq(spinlock_t *lock)
 {
-LOCK_PROFILE_VAR;
-
 ASSERT(local_irq_is_enabled());
 local_irq_disable();
-check_lock(lock-debug);
-while ( unlikely(!_raw_spin_trylock(lock-raw)) )
-{
-LOCK_PROFILE_BLOCK;
-local_irq_enable();
-while ( likely(_raw_spin_is_locked(lock-raw)) )
-cpu_relax();
-local_irq_disable();
-}
-LOCK_PROFILE_GOT;
-preempt_disable();
+_spin_lock(lock);
 }
 
 unsigned long _spin_lock_irqsave(spinlock_t *lock)
 {
 unsigned long flags;
-LOCK_PROFILE_VAR;
 
 local_irq_save(flags);
-check_lock(lock-debug);
-while ( unlikely(!_raw_spin_trylock(lock-raw)) )
-{
-LOCK_PROFILE_BLOCK;
-local_irq_restore(flags);
-while ( likely(_raw_spin_is_locked(lock-raw)) )
-cpu_relax();
-local_irq_disable();
-}
-LOCK_PROFILE_GOT;
-preempt_disable();
+_spin_lock(lock);
 return flags;
 }
 
 void _spin_unlock(spinlock_t *lock)
 {
+smp_mb();
 preempt_enable();
 LOCK_PROFILE_REL;
-_raw_spin_unlock(lock-raw);
+lock-tickets.head++;
 }
 
 void _spin_unlock_irq(spinlock_t *lock)
 {
-preempt_enable();
-LOCK_PROFILE_REL;
-_raw_spin_unlock(lock-raw);
+_spin_unlock(lock);
 local_irq_enable();
 }
 
 void _spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags)
 {
-preempt_enable();
-LOCK_PROFILE_REL;
-_raw_spin_unlock(lock-raw);
+_spin_unlock(lock);
 local_irq_restore(flags);
 }
 
 int _spin_is_locked(spinlock_t *lock)
 {
 check_lock(lock-debug);
-return _raw_spin_is_locked(lock-raw);
+return lock-tickets.head != lock-tickets.tail;
 }
 
 int _spin_trylock(spinlock_t *lock)
 {
+spinlock_tickets_t old, new;
+
 check_lock(lock-debug);
-if ( !_raw_spin_trylock(lock-raw) )
+old = observe_lock(lock-tickets);
+if ( old.head != old.tail )
+return 0;
+new = old;
+new.tail++;
+if ( cmpxchg(lock-tickets.head_tail, old.head_tail, new.head_tail)
+ != old.head_tail )
 return 0;
 #ifdef LOCK_PROFILE
 if (lock-profile)
@@ -213,27 +211,32 @@ int _spin_trylock(spinlock_t *lock)
 
 void _spin_barrier(spinlock_t *lock)
 {
+spinlock_tickets_t sample;
 #ifdef LOCK_PROFILE
 s_time_t block = NOW();
-u64  loop = 0;
+#endif
 
 check_barrier(lock-debug);
-do { smp_mb(); loop++;} while ( _raw_spin_is_locked(lock-raw) );
-if ((loop  1)  lock-profile)
+sample = observe_lock(lock-tickets);
+if (sample.head != sample.tail)
 {
-lock-profile-time_block += NOW() - block;
-lock-profile-block_cnt++;
-}
-#else
-check_barrier(lock-debug);
-do { smp_mb(); } while ( _raw_spin_is_locked(lock-raw) );
+while (observe_head(lock-tickets) != sample.tail)
+{
+#ifdef LOCK_PROFILE
+if (lock-profile)
+{
+lock-profile-time_block += NOW() - block;
+lock-profile-block_cnt++;
+}
 #endif
+}
+}
 smp_mb();
 }
 
 int _spin_trylock_recursive(spinlock_t *lock)
 {
-int cpu = smp_processor_id();
+unsigned int cpu = smp_processor_id();
 
 /* Don't 

[Xen-devel] [PATCHv3 4/4] x86, arm: remove asm/spinlock.h from all architectures

2015-04-21 Thread David Vrabel
Now that all architecture use a common ticket lock implementation for
spinlocks, remove the architecture specific byte lock implementations.

Signed-off-by: David Vrabel david.vra...@citrix.com
Reviewed-by: Tim Deegan t...@xen.org
Acked-by: Jan Beulich jbeul...@suse.com
---
 xen/arch/arm/README.LinuxPrimitives  |   28 ---
 xen/include/asm-arm/arm32/spinlock.h |   66 --
 xen/include/asm-arm/arm64/spinlock.h |   63 
 xen/include/asm-arm/spinlock.h   |   23 
 xen/include/asm-x86/spinlock.h   |   37 ---
 xen/include/xen/spinlock.h   |1 -
 6 files changed, 218 deletions(-)
 delete mode 100644 xen/include/asm-arm/arm32/spinlock.h
 delete mode 100644 xen/include/asm-arm/arm64/spinlock.h
 delete mode 100644 xen/include/asm-arm/spinlock.h
 delete mode 100644 xen/include/asm-x86/spinlock.h

diff --git a/xen/arch/arm/README.LinuxPrimitives 
b/xen/arch/arm/README.LinuxPrimitives
index 7f33fc7..3115f51 100644
--- a/xen/arch/arm/README.LinuxPrimitives
+++ b/xen/arch/arm/README.LinuxPrimitives
@@ -25,16 +25,6 @@ linux/arch/arm64/include/asm/atomic.h   
xen/include/asm-arm/arm64/atomic.h
 
 -
 
-spinlocks: last sync @ v3.16-rc6 (last commit: 95c4189689f9)
-
-linux/arch/arm64/include/asm/spinlock.h xen/include/asm-arm/arm64/spinlock.h
-
-Skipped:
-  5686b06 arm64: lockref: add support for lockless lockrefs using cmpxchg
-  52ea2a5 arm64: locks: introduce ticket-based spinlock implementation
-
--
-
 mem*: last sync @ v3.16-rc6 (last commit: d875c9b37240)
 
 linux/arch/arm64/lib/memchr.S   xen/arch/arm/arm64/lib/memchr.S
@@ -103,24 +93,6 @@ linux/arch/arm/include/asm/atomic.h 
xen/include/asm-arm/arm32/atomic.h
 
 -
 
-spinlocks: last sync: 15e7e5c1ebf5
-
-linux/arch/arm/include/asm/spinlock.h   xen/include/asm-arm/arm32/spinlock.h
-
-*** Linux has switched to ticket locks but we still use bitlocks.
-
-resync to v3.14-rc7:
-
-  7c8746a ARM: 7955/1: spinlock: ensure we have a compiler barrier before sev
-  0cbad9c ARM: 7854/1: lockref: add support for lockless lockrefs using 
cmpxchg64
-  9bb17be ARM: locks: prefetch the destination word for write prior to strex
-  27a8479 ARM: smp_on_up: move inline asm ALT_SMP patching macro out of 
spinlock.
-  00efaa0 ARM: 7812/1: rwlocks: retry trylock operation if strex fails on free 
lo
-  afa31d8 ARM: 7811/1: locks: use early clobber in arch_spin_trylock
-  73a6fdc ARM: spinlock: use inner-shareable dsb variant prior to sev 
instruction
-
--
-
 mem*: last sync @ v3.16-rc6 (last commit: d98b90ea22b0)
 
 linux/arch/arm/lib/copy_template.S  xen/arch/arm/arm32/lib/copy_template.S
diff --git a/xen/include/asm-arm/arm32/spinlock.h 
b/xen/include/asm-arm/arm32/spinlock.h
deleted file mode 100644
index bc0343c..000
--- a/xen/include/asm-arm/arm32/spinlock.h
+++ /dev/null
@@ -1,66 +0,0 @@
-#ifndef __ASM_ARM32_SPINLOCK_H
-#define __ASM_ARM32_SPINLOCK_H
-
-static inline void dsb_sev(void)
-{
-__asm__ __volatile__ (
-dsb\n
-sev\n
-);
-}
-
-typedef struct {
-volatile unsigned int lock;
-} raw_spinlock_t;
-
-#define _RAW_SPIN_LOCK_UNLOCKED { 0 }
-
-#define _raw_spin_is_locked(x)  ((x)-lock != 0)
-
-static always_inline void _raw_spin_unlock(raw_spinlock_t *lock)
-{
-ASSERT(_raw_spin_is_locked(lock));
-
-smp_mb();
-
-__asm__ __volatile__(
-   str %1, [%0]\n
-:
-: r (lock-lock), r (0)
-: cc);
-
-dsb_sev();
-}
-
-static always_inline int _raw_spin_trylock(raw_spinlock_t *lock)
-{
-unsigned long contended, res;
-
-do {
-__asm__ __volatile__(
-   ldrex   %0, [%2]\n
-   teq %0, #0\n
-   strexeq %1, %3, [%2]\n
-   movne   %1, #0\n
-: =r (contended), =r (res)
-: r (lock-lock), r (1)
-: cc);
-} while (res);
-
-if (!contended) {
-smp_mb();
-return 1;
-} else {
-return 0;
-}
-}
-
-#endif /* __ASM_SPINLOCK_H */
-/*
- * Local variables:
- * mode: C
- * c-file-style: BSD
- * c-basic-offset: 4
- * indent-tabs-mode: nil
- * End:
- */
diff --git a/xen/include/asm-arm/arm64/spinlock.h 
b/xen/include/asm-arm/arm64/spinlock.h
deleted file mode 100644
index 5ae034d..000
--- a/xen/include/asm-arm/arm64/spinlock.h
+++ /dev/null
@@ -1,63 +0,0 @@
-/*
- * Derived from Linux arch64 spinlock.h which is:
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the 

[Xen-devel] [PATCHv3 2/4] arm: provide xadd()

2015-04-21 Thread David Vrabel
xadd() atomically adds a value and returns the previous value.  This
is needed to implement ticket locks.

This generic arm implementation uses the GCC __sync_fetch_and_add()
builtin.  This builtin resulted in suitable inlined asm for GCC 4.8.3
(arm64) and GCC 4.6.3 (arm32).

Signed-off-by: David Vrabel david.vra...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
---
 xen/include/asm-arm/system.h |2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/asm-arm/system.h b/xen/include/asm-arm/system.h
index ce3d38a..367af1d 100644
--- a/xen/include/asm-arm/system.h
+++ b/xen/include/asm-arm/system.h
@@ -51,6 +51,8 @@
 # error unknown ARM variant
 #endif
 
+#define xadd(x, v) __sync_fetch_and_add(x, v)
+
 extern struct vcpu *__context_switch(struct vcpu *prev, struct vcpu *next);
 
 #endif
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCHv3 1/4] x86: provide xadd()

2015-04-21 Thread David Vrabel
xadd() atomically adds a value and returns the previous value.  This
is needed to implement ticket locks.

Signed-off-by: David Vrabel david.vra...@citrix.com
---
 xen/include/asm-x86/system.h |   47 ++
 1 file changed, 47 insertions(+)

diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index 7111329..f244c8d 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -118,6 +118,53 @@ static always_inline unsigned long __cmpxchg(
 })
 
 /*
+ * Undefined symbol to cause link failure if a wrong size is used with
+ * xadd().
+ */
+extern unsigned long __bad_xadd_size(void);
+
+static always_inline unsigned long __xadd(
+volatile void *ptr, unsigned long v, int size)
+{
+switch ( size )
+{
+case 1:
+asm volatile ( lock; xaddb %b0,%1
+   : +r (v), +m (*__xg((volatile void *)ptr))
+   :: memory);
+return v;
+case 2:
+asm volatile ( lock; xaddw %w0,%1
+   : +r (v), +m (*__xg((volatile void *)ptr))
+   :: memory);
+return v;
+case 4:
+asm volatile ( lock; xaddl %k0,%1
+   : +r (v), +m (*__xg((volatile void *)ptr))
+   :: memory);
+return v;
+case 8:
+asm volatile ( lock; xaddq %q0,%1
+   : +r (v), +m (*__xg((volatile void *)ptr))
+   :: memory);
+
+return v;
+default:
+return __bad_xadd_size();
+}
+}
+
+/*
+ * Atomically add @v to the 1, 2, 4, or 8 byte value at @ptr.  Returns
+ * the previous value.
+ *
+ * This is a full memory barrier.
+ */
+#define xadd(ptr, v) ({ \
+__xadd((ptr), (unsigned long)(v), sizeof(*(ptr)));  \
+})
+
+/*
  * Both Intel and AMD agree that, from a programmer's viewpoint:
  *  Loads cannot be reordered relative to other loads.
  *  Stores cannot be reordered relative to other stores.
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 12/13] tools: add tools support for Intel CAT

2015-04-21 Thread Chao Peng
On Tue, Apr 21, 2015 at 03:24:37AM +0200, Dario Faggioli wrote:
 On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote:
  This is the xc/xl changes to support Intel Cache Allocation
  Technology(CAT). Two commands are introduced:
  - xl psr-cat-hwinfo
Show CAT hardware information.
 
  Examples:
  [root@vmm-psr vmm]# xl psr-cat-hwinfo
  Cache Allocation Technology (CAT):
  Socket ID   : 0
  L3 Cache: 12288KB
  Maximum COS : 15
  CBM length  : 12
  Default CBM : 0xfff
  
 Or, you can rename the psr-cmt-hwinfo command, added in the previous
 patch, to 'psr-hwinfo' and make it accept options, e.g.,
 
  -m, --cmt   show Cache Monitoring Technology (CMT) hardware info
  -c, --cat   show Cache Allocation Technology (CAT) hardware info
 
 By default (i.e., no options provided), it can just print all the hw
 info.
 
 Not a big deal, but I think that would make a better command line
 interface. Tools' maintainers' call, I guess.

Thanks for suggestion.
 
  --- a/tools/libxc/include/xenctrl.h
  +++ b/tools/libxc/include/xenctrl.h
 
  +
  +int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
  +   xc_psr_cat_type type, uint32_t target,
  +   uint64_t data);
  +int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
  +   xc_psr_cat_type type, uint32_t target,
  +   uint64_t *data);
 
 So, for this twos, 'target' is the socket you want to act on.
 
  +int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
  +   uint32_t *cos_max, uint32_t *cbm_len);
 
 While here you use 'socket', to mean the same thing.
 
 That looks rather inconsistent. Since it's a socket we are talking
 about, why not 'socket' everywhere?


The idea behind here is: All the places that appear as 'target' imply 
there are possible values other than just socket (e.g. considering L2
Cache Allocation in the future). So 'target' is always paired with a
'psr_cat_type' parameter. For routines that only work for L3 (e.g.
xc_psr_cat_get_l3_info) then 'socket' is used. I admit that it looks
inconsistent, perhaps rename all 'socket' to 'target'?

 
  --- a/tools/libxl/libxl.h
  +++ b/tools/libxl/libxl.h
  
  +#ifdef LIBXL_HAVE_PSR_CAT
  +
  +#define LIBXL_PSR_TARGET_ALL (~0U)
  +int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
  +  libxl_psr_cbm_type type, uint32_t target,
  +  uint64_t cbm);
  +int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
  +  libxl_psr_cbm_type type, uint32_t target,
  +  uint64_t *cbm_r);
  +
 The same applies here: I'd rename taget to socket.
 
 Then there's that LIBXL_PSR_TARGET_ALL. So, target (or socket, if you'll
 rename it) is an integer representing _which_one_ socket to act on.
 However, there is a special value to mean all sockets.
 
 Another possibility would be to offer an API that natively allows for
 operating on multiple sockets, by using libxl_bitmap-s, as it happens in
 many other places, in libxl itself (e.g., setting and getting vcpu
 affinity).
 
 That means target would become a libxl_bitmap, and, in the
 implementation, you'll apply the operation on all the sockets
 corresponding to a set bit in there. Only one bit set means just that
 socket, all bits means all sockets.
 
 This looks like a better interface to me (no need for special ~0U
 values), it'd make the implementation more linear, and is more
 consistent with how other similar situations are handled in libxl.
 However, I appreciate that one may find it overkill... I guess it
 depends whether we expect the prevalent usage pattern to be almost
 always about single sockets --and maybe on all sockets, from time to
 time-- or if we see value in being able to specify more than one and
 less than all sockets.
 
 For instance, now that we have vNUMA, if a domain has 4 vNUMA nodes,
 each one mapped to a physical NUMA node, it looks to me like it would
 make sense to set CAT to, say, 0x0F, on the sockets corresponding to the
 physical NODEs. With the interface in this patch, that would require
 calling libxl_psr_cat_set_cbm() 4 times, with the libxl_bitmap approach,
 just once, after setting up the bitmap properly.
 
 Thoughts?

I do like this suggestion and I have ever considered it actually. The
only thing prevents me is that we need an extra _get_socket_count in xl
for TARGET_ALL case. So libxl__count_physical_sockets is needed to be
public. If Ian/Wei have no concerns for this, then I'm glad to do this.

 
  --- a/tools/libxl/libxl_psr.c
  +++ b/tools/libxl/libxl_psr.c
 
  +int libxl_psr_cat_get_l3_info(libxl_ctx *ctx, libxl_psr_cat_info **info,
  +  uint32_t *nr)
  +{
  +GC_INIT(ctx);
  +int rc, r;
  +uint32_t i, nr_sockets;
  +libxl_psr_cat_info *ptr;
  +
  +rc = libxl__count_physical_sockets(gc, nr_sockets);
  

Re: [Xen-devel] [PATCH 8/8] raisin: RFC Add blktap2 as an external tree

2015-04-21 Thread George Dunlap
On 04/21/2015 10:25 AM, Ian Campbell wrote:
 On Mon, 2015-04-20 at 18:05 +0100, Stefano Stabellini wrote:
 I think we need to disable the build on architectures other than x86,
 see grub for example
 
 Eventually we might want to build our own grub on ARM in order to pick
 up Fu Wei's multiboot for arm64 patches, until they enter distros?
 
 Or maybe Raisin on UEFI should be calling efibootmgr to register Xen
 directly with the BIOS, and creating a xen.cfg in /boot, i.e. the way it
 currently works even on x86.
 
 Do we?  There's no reason a blktap2 kernel module couldn't be built on
 ARM, is there?

 Maybe not, but I am pretty sure that it doesn't work at the moment. I
 don't think that the userspace stuff even compiles on ARM.
 Eventually we might have blktap on ARM, but I don't want to enable
 stuff in Raisin that we know it does not work.
 
 Especially if it is already to a greater or lesser extent deprecated (in
 favour of eventual blktap3) even on x86.

So from my discussions w/ the XenServer guys, it seems that:

1. The master branch of the blktap.git repo contains support for
*both* blktap3 and blktap2.5 (with a kernel module)

2. XenServer uses blktap3 for guest access, but still use the blktap2.5
w/ kernel module for dom0 access to guest disks, to avoid the
possibility of hitting a scalability limit due to grant references.

So from raisin's perspective, the only difference between blktap2.5 and
blktap3 is using the master branch rather than the blktap2 branch of
the repo.

Whether we maintain support for blktap2.5 in libxl is a matter for the
Xen maintainers; but if xapi is ever going to start using libxl, it will
certainly need to be able to do so.

(Dave / David, please correct me if I'm wrong.)

That said, there's no harm in disabling it on ARM to begin with, and
enabling it once blktap3 works.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v8 2/7] Changes to support '/boot' leading paths of kernel, xen, in grub

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote:
 Support situations of grub that have vmlinuz and other things starting
 with path of '/boot' rather than '/'.
 
 Signed-off-by: longtao.pang longtaox.p...@intel.com

Acked-by: Ian Campbell ian.campb...@citrix.com

 ---
  Osstest/Debian.pm |8 
  1 file changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm
 index 378af54..1a57cdd 100644
 --- a/Osstest/Debian.pm
 +++ b/Osstest/Debian.pm
 @@ -449,21 +449,21 @@ sub setupboot_grub2 () {
  if (m/^submenu\s+[\'\](.*)[\'\].*\{\s*$/) {
  $submenu={ StartLine =$.};
  }
 -if (m/^\s*multiboot\s*\/(xen\-[0-9][-+.0-9a-z]*\S+)/) {
 +if (m/^\s*multiboot\s*(?:\/boot)?\/(xen\S+)/) {
  die unless $entry;
  $entry-{Hv}= $1;
  }
 -if (m/^\s*multiboot\s*\/(vmlinu[xz]-(\S+))/) {
 +if (m/^\s*multiboot\s*(?:\/boot)?\/(vmlinu[xz]-(\S+))/) {
  die unless $entry;
  $entry-{KernOnly}= $1;
  $entry-{KernVer}= $2;
  }
 -if (m/^\s*module\s*\/(vmlinu[xz]-(\S+))/) {
 +if (m/^\s*module\s*(?:\/boot)?\/(vmlinu[xz]-(\S+))/) {
  die unless $entry;
  $entry-{KernDom0}= $1;
  $entry-{KernVer}= $2;
  }
 -if (m/^\s*module\s*\/(initrd\S+)/) {
 +if (m/^\s*module\s*(?:\/boot)?\/(initrd\S+)/) {
  $entry-{Initrd}= $1;
  }
   if (m/^\s*module\s*\/(xenpolicy\S+)/) {



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v8 1/7] parsing grub which has 'submenu' primitive

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote:
 From a hvm kernel build from Linux stable Kernel tree,
 the auto generated grub2 menu will have 'submenu' primitive, upon the
 'menuentry' items. Xen boot entries will be grouped into a submenu. This
 patch adds capability to support such grub formats.
 
 Signed-off-by: longtao.pang longtaox.p...@intel.com

I think this won't work with nested submenus, but I assume we don't see
them at least with Jessie so we can live with this.

I'm also not sure it would cope with a submenu declared within a
function, but again I expect we aren't seeing those in practice.

So given this handles the grub.cfg files we see in practice:
Acked-by: Ian Campbell ian.campb...@citrix.com

I'd also like to see some of our delta in
overlay/etc/grub.d/20_linux_xen which was added to remove the
possibility of submenus to be removed, see below.

Wei, per the comment below I think we should update the grub BR with a
fixed patch, like the attached.

Ian.

-


From e082a82a3036152797447ac4c809e3b67e68cd12 Mon Sep 17 00:00:00 2001
From: Ian Campbell ian.campb...@citrix.com
Date: Tue, 21 Apr 2015 11:06:06 +0100
Subject: [PATCH] grub: remove patch to disable submenu from 20_linux_xen
 overlay

setupboot_grub2 now supports submenus, so we can reduce our delta vs
upstream a bit.

I started by extracting 20_linux_xen from
http://snapshot.debian.org/archive/debian/20130703T094657Z/pool/main/g/grub2/grub-common_1.99-27%2Bdeb7u2_amd64.deb
and then applying the patch at
http://savannah.gnu.org/file/grub.patch?file_id=32276 (the patch from
grub bug #42420 at http://savannah.gnu.org/bugs/?43420) and
reinstating the comment at the top of the file (modified to drop the
reference to the Debian bug.

This left me with some spurious changes:

@@ -93,7 +93,7 @@ linux_entry ()
   if test ! -e ${xen_dirname}/${xenpolicy} ; then
  return
   fi
-  xen_args=`echo $xen_args flask=enforcing`
+  xen_args=`echo $xen_args flask_enabled=1 flask_enforcing=1`
   if ${recovery} ; then
  title=$(gettext_quoted %s, with Xen %s (XSM enabled) and Linux 
%s (recovery mode))
   else
@@ -137,7 +137,6 @@ EOF
echo'$message'
module  ${rel_dirname}/${xenpolicy}
 EOF
-  fi
   cat  EOF
 }
 EOF

I think these are bugs in the patch in the grub BTS, which were fixed
while iterating over the XSM series in osstest but didn't make it into
the upstream version, the fixes to those bugs are reverted byu the
above. So I have manually reverted them.

Signed-off-by: Ian Campbell ian.campb...@citrix.com
---
Wei, if you agree wrt those changes I'll update the bug, or perhaps
you want to?
---
 overlay/etc/grub.d/20_linux_xen | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/overlay/etc/grub.d/20_linux_xen b/overlay/etc/grub.d/20_linux_xen
index 5315e2a..aaead1b 100755
--- a/overlay/etc/grub.d/20_linux_xen
+++ b/overlay/etc/grub.d/20_linux_xen
@@ -1,7 +1,7 @@
 #! /bin/sh
 
 # Copied from the identically named file in grub-common 1.99-27+deb7u2.
-# This version fixed Debian bug #690538 and GRUB bug #43420.
+# This version fixes GRUB bug #43420.
 
 set -e
 
@@ -173,6 +173,7 @@ while [ x${xen_list} != x ] ; do
 xen_dirname=`dirname ${current_xen}`
 rel_xen_dirname=`make_system_path_relative_to_its_root $xen_dirname`
 xen_version=`echo $xen_basename | sed -e s,.gz$,,g;s,^xen-,,g`
+echo submenu \Xen ${xen_version}\ {
 while [ x$list != x ] ; do
linux=`version_find_latest $list`
echo Found linux image: $linux 2
@@ -214,5 +215,6 @@ while [ x${xen_list} != x ] ; do
 
list=`echo $list | tr ' ' '\n' | grep -vx $linux | tr '\n' ' '`
 done
+echo }
 xen_list=`echo $xen_list | tr ' ' '\n' | grep -vx $current_xen | tr '\n' ' 
'`
 done
-- 
2.1.4


--- /home/ianc/tmp/x/etc/grub.d/20_linux_xen	2013-07-03 04:39:20.0 +0100
+++ overlay/etc/grub.d/20_linux_xen	2015-04-21 11:09:57.777812773 +0100
@@ -81,10 +85,27 @@
   recovery=$4
   args=$5
   xen_args=$6
-  if ${recovery} ; then
-title=$(gettext_quoted %s, with Xen %s and Linux %s (recovery mode))
+  xsm=$7
+  # If user wants to enable XSM support, make sure there's
+  # corresponding policy file.
+  if ${xsm} ; then
+  xenpolicy=`echo xenpolicy-$xen_version`
+  if test ! -e ${xen_dirname}/${xenpolicy} ; then
+	  return
+  fi
+  xen_args=`echo $xen_args flask=enforcing`
+  if ${recovery} ; then
+	  title=$(gettext_quoted %s, with Xen %s (XSM enabled) and Linux %s (recovery mode))
+  else
+	  title=$(gettext_quoted %s, with Xen %s (XSM enabled) and Linux %s)
+  fi
   else
-title=$(gettext_quoted %s, with Xen %s and Linux %s)
+  xenpolicy=
+  if ${recovery} ; then
+	  title=$(gettext_quoted %s, with Xen %s and Linux %s (recovery mode))
+  else
+	  title=$(gettext_quoted %s, with Xen %s and Linux %s)
+  fi
   fi
   printf menuentry '${title}' ${CLASS} 

Re: [Xen-devel] Xen-unstable-staging: Xen BUG at iommu_map.c:455

2015-04-21 Thread Jan Beulich
 On 20.04.15 at 20:50, li...@eikelenboom.it wrote:
 Monday, April 20, 2015, 6:11:42 PM, you wrote:
 On 16.04.15 at 11:28, t...@xen.org wrote:
 At 22:35 +0100 on 11 Apr (1428791713), Andrew Cooper wrote:
 At least we now understand why it happens.  I will defer to others CC'd
 on this thread for their opinions in the matter.
 
 The patch semes like a pretty good check to me, though I'm not
 convinced it's race-free.  At the least I'd cache the m2p lookup so we
 use the same value for the checks and the map_page() call. 
 
 And IMO update_paging_mode() ought to log and reject bogus GFNs as
 well.
 
 could you give the patch below a try, namely also in the context
 of seeing again the issue originally fixed by Andrew's initial patch?
 Please make sure you try a debug build and you have
 iommu=debug on the Xen command line.
 
 I'm running with it now, have seen no issues so far !

Interesting - didn't you say that as a side effect of Andrew's patch
you saw massive log spam?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 13/19] xen: arm: Annotate registers trapped by MDCR_EL2.TDRA

2015-04-21 Thread Julien Grall

Hi Ian,

On 17/04/2015 16:51, Ian Campbell wrote:

Furthermore, this is the only registers not handled on AArch32 for this
bit. This is rather strange to list them while you didn't do it for the
trace registers.


My intention was that every register trapped by a bit which we set be
listed somewhere, to make it easier to cross reference with the docs and
check we haven't accidentally forgotten something (as opposed to
deliberately ignoring as indicated by these comments).

You seem to be saying I've missed some trace registers, which ones?


I meant that you didn't list the trace registers trapped but unhandled. 
Although I wasn't able to find a list, is it trace module specific? If 
so maybe a comment would be good?


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 03/13] x86: detect and initialize Intel CAT feature

2015-04-21 Thread Chao Peng
On Mon, Apr 20, 2015 at 06:13:12PM +0200, Dario Faggioli wrote:
 On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote:
 
  --- a/xen/arch/x86/psr.c
  +++ b/xen/arch/x86/psr.c
  @@ -19,14 +19,25 @@
   #include asm/psr.h
   
   #define PSR_CMT(10)
  +#define PSR_CAT(11)
  +
  +struct psr_cat_socket_info {
  +bool_t initialized;
  +bool_t enabled;
  +unsigned int cbm_len;
  +unsigned int cos_max;
  +};
 
 Can't we ditch 'initialized' from within the struct and have a (global)
 bitmap, with one bit for each socket, expressing the same? And that also
 for 'enabled'.
 
 It's probably, at least up to a certain extent, a matter of taste (and I
 personally think it will look better), but it also should produce
 tighter code...

NP, thanks.
Chao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 11/13] tools/libxl: add command to show CMT hardware info

2015-04-21 Thread Chao Peng
On Tue, Apr 21, 2015 at 02:37:54AM +0200, Dario Faggioli wrote:
 On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote:
  Add dedicated one to show hardware information.
  
  [root@vmm-psr]xl psr-cmt-hwinfo
  Cache Monitoring Technology (CMT):
  Enabled : 1
  Total RMID  : 63
  Supported monitor types:
  cache-occupancy
  total-mem-bandwidth
  local-mem-bandwidth
  
 Nice.
 
  Signed-off-by: Chao Peng chao.p.p...@linux.intel.com
 
  --- a/tools/libxl/xl_cmdimpl.c
  +++ b/tools/libxl/xl_cmdimpl.c
  @@ -8014,6 +8014,36 @@ out:
   }
   
   #ifdef LIBXL_HAVE_PSR_CMT
  +static int psr_cmt_hwinfo(void)
  +{
  +int rc;
  +int enabled;
  +uint32_t total_rmid;
  +
 
 I think you should still have something like:
 
 SWITCH_FOREACH_OPT(opt, , NULL, psr-cmt-hwinfo, 0) {
 /* No options */
 }
 
 Or `xl psr-cmt-hwinfo -h' wouldn't work, would it?
 
 With that done:
 
  Reviewed-by: Dario Faggioli dario.faggi...@citrix.com
 
Agreed, I will add it.

Chao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 8/8] raisin: RFC Add blktap2 as an external tree

2015-04-21 Thread Stefano Stabellini
On Tue, 21 Apr 2015, George Dunlap wrote:
 On 04/21/2015 10:25 AM, Ian Campbell wrote:
  On Mon, 2015-04-20 at 18:05 +0100, Stefano Stabellini wrote:
  I think we need to disable the build on architectures other than x86,
  see grub for example
  
  Eventually we might want to build our own grub on ARM in order to pick
  up Fu Wei's multiboot for arm64 patches, until they enter distros?
  
  Or maybe Raisin on UEFI should be calling efibootmgr to register Xen
  directly with the BIOS, and creating a xen.cfg in /boot, i.e. the way it
  currently works even on x86.
  
  Do we?  There's no reason a blktap2 kernel module couldn't be built on
  ARM, is there?
 
  Maybe not, but I am pretty sure that it doesn't work at the moment. I
  don't think that the userspace stuff even compiles on ARM.
  Eventually we might have blktap on ARM, but I don't want to enable
  stuff in Raisin that we know it does not work.
  
  Especially if it is already to a greater or lesser extent deprecated (in
  favour of eventual blktap3) even on x86.
 
 So from my discussions w/ the XenServer guys, it seems that:
 
 1. The master branch of the blktap.git repo contains support for
 *both* blktap3 and blktap2.5 (with a kernel module)
 
 2. XenServer uses blktap3 for guest access, but still use the blktap2.5
 w/ kernel module for dom0 access to guest disks, to avoid the
 possibility of hitting a scalability limit due to grant references.
 
 So from raisin's perspective, the only difference between blktap2.5 and
 blktap3 is using the master branch rather than the blktap2 branch of
 the repo.

That is not entirely true: compiling and installing a kernel module is
quite different from userspace stuff, at least in terms of dependencies
and installation paths.


 Whether we maintain support for blktap2.5 in libxl is a matter for the
 Xen maintainers; but if xapi is ever going to start using libxl, it will
 certainly need to be able to do so.
 
 (Dave / David, please correct me if I'm wrong.)
 
 That said, there's no harm in disabling it on ARM to begin with, and
 enabling it once blktap3 works.

Yes, I would the code in Raisin to actually work :-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-13 at 17:19 -0400, longtao.pang wrote:
 1. Designate vif model by make-flight.
 2. In L2 installation context, its host (L1) IP address is not queried
 from DNS, but after running ts-nested-setup + host + nested, L1 IP is
 stored in runvar.
 
 Signed-off-by: longtao.pang longtaox.p...@intel.com
 ---
 Changes in v8:
 Remove the unnecessary symbol of ';' in 'TestSupport.pm'.
 ---
  Osstest/TestSupport.pm |8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)
 
 diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
 index 1bde67d..4cdacfc 100644
 --- a/Osstest/TestSupport.pm
 +++ b/Osstest/TestSupport.pm
 @@ -818,6 +818,10 @@ sub selecthost ($) {
   logm(Host $name is in HostGroup $ho-{Properties}{HostGroup});
  }
  
 +if ( $r{${name}_ip} ) {
 +$setprop-(IpAddr, $r{${name}_ip});
 +}

This is one for Ian I think, but I suspect this should use ${ident}
rather than ${name}. 

${ident} is e.g. 'host' or 'srchost' or 'nestedl1' it is the prefer used
on the runvar names. ${name} is the specific value assigned, i.e. an
actual host name.

For such properties we usually prefer ${ident}_foo. Ian, correct me if
I'm wrong please.

 +
  # Next, we use the config file's general properites as defaults
  foreach my $k (keys %c) {
   next unless $k =~ m/^HostProp_([A-Z].*)$/;
 @@ -1548,11 +1552,13 @@ sub prepareguest_part_xencfg ($) {
  my $oncrash= $xopts-{OnCrash} || 'preserve';
  my $vcpus= guest_var($gho, 'vcpus', $xopts-{DefVcpus} || 2);
  my $xoptcfg= $xopts-{ExtraConfig};
 +my $vif= guest_var($gho, 'vifmodel','');
 +my $vifmodel= $vif ? model=$vif : '';
  $xoptcfg='' unless defined $xoptcfg;
  my $xencfg= END;
  name= '$gho-{Name}'
  memory = ${ram_mb}
 -vif = [ 'type=ioemu,mac=$gho-{Ether}' ]
 +vif = [ 'type=ioemu,${vifmodel},mac=$gho-{Ether}' ]

If $vifmodel=='' then this will end up with a literal ,, in the
configuration. I've not checked if this is allowed but for forms sake I
would suggest to add the necessary , to $vifmodel when it is != ''. e.g.

+my $vifmodel= $vif ? ,model=$vif : '';
[...]
+vif = [ 'type=ioemu,mac=$gho-{Ether}${vifmodel}' ]

Ian


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Tamas K Lengyel
On Tue, Apr 21, 2015 at 4:29 PM, Jan Beulich jbeul...@suse.com wrote:

  On 21.04.15 at 16:24, ian.campb...@citrix.com wrote:
  On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote:
   On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
   On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
   On 20/04/15 16:06, Tamas K Lengyel wrote:
The current implementation of three memops,
 XENMEM_current_reservation,
XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values
 as an
int. However, in ARM64 we could potentially have 36-bit pfn's, thus
in preparation for the ARM patch, in this patch we update the
 existing
memop routines to use a struct, xen_get_gpfn, to exchange the gpfn
 info
as a uin64_t.
   
This patch also adds error checking on the toolside in case the
 memop
fails.
   
Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
  
   XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
  
   You cannot make adjustments like this, but you can add a brand new op
   with appropriate parameters and list the old ops as deprecated.
  
   Right. For the benefit of callers using the old API it seems what we
   usually do is rename the old op XENMEM_foo_compat and use the name
 with
   a new number for the new functionality, then use a
   __XEN_INTERFACE_VERSION__ to #define back to the old name.
  
   The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
   reasonable example, I couldn't find one specifically for the memory
 ops.
 
  And there's no need to afaict: This complication isn't needed in the
  first place. The patch's context already makes this clear:
 
  --- a/xen/common/memory.c
  +++ b/xen/common/memory.c
  @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd,
  XEN_GUEST_HANDLE_PARAM(void) arg)
 
  Note the long return type. Yet the patch description, for
  whatever reason, claims the hypercall to only return an int.
  Maybe because (as pointed out before) the respective Linux
  hypercall stub wrongly an int return type?
 
  Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64
  bit host (maximum pfn more than 2^32)?

 It is, but do we really want to introduce other than just compat
 mode helper interfaces (i.e. leaving the current ones alone, and
 perhaps even making the new ones tools only) if we really care
 about such setups in the first place?

 Jan


At the moment I just followed Andrew's advice and will introduce a new
XENMEM_maximum_gpfn2 memop that returns the gpfn in a struct as xen_pfn_t.
The old memops I'll leave untouched if that's OK.

Tamas
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/5] raisin: introduce ovmf

2015-04-21 Thread George Dunlap
On 04/21/2015 03:55 PM, Stefano Stabellini wrote:
 diff --git a/components/series b/components/series
 index f0f3cfa..fe9092a 100644
 --- a/components/series
 +++ b/components/series
 @@ -1,4 +1,5 @@
  seabios
 +ovmf
  xen
  qemu
  qemu_traditional
 diff --git a/components/xen b/components/xen
 index b3426f0..b3a0c96 100644
 --- a/components/xen
 +++ b/components/xen
 @@ -29,7 +29,8 @@ function xen_build() {
  cd xen-dir
  ./configure --prefix=$PREFIX 
 --with-system-qemu=$PREFIX/lib/xen/bin/qemu-system-i386 \
  --disable-qemu-traditional --enable-rombios \
 ---with-system-seabios=$BASEDIR/seabios-dir/out/bios.bin
 +--with-system-seabios=$BASEDIR/seabios-dir/out/bios.bin \
 +--with-system-ovmf=$BASEDIR/ovmf-dir/ovmf.bin

Does this still work if you don't build ovmf?

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/7] docs/build: Do not use move-if-changed

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 16:00 +0100, Andrew Cooper wrote:
 On 21/04/15 15:12, Ian Campbell wrote:
  On Mon, 2015-04-20 at 11:49 +0100, Andrew Cooper wrote:
  Nothing expensive depends on these results.
  But are those uses a problem?
 
 Less fluff in the build logs, and less load moving files around the
 filesystem.

OK, then: Acked-by: Ian Campbell ian.campb...@citrix.com

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] tg3 NIC driver bug in 3.14.x under Xen [and 3 more messages]

2015-04-21 Thread Ian Jackson
Prashant Sreedharan writes (Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 
more messages]):
 On Fri, 2015-04-17 at 15:12 -0400, David Miller wrote:
  From: Konrad Rzeszutek Wilk konrad.w...@oracle.com
  Date: Fri, 17 Apr 2015 15:04:48 -0400
   A huge amount of NIC drivers use the DMA API, however if
   compiled under 32-bit an very important part of the DMA API can
   be ommitted leading to the drivers not working at all
   (especially if used with 'swiotlb=force iommu=soft').
...
   As such enable this even when using 32-bit kernels.
   
   Reported-by: Ian Jackson ian.jack...@eu.citrix.com
   Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com

Thanks.

Tested-by: Ian Jackson ian.jack...@eu.citrix.com

I'd appreciate it if this patch could make its way into the 3.14.y
stable branch, as well as all the other places it's applicable of
course.  I've included another copy below for convenience, with
acks etc. from this email thread folded in.

I have tested 3.14.34 plus just this patch, with my usual kernel
configuration input and the affected machine, and this fixes the
problem completely AFAICT.

I have tested both baremetal 32-bit with `iommu=soft swiotlb=force'
(which previously corrupted all received network packets) and 32-bit
Linux under 64-bit Xen without any special options (which previously
gave 25-30% packet loss).

Thanks,
Ian.

From 9e417af099e3cee2b219ab28ffc1e96b0564b213 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Date: Fri, 17 Apr 2015 14:55:47 -0400
Subject: [PATCH] config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected

A huge amount of NIC drivers use the DMA API, however if compiled
under 32-bit an very important part of the DMA API can be ommitted leading
to the drivers not working at all (especially if used with
'swiotlb=force iommu=soft').

As Prashant Sreedharan explains it: the driver [tg3] uses
DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of the dma
mapping and dma_unmap_addr() to get the mapping value. On most of
the platforms this is a no-op, but ... with iommu=soft and
swiotlb=force this house keeping is required, ... otherwise
we pass 0 while calling pci_unmap_/pci_dma_sync_ instead of the
DMA address.

As such enable this even when using 32-bit kernels.

Reported-by: Ian Jackson ian.jack...@eu.citrix.com
Signed-off-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com
Acked-by: David S. Miller da...@davemloft.net
Acked-by: Prashant Sreedharan prash...@broadcom.com
Tested-by: Ian Jackson ian.jack...@eu.citrix.com
---
 arch/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b7d31ca..570c71d 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -177,7 +177,7 @@ config SBUS
 
 config NEED_DMA_MAP_STATE
def_bool y
-   depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG
+   depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG || SWIOTLB
 
 config NEED_SG_DMA_LENGTH
def_bool y
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 12/13] tools: add tools support for Intel CAT

2015-04-21 Thread Dario Faggioli
On Tue, 2015-04-21 at 17:49 +0800, Chao Peng wrote:
 On Tue, Apr 21, 2015 at 03:24:37AM +0200, Dario Faggioli wrote:
  On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote:
   This is the xc/xl changes to support Intel Cache Allocation
   Technology(CAT). Two commands are introduced:
   - xl psr-cat-hwinfo
 Show CAT hardware information.
  
   Examples:
   [root@vmm-psr vmm]# xl psr-cat-hwinfo
   Cache Allocation Technology (CAT):
   Socket ID   : 0
   L3 Cache: 12288KB
   Maximum COS : 15
   CBM length  : 12
   Default CBM : 0xfff
   
  Or, you can rename the psr-cmt-hwinfo command, added in the previous
  patch, to 'psr-hwinfo' and make it accept options, e.g.,
  
   -m, --cmt   show Cache Monitoring Technology (CMT) hardware info
   -c, --cat   show Cache Allocation Technology (CAT) hardware info
  
  By default (i.e., no options provided), it can just print all the hw
  info.
  
  Not a big deal, but I think that would make a better command line
  interface. Tools' maintainers' call, I guess.
 
 Thanks for suggestion.
  
   --- a/tools/libxc/include/xenctrl.h
   +++ b/tools/libxc/include/xenctrl.h
  
   +
   +int xc_psr_cat_set_domain_data(xc_interface *xch, uint32_t domid,
   +   xc_psr_cat_type type, uint32_t target,
   +   uint64_t data);
   +int xc_psr_cat_get_domain_data(xc_interface *xch, uint32_t domid,
   +   xc_psr_cat_type type, uint32_t target,
   +   uint64_t *data);
  
  So, for this twos, 'target' is the socket you want to act on.
  
   +int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
   +   uint32_t *cos_max, uint32_t *cbm_len);
  
  While here you use 'socket', to mean the same thing.
  
  That looks rather inconsistent. Since it's a socket we are talking
  about, why not 'socket' everywhere?
 
 
 The idea behind here is: All the places that appear as 'target' imply 
 there are possible values other than just socket (e.g. considering L2
 Cache Allocation in the future). So 'target' is always paired with a
 'psr_cat_type' parameter. 

Mmm... I understand your concerns. So, sticking to the future L2 CAT
support example, what would 'target' mean in that case, a pCPU? I'll
have to be something that makes it possible to 'identify' an L2, as much
as socket identifies an L3... Is this the case?

I'm not sure. My own concerns are that, if I look at the prototypes of
this functions, it's not that evident what values should I use for the
type and target parameters, whether there are constraints/relationships
among them, etc. That applies to both the xc_* functions above and the
libxl_* functions below, IMO.

Libxc is not a stable interface, so we could even just forget about
this, design this very interface _only_ for what we have now and deal
with different CBM types when we'll be introducing them. However, libxl
*does* have a stable API, so we still need to solve the issue at that
level.

 For routines that only work for L3 (e.g.
 xc_psr_cat_get_l3_info) then 'socket' is used. I admit that it looks
 inconsistent, perhaps rename all 'socket' to 'target'?
 
No, IMO, that is one good inconsistency, as it allows, at least for
that function, to easily figure out what one should pass to the function
by means of that parameter! :-)

I really am not sure, and probably would have to know in what way(s)
'target' would change its meaning, depending on the value of 'type' (as
asked above)... Probably what I'd do is leave parameters names as they
are, but write a few doc-comments to explain how to use them, especially
at the libxl level (libxc is a lot less critical, from this respect, I
think).

Regards,
Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST] ts-xen-*: Avoid use of uninitialised $r{enable_xsm}

2015-04-21 Thread Ian Jackson
Ian Campbell writes ([PATCH OSSTEST] ts-xen-*: Avoid use of uninitialised 
$r{enable_xsm}):
 Use of uninitialized value $r{enable_xsm} in pattern match (m//) at 
 ./ts-xen-install line 49.

Acked-by: Ian Jackson ian.jack...@eu.citrix.com


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Tamas K Lengyel
On Tue, Apr 21, 2015 at 5:13 PM, Jan Beulich jbeul...@suse.com wrote:

  On 21.04.15 at 16:42, tkleng...@sec.in.tum.de wrote:
  On Tue, Apr 21, 2015 at 4:29 PM, Jan Beulich jbeul...@suse.com wrote:
 
   On 21.04.15 at 16:24, ian.campb...@citrix.com wrote:
   On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote:
On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
On 20/04/15 16:06, Tamas K Lengyel wrote:
 The current implementation of three memops,
  XENMEM_current_reservation,
 XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values
  as an
 int. However, in ARM64 we could potentially have 36-bit pfn's,
 thus
 in preparation for the ARM patch, in this patch we update the
  existing
 memop routines to use a struct, xen_get_gpfn, to exchange the
 gpfn
  info
 as a uin64_t.

 This patch also adds error checking on the toolside in case the
  memop
 fails.

 Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
   
XENMEM, unlikely domctls/sysctls is a guest-visible stable
 ABI/API.
   
You cannot make adjustments like this, but you can add a brand
 new op
with appropriate parameters and list the old ops as deprecated.
   
Right. For the benefit of callers using the old API it seems what
 we
usually do is rename the old op XENMEM_foo_compat and use the name
  with
a new number for the new functionality, then use a
__XEN_INTERFACE_VERSION__ to #define back to the old name.
   
The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
reasonable example, I couldn't find one specifically for the memory
  ops.
  
   And there's no need to afaict: This complication isn't needed in the
   first place. The patch's context already makes this clear:
  
   --- a/xen/common/memory.c
   +++ b/xen/common/memory.c
   @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd,
   XEN_GUEST_HANDLE_PARAM(void) arg)
  
   Note the long return type. Yet the patch description, for
   whatever reason, claims the hypercall to only return an int.
   Maybe because (as pointed out before) the respective Linux
   hypercall stub wrongly an int return type?
  
   Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a
 64
   bit host (maximum pfn more than 2^32)?
 
  It is, but do we really want to introduce other than just compat
  mode helper interfaces (i.e. leaving the current ones alone, and
  perhaps even making the new ones tools only) if we really care
  about such setups in the first place?
 
  At the moment I just followed Andrew's advice and will introduce a new
  XENMEM_maximum_gpfn2 memop that returns the gpfn in a struct as
 xen_pfn_t.
  The old memops I'll leave untouched if that's OK.

 For this specific one - is there a reasonable use case? Other than
 for host PFN, we have control over guest ones, and I'm not sure
 managing a guest with GPFNs extending past 4 billion can be
 expected to work if only this one hypercall got fixed. IOW I'm
 expecting to NAK any such addition without proper rationale.

 Jan


AFAIK everything works for me as it is already without this patch. I'm not
sure if the cornercase of pfn's  32-bit wide on ARM64 is actually
something that would work/is supported at the moment.

Tamas
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST] ts-host-install: Arrange for ssh logins to have no corefile size limit

2015-04-21 Thread Ian Jackson
Ian Campbell writes ([PATCH OSSTEST] ts-host-install: Arrange for ssh logins 
to have no corefile size limit):
 Collect the output of cat /proc/self/limits so we get some clue if
 this isn't working for some reason.

Acked-by: Ian Jackson ian.jack...@eu.citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST] ts-libvirt-build: set coredump ulimit to unlimited for libvirtd

2015-04-21 Thread Ian Jackson
Ian Campbell writes ([PATCH OSSTEST] ts-libvirt-build: set coredump ulimit to 
unlimited for libvirtd):
 Signed-off-by: Ian Campbell ian.campb...@citrix.com

Acked-by: Ian Jackson ian.jack...@eu.citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 15:29 +0100, Jan Beulich wrote:
  On 21.04.15 at 16:24, ian.campb...@citrix.com wrote:
  On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote:
   On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
   On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
   On 20/04/15 16:06, Tamas K Lengyel wrote:
The current implementation of three memops, 
XENMEM_current_reservation,
XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an
int. However, in ARM64 we could potentially have 36-bit pfn's, thus
in preparation for the ARM patch, in this patch we update the existing
memop routines to use a struct, xen_get_gpfn, to exchange the gpfn 
info
as a uin64_t.
   
This patch also adds error checking on the toolside in case the memop
fails.
   
Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
   
   XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
   
   You cannot make adjustments like this, but you can add a brand new op
   with appropriate parameters and list the old ops as deprecated.
   
   Right. For the benefit of callers using the old API it seems what we
   usually do is rename the old op XENMEM_foo_compat and use the name with
   a new number for the new functionality, then use a
   __XEN_INTERFACE_VERSION__ to #define back to the old name.
   
   The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
   reasonable example, I couldn't find one specifically for the memory ops.
  
  And there's no need to afaict: This complication isn't needed in the
  first place. The patch's context already makes this clear:
  
  --- a/xen/common/memory.c
  +++ b/xen/common/memory.c
  @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, 
  XEN_GUEST_HANDLE_PARAM(void) arg)
  
  Note the long return type. Yet the patch description, for
  whatever reason, claims the hypercall to only return an int.
  Maybe because (as pointed out before) the respective Linux
  hypercall stub wrongly an int return type?
  
  Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64
  bit host (maximum pfn more than 2^32)?
 
 It is, but do we really want to introduce other than just compat
 mode helper interfaces (i.e. leaving the current ones alone, and
 perhaps even making the new ones tools only) if we really care
 about such setups in the first place?

IIRC the original impetus for at least one part of this interface was
for in guest use. I don't recall what the use was, I think it was the
max pfn one which was used though.

Perhaps in that case we can assume that a signed long is sufficient for
any pfn they might see. But is that true? A 32-bit guest could see
PFN2^31 I think.

Perhaps we should add these fixed version as a tools interface and
consider only doing a fixed guest visible version of the max pfn one?
Modulo confirming that I'm not misremembering...

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 4/5] raisin: introduce ovmf

2015-04-21 Thread Stefano Stabellini
Add a component to build ovmf and pass the output binary to the xen
build.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 components/ovmf   |   60 +
 components/series |1 +
 components/xen|3 ++-
 defconfig |6 --
 4 files changed, 67 insertions(+), 3 deletions(-)
 create mode 100644 components/ovmf

diff --git a/components/ovmf b/components/ovmf
new file mode 100644
index 000..a59a771
--- /dev/null
+++ b/components/ovmf
@@ -0,0 +1,60 @@
+#!/usr/bin/env bash
+
+function ovmf_check_package() {
+local DEP_Debian_common=build-essential nasm uuid-dev python iasl
+local DEP_Debian_x86_32=$DEP_Debian_common
+local DEP_Debian_x86_64=$DEP_Debian_common
+local DEP_Debian_arm32=$DEP_Debian_common
+local DEP_Debian_arm64=$DEP_Debian_common
+
+local DEP_Fedora_common=make gcc gcc-c++ nasm libuuid-devel python 
acpica-tools
+local DEP_Fedora_x86_32=$DEP_Fedora_common
+local DEP_Fedora_x86_64=$DEP_Fedora_common
+
+
+if [[ $RAISIN_ARCH != x86_64 ]]
+then
+echo ovmf is only supported on x86_64
+return
+fi
+echo Checking OVMF dependencies
+eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
+}
+
+
+function ovmf_build() {
+if [[ $RAISIN_ARCH != x86_64 ]]
+then
+echo ovmf is only supported on x86_64
+return
+fi
+
+cd $BASEDIR
+git-checkout $OVMF_URL $OVMF_REVISION ovmf-dir
+cd ovmf-dir
+
+make -C BaseTools/Source/C
+OvmfPkg/build.sh -a X64 -b RELEASE -n 4
+cp Build/OvmfX64/RELEASE_GCC*/FV/OVMF.fd ovmf.bin
+
+cd $BASEDIR
+}
+
+function ovmf_clean() {
+cd $BASEDIR
+if [[ -d ovmf-dir ]]
+then
+cd ovmf-dir
+$GIT clean -fdx
+cd ..
+rm -rf ovmf-dir
+fi
+}
+
+function ovmf_configure() {
+:
+}
+
+function ovmf_unconfigure() {
+:
+}
diff --git a/components/series b/components/series
index f0f3cfa..fe9092a 100644
--- a/components/series
+++ b/components/series
@@ -1,4 +1,5 @@
 seabios
+ovmf
 xen
 qemu
 qemu_traditional
diff --git a/components/xen b/components/xen
index b3426f0..b3a0c96 100644
--- a/components/xen
+++ b/components/xen
@@ -29,7 +29,8 @@ function xen_build() {
 cd xen-dir
 ./configure --prefix=$PREFIX 
--with-system-qemu=$PREFIX/lib/xen/bin/qemu-system-i386 \
 --disable-qemu-traditional --enable-rombios \
---with-system-seabios=$BASEDIR/seabios-dir/out/bios.bin
+--with-system-seabios=$BASEDIR/seabios-dir/out/bios.bin \
+--with-system-ovmf=$BASEDIR/ovmf-dir/ovmf.bin
 $RAISIN_MAKE
 $RAISIN_MAKE install DESTDIR=$INST_DIR
 cd $BASEDIR
diff --git a/defconfig b/defconfig
index d3ef283..7d2a3f7 100644
--- a/defconfig
+++ b/defconfig
@@ -1,12 +1,12 @@
 # Config variables for raisin
 
 # Components
-## All components: seabios xen qemu qemu_traditional grub libvirt
+## All components: seabios ovmf xen qemu qemu_traditional grub libvirt
 ## Core xen functionality: xen
 ## Remove a component from the list below, if you want to disable it
 ## You can manually overwrite this list using the COMPONENTS
 ## environmental variable.
-ENABLED_COMPONENTS=seabios xen qemu qemu_traditional grub libvirt
+ENABLED_COMPONENTS=seabios ovmf xen qemu qemu_traditional grub libvirt
 
 # Build config
 ## Make command to run
@@ -27,6 +27,7 @@ 
QEMU_TRADITIONAL_URL=git://xenbits.xen.org/qemu-xen-unstable.git
 SEABIOS_URL=git://xenbits.xen.org/seabios.git
 GRUB_URL=git://git.savannah.gnu.org/grub.git
 LIBVIRT_URL=git://libvirt.org/libvirt.git
+OVMF_URL=git://xenbits.xen.org/ovmf.git
 
 # Software versions.
 XEN_REVISION=master
@@ -35,3 +36,4 @@ QEMU_TRADITIONAL_REVISION=master
 SEABIOS_REVISION=master
 GRUB_REVISION=master
 LIBVIRT_REVISION=master
+OVMF_REVISION=master
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/7] docs/build: Move install checks into individual build targets

2015-04-21 Thread Andrew Cooper
On 21/04/15 15:23, Ian Campbell wrote:
 On Tue, 2015-04-21 at 15:18 +0100, Jan Beulich wrote:
 On 21.04.15 at 16:15, ian.campb...@citrix.com wrote:
 On Mon, 2015-04-20 at 13:01 +0100, Andrew Cooper wrote:
 On 20/04/15 12:57, Jan Beulich wrote:
 On 20.04.15 at 12:49, andrew.coop...@citrix.com wrote:
 @@ -103,12 +88,20 @@ install: install-man-pages install-html
  
  # Individual file build targets
  man1/%.1: man/%.pod.1 Makefile
 +ifdef POD2MAN
 Perhaps better to use ifneq($(POD2MAN),) in such cases?
 I was following the prevailing style, but can update all instances. 
 FWIW, it is not currently an issue.
 I don't really mind, ifneq seems more prevalent in other makefiles, I'm
 not sure why this one uses ifdef nor really what the implications are.
 The difference becomes visible when the variable is defined but
 expands to nothing. And I view it as more useful if as empty
 variable means none than if that results in presumably very
 strange build errors.
 True. Andy, will you fix these up then?

Yes - will fix in v2.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/5] raisin: rename ARCH to RAISIN_ARCH

2015-04-21 Thread Stefano Stabellini
Rename exported environmental variable ARCH to RAISIN_ARCH not to
conflict with any component Makefile variables.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 README  |2 +-
 components/grub |6 +++---
 components/libvirt  |2 +-
 components/qemu |2 +-
 components/qemu_traditional |2 +-
 components/seabios  |6 +++---
 components/xen  |2 +-
 lib/common-functions.sh |6 +++---
 scripts/mkdeb   |8 
 scripts/mkrpm   |2 +-
 10 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/README b/README
index ab8d3e7..b7832da 100644
--- a/README
+++ b/README
@@ -85,7 +85,7 @@ RAISIN_MAKE: make command to run
 SUDO: sudo command to run
 DISTRO: which Linux distribution we are running on, Debian, Fedora, etc
 PKGTYPE: which package format is used by the distro, rpm, deb, etc
-ARCH: which architecture we are running on, x86_64, arm32, etc.
+RAISIN_ARCH: which architecture we are running on, x86_64, arm32, etc.
 BASEDIR: absolute path of the raisin directory
 PREFIX: installation PREFIX
 INST_DIR: absolute path of the installation directory
diff --git a/components/grub b/components/grub
index 7c14a62..fa72b99 100644
--- a/components/grub
+++ b/components/grub
@@ -17,13 +17,13 @@ function grub_check_package() {
 local DEP_CentOS_x86_64=$DEP_Fedora_x86_64
 
 
-if [[ $ARCH != x86_64  $ARCH != x86_32 ]]
+if [[ $RAISIN_ARCH != x86_64  $RAISIN_ARCH != x86_32 ]]
 then
 echo grub is only supported on x86_32 and x86_64
 return
 fi
 echo Checking Grub dependencies
-eval check-package \$DEP_$DISTRO_$ARCH
+eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
 }
 
 
@@ -41,7 +41,7 @@ function grub_build() {
   -m ../memdisk.tar -o grub-i386-xen grub-core/*mod
 cp grub-i386-xen $INST_DIR/$PREFIX/lib/xen/boot
 ## GRUB64
-if [[ $ARCH = x86_64 ]]
+if [[ $RAISIN_ARCH = x86_64 ]]
 then
 $RAISIN_MAKE clean
 ./configure --target=amd64 --with-platform=xen
diff --git a/components/libvirt b/components/libvirt
index 5fb1a41..a554643 100644
--- a/components/libvirt
+++ b/components/libvirt
@@ -23,7 +23,7 @@ function libvirt_check_package() {
 local DEP_CentOS_x86_64=$DEP_Fedora_x86_64
 
 echo Checking Libvirt dependencies
-eval check-package \$DEP_$DISTRO_$ARCH
+eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
 }
 
 function libvirt_build() {
diff --git a/components/qemu b/components/qemu
index 00aeeb6..72cfec1 100644
--- a/components/qemu
+++ b/components/qemu
@@ -12,7 +12,7 @@ function qemu_check_package() {
 local DEP_Fedora_x86_64=$DEP_Fedora_common
 
 echo Checking QEMU dependencies
-eval check-package \$DEP_$DISTRO_$ARCH
+eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
 }
 
 function qemu_build() {
diff --git a/components/qemu_traditional b/components/qemu_traditional
index 500cbed..b338007 100644
--- a/components/qemu_traditional
+++ b/components/qemu_traditional
@@ -13,7 +13,7 @@ function qemu_traditional_check_package() {
 local DEP_Fedora_x86_64=$DEP_Fedora_common
 
 echo Checking QEMU dependencies
-eval check-package \$DEP_$DISTRO_$ARCH
+eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
 }
 
 function qemu_traditional_build() {
diff --git a/components/seabios b/components/seabios
index 960a538..ed2c7d2 100644
--- a/components/seabios
+++ b/components/seabios
@@ -12,18 +12,18 @@ function seabios_check_package() {
 local DEP_Fedora_x86_64=$DEP_Fedora_common
 
 
-if [[ $ARCH != x86_64  $ARCH != x86_32 ]]
+if [[ $RAISIN_ARCH != x86_64  $RAISIN_ARCH != x86_32 ]]
 then
 echo seabios is only supported on x86_32 and x86_64
 return
 fi
 echo Checking SeaBIOS dependencies
-eval check-package \$DEP_$DISTRO_$ARCH
+eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
 }
 
 
 function seabios_build() {
-if [[ $ARCH != x86_64  $ARCH != x86_32 ]]
+if [[ $RAISIN_ARCH != x86_64  $RAISIN_ARCH != x86_32 ]]
 then
 echo seabios is only supported on x86_32 and x86_64
 return
diff --git a/components/xen b/components/xen
index b7258b0..b3426f0 100644
--- a/components/xen
+++ b/components/xen
@@ -20,7 +20,7 @@ function xen_check_package() {
 local DEP_CentOS_x86_64=$DEP_CentOS_x86_32 glibc-devel.i686
 
 echo Checking Xen dependencies
-eval check-package \$DEP_$DISTRO_$ARCH
+eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
 }
 
 function xen_build() {
diff --git a/lib/common-functions.sh b/lib/common-functions.sh
index c19046b..186a53b 100644
--- a/lib/common-functions.sh
+++ b/lib/common-functions.sh
@@ -41,7 +41,7 @@ function common_init() {
 get_components
 
 _verbose_echo Distro: $DISTRO
-_verbose_echo Arch: $ARCH
+_verbose_echo Arch: $RAISIN_ARCH
 _verbose_echo Components: $COMPONENTS
 
 for f in $COMPONENTS
@@ -161,7 +161,7 @@ function get_distro() {
 }
 
 

[Xen-devel] [PATCH 2/5] raisin: remove duplicate source config in raise

2015-04-21 Thread Stefano Stabellini
Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 raise |   15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/raise b/raise
index bce6908..68dbfd8 100755
--- a/raise
+++ b/raise
@@ -17,9 +17,12 @@ _help() {
 }
 
 # Include your defaults
-if [[ -e ./config ]] ; then
-. ./config
+if [[ ! -e ./config ]]
+then
+   echo No config file found, copying default config
+   cp defconfig config
 fi
+source ./config
 
 # To use this as a library, set RAISIN_PATH appropriately
 [[ -z $RAISIN_PATH ]]  RAISIN_PATH=$PWD/lib
@@ -29,14 +32,6 @@ source ${RAISIN_PATH}/common-functions.sh
 source ${RAISIN_PATH}/git-checkout.sh
 source ${RAISIN_PATH}/commands.sh
 
-# Include your defaults
-if [[ ! -e ./config ]] ; then
-   echo No config file found, copying default config
-   cp defconfig config
-fi
-
-source ./config
-
 # Set up basic functionality
 common_init
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 5/5] raisin: build linux

2015-04-21 Thread Stefano Stabellini
Add a component, disabled by default, to build a linux kernel with the
Xen kconfig options enabled.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 components/linux  |  120 +
 components/series |1 +
 defconfig |4 +-
 3 files changed, 124 insertions(+), 1 deletion(-)
 create mode 100644 components/linux

diff --git a/components/linux b/components/linux
new file mode 100644
index 000..f90a894
--- /dev/null
+++ b/components/linux
@@ -0,0 +1,120 @@
+#!/usr/bin/env bash
+
+function linux_check_package() {
+local DEP_Debian_common=build-essential bc openssl
+local DEP_Debian_x86_32=$DEP_Debian_common
+local DEP_Debian_x86_64=$DEP_Debian_common
+local DEP_Debian_arm32=$DEP_Debian_common
+local DEP_Debian_arm64=$DEP_Debian_common
+
+local DEP_Fedora_common=make gcc bc openssl
+local DEP_Fedora_x86_32=$DEP_Fedora_common
+local DEP_Fedora_x86_64=$DEP_Fedora_common
+
+local DEP_CentOS_common=$DEP_Fedora_common
+local DEP_CentOS_x86_32=$DEP_Fedora_x86_32
+local DEP_CentOS_x86_64=$DEP_Fedora_x86_64
+
+echo Checking Linux dependencies
+eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
+}
+
+function _xenify_config() {
+echo CONFIG_HYPERVISOR_GUEST=y  $1
+echo CONFIG_PARAVIRT=y  $1
+echo CONFIG_PARAVIRT_SPINLOCKS=y  $1
+echo CONFIG_XEN=y  $1
+echo CONFIG_XEN_DOM0=y  $1
+echo CONFIG_XEN_PVHVM=y  $1
+echo CONFIG_XEN_SAVE_RESTORE=y  $1
+echo CONFIG_XEN_DEBUG_FS=y  $1
+echo CONFIG_XEN_PVH=y  $1
+echo CONFIG_PARAVIRT_CLOCK=y  $1
+echo CONFIG_BALLOON_COMPACTION=y  $1
+echo CONFIG_XEN_PCIDEV_FRONTEND=y  $1
+echo CONFIG_XEN_BLKDEV_FRONTEND=y  $1
+echo CONFIG_XEN_BLKDEV_BACKEND=y  $1
+echo CONFIG_XEN_SCSI_FRONTEND=y  $1
+echo CONFIG_XEN_NETDEV_FRONTEND=y  $1
+echo CONFIG_XEN_NETDEV_BACKEND=y  $1
+echo CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y  $1
+echo CONFIG_XEN_FBDEV_FRONTEND=y  $1
+echo CONFIG_HVC_XEN=y  $1
+echo CONFIG_HVC_XEN_FRONTEND=y  $1
+echo CONFIG_XEN_BALLOON=y  $1
+echo CONFIG_XEN_SCRUB_PAGES=y  $1
+echo CONFIG_XEN_DEV_EVTCHN=y  $1
+echo CONFIG_XEN_BACKEND=y  $1
+echo CONFIG_XENFS=y  $1
+echo CONFIG_XEN_COMPAT_XENFS=y  $1
+echo CONFIG_XEN_SYS_HYPERVISOR=y  $1
+echo CONFIG_XEN_XENBUS_FRONTEND=y  $1
+echo CONFIG_XEN_GNTDEV=y  $1
+echo CONFIG_XEN_GRANT_DEV_ALLOC=y  $1
+echo CONFIG_SWIOTLB_XEN=y  $1
+echo CONFIG_XEN_PCIDEV_BACKEND=y  $1
+echo CONFIG_XEN_PRIVCMD=y  $1
+echo CONFIG_XEN_HAVE_PVMMU=y  $1
+echo CONFIG_XEN_ACPI_PROCESSOR=y  $1
+echo CONFIG_XEN_EFI=y  $1
+echo CONFIG_XEN_AUTO_XLATE=y  $1
+echo CONFIG_BRIDGE=y  $1
+}
+
+function linux_build() {
+local vmlinuz
+
+cd $BASEDIR
+git-checkout $LINUX_URL $LINUX_REVISION linux-dir
+cd linux-dir
+
+if [[ ! -e .config ]]
+then
+if [[ -e /boot/config-`uname -r` ]]
+then
+cp /boot/config-`uname -r` .config
+else
+$RAISIN_MAKE defconfig
+fi
+_xenify_config .config
+$RAISIN_MAKE olddefconfig
+fi
+
+$RAISIN_MAKE
+$RAISIN_MAKE modules_install INSTALL_MOD_PATH=$INST_DIR
+
+mkdir -p $INST_DIR/boot/xen
+vmlinuz=$INST_DIR/boot/xen/vmlinuz-$RAISIN_ARCH-$LINUX_REVISION-`date 
+%Y%m%d.%H%M%S`
+
+if [[ $RAISIN_ARCH = x86_64 || $RAISIN_ARCH = x86_32 ]]
+then
+cp arch/x86/boot/bzImage $vmlinuz
+elif [[ $RAISIN_ARCH = arm32 ]]
+then
+cp arch/arm/boot/zImage $vmlinuz
+elif [[ $RAISIN_ARCH = arm64 ]]
+then
+cp arch/x86/boot/Image.gz $vmlinuz
+fi
+
+cd ..
+}
+
+function linux_clean() {
+cd $BASEDIR
+if [[ -d linux-dir ]]
+then
+cd linux-dir
+$RAISIN_MAKE distclean
+cd ..
+rm -rf linux-dir
+fi
+}
+
+function linux_configure() {
+:
+}
+
+function linux_unconfigure() {
+:
+}
diff --git a/components/series b/components/series
index fe9092a..928e78e 100644
--- a/components/series
+++ b/components/series
@@ -5,3 +5,4 @@ qemu
 qemu_traditional
 grub
 libvirt
+linux
diff --git a/defconfig b/defconfig
index 7d2a3f7..b4ed94d 100644
--- a/defconfig
+++ b/defconfig
@@ -1,7 +1,7 @@
 # Config variables for raisin
 
 # Components
-## All components: seabios ovmf xen qemu qemu_traditional grub libvirt
+## All components: seabios ovmf xen qemu qemu_traditional grub libvirt linux
 ## Core xen functionality: xen
 ## Remove a component from the list below, if you want to disable it
 ## You can manually overwrite this list using the COMPONENTS
@@ -28,6 +28,7 @@ SEABIOS_URL=git://xenbits.xen.org/seabios.git
 GRUB_URL=git://git.savannah.gnu.org/grub.git
 LIBVIRT_URL=git://libvirt.org/libvirt.git
 OVMF_URL=git://xenbits.xen.org/ovmf.git
+LINUX_URL=git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
 
 # Software versions.
 XEN_REVISION=master
@@ -37,3 +38,4 @@ SEABIOS_REVISION=master
 

[Xen-devel] [PATCH 1/5] raisin: introduce _verbose_echo

2015-04-21 Thread Stefano Stabellini
A new utility function to make the code more readable and compact:
prints a message if VERBOSE = 1.

Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 lib/common-functions.sh |   46 --
 1 file changed, 16 insertions(+), 30 deletions(-)

diff --git a/lib/common-functions.sh b/lib/common-functions.sh
index dc3a2bb..c19046b 100644
--- a/lib/common-functions.sh
+++ b/lib/common-functions.sh
@@ -1,5 +1,12 @@
 #!/usr/bin/env bash
 
+function _verbose_echo() {
+if [[ $VERBOSE -eq 1 ]]
+then
+echo $*
+fi
+}
+
 # Executed once at the beginning of the script
 function common_init() {
 export BASEDIR=`pwd`
@@ -33,12 +40,9 @@ function common_init() {
 get_arch
 get_components
 
-if [[ $VERBOSE -eq 1 ]]
-then
-echo Distro: $DISTRO
-echo Arch: $ARCH
-echo Components: $COMPONENTS
-fi
+_verbose_echo Distro: $DISTRO
+_verbose_echo Arch: $ARCH
+_verbose_echo Components: $COMPONENTS
 
 for f in $COMPONENTS
 do
@@ -62,10 +66,7 @@ function get_components() {
 if eval [[ ! -z \$$capital_REVISION ]]
 then
 COMPONENTS=$COMPONENTS $component
-if [[ $VERBOSE -eq 1 ]]
-then
-echo Found component $component
-fi
+_verbose_echo Found component $component
 fi
 done
 fi
@@ -166,10 +167,7 @@ function get_arch() {
 }
 
 function _check-package-deb() {
-if [[ $VERBOSE -eq 1 ]]
-then
-echo Checking for package ${args[0]}
-fi
+_verbose_echo Checking for package ${args[0]}
 
 if dpkg -s $1 2/dev/null | grep -q Status:.*installed
 then
@@ -184,10 +182,7 @@ function _install-package-deb() {
 }
 
 function _check-package-rpm() {
-if [[ $VERBOSE -eq 1 ]]
-then
-echo Checking for package $1
-fi
+_verbose_echo Checking for package $1
 
 if rpm -q $1 21 /dev/null
 then
@@ -273,22 +268,13 @@ function for_each_component () {
 done
 if [[ $found -eq 0 ]]
 then
-if [[ $VERBOSE -eq 1 ]]
-then
-echo $component is disabled
-fi
+_verbose_echo $component is disabled
 continue
 fi
 
-if [[ $VERBOSE -eq 1 ]]
-then
-echo calling $component_$1
-fi
+_verbose_echo calling $component_$1
 $component_$1
-if [[ $VERBOSE -eq 1 ]]
-then
-echo $component_$1 done
-fi
+_verbose_echo $component_$1 done
 done
 }
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/7] docs/build: Do not use move-if-changed

2015-04-21 Thread Andrew Cooper
On 21/04/15 15:12, Ian Campbell wrote:
 On Mon, 2015-04-20 at 11:49 +0100, Andrew Cooper wrote:
 Nothing expensive depends on these results.
 But are those uses a problem?

Less fluff in the build logs, and less load moving files around the
filesystem.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/5] raisin: introduce _verbose_echo

2015-04-21 Thread George Dunlap
On 04/21/2015 03:54 PM, Stefano Stabellini wrote:
 A new utility function to make the code more readable and compact:
 prints a message if VERBOSE = 1.
 
 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
 ---
  lib/common-functions.sh |   46 --
  1 file changed, 16 insertions(+), 30 deletions(-)
 
 diff --git a/lib/common-functions.sh b/lib/common-functions.sh
 index dc3a2bb..c19046b 100644
 --- a/lib/common-functions.sh
 +++ b/lib/common-functions.sh
 @@ -1,5 +1,12 @@
  #!/usr/bin/env bash
  
 +function _verbose_echo() {
 +if [[ $VERBOSE -eq 1 ]]
 +then
 +echo $*
 +fi
 +}

Why _?  Don't we have VERBOSE in other parts of the code that could call
this?

 -George

 +
  # Executed once at the beginning of the script
  function common_init() {
  export BASEDIR=`pwd`
 @@ -33,12 +40,9 @@ function common_init() {
  get_arch
  get_components
  
 -if [[ $VERBOSE -eq 1 ]]
 -then
 -echo Distro: $DISTRO
 -echo Arch: $ARCH
 -echo Components: $COMPONENTS
 -fi
 +_verbose_echo Distro: $DISTRO
 +_verbose_echo Arch: $ARCH
 +_verbose_echo Components: $COMPONENTS
  
  for f in $COMPONENTS
  do
 @@ -62,10 +66,7 @@ function get_components() {
  if eval [[ ! -z \$$capital_REVISION ]]
  then
  COMPONENTS=$COMPONENTS $component
 -if [[ $VERBOSE -eq 1 ]]
 -then
 -echo Found component $component
 -fi
 +_verbose_echo Found component $component
  fi
  done
  fi
 @@ -166,10 +167,7 @@ function get_arch() {
  }
  
  function _check-package-deb() {
 -if [[ $VERBOSE -eq 1 ]]
 -then
 -echo Checking for package ${args[0]}
 -fi
 +_verbose_echo Checking for package ${args[0]}
  
  if dpkg -s $1 2/dev/null | grep -q Status:.*installed
  then
 @@ -184,10 +182,7 @@ function _install-package-deb() {
  }
  
  function _check-package-rpm() {
 -if [[ $VERBOSE -eq 1 ]]
 -then
 -echo Checking for package $1
 -fi
 +_verbose_echo Checking for package $1
  
  if rpm -q $1 21 /dev/null
  then
 @@ -273,22 +268,13 @@ function for_each_component () {
  done
  if [[ $found -eq 0 ]]
  then
 -if [[ $VERBOSE -eq 1 ]]
 -then
 -echo $component is disabled
 -fi
 +_verbose_echo $component is disabled
  continue
  fi
  
 -if [[ $VERBOSE -eq 1 ]]
 -then
 -echo calling $component_$1
 -fi
 +_verbose_echo calling $component_$1
  $component_$1
 -if [[ $VERBOSE -eq 1 ]]
 -then
 -echo $component_$1 done
 -fi
 +_verbose_echo $component_$1 done
  done
  }
  
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] raisin: build linux

2015-04-21 Thread George Dunlap
On 04/21/2015 03:55 PM, Stefano Stabellini wrote:
 Add a component, disabled by default, to build a linux kernel with the
 Xen kconfig options enabled.
 
 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com

Acked-by: George Dunlap george.dun...@eu.citrix.com

I'm wondering if at some point we might want to give people the option
to pass in a specific Linux config; but that can be a feature for
another patch.

 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/5] raisin: remove duplicate source config in raise

2015-04-21 Thread George Dunlap
On 04/21/2015 03:55 PM, Stefano Stabellini wrote:
 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com

I take it this got here by a bad merge or somethign?

Acked-by: George Dunlap george.dun...@eu.citrix.com

 ---
  raise |   15 +--
  1 file changed, 5 insertions(+), 10 deletions(-)
 
 diff --git a/raise b/raise
 index bce6908..68dbfd8 100755
 --- a/raise
 +++ b/raise
 @@ -17,9 +17,12 @@ _help() {
  }
  
  # Include your defaults
 -if [[ -e ./config ]] ; then
 -. ./config
 +if [[ ! -e ./config ]]
 +then
 +   echo No config file found, copying default config
 +   cp defconfig config
  fi
 +source ./config
  
  # To use this as a library, set RAISIN_PATH appropriately
  [[ -z $RAISIN_PATH ]]  RAISIN_PATH=$PWD/lib
 @@ -29,14 +32,6 @@ source ${RAISIN_PATH}/common-functions.sh
  source ${RAISIN_PATH}/git-checkout.sh
  source ${RAISIN_PATH}/commands.sh
  
 -# Include your defaults
 -if [[ ! -e ./config ]] ; then
 -   echo No config file found, copying default config
 -   cp defconfig config
 -fi
 -
 -source ./config
 -
  # Set up basic functionality
  common_init
  
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] x86/asm/irq: Don't use POPF but STI

2015-04-21 Thread Ingo Molnar

* Borislav Petkov b...@alien8.de wrote:

 On Tue, Apr 21, 2015 at 02:45:58PM +0200, Ingo Molnar wrote:
  From 6f01f6381e8293c360b7a89f516b8605e357d563 Mon Sep 17 00:00:00 2001
  From: Ingo Molnar mi...@kernel.org
  Date: Tue, 21 Apr 2015 13:32:13 +0200
  Subject: [PATCH] x86/asm/irq: Don't use POPF but STI
  
  So because the POPF instruction is slow and STI is faster on 
  essentially all x86 CPUs that matter, instead of:
  
81891848:   9d  popfq
  
  we can do:
  
81661a2e:   41 f7 c4 00 02 00 00test   $0x200,%r12d
81661a35:   74 01   je 81661a38 
  snd_pcm_stream_unlock_irqrestore+0x28
81661a37:   fb  sti
81661a38:
  
  This bloats the kernel a bit, by about 1K on the 64-bit defconfig:
  
 textdata bss dec hex filename
 122586341812120 1085440 15156194 e743e2 vmlinux.before
 122595821812120 1085440 15157142 e74796 vmlinux.after
  
  the other cost is the extra branching, adding extra pressure to the
  branch prediction hardware and also potential branch misses.
 
 Do we care? [...]

Only if it makes stuff faster.

 [...] After we enable interrupts, we'll most likely go somewhere 
 cache cold anyway, so the branch misses will happen anyway.
 
 The question is, would the cost drop from POPF - STI cover the 
 increase in branch misses overhead?
 
 Hmm, interesting.

So there's a few places where the POPF is a STI in 100% of the cases. 
It's probably a win there.

But my main worry would be sites that are 'multi use', such as locking 
APIs - for example spin_unlock_irqrestore(): those tend to be called 
from different code paths, and each one has a different IRQ flags 
state.

For example scheduler wakeups done from irqs-off codepaths (it's very 
common), or from irqs-on codepaths (that's very common as well). In 
the former case we won't have a STI, in the latter case we will - and 
both would hit a POPF at the end of the critical section. The 
probability of a branch prediction miss is high in this case.

So the question is, is the POPF/STI performance difference higher than 
the average cost of branch misses. If yes, then the change is probably 
a win. If not, then it's probably a loss.

My gut feeling is that we should let the hardware do it, i.e. we 
should continue to use POPF - but I can be convinced ...

Thanks,

Ingo

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3] xen: sched_rt: print useful affinity info when dumping

2015-04-21 Thread Dario Faggioli
In fact, printing the cpupool's CPU online mask
for each vCPU is just redundant, as that is the
same for all the vCPUs of all the domains in the
same cpupool, while hard affinity is already part
of the output of dumping domains info.

Instead, print the intersection between hard
affinity and online CPUs, which is --in case of this
scheduler-- the effective affinity always used for
the vCPUs.

This change also takes the chance to add a scratch
cpumask area, to avoid having to either put one
(more) cpumask_t on the stack, or dynamically
allocate it within the dumping routine. (The former
being bad because hypervisor stack size is limited,
the latter because dynamic allocations can fail, if
the hypervisor was built for a large enough number
of CPUs.) We allocate such scratch area, for all pCPUs,
when the first instance of the RTDS scheduler is
activated and, in order not to loose track/leak it
if other instances are activated in new cpupools,
and when the last instance is deactivated, we (sort
of) refcount it.

Such scratch area can be used to kill most of the
cpumasks{_var}_t local variables in other functions
in the file, but that is *NOT* done in this chage.

Finally, convert the file to use keyhandler scratch,
instead of open coded string buffers.

Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
Cc: George Dunlap george.dun...@eu.citrix.com
Cc: Meng Xu xumengpa...@gmail.com
Cc: Jan Beulich jbeul...@suse.com
Cc: Keir Fraser k...@xen.org
---
Changes from v2:
 * added refcounting, to avoid leaking the scratch cpumasks
 * added some ASSERT()s to validate the refcounting

Changes from v1:
 * improved changelog;
 * made a local variable to point to the correct
   scratch mask, as suggested during review.
---
 xen/common/sched_rt.c |   64 ++---
 1 file changed, 55 insertions(+), 9 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 7c39a9e..79cbb67 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -124,6 +124,24 @@
 #define TRC_RTDS_BUDGET_REPLENISH TRC_SCHED_CLASS_EVT(RTDS, 4)
 #define TRC_RTDS_SCHED_TASKLETTRC_SCHED_CLASS_EVT(RTDS, 5)
 
+ /*
+  * Useful to avoid too many cpumask_var_t on the stack.
+  */
+static cpumask_t **_cpumask_scratch;
+#define cpumask_scratch _cpumask_scratch[smp_processor_id()]
+
+/*
+ * We want to only allocate the _cpumask_scratch array the first time an
+ * instance of this scheduler is used, and avoid reallocating and leaking
+ * the old one when more instance are activated inside new cpupools. We
+ * also want to get rid of it when the last instance is de-inited.
+ *
+ * So we (sort of) reference count the number of initialized instances. This
+ * does not need to happen via atomic_t refcounters, as it only happens either
+ * during boot, or under the protection of the cpupool_lock spinlock.
+ */
+static unsigned int nr_rt_ops;
+
 /*
  * Systme-wide private data, include global RunQueue/DepletedQ
  * Global lock is referenced by schedule_data.schedule_lock from all
@@ -218,8 +236,7 @@ __q_elem(struct list_head *elem)
 static void
 rt_dump_vcpu(const struct scheduler *ops, const struct rt_vcpu *svc)
 {
-char cpustr[1024];
-cpumask_t *cpupool_mask;
+cpumask_t *cpupool_mask, *mask;
 
 ASSERT(svc != NULL);
 /* idle vcpu */
@@ -229,10 +246,22 @@ rt_dump_vcpu(const struct scheduler *ops, const struct 
rt_vcpu *svc)
 return;
 }
 
-cpumask_scnprintf(cpustr, sizeof(cpustr), svc-vcpu-cpu_hard_affinity);
+/*
+ * We can't just use 'cpumask_scratch' because the dumping can
+ * happen from a pCPU outside of this scheduler's cpupool, and
+ * hence it's not right to use the pCPU's scratch mask (which
+ * may even not exist!). On the other hand, it is safe to use
+ * svc-vcpu-processor's own scratch space, since we hold the
+ * runqueue lock.
+ */
+mask = _cpumask_scratch[svc-vcpu-processor];
+
+cpupool_mask = cpupool_scheduler_cpumask(svc-vcpu-domain-cpupool);
+cpumask_and(mask, cpupool_mask, svc-vcpu-cpu_hard_affinity);
+cpulist_scnprintf(keyhandler_scratch, sizeof(keyhandler_scratch), mask);
 printk([%5d.%-2u] cpu %u, (%PRI_stime, %PRI_stime),
 cur_b=%PRI_stime cur_d=%PRI_stime last_start=%PRI_stime\n
-\t\t onQ=%d runnable=%d cpu_hard_affinity=%s ,
+\t\t onQ=%d runnable=%d flags=%x effective hard_affinity=%s\n,
 svc-vcpu-domain-domain_id,
 svc-vcpu-vcpu_id,
 svc-vcpu-processor,
@@ -243,11 +272,8 @@ rt_dump_vcpu(const struct scheduler *ops, const struct 
rt_vcpu *svc)
 svc-last_start,
 __vcpu_on_q(svc),
 vcpu_runnable(svc-vcpu),
-cpustr);
-memset(cpustr, 0, sizeof(cpustr));
-cpupool_mask = cpupool_scheduler_cpumask(svc-vcpu-domain-cpupool);
-cpumask_scnprintf(cpustr, sizeof(cpustr), cpupool_mask);
-printk(cpupool=%s\n, cpustr);
+svc-flags,
+

Re: [Xen-devel] [PATCH v10 3/6] Support for BIOS interrupt handler

2015-04-21 Thread Kevin O'Connor
On Mon, Mar 30, 2015 at 05:09:47AM +, Xu, Quan wrote:
  From: Stefan Berger [mailto:stef...@linux.vnet.ibm.com]
  On 03/27/2015 03:58 AM, Xu, Quan wrote:
   From: Xu, Quan
   From: Stefan Berger [mailto:stef...@linux.vnet.ibm.com]
   On 03/26/2015 07:01 AM, Xu, Quan wrote:
   From: Stefan Berger [mailto:stef...@linux.vnet.ibm.com]
   On 03/25/2015 06:42 PM, Kevin O'Connor wrote:
   On Tue, Mar 24, 2015 at 11:10:03AM -0400, Stefan Berger wrote:
   On 03/23/2015 08:13 PM, Kevin O'Connor wrote:
   Because of the mixed 16bit/32bit code in SeaBIOS, all assembler
   must use size suffixes - so the above should be roll instead of 
   rol.
   Ok, fixed.
  
   As before - both issues are minor and can be addressed after
   merge (as long as there is agreement that the sha1.c file can
   be licensed as LGPLv3).
   It can have that license. I can post v11 or you can modify it,
   either way is fine.
   Thanks.  I pushed the first three patches into a test branch at:
  
https://github.com/KevinOConnor/seabios/tree/tcg-testing
  
   I'd like to get confirmation that this works for the Xen
   requirements before merging.
   I don't use Xen. I hope that Quan will provide feedback.
  
 Stefan
   Sure, I am glad to help you test it :):) Try to
   https://github.com/KevinOConnor/seabios/tree/tcg-testing ??
   Yes.
[...]
 I will go through all of these seabios patch, and try to make it compatible 
 for Xen vTPM.
 

What's the status of this?  Is it safe to push forward parts of
Stefan's patches, or does that represent a regression for Xen?

-Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/5] libxl: add support for vscsi

2015-04-21 Thread Konrad Rzeszutek Wilk
On Fri, Apr 17, 2015 at 08:30:58AM +, Olaf Hering wrote:
 Port pvscsi support from xend to libxl:
 
  vscsi=['pdev,vdev{,options}']
  xl scsi-attach
  xl scsi-detach
  xl scsi-list
 
 Signed-off-by: Olaf Hering o...@aepfle.de
 Cc: Ian Jackson ian.jack...@eu.citrix.com
 Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
 Cc: Ian Campbell ian.campb...@citrix.com
 Cc: Wei Liu wei.l...@citrix.com
 ---
  docs/man/xl.cfg.pod.5|  55 +++
  docs/man/xl.pod.1|  18 +
  tools/libxl/Makefile |   2 +
  tools/libxl/libxl.c  | 441 
  tools/libxl/libxl.h  |  27 ++
  tools/libxl/libxl_create.c   |   1 +
  tools/libxl/libxl_device.c   |   2 +
  tools/libxl/libxl_internal.h |  16 +
  tools/libxl/libxl_types.idl  |  56 +++
  tools/libxl/libxl_types_internal.idl |   1 +
  tools/libxl/libxl_vscsi.c| 271 +
  tools/libxl/libxlu_vscsi.c   | 750 
 +++
  tools/libxl/libxlutil.h  |  21 +
  tools/libxl/xl.h |   3 +
  tools/libxl/xl_cmdimpl.c | 184 -
  tools/libxl/xl_cmdtable.c|  15 +
  16 files changed, 1862 insertions(+), 1 deletion(-)
 
 diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
 index f936dfc..d395e56 100644
 --- a/docs/man/xl.cfg.pod.5
 +++ b/docs/man/xl.cfg.pod.5
 @@ -510,6 +510,61 @@ value is optional if this is a guest domain.
  
  =back
  
 +=item Bvscsi=[ VSCSI_SPEC_STRING, VSCSI_SPEC_STRING, ...]
 +
 +Specifies the PVSCSI devices to be provided to the guest. PVSCSI passes
 +dom0 SCSI devices as-is to the guest.

s/dom0/backend/

As in theory you could run an HVM guest with an SCSI device passed in - and
use said backed to pass the SCSI device to another guest.

'as-is' ? I thought there were some filtering done in the backend?

 +
 +Each VSCSI_SPEC_STRING consists of pdev,vdev[,options].
 +'pdev' describes the physical device, preferable in a persistant format.

What is 'persistent' format?

 +'vdev' is the domU device in vHOST:CHANNEL:TARGET:LUN notation, all integers.
 +'option' lists additional flags which a backend may recognize.
 +
 +The supported values for pdev and option depends on the used backend 
 driver:
 +
 +=over 4
 +
 +=item BLinux pvops

s/pvops//

 +
 +=over 4
 +
 +=item Cpdev
 +
 +The backend driver in the pvops kernel is part of the Linux-IO Target 
 framework
 +(LIO). As such the SCSI devices have to be configured first with the tools
 +provided by this framework, such as a xen-scsiback aware targetcli. The 
 pdev
 +in domU.cfg has to refer to a config item in that framework instead of the 
 raw
 +device. Ususally this is a WWN in the form of na.WWN:LUN.

Usually.

 +
 +=item Coption
 +
 +No options recognized.
 +
 +=back
 +
 +=item BLinux xenlinux

xenlinux? Classic Xen?

 +
 +=over 4
 +
 +=item Cpdev
 +
 +The dom0 device in either /dev/scsidev or pHOST:CHANNEL:TARGET:LUN notation.
 +
 +Its recommended to use persistant names /dev/disk/by-*/* to refer to a 
 pdev.
 +The toolstack will translate this internally to h:c:t:l notation, which is 
 how
 +the backend driver will access the device. Using the h:c:t:l notation for
 +pdev in domU.cfg is discouraged because this value will change across 
 reboots,
 +depending on the detection order in the OS.
 +
 +=item Coption
 +
 +Currently only the option value feature-host is recognized. SCSI command
 +emulation in backend driver is bypassed when feature-host is specified.
 +
 +=back
 +
 +=back
 +
  =item Bvfb=[ VFB_SPEC_STRING, VFB_SPEC_STRING, ...]
  
  Specifies the paravirtual framebuffer devices which should be supplied
 diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
 index 16783c8..19bdbfa 100644
 --- a/docs/man/xl.pod.1
 +++ b/docs/man/xl.pod.1
 @@ -1328,6 +1328,24 @@ List virtual trusted platform modules for a domain.
  
  =back
  
 +=head2 PVSCSI DEVICES
 +
 +=over 4
 +
 +=item Bscsi-attach Idomain-id Ipdev Ivdev,I[feature-host]
 +
 +Creates a new vscsi device in the domain specified by Idomain-id.
 +
 +=item Bscsi-detach Idomain-id Ivdev
 +
 +Removes the vscsi device from domain specified by Idomain-id.
 +
 +=item Bscsi-list Idomain-id I[domain-id] ...
 +
 +List vscsi devices for the domain specified by Idomain-id.
 +
 +=back
 +
  =head1 PCI PASS-THROUGH
  
  =over 4
 diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
 index 1b16598..79b3867 100644
 --- a/tools/libxl/Makefile
 +++ b/tools/libxl/Makefile
 @@ -91,6 +91,7 @@ endif
  LIBXL_LIBS += -lyajl
  
  LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 + libxl_vscsi.o \
   libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
   libxl_internal.o libxl_utils.o libxl_uuid.o \
   libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
 \
 @@ -122,6 +123,7 @@ AUTOINCS= libxlu_cfg_y.h libxlu_cfg_l.h 

Re: [Xen-devel] [PATCH v5 13/13] docs: add xl-psr.markdown

2015-04-21 Thread Ian Campbell
On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote:
 @@ -1534,7 +1540,8 @@ Show CAT hardware information.
  
  =item Bpsr-cat-cbm-set [IOPTIONS] Idomain-id Icbm
  
 -Set cache capacity bitmasks(CBM) for a domain.
 +Set cache capacity bitmasks(CBM) for a domain. For how to specify Icbm
 +please refer to the link above.

I think the link above is to far. I think say something more explicit
like Lxl-psr.txt ?

  
  BOPTIONS
  
 @@ -1575,6 +1582,7 @@ And the following documents on the xen.org website:
  Lhttp://xenbits.xen.org/docs/unstable/misc/xl-network-configuration.html
  Lhttp://xenbits.xen.org/docs/unstable/misc/xl-disk-configuration.txt
  Lhttp://xenbits.xen.org/docs/unstable/misc/xsm-flask.txt
 +Lhttp://xenbits.xen.org/docs/unstable/misc/xl-psr.html
  
  For systems that don't automatically bring CPU online:
  
 diff --git a/docs/misc/xl-psr.markdown b/docs/misc/xl-psr.markdown
 new file mode 100644
 index 000..d167b84
 --- /dev/null
 +++ b/docs/misc/xl-psr.markdown
 @@ -0,0 +1,134 @@
 +# Intel Platform Shared Resource Monitoring/Control in xl
 +
 +This document introduces Intel Platform Shared Resource Monitoring/Control
 +technologies, their basic concepts and the xl interfaces.
 +
 +## Cache Monitoring Technology (CMT)
 +
 +Cache Monitoring Technology (CMT) is a new feature available on Intel Haswell
 +and later server platforms that allows an OS or Hypervisor/VMM to determine
 +the usage of cache(currently only L3 cache supported) by applications running
 ^space before ( please.

 +For example, assuming a system with 8 portions and 3 domains:
 +
 +A CBM of 0xff for every domain means each domain can access the
 +whole cache. This is the default.
 +
 +Giving one domain a CBM of 0x0f and the other two domain's 0xf0
 +means that the first domain gets exclusive access to half of the
 +cache (half of the portions) and the other two will share the
 +other half.
 +
 +Giving one domain a CBM of 0x0f, one 0x30 and the last 0xc0
 +would give the first domain exclusive access to half the cache,
 +and the other two exclusive access to one quarter each.

How does markdown render this? I think you might want to start each para
with a * to make it a bullet list.

Other than those minor things all looks good, thanks.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/5] vscsiif.h: fix WWN notation for p-dev property

2015-04-21 Thread Olaf Hering
On Tue, Apr 21, Konrad Rzeszutek Wilk wrote:

 On Fri, Apr 17, 2015 at 08:30:56AM +, Olaf Hering wrote:
  The pvops kernel expects either naa.WWN:LUN or h:c:t:l in the p-dev
  property. Add the missing :LUN part to the comment.
 
 What about the older kernels that had said driver?

Which older kernels? There is just 3.18+ of pvops.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 16:33, tkleng...@sec.in.tum.de wrote:
 On Tue, Apr 21, 2015 at 4:14 PM, Jan Beulich jbeul...@suse.com wrote:
 
  On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
  On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
  On 20/04/15 16:06, Tamas K Lengyel wrote:
   The current implementation of three memops,
 XENMEM_current_reservation,
   XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an
   int. However, in ARM64 we could potentially have 36-bit pfn's, thus
   in preparation for the ARM patch, in this patch we update the existing
   memop routines to use a struct, xen_get_gpfn, to exchange the gpfn
 info
   as a uin64_t.
  
   This patch also adds error checking on the toolside in case the memop
   fails.
  
   Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
 
  XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
 
  You cannot make adjustments like this, but you can add a brand new op
  with appropriate parameters and list the old ops as deprecated.
 
  Right. For the benefit of callers using the old API it seems what we
  usually do is rename the old op XENMEM_foo_compat and use the name with
  a new number for the new functionality, then use a
  __XEN_INTERFACE_VERSION__ to #define back to the old name.
 
  The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
  reasonable example, I couldn't find one specifically for the memory ops.

 And there's no need to afaict: This complication isn't needed in the
 first place. The patch's context already makes this clear:

 --- a/xen/common/memory.c
 +++ b/xen/common/memory.c
 @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd,
 XEN_GUEST_HANDLE_PARAM(void) arg)

 Note the long return type. Yet the patch description, for
 whatever reason, claims the hypercall to only return an int.
 Maybe because (as pointed out before) the respective Linux
 hypercall stub wrongly an int return type?
 
 The privcmd driver on Linux certainly does return an int via the ioctl.

That's clearly a bug in Linux:

static long privcmd_ioctl(struct file *file,
  unsigned int cmd, unsigned long data)
{
int ret = -ENOSYS;
void __user *udata = (void __user *) data;

switch (cmd) {
case IOCTL_PRIVCMD_HYPERCALL:
ret = privcmd_ioctl_hypercall(udata);
break;
[...]
return ret;
}

Why in the world is ret an int? That's certainly not something
inherited from XenoLinux, where all involved types are long.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 15:30 +0100, Ian Jackson wrote:
 Ian Campbell writes (Re: [OSSTEST Nested PATCH v8 3/7] Edit some APIs in 
 TestSupport.pm for nested test):
  On Tue, 2015-04-21 at 14:28 +0100, Ian Jackson wrote:
   I'm only objecting to the involvement of the host property machinery.
  
  I'm afraid I'm not 100% sure I understand, but do you just mean that
  longtao should assign to $ho-{Ip} directly within selecthost instead of
  using $setprop to set the IpAddr hostprop and therefore indirectly
  -{Ip}?
 
 Yes.
 
  IOW the existing code in selecthost which looks up the IpAddr host prop
  into $ho-{IpStatic} and then into $ho-{Ip} should prefer the runvar to
  the host db if it is set?
 
 Yes.

Good. Longtao, do you understand what is needed here?

  BTW, what is the difference between -{Ip} and -{IpStatic} and should
  the runvar be reflected in both or just in -{Ip}?
 
 AFAICT the only difference is that only IpStatic is used for expanding
 `ipaddr' and `ipaddrhex' in the patterns for dhcp pxe configurations.
 
 There is an incipient problem here: our existing setup does not
 require that osstest has complete control of the pxe config file for
 every host on the network.  And indeed the Citrix (Cambridge) instance
 lacks such control.
 
 Instead there are some symlinks etc.  The mg-hosts mkpxedir subcommand
 arranges to delegate the DHCP from root to the osstest user (and
 group), by making an appropriate subdirectory with the right
 permissions, and a suitable symlink.
 
 It's not quite clear to me how this should work for nested HVM hosts.

I think this is OK with the current design because the nested HVM host
is not PXE booted, it is installed as a guest but preseeded to look a
lot like a host (quite a lot of sharing in the preseed creation) and
then tailored into a more host-like thing by things like this IP addr
runvar.

So I think from what you are saying that the -{Ip} and -{IpStatic}
distinction here is correct and a host which is really a guest should
not have nor use -{IpStatic} (we want e.g. attempts to create a pxe dir
to fail).

Longtao, the conclusion is that the runvar should only appear in -{Ip}.

-{IpStatic} should be left untouched (i.e. not initialised).

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/5] raisin: introduce ovmf and linux

2015-04-21 Thread Stefano Stabellini
Hi all,

this patch series includes a few clean-ups and introduces a component to
build ovmf and a component to build a linux kernel. The latter is not
enabled by default, and could probably benefit from a bit more work on
the kconfig options to really be useful, but it is a start.


Stefano Stabellini (5):
  raisin: introduce _verbose_echo
  raisin: remove duplicate source config in raise
  raisin: rename ARCH to RAISIN_ARCH
  raisin: introduce ovmf
  raisin: build linux

 README  |2 +-
 components/grub |6 +--
 components/libvirt  |2 +-
 components/linux|  120 +++
 components/ovmf |   60 ++
 components/qemu |2 +-
 components/qemu_traditional |2 +-
 components/seabios  |6 +--
 components/series   |2 +
 components/xen  |5 +-
 defconfig   |8 ++-
 lib/common-functions.sh |   50 +++---
 raise   |   15 ++
 scripts/mkdeb   |8 +--
 scripts/mkrpm   |2 +-
 15 files changed, 229 insertions(+), 61 deletions(-)
 create mode 100644 components/linux
 create mode 100644 components/ovmf


Cheers,

Stefano

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/5] raisin: rename ARCH to RAISIN_ARCH

2015-04-21 Thread George Dunlap
On 04/21/2015 03:55 PM, Stefano Stabellini wrote:
 Rename exported environmental variable ARCH to RAISIN_ARCH not to
 conflict with any component Makefile variables.
 
 Signed-off-by: Stefano Stabellini stefano.stabell...@eu.citrix.com

Acked-by: George Dunlap george.dun...@eu.citrix.com

 ---
  README  |2 +-
  components/grub |6 +++---
  components/libvirt  |2 +-
  components/qemu |2 +-
  components/qemu_traditional |2 +-
  components/seabios  |6 +++---
  components/xen  |2 +-
  lib/common-functions.sh |6 +++---
  scripts/mkdeb   |8 
  scripts/mkrpm   |2 +-
  10 files changed, 19 insertions(+), 19 deletions(-)
 
 diff --git a/README b/README
 index ab8d3e7..b7832da 100644
 --- a/README
 +++ b/README
 @@ -85,7 +85,7 @@ RAISIN_MAKE: make command to run
  SUDO: sudo command to run
  DISTRO: which Linux distribution we are running on, Debian, Fedora, etc
  PKGTYPE: which package format is used by the distro, rpm, deb, etc
 -ARCH: which architecture we are running on, x86_64, arm32, etc.
 +RAISIN_ARCH: which architecture we are running on, x86_64, arm32, etc.
  BASEDIR: absolute path of the raisin directory
  PREFIX: installation PREFIX
  INST_DIR: absolute path of the installation directory
 diff --git a/components/grub b/components/grub
 index 7c14a62..fa72b99 100644
 --- a/components/grub
 +++ b/components/grub
 @@ -17,13 +17,13 @@ function grub_check_package() {
  local DEP_CentOS_x86_64=$DEP_Fedora_x86_64
  
  
 -if [[ $ARCH != x86_64  $ARCH != x86_32 ]]
 +if [[ $RAISIN_ARCH != x86_64  $RAISIN_ARCH != x86_32 ]]
  then
  echo grub is only supported on x86_32 and x86_64
  return
  fi
  echo Checking Grub dependencies
 -eval check-package \$DEP_$DISTRO_$ARCH
 +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
  }
  
  
 @@ -41,7 +41,7 @@ function grub_build() {
-m ../memdisk.tar -o grub-i386-xen grub-core/*mod
  cp grub-i386-xen $INST_DIR/$PREFIX/lib/xen/boot
  ## GRUB64
 -if [[ $ARCH = x86_64 ]]
 +if [[ $RAISIN_ARCH = x86_64 ]]
  then
  $RAISIN_MAKE clean
  ./configure --target=amd64 --with-platform=xen
 diff --git a/components/libvirt b/components/libvirt
 index 5fb1a41..a554643 100644
 --- a/components/libvirt
 +++ b/components/libvirt
 @@ -23,7 +23,7 @@ function libvirt_check_package() {
  local DEP_CentOS_x86_64=$DEP_Fedora_x86_64
  
  echo Checking Libvirt dependencies
 -eval check-package \$DEP_$DISTRO_$ARCH
 +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
  }
  
  function libvirt_build() {
 diff --git a/components/qemu b/components/qemu
 index 00aeeb6..72cfec1 100644
 --- a/components/qemu
 +++ b/components/qemu
 @@ -12,7 +12,7 @@ function qemu_check_package() {
  local DEP_Fedora_x86_64=$DEP_Fedora_common
  
  echo Checking QEMU dependencies
 -eval check-package \$DEP_$DISTRO_$ARCH
 +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
  }
  
  function qemu_build() {
 diff --git a/components/qemu_traditional b/components/qemu_traditional
 index 500cbed..b338007 100644
 --- a/components/qemu_traditional
 +++ b/components/qemu_traditional
 @@ -13,7 +13,7 @@ function qemu_traditional_check_package() {
  local DEP_Fedora_x86_64=$DEP_Fedora_common
  
  echo Checking QEMU dependencies
 -eval check-package \$DEP_$DISTRO_$ARCH
 +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
  }
  
  function qemu_traditional_build() {
 diff --git a/components/seabios b/components/seabios
 index 960a538..ed2c7d2 100644
 --- a/components/seabios
 +++ b/components/seabios
 @@ -12,18 +12,18 @@ function seabios_check_package() {
  local DEP_Fedora_x86_64=$DEP_Fedora_common
  
  
 -if [[ $ARCH != x86_64  $ARCH != x86_32 ]]
 +if [[ $RAISIN_ARCH != x86_64  $RAISIN_ARCH != x86_32 ]]
  then
  echo seabios is only supported on x86_32 and x86_64
  return
  fi
  echo Checking SeaBIOS dependencies
 -eval check-package \$DEP_$DISTRO_$ARCH
 +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
  }
  
  
  function seabios_build() {
 -if [[ $ARCH != x86_64  $ARCH != x86_32 ]]
 +if [[ $RAISIN_ARCH != x86_64  $RAISIN_ARCH != x86_32 ]]
  then
  echo seabios is only supported on x86_32 and x86_64
  return
 diff --git a/components/xen b/components/xen
 index b7258b0..b3426f0 100644
 --- a/components/xen
 +++ b/components/xen
 @@ -20,7 +20,7 @@ function xen_check_package() {
  local DEP_CentOS_x86_64=$DEP_CentOS_x86_32 glibc-devel.i686
  
  echo Checking Xen dependencies
 -eval check-package \$DEP_$DISTRO_$ARCH
 +eval check-package \$DEP_$DISTRO_$RAISIN_ARCH
  }
  
  function xen_build() {
 diff --git a/lib/common-functions.sh b/lib/common-functions.sh
 index c19046b..186a53b 100644
 --- a/lib/common-functions.sh
 +++ b/lib/common-functions.sh
 @@ -41,7 +41,7 @@ function common_init() {
   

Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 16:42, tkleng...@sec.in.tum.de wrote:
 On Tue, Apr 21, 2015 at 4:29 PM, Jan Beulich jbeul...@suse.com wrote:
 
  On 21.04.15 at 16:24, ian.campb...@citrix.com wrote:
  On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote:
   On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
   On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
   On 20/04/15 16:06, Tamas K Lengyel wrote:
The current implementation of three memops,
 XENMEM_current_reservation,
XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values
 as an
int. However, in ARM64 we could potentially have 36-bit pfn's, thus
in preparation for the ARM patch, in this patch we update the
 existing
memop routines to use a struct, xen_get_gpfn, to exchange the gpfn
 info
as a uin64_t.
   
This patch also adds error checking on the toolside in case the
 memop
fails.
   
Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
  
   XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
  
   You cannot make adjustments like this, but you can add a brand new op
   with appropriate parameters and list the old ops as deprecated.
  
   Right. For the benefit of callers using the old API it seems what we
   usually do is rename the old op XENMEM_foo_compat and use the name
 with
   a new number for the new functionality, then use a
   __XEN_INTERFACE_VERSION__ to #define back to the old name.
  
   The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
   reasonable example, I couldn't find one specifically for the memory
 ops.
 
  And there's no need to afaict: This complication isn't needed in the
  first place. The patch's context already makes this clear:
 
  --- a/xen/common/memory.c
  +++ b/xen/common/memory.c
  @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd,
  XEN_GUEST_HANDLE_PARAM(void) arg)
 
  Note the long return type. Yet the patch description, for
  whatever reason, claims the hypercall to only return an int.
  Maybe because (as pointed out before) the respective Linux
  hypercall stub wrongly an int return type?
 
  Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64
  bit host (maximum pfn more than 2^32)?

 It is, but do we really want to introduce other than just compat
 mode helper interfaces (i.e. leaving the current ones alone, and
 perhaps even making the new ones tools only) if we really care
 about such setups in the first place?
 
 At the moment I just followed Andrew's advice and will introduce a new
 XENMEM_maximum_gpfn2 memop that returns the gpfn in a struct as xen_pfn_t.
 The old memops I'll leave untouched if that's OK.

For this specific one - is there a reasonable use case? Other than
for host PFN, we have control over guest ones, and I'm not sure
managing a guest with GPFNs extending past 4 billion can be
expected to work if only this one hypercall got fixed. IOW I'm
expecting to NAK any such addition without proper rationale.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v2] Arrange for core dumps to be placed in /var/core and collect them

2015-04-21 Thread Ian Jackson
Ian Campbell writes ([PATCH OSSTEST v2] Arrange for core dumps to be placed in 
/var/core and collect them):
 Refactor the $kvp_replace helper in ts-xen-install into a generic
 helper (which requires using ::EO and ::EI for namespacing) for use
 with target_editfile and use it to edit /etc/sysctl.conf to set
 kernel.core_pattern on boot.
 
 Tested in standalone mode by installing and running a C program
 containing *(int *)0 = 1; which, after running ulimit -c unlimited
 produces the expected core file. ts-logs-capture when run in
 standalone mode then picks them up.
 
 I've not yet figured out how to make the desired rlimit take affect
 for all processes (including e.g. daemons spawned on boot). Likely
 this will involve some combination of pam_limits.so PAM module and
 adding explicit ulimit calls to the initscripts which we care about
 (primarily xencommons and libvirt initscripts).
 
 Signed-off-by: Ian Campbell ian.campb...@citrix.com
 Acked-by: Ian Jackson ian.jack...@eu.citrix.com
 ---
 v2: Add /var/core to ts-leak-check

I'm going to push this and the two related patches, and the
$r{enable_xsm} one, RSN.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 11/13] tools/libxl: add command to show CMT hardware info

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 02:37 +0200, Dario Faggioli wrote:
 On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote:
  Add dedicated one to show hardware information.
  
  [root@vmm-psr]xl psr-cmt-hwinfo
  Cache Monitoring Technology (CMT):
  Enabled : 1
  Total RMID  : 63
  Supported monitor types:
  cache-occupancy
  total-mem-bandwidth
  local-mem-bandwidth
  
 Nice.
 
  Signed-off-by: Chao Peng chao.p.p...@linux.intel.com
 
  --- a/tools/libxl/xl_cmdimpl.c
  +++ b/tools/libxl/xl_cmdimpl.c
  @@ -8014,6 +8014,36 @@ out:
   }
   
   #ifdef LIBXL_HAVE_PSR_CMT
  +static int psr_cmt_hwinfo(void)
  +{
  +int rc;
  +int enabled;
  +uint32_t total_rmid;
  +
 
 I think you should still have something like:
 
 SWITCH_FOREACH_OPT(opt, , NULL, psr-cmt-hwinfo, 0) {
 /* No options */
 }
 
 Or `xl psr-cmt-hwinfo -h' wouldn't work, would it?

Yes, that is neded.

 With that done:
 
  Reviewed-by: Dario Faggioli dario.faggi...@citrix.com

Likewise:
Acked-by: Ian Campbell ian.campb...@citrix.com



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
 On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
 On 20/04/15 16:06, Tamas K Lengyel wrote:
  The current implementation of three memops, XENMEM_current_reservation,
  XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an
  int. However, in ARM64 we could potentially have 36-bit pfn's, thus
  in preparation for the ARM patch, in this patch we update the existing
  memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info
  as a uin64_t.
 
  This patch also adds error checking on the toolside in case the memop
  fails.
 
  Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
 
 XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
 
 You cannot make adjustments like this, but you can add a brand new op
 with appropriate parameters and list the old ops as deprecated.
 
 Right. For the benefit of callers using the old API it seems what we
 usually do is rename the old op XENMEM_foo_compat and use the name with
 a new number for the new functionality, then use a
 __XEN_INTERFACE_VERSION__ to #define back to the old name.
 
 The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
 reasonable example, I couldn't find one specifically for the memory ops.

And there's no need to afaict: This complication isn't needed in the
first place. The patch's context already makes this clear:

--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)

Note the long return type. Yet the patch description, for
whatever reason, claims the hypercall to only return an int.
Maybe because (as pointed out before) the respective Linux
hypercall stub wrongly an int return type?

(Tamas, while I had asked you to avoid Cc-ing me on ARM only
patches, I would have been able to spot this earlier if you had
Cc-ed me on this non-ARM change. Again, please Cc
maintainers, but especially on larger series make sure you
don't Cc everyone on every patch. I'm sure avoiding the extra
mail load is being appreciated not just by me.)

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/7] docs/build: Move install checks into individual build targets

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-20 at 13:01 +0100, Andrew Cooper wrote:
 On 20/04/15 12:57, Jan Beulich wrote:
  On 20.04.15 at 12:49, andrew.coop...@citrix.com wrote:
  @@ -103,12 +88,20 @@ install: install-man-pages install-html
   
   # Individual file build targets
   man1/%.1: man/%.pod.1 Makefile
  +ifdef POD2MAN
  Perhaps better to use ifneq($(POD2MAN),) in such cases?
 
 I was following the prevailing style, but can update all instances. 
 FWIW, it is not currently an issue.

I don't really mind, ifneq seems more prevalent in other makefiles, I'm
not sure why this one uses ifdef nor really what the implications are.

 @$(INSTALL_DIR) $(@D)
 $(POD2MAN) --release=$(VERSION) --name=$* -s 1 -c Xen $ $@
  +else
  +  @echo pod2man not installed; skipping $(@F)
  Why do you strip off the directory part? Leaving it in place would
  make the output even less ambiguous (and hence more helpful).
 
 Can do.

Thanks.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 2/9] xen/arm: Allow hypervisor access to mem_access protected pages

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 16:10 +0200, Tamas K Lengyel wrote:
 
 
 On Tue, Apr 21, 2015 at 3:14 PM, Ian Campbell
 ian.campb...@citrix.com wrote:
 On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote:
  -paddr_t p2m_lookup(struct domain *d, paddr_t paddr,
 p2m_type_t *t)
  +static paddr_t __p2m_lookup(struct domain *d, paddr_t
 paddr, p2m_type_t *t)
 
 I'd have been inclined to make this take a p2m_domain* rather
 than a
 domain*, on the basis that the caller must have one in hand to
 have
 locked it. I don't think __p2m_lookup uses d other than to
 find the p2m.
 
 Not worth changing unless there is some other reason to resend
 though,
 so with or without that:
 
 Acked-by: Ian Campbell ian.campb...@citrix.com
 
 Ian.
 
 
 Thanks, I made the change as I'll have another round anyway to
 introduce XENMEM_maximum_gpfn2 in the series.

Great, thanks.

FWIW there was another similar wrapper in another patch later on, but
that one did use d so I didn't suggest the same change there for that
reason.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote:
  On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
  On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
  On 20/04/15 16:06, Tamas K Lengyel wrote:
   The current implementation of three memops, XENMEM_current_reservation,
   XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an
   int. However, in ARM64 we could potentially have 36-bit pfn's, thus
   in preparation for the ARM patch, in this patch we update the existing
   memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info
   as a uin64_t.
  
   This patch also adds error checking on the toolside in case the memop
   fails.
  
   Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
  
  XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
  
  You cannot make adjustments like this, but you can add a brand new op
  with appropriate parameters and list the old ops as deprecated.
  
  Right. For the benefit of callers using the old API it seems what we
  usually do is rename the old op XENMEM_foo_compat and use the name with
  a new number for the new functionality, then use a
  __XEN_INTERFACE_VERSION__ to #define back to the old name.
  
  The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
  reasonable example, I couldn't find one specifically for the memory ops.
 
 And there's no need to afaict: This complication isn't needed in the
 first place. The patch's context already makes this clear:
 
 --- a/xen/common/memory.c
 +++ b/xen/common/memory.c
 @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, 
 XEN_GUEST_HANDLE_PARAM(void) arg)
 
 Note the long return type. Yet the patch description, for
 whatever reason, claims the hypercall to only return an int.
 Maybe because (as pointed out before) the respective Linux
 hypercall stub wrongly an int return type?

Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64
bit host (maximum pfn more than 2^32)?



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Jan Beulich
 On 21.04.15 at 16:24, ian.campb...@citrix.com wrote:
 On Tue, 2015-04-21 at 15:14 +0100, Jan Beulich wrote:
  On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
  On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
  On 20/04/15 16:06, Tamas K Lengyel wrote:
   The current implementation of three memops, XENMEM_current_reservation,
   XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an
   int. However, in ARM64 we could potentially have 36-bit pfn's, thus
   in preparation for the ARM patch, in this patch we update the existing
   memop routines to use a struct, xen_get_gpfn, to exchange the gpfn info
   as a uin64_t.
  
   This patch also adds error checking on the toolside in case the memop
   fails.
  
   Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
  
  XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
  
  You cannot make adjustments like this, but you can add a brand new op
  with appropriate parameters and list the old ops as deprecated.
  
  Right. For the benefit of callers using the old API it seems what we
  usually do is rename the old op XENMEM_foo_compat and use the name with
  a new number for the new functionality, then use a
  __XEN_INTERFACE_VERSION__ to #define back to the old name.
  
  The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
  reasonable example, I couldn't find one specifically for the memory ops.
 
 And there's no need to afaict: This complication isn't needed in the
 first place. The patch's context already makes this clear:
 
 --- a/xen/common/memory.c
 +++ b/xen/common/memory.c
 @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd, 
 XEN_GUEST_HANDLE_PARAM(void) arg)
 
 Note the long return type. Yet the patch description, for
 whatever reason, claims the hypercall to only return an int.
 Maybe because (as pointed out before) the respective Linux
 hypercall stub wrongly an int return type?
 
 Isn't this still an issue for 32-bit toolstack (long == 4 bytes) on a 64
 bit host (maximum pfn more than 2^32)?

It is, but do we really want to introduce other than just compat
mode helper interfaces (i.e. leaving the current ones alone, and
perhaps even making the new ones tools only) if we really care
about such setups in the first place?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v7 2/5] sysctl: Add sysctl interface for querying PCI topology

2015-04-21 Thread Andrew Cooper
On 21/04/15 14:13, Boris Ostrovsky wrote:

 On 04/21/2015 09:14 AM, Andrew Cooper wrote:
 On 21/04/15 13:56, Boris Ostrovsky wrote:
 On 04/21/2015 03:01 AM, Jan Beulich wrote:
 --- a/docs/misc/xsm-flask.txt
 +++ b/docs/misc/xsm-flask.txt
 @@ -121,6 +121,7 @@ __HYPERVISOR_sysctl
 (xen/include/public/sysctl.h)
 * XEN_SYSCTL_cpupool_op
 * XEN_SYSCTL_scheduler_op
 * XEN_SYSCTL_coverage_op
 + * XEN_SYSCTL_pcitopoinfo
 No additions to this list are permitted. Either the new sub-op is
 disaggregation safe (which it looks to be), or it can't be accepted.
 True, it *is* safe, but why then cputopoinfo and numainfo are in this
 list? They look to be safe as well.
 This list includes not yet audited.  It is quite likely that some
 entries in the list are safe.

 fwiw, neither cputopo nor numainfo are currently safe for very large
 systems.

 Why would safety of an operation depend on system size? And how are
 these two unsafe? (Because maybe then PCI topology query is unsafe in
 the same manner)

PCI topology has continuations in it.

cpu/numa info is bounded by host values, but does attempt to cover all
cpus/nodes in a single pass, which will get progressively longer as the
server gets larger.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Wg-test-framework] osstest outage over the weekend

2015-04-21 Thread Ian Jackson
Don Koch writes (Re: [Wg-test-framework] osstest outage over the weekend):
 Sorry. I had gotten the impression that you had some other tool that was
 handling the DHCP addresses for the test machines. I guess what you meant
 (and I misinterpreted) was that it was handling figuring out which addresses
 were handed out to which machines. As a result, I thought that the dhcp
 server wasn't really in the production loop.

What I have is a system for automatically producing dhcp server
configuration so that test boxes can be automatically provisioned.

You can see a hook for this in dhcpd.conf, which says:
include /etc/dhcp/testhosts.dhcp.conf;

There is an analogous hook in the dns zonefile.

 I will refrain from modifying it from here on.

Thanks.  In general this applies to everything in the colo, both
hardware and software:

Unless the resource in question is specific to one of the test boxes
that have not yet been handed over to me, it should not be touched
without consulting me.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Config.mk: Fix (and, effectively, update) QEMU_TAG

2015-04-21 Thread Ian Jackson
Jan Beulich writes (Re: [PATCH] Config.mk: Fix (and, effectively, update) 
QEMU_TAG):
 On 21.04.15 at 12:33, ian.jack...@eu.citrix.com wrote:
  In 952944f7 QEMU_TAG update my tag update script mangled the
  machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first
  assignment to QEMU_TRADITIONAL_REVISION it found rather than the one
  which ought to have been replaced.
  Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com
  Reported-by: Jan Beulich jbeul...@suse.com
 
 Acked-by: Jan Beulich jbeul...@suse.com

Thanks.  Sorry for the breakage.

 (and please also for 4.5)

Indeed.  See below, which I have just pushed (without waiting for your
further ack).

Ian.

From d41906197d9a89355f43d4359d795b1d0257a53a Mon Sep 17 00:00:00 2001
From: Ian Jackson ian.jack...@eu.citrix.com
Date: Tue, 21 Apr 2015 14:47:38 +0100
Subject: [PATCH] Config.mk: Fix QEMU_TAG and QEMU_TRADITIONAL_REVISION
 handling

In 2417e243 QEMU_TAG update my tag update script mangled the
machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first
assignment to QEMU_TRADITIONAL_REVISION it found rather than the one
which ought to have been replaced.

The result was that from that commit on, QEMU_TAG was no longer
honoured although QEMU_TRADITIONAL_REVISION still was.

Fix this by restoring the transfer from QEMU_TAG.

This fix is analogous to 5d4c0952 Config.mk: Fix (and, effectively,
update) QEMU_TAG from xen.git#staging, but in 4.5 there has not been
a subsequent attempt to update the qemu revision, so both settings of
QEMU_TRADITIONAL_REVISION are the same, and restoring the proper
behaviour will not have the side effect of making a previous qemu tag
update effective.

Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: George Dunlap george.dun...@eu.citrix.com
CC: Jan Beulich jbeul...@suse.com
Reported-by: Jan Beulich jbeul...@suse.com
---
 Config.mk |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/Config.mk b/Config.mk
index cb1d23b..08df07d 100644
--- a/Config.mk
+++ b/Config.mk
@@ -237,9 +237,7 @@ ifneq (,$(CONFIG_QEMU))
 QEMU_TRADITIONAL_LOC ?= $(CONFIG_QEMU)
 endif
 ifneq (,$(QEMU_TAG))
-QEMU_TRADITIONAL_REVISION ?= 62e41581f69c3fd4a8f829a773015eb4c17f1f3e
-# Tue Mar 31 16:27:45 2015 +0100
-# xen: limit guest control of PCI command register
+QEMU_TRADITIONAL_REVISION ?= $(QEMU_TAG)
 endif
 
 ifeq ($(GIT_HTTP),y)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 12/13] tools: add tools support for Intel CAT

2015-04-21 Thread Ian Campbell
On Tue, 2015-04-21 at 17:49 +0800, Chao Peng wrote:
 On Tue, Apr 21, 2015 at 03:24:37AM +0200, Dario Faggioli wrote:
  On Fri, 2015-04-17 at 22:33 +0800, Chao Peng wrote:
   This is the xc/xl changes to support Intel Cache Allocation
   Technology(CAT). Two commands are introduced:
   - xl psr-cat-hwinfo
 Show CAT hardware information.
  
   Examples:
   [root@vmm-psr vmm]# xl psr-cat-hwinfo
   Cache Allocation Technology (CAT):
   Socket ID   : 0
   L3 Cache: 12288KB
   Maximum COS : 15
   CBM length  : 12
   Default CBM : 0xfff
   
  Or, you can rename the psr-cmt-hwinfo command, added in the previous
  patch, to 'psr-hwinfo' and make it accept options, e.g.,
  
   -m, --cmt   show Cache Monitoring Technology (CMT) hardware info
   -c, --cat   show Cache Allocation Technology (CAT) hardware info
  
  By default (i.e., no options provided), it can just print all the hw
  info.
  
  Not a big deal, but I think that would make a better command line
  interface. Tools' maintainers' call, I guess.
 
 Thanks for suggestion.

FWIW I think this is a good idea.


   --- a/tools/libxl/libxl.h
   +++ b/tools/libxl/libxl.h
   
   +#ifdef LIBXL_HAVE_PSR_CAT
   +
   +#define LIBXL_PSR_TARGET_ALL (~0U)
   +int libxl_psr_cat_set_cbm(libxl_ctx *ctx, uint32_t domid,
   +  libxl_psr_cbm_type type, uint32_t target,
   +  uint64_t cbm);
   +int libxl_psr_cat_get_cbm(libxl_ctx *ctx, uint32_t domid,
   +  libxl_psr_cbm_type type, uint32_t target,
   +  uint64_t *cbm_r);
   +
  The same applies here: I'd rename taget to socket.
  
  Then there's that LIBXL_PSR_TARGET_ALL. So, target (or socket, if you'll
  rename it) is an integer representing _which_one_ socket to act on.
  However, there is a special value to mean all sockets.
  
  Another possibility would be to offer an API that natively allows for
  operating on multiple sockets, by using libxl_bitmap-s, as it happens in
  many other places, in libxl itself (e.g., setting and getting vcpu
  affinity).
  
  That means target would become a libxl_bitmap, and, in the
  implementation, you'll apply the operation on all the sockets
  corresponding to a set bit in there. Only one bit set means just that
  socket, all bits means all sockets.
  
  This looks like a better interface to me (no need for special ~0U
  values), it'd make the implementation more linear, and is more
  consistent with how other similar situations are handled in libxl.
  However, I appreciate that one may find it overkill... I guess it
  depends whether we expect the prevalent usage pattern to be almost
  always about single sockets --and maybe on all sockets, from time to
  time-- or if we see value in being able to specify more than one and
  less than all sockets.
  
  For instance, now that we have vNUMA, if a domain has 4 vNUMA nodes,
  each one mapped to a physical NUMA node, it looks to me like it would
  make sense to set CAT to, say, 0x0F, on the sockets corresponding to the
  physical NODEs. With the interface in this patch, that would require
  calling libxl_psr_cat_set_cbm() 4 times, with the libxl_bitmap approach,
  just once, after setting up the bitmap properly.
  
  Thoughts?
 
 I do like this suggestion and I have ever considered it actually. The
 only thing prevents me is that we need an extra _get_socket_count in xl
 for TARGET_ALL case. So libxl__count_physical_sockets is needed to be
 public. If Ian/Wei have no concerns for this, then I'm glad to do this.

I don't think you need libxl__count_physical_sockets for this, the
existing xl code for similar things just uses libxl_bitmap_set_any to
handle the all case.

With that in mind Dario's suggestion does seem like a good improvement
to the interface.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/7] docs/build: Move two rules for consistency, and comment sections

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-20 at 11:49 +0100, Andrew Cooper wrote:
 No functional change.
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com

Acked-by: Ian Campbell ian.campb...@citrix.com



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 4/7] docs/build: Do not use move-if-changed

2015-04-21 Thread Ian Campbell
On Mon, 2015-04-20 at 11:49 +0100, Andrew Cooper wrote:
 Nothing expensive depends on these results.

But are those uses a problem?

 
 Also prefer $(INSTALL_DATA) over cp to get correct file attributes (see
 fb33b2b docs: make .txt files over-writable when building from r/o sources)
 
 Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Wei Liu wei.l...@citrix.com
 ---
  docs/Makefile |   21 +++--
  1 file changed, 7 insertions(+), 14 deletions(-)
 
 diff --git a/docs/Makefile b/docs/Makefile
 index 0b458f1..d31b36f 100644
 --- a/docs/Makefile
 +++ b/docs/Makefile
 @@ -116,8 +116,7 @@ html/index.html: $(DOC_HTML) $(CURDIR)/gen-html-index 
 INDEX
  html/%.html: %.markdown
   $(INSTALL_DIR) $(@D)
  ifdef MARKDOWN
 - $(MARKDOWN) $  $@.tmp ; \
 - $(call move-if-changed,$@.tmp,$@)
 + $(MARKDOWN) $  $@
  else
   @echo markdown not installed; skipping $*.html.
  endif
 @@ -129,8 +128,7 @@ html/%.txt: %.txt
  html/man/%.1.html: man/%.pod.1 Makefile
   $(INSTALL_DIR) $(@D)
  ifdef POD2HTML
 - $(POD2HTML) --infile=$ --outfile=$@.tmp
 - $(call move-if-changed,$@.tmp,$@)
 + $(POD2HTML) --infile=$ --outfile=$@
  else
   @echo pod2html not installed; skipping $.
  endif
 @@ -138,8 +136,7 @@ endif
  html/man/%.5.html: man/%.pod.5 Makefile
   $(INSTALL_DIR) $(@D)
  ifdef POD2HTML
 - $(POD2HTML) --infile=$ --outfile=$@.tmp
 - $(call move-if-changed,$@.tmp,$@)
 + $(POD2HTML) --infile=$ --outfile=$@
  else
   @echo pod2html not installed; skipping $.
  endif
 @@ -161,19 +158,16 @@ html/hypercall/%/index.html: $(CURDIR)/xen-headers 
 Makefile
  
  txt/%.txt: %.txt
   $(INSTALL_DIR) $(@D)
 - cp $ $@.tmp
 - $(call move-if-changed,$@.tmp,$@)
 + $(INSTALL_DATA) $ $@
  
  txt/%.txt: %.markdown
   $(INSTALL_DIR) $(@D)
 - cp $ $@.tmp
 - $(call move-if-changed,$@.tmp,$@)
 + $(INSTALL_DATA) $ $@
  
  txt/man/%.1.txt: man/%.pod.1 Makefile
   $(INSTALL_DIR) $(@D)
  ifdef POD2TEXT
 - $(POD2TEXT) $ $@.tmp
 - $(call move-if-changed,$@.tmp,$@)
 + $(POD2TEXT) $ $@
  else
   @echo pod2text not installed; skipping $.
  endif
 @@ -181,8 +175,7 @@ endif
  txt/man/%.5.txt: man/%.pod.5 Makefile
   $(INSTALL_DIR) $(@D)
  ifdef POD2TEXT
 - $(POD2TEXT) $ $@.tmp
 - $(call move-if-changed,$@.tmp,$@)
 + $(POD2TEXT) $ $@
  else
   @echo pod2text not installed; skipping $.
  endif



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 5/9] xen: Make gpfn related memops compatible with wider return values

2015-04-21 Thread Tamas K Lengyel
On Tue, Apr 21, 2015 at 4:14 PM, Jan Beulich jbeul...@suse.com wrote:

  On 21.04.15 at 15:23, ian.campb...@citrix.com wrote:
  On Mon, 2015-04-20 at 16:22 +0100, Andrew Cooper wrote:
  On 20/04/15 16:06, Tamas K Lengyel wrote:
   The current implementation of three memops,
 XENMEM_current_reservation,
   XENMEM_maximum_reservation and XENMEM_maximum_gpfn return values as an
   int. However, in ARM64 we could potentially have 36-bit pfn's, thus
   in preparation for the ARM patch, in this patch we update the existing
   memop routines to use a struct, xen_get_gpfn, to exchange the gpfn
 info
   as a uin64_t.
  
   This patch also adds error checking on the toolside in case the memop
   fails.
  
   Signed-off-by: Tamas K Lengyel tkleng...@sec.in.tum.de
 
  XENMEM, unlikely domctls/sysctls is a guest-visible stable ABI/API.
 
  You cannot make adjustments like this, but you can add a brand new op
  with appropriate parameters and list the old ops as deprecated.
 
  Right. For the benefit of callers using the old API it seems what we
  usually do is rename the old op XENMEM_foo_compat and use the name with
  a new number for the new functionality, then use a
  __XEN_INTERFACE_VERSION__ to #define back to the old name.
 
  The handling of __HYPERVISOR_sched_op in public/xen.h seems like a
  reasonable example, I couldn't find one specifically for the memory ops.

 And there's no need to afaict: This complication isn't needed in the
 first place. The patch's context already makes this clear:

 --- a/xen/common/memory.c
 +++ b/xen/common/memory.c
 @@ -838,12 +838,16 @@ long do_memory_op(unsigned long cmd,
 XEN_GUEST_HANDLE_PARAM(void) arg)

 Note the long return type. Yet the patch description, for
 whatever reason, claims the hypercall to only return an int.
 Maybe because (as pointed out before) the respective Linux
 hypercall stub wrongly an int return type?


The privcmd driver on Linux certainly does return an int via the ioctl.



 (Tamas, while I had asked you to avoid Cc-ing me on ARM only
 patches, I would have been able to spot this earlier if you had
 Cc-ed me on this non-ARM change. Again, please Cc
 maintainers, but especially on larger series make sure you
 don't Cc everyone on every patch. I'm sure avoiding the extra
 mail load is being appreciated not just by me.)

 Jan


Ack, I took you off only after you repeatedly asked me to exclude you from
this series. I agree that patches should be only cc-d to the respective
maintainers and not all maintainers who are involved in the series. I'll
look into how that could be automated because doing that by hand is not
very pleasant.

Thanks,
Tamas
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v8 3/7] Edit some APIs in TestSupport.pm for nested test

2015-04-21 Thread Ian Jackson
Ian Campbell writes (Re: [OSSTEST Nested PATCH v8 3/7] Edit some APIs in 
TestSupport.pm for nested test):
 On Tue, 2015-04-21 at 13:33 +0100, Ian Jackson wrote:
  It seems to be copying information from runvars into the in-memory
  data structure for host properties.  But host properties are (by
  definition) matters of (fixed) configuration, not runtime definition.
 
 Remember that here the host is actual an L1 virtual machine, whose IP
 is not fixed (since it mac address isn't).

Indeed.

 I don't know if that affects your opinion at all?

No, it doesn't.  I agree that the IP address should be dynamically
figured out, and passing it through a runvar is fine.

I'm only objecting to the involvement of the host property machinery.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Wg-test-framework] osstest outage over the weekend

2015-04-21 Thread Don Koch
On Mon, 20 Apr 2015 06:40:18 -0400
Ian Jackson ian.jack...@eu.citrix.com wrote:

 osstest service user writes ([qemu-mainline test] 50651: trouble: 
 blocked/broken):
  flight 50651 qemu-mainline real [real]
  http://logs.test-lab.xenproject.org/osstest/logs/50651/
  
  Failures and problems with tests :-(
  
  Tests which did not succeed and are blocking,
  including tests which could not be run:
   build-armhf-xsm   3 host-install(3)broken REGR. vs. 50391
   build-armhf   3 host-install(3)broken REGR. vs. 50391
   build-i386-pvops  3 host-install(3)broken REGR. vs. 50391
   build-amd64-xsm   3 host-install(3)broken REGR. vs. 50391
   build-i386-xsm3 host-install(3)broken REGR. vs. 50391
   build-amd64-pvops 3 host-install(3)broken REGR. vs. 50391
   build-amd64   3 host-install(3)broken REGR. vs. 50391
   build-i3863 host-install(3)broken REGR. vs. 50391
   build-armhf-pvops 3 host-install(3)broken REGR. vs. 50391
 
 
 This was due to our contractors / hardware suppliers making an
 ill-advised change to the DHCP server configuration, which I have now
 reverted.
 
 All jobs started between some time around 2100 UTC on Friday and
 around 1030 UTC today failed in this way.  The effect on individual
 flights can be mixed and depends on the period during which the flight
 was running.
 
 
 Don: This is a live, production system.  Please do not make ANY
 changes with global effect, no matter how minor, without prior
 consultation.

Sorry. I had gotten the impression that you had some other tool that was
handling the DHCP addresses for the test machines. I guess what you meant
(and I misinterpreted) was that it was handling figuring out which addresses
were handed out to which machines. As a result, I thought that the dhcp
server wasn't really in the production loop.

I will refrain from modifying it from here on.

-d

 Also if you do change anything you should let me know.  I assume it
 was you rather than Paul.  I only know exactly what you changed from
 looking at the git logs (thanks to etckeeper!)
 
 Thanks,
 Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/5] vscsiif.h: fix WWN notation for p-dev property

2015-04-21 Thread Konrad Rzeszutek Wilk
On Fri, Apr 17, 2015 at 08:30:56AM +, Olaf Hering wrote:
 The pvops kernel expects either naa.WWN:LUN or h:c:t:l in the p-dev
 property. Add the missing :LUN part to the comment.

What about the older kernels that had said driver?

 
 Signed-off-by: Olaf Hering o...@aepfle.de
 Cc: Ian Campbell ian.campb...@citrix.com
 Cc: Ian Jackson ian.jack...@eu.citrix.com
 Cc: Jan Beulich jbeul...@suse.com
 Cc: Keir Fraser k...@xen.org
 Cc: Tim Deegan t...@xen.org
 ---
  xen/include/public/io/vscsiif.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/xen/include/public/io/vscsiif.h b/xen/include/public/io/vscsiif.h
 index 7a1db05..e8e38a9 100644
 --- a/xen/include/public/io/vscsiif.h
 +++ b/xen/include/public/io/vscsiif.h
 @@ -60,7 +60,7 @@
   *
   *  A string specifying the backend device: either a 4-tuple h:c:t:l
   *  (host, controller, target, lun, all integers), or a WWN (e.g.
 - *  naa.60014054ac780582).
 + *  naa.60014054ac780582:0).
   *
   * v-dev
   *  Values: string
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V15 2/9] xen/arm: Allow hypervisor access to mem_access protected pages

2015-04-21 Thread Tamas K Lengyel
On Tue, Apr 21, 2015 at 3:14 PM, Ian Campbell ian.campb...@citrix.com
wrote:

 On Mon, 2015-04-20 at 17:06 +0200, Tamas K Lengyel wrote:
  -paddr_t p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t *t)
  +static paddr_t __p2m_lookup(struct domain *d, paddr_t paddr, p2m_type_t
 *t)

 I'd have been inclined to make this take a p2m_domain* rather than a
 domain*, on the basis that the caller must have one in hand to have
 locked it. I don't think __p2m_lookup uses d other than to find the p2m.

 Not worth changing unless there is some other reason to resend though,
 so with or without that:

 Acked-by: Ian Campbell ian.campb...@citrix.com

 Ian.


Thanks, I made the change as I'll have another round anyway to introduce
XENMEM_maximum_gpfn2 in the series.

Tamas
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >