Re: [Xen-devel] CPU emulation on Xen

2015-03-10 Thread Bunny Mintoo
Thanks for your mail.

My architecture is x86 compatible. I have just added new instructions over
existing x86 ISA.
These new instructions are now to be taught to the CPU. In QEMU, I was able
to mimic the new CPU working with software changes (changes involved in
both kernel and QEMU source tree). So what I am really looking is Xen to
emulate this new architecture so that I can carry out testing (and probably
compare performance results). The high level idea is to tweak Xen source
code to emulate my CPU working and Dom0 can be modified appropriately.

Why do you say QEMU might be a better fit for you? Asking this out of
curiosity.
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86: synchronize PCI config space access decoding

2015-03-10 Thread Jan Beulich
>>> On 09.03.15 at 19:49,  wrote:
> On 09/03/15 16:08, Jan Beulich wrote:
>> Both PV and HVM logic have similar but not similar enough code here.
>> Synchronize the two so that
>> - in the HVM case we don't unconditionally try to access extended
>>   config space
>> - in the PV case we pass a correct range to the XSM hook
>> - in the PV case we don't needlessly deny access when the operation
>>   isn't really on PCI config space
>> All this along with sharing the macros HVM already had here.
>>
>> Signed-off-by: Jan Beulich 
>>
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -2383,11 +2383,6 @@ void hvm_vcpu_down(struct vcpu *v)
>>  static struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>>  ioreq_t *p)
>>  {
>> -#define CF8_BDF(cf8) (((cf8) & 0x0000) >> 8)
>> -#define CF8_ADDR_LO(cf8) ((cf8) & 0x00fc)
>> -#define CF8_ADDR_HI(cf8) (((cf8) & 0x0f00) >> 16)
>> -#define CF8_ENABLED(cf8) (!!((cf8) & 0x8000))
>> -
>>  struct hvm_ioreq_server *s;
>>  uint32_t cf8;
>>  uint8_t type;
>> @@ -2416,9 +2411,19 @@ static struct hvm_ioreq_server *hvm_sele
>>  
>>  type = IOREQ_TYPE_PCI_CONFIG;
>>  addr = ((uint64_t)sbdf << 32) |
>> -   CF8_ADDR_HI(cf8) |
>> CF8_ADDR_LO(cf8) |
>> (p->addr & 3);
>> +/* AMD extended configuration space access? */
>> +if ( CF8_ADDR_HI(cf8) &&
>> + boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
>> + boot_cpu_data.x86 >= 0x10 && boot_cpu_data.x86 <= 0x17 )
>> +{
>> +uint64_t msr_val;
>> +
>> +if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
>> + (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
>> +addr |= CF8_ADDR_HI(cf8);
> 
> This is another example of host state which leaks into guests across
> migrate, but in this case is also problematic at the host level.

Yes, but cross-vendor migration has (iirc) many more issues like this
(and considering the wide family range the risk of this breaking for
migration between AMD systems seems marginal).

> As far as the host goes, MSR_AMD64_NB_CFG is a per-node msr and Xen
> should verify that the AMD64_NB_CFG_CF8_EXT_ENABLE_BIT is consistent
> across the system, or bits of emulate_privileged_op() are liable to
> execute differently depending on which pcpu a vcpu happens to be scheduled.

I think this goes too far in mistrusting Dom0.

> Beyond that, for now there should be a __read_mostly bool_t based on the
> system verification, which is used in preference to reading the MSR each
> time a guest does a cf8 access.

But it is part of the change to _not_ do the MSR access on each
CF8 one: We first check whether this at all looks like an extended
config space access. I.e. I considered eliminating the rdmsr, but
didn't consider it worthwhile for the change here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] AMD IOMMU: widen NUMA nodes to be allocated from

2015-03-10 Thread Jan Beulich
>>> On 09.03.15 at 20:02,  wrote:
> I agree that having the IO page tables on the NUMA node that is closest 
> to the IOMMU would be beneficial.

And I already withdrew this patch and the corresponding VT-d one.

> However, I am not sure at the moment 
> that this information could be easily determined. I think ACPI _PXM for 
> devices should be able to provide this information, but this is optional 
> and often not available.

And even if it was available, it would be too late at least for Dom0's
allocations (as it requires Dom0's interpreter to dig out this detail).
The best we could do in that case would be to try to replace the
existing tables. Or assume Dom0 is being placed suitably by the
dom0_nodes= option. Or add yet another option.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup)

2015-03-10 Thread Jan Beulich
>>> On 10.03.15 at 03:27,  wrote:
> Changes in v4:
> * Split cputopology and NUMA info changes into separate patches
> * Added patch#1 (partly because patch#4 needs to know when when distance is 
> invalid,
>   i.e. NUMA_NO_DISTANCE)
> * Split sysctl version update into a separate patch

Why?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Failed to launch xen on J6 evm

2015-03-10 Thread M A Young
On Mon, 9 Mar 2015, Korupol, Naveen (EXT) wrote:

> Hi Ian
> 
> I am (awesome) glad to see your response.
> 
> 
> [IC] J6 EVM?
> [NK] is a TI (& Spectrum Digital) evaluation board built on OMAP 5 
> architecture which has 2 ARM Cortex-A15 core(s), 2 Cortex-M4 cores and SGX544 
> 3D graphics core(s).
> 
> [IC] enable early_printk
> [NK] now I have enabled early_printk, I see that processor is not in hyp mode 
> as you correctly envisioned.
> 
> Starting kernel ...
> 
> - UART enabled -
> - CPU  booting -
> - Xen must be entered in NS Hyp mode -
> - Boot failed -
> 
> I took my current U-Boot from:
>  Xen needs to be started in non-secure HYP mode. Use this U-Boot Git 
>  repository:
>  git clone git://github.com/jwrdegoede/u-boot-sunxi.git
> And compiled it for dra7xx_evm_config.
> 
> Is there any specific U-Boot for ARM with this HYP mode already enabled 
> correctly?
> If not, I would appreciate any links to help me update my current u-boot to 
> have this setting enabled.

You can get uboot from the U-Boot website is at
http://www.denx.de/wiki/U-Boot
which might have HYP enabled for your board, but you are probably 
going to have to edit the uboot code to enable it, for example by 
comparing the settings for a board that does have HYP enabled.

Michael Young

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] iommu: add rmrr Xen command line option for misc rmrrs

2015-03-10 Thread Jan Beulich
>>> On 10.03.15 at 03:47,  wrote:
>>  From: elena.ufimts...@oracle.com [mailto:elena.ufimts...@oracle.com]
>> Format for rmrr Xen command line option:
>> rmrr=[sbdf]start<:end>,[sbdf]start:
> 
> how about sticking to rmrr structure, i.e. 
> 
> rmrr=start<:end>[sbdf1, sbdf2, ...], ...

+1

>> +if ( *s != ']' )
>> +return;
> 
> better to have some warn message for malformat.

Warning messages from command line argument parsing functions
(which this is to be moved into by making it a custom_param, as
requested by Andrew [which I support])  are at best marginally
useful, as they get issued before any console was set up.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] EFI: Fix getting EFI variable list on some systems

2015-03-10 Thread Jan Beulich
>>> On 09.03.15 at 17:47,  wrote:
> Copy the entire output buffer to the guest because it may contain data beyond
> the output size that the firmware requires on a subsequent
> GetNextVariableName() call (e.g. a NULL character).
> 
> The spec requires that on each call, "the previous results" be passed in.
> 
> Without this change, the following (simplified) sequence would occur:
> GetNextVariableName: in \0, out AdminPw\0, size 7
> GetNextVariableName: in AdminPw\0, out UserPw\0, size 6
> GetNextVariableName: in UserPww\0, NOT FOUND

As such behavior is outside the specification, please name the system
needing this workaround: The runtime services function is documented
to update *VariableNameSize only upon EFI_BUFFER_TOO_SMALL. A
code comment would also seem to be on order, as otherwise people
like me might be tempted to undo this again, as it's sub-optimal code
for spec conforming firmware.

And to save me from having to do an incremental patch on top, you
may want to consider switching to __copy_to_user() at once.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP v1 09/10] HACK: xen: arm: stop recursing with dom0 mappings once we've hit a ranges.

2015-03-10 Thread Ian Campbell
Probably better done with improvements to DT/PCI parsing code and doesn't seem
to dpo any harm not to have this patch anyway.

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/domain_build.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index e754d37..ee27930 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1022,7 +1022,8 @@ static int map_device(struct domain *d, struct 
dt_device_node *dev)
 }
 
 static int handle_node(struct domain *d, struct kernel_info *kinfo,
-   struct dt_device_node *node)
+   struct dt_device_node *node,
+   bool_t map)
 {
 static const struct dt_device_match skip_matches[] __initconst =
 {
@@ -1051,6 +1052,8 @@ static int handle_node(struct domain *d, struct 
kernel_info *kinfo,
 int res;
 const char *name;
 const char *path;
+bool_t map_children = true;
+u32 ranges_len;
 
 path = dt_node_full_name(node);
 
@@ -1099,7 +1102,8 @@ static int handle_node(struct domain *d, struct 
kernel_info *kinfo,
  *  property. Therefore these device doesn't need to be mapped. This
  *  solution can be use later for pass through.
  */
-if ( !dt_device_type_is_equal(node, "memory") &&
+if ( map &&
+ !dt_device_type_is_equal(node, "memory") &&
  dt_device_is_available(node) )
 {
 res = map_device(d, node);
@@ -1124,9 +1128,15 @@ static int handle_node(struct domain *d, struct 
kernel_info *kinfo,
 if ( res )
 return res;
 
+/* Don't need to map anything below a node with a non-empty ranges property
+ * -- it's already covered and we may not know how to translate
+ * anyway. */
+if ( dt_get_property(node, "ranges", &ranges_len) != NULL && ranges_len )
+map_children = false;
+
 for ( child = node->child; child != NULL; child = child->sibling )
 {
-res = handle_node(d, kinfo, child);
+res = handle_node(d, kinfo, child, map_children);
 if ( res )
 return res;
 }
@@ -1177,7 +1187,7 @@ static int prepare_dtb(struct domain *d, struct 
kernel_info *kinfo)
 
 fdt_finish_reservemap(kinfo->fdt);
 
-ret = handle_node(d, kinfo, dt_host);
+ret = handle_node(d, kinfo, dt_host, true);
 if ( ret )
 goto err;
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP v1 02/10] xen: arm: earlyprintk support for Nvidia Jetson

2015-03-10 Thread Ian Campbell
Signed-off-by: Ian Campbell 
---
 docs/misc/arm/early-printk.txt | 1 +
 xen/arch/arm/Rules.mk  | 5 +
 2 files changed, 6 insertions(+)

diff --git a/docs/misc/arm/early-printk.txt b/docs/misc/arm/early-printk.txt
index 1ca2a55..4354e2d 100644
--- a/docs/misc/arm/early-printk.txt
+++ b/docs/misc/arm/early-printk.txt
@@ -13,6 +13,7 @@ where mach is the name of the machine:
   - exynos5250: printk with the second UART
   - midway: printk with the pl011 on Calxeda Midway processors
   - fastmodel: printk on ARM Fastmodel software emulators
+  - jetson: printk on Nvidia Jetson TK1
   - omap5432: printk with UART3 on TI OMAP5432 processors
   - sun6i: printk with 8250 on Allwinner A31 processors
   - sun7i: printk with 8250 on Allwinner A20 processors
diff --git a/xen/arch/arm/Rules.mk b/xen/arch/arm/Rules.mk
index c7bd227..3fc8065 100644
--- a/xen/arch/arm/Rules.mk
+++ b/xen/arch/arm/Rules.mk
@@ -105,6 +105,11 @@ EARLY_PRINTK_INC := 8250
 EARLY_UART_BASE_ADDRESS := 0xE4007000
 EARLY_UART_REG_SHIFT := 2
 endif
+ifeq ($(CONFIG_EARLY_PRINTK), jetson)
+EARLY_PRINTK_INC := 8250
+EARLY_UART_BASE_ADDRESS := 0x70006300
+EARLY_UART_REG_SHIFT := 2
+endif
 ifeq ($(CONFIG_EARLY_PRINTK), seattle)
 EARLY_PRINTK_INC := pl011
 EARLY_UART_BASE_ADDRESS := 0xe101
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP v1 08/10] HACK: xen: arm: Add mask, unmask and eoi platform hooks.

2015-03-10 Thread Ian Campbell
Not to be applied until better understood.

Linux has these hooks and uses them on Tegra.

However I don't think they are strictly needed (only for power gating stuff
perhaps?). I implemented them while investigating some other issues, used by
later "Tegra hacking" patch which isn't actually needed for a working board,
AFAICT.

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/gic-v2.c  |  3 +++
 xen/arch/arm/platform.c| 16 
 xen/include/asm-arm/platform.h |  8 
 3 files changed, 27 insertions(+)

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index 2f5d33b..033a94a 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -537,6 +537,7 @@ static void gicv2_irq_enable(struct irq_desc *desc)
 clear_bit(_IRQ_DISABLED, &desc->status);
 dsb(sy);
 /* Enable routing */
+platform_irq_unmask(desc);
 writel_gicd((1u << (irq % 32)), GICD_ISENABLER + (irq / 32) * 4);
 spin_unlock_irqrestore(&gicv2.lock, flags);
 }
@@ -552,6 +553,7 @@ static void gicv2_irq_disable(struct irq_desc *desc)
 /* Disable routing */
 writel_gicd(1u << (irq % 32), GICD_ICENABLER + (irq / 32) * 4);
 set_bit(_IRQ_DISABLED, &desc->status);
+platform_irq_mask(desc);
 spin_unlock_irqrestore(&gicv2.lock, flags);
 }
 
@@ -574,6 +576,7 @@ static void gicv2_irq_ack(struct irq_desc *desc)
 
 static void gicv2_host_irq_end(struct irq_desc *desc)
 {
+platform_irq_eoi(desc);
 /* Lower the priority */
 gicv2_eoi_irq(desc);
 /* Deactivate */
diff --git a/xen/arch/arm/platform.c b/xen/arch/arm/platform.c
index c58e251..3255c6a 100644
--- a/xen/arch/arm/platform.c
+++ b/xen/arch/arm/platform.c
@@ -160,6 +160,22 @@ bool_t platform_device_is_blacklisted(const struct 
dt_device_node *node)
 return (dt_match_node(blacklist, node) != NULL);
 }
 
+void platform_irq_eoi(struct irq_desc *desc)
+{
+if ( platform && platform->irq_eoi )
+platform->irq_eoi(desc);
+}
+void platform_irq_mask(struct irq_desc *desc)
+{
+if ( platform && platform->irq_mask )
+platform->irq_mask(desc);
+}
+void platform_irq_unmask(struct irq_desc *desc)
+{
+if ( platform && platform->irq_unmask )
+platform->irq_unmask(desc);
+}
+
 void platform_route_irq_to_guest(struct domain *d, struct irq_desc *desc)
 {
 if ( platform && platform->route_irq_to_guest )
diff --git a/xen/include/asm-arm/platform.h b/xen/include/asm-arm/platform.h
index 22d1f8b..8b4c807 100644
--- a/xen/include/asm-arm/platform.h
+++ b/xen/include/asm-arm/platform.h
@@ -27,6 +27,11 @@ struct platform_desc {
 /* Platform power-off */
 void (*poweroff)(void);
 
+/* GIC hooks */
+void (*irq_eoi)(struct irq_desc *);
+void (*irq_mask)(struct irq_desc *);
+void (*irq_unmask)(struct irq_desc *);
+
 void (*route_irq_to_guest)(struct domain *d, struct irq_desc *);
 
 /*
@@ -72,6 +77,9 @@ bool_t platform_has_quirk(uint32_t quirk);
 bool_t platform_device_is_blacklisted(const struct dt_device_node *node);
 unsigned int platform_dom0_evtchn_ppi(void);
 void platform_dom0_gnttab(paddr_t *start, paddr_t *size);
+void platform_irq_eoi(struct irq_desc *);
+void platform_irq_mask(struct irq_desc *);
+void platform_irq_unmask(struct irq_desc *);
 
 void platform_route_irq_to_guest(struct domain *, struct irq_desc *);
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP v1 06/10] HACK: xen: arm: Map other regions to dom0 after one fails

2015-03-10 Thread Ian Campbell
We can't seem to hanle /pcie-controller@0,01003000/pci@1,0 and
/pcie-controller@0,01003000/pci@2,0. Perhaps better solved by DT/PCI series.

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/domain_build.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 9f1f59f..e754d37 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -985,8 +985,9 @@ static int map_device(struct domain *d, struct 
dt_device_node *dev)
 res = dt_device_get_address(dev, i, &addr, &size);
 if ( res )
 {
-printk(XENLOG_ERR "Unable to retrieve address %u for %s\n",
-   i, dt_node_full_name(dev));
+printk(XENLOG_ERR "Unable to retrieve address %u for %s: %d\n",
+   i, dt_node_full_name(dev), res);
+continue;
 return res;
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP v1 05/10] WIP: xen: arm: intial platform support for Nvidia TK1

2015-03-10 Thread Ian Campbell
As used on the Jetson board.

This platform has a bunch of specific mappings and, more importantly, an
additional interrupt controller (which is used alongside the main GIC and
covers the same interrupts etc, so it is not a secondary or chained interrupt
controller) which dom0 really wants to poke at, I think for power gating
reasons. This is implemented as a whitelist derived from the set of interrupts
routed to dom0 (discovered by the new route_irq_to_guest platform hook).

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/platforms/Makefile |   1 +
 xen/arch/arm/platforms/tegra.c  | 395 
 2 files changed, 396 insertions(+)
 create mode 100644 xen/arch/arm/platforms/tegra.c

diff --git a/xen/arch/arm/platforms/Makefile b/xen/arch/arm/platforms/Makefile
index e173fec..eb512ed 100644
--- a/xen/arch/arm/platforms/Makefile
+++ b/xen/arch/arm/platforms/Makefile
@@ -3,6 +3,7 @@ obj-$(CONFIG_ARM_32) += brcm.o
 obj-$(CONFIG_ARM_32) += exynos5.o
 obj-$(CONFIG_ARM_32) += midway.o
 obj-$(CONFIG_ARM_32) += omap5.o
+obj-$(CONFIG_ARM_32) += tegra.o
 obj-$(CONFIG_ARM_32) += sunxi.o
 obj-$(CONFIG_ARM_32) += rcar2.o
 obj-$(CONFIG_ARM_64) += seattle.o
diff --git a/xen/arch/arm/platforms/tegra.c b/xen/arch/arm/platforms/tegra.c
new file mode 100644
index 000..189ef44
--- /dev/null
+++ b/xen/arch/arm/platforms/tegra.c
@@ -0,0 +1,395 @@
+/*
+ * xen/arch/arm/platforms/tegra.c
+ *
+ * Nvidia Tegra specific settings
+ *
+ * Ian Campbell
+ * Copyright (c) 2014 Citrix Systems
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define ICTLR_BASE 0x60004000
+#define ICTLR_SIZE 0x1000
+
+#define ICTLR_CPU_IEP_VFIQ 0x08
+#define ICTLR_CPU_IEP_FIR  0x14
+#define ICTLR_CPU_IEP_FIR_SET  0x18
+#define ICTLR_CPU_IEP_FIR_CLR  0x1c
+
+#define ICTLR_CPU_IER  0x20
+#define ICTLR_CPU_IER_SET  0x24
+#define ICTLR_CPU_IER_CLR  0x28
+#define ICTLR_CPU_IEP_CLASS0x2C
+
+#define ICTLR_COP_IER  0x30
+#define ICTLR_COP_IER_SET  0x34
+#define ICTLR_COP_IER_CLR  0x38
+#define ICTLR_COP_IEP_CLASS0x3c
+
+static void __iomem *ictlr;
+
+struct {
+uint32_t allow_dom0;
+} ictlr_info[5] = {
+[0] = { 0x0 },
+[1] = { 0x0 },
+[2] = { 0x0 },
+[3] = { 0x0 },
+[4] = { 0x0 },
+};
+
+static int ictlr_read(struct vcpu *v, mmio_info_t *info)
+{
+struct hsr_dabt dabt = info->dabt;
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+register_t *r = select_user_reg(regs, dabt.reg);
+uint32_t offs = info->gpa - ICTLR_BASE;
+int ctlrnr = offs >> 8;
+int reg = offs & 0xff;
+
+uint32_t val;
+
+if ( offs > 0x4ff )
+{
+printk("UNHANDLED READ FROM %"PRIpaddr"\n", info->gpa);
+domain_crash_synchronous();
+}
+if ( offs & 0x3 )
+{
+printk("MISALIGNED READ FROM %"PRIpaddr"\n", info->gpa);
+domain_crash_synchronous();
+}
+if ( dabt.size != DABT_WORD )
+{
+printk("NON-WORD READ FROM %"PRIpaddr"\n", info->gpa);
+domain_crash_synchronous();
+}
+
+switch ( reg ) {
+/* Read only */
+case 0x00 ... 0x14:
+case 0x20:
+case 0x30:
+case 0x60 ... 0x68:
+case 0x78 ... 0x80:
+case 0x90 ... 0x98:
+/* Read/write */
+case 0x2C:
+case 0x3C:
+case 0x74:
+case 0x8C:
+case 0xA4:
+val = readl(ictlr + offs);
+*r = val & ictlr_info[ctlrnr].allow_dom0;
+if ( val != *r )
+printk("TEGRA: ICTLR%d READ %x INTO r%d=%08"PRIregister" 
(%08"PRIregister")\n",
+   ctlrnr+1, reg, dabt.reg, *r, val);
+return 1;
+/* Write only */
+case 0x18 ... 0x1c:
+case 0x24 ... 0x28:
+case 0x34 ... 0x38:
+case 0x6C ... 0x70:
+case 0x84 ... 0x88:
+case 0x9C ... 0xA0:
+printk("READ FROM WO %"PRIpaddr"\n", info->gpa);
+domain_crash_synchronous();
+break;
+case 0xa8 ... 0xff:
+printk("READ FROM NON-EXISTENT %"PRIpaddr"\n", info->gpa);
+domain_crash_synchronous();
+break;
+default:
+BUG();
+}
+}
+
+static int ictlr_write(struct vcpu *v, mmio_info_t *info)
+{
+struct hsr_dabt dabt = info->dabt;
+struct cpu_user_regs *regs = guest_cpu_user_regs();
+register_t *r = select_user_reg(regs, dabt.reg);
+uint32_t offs = info->gpa - ICTLR_BASE;
+int ctlrnr = offs >> 8;
+int reg = offs & 0xff;
+
+uint32_t val = *r;
+
+if ( offs > 0x4ff )
+{
+p

[Xen-devel] [PATCH WIP v1 04/10] WIP: xen: ns16550: Add nvidia, tegra20-uart to DT compatible list

2015-03-10 Thread Ian Campbell
DO NOT APPLY. Doesn't work without sync_console *and* the changes from "HACK:
xen: arm: trying to figure out ns16550 vs. Tegra issue".

I suspect and IRQ or FIFO depth issue. Linux has a separate serial-tegra driver
so perhaps the device is just not as compatible as I think.

Signed-off-by: Ian Campbell 
---
 xen/drivers/char/ns16550.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index d443880..44045d7 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -1189,6 +1189,7 @@ static const struct dt_device_match ns16550_dt_match[] 
__initconst =
 {
 DT_MATCH_COMPATIBLE("ns16550"),
 DT_MATCH_COMPATIBLE("ns16550a"),
+DT_MATCH_COMPATIBLE("nvidia,tegra20-uart"),
 DT_MATCH_COMPATIBLE("snps,dw-apb-uart"),
 { /* sentinel */ },
 };
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP v1 01/10] xen: arm: Add debug keyhandler to dump the physical GIC state.

2015-03-10 Thread Ian Campbell
Rename the existing gic_dump_info to gic_dump_info_guest reduce confusion.

Signed-off-by: Ian Campbell 
---
v2: s/gic_dump_info/gic_dump_info_guest/
---
 xen/arch/arm/domain.c |  2 +-
 xen/arch/arm/gic-v2.c | 66 ++-
 xen/arch/arm/gic-v3.c |  5 ++--
 xen/arch/arm/gic.c| 25 --
 xen/include/asm-arm/gic.h |  6 +++--
 5 files changed, 96 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index fdba081..bec1082 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -803,7 +803,7 @@ long arch_do_vcpu_op(int cmd, struct vcpu *v, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 
 void arch_dump_vcpu_info(struct vcpu *v)
 {
-gic_dump_info(v);
+gic_dump_info_guest(v);
 }
 
 void vcpu_mark_events_pending(struct vcpu *v)
diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index 20cdbc9..2f5d33b 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -94,6 +94,11 @@ static inline void writel_gicd(uint32_t val, unsigned int 
offset)
 writel_relaxed(val, gicv2.map_dbase + offset);
 }
 
+static inline uint8_t readb_gicd(unsigned int offset)
+{
+return readb_relaxed(gicv2.map_dbase + offset);
+}
+
 static inline uint32_t readl_gicd(unsigned int offset)
 {
 return readl_relaxed(gicv2.map_dbase + offset);
@@ -168,7 +173,7 @@ static void gicv2_restore_state(const struct vcpu *v)
 writel_gich(GICH_HCR_EN, GICH_HCR);
 }
 
-static void gicv2_dump_state(const struct vcpu *v)
+static void gicv2_dump_state_guest(const struct vcpu *v)
 {
 int i;
 
@@ -651,6 +656,64 @@ static int gicv2_make_dt_node(const struct domain *d,
 return res;
 }
 
+static void gicv2_dump_state(void)
+{
+int irq;
+
+for ( irq = 0; irq < gicv2_info.nr_lines; irq++ )
+{
+const char *type;
+int type_nr, enable, pend, active, priority, target;
+struct irq_desc *desc = irq_to_desc(irq);
+uint32_t wordreg;
+
+target = readb_gicd(GICD_ITARGETSR + irq);
+priority = readb_gicd(GICD_IPRIORITYR + irq);
+
+switch ( irq )
+{
+case 0 ... 15:
+type = "SGI";
+type_nr = irq;
+target = 0x00; /* these are per-CPU */
+break;
+case 16 ... 31:
+type = "PPI";
+type_nr = irq - 16;
+break;
+default:
+type = "SPI";
+type_nr = irq - 32;
+break;
+}
+
+wordreg = readl_gicd(GICD_ISENABLER + (irq / 32) * 4);
+enable = !!(wordreg & (1u << (irq % 32)));
+wordreg = readl_gicd(GICD_ISPENDR + (irq / 32) * 4);
+pend = !!(wordreg & (1u << (irq % 32)));
+wordreg = readl_gicd(GICD_ISACTIVER + (irq / 32) * 4);
+active = !!(wordreg & (1u << (irq % 32)));
+
+printk("IRQ%03d %s%03d: %c%c%c pri:%02x tgt:%02x ",
+   irq, type, type_nr,
+   enable ? 'e' : '-',
+   pend   ? 'p' : '-',
+   active ? 'a' : '-',
+   priority, target);
+
+if ( desc->status & IRQ_GUEST )
+{
+struct domain *d = desc->action->dev_id;
+printk("dom%d %s", d->domain_id, desc->action->name);
+}
+else
+{
+printk("Xen");
+}
+printk("\n");
+}
+}
+
 /* XXX different for level vs edge */
 static hw_irq_controller gicv2_host_irq_type = {
 .typename = "gic-v2",
@@ -680,6 +743,7 @@ const static struct gic_hw_operations gicv2_ops = {
 .save_state  = gicv2_save_state,
 .restore_state   = gicv2_restore_state,
 .dump_state  = gicv2_dump_state,
+.dump_state_guest= gicv2_dump_state_guest,
 .gicv_setup  = gicv2v_setup,
 .gic_host_irq_type   = &gicv2_host_irq_type,
 .gic_guest_irq_type  = &gicv2_guest_irq_type,
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index ab80670..4516304 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -394,7 +394,7 @@ static void gicv3_restore_state(const struct vcpu *v)
 dsb(sy);
 }
 
-static void gicv3_dump_state(const struct vcpu *v)
+static void gicv3_dump_state_guest(const struct vcpu *v)
 {
 int i;
 
@@ -1177,7 +1177,8 @@ static const struct gic_hw_operations gicv3_ops = {
 .info= &gicv3_info,
 .save_state  = gicv3_save_state,
 .restore_state   = gicv3_restore_state,
-.dump_state  = gicv3_dump_state,
+.dump_state  = NULL,
+.dump_state_guest= gicv3_dump_state_guest,
 .gicv_setup  = gicv_v3_init,
 .gic_host_irq_type   = &gicv3_host_irq_type,
 .gic_guest_irq_type  = &gicv3_guest_irq_type,
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 390c8b0..6c5581b 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -163,6 +164,24 @@ 

[Xen-devel] [PATCH WIP v1 10/10] Tegra hacking.

2015-03-10 Thread Ian Campbell
Works without all this stuff.
---
 xen/arch/arm/platforms/tegra.c | 68 +-
 1 file changed, 67 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/platforms/tegra.c b/xen/arch/arm/platforms/tegra.c
index 189ef44..5ec9dda 100644
--- a/xen/arch/arm/platforms/tegra.c
+++ b/xen/arch/arm/platforms/tegra.c
@@ -191,6 +191,45 @@ static struct mmio_handler_ops tegra_mmio_ictlr = {
 .write_handler = ictlr_write,
 };
 
+static inline void tegra_irq_write_mask(unsigned int irq, unsigned long reg)
+{
+void __iomem *base;
+u32 mask;
+
+BUG_ON(irq < NR_LOCAL_IRQS ||
+   irq >= NR_LOCAL_IRQS + ARRAY_SIZE(ictlr_info) * 32);
+
+irq -= NR_LOCAL_IRQS;
+base = ictlr + 0x100 * (irq / 32);
+mask = BIT(irq % 32);
+
+writel(mask, base + reg);
+}
+
+static void tegra_irq_eoi(struct irq_desc *desc)
+{
+int irq = desc->irq;
+if ( irq < NR_LOCAL_IRQS )
+return;
+//tegra_irq_write_mask(irq, ICTLR_CPU_IEP_FIR_CLR);
+}
+
+static void tegra_irq_mask(struct irq_desc *desc)
+{
+int irq = desc->irq;
+if ( irq < NR_LOCAL_IRQS )
+return;
+tegra_irq_write_mask(irq, ICTLR_CPU_IER_CLR);
+}
+
+static void tegra_irq_unmask(struct irq_desc *desc)
+{
+int irq = desc->irq;
+if ( irq < NR_LOCAL_IRQS )
+return;
+tegra_irq_write_mask(irq, ICTLR_CPU_IER_SET);
+}
+
 static void tegra_route_irq_to_guest(struct domain *d, struct irq_desc *desc)
 {
 int irq = desc->irq;
@@ -256,7 +295,7 @@ static int map_one_spi(struct domain *d, const char *what,
  */
 static int tegra_specific_mapping(struct domain *d)
 {
-int ret;
+int ret/*, i*/;
 
 ret = map_one_mmio(d, "IRAM", paddr_to_pfn(0x4000),
   paddr_to_pfn(0x4004));
@@ -312,6 +351,29 @@ static int tegra_specific_mapping(struct domain *d)
 if ( ret )
 goto err;
 
+#if 0
+ret = map_one_spi(d, "PCI INTx", 98, DT_IRQ_TYPE_LEVEL_HIGH);
+if ( ret )
+goto err;
+
+ret = map_one_spi(d, "PCI MSI", 99, DT_IRQ_TYPE_LEVEL_HIGH);
+if ( ret )
+goto err;
+
+for ( i = 104 ; i < 119 ; i++ )
+{
+ret = map_one_spi(d, "AHB DMA", i, DT_IRQ_TYPE_LEVEL_HIGH);
+if ( ret )
+goto err;
+}
+for ( i = 128 ; i < 143 ; i++ )
+{
+ret = map_one_spi(d, "AHB DMA", i, DT_IRQ_TYPE_LEVEL_HIGH);
+if ( ret )
+goto err;
+}
+#endif
+
 register_mmio_handler(d, &tegra_mmio_ictlr, ICTLR_BASE, ICTLR_SIZE);
 
 ret = 0;
@@ -379,6 +441,10 @@ PLATFORM_START(tegra, "TEGRA124")
 .reset = tegra_reset,
 .specific_mapping = tegra_specific_mapping,
 
+.irq_eoi = tegra_irq_eoi,
+.irq_mask = tegra_irq_mask,
+.irq_unmask = tegra_irq_unmask,
+
 .route_irq_to_guest = tegra_route_irq_to_guest,
 
 .dom0_gnttab_start = 0x6800,
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP v1 03/10] xen: arm: add platform hook for routing IRQ to guests

2015-03-10 Thread Ian Campbell
Tegra contains a secondary set of IRQ registers which dom0 wants to poke at, we
will use this for that.

Signed-off-by: Ian Campbell 
---
 xen/arch/arm/irq.c | 4 
 xen/arch/arm/platform.c| 6 ++
 xen/include/asm-arm/platform.h | 5 +
 3 files changed, 15 insertions(+)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index cb9c99b..c574b92 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -27,6 +27,7 @@
 
 #include 
 #include 
+#include 
 
 static unsigned int local_irqs_type[NR_LOCAL_IRQS];
 static DEFINE_SPINLOCK(local_irqs_type_lock);
@@ -423,6 +424,9 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
 
 gic_route_irq_to_guest(d, desc, cpumask_of(smp_processor_id()),
GIC_PRI_IRQ);
+
+platform_route_irq_to_guest(d, desc);
+
 spin_unlock_irqrestore(&desc->lock, flags);
 return 0;
 
diff --git a/xen/arch/arm/platform.c b/xen/arch/arm/platform.c
index 86daf2b..c58e251 100644
--- a/xen/arch/arm/platform.c
+++ b/xen/arch/arm/platform.c
@@ -160,6 +160,12 @@ bool_t platform_device_is_blacklisted(const struct 
dt_device_node *node)
 return (dt_match_node(blacklist, node) != NULL);
 }
 
+void platform_route_irq_to_guest(struct domain *d, struct irq_desc *desc)
+{
+if ( platform && platform->route_irq_to_guest )
+platform->route_irq_to_guest(d, desc);
+}
+
 void platform_dom0_gnttab(paddr_t *start, paddr_t *size)
 {
 if ( platform && platform->dom0_gnttab_size )
diff --git a/xen/include/asm-arm/platform.h b/xen/include/asm-arm/platform.h
index 4eba37b..22d1f8b 100644
--- a/xen/include/asm-arm/platform.h
+++ b/xen/include/asm-arm/platform.h
@@ -26,6 +26,9 @@ struct platform_desc {
 void (*reset)(void);
 /* Platform power-off */
 void (*poweroff)(void);
+
+void (*route_irq_to_guest)(struct domain *d, struct irq_desc *);
+
 /*
  * Platform quirks
  * Defined has a function because a platform can support multiple
@@ -70,6 +73,8 @@ bool_t platform_device_is_blacklisted(const struct 
dt_device_node *node);
 unsigned int platform_dom0_evtchn_ppi(void);
 void platform_dom0_gnttab(paddr_t *start, paddr_t *size);
 
+void platform_route_irq_to_guest(struct domain *, struct irq_desc *);
+
 #define PLATFORM_START(_name, _namestr) \
 static const struct platform_desc  __plat_desc_##_name __used   \
 __attribute__((__section__(".arch.info"))) = {  \
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP v1 07/10] HACK: xen: arm: trying to figure out ns16550 vs. Tegra issue

2015-03-10 Thread Ian Campbell
Messing with the FIFO depths and trigger levels seems to help, as does messing
with the interrupt enable state at various points. The latter might be better
handled via the start/stop_tx hooks which were added a while ago.

Signed-off-by: Ian Campbell 
---
 xen/drivers/char/ns16550.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c
index 44045d7..3b32f40 100644
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -401,7 +401,14 @@ static void ns16550_interrupt(
 {
 char lsr = ns_read_reg(uart, UART_LSR);
 if ( (lsr & uart->lsr_mask) == uart->lsr_mask )
+{
 serial_tx_interrupt(port, regs);
+if ( port->txbufc == port->txbufp ) {
+u8 reg;
+reg = ns_read_reg(uart, UART_IER);
+ns_write_reg(uart, UART_IER, reg & (~UART_IER_ETHREI));
+}
+}
 if ( lsr & UART_LSR_DR )
 serial_rx_interrupt(port, regs);
 }
@@ -450,6 +457,13 @@ static int ns16550_tx_ready(struct serial_port *port)
 if ( ns16550_ioport_invalid(uart) )
 return -EIO;
 
+if ( 1 )
+{
+u8 reg;
+reg = ns_read_reg(uart, UART_IER);
+ns_write_reg(uart, UART_IER, reg | UART_IER_ETHREI);
+}
+
 return ( (ns_read_reg(uart, UART_LSR) &
   uart->lsr_mask ) == uart->lsr_mask ) ? uart->fifo_size : 0;
 }
@@ -539,7 +553,7 @@ static void ns16550_setup_preirq(struct ns16550 *uart)
 
 /* Enable and clear the FIFOs. Set a large trigger threshold. */
 ns_write_reg(uart, UART_FCR,
- UART_FCR_ENABLE | UART_FCR_CLRX | UART_FCR_CLTX | 
UART_FCR_TRG14);
+ UART_FCR_ENABLE | UART_FCR_CLRX | UART_FCR_CLTX | 
UART_FCR_TRG1/*UART_FCR_TRG14*/);
 }
 
 static void __init ns16550_init_preirq(struct serial_port *port)
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH WIP 0/10] *hacky* basic support for Nvidia Tegra K1/Jetson board (probably ignore)

2015-03-10 Thread Ian Campbell
There is a lot of hacky and WIP stuff here, including a serial driver
issue (FIFO depth? Interrupts?) which I haven't gotten to the bottom of
yet. But with these hacks and sync_console I can boot on the platform
and run guests etc. 

Actually only up to and including patch #7 "HACK: xen: arm: trying to
figure out ns16550 vs. Tegra issue" are needed for a functioning system.

I'm not sure when I'll be able to dig in properly into the issues so I'm
sending it out now. Nothing here is to be applied, except possibly "xen:
arm: Add debug keyhandler to dump the physical GIC stat" but even that
needs more cleanup I think (in particular only handles GICv2).

I've deliberately not CC-d anyone on the patches themselves so as not to
spam you with half complete rubbish.

The following changes since commit f0ffd6032f679ec4b9a39d526cdbcdaf692e2f03:

  netif.h: describe request/response structures in terms of binary layout 
(2015-03-03 11:26:24 +)

are available in the git repository at:

  git://xenbits.xen.org/people/ianc/xen.git tegra-tk1-jetson-v1

for you to fetch changes up to c78d51660446d33dac4bb07c3c17e1d14d62ebc2:

  Tegra hacking. (2015-03-10 08:55:00 +)


Ian Campbell (10):
  xen: arm: Add debug keyhandler to dump the physical GIC state.
  xen: arm: earlyprintk support for Nvidia Jetson
  xen: arm: add platform hook for routing IRQ to guests
  WIP: xen: ns16550: Add nvidia,tegra20-uart to DT compatible list
  WIP: xen: arm: intial platform support for Nvidia TK1
  HACK: xen: arm: Map other regions to dom0 after one fails
  HACK: xen: arm: trying to figure out ns16550 vs. Tegra issue
  HACK: xen: arm: Add mask, unmask and eoi platform hooks.
  HACK: xen: arm: stop recursing with dom0 mappings once we've hit a ranges.
  Tegra hacking.

 docs/misc/arm/early-printk.txt  |   1 +
 xen/arch/arm/Rules.mk   |   5 +++
 xen/arch/arm/domain.c   |   2 +-
 xen/arch/arm/domain_build.c |  23 +++---
 xen/arch/arm/gic-v2.c   |  69 +-
 xen/arch/arm/gic-v3.c   |   5 ++-
 xen/arch/arm/gic.c  |  25 ++-
 xen/arch/arm/irq.c  |   4 ++
 xen/arch/arm/platform.c |  22 ++
 xen/arch/arm/platforms/Makefile |   1 +
 xen/arch/arm/platforms/tegra.c  | 461 
+++
 xen/drivers/char/ns16550.c  |  17 +++-
 xen/include/asm-arm/gic.h   |   6 ++-
 xen/include/asm-arm/platform.h  |  13 ++
 14 files changed, 639 insertions(+), 15 deletions(-)
 create mode 100644 xen/arch/arm/platforms/tegra.c



Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] tools/libxl: cleanup one libxl__calloc() usage

2015-03-10 Thread Wei Liu
On Tue, Mar 10, 2015 at 02:28:16PM +0800, Tiejun Chen wrote:
> Its pointless because internally, libxl__calloc() would always
> terminate program execution if failed,
> 
> libxl__calloc()
> |
> + void *ptr = calloc(nmemb, size);
> + if (!ptr) libxl__alloc_failed(CTX, __func__, nmemb, size);
> |
> + _exit(-1);
> 
> Signed-off-by: Tiejun Chen 

Acked-by: Wei Liu 

> ---
>  tools/libxl/libxl_dm.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 8599a6a..cb006df 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -1175,10 +1175,6 @@ static void spawn_stub_launch_dm(libxl__egc *egc,
>  num_console++;
>  
>  console = libxl__calloc(gc, num_console, sizeof(libxl__device_console));
> -if (!console) {
> -ret = ERROR_NOMEM;
> -goto out;
> -}
>  
>  for (i = 0; i < num_console; i++) {
>  libxl__device device;
> -- 
> 1.9.1

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST] Arrange for core dumps to be placed in /var/core and collect them

2015-03-10 Thread Ian Campbell
On Mon, 2015-03-09 at 17:54 -0400, Don Slutz wrote:
> 
> On 03/09/15 11:40, Ian Campbell wrote:
> > Refactor the $kvp_replace helper in ts-xen-install into a generic
> > helper (which requires using ::EO and ::EI for namespacing) for use
> > with target_editfile and use it to edit /etc/sysctl.conf to set
> > kernel.core_pattern on boot.
> > 
> > Tested in standalone mode by installing and running a C program
> > containing "*(int *)0 = 1;" which, after running "ulimit -c unlimited"
> > produces the expected core file. ts-logs-capture when run in
> > standalone mode then picks them up.
> > 
> > I've not yet figured out how to make the desired rlimit take affect
> > for all processes (including e.g. daemons spawned on boot). Likely
> > this will involve some combination of pam_limits.so PAM module and
> > adding explicit ulimit calls to the initscripts which we care about
> > (primarily xencommons and libvirt initscripts).
> 
> I am not sure about debian, but for fedora the places are:

Thanks, these look broadly similar to Debian.

> /etc/security/limits.conf:
> *  soft  core  unlimited

FYI "*" explicitly excludes root, at least on Debian, so a separate line
would be needed for root logins.

> /etc/profile:
> ulimit -S -c unlimited > /dev/null 2>&1
> 
> /etc/sysctl.conf
> fs.suid_dumpable = 1

Not 100% sure if we would need this in the context of osstest.

> /etc/sysconfig/init:
> DAEMON_COREFILE_LIMIT='unlimited'
> 
> Note: The last depends on:
>   /etc/init.d/functions:
>   ulimit -S -c ${DAEMON_COREFILE_LIMIT:-0} >/dev/null 2>&1

I think this one is a Red Hat ism.

> Hope this helps.

It did, thanks.

>-Don Slutz
> 
> > 
> > I did debate making the presence of cores in /var/core fail the test
> > (somehow), but I decided that would be annoying for standalone mode or
> > shared host scenarios where the core files might be stale or related
> > to another job.
> > 
> > Signed-off-by: Ian Campbell 
> > ---
> >  Osstest/TestSupport.pm | 22 ++
> >  ts-host-install|  9 +
> >  ts-logs-capture|  2 ++
> >  ts-xen-install | 19 ++-
> >  4 files changed, 35 insertions(+), 17 deletions(-)
> > 
> > diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
> > index 8754e22..ece2282 100644
> > --- a/Osstest/TestSupport.pm
> > +++ b/Osstest/TestSupport.pm
> > @@ -57,6 +57,7 @@ BEGIN {
> >target_put_guest_image target_editfile
> >target_editfile_cancel
> >target_editfile_root target_file_exists
> > +  target_editfile_kvp_replace
> >target_run_apt
> >target_install_packages target_install_packages_norec
> >target_jobdir target_extract_jobdistpath_subdir
> > @@ -542,6 +543,27 @@ sub teditfileex {
> > if $install;
> >  }
> >  
> > +# Replace a Key=Value style line in a config file.
> > +#
> > +# To be used as 3rd argument to target_editfile(_root) as:
> > +#target_editfile_root($ho, "/path/to/a/file",
> > +#   sub { target_editfile_kvp_replace($key, $value) });
> > +sub target_editfile_kvp_replace ($$)
> > +{
> > +my ($key,$value) = @_;
> > +my $prnow;
> > +$prnow= sub {
> > +   print ::EO "$key=$value\n" or die $!;
> > +   $prnow= sub { };
> > +};
> > +while (<::EI>) {
> > +   print ::EO or die $! unless m/^$key\b/;
> > +   $prnow->() if m/^#$key/;
> > +}
> > +print ::EO "\n" or die $!;
> > +$prnow->();
> > +};
> > +
> >  sub target_editfile_root ($$$;$$) { teditfileex('root',@_); }
> >  sub target_editfile  ($$$;$$) { teditfileex('osstest',@_); }
> >  # my $code= pop @_;
> > diff --git a/ts-host-install b/ts-host-install
> > index 9656079..b60abae 100755
> > --- a/ts-host-install
> > +++ b/ts-host-install
> > @@ -139,6 +139,15 @@ END
> > });
> >  }
> >  
> > +target_cmd_root($ho, 'mkdir -p /var/core');
> > +
> > +target_editfile_root($ho, '/etc/sysctl.conf',
> > +   sub { target_editfile_kvp_replace(
> > + "kernel.core_pattern",
> > + # %p==pid,%e==executable name,%t==timestamp
> > + "/var/core/%t.%p.%e.core") });
> > +target_cmd_root($ho, "sysctl --load /etc/sysctl.conf");
> > +
> >  target_cmd_root($ho, "update-rc.d osstest-confirm-booted start 99 2 
> > .");
> >  
> >  logm('OK: install completed');
> > diff --git a/ts-logs-capture b/ts-logs-capture
> > index 453b03d..45b0a38 100755
> > --- a/ts-logs-capture
> > +++ b/ts-logs-capture
> > @@ -136,6 +136,8 @@ sub fetch_logs_host_guests () {
> >  
> >/home/osstest/osstest-confirm-booted.log
> >  
> > +  /var/core/*.core
> > +
> >)];
> >  if (!try_fetch_logs($ho, $logs)) {
> >  logm("log fetching failed, trying hard host reboot...");
> > diff --git a/ts-xen-install b/ts-xen-install
> > index 5282f0a..da64a90 100755
> > --- a/ts-xen-install
> > ++

[Xen-devel] [v2][PATCH 0/2] libxl: try to support IGD passthrough for qemu upstream

2015-03-10 Thread Tiejun Chen
v2:

* Refine patch #2's head description 
* Improve codes quality inside patch #1 based on Wei's comments
* Refill the summary inside patch #0 based on Konrad and Wei's suggestion

When we're working to support IGD GFX passthrough with qemu
upstream, instead of "-gfx_passthru" we'd like to make that
a machine option, "-machine xxx,igd-passthru=on".

https://lists.nongnu.org/archive/html/qemu-devel/2015-01/msg02050.html

This need to bring a change on tool side.

After a discussion with Campbell, we'd like to construct a table to record
all IGD devices we can support. If we hit that table, we should pass that
option. And so we also introduce a new field of type, 'gfx_passthru_kind',
to cooperate with 'gfx_passthru' to cover all scenarios like this,

gfx_passthru = 0=> sets build_info.u.gfx_passthru to false
gfx_passthru = 1=> sets build_info.u.gfx_passthru to true and
   build_info.u.gfx_passthru_kind to DEFAULT
gfx_passthru = "igd"=> sets build_info.u.gfx_passthru to false
   and build_info.u.gfx_passthru_kind to IGD

And note actually that option "-gfx_passthru" is just introduced to
work for qemu-xen-traditional so we should get this away from
libxl__build_device_model_args_new() in the case of qemu upstream. 


Tiejun Chen (2):
  libxl: introduce libxl__is_igd_vga_passthru
  libxl: introduce gfx_passthru_kind

 tools/libxl/libxl_dm.c   |  15 -
 tools/libxl/libxl_internal.h |   2 +
 tools/libxl/libxl_pci.c  | 124 +++
 tools/libxl/libxl_types.idl  |   6 ++
 tools/libxl/xl_cmdimpl.c |  22 ++-
 5 files changed, 164 insertions(+), 5 deletions(-)

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [v2][PATCH 2/2] libxl: introduce gfx_passthru_kind

2015-03-10 Thread Tiejun Chen
Although we already have 'gfx_passthru' in b_info, this doesn' suffice
after we want to handle IGD specifically. Now we define a new field of
type, gfx_passthru_kind, to indicate we're trying to pass IGD. Actually
this means we can benefit this to support other specific devices just
by extending gfx_passthru_kind. And then we can cooperate with
gfx_passthru to address IGD cases as follows:

gfx_passthru = 0=> sets build_info.u.gfx_passthru to false
gfx_passthru = 1=> sets build_info.u.gfx_passthru to true and
   build_info.u.gfx_passthru_kind to DEFAULT
gfx_passthru = "igd"=> sets build_info.u.gfx_passthru to false
   and build_info.u.gfx_passthru_kind to IGD

Here if gfx_passthru_kind = DEFAULT, we will call
libxl__is_igd_vga_passthru() to check if we're hitting that table to need
to pass that option to qemu. But if gfx_passthru_kind = "igd" we always
force to pass that.

And "-gfx_passthru" is just introduced to work for qemu-xen-traditional
so we should get this away from libxl__build_device_model_args_new() in
the case of qemu upstream.

Signed-off-by: Tiejun Chen 
---
 tools/libxl/libxl_dm.c  | 15 ---
 tools/libxl/libxl_pci.c |  4 ++--
 tools/libxl/libxl_types.idl |  6 ++
 tools/libxl/xl_cmdimpl.c| 22 --
 4 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 8599a6a..2d06038 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -710,9 +710,6 @@ static char ** libxl__build_device_model_args_new(libxl__gc 
*gc,
 flexarray_append(dm_args, "-net");
 flexarray_append(dm_args, "none");
 }
-if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
-flexarray_append(dm_args, "-gfx_passthru");
-}
 } else {
 if (!sdl && !vnc) {
 flexarray_append(dm_args, "-nographic");
@@ -757,6 +754,18 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,
 machinearg, max_ram_below_4g);
 }
 }
+
+if (b_info->u.hvm.gfx_passthru_kind ==
+LIBXL_GFX_PASSTHRU_KIND_DEFAULT) {
+if (libxl__is_igd_vga_passthru(gc, guest_config))
+machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
+} else if (b_info->u.hvm.gfx_passthru_kind ==
+LIBXL_GFX_PASSTHRU_KIND_IGD) {
+machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
+} else {
+LOG(WARN, "gfx_passthru_kind is invalid so ignored.\n");
+}
+
 flexarray_append(dm_args, machinearg);
 for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL; i++)
 flexarray_append(dm_args, b_info->extra_hvm[i]);
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index fc060c6..9a534cc 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -608,11 +608,11 @@ bool libxl__is_igd_vga_passthru(libxl__gc *gc,
 device = fixup_ids[j].device;
 
 if (pt_vendor == vendor &&  pt_device == device)
-return 1;
+return true;
 }
 }
 
-return 0;
+return false;
 }
 
 /*
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 02be466..d64ad10 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -140,6 +140,11 @@ libxl_tsc_mode = Enumeration("tsc_mode", [
 (3, "native_paravirt"),
 ])
 
+libxl_gfx_passthru_kind = Enumeration("gfx_passthru_kind", [
+(0, "default"),
+(1, "igd"),
+])
+
 # Consistent with the values defined for HVM_PARAM_TIMER_MODE.
 libxl_timer_mode = Enumeration("timer_mode", [
 (-1, "unknown"),
@@ -430,6 +435,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("spice",libxl_spice_info),

("gfx_passthru", libxl_defbool),
+   ("gfx_passthru_kind", 
libxl_gfx_passthru_kind),

("serial",   string),
("boot", string),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 440db78..d0d6ce3 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1953,8 +1953,26 @@ skip_vfb:
 xlu_cfg_replace_string (config, "spice_streaming_video",
 &b_info->u.hvm.spice.streaming_video, 0);
 xlu_cfg_get_defbool(config, "nographic", &b_info->u.hvm.nographic, 0);
-xlu_cfg_get_defbool(config, "gfx_passthru",
-&b_info->u.hvm.gfx_passthru, 0);
+if (!xlu_cfg_get_long(config, "gfx_passthru", &l, 1)) {
+if (!l) {

[Xen-devel] [v2][PATCH 1/2] libxl: introduce libxl__is_igd_vga_passthru

2015-03-10 Thread Tiejun Chen
While working with qemu, IGD is a specific device in the case of pass through
so we need to identify that to handle more later. Here we define a table to
record all IGD types currently we can support. Also we need to introduce two
helper functions to get vendor and device ids to lookup that table.

Signed-off-by: Tiejun Chen 
---
 tools/libxl/libxl_internal.h |   2 +
 tools/libxl/libxl_pci.c  | 124 +++
 2 files changed, 126 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 934465a..c97c62d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1176,6 +1176,8 @@ _hidden int libxl__device_pci_add(libxl__gc *gc, uint32_t 
domid, libxl_device_pc
 _hidden int libxl__create_pci_backend(libxl__gc *gc, uint32_t domid,
   libxl_device_pci *pcidev, int num);
 _hidden int libxl__device_pci_destroy_all(libxl__gc *gc, uint32_t domid);
+_hidden bool libxl__is_igd_vga_passthru(libxl__gc *gc,
+const libxl_domain_config *d_config);
 
 /*- xswait: wait for a xenstore node to be suitable -*/
 
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index f3ae132..fc060c6 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -491,6 +491,130 @@ static int sysfs_dev_unbind(libxl__gc *gc, 
libxl_device_pci *pcidev,
 return 0;
 }
 
+static uint16_t sysfs_dev_get_vendor(libxl__gc *gc, libxl_device_pci *pcidev)
+{
+char *pci_device_vendor_path =
+GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/vendor",
+  pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func);
+uint16_t read_items;
+uint16_t pci_device_vendor;
+
+FILE *f = fopen(pci_device_vendor_path, "r");
+if (!f) {
+LOGE(ERROR,
+ "pci device "PCI_BDF" does not have vendor attribute",
+ pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func);
+return 0x;
+}
+read_items = fscanf(f, "0x%hx\n", &pci_device_vendor);
+fclose(f);
+if (read_items != 1) {
+LOGE(ERROR,
+ "cannot read vendor of pci device "PCI_BDF,
+ pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func);
+return 0x;
+}
+
+return pci_device_vendor;
+}
+
+static uint16_t sysfs_dev_get_device(libxl__gc *gc, libxl_device_pci *pcidev)
+{
+char *pci_device_device_path =
+GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/device",
+  pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func);
+uint16_t read_items;
+uint16_t pci_device_device;
+
+FILE *f = fopen(pci_device_device_path, "r");
+if (!f) {
+LOGE(ERROR,
+ "pci device "PCI_BDF" does not have device attribute",
+ pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func);
+return 0x;
+}
+read_items = fscanf(f, "0x%hx\n", &pci_device_device);
+fclose(f);
+if (read_items != 1) {
+LOGE(ERROR,
+ "cannot read device of pci device "PCI_BDF,
+ pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func);
+return 0x;
+}
+
+return pci_device_device;
+}
+
+typedef struct {
+uint16_t vendor;
+uint16_t device;
+} pci_info;
+
+static const pci_info fixup_ids[] = {
+/* Intel HSW Classic */
+{0x8086, 0x0402}, /* HSWGT1D, HSWD_w7 */
+{0x8086, 0x0406}, /* HSWGT1M, HSWM_w7 */
+{0x8086, 0x0412}, /* HSWGT2D, HSWD_w7 */
+{0x8086, 0x0416}, /* HSWGT2M, HSWM_w7 */
+{0x8086, 0x041E}, /* HSWGT15D, HSWD_w7 */
+/* Intel HSW ULT */
+{0x8086, 0x0A06}, /* HSWGT1UT, HSWM_w7 */
+{0x8086, 0x0A16}, /* HSWGT2UT, HSWM_w7 */
+{0x8086, 0x0A26}, /* HSWGT3UT, HSWM_w7 */
+{0x8086, 0x0A2E}, /* HSWGT3UT28W, HSWM_w7 */
+{0x8086, 0x0A1E}, /* HSWGT2UX, HSWM_w7 */
+{0x8086, 0x0A0E}, /* HSWGT1ULX, HSWM_w7 */
+/* Intel HSW CRW */
+{0x8086, 0x0D26}, /* HSWGT3CW, HSWM_w7 */
+{0x8086, 0x0D22}, /* HSWGT3CWDT, HSWD_w7 */
+/* Intel HSW Server */
+{0x8086, 0x041A}, /* HSWSVGT2, HSWD_w7 */
+/* Intel HSW SRVR */
+{0x8086, 0x040A}, /* HSWSVGT1, HSWD_w7 */
+/* Intel BSW */
+{0x8086, 0x1606}, /* BDWULTGT1, BDWM_w7 */
+{0x8086, 0x1616}, /* BDWULTGT2, BDWM_w7 */
+{0x8086, 0x1626}, /* BDWULTGT3, BDWM_w7 */
+{0x8086, 0x160E}, /* BDWULXGT1, BDWM_w7 */
+{0x8086, 0x161E}, /* BDWULXGT2, BDWM_w7 */
+{0x8086, 0x1602}, /* BDWHALOGT1, BDWM_w7 */
+{0x8086, 0x1612}, /* BDWHALOGT2, BDWM_w7 */
+{0x8086, 0x1622}, /* BDWHALOGT3, BDWM_w7 */
+{0x8086, 0x162B}, /* BDWHALO28W, BDWM_w7 */
+{0x8086, 0x162A}, /* BDWGT3WRKS, BDWM_w7 */
+{0x8086, 0x162D}, /* BDWGT3SRVR, BDWM_w7 */
+};
+
+/*
+ * Some devices may need some ways to work well. Here like IGD,
+ * we have to pass a specific option to qemu.
+ */
+bool libxl__is_igd_vga_passthru(libxl__gc *gc,
+const libxl_domain_config *d_conf

Re: [Xen-devel] [PATCH WIP 0/10] *hacky* basic support for Nvidia Tegra K1/Jetson board (probably ignore)

2015-03-10 Thread Ian Campbell
On Tue, 2015-03-10 at 09:01 +, Ian Campbell wrote:

Forgot to say -- this needs some u-boot patches to enable booting in
hypmode. Jan Kiska has picked up the dev work of those, I tested using
v4 from http://lists.denx.de/pipermail/u-boot/2015-February/206557.html
applied on top of u-boot-tegra.git#master

v5 was posted recently
http://lists.denx.de/pipermail/u-boot/2015-March/207532.html

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] CPU emulation on Xen

2015-03-10 Thread Stefano Stabellini
On Tue, 10 Mar 2015, Bunny Mintoo wrote:
> Thanks for your mail. 
> My architecture is x86 compatible. I have just added new instructions over 
> existing x86 ISA.These new instructions are now to be
> taught to the CPU. In QEMU, I was able to mimic the new CPU working with 
> software changes (changes involved in both kernel and
> QEMU source tree). So what I am really looking is Xen to emulate this new 
> architecture so that I can carry out testing (and
> probably compare performance results). The high level idea is to tweak Xen 
> source code to emulate my CPU working and Dom0 can be
> modified appropriately. 

In that case you might want to take a look at 
xen/arch/x86/x86_emulate/x86_emulate.c


> Why do you say QEMU might be a better fit for you? Asking this out of 
> curiosity.

QEMU is very good at emulation -- is able to emulate entire platforms
and cpus. If this is what you need, QEMU is the right way to go. Xen is
about virtualizing your hardware mostly with para-virtualization and a
little emulation as needed.___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] EFI: Fix getting EFI variable list on some systems

2015-03-10 Thread Ross Lagerwall
Copy the entire output buffer to the guest because some firmwares update
size on successful calls (contrary to the spec) and the buffer may
contain data beyond the output size that the firmware requires on a
subsequent GetNextVariableName() call (e.g. a NULL character).

Note that this shouldn't change the amount of data copied because on success, a
compliant firmware does not change size and so the entire buffer is copied
anyway.  If size is changed, Xen does not copy the buffer.

Without this change, the following (simplified) sequence would occur:
GetNextVariableName: in \0, size 1024 || out AdminPw\0, size 7
GetNextVariableName: in AdminPw\0, size 1024 || out UserPw\0, size 6
GetNextVariableName: in UserPww\0, size 1024 || NOT FOUND

This was seen on an Intel S1200RP_SE with firmware
S1200RP.86B.02.02.0005.102320140911, version 4.6, date 2014-10-23.

Signed-off-by: Ross Lagerwall 
Reviewed-by: Andrew Cooper 
---
 xen/common/efi/runtime.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c
index 7ed5bfa..dc80c6f9 100644
--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -516,9 +516,13 @@ int efi_runtime_call(struct xenpf_efi_runtime_call *op)
 cast_guid(&op->u.get_next_variable_name.vendor_guid));
 efi_rs_leave(cr3);
 
+/*
+ * Copy the variable name if necessary. The entire buffer is copied
+ * because some firmwares update size when they shouldn't.
+ * */
 if ( !EFI_ERROR(status) &&
- copy_to_guest(op->u.get_next_variable_name.name,
-   name.raw, size) )
+ __copy_to_guest(op->u.get_next_variable_name.name,
+ name.raw, op->u.get_next_variable_name.size) )
 rc = -EFAULT;
 op->u.get_next_variable_name.size = size;
 }
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86: synchronize PCI config space access decoding

2015-03-10 Thread Andrew Cooper
On 10/03/15 07:30, Jan Beulich wrote:
 On 09.03.15 at 19:49,  wrote:
>> On 09/03/15 16:08, Jan Beulich wrote:
>>> Both PV and HVM logic have similar but not similar enough code here.
>>> Synchronize the two so that
>>> - in the HVM case we don't unconditionally try to access extended
>>>   config space
>>> - in the PV case we pass a correct range to the XSM hook
>>> - in the PV case we don't needlessly deny access when the operation
>>>   isn't really on PCI config space
>>> All this along with sharing the macros HVM already had here.
>>>
>>> Signed-off-by: Jan Beulich 
>>>
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -2383,11 +2383,6 @@ void hvm_vcpu_down(struct vcpu *v)
>>>  static struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
>>>  ioreq_t *p)
>>>  {
>>> -#define CF8_BDF(cf8) (((cf8) & 0x0000) >> 8)
>>> -#define CF8_ADDR_LO(cf8) ((cf8) & 0x00fc)
>>> -#define CF8_ADDR_HI(cf8) (((cf8) & 0x0f00) >> 16)
>>> -#define CF8_ENABLED(cf8) (!!((cf8) & 0x8000))
>>> -
>>>  struct hvm_ioreq_server *s;
>>>  uint32_t cf8;
>>>  uint8_t type;
>>> @@ -2416,9 +2411,19 @@ static struct hvm_ioreq_server *hvm_sele
>>>  
>>>  type = IOREQ_TYPE_PCI_CONFIG;
>>>  addr = ((uint64_t)sbdf << 32) |
>>> -   CF8_ADDR_HI(cf8) |
>>> CF8_ADDR_LO(cf8) |
>>> (p->addr & 3);
>>> +/* AMD extended configuration space access? */
>>> +if ( CF8_ADDR_HI(cf8) &&
>>> + boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
>>> + boot_cpu_data.x86 >= 0x10 && boot_cpu_data.x86 <= 0x17 )
>>> +{
>>> +uint64_t msr_val;
>>> +
>>> +if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
>>> + (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
>>> +addr |= CF8_ADDR_HI(cf8);
>> This is another example of host state which leaks into guests across
>> migrate, but in this case is also problematic at the host level.
> Yes, but cross-vendor migration has (iirc) many more issues like this
> (and considering the wide family range the risk of this breaking for
> migration between AMD systems seems marginal).

I wasn't even considering cross-vendor migration, but that is another
concern.  I was more concerned with leaking bios-configured state into
the guest.

>
>> As far as the host goes, MSR_AMD64_NB_CFG is a per-node msr and Xen
>> should verify that the AMD64_NB_CFG_CF8_EXT_ENABLE_BIT is consistent
>> across the system, or bits of emulate_privileged_op() are liable to
>> execute differently depending on which pcpu a vcpu happens to be scheduled.
> I think this goes too far in mistrusting Dom0.

The only case where dom0 could plausibly set this up consistently even
if wanted to, is when it has a vcpu for each pcpu and is using
dom0_vcpu_pin.  Either of these conditions is rare in practice.

I still think it is Xen which needs to set this up consistently on boot,
at which point removing all the the rdmsr_safe() from cf8 accesses is
trivial.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 0/2] Add support for Xilinx ZynqMP SoC

2015-03-10 Thread Julien Grall
Hello Edgar,

Thank you for adding support of the ZynqMP.

On 10/03/15 02:49, Edgar E. Iglesias wrote:
> From: "Edgar E. Iglesias" 
> 
> Adds support for the Cadence UART in Xilinx ZynqMP. The
> rest of the ZynqMP platform is discovered via device-tree.

Did you make sure that the default grant table range (0xb000 -
0xb002) don't overlap with an hardware region?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/2] xen/arm: Add Cadence UART driver

2015-03-10 Thread Julien Grall
Hello Edgar,

On 10/03/15 02:49, Edgar E. Iglesias wrote:
> Signed-off-by: Edgar E. Iglesias 

Reviewed-by: Julien Grall 

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv1] xen/balloon: disable memory hotplug in PV guests

2015-03-10 Thread David Vrabel
On 09/03/15 14:10, David Vrabel wrote:
> Memory hotplug doesn't work with PV guests because:
> 
>   a) The p2m cannot be expanded to cover the new sections.

Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
linear virtual mapped sparse p2m list).

This one would be non-trivial to fix.  We'd need a sparse set of
vm_area's for the p2m or similar.

>   b) add_memory() builds page tables for the new sections which means
>  the new pages must have valid p2m entries (or a BUG occurs).

After some more testing this appears to be broken by:

25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions above the
end of RAM as 1:1) included 3.16.

This one can be trivially fixed by setting the new sections in the p2m
to INVALID_P2M_ENTRY before calling add_memory().

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] xen/arm: Do not allocate pte entries for MAP_SMALL_PAGES

2015-03-10 Thread Julien Grall
On 09/03/15 16:08, Vijay Kilari wrote:
> On Mon, Mar 9, 2015 at 5:46 PM, Julien Grall  wrote:
>> Hi Vijay,
>>
>> Given the introduction of the new helper, the title looks wrong to me.
>>
>>
>> On 09/03/2015 08:59, vijay.kil...@gmail.com wrote:
>>>
>>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
>>> index 7d4ba0c..e0be36b 100644
>>> --- a/xen/arch/arm/mm.c
>>> +++ b/xen/arch/arm/mm.c
>>> @@ -827,14 +827,15 @@ static int create_xen_table(lpae_t *entry)
>>>
>>>   enum xenmap_operation {
>>>   INSERT,
>>> -REMOVE
>>> +REMOVE,
>>> +RESERVE
>>>   };
>>>
>>>   static int create_xen_entries(enum xenmap_operation op,
>>> unsigned long virt,
>>> unsigned long mfn,
>>> unsigned long nr_mfns,
>>> -  unsigned int ai)
>>> +  unsigned int flags)
>>>   {
>>>   int rc;
>>>   unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE;
>>> @@ -859,13 +860,17 @@ static int create_xen_entries(enum xenmap_operation
>>> op,
>>>
>>>   switch ( op ) {
>>>   case INSERT:
>>> +case RESERVE:
>>>   if ( third[third_table_offset(addr)].pt.valid )
>>>   {
>>>   printk("create_xen_entries: trying to replace an
>>> existing mapping addr=%lx mfn=%lx\n",
>>>  addr, mfn);
>>>   return -EINVAL;
>>>   }
>>> -pte = mfn_to_xen_entry(mfn, ai);
>>> +if ( op == RESERVE || !is_pte_present(flags) )
>>
>>
>> As you have a new operation (only used by populate_pt_range), why do you
>> need to check is_pte_present?
> 
> map_pages_to_xen() can still take MAP_SMALL_PAGES as flags.
> In future if any common code requires  MAP_SMALL_PAGES then,
> this can be used.

The only usage was in vmap that you removed in this patch...

Furthermore, we decided to use to introduce populate_pt_range in order
to avoid using MAP_SMALL_PAGES on ARM...

It's pointless to keep to different way to population page table...

[..]

>>
>> And, therefore, MAP_SMALL_PAGES could be dropped.
> 
> MAP_SMALL_PAGES is still used in common code esp. EFI code.
> We can remove this provided if we clean up this. But I still think
> MAP_SMALL_PAGES is required to keep equivalent functionality of x86.

If you looked at the code you would have notice that the code is only
compiled for x86 and would never work for ARM (_PAGE_PAT, _PAGE_PWT...
doesn't exist).

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: gic: GICv2 & GICv3 only supports 1020 physical interrupts

2015-03-10 Thread Julien Grall
Hi Ian,

On 09/03/15 16:06, Ian Campbell wrote:
> On Mon, 2015-03-09 at 14:00 +0200, Julien Grall wrote:
>> Hi Ian,
>>
>> On 05/03/2015 19:00, Ian Campbell wrote:
>>> On Tue, 2015-03-03 at 16:35 +, Julien Grall wrote:
 +gicv3_info.nr_lines = min((unsigned)1020, nr_lines);
>>>
>>> "1020U" is the correct way to write (unsigned)1020 I think (in both
>>> places).
>>
>> I gave a look on several usage of min in arch/arm and (unsigned) was used.
> 
> I don't see any with a literal number, which is the main point.
> (unsigned)PAGE_SIZE is fine, because you can't write PAGE_SIZEU very
> easily. (Leaving aside whether PAGE_SIZE should be unsigned in its
> definition.

Hmmm ... right. I will update the patch.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] xen/arm: Do not allocate pte entries for MAP_SMALL_PAGES

2015-03-10 Thread Ian Campbell
On Tue, 2015-03-10 at 11:45 +, Julien Grall wrote:
> On 09/03/15 16:08, Vijay Kilari wrote:
> > On Mon, Mar 9, 2015 at 5:46 PM, Julien Grall  
> > wrote:
> >> Hi Vijay,
> >>
> >> Given the introduction of the new helper, the title looks wrong to me.
> >>
> >>
> >> On 09/03/2015 08:59, vijay.kil...@gmail.com wrote:
> >>>
> >>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> >>> index 7d4ba0c..e0be36b 100644
> >>> --- a/xen/arch/arm/mm.c
> >>> +++ b/xen/arch/arm/mm.c
> >>> @@ -827,14 +827,15 @@ static int create_xen_table(lpae_t *entry)
> >>>
> >>>   enum xenmap_operation {
> >>>   INSERT,
> >>> -REMOVE
> >>> +REMOVE,
> >>> +RESERVE
> >>>   };
> >>>
> >>>   static int create_xen_entries(enum xenmap_operation op,
> >>> unsigned long virt,
> >>> unsigned long mfn,
> >>> unsigned long nr_mfns,
> >>> -  unsigned int ai)
> >>> +  unsigned int flags)
> >>>   {
> >>>   int rc;
> >>>   unsigned long addr = virt, addr_end = addr + nr_mfns * PAGE_SIZE;
> >>> @@ -859,13 +860,17 @@ static int create_xen_entries(enum xenmap_operation
> >>> op,
> >>>
> >>>   switch ( op ) {
> >>>   case INSERT:
> >>> +case RESERVE:
> >>>   if ( third[third_table_offset(addr)].pt.valid )
> >>>   {
> >>>   printk("create_xen_entries: trying to replace an
> >>> existing mapping addr=%lx mfn=%lx\n",
> >>>  addr, mfn);
> >>>   return -EINVAL;
> >>>   }
> >>> -pte = mfn_to_xen_entry(mfn, ai);
> >>> +if ( op == RESERVE || !is_pte_present(flags) )
> >>
> >>
> >> As you have a new operation (only used by populate_pt_range), why do you
> >> need to check is_pte_present?
> > 
> > map_pages_to_xen() can still take MAP_SMALL_PAGES as flags.
> > In future if any common code requires  MAP_SMALL_PAGES then,
> > this can be used.
> 
> The only usage was in vmap that you removed in this patch...
> 
> Furthermore, we decided to use to introduce populate_pt_range in order
> to avoid using MAP_SMALL_PAGES on ARM...
> 
> It's pointless to keep to different way to population page table...
> 
> [..]
> 
> >>
> >> And, therefore, MAP_SMALL_PAGES could be dropped.
> > 
> > MAP_SMALL_PAGES is still used in common code esp. EFI code.
> > We can remove this provided if we clean up this. But I still think
> > MAP_SMALL_PAGES is required to keep equivalent functionality of x86.
> 
> If you looked at the code you would have notice that the code is only
> compiled for x86 and would never work for ARM (_PAGE_PAT, _PAGE_PWT...
> doesn't exist).

Right, I think we should just remove MAP_SMALL_PAGES on ARM and if/when
it turns out we need it we can add it back and implement it in the PT
creation code.

In any case the fix to make vmap_init use the new function should
certainly be in a separate patch to anything which is fixing
MAP_SMALL_PAGES.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/9] numa: __node_distance() should return u8

2015-03-10 Thread Andrew Cooper
On 10/03/15 02:27, Boris Ostrovsky wrote:
> SLIT values are byte-sized and some of them (0-9 and 255) have
> special meaning. Adjust __node_distance() to reflect this and
> modify scrub_heap_pages() to deal with __node_distance() returning
> an invalid SLIT entry.
>
> Signed-off-by: Boris Ostrovsky 

You also need to teach XEN_SYSCTL_numainfo about the new NUMA_NO_DISTANCE.

> ---
>  xen/arch/x86/srat.c|   15 +++
>  xen/common/page_alloc.c|4 ++--
>  xen/include/asm-x86/numa.h |2 +-
>  xen/include/xen/numa.h |3 ++-
>  4 files changed, 16 insertions(+), 8 deletions(-)
>
> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
> index dfabba3..aa2eda3 100644
> --- a/xen/arch/x86/srat.c
> +++ b/xen/arch/x86/srat.c
> @@ -496,14 +496,21 @@ static unsigned node_to_pxm(nodeid_t n)
>   return 0;
>  }
>  
> -int __node_distance(nodeid_t a, nodeid_t b)
> +u8 __node_distance(nodeid_t a, nodeid_t b)
>  {
> - int index;
> + u8 slit_val;
>  
>   if (!acpi_slit)
>   return a == b ? 10 : 20;
> - index = acpi_slit->locality_count * node_to_pxm(a);
> - return acpi_slit->entry[index + node_to_pxm(b)];
> +
> + slit_val = acpi_slit->entry[acpi_slit->locality_count * node_to_pxm(a) +
> + node_to_pxm(b)];

This would be easier to read if you kept the old index temporary
(although making it a u64).

~Andrew

> +
> + /* ACPI defines 0xff as an unreachable node and 0-9 are undefined */
> + if ((slit_val == 0xff) || (slit_val <= 9))
> + return NUMA_NO_DISTANCE;
> + else
> + return slit_val;
>  }
>  
>  EXPORT_SYMBOL(__node_distance);
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index d96d25b..0f0ca56 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1435,13 +1435,13 @@ void __init scrub_heap_pages(void)
>  /* Figure out which NODE CPUs are close. */
>  for_each_online_node ( j )
>  {
> -int distance;
> +u8 distance;
>  
>  if ( cpumask_empty(&node_to_cpumask(j)) )
>  continue;
>  
>  distance = __node_distance(i, j);
> -if ( distance < last_distance )
> +if ( (distance < last_distance) && (distance != 
> NUMA_NO_DISTANCE) )
>  {
>  last_distance = distance;
>  best_node = j;
> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
> index cc5b5d1..7a489d3 100644
> --- a/xen/include/asm-x86/numa.h
> +++ b/xen/include/asm-x86/numa.h
> @@ -85,6 +85,6 @@ extern int valid_numa_range(u64 start, u64 end, nodeid_t 
> node);
>  #endif
>  
>  void srat_parse_regions(u64 addr);
> -extern int __node_distance(nodeid_t a, nodeid_t b);
> +extern u8 __node_distance(nodeid_t a, nodeid_t b);
>  
>  #endif
> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
> index ac4b391..7aef1a8 100644
> --- a/xen/include/xen/numa.h
> +++ b/xen/include/xen/numa.h
> @@ -7,7 +7,8 @@
>  #define NODES_SHIFT 0
>  #endif
>  
> -#define NUMA_NO_NODE0xFF
> +#define NUMA_NO_NODE 0xFF
> +#define NUMA_NO_DISTANCE 0xFF
>  
>  #define MAX_NUMNODES(1 << NODES_SHIFT)
>  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen Security Advisory 124 - Non-standard PCI device functionality may render pass-through insecure

2015-03-10 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Xen Security Advisory XSA-124
  version 2

  Non-standard PCI device functionality may render pass-through insecure

UPDATES IN VERSION 2


Clarify scope.  PCI config space backdoors are just one example.
Provide more examples of potential problems.  Provide some additional
mitigation options.

Public release.

ISSUE DESCRIPTION
=

Devices with capabilities or defects that are undocumented or that
virtualization software is unaware of may allow guests to control
parts of the host that they shouldn't be in control of.  Here are some
examples of the kind of problem:

* While XSA-120 deals with standard PCI config space accesses to the
  PCI control word, various devices have alternative methods to read
  and modify config space values.  A guest which has been given such a
  device can definitely cause a host DoS; worse attacks cannot be
  ruled out.

* Devices which are physically integrated into the system chipset
  might have undocumented direct access to memory or other resources
  (as well as the documented access via the IOMMU).  A guest with such
  a device is likely to be able to gain control of the host.

* Many devices permit (or require) the loading or updating of the
  firmware on the device.  Bad firmware is likely to be able to
  violate the PCI protocols (depending on the physical circuitry on
  the device).  The impact of such violations is difficult to assess
  in the abstract.

  Malicious firmware might also be able to cause electrical problems
  for the PCI bus, system power supply, and other circuitry.  This
  could be used to mount fault-injection attacks, or even to cause
  damage to hardware.

  Again, this will depend on the details of the device, but in general
  defending against bad firmware would require additional electronics.
  Therefore the Xen Project Security Team expects that devices which
  support firmware loading are unlikely to be robust against malicious
  firmware unless that robustness has been specifically engineered.

Since the details are device specific, special workarounds would need
to be developed for any such device for which secure pass-through is
desired.  Developing such workarounds is a task presenting multiple
challenges, particularly since the hardware details are often not
officially documented, and is beyond the scope of normal security
fixes.

The Xen Project Security Team is therefore adopting an exceptional
process for these kind of problems.  See below for details of that
exceptional process, and for the scope of the exception.

IMPACT
==

Passing through a device providing such mechanisms, which bypass or
subvert the software layers that ensure security and correctness, may
expose the host to guest induced information leaks, host crashes, and
privilege escalation.

VULNERABLE SYSTEMS
==

Only systems where physical PCI devices are passed through to
untrusted guests are affected.

All hypervisors supporting PCI passthrough are exposed to this kind of
problem; this includes all versions of Xen which support PCI
passthrough.

Only x86 Xen systems are currently affected.  ARM systems are not
currently affected when running Xen due to not supporting
pass-through.  However once this feature is implemented ARM systems
will become vulnerable to this class of bugs and subject to the
exceptional handling described in this advisory.

Devices specifically designed and advertised for secure PCI
passthrough (for example, SR-IOV virtual functions) are outside the
scope of this advisory, and outside the process exception.  We are not
aware of problems with any such devices at the present time, and any
vulnerabilities which we become aware of will be handled in the normal
way.

Any other PCI devices might cause vulnerablities, and are subject to
the exception.  Whether a specific system is actually vulnerable
depends on the characteristics of the PCI device being passed through:

* The device behaviour will usually depend on the specific firmware
  loaded onto the device itself; if such firmware is (or can be)
  loaded by guests, the device is probably vulnerable (unless its
  manufacturer has specifically advertised to the contrary).

* Other devices should be assumed to be vulnerable unless the complete
  functionality is known, and has been reviewed in the context of PCI
  passthrough security.

MITIGATION
==

Not passing through any physical devices to guests will avoid this
vulnerability.

This vulnerability can also be avoided by only passing through devices
the entire scope of whose functionality is known and has been reviewed
for PCI passthrough security and correctness, or only devices
specifically and correctly designed to be passed through in a secure
manner (for example, SR-IOV virtual functions).

If the functionality of a PCI device needs to be exposed to an
untrusted guest, PCI passthrough rel

[Xen-devel] Xen Security Advisory 120 (CVE-2015-2150) - Non-maskable interrupts triggerable by guests

2015-03-10 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Xen Security Advisory CVE-2015-2150 / XSA-120
  version 4

  Non-maskable interrupts triggerable by guests

UPDATES IN VERSION 4


Public release.

ISSUE DESCRIPTION
=

Guests are currently permitted to modify all of the (writable) bits in
the PCI command register of devices passed through to them. This in
particular allows them to disable memory and I/O decoding on the
device unless the device is an SR-IOV virtual function, in which case
subsequent accesses to the respective MMIO or I/O port ranges would
- - on PCI Express devices - lead to Unsupported Request responses. The
treatmeant of such errors is platform specific.

IMPACT
==

In the event that the platform surfaces aforementioned UR responses as
Non-Maskable Interrupts, and either the OS is configured to treat NMIs
as fatal or (e.g. via ACPI's APEI) the platform tells the OS to treat
these errors as fatal, the host would crash, leading to a Denial of
Service.

VULNERABLE SYSTEMS
==

Xen versions 3.3 and onwards are vulnerable due to supporting PCI
pass-through. Upstream Linux versions 3.1 and onwards are vulnerable
due to supporting PCI backend functionality. Other Linux versions as
well as other OS versions may be vulnerable too.

Any domain which is given access to a non-SR-IOV virtual function PCI
Express device can take advantage of this vulnerability.

MITIGATION
==

This issue can be avoided by not assigning PCI Express devices other
than SR-IOV virtual functions to untrusted guests.

CREDITS
===

This issue was discovered by Jan Beulich of SUSE.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue for the
indicated versions of Linux, but only for ordinary PCI config space
accesses by the guest. See XSA-124 for all other cases.

xsa120.patchLinux 3.19
xsa120-classic.patchlinux-2.6.18-xen.hg

$ sha256sum xsa120*.patch
ecd4568d418d6e275f1eebdba4867e7cfdc6a487292db0e9eff0e9e7e2c91826  
xsa120-classic.patch
32441fd3930848f7533f74376648fbeb5e35870661e1259860fe10f9a1f67f88  xsa120.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJU/tzUAAoJEIP+FMlX6CvZcDcIALHGaamMEPKtOANKkWW7cxJz
zWrgU+6cg/slx6wlgTnHB0/9N/zb9VPUZO3j7TS4VNL6z5zu3S1aTelo5w0F5j2N
rbQrmnJ56P7iTGU0UwerueGPUzRAOqw5JNJK/i7Y2nZo/r7Y8IkwZub8nxpeBaPF
YN3gqd7iTmq5IkM0mQNUuSmneLlMVX32dITSatKjaUNaBI54aH8byM+lUjdFyUYv
tKjb6HJD0upo7e5MPmchC1+/1B+Jm7YfAIMJ6Mn168pHMSy9Zn0p0zFeVGCA41u7
L28yDiIVfu1XWcOLWryAQQ4e/rMv1Bpy7Q259SUUj4bUiQDmRdqOdZmaXHlO/Po=
=H+jB
-END PGP SIGNATURE-


xsa120-classic.patch
Description: Binary data


xsa120.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen Security Advisory 123 (CVE-2015-2151) - Hypervisor memory corruption due to x86 emulator flaw

2015-03-10 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Xen Security Advisory CVE-2015-2151 / XSA-123
  version 4

 Hypervisor memory corruption due to x86 emulator flaw

UPDATES IN VERSION 4


Public release.

ISSUE DESCRIPTION
=

Instructions with register operands ignore eventual segment overrides
encoded for them. Due to an insufficiently conditional assignment such
a bogus segment override can, however, corrupt a pointer used
subsequently to store the result of the instruction.

IMPACT
==

A malicious guest might be able to read sensitive data relating to
other guests, or to cause denial of service on the host. Arbitrary code
execution, and therefore privilege escalation, cannot be excluded.

VULNERABLE SYSTEMS
==

Xen 3.2.x and later are vulnerable.
Xen 3.1.x and earlier have not been inspected.

Only x86 systems are vulnerable.  ARM systems are not vulnerable.

MITIGATION
==

There is no mitigation available for this issue.

CREDITS
===

This issue was discovered by Felix Wilhelm of ERNW GmbH.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa123.patch xen-unstable, Xen 4.5.x, Xen 4.4.x
xsa123-4.3-4.2.patch Xen 4.3.x, Xen 4.2.x

$ sha256sum xsa123*.patch
e6da3a2c35b50e163b15100ef28a48dca429160104f346fc82be4711fe60f64f  
xsa123-4.3-4.2.patch
994cf1487ec5c455fce4877168901e03283f0002062dcff8895a17ca30e010df  xsa123.patch
$

DEPLOYMENT DURING EMBARGO
=

Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.

But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).

Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.

(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable.  This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)

For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
  http://www.xenproject.org/security-policy.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJU/tzZAAoJEIP+FMlX6CvZV64IAJOsaNqXoLZQ0sAdfJpE6lnv
KtYzXixzTTrP87cWmkYfkLTcuQdMJKUNe00xRoEP2ES1I2XUC4dy9MrlaTpHOJ27
hZ1OpDkiOOk6B8Scf1PI6pvXZXzpnoQITPRhxUgPawIBrtPW/OP8pdUbTeGsw3MJ
hUjixTBT+Ok2Geq1U/Ki+aNe+lnLOjkuivH2nkZGsWYrRAm7Uypmtn9obQzZ4piB
OGDAsuHSXtOPGgmtztj+NW8PJ+6oURkBi0ITtc12lUwJodQV9OIOsvqD3d+HW6OC
4K1gkSor+coTS6jmoU2YU1UnPBMy4irgmg1XojwWZb+FC7lHQDD24wMSs1LVJ7c=
=E2Oh
-END PGP SIGNATURE-


xsa123-4.3-4.2.patch
Description: Binary data


xsa123.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] OpenStack - Libvirt+Xen CI overview

2015-03-10 Thread Bob Ball
For the last few weeks Anthony and I have been working on creating a CI 
environment to run against all OpenStack jobs.  We're now in a position where 
we can share the current status, overview of how it works and next steps.  We 
actively want to support involvement in this effort from others with an 
interest in libvirt+Xen's openstack integration.

The CI we have set up is follow the recommendations made by the OpenStack 
official infrastructure maintainers, and reproduces a notable portion of the 
official OpenStack CI environment to run these tests.  Namely this setup is 
using:
- Puppet to deploy the master node
- Zuul to watch for code changes uploaded to review.openstack.org
- Jenkins job builder to create Jenkins job definitions from a YAML file
- Nodepool to automatically create single-use virtual machines in the Rackspace 
public cloud 
- Devstack-gate to run Tempest tests in serial

More information on Zuul, JJB, Nodepool and devstack-gate is available through 
http://ci.openstack.org

The current status is that we have a zuul instance monitoring for jobs and 
adding them to the queue of jobs to be run at 
http://zuul.openstack.xenproject.org/

In the background Nodepool provisions virtual machines into a pool of nodes 
ready to be used.  All ready nodes are automatically added to Jenkins 
(https://jenkins.openstack.xenproject.org/), and then Zuul+Jenkins will trigger 
a particular job on a node when one is available.

Logs are then uploaded to Rackspace's Cloud Files with sample logs for a 
passing job at 
http://logs.openstack.xenproject.org/52/162352/3/silent/dsvm-tempest-xen/da3ff30/index.html

I'd like to organise a meeting to walk through the various components of the CI 
with those who are interested, so this is an initial call to find out who is 
interested in finding out more!

Thanks,

Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.4-testing test] 35991: regressions - trouble: blocked/broken/fail/pass

2015-03-10 Thread xen . org
flight 35991 xen-4.4-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/35991/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-qemuu-rhel6hvm-intel  3 host-install(3) broken REGR. vs. 35919
 build-amd64-rumpuserxen   4 host-build-prep   fail REGR. vs. 35919
 test-amd64-i386-xl-qemuu-win7-amd64  3 host-install(3)  broken REGR. vs. 35919
 test-amd64-amd64-xl-qemut-debianhvm-amd64 3 host-install(3) broken REGR. vs. 
35919
 test-amd64-amd64-xl-qemuu-win7-amd64  7 windows-install   fail REGR. vs. 35919
 test-amd64-i386-xl-qemut-win7-amd64  3 host-install(3)  broken REGR. vs. 35919

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt   3 host-install(3) broken REGR. vs. 35919
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 35919

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  9 guest-start  fail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2   5 xen-boot fail   never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 test-amd64-i386-xend-qemut-winxpsp3 17 leak-check/checkfail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-i386-xend-winxpsp3 17 leak-check/check fail  never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass

version targeted for testing:
 xen  40ab3d6b78a9f5a8a22bb333fdca0309e4a2fb4b
baseline version:
 xen  24ecb0be82825e366edd559af29562bca0e07d95


People who touched revisions under test:
  Aaron Adams 
  Ian Campbell 
  Jan Beulich 


jobs:
 build-amd64-xend pass
 build-i386-xend  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  broken  
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64broken  
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd

Re: [Xen-devel] [PATCH v2] stubdom: fix make clean and distclean on a freshly cloned tree

2015-03-10 Thread Wei Liu
On Tue, Mar 10, 2015 at 06:42:30AM +, Liu, SongtaoX wrote:
> > -Original Message-
> > From: xen-devel-boun...@lists.xen.org
> > [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Ian Campbell
> > Sent: Tuesday, March 03, 2015 1:00 AM
> > To: Wei Liu
> > Cc: Samuel Thibault; Stefano Stabellini; Ian Jackson; 
> > xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] [PATCH v2] stubdom: fix make clean and distclean 
> > on a
> > freshly cloned tree
> > 
> > On Mon, 2015-03-02 at 15:05 +, Wei Liu wrote:
> > > Clean and distclean targets need not depend on existence of the
> > > mini-os tree. Don't check for mini-os and don't try to blindly include
> > > mini-os's Config.mk when doing clean and distclean.
> > >
> > > Note that one subtle issue is that $(XEN_ROOT)/Config.mk tries to
> > > include $(XEN_ROOT)/config/$(XEN_OS).mk. In stubdom's case XEN_OS is
> > > "MiniOS". Then $(XEN_ROOT)/config/MiniOS.mk tries to include mini-os's
> > > Config.mk.
> > >
> > > Since clean and distclean don't enforce existence of mini-os tree,
> > > don't include $(XEN_ROOT)/Config.mk to avoid getting error due to the
> > > aforementioned issue.
> > >
> > > Reported-by: Sander Eikelenboom 
> > > Signed-off-by: Wei Liu 
> > > Cc: Ian Campbell 
> > > Cc: Ian Jackson 
> > > Cc: Stefano Stabellini 
> > > Cc: Samuel Thibault 
> > 
> > Acked + applied, thanks.
> > 
> When building pv-grub, error occurred with " ld: warning: app.lds contains 
> output sections; did you forget -T?
> ld: cannot find -lxenguest "
> 
> Building pv-grub steps(after building xen and tools):
> "cd xen
> make mini-os-dir
> cd studom
> ./configure
> make pv-grub"
> 
> There were some errors in mini-os.git/Makefile introduced by commit 55f7cd74.
> 1. The commit had deleted the XEN_ROOT variable in Makefile, bug it still 
> used it;
> 2. And XEN_TARGET_ARCH was blank for the above steps; it will report " ld: 
> cannot find -lxenguest ";
> 3. it lacks a "-T" option, it would report the warning " ld: warning: app.lds 
> contains output sections; did you forget -T?".
> 
> Following patch may work around this issue.
> 

Thanks for reporting. The patch below is not correct.

I will submit a patch series to fix these problems and CC you.

We.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4] xen-scsiback: define a pr_fmt macro with xen-pvscsi

2015-03-10 Thread Tao Chen
Add the {xen-pvscsi: } prefix in pr_fmt and remove DPRINTK, then
replace all DPRINTK with pr_debug.

Also fixed up some comments just as eliminate redundant whitespace
and format the code.

These will make the code easier to read.

Signed-off-by: Tao Chen 
---
 drivers/xen/xen-scsiback.c | 75 ++
 1 file changed, 36 insertions(+), 39 deletions(-)

diff --git a/drivers/xen/xen-scsiback.c b/drivers/xen/xen-scsiback.c
index 9faca6a..d0b0bc5 100644
--- a/drivers/xen/xen-scsiback.c
+++ b/drivers/xen/xen-scsiback.c
@@ -31,6 +31,8 @@
  * IN THE SOFTWARE.
  */
 
+#define pr_fmt(fmt) "xen-pvscsi: " fmt
+
 #include 
 
 #include 
@@ -69,9 +71,6 @@
 #include 
 #include 
 
-#define DPRINTK(_f, _a...) \
-   pr_debug("(file=%s, line=%d) " _f, __FILE__ , __LINE__ , ## _a)
-
 #define VSCSI_VERSION  "v0.1"
 #define VSCSI_NAMELEN  32
 
@@ -271,7 +270,7 @@ static void scsiback_print_status(char *sense_buffer, int 
errors,
 {
struct scsiback_tpg *tpg = pending_req->v2p->tpg;
 
-   pr_err("xen-pvscsi[%s:%d] cmnd[0]=%02x -> st=%02x msg=%02x host=%02x 
drv=%02x\n",
+   pr_err("[%s:%d] cmnd[0]=%02x -> st=%02x msg=%02x host=%02x drv=%02x\n",
   tpg->tport->tport_name, pending_req->v2p->lun,
   pending_req->cmnd[0], status_byte(errors), msg_byte(errors),
   host_byte(errors), driver_byte(errors));
@@ -427,7 +426,7 @@ static int scsiback_gnttab_data_map_batch(struct 
gnttab_map_grant_ref *map,
BUG_ON(err);
for (i = 0; i < cnt; i++) {
if (unlikely(map[i].status != GNTST_okay)) {
-   pr_err("xen-pvscsi: invalid buffer -- could not remap 
it\n");
+   pr_err("invalid buffer -- could not remap it\n");
map[i].handle = SCSIBACK_INVALID_HANDLE;
err = -ENOMEM;
} else {
@@ -449,7 +448,7 @@ static int scsiback_gnttab_data_map_list(struct 
vscsibk_pend *pending_req,
for (i = 0; i < cnt; i++) {
if (get_free_page(pg + mapcount)) {
put_free_pages(pg, mapcount);
-   pr_err("xen-pvscsi: no grant page\n");
+   pr_err("no grant page\n");
return -ENOMEM;
}
gnttab_set_map_op(&map[mapcount], vaddr_page(pg[mapcount]),
@@ -492,7 +491,7 @@ static int scsiback_gnttab_data_map(struct vscsiif_request 
*ring_req,
return 0;
 
if (nr_segments > VSCSIIF_SG_TABLESIZE) {
-   DPRINTK("xen-pvscsi: invalid parameter nr_seg = %d\n",
+   pr_debug("invalid parameter nr_seg = %d\n",
ring_req->nr_segments);
return -EINVAL;
}
@@ -516,13 +515,12 @@ static int scsiback_gnttab_data_map(struct 
vscsiif_request *ring_req,
nr_segments += n_segs;
}
if (nr_segments > SG_ALL) {
-   DPRINTK("xen-pvscsi: invalid nr_seg = %d\n",
-   nr_segments);
+   pr_debug("invalid nr_seg = %d\n", nr_segments);
return -EINVAL;
}
}
 
-   /* free of (sgl) in fast_flush_area()*/
+   /* free of (sgl) in fast_flush_area() */
pending_req->sgl = kmalloc_array(nr_segments,
sizeof(struct scatterlist), GFP_KERNEL);
if (!pending_req->sgl)
@@ -679,7 +677,8 @@ static int prepare_pending_reqs(struct vscsibk_info *info,
v2p = scsiback_do_translation(info, &vir);
if (!v2p) {
pending_req->v2p = NULL;
-   DPRINTK("xen-pvscsi: doesn't exist.\n");
+   pr_debug("the v2p of (chn:%d, tgt:%d, lun:%d) doesn't exist.\n",
+   vir.chn, vir.tgt, vir.lun);
return -ENODEV;
}
pending_req->v2p = v2p;
@@ -690,14 +689,14 @@ static int prepare_pending_reqs(struct vscsibk_info *info,
(pending_req->sc_data_direction != DMA_TO_DEVICE) &&
(pending_req->sc_data_direction != DMA_FROM_DEVICE) &&
(pending_req->sc_data_direction != DMA_NONE)) {
-   DPRINTK("xen-pvscsi: invalid parameter data_dir = %d\n",
+   pr_debug("invalid parameter data_dir = %d\n",
pending_req->sc_data_direction);
return -EINVAL;
}
 
pending_req->cmd_len = ring_req->cmd_len;
if (pending_req->cmd_len > VSCSIIF_MAX_COMMAND_SIZE) {
-   DPRINTK("xen-pvscsi: invalid parameter cmd_len = %d\n",
+   pr_debug("invalid parameter cmd_len = %d\n",
pending_req->cmd_len);
return -EINVAL;
}
@@ -721,7 +720,7 @@ static int scsiback_do_cmd_fn(struct vscsibk_info *info)
 
if (RING_REQUEST_PROD_OVERFLOW(ring, rp)) {
rc = ring->rsp_prod_pvt;
- 

Re: [Xen-devel] [PATCH v2] xen/iommu: fix usage of shared EPT/IOMMU page tables on PVH guests

2015-03-10 Thread Julien Grall
Hi,

On 27/02/15 11:33, Roger Pau Monne wrote:
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index cc12735..7fcbbb1 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -332,7 +332,8 @@ void iommu_share_p2m_table(struct domain* d)
>  {
>  const struct iommu_ops *ops = iommu_get_ops();
>  
> -if ( iommu_enabled && is_hvm_domain(d) )
> +ASSERT( hap_enabled(d) );

This line is breaking compilation on ARM.

Shouldn't it be replaced by iommu_use_hap_pt?

> +if ( iommu_enabled )


Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/iommu: fix usage of shared EPT/IOMMU page tables on PVH guests

2015-03-10 Thread Jan Beulich
>>> On 10.03.15 at 13:51,  wrote:
> Hi,
> 
> On 27/02/15 11:33, Roger Pau Monne wrote:
>> diff --git a/xen/drivers/passthrough/iommu.c 
> b/xen/drivers/passthrough/iommu.c
>> index cc12735..7fcbbb1 100644
>> --- a/xen/drivers/passthrough/iommu.c
>> +++ b/xen/drivers/passthrough/iommu.c
>> @@ -332,7 +332,8 @@ void iommu_share_p2m_table(struct domain* d)
>>  {
>>  const struct iommu_ops *ops = iommu_get_ops();
>>  
>> -if ( iommu_enabled && is_hvm_domain(d) )
>> +ASSERT( hap_enabled(d) );
> 
> This line is breaking compilation on ARM.
> 
> Shouldn't it be replaced by iommu_use_hap_pt?

No, that's a different thing. But shouldn't ARM have a stub
hap_enabled() evaluating to constant true?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Stubdom build fix

2015-03-10 Thread Wei Liu
Hi all

Songtao Liu reported some issues with stubdom.

This patch series tights up those loose ends that caused the problem he
saw [0].

I've tested this series myself, but it would be good if Songtao can also
give it a try and confirm it fixes those issues.

Note that the patch marked with mini-os should be applied to mini-os
tree, while the patch marked with stubdom should be applied to Xen tree.

Wei.

[0] <582fb90ab890394081254b69739046fc0ebac...@shsmsx101.ccr.corp.intel.com>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] mini-os: replace XEN_TARGET_ARCH with MINIOS_TARGET_ARCH

2015-03-10 Thread Wei Liu
One place was missed when I did the replacement in 55f7cd7427 ("Mini-OS:
standalone build").

Signed-off-by: Wei Liu 
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f16520e..862f766 100644
--- a/Makefile
+++ b/Makefile
@@ -165,7 +165,7 @@ OBJS := $(filter-out $(OBJ_DIR)/lwip%.o $(LWO), $(OBJS))
 
 ifeq ($(libc),y)
 ifeq ($(CONFIG_XC),y)
-APP_LDLIBS += -L$(XEN_ROOT)/stubdom/libxc-$(XEN_TARGET_ARCH) -whole-archive 
-lxenguest -lxenctrl -no-whole-archive
+APP_LDLIBS += -L$(XEN_ROOT)/stubdom/libxc-$(MINIOS_TARGET_ARCH) -whole-archive 
-lxenguest -lxenctrl -no-whole-archive
 endif
 APP_LDLIBS += -lpci
 APP_LDLIBS += -lz
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] stubdom: export XEN_ROOT in makefile

2015-03-10 Thread Wei Liu
... because XEN_ROOT is used in mini-os's Config.mk.

Signed-off-by: Wei Liu 
---
 stubdom/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/stubdom/Makefile b/stubdom/Makefile
index f339b20..d9e7e40 100644
--- a/stubdom/Makefile
+++ b/stubdom/Makefile
@@ -1,6 +1,7 @@
 XEN_ROOT = $(CURDIR)/..
 MINI_OS = $(XEN_ROOT)/extras/mini-os
 
+export XEN_ROOT
 export XEN_OS=MiniOS
 
 export stubdom=y
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] mini-os: replace XEN_TARGET_ARCH with MINIOS_TARGET_ARCH

2015-03-10 Thread Samuel Thibault
Wei Liu, le Tue 10 Mar 2015 13:14:38 +, a écrit :
> One place was missed when I did the replacement in 55f7cd7427 ("Mini-OS:
> standalone build").
> 
> Signed-off-by: Wei Liu 

Acked-by: Samuel Thibault 

> ---
>  Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index f16520e..862f766 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -165,7 +165,7 @@ OBJS := $(filter-out $(OBJ_DIR)/lwip%.o $(LWO), $(OBJS))
>  
>  ifeq ($(libc),y)
>  ifeq ($(CONFIG_XC),y)
> -APP_LDLIBS += -L$(XEN_ROOT)/stubdom/libxc-$(XEN_TARGET_ARCH) -whole-archive 
> -lxenguest -lxenctrl -no-whole-archive
> +APP_LDLIBS += -L$(XEN_ROOT)/stubdom/libxc-$(MINIOS_TARGET_ARCH) 
> -whole-archive -lxenguest -lxenctrl -no-whole-archive
>  endif
>  APP_LDLIBS += -lpci
>  APP_LDLIBS += -lz
> -- 
> 1.9.1
> 

-- 
Samuel
 Profitant de cette occasion, vous serait-il possible de rebooter 
 aussi Modérator et son petit copain qui gère les ressources de 
 download ?
 -+- OB in NPC : Apprendre à flasher son personnel -+-

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] stubdom: export XEN_ROOT in makefile

2015-03-10 Thread Samuel Thibault
Wei Liu, le Tue 10 Mar 2015 13:14:39 +, a écrit :
> ... because XEN_ROOT is used in mini-os's Config.mk.
> 
> Signed-off-by: Wei Liu 

Acked-by: Samuel Thibault 

> ---
>  stubdom/Makefile | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/stubdom/Makefile b/stubdom/Makefile
> index f339b20..d9e7e40 100644
> --- a/stubdom/Makefile
> +++ b/stubdom/Makefile
> @@ -1,6 +1,7 @@
>  XEN_ROOT = $(CURDIR)/..
>  MINI_OS = $(XEN_ROOT)/extras/mini-os
>  
> +export XEN_ROOT
>  export XEN_OS=MiniOS
>  
>  export stubdom=y
> -- 
> 1.9.1
> 

-- 
Samuel
/*
 * Oops. The kernel tried to access some bad page. We'll have to
 * terminate things with extreme prejudice.
*/
die_if_kernel("Oops", regs, error_code);
(From linux/arch/i386/mm/fault.c)  

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/iommu: fix usage of shared EPT/IOMMU page tables on PVH guests

2015-03-10 Thread Julien Grall
On 10/03/15 13:06, Jan Beulich wrote:
 On 10.03.15 at 13:51,  wrote:
>> Hi,
>>
>> On 27/02/15 11:33, Roger Pau Monne wrote:
>>> diff --git a/xen/drivers/passthrough/iommu.c 
>> b/xen/drivers/passthrough/iommu.c
>>> index cc12735..7fcbbb1 100644
>>> --- a/xen/drivers/passthrough/iommu.c
>>> +++ b/xen/drivers/passthrough/iommu.c
>>> @@ -332,7 +332,8 @@ void iommu_share_p2m_table(struct domain* d)
>>>  {
>>>  const struct iommu_ops *ops = iommu_get_ops();
>>>  
>>> -if ( iommu_enabled && is_hvm_domain(d) )
>>> +ASSERT( hap_enabled(d) );
>>
>> This line is breaking compilation on ARM.
>>
>> Shouldn't it be replaced by iommu_use_hap_pt?
> 
> No, that's a different thing. But shouldn't ARM have a stub
> hap_enabled() evaluating to constant true?

I'm not sure if we should introduce hap_enabled. It's not something that
we should used in general.

What are we trying to catch with this ASSERT? I guess wrong caller?

If so, every share_p2m callbacks have a check "if iommu_use_hap_pt()"
which contains a check to hap_enabled on x86.

So I don't think this check is worthwhile in the common iommu code.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] vTPM Deep Quote validation

2015-03-10 Thread Emil Condrea
I think it is fair to read the PCRs before performing Deep Quote and after
and
to retry if something changed.
It is an interesting suggestion to extend the tpm character device driver
in order
to obtain atomicity. I will think about it.
Thanks for clarifying.

On Mon, Mar 9, 2015 at 7:26 PM, Daniel De Graaf 
wrote:

> On 03/09/2015 11:58 AM, Emil Condrea wrote:
>
>> On Mon, Mar 9, 2015 at 4:40 PM, Daniel De Graaf 
>> wrote:
>>
>>  On 03/08/2015 07:41 AM, Emil Condrea wrote:
>>>
>>>  I am trying to validate a Deep Quote request made by domU but I feel
 that
 something is missing. Right now when a domU requests TPM_ORD_DeepQuote:
 1. vTPM:
 - unpacks the params: nonce, vTPM PCR selection and physical PCR
 selection
 - packs PCR_INFO_SHORT structure into buf that contains the selected
 vTPM
 PCRs
 - computes nonce as a SHA1 of: dquot_hdr, nonce, and previous packed buf
 - packs: nonce, physical PCR selection
 - receives physical pcr data and signature from manager and returns them
 to
 DomU
 2. vTPM Manager
 - unpacks the params: nonce, PCR selection
 - execute TPM_Quote with: externalData = nonce
 - returns pcr data and signature to vTPM

 If domU user wants to validate the signature it has to do the exact
 process
 that the vtpm and manager did  but the virtual PCR values are not
 included
 in response, just physical ones.


>>> The virtual machine can use TPM_PCRRead to get the value of the vTPM
>>> PCRs.
>>> This is the same method that is used by the TPM_Quote2 command.
>>>
>>
>>
>> I thought of using TPM_PCRRead from virtual machine but it was not clear
>> for me if it is safe.
>> Is it possible for the selected vTPM PCRs values to be different when
>> performing
>> composite hash on vTPM from the values read with TPM_PCRRead after
>> executing DeepQuote?
>>
>
> One way to detect this is by reading the PCRs before and after asking
> for a quote.  If the values match, then the quote used those values; if
> not, try the quote operation again.  In either case, you should have a
> log or other information on what values have been extended into the PCRs
> so that a verifier can make sense of them: there is little reason to
> include the PCRs in a quote if you can't reconstruct them.
> As an alternative to retrying, you could try to reconstruct the PCRs
> used in the quote by hashing the various possibilities drawn from the
> logs.  If the number of extend operations between the pre- and post-read
> operations is reasonable, this could end up being faster than asking for
> another quote from the (rather slow) hardware TPM.
>
>  The TPM has context management for each application? (eg: when one
>> application extends something
>> into a PCR and another application extends other thing in the same PCR(at
>> the same time moment),
>> are they hashed together?)
>>
>
> This depends on the TPM multiplexing daemon (usually trousers in Linux).
> I believe it just processes the requests in the order it receives them,
> so without external synchronization they would be in an arbitrary order.
>
> I am unsure if this is implemented (and would guess it is currently
> not), but it would be possible for trousers to queue up several commands
> (such as PCR reads and quote requests) from a single source and
> guarantee that they are executed without intervening commands.  In order
> to avoid interactions with IMA, this would need an extension to the
> Linux TPM character device interface to submit multiple commands for
> processing without unlocking the TPM device.
>
>
>  When I read the standard I understood that the PCRs can never be
>> overwritten,
>> just reset and extended.
>>
>> Thanks.
>>
>>
>>>
>>>   We can include the vTPM PCRS in response or the manager must perform
>>>
 TPM_Quote using the nonce received from domU in order to be able to
 have a
 successful validation on the client side.


>>> If you want a quote without any vTPM PCRs, you can specify an empty PCR
>>> mask
>>> to get something fairly close to this behavior - the nonce will be
>>> combined
>>> with an empty deep quote structure instead of passed directly.
>>>
>>>   What do you think? Is there something that I am missing ?
>>>


>>> It is useful to be able to ask for the current value of both physical and
>>> virtual PCRs in a single atomic operation.  Including the value of all
>>> PCRs
>>> in the response could make the reply packet too large (which is part of
>>> the
>>> reason why TPM_Quote2 removed them).
>>>
>>> --
>>> Daniel De Graaf
>>> National Security Agency
>>>
>>>
>>
>
> --
> Daniel De Graaf
> National Security Agency
>
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] sched: credit2: respect per-vcpu hard affinity

2015-03-10 Thread Dario Faggioli
On Sun, 2015-03-08 at 21:11 -1000, Justin Weaver wrote:
> Thanks to all for the comments! I've implemented most of the changes
> recommended here in the v2 review. I should have a new patch set ready
> this week (with an updated soft affinity patch as well). 
>
Great! :-)

> > Oh, and this is what was causing you troubles, in case source and
> > destination runqueue were the same... Help me understand, which call to
> > sched_move_irqs() in schedule.c were we missing? I'd say it is the one
> > in vcpu_migrate(), but that does not seem to care about vc->processor
> > (much rater about new_cpu)... what am I missing?
> >
> > However, if they are not the same, the call to migrate() will override
> > this right away, won't it? What I mean to say is, if this is required
> > only in case trqd and svc->rqd are equal, can we tweak this part of
> > csched2_vcpu_migrate() as follows?
> >
> > if ( trqd != svc->rqd )
> > migrate(ops, svc, trqd, NOW());
> > else
> > vc->processor = new_cpu;
> 
> You are right; it does not have anything to do with sched_move_irqs()
> not being called (like you said it doesn't care about vc->processor).
>
Ah, ok. :-)

> You are never going to believe any of my explanations now! :) 
>
EhEh... If I'd do that to people who fail to understand how things works
at the first or second attempt, I would stop believing myself! :-D :-D

> Without the processor assignment
> here the vcpu might go on being assigned to a processor it no longer
> is allowed to run on. 
>
Ok.

> In that case, function runq_candidate may only
> get called for the vcpu's old processor, and runq_candidate will no
> longer let a vcpu run on a processor that it's not allowed to run on
> (because of the hard affinity check first introduced in v1 of this
> patch). 
>
It mostly makes sense. Out of the top of my head, it still looks like
there should be a pCPU that, when scheduling, would pick it up... I need
to think more about this...

> So in that condition the vcpu never gets to run. That's still
> somewhat of a vague explanation, but I have observed that that is what
> happens. 
>
Do you mean you _actually_ saw this, with some debugging printk-s, or
tracing, or something like this?

> Anyway I think everyone agrees that the processor assignment
> needs to be here, and I did move it to an else block for v3 like you
> recommended above.
>
Yes, that's the point, the assignement above is correct, IMO, so it
should be there, whether or not it is the cause of the issue :-)

> > I don't like the "_safe_" part, but that is not a big deal, I certainly
> > can live with it.
> 
> I changed it to _choose_valid_pcpu ... discuss! 
>
I'm fine with what Goerge proposes in his email.

> (Also, I can send out
> a pre-patch to change the double underscores in the whole file)
> 
For static symbols, yes. As Jan says, it's George's call. If you're up
for it, I think it would be an improvement.

> >> > VCPU2ONLINE(svc->vcpu) would make the line shorter.
> 
> I agree. VCPU2ONLINE is defined in schedule.c ... do you want me to
> move it to a common header along with the other parts we discussed
> (__vcpu_has_soft_affinity, etc.)? 
>
Either that or, if you only need it once, just open code it.

> Okay to move them to sched-if.h, or
> should I put them in a new header file?
> 
sched-if.h is ok for the step-wise load balancing macros, and it would
be the proper place where to move this too, if we go for moving it.

Thanks and Regards,
Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv1] xen/balloon: disable memory hotplug in PV guests

2015-03-10 Thread Boris Ostrovsky

On 03/10/2015 07:40 AM, David Vrabel wrote:

On 09/03/15 14:10, David Vrabel wrote:

Memory hotplug doesn't work with PV guests because:

   a) The p2m cannot be expanded to cover the new sections.

Broken by 054954eb051f35e74b75a566a96fe756015352c8 (xen: switch to
linear virtual mapped sparse p2m list).

This one would be non-trivial to fix.  We'd need a sparse set of
vm_area's for the p2m or similar.


   b) add_memory() builds page tables for the new sections which means
  the new pages must have valid p2m entries (or a BUG occurs).

After some more testing this appears to be broken by:

25b884a83d487fd62c3de7ac1ab5549979188482 (x86/xen: set regions above the
end of RAM as 1:1) included 3.16.

This one can be trivially fixed by setting the new sections in the p2m
to INVALID_P2M_ENTRY before calling add_memory().


Have you tried 3.17? As I said yesterday, it worked for me (with 4.4 Xen).

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/iommu: fix usage of shared EPT/IOMMU page tables on PVH guests

2015-03-10 Thread Jan Beulich
>>> On 10.03.15 at 14:18,  wrote:
> On 10/03/15 13:06, Jan Beulich wrote:
> On 10.03.15 at 13:51,  wrote:
>>> Hi,
>>>
>>> On 27/02/15 11:33, Roger Pau Monne wrote:
 diff --git a/xen/drivers/passthrough/iommu.c 
>>> b/xen/drivers/passthrough/iommu.c
 index cc12735..7fcbbb1 100644
 --- a/xen/drivers/passthrough/iommu.c
 +++ b/xen/drivers/passthrough/iommu.c
 @@ -332,7 +332,8 @@ void iommu_share_p2m_table(struct domain* d)
  {
  const struct iommu_ops *ops = iommu_get_ops();
  
 -if ( iommu_enabled && is_hvm_domain(d) )
 +ASSERT( hap_enabled(d) );
>>>
>>> This line is breaking compilation on ARM.
>>>
>>> Shouldn't it be replaced by iommu_use_hap_pt?
>> 
>> No, that's a different thing. But shouldn't ARM have a stub
>> hap_enabled() evaluating to constant true?
> 
> I'm not sure if we should introduce hap_enabled. It's not something that
> we should used in general.
> 
> What are we trying to catch with this ASSERT? I guess wrong caller?
> 
> If so, every share_p2m callbacks have a check "if iommu_use_hap_pt()"
> which contains a check to hap_enabled on x86.

Ah, right, I mixed this up with iommu_hap_pt_share. Roger - looks
like this could indeed be replaced by

 if ( iommu_enabled && iommu_use_hap_pt(d) )
 ops->share_p2m(d);

and the corresponding check in VT-d and AMD Vi code could then
also be dropped.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup)

2015-03-10 Thread Boris Ostrovsky

On 03/10/2015 04:20 AM, Jan Beulich wrote:

On 10.03.15 at 03:27,  wrote:

Changes in v4:
* Split cputopology and NUMA info changes into separate patches
* Added patch#1 (partly because patch#4 needs to know when when distance is 
invalid,
   i.e. NUMA_NO_DISTANCE)
* Split sysctl version update into a separate patch

Why?


Which patch should it go to then? The first that changed the interfaces 
(patch#3) or the second (#4)?


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 7/9] libxl/libxc: Move libxl_get_cpu_topology()'s hypercall buffer management to libxc

2015-03-10 Thread Dario Faggioli
On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> xc_cputopoinfo() is not expected to be used on a hot path and therefore
> hypercall buffer management can be pushed into libxc. This will simplify
> life for callers.
> 
> Also update error reporting macros.
> 
> Signed-off-by: Boris Ostrovsky 
>
Reviewed-by: Dario Faggioli 

Regards,
Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 8/9] libxl/libxc: Move libxl_get_numainfo()'s hypercall buffer management to libxc

2015-03-10 Thread Dario Faggioli
On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> xc_numainfo() is not expected to be used on a hot path and therefore
> hypercall buffer management can be pushed into libxc. This will simplify
> life for callers.
> 
> Also update error logging macros.
> 
> Signed-off-by: Boris Ostrovsky 
>
Reviewed-by: Dario Faggioli 

Regards,
Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 0/9] Display IO topology when PXM data is available (plus some cleanup)

2015-03-10 Thread Jan Beulich
>>> On 10.03.15 at 14:39,  wrote:
> On 03/10/2015 04:20 AM, Jan Beulich wrote:
> On 10.03.15 at 03:27,  wrote:
>>> Changes in v4:
>>> * Split cputopology and NUMA info changes into separate patches
>>> * Added patch#1 (partly because patch#4 needs to know when when distance is 
>>> invalid,
>>>i.e. NUMA_NO_DISTANCE)
>>> * Split sysctl version update into a separate patch
>> Why?
> 
> Which patch should it go to then? The first that changed the interfaces 
> (patch#3) or the second (#4)?

Whichever first changes the interface in an incompatible way.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 9/9] libxl: Add interface for querying hypervisor about PCI topology

2015-03-10 Thread Dario Faggioli
On Mon, 2015-03-09 at 22:27 -0400, Boris Ostrovsky wrote:
> .. and use this new interface to display it along with CPU topology
> and NUMA information when 'xl info -n' command is issued
> 
> The output will look like
> ...
> cpu_topology   :
> cpu:coresocket node
>   0:   000
> ...
> device topology:
> device   node
> :00:00.0  0
> :00:01.0  0
> ...
> 
> Signed-off-by: Boris Ostrovsky 
>
Reviewed-by: Dario Faggioli 

Regards,
Dario


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/9] numa: __node_distance() should return u8

2015-03-10 Thread Boris Ostrovsky

On 03/10/2015 07:53 AM, Andrew Cooper wrote:

On 10/03/15 02:27, Boris Ostrovsky wrote:

SLIT values are byte-sized and some of them (0-9 and 255) have
special meaning. Adjust __node_distance() to reflect this and
modify scrub_heap_pages() to deal with __node_distance() returning
an invalid SLIT entry.

Signed-off-by: Boris Ostrovsky 

You also need to teach XEN_SYSCTL_numainfo about the new NUMA_NO_DISTANCE.


The sysctl is updated later in the series but this may indeed break 
bisection.





---
  xen/arch/x86/srat.c|   15 +++
  xen/common/page_alloc.c|4 ++--
  xen/include/asm-x86/numa.h |2 +-
  xen/include/xen/numa.h |3 ++-
  4 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index dfabba3..aa2eda3 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -496,14 +496,21 @@ static unsigned node_to_pxm(nodeid_t n)
return 0;
  }
  
-int __node_distance(nodeid_t a, nodeid_t b)

+u8 __node_distance(nodeid_t a, nodeid_t b)
  {
-   int index;
+   u8 slit_val;
  
  	if (!acpi_slit)

return a == b ? 10 : 20;
-   index = acpi_slit->locality_count * node_to_pxm(a);
-   return acpi_slit->entry[index + node_to_pxm(b)];
+
+   slit_val = acpi_slit->entry[acpi_slit->locality_count * node_to_pxm(a) +
+   node_to_pxm(b)];

This would be easier to read if you kept the old index temporary
(although making it a u64).


Yes.

-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] AMD IOMMU: widen NUMA nodes to be allocated from

2015-03-10 Thread Boris Ostrovsky

On 03/10/2015 03:35 AM, Jan Beulich wrote:

On 09.03.15 at 20:02,  wrote:

I agree that having the IO page tables on the NUMA node that is closest
to the IOMMU would be beneficial.

And I already withdrew this patch and the corresponding VT-d one.


However, I am not sure at the moment
that this information could be easily determined. I think ACPI _PXM for
devices should be able to provide this information, but this is optional
and often not available.

And even if it was available, it would be too late at least for Dom0's
allocations (as it requires Dom0's interpreter to dig out this detail).
The best we could do in that case would be to try to replace the
existing tables. Or assume Dom0 is being placed suitably by the
dom0_nodes= option. Or add yet another option.


There is a nodeID register on each northbridge (D1F0x60). You would 
have to figure out how to map it to _PXMs though.


-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 15/24] xen/dts: Provide an helper to get a DT node from a path provided by a guest

2015-03-10 Thread Julien Grall
On 23/02/15 16:27, Ian Campbell wrote:
> On Mon, 2015-02-23 at 16:01 +, Julien Grall wrote:
>> Hi Ian,
>>
>> On 23/02/15 15:30, Ian Campbell wrote:
>>> On Tue, 2015-01-13 at 14:25 +, Julien Grall wrote:
>>>
 +/* This limit is used by the hypercalls to restrict the size of the path 
 */
 +#define DEVICE_TREE_MAX_PATHLEN 1024
>>>
>>> Is this something you've made up or derived from the DT spec/ePAPR etc?
>>
>> I didn't find a such requirements on the spec.
> 
> I vaguely recall a limit on the length of a single node name, and a
> limit on the depth which they may nest, which can be multiplied. It's
> probably an unhelpfully large number though, so...
> 
>> I chose this number based on the linux pathlen because the path is also
>> used in the sysfs (/sys/firmware/devicetree).
> 
> ...that's a good as anything I suppose!

Hmmm... I'm not so sure about the 1024 limit anymore. Linux is defining
PATH_MAX to 4096 but I don't see many usage in the sysfs code.

And this value may change from one OS to another. Although, 1024 sounds
a very long path to write in the configuration file... Maybe we should
support alias too?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient

2015-03-10 Thread Andrew Cooper
On 10/03/15 02:27, Boris Ostrovsky wrote:
> Instead of copying data for each field in xen_sysctl_topologyinfo separately
> put cpu/socket/node into a single structure and do a single copy for each
> processor.
>
> Do not use max_cpu_index, which is almost always used for calculating number
> CPUs (thus requiring adding or subtracting one), replace it with num_cpus.
>
> There is no need to copy whole op in sysctl to user at the end, we only need
> num_cpus.
>
> Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact
> that these are used for CPU topology. Subsequent patch will add support for
> PCI topology sysctl.
>
> Replace INVALID_TOPOLOGY_ID with "XEN_"-prefixed macros for each invalid type
> (core, socket, node).
>
> Signed-off-by: Boris Ostrovsky 

In principle, a good improvement, but I have some specific issues.

> ---
>
> diff --git a/tools/python/xen/lowlevel/xc/xc.c 
> b/tools/python/xen/lowlevel/xc/xc.c
> index 2aa0dc7..2fd93e0 100644
> --- a/tools/python/xen/lowlevel/xc/xc.c
> +++ b/tools/python/xen/lowlevel/xc/xc.c
> @@ -1375,7 +1365,7 @@ static PyObject *pyxc_numainfo(XcObject *self)
>  for ( j = 0; j <= max_node_index; j++ )
>  {
>  uint32_t dist = nodes_dist[i*(max_node_index+1) + j];
> -if ( dist == INVALID_TOPOLOGY_ID )
> +if ( dist == ~0u )
>  {
>  PyList_Append(node_to_node_dist_obj, Py_None);
>  }

This looks like a spurious hunk (perhaps from patch 9?)

> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
> index 0cb6ee1..fe48ee8 100644
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -320,39 +320,61 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
> u_sysctl)
>  }
>  break;
>  
> -case XEN_SYSCTL_topologyinfo:
> +case XEN_SYSCTL_cputopoinfo:
>  {
> -uint32_t i, max_cpu_index, last_online_cpu;
> -xen_sysctl_topologyinfo_t *ti = &op->u.topologyinfo;
> +uint32_t i, num_cpus;
> +xen_sysctl_cputopoinfo_t *ti = &op->u.cputopoinfo;
>  
> -last_online_cpu = cpumask_last(&cpu_online_map);
> -max_cpu_index = min_t(uint32_t, ti->max_cpu_index, last_online_cpu);
> -ti->max_cpu_index = last_online_cpu;
> +if ( guest_handle_is_null(ti->cputopo) )
> +{
> +ret = -EINVAL;
> +break;
> +}

The prevailing hypervisor style is to use a NULL guest handle as an
explicit request for size.  i.e. write back num_cpus in ti and return
success.

>  
> -for ( i = 0; i <= max_cpu_index; i++ )
> +num_cpus = cpumask_last(&cpu_online_map) + 1;
> +if ( ti->num_cpus != num_cpus )
>  {
> -if ( !guest_handle_is_null(ti->cpu_to_core) )
> +uint32_t array_sz = ti->num_cpus;
> +
> +ti->num_cpus = num_cpus;
> +if ( __copy_field_to_guest(u_sysctl, op,
> +   u.cputopoinfo.num_cpus) )
>  {
> -uint32_t core = cpu_online(i) ? cpu_to_core(i) : ~0u;
> -if ( copy_to_guest_offset(ti->cpu_to_core, i, &core, 1) )
> -break;
> +ret = -EFAULT;
> +break;
> +}
> +num_cpus = min_t(uint32_t, array_sz, num_cpus);
> +}
> +
> +for ( i = 0; i < num_cpus; i++ )
> +{
> +xen_sysctl_cputopo_t cputopo;
> +
> +if ( cpu_present(i) )
> +{
> +cputopo.core = cpu_to_core(i);
> +if ( cputopo.core == BAD_APICID )
> +cputopo.core = XEN_INVALID_CORE_ID;
> +cputopo.socket = cpu_to_socket(i);
> +if ( cputopo.socket == BAD_APICID )
> +cputopo.socket = XEN_INVALID_SOCKET_ID;
> +cputopo.node = cpu_to_node(i);
> +if ( cputopo.node == NUMA_NO_NODE )
> +cputopo.node = XEN_INVALID_NODE_ID;
>  }
> -if ( !guest_handle_is_null(ti->cpu_to_socket) )
> +else
>  {
> -uint32_t socket = cpu_online(i) ? cpu_to_socket(i) : ~0u;
> -if ( copy_to_guest_offset(ti->cpu_to_socket, i, &socket, 1) )
> -break;
> +cputopo.core = XEN_INVALID_CORE_ID;
> +cputopo.socket = XEN_INVALID_SOCKET_ID;
> +cputopo.node = XEN_INVALID_NODE_ID;
>  }
> -if ( !guest_handle_is_null(ti->cpu_to_node) )
> +
> +if ( copy_to_guest_offset(ti->cputopo, i, &cputopo, 1) )
>  {
> -uint32_t node = cpu_online(i) ? cpu_to_node(i) : ~0u;
> -if ( copy_to_guest_offset(ti->cpu_to_node, i, &node, 1) )
> -break;
> +ret = -EFAULT;
> +break;
>  }
>  }
> -
> -ret = ((i <= max_cpu_index) || copy_to_guest(u_sysctl, op, 1))
> -

Re: [Xen-devel] [PATCH v4 4/9] sysctl: Make XEN_SYSCTL_numainfo a little more efficient

2015-03-10 Thread Andrew Cooper
On 10/03/15 02:27, Boris Ostrovsky wrote:
> Make sysctl NUMA topology query use fewer copies by combining some
> fields into a single structure and copying distances for each node
> in a single copy.
>
> Instead of using max_node_index for passing number of nodes keep this
> value in num_nodes: almost all uses of max_node_index required adding
> or subtracting one to eventually get to number of nodes anyway.
>
> Replace INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ and add
> XEN_INVALID_NODE_DIST.
>
> Signed-off-by: Boris Ostrovsky 

I have the same concerns with the interface as patch 3, but other comments.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 3/9] sysctl: Make XEN_SYSCTL_topologyinfo sysctl a little more efficient

2015-03-10 Thread Boris Ostrovsky

On 03/10/2015 10:29 AM, Andrew Cooper wrote:

On 10/03/15 02:27, Boris Ostrovsky wrote:

Instead of copying data for each field in xen_sysctl_topologyinfo separately
put cpu/socket/node into a single structure and do a single copy for each
processor.

Do not use max_cpu_index, which is almost always used for calculating number
CPUs (thus requiring adding or subtracting one), replace it with num_cpus.

There is no need to copy whole op in sysctl to user at the end, we only need
num_cpus.

Rename xen_sysctl_topologyinfo and XEN_SYSCTL_topologyinfo to reflect the fact
that these are used for CPU topology. Subsequent patch will add support for
PCI topology sysctl.

Replace INVALID_TOPOLOGY_ID with "XEN_"-prefixed macros for each invalid type
(core, socket, node).

Signed-off-by: Boris Ostrovsky 

In principle, a good improvement, but I have some specific issues.


---

diff --git a/tools/python/xen/lowlevel/xc/xc.c 
b/tools/python/xen/lowlevel/xc/xc.c
index 2aa0dc7..2fd93e0 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1375,7 +1365,7 @@ static PyObject *pyxc_numainfo(XcObject *self)
  for ( j = 0; j <= max_node_index; j++ )
  {
  uint32_t dist = nodes_dist[i*(max_node_index+1) + j];
-if ( dist == INVALID_TOPOLOGY_ID )
+if ( dist == ~0u )
  {
  PyList_Append(node_to_node_dist_obj, Py_None);
  }

This looks like a spurious hunk (perhaps from patch 9?)


No, this is sort of an intermediate step: INVALID_TOPOLOGY_ID is no 
longer defined and a new macro for distance will show up in the next patch.


And, in fact, it's wrong anyway since the sysctl never sets distance to 
INVALID_TOPOLOGY_ID, it sets it explicitly to ~0u. It just so happens 
that INVALID_TOPOLOGY_ID is also ~0U.


I thought about moving macro definition into this patch but then 
logically it wouldn't belong here so I figured I'd do what the 
hypervisor does, which is use a constant. Until the next patch.






diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index 0cb6ee1..fe48ee8 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -320,39 +320,61 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
u_sysctl)
  }
  break;
  
-case XEN_SYSCTL_topologyinfo:

+case XEN_SYSCTL_cputopoinfo:
  {
-uint32_t i, max_cpu_index, last_online_cpu;
-xen_sysctl_topologyinfo_t *ti = &op->u.topologyinfo;
+uint32_t i, num_cpus;
+xen_sysctl_cputopoinfo_t *ti = &op->u.cputopoinfo;
  
-last_online_cpu = cpumask_last(&cpu_online_map);

-max_cpu_index = min_t(uint32_t, ti->max_cpu_index, last_online_cpu);
-ti->max_cpu_index = last_online_cpu;
+if ( guest_handle_is_null(ti->cputopo) )
+{
+ret = -EINVAL;
+break;
+}

The prevailing hypervisor style is to use a NULL guest handle as an
explicit request for size.  i.e. write back num_cpus in ti and return
success.


Yes, this would look better from the interface POV.

-boris



  
-for ( i = 0; i <= max_cpu_index; i++ )

+num_cpus = cpumask_last(&cpu_online_map) + 1;
+if ( ti->num_cpus != num_cpus )
  {
-if ( !guest_handle_is_null(ti->cpu_to_core) )
+uint32_t array_sz = ti->num_cpus;
+
+ti->num_cpus = num_cpus;
+if ( __copy_field_to_guest(u_sysctl, op,
+   u.cputopoinfo.num_cpus) )
  {
-uint32_t core = cpu_online(i) ? cpu_to_core(i) : ~0u;
-if ( copy_to_guest_offset(ti->cpu_to_core, i, &core, 1) )
-break;
+ret = -EFAULT;
+break;
+}
+num_cpus = min_t(uint32_t, array_sz, num_cpus);
+}
+
+for ( i = 0; i < num_cpus; i++ )
+{
+xen_sysctl_cputopo_t cputopo;
+
+if ( cpu_present(i) )
+{
+cputopo.core = cpu_to_core(i);
+if ( cputopo.core == BAD_APICID )
+cputopo.core = XEN_INVALID_CORE_ID;
+cputopo.socket = cpu_to_socket(i);
+if ( cputopo.socket == BAD_APICID )
+cputopo.socket = XEN_INVALID_SOCKET_ID;
+cputopo.node = cpu_to_node(i);
+if ( cputopo.node == NUMA_NO_NODE )
+cputopo.node = XEN_INVALID_NODE_ID;
  }
-if ( !guest_handle_is_null(ti->cpu_to_socket) )
+else
  {
-uint32_t socket = cpu_online(i) ? cpu_to_socket(i) : ~0u;
-if ( copy_to_guest_offset(ti->cpu_to_socket, i, &socket, 1) )
-break;
+cputopo.core = XEN_INVALID_CORE_ID;
+cputopo.socket = XEN_INVALID_SOCKET_ID;
+cputopo.node = XEN_INVALID_NODE_ID;
  }
-if ( !guest_handle_is_null(ti->cpu_to

Re: [Xen-devel] [PATCH v4 5/9] sysctl: Add sysctl interface for querying PCI topology

2015-03-10 Thread Andrew Cooper
On 10/03/15 02:27, Boris Ostrovsky wrote:
> Signed-off-by: Boris Ostrovsky 
> ---
>
> Changes in v4:
> * No buffer allocation in sysctl, copy data device-by-device
> * Replace INVALID_TOPOLOGY_ID with XEN_INVALID_NODE_ID
>
>  xen/common/sysctl.c |   59 
> +++
>  xen/include/public/sysctl.h |   29 +
>  2 files changed, 88 insertions(+), 0 deletions(-)
>
> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
> index 2d11a76..9cd6321 100644
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -387,7 +387,66 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
> u_sysctl)
>  }
>  }
>  break;

Blank line here.

> +#ifdef HAS_PCI
> +case XEN_SYSCTL_pcitopoinfo:

Please try and keep the SYSCTL implementations in numerical order, which
should put this new block below the pcr and coverage blocks.

> +{
> +xen_sysctl_pcitopoinfo_t *ti = &op->u.pcitopoinfo;
> +
> +if ( guest_handle_is_null(ti->devs) ||
> + guest_handle_is_null(ti->nodes) ||
> + (ti->first_dev > ti->num_devs) )
> +{
> +ret = -EINVAL;
> +break;
> +}
> +
> +for ( ; ti->first_dev < ti->num_devs; ti->first_dev++ )
> +{
> +physdev_pci_device_t dev;
> +uint8_t node;
> +struct pci_dev *pdev;
> +
> +if ( copy_from_guest_offset(&dev, ti->devs, ti->first_dev, 1) )
> +{
> +ret = -EFAULT;
> +break;
> +}
> +
> +spin_lock(&pcidevs_lock);
> +pdev = pci_get_pdev(dev.seg, dev.bus, dev.devfn);
> +if ( !pdev || (pdev->node == NUMA_NO_NODE) )
> +node = XEN_INVALID_NODE_ID;
> +else
> +node = pdev->node;
> +spin_unlock(&pcidevs_lock);
> +
> +if ( copy_to_guest_offset(ti->nodes, ti->first_dev, &node, 1) )
> +{
> +ret = -EFAULT;
> +break;
> +}
>  
> +if ( hypercall_preempt_check() )
> +break;
> +}
> +
> +if ( !ret )
> +{
> +ti->first_dev++;
> +
> +if ( __copy_field_to_guest(u_sysctl, op, 
> u.pcitopoinfo.first_dev) )
> +{
> +ret = -EFAULT;
> +break;
> +}
> +
> +if ( ti->first_dev < ti->num_devs )
> +ret = hypercall_create_continuation(__HYPERVISOR_sysctl,
> +"h", u_sysctl);
> +}
> +}
> +break;
> +#endif

And here.


>  #ifdef TEST_COVERAGE
>  case XEN_SYSCTL_coverage_op:
>  ret = sysctl_coverage_op(&op->u.coverage_op);
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index c544c76..a224951 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -33,6 +33,7 @@
>  
>  #include "xen.h"
>  #include "domctl.h"
> +#include "physdev.h"
>  
>  #define XEN_SYSCTL_INTERFACE_VERSION 0x000B
>  
> @@ -494,6 +495,32 @@ struct xen_sysctl_cputopoinfo {
>  typedef struct xen_sysctl_cputopoinfo xen_sysctl_cputopoinfo_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cputopoinfo_t);
>  
> +/* XEN_SYSCTL_pcitopoinfo */
> +struct xen_sysctl_pcitopoinfo {
> +/* IN: Size of pcitopo array */

Number of elements (as the two arrays are different sized structures) of
both the devs and nodes array.  It is not sensible for a caller to ever
pass arrays of different lengths.

> +uint32_t num_devs;
> +
> +/*
> + * IN/OUT: First element of pcitopo array that needs to be processed by
> + * hypervisor.
> + * This is used primarily by hypercall continuations and callers will
> + * typically set it to zero.

I would word this more strongly and state that callers must set this to
0 and that it is an internal implementation detail of Xen.

Otherwise, Reviewed-by: Andrew Cooper 

> + */
> +uint32_t first_dev;
> +
> +/* IN: list of devices */
> +XEN_GUEST_HANDLE_64(physdev_pci_device_t) devs;
> +
> +/*
> + * OUT: node identifier for each device.
> + * If information for a particular device is not avalable then set
> + * to XEN_INVALID_NODE_ID.
> + */
> +XEN_GUEST_HANDLE_64(uint8) nodes;
> +};
> +typedef struct xen_sysctl_pcitopoinfo xen_sysctl_pcitopoinfo_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_pcitopoinfo_t);
> +
>  /* XEN_SYSCTL_numainfo */
>  #define XEN_INVALID_MEM_SZ (~0U)
>  #define XEN_INVALID_NODE_DIST  ((uint8_t)~0)
> @@ -694,12 +721,14 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_scheduler_op  19
>  #define XEN_SYSCTL_coverage_op   20
>  #define XEN_SYSCTL_psr_cmt_op21
> +#define XEN_SYSCTL_pcitopoinfo   22
>  uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>  union {
>  struct xen_sysctl_readcons

Re: [Xen-devel] [PATCH v3 18/24] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody

2015-03-10 Thread Julien Grall
Hi,

On 23/02/15 15:34, Ian Campbell wrote:
> On Mon, 2015-02-23 at 10:10 +, Julien Grall wrote:
>>
>> On 20/02/2015 17:04, Ian Campbell wrote:
>>> On Tue, 2015-01-13 at 14:25 +, Julien Grall wrote:
 Currently, when the device is deassigned from a domain, we directly 
 reassign
 to DOM0.

 As the device may not have been correctly reset, this may lead to 
 corruption or
 expose some part of DOM0 memory. Also, we may have no way to reset some
 platform devices.

 If Xen reassigns the device to "nobody", it may receive some global/context
 fault because the transaction has failed (indeed the context has been
 marked invalid). Unfortunately there is no simple way to quiesce a buggy
 hardware. I think we could live with that for a first version of platform
 device passthrough.

 DOM0 will have to issue an hypercall to assign the device to itself if it
 wants to use it.
>>>
>>> Does this behaviour differ from x86?
> 
> I realise now that x86 is a red-herring, what I really meant was differ
> from other types of device (specifically PCI ones).
> 
>>  If so then it is worth calling that
>>> out explicitly (even if not, good to know I think!)
>>
>> What do you mean by "calling that out explicitly"?
> 
> Stating in the commit log or a suitably placed comment (at least under
> xen/include/public hopefully) that deassignment of dt devices behaves
> differently to deassignment of other types of devices.

I tried to search any documentation explaining the behavior of those
DOMCTL for PCI (mostly the deassign one) but I didn't find any.

By any chance, do you know if there is one? If so where?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86: synchronize PCI config space access decoding

2015-03-10 Thread Jan Beulich
>>> On 10.03.15 at 12:24,  wrote:
> On 10/03/15 07:30, Jan Beulich wrote:
> On 09.03.15 at 19:49,  wrote:
>>> On 09/03/15 16:08, Jan Beulich wrote:
 Both PV and HVM logic have similar but not similar enough code here.
 Synchronize the two so that
 - in the HVM case we don't unconditionally try to access extended
   config space
 - in the PV case we pass a correct range to the XSM hook
 - in the PV case we don't needlessly deny access when the operation
   isn't really on PCI config space
 All this along with sharing the macros HVM already had here.

 Signed-off-by: Jan Beulich 

 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -2383,11 +2383,6 @@ void hvm_vcpu_down(struct vcpu *v)
  static struct hvm_ioreq_server *hvm_select_ioreq_server(struct domain *d,
  ioreq_t *p)
  {
 -#define CF8_BDF(cf8) (((cf8) & 0x0000) >> 8)
 -#define CF8_ADDR_LO(cf8) ((cf8) & 0x00fc)
 -#define CF8_ADDR_HI(cf8) (((cf8) & 0x0f00) >> 16)
 -#define CF8_ENABLED(cf8) (!!((cf8) & 0x8000))
 -
  struct hvm_ioreq_server *s;
  uint32_t cf8;
  uint8_t type;
 @@ -2416,9 +2411,19 @@ static struct hvm_ioreq_server *hvm_sele
  
  type = IOREQ_TYPE_PCI_CONFIG;
  addr = ((uint64_t)sbdf << 32) |
 -   CF8_ADDR_HI(cf8) |
 CF8_ADDR_LO(cf8) |
 (p->addr & 3);
 +/* AMD extended configuration space access? */
 +if ( CF8_ADDR_HI(cf8) &&
 + boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
 + boot_cpu_data.x86 >= 0x10 && boot_cpu_data.x86 <= 0x17 )
 +{
 +uint64_t msr_val;
 +
 +if ( !rdmsr_safe(MSR_AMD64_NB_CFG, msr_val) &&
 + (msr_val & (1ULL << AMD64_NB_CFG_CF8_EXT_ENABLE_BIT)) )
 +addr |= CF8_ADDR_HI(cf8);
>>> This is another example of host state which leaks into guests across
>>> migrate, but in this case is also problematic at the host level.
>> Yes, but cross-vendor migration has (iirc) many more issues like this
>> (and considering the wide family range the risk of this breaking for
>> migration between AMD systems seems marginal).
> 
> I wasn't even considering cross-vendor migration, but that is another
> concern.  I was more concerned with leaking bios-configured state into
> the guest.
> 
>>
>>> As far as the host goes, MSR_AMD64_NB_CFG is a per-node msr and Xen
>>> should verify that the AMD64_NB_CFG_CF8_EXT_ENABLE_BIT is consistent
>>> across the system, or bits of emulate_privileged_op() are liable to
>>> execute differently depending on which pcpu a vcpu happens to be scheduled.
>> I think this goes too far in mistrusting Dom0.
> 
> The only case where dom0 could plausibly set this up consistently even
> if wanted to, is when it has a vcpu for each pcpu and is using
> dom0_vcpu_pin.  Either of these conditions is rare in practice.

Did you look at Linux? In order to avoid these preconditions, I
specifically made it try via PCI config space accesses first a couple
of years ago.

> I still think it is Xen which needs to set this up consistently on boot,
> at which point removing all the the rdmsr_safe() from cf8 accesses is
> trivial.

Since Xen is not itself using this mechanism (maybe it should), it
seems wrong for it to configure it any specific way (i.e. Dom0 may
also rely on it being off). I btw also don't think the BIOS should
enable this, at least not without being told so. And if it does, then
please consistently for the whole system.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.5-testing test] 36006: tolerable trouble: broken/fail/pass - PUSHED

2015-03-10 Thread xen . org
flight 36006 xen-4.5-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36006/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-pvh-intel  3 host-install(3)  broken pass in 35937
 test-amd64-amd64-xl-multivcpu  3 host-install(3)  broken pass in 35937
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 3 host-install(3) broken pass in 35937
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 3 host-install(3) broken pass in 35937
 test-armhf-armhf-xl-multivcpu  3 host-install(3) broken in 35937 pass in 36006

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2   5 xen-boot fail   never pass
 test-amd64-amd64-rumpuserxen-amd64 13 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail in 35937 never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail in 35937 never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail in 35937 never pass

version targeted for testing:
 xen  25c6ee85a88b42ab6e63a418008448f1935d3312
baseline version:
 xen  3665563ac10c5476484dc1c13776fd997c1592e5


People who touched revisions under test:
  Aaron Adams 
  Ian Campbell 
  Jan Beulich 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64

Re: [Xen-devel] [PATCH 2/2] iommu: add rmrr Xen command line option for misc rmrrs

2015-03-10 Thread Elena Ufimtseva
On Tue, Mar 10, 2015 at 02:47:24AM +, Tian, Kevin wrote:
> > From: elena.ufimts...@oracle.com [mailto:elena.ufimts...@oracle.com]
> > Sent: Monday, March 09, 2015 10:43 PM
> > 
> > From: Elena Ufimtseva 
> > 
> > On some platforms RMRR regions may be not specified
> > in ACPI and thus will not be mapped 1:1 in dom0. This
> > causes IO Page Faults and prevents dom0 from booting
> > in PVH mode.
> > New Xen command line option rmrr allows to specify
> > such devices and memory regions. These regions are added
> > to the list of RMRR defined in ACPI if the device
> > is present in system. As a result, additional RMRRs will
> > be mapped 1:1 in dom0 with correct permissions.
> > 
> > Mentioned above problems were discovered during PVH work with
> > ThinkCentre M and Dell 5600T. No official documentation
> > was found so far in regards to what devices and why cause this.
> > Experiments show that ThinkCentre M USB devices with enabled
> > debug port generate DMA read transactions to the regions of
> > memory marked reserved in host e820 map.
> > For Dell 5600T the device and faulting addresses are not found yet.
> > 
> > For detailed history of the discussion please check following threads:
> > http://lists.Xen.org/archives/html/xen-devel/2015-02/msg01724.html
> > http://lists.Xen.org/archives/html/xen-devel/2015-01/msg02513.html
> > 
> > Format for rmrr Xen command line option:
> > rmrr=[sbdf]start<:end>,[sbdf]start:
> 
> how about sticking to rmrr structure, i.e. 
> 
> rmrr=start<:end>[sbdf1, sbdf2, ...], ...
> 
> > 
> > Signed-off-by: Elena Ufimtseva 
> > ---
> >  docs/misc/xen-command-line.markdown |7 +++
> >  xen/drivers/passthrough/iommu.c |   86
> > +++
> >  xen/drivers/passthrough/vtd/iommu.c |   33 ++
> >  xen/drivers/passthrough/vtd/iommu.h |1 +
> >  xen/include/xen/iommu.h |9 
> >  5 files changed, 136 insertions(+)
> > 
> > diff --git a/docs/misc/xen-command-line.markdown
> > b/docs/misc/xen-command-line.markdown
> > index bc316be..2e1210f 100644
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -1392,3 +1392,10 @@ mode.
> >  > Default: `true`
> > 
> >  Permit use of the `xsave/xrstor` instructions.
> > +
> > +### rmrr
> > +> '= [sbdf]start<:end>,[sbdf]start<:end>
> > +
> > +Define RMRRs units that are missing from ACPI table along with device
> > +they belong to and use them for 1:1 mapping. End addresses can be omitted
> > +and one page will be mapped.
> > diff --git a/xen/drivers/passthrough/iommu.c
> > b/xen/drivers/passthrough/iommu.c
> > index cc12735..b64916e 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -55,6 +55,9 @@ bool_t __read_mostly iommu_hap_pt_share = 1;
> >  bool_t __read_mostly iommu_debug;
> >  bool_t __read_mostly amd_iommu_perdev_intremap = 1;
> > 
> > +static char __initdata misc_rmrr[100];
> 
> define a macro.
> 
> > +string_param("rmrr", misc_rmrr);
> > +
> >  DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb);
> > 
> >  DEFINE_SPINLOCK(iommu_pt_cleanup_lock);
> > @@ -67,6 +70,87 @@ static struct keyhandler iommu_p2m_table = {
> >  .desc = "dump iommu p2m table"
> >  };
> > 
> > +/*
> > + * List of command line defined rmrr units.
> > + */
> > +__initdata LIST_HEAD(misc_rmrr_units);
> > +
> > +/*
> > + * Parse rmrr Xen command line options and add parsed
> > + * device and region into apci_rmrr_unit list to mapped
> > + * as RMRRs parsed from ACPI.
> > + * Format: rmrr=[sbdf]start<:end>,[sbdf]start:
> > + * end address can be ommited and one page will be used
> > + * for mapping with start pfn.
> > + */
> > +static void __init parse_iommu_extra_rmrr(const char *s)
> > +{
> > +unsigned int idx = 0, found = 0;
> > +struct misc_rmrr_unit *rmrru;
> > +unsigned int seg, bus, dev, func;
> > +const char *str, *cur;
> > +u64 start, end;
> > +
> > +do {
> > +if ( idx >= 10 )
> > +break;
> 
> as Konrad pointed out, using 10 and earlier 100 are not readable
> and error prone.
> 
> > +
> > +if ( *s != '[' )
> > +break;
> > +
> > +str = s;
> > +seg = bus = dev = func = 0;
> > +str = parse_pci(str + 1, &seg, &bus, &dev, &func);
> > +if ( !str )
> > +{
> > +str = strchr(s, ']');
> > +if ( !str )
> > +return;
> > +}
> > +else
> > +found = 1;
> > +
> > +s = str;
> > +if ( *s != ']' )
> > +return;
> 
> better to have some warn message for malformat.
> 
> > +
> > +start = simple_strtoull(cur = s + 1, &s, 0);
> > +if ( cur == s )
> > +break;
> > +
> > +if ( *s == ':' )
> > +{
> > +end = simple_strtoull(cur = s + 1, &s, 0);
> > +if ( cur == s )
> > +break;
> > +}
> > +else
> > +end = start;
> > +
> > +if 

[Xen-devel] [PATCH 0/4] x86/MSI-X: XSA-120 follow-up

2015-03-10 Thread Jan Beulich
The problem requiring the first patch here is actually what lead to
XSA-120.

1: be more careful during teardown
2: access MSI-X table only after having enabled MSI-X
3: reduce fiddling with control register during restore
4: cleanup

Signed-off-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/4] x86/MSI-X: be more careful during teardown

2015-03-10 Thread Jan Beulich
When a device gets detached from a guest, pciback will clear its
command register, thus disabling both memory and I/O decoding. The
disabled memory decoding, however, has an effect on the MSI-X table
accesses the hypervisor does: These won't have the intended effect
anymore. Even worse, for PCIe devices (but not SR-IOV virtual
functions) such accesses may (will?) be treated as Unsupported
Requests, causing respective errors to be surfaced, potentially in the
form of NMIs that may be fatal to the hypervisor or Dom0 is different
ways. Hence rather than carrying out these accesses, we should avoid
them where we can, and use alternative (e.g. PCI config space based)
mechanisms to achieve at least the same effect.

Signed-off-by: Jan Beulich 
---
Backporting note (largely to myself):
   Depends on (not yet backported) commit 061eebe0e "x86/MSI: drop
   workaround for insecure Dom0 kernels" (due to re-use of struct
   arch_msix's warned field).

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -121,6 +121,27 @@ static void msix_put_fixmap(struct arch_
 spin_unlock(&msix->table_lock);
 }
 
+static bool_t memory_decoded(const struct pci_dev *dev)
+{
+u8 bus, slot, func;
+
+if ( !dev->info.is_virtfn )
+{
+bus = dev->bus;
+slot = PCI_SLOT(dev->devfn);
+func = PCI_FUNC(dev->devfn);
+}
+else
+{
+bus = dev->info.physfn.bus;
+slot = PCI_SLOT(dev->info.physfn.devfn);
+func = PCI_FUNC(dev->info.physfn.devfn);
+}
+
+return !!(pci_conf_read16(dev->seg, bus, slot, func, PCI_COMMAND) &
+  PCI_COMMAND_MEMORY);
+}
+
 /*
  * MSI message composition
  */
@@ -162,7 +183,7 @@ void msi_compose_msg(unsigned vector, co
 }
 }
 
-static void read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
+static bool_t read_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 {
 switch ( entry->msi_attrib.type )
 {
@@ -198,6 +219,8 @@ static void read_msi_msg(struct msi_desc
 void __iomem *base;
 base = entry->mask_base;
 
+if ( unlikely(!memory_decoded(entry->dev)) )
+return 0;
 msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
 msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET);
 msg->data = readl(base + PCI_MSIX_ENTRY_DATA_OFFSET);
@@ -209,6 +232,8 @@ static void read_msi_msg(struct msi_desc
 
 if ( iommu_intremap )
 iommu_read_msi_from_ire(entry, msg);
+
+return 1;
 }
 
 static int write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
@@ -260,6 +285,8 @@ static int write_msi_msg(struct msi_desc
 void __iomem *base;
 base = entry->mask_base;
 
+if ( unlikely(!memory_decoded(entry->dev)) )
+return -ENXIO;
 writel(msg->address_lo,
base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
 writel(msg->address_hi,
@@ -287,7 +314,8 @@ void set_msi_affinity(struct irq_desc *d
 ASSERT(spin_is_locked(&desc->lock));
 
 memset(&msg, 0, sizeof(msg));
-read_msi_msg(msi_desc, &msg);
+if ( !read_msi_msg(msi_desc, &msg) )
+return;
 
 msg.data &= ~MSI_DATA_VECTOR_MASK;
 msg.data |= MSI_DATA_VECTOR(desc->arch.vector);
@@ -347,20 +375,24 @@ int msi_maskable_irq(const struct msi_de
|| entry->msi_attrib.maskbit;
 }
 
-static void msi_set_mask_bit(struct irq_desc *desc, int flag)
+static bool_t msi_set_mask_bit(struct irq_desc *desc, int flag)
 {
 struct msi_desc *entry = desc->msi_desc;
+struct pci_dev *pdev;
+u16 seg;
+u8 bus, slot, func;
 
 ASSERT(spin_is_locked(&desc->lock));
 BUG_ON(!entry || !entry->dev);
+pdev = entry->dev;
+seg = pdev->seg;
+bus = pdev->bus;
+slot = PCI_SLOT(pdev->devfn);
+func = PCI_FUNC(pdev->devfn);
 switch (entry->msi_attrib.type) {
 case PCI_CAP_ID_MSI:
 if (entry->msi_attrib.maskbit) {
 u32 mask_bits;
-u16 seg = entry->dev->seg;
-u8 bus = entry->dev->bus;
-u8 slot = PCI_SLOT(entry->dev->devfn);
-u8 func = PCI_FUNC(entry->dev->devfn);
 
 mask_bits = pci_conf_read32(seg, bus, slot, func, entry->msi.mpos);
 mask_bits &= ~((u32)1 << entry->msi_attrib.entry_nr);
@@ -369,24 +401,52 @@ static void msi_set_mask_bit(struct irq_
 }
 break;
 case PCI_CAP_ID_MSIX:
-{
-int offset = PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET;
-writel(flag, entry->mask_base + offset);
-readl(entry->mask_base + offset);
-break;
-}
+if ( likely(memory_decoded(pdev)) )
+{
+writel(flag, entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
+readl(entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
+break;
+}
+if ( flag )
+{
+u16 control;
+domid_t domid = pdev->domain->domain_id;
+
+control = pci_conf_read16(seg, bus, slot, func,
+

[Xen-devel] [PATCH 2/4] x86/MSI-X: access MSI‑X table only after having enabled MSI‑X

2015-03-10 Thread Jan Beulich
As done in Linux by f598282f51 ("PCI: Fix the NIU MSI-X problem in a
better way") and its broken predecessor, make sure we don't access the
MSI-X table without having enabled MSI-X first, using the mask-all flag
instead to prevent interrupts from occurring.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -142,6 +142,19 @@ static bool_t memory_decoded(const struc
   PCI_COMMAND_MEMORY);
 }
 
+static bool_t msix_memory_decoded(const struct pci_dev *dev, unsigned int pos)
+{
+u16 control = pci_conf_read16(dev->seg, dev->bus,
+  PCI_SLOT(dev->devfn),
+  PCI_FUNC(dev->devfn),
+  msix_control_reg(pos));
+
+if ( !(control & PCI_MSIX_FLAGS_ENABLE) )
+return 0;
+
+return memory_decoded(dev);
+}
+
 /*
  * MSI message composition
  */
@@ -219,7 +236,8 @@ static bool_t read_msi_msg(struct msi_de
 void __iomem *base;
 base = entry->mask_base;
 
-if ( unlikely(!memory_decoded(entry->dev)) )
+if ( unlikely(!msix_memory_decoded(entry->dev,
+   entry->msi_attrib.pos)) )
 return 0;
 msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
 msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET);
@@ -285,7 +303,8 @@ static int write_msi_msg(struct msi_desc
 void __iomem *base;
 base = entry->mask_base;
 
-if ( unlikely(!memory_decoded(entry->dev)) )
+if ( unlikely(!msix_memory_decoded(entry->dev,
+   entry->msi_attrib.pos)) )
 return -ENXIO;
 writel(msg->address_lo,
base + PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
@@ -379,7 +398,7 @@ static bool_t msi_set_mask_bit(struct ir
 {
 struct msi_desc *entry = desc->msi_desc;
 struct pci_dev *pdev;
-u16 seg;
+u16 seg, control;
 u8 bus, slot, func;
 
 ASSERT(spin_is_locked(&desc->lock));
@@ -401,35 +420,38 @@ static bool_t msi_set_mask_bit(struct ir
 }
 break;
 case PCI_CAP_ID_MSIX:
+control = pci_conf_read16(seg, bus, slot, func,
+  msix_control_reg(entry->msi_attrib.pos));
+if ( unlikely(!(control & PCI_MSIX_FLAGS_ENABLE)) )
+pci_conf_write16(seg, bus, slot, func,
+ msix_control_reg(entry->msi_attrib.pos),
+ control | (PCI_MSIX_FLAGS_ENABLE |
+PCI_MSIX_FLAGS_MASKALL));
 if ( likely(memory_decoded(pdev)) )
 {
 writel(flag, entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
 readl(entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
-break;
+if ( likely(control & PCI_MSIX_FLAGS_ENABLE) )
+break;
+flag = 1;
 }
-if ( flag )
+else if ( flag && !(control & PCI_MSIX_FLAGS_MASKALL) )
 {
-u16 control;
 domid_t domid = pdev->domain->domain_id;
 
-control = pci_conf_read16(seg, bus, slot, func,
-  msix_control_reg(entry->msi_attrib.pos));
-if ( control & PCI_MSIX_FLAGS_MASKALL )
-break;
-pci_conf_write16(seg, bus, slot, func,
- msix_control_reg(entry->msi_attrib.pos),
- control | PCI_MSIX_FLAGS_MASKALL);
+control |= PCI_MSIX_FLAGS_MASKALL;
 if ( pdev->msix->warned != domid )
 {
 pdev->msix->warned = domid;
 printk(XENLOG_G_WARNING
-   "cannot mask IRQ %d: masked MSI-X on Dom%d's 
%04x:%02x:%02x.%u\n",
+   "cannot mask IRQ %d: masking MSI-X on Dom%d's 
%04x:%02x:%02x.%u\n",
desc->irq, domid, pdev->seg, pdev->bus,
PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 }
-break;
 }
-/* fall through */
+pci_conf_write16(seg, bus, slot, func,
+ msix_control_reg(entry->msi_attrib.pos), control);
+return flag;
 default:
 return 0;
 }
@@ -454,7 +476,8 @@ static int msi_get_mask_bit(const struct
 entry->msi.mpos) >>
 entry->msi_attrib.entry_nr) & 1;
 case PCI_CAP_ID_MSIX:
-if ( unlikely(!memory_decoded(entry->dev)) )
+if ( unlikely(!msix_memory_decoded(entry->dev,
+   entry->msi_attrib.pos)) )
 break;
 return readl(entry->mask_base + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET) & 1;
 }
@@ -775,16 +798,32 @@ static int msix_capability_init(struct p
 
 pos = pci_find_cap_offset(seg, bus, slot, func, PCI_CAP_ID_MSIX);
 control = pci_conf_read16(seg, bus, slot, func, msix_co

[Xen-devel] [PATCH 3/4] x86/MSI-X: reduce fiddling with control register during restore

2015-03-10 Thread Jan Beulich
Rather than disabling and enabling MSI-X once per vector, do it just
once per device.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1218,6 +1218,9 @@ int pci_restore_msi_state(struct pci_dev
 struct msi_desc *entry, *tmp;
 struct irq_desc *desc;
 struct msi_msg msg;
+u8 slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
+unsigned int type = 0, pos = 0;
+u16 control = 0;
 
 ASSERT(spin_is_locked(&pcidevs_lock));
 
@@ -1234,8 +1237,6 @@ int pci_restore_msi_state(struct pci_dev
 list_for_each_entry_safe( entry, tmp, &pdev->msi_list, list )
 {
 unsigned int i = 0, nr = 1;
-u16 control = 0;
-u8 slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
 
 irq = entry->irq;
 desc = &irq_desc[irq];
@@ -1252,31 +1253,38 @@ int pci_restore_msi_state(struct pci_dev
 pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
 PCI_FUNC(pdev->devfn), i);
 spin_unlock_irqrestore(&desc->lock, flags);
+if ( type == PCI_CAP_ID_MSIX )
+pci_conf_write16(pdev->seg, pdev->bus, slot, func,
+ msix_control_reg(pos),
+ control & ~PCI_MSIX_FLAGS_ENABLE);
 return -EINVAL;
 }
 
+ASSERT(!type || type == entry->msi_attrib.type);
+pos = entry->msi_attrib.pos;
 if ( entry->msi_attrib.type == PCI_CAP_ID_MSI )
 {
 msi_set_enable(pdev, 0);
 nr = entry->msi.nvec;
 }
-else if ( entry->msi_attrib.type == PCI_CAP_ID_MSIX )
+else if ( !type && entry->msi_attrib.type == PCI_CAP_ID_MSIX )
 {
 control = pci_conf_read16(pdev->seg, pdev->bus, slot, func,
-  msix_control_reg(entry->msi_attrib.pos));
+  msix_control_reg(pos));
 pci_conf_write16(pdev->seg, pdev->bus, slot, func,
- msix_control_reg(entry->msi_attrib.pos),
+ msix_control_reg(pos),
  control | (PCI_MSIX_FLAGS_ENABLE |
 PCI_MSIX_FLAGS_MASKALL));
 if ( unlikely(!memory_decoded(pdev)) )
 {
 spin_unlock_irqrestore(&desc->lock, flags);
 pci_conf_write16(pdev->seg, pdev->bus, slot, func,
- msix_control_reg(entry->msi_attrib.pos),
+ msix_control_reg(pos),
  control & ~PCI_MSIX_FLAGS_ENABLE);
 return -ENXIO;
 }
 }
+type = entry->msi_attrib.type;
 
 msg = entry->msg;
 write_msi_msg(entry, &msg);
@@ -1299,9 +1307,9 @@ int pci_restore_msi_state(struct pci_dev
 
 spin_unlock_irqrestore(&desc->lock, flags);
 
-if ( entry->msi_attrib.type == PCI_CAP_ID_MSI )
+if ( type == PCI_CAP_ID_MSI )
 {
-unsigned int cpos = msi_control_reg(entry->msi_attrib.pos);
+unsigned int cpos = msi_control_reg(pos);
 
 control = pci_conf_read16(pdev->seg, pdev->bus, slot, func, cpos) &
   ~PCI_MSI_FLAGS_QSIZE;
@@ -1311,12 +1319,13 @@ int pci_restore_msi_state(struct pci_dev
 
 msi_set_enable(pdev, 1);
 }
-else if ( entry->msi_attrib.type == PCI_CAP_ID_MSIX )
-pci_conf_write16(pdev->seg, pdev->bus, slot, func,
- msix_control_reg(entry->msi_attrib.pos),
- control | PCI_MSIX_FLAGS_ENABLE);
 }
 
+if ( type == PCI_CAP_ID_MSIX )
+pci_conf_write16(pdev->seg, pdev->bus, slot, func,
+ msix_control_reg(pos),
+ control | PCI_MSIX_FLAGS_ENABLE);
+
 return 0;
 }
 



x86/MSI-X: reduce fiddling with control register during restore

Rather than disabling and enabling MSI-X once per vector, do it just
once per device.

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1218,6 +1218,9 @@ int pci_restore_msi_state(struct pci_dev
 struct msi_desc *entry, *tmp;
 struct irq_desc *desc;
 struct msi_msg msg;
+u8 slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
+unsigned int type = 0, pos = 0;
+u16 control = 0;
 
 ASSERT(spin_is_locked(&pcidevs_lock));
 
@@ -1234,8 +1237,6 @@ int pci_restore_msi_state(struct pci_dev
 list_for_each_entry_safe( entry, tmp, &pdev->msi_list, list )
 {
 unsigned int i = 0, nr = 1;
-u16 control = 0;
-u8 slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
 
 irq = entry->irq;
 desc = &irq_desc[irq];
@@ -1252,31 +1253,38 @@ int pci_restore_msi_state(struct pci_dev
 pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
 PCI_FUNC(pdev->devfn), i

Re: [Xen-devel] [PATCH 2/2] iommu: add rmrr Xen command line option for misc rmrrs

2015-03-10 Thread Jan Beulich
>>> On 10.03.15 at 17:16,  wrote:
> On Tue, Mar 10, 2015 at 02:47:24AM +, Tian, Kevin wrote:
>> > From: elena.ufimts...@oracle.com [mailto:elena.ufimts...@oracle.com]
>> > --- a/xen/drivers/passthrough/vtd/iommu.c
>> > +++ b/xen/drivers/passthrough/vtd/iommu.c
>> > @@ -1232,6 +1232,38 @@ static int intel_iommu_domain_init(struct domain
>> > *d)
>> >  return 0;
>> >  }
>> > 
>> > +static void add_misc_rmrr(void)
>> > +{
>> > +struct acpi_rmrr_unit *rmrrn;
>> > +struct misc_rmrr_unit *rmrru, *r;
>> > +
>> > +list_for_each_entry_safe( rmrru, r, &misc_rmrr_units, list )
>> > +{
>> > +rmrrn = xzalloc(struct acpi_rmrr_unit);
>> > +if ( !rmrrn )
>> > +goto free;
>> > +
>> > +rmrrn->scope.devices = xzalloc(typeof(*rmrrn->scope.devices));
>> > +if ( !rmrrn->scope.devices )
>> > +{
>> > +xfree(rmrrn);
>> > +goto free;
>> > +}
>> > +rmrrn->scope.devices_cnt = 1;
>> > +rmrrn->segment = rmrru->segment;
>> > +rmrrn->scope.devices[0] = rmrru->device;
>> 
>> need handle one-rmrr-multiple-deviecs. even if you don't want
>> to support it, need capture user attempts at least.
> 
> Kevin, on the second thought, I think to support multiple devices 
> per one rmrr one need to put on command line same address/range and
> specify unique device each time. 

Why? Iirc it was you who already proposed a way to properly
express this on the command line without having to repeat the
memory addresses.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/2] x86emul: XSA-123 follow-up

2015-03-10 Thread Jan Beulich
1: drop unused "bigval" fields from struct operand
2: simplify asm() constraints

Signed-off-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Xen-users] Grant reference batch transmission

2015-03-10 Thread Ian Campbell
Hi Gareth,

I think this counts as a -devel question, so I've added -devel and moved
-users to bcc (-users is more for end users). I've done less quote
trimming than usual for the other folks on -devel.

On Tue, 2015-03-10 at 14:15 +, Gareth Stockwell wrote:
> What is the recommended way to transmit batches of grant references
> between Linux domains?

> I need to share large regions of memory between domains.  I understand
> that the grant table API can be used to share memory, with one grant
> reference being created per page (frame) to be shared:

> gref = gnttab_grant_foreign_access(otherend_id, frame, readonly);

> or

> err = gnttab_alloc_grant_references(num_grefs, &gref_head);
> gref = gnttab_claim_reference(&gref_head);
> err = gnttab_grant_foreign_access_ref(gref, otherend_id, frame,
> readonly);
> 
>  
> 
> In order to share a large number of pages, it is desirable to minimise
> both the number of hypercalls required in each domain, and the number
> of messages (e.g. xenstore writes) required to transmit grant
> reference(s) from the donor to the recipient.
> 
>  
> 
> I see that gnttab_grant_foreign_access just updates grant table fields
> in a page which is mapped into the donor, and does not require a
> hypercall.  In the recipient domain, multiple grant references can be
> mapped by gnttab_map_refs using a single GNTTABOP_map_grant_ref
> hypercall (assuming that the target memory is not paged out).

Correct. Granting access to a page is just a case of writing to a local
page and the mapping interface is batched.

> What is the recommended way for the donor to transmit a batch of grant
> references?  I assume that this requires the donor to pack references
> into an index page, grant foreign access to the index and transmit the
> index grant reference.  Does Linux provide any way to do this, or are
> xenbus drivers expected to implement their own batch transmission?

A bit of each. You would indeed want to setup a shared page and push the
references into it, and Linux (/the Xen interface headers) provide some
helpers for this sort of thing, but each driver largely sets things up
themselves using a specific ring request format etc.

The actual ring structure helpers are in
xen.git/xen/include/public/io/ring.h you would define a request and
response pair (e.g. containing one or more grefs per request) and then
use the macros from ring.h to setup and use the shared ring data
structures.

(NB: xen.git/xen/include/public corresponds to
linux.git/include/xen/interface)

The ring macros include provision for batching and deferred
notifications, so you can balance the number of grefs per request vs.
multiple requests based on your needs.

As far as setup of the ring itself goes typically the frontend would
allocate one of its pages, grant it to the backend and communicate that
to the backend via xenstore. Most drivers use a little start of day
synchronisation protocol based around the "state" keys in the front and
backend xenstore dirs, working through the states in enum xenbus_state
XenbusState* from xen/include/public/io/xenbus.h. It's assumed that this
setup is infrequent (i.e. corresponds to plugging in a new disk etc)

xen/include/public/io/blkif.h has an example of how that works in the
case of the blk driver.

In Linux (for most drivers at least, yours may not fit this
infrastructure) that state machine can be driven from
the .otherend_changed callback in the struct xenbus_driver ops struct.

http://wiki.xen.org/wiki/XenBus covers some of this in the first 3rd,
but TBH it's not as helpful as it could be. I thought we had something
better somewhere (a whitepaper or something), but I can't find any sign
of such a thing, perhaps someone on the list has a reference to such a
thing.

Other than that there is the code in Linux. I think both net and blkback
put most of the initial setup xenbus stuff in their respective
drivers/{block,net}/xen-{blk,net}back/xenbus.c.

For the frontend (drivers/{block,net}/xen-{blk,net}front.c) it's in the
single file.

In both cases the .otherend_changed hook is probably the place to start.

I hope that helps.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/2] x86emul: drop unused "bigval" fields from struct operand

2015-03-10 Thread Jan Beulich
Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -313,17 +313,11 @@ struct operand {
 enum { OP_REG, OP_MEM, OP_IMM, OP_NONE } type;
 unsigned int bytes;
 
-/* Up to 128-byte operand value, addressable as ulong or uint32_t[]. */
-union {
-unsigned long val;
-uint32_t bigval[4];
-};
+/* Operand value. */
+unsigned long val;
 
-/* Up to 128-byte operand value, addressable as ulong or uint32_t[]. */
-union {
-unsigned long orig_val;
-uint32_t orig_bigval[4];
-};
+/* Original operand value. */
+unsigned long orig_val;
 
 union {
 /* OP_REG: Pointer to register field. */



x86emul: drop unused "bigval" fields from struct operand

Signed-off-by: Jan Beulich 

--- a/xen/arch/x86/x86_emulate/x86_emulate.c
+++ b/xen/arch/x86/x86_emulate/x86_emulate.c
@@ -313,17 +313,11 @@ struct operand {
 enum { OP_REG, OP_MEM, OP_IMM, OP_NONE } type;
 unsigned int bytes;
 
-/* Up to 128-byte operand value, addressable as ulong or uint32_t[]. */
-union {
-unsigned long val;
-uint32_t bigval[4];
-};
+/* Operand value. */
+unsigned long val;
 
-/* Up to 128-byte operand value, addressable as ulong or uint32_t[]. */
-union {
-unsigned long orig_val;
-uint32_t orig_bigval[4];
-};
+/* Original operand value. */
+unsigned long orig_val;
 
 union {
 /* OP_REG: Pointer to register field. */
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 20/24] xen/passthrough: Extend XEN_DOMCTL_assign_device to support DT device

2015-03-10 Thread Julien Grall
Hi Daniel,

On 23/02/15 16:25, Daniel De Graaf wrote:
> On 02/20/2015 12:17 PM, Ian Campbell wrote:
>> On Tue, 2015-01-13 at 14:25 +, Julien Grall wrote:
>>> TODO: Update the commit message
>>>
>>> A device node is described by a path. It will be used to retrieved the
>>> node in the device tree and assign the related device to the domain.
>>>
>>> Only device protected by an IOMMU can be assigned to a guest.
>>>
>>> Signed-off-by: Julien Grall 
>>> Cc: Ian Jackson 
>>> Cc: Wei Liu 
>>> Cc: Jan Beulich 
>>>
>>> ---
>>>  Changes in v2:
>>>  - Use a different number for XEN_DOMCTL_assign_dt_device
>>> ---
>>>   tools/libxc/include/xenctrl.h | 10 
>>>   tools/libxc/xc_domain.c   | 95
>>> --
>>
>> These bits all look fine.
>>
>>> +int iommu_do_dt_domctl(struct xen_domctl *domctl, struct domain *d,
>>> +   XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>> +{
>>> +int ret;
>>> +struct dt_device_node *dev;
>>> +
>>> +/* TODO: How to deal with XSM? */
>>
>> Adding Daniel.
>>
>> It seems the PCI ones are protected by
>>  xsm_test_assign_device(XSM_HOOK,
>> domctl->u.assign_device.machine_sbdf);
>>
>> So it seem that either this needs to become "test_assign_pci_device" and
>> a similar "test_assign_dt_device" needs to be added and plumbed through
>> or it needs to grow a type parameter and take the union for the
>> identifier.
> 
> Either would work, but a distinct hook seems simpler to me, especially as
> the call sites are distinct and the hook would process them differently.

Sounds good.

>> The code to apply an XSM context to a DT node would need consideration
>> too I suppose?
> 
> This may require a bit more thought.  At first glance, the dt_phandle
> field seems to be an identifier that could be used by FLASK to identify a
> device using an ocontext lookup.  Labeling would then be done in the same
> way as PCI devices and x86 legacy I/O ports.

We don't always have a dt_phandle in hand. They are mostly used for
referencing a node within another (such as IOMMU, interrupt
controller...). Also, the value is controlled by the compiler.

AFAICT, the only unique value we have in hand is the path of the device.

BTW, do you have any pointer on how to write a policy for device/IRQ
passthrough?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 6/7] vTPM: Parse envent string from QEMU frontend

2015-03-10 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 extras/mini-os/tpmback.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/extras/mini-os/tpmback.c b/extras/mini-os/tpmback.c
index 8a0a983..b8f4c8f 100644
--- a/extras/mini-os/tpmback.c
+++ b/extras/mini-os/tpmback.c
@@ -732,11 +732,22 @@ static int parse_eventstr(const char* evstr, domid_t* 
domid, unsigned int* handl
 return EV_NONE;
   }
   return EV_NEWFE;
-   } else if((ret = sscanf(evstr, "/local/domain/%u/device/vtpm/%u/%40s", 
&udomid, handle, cmd)) == 3) {
-  *domid = udomid;
-  if (!strcmp(cmd, "state"))
-return EV_STCHNG;
-   }
+
+/* vtpm and PV virtual machines */
+} else if((ret = sscanf(evstr, "/local/domain/%u/device/vtpm/%u/%40s",
+&udomid, handle, cmd)) == 3) {
+*domid = udomid;
+if (!strcmp(cmd, "state"))
+return EV_STCHNG;
+
+/* HVM virtual machines */
+} else if((ret = sscanf(evstr, "/local/domain/0/frontend/vtpm/%u/%u/%40s",
+&udomid, handle, cmd)) == 3) {
+*domid = 0;
+if (!strcmp(cmd, "state"))
+return EV_STCHNG;
+}
+
return EV_NONE;
 }
 
-- 
1.8.3.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 5/7] vTPM: Delete the xenstore directory of frontend device

2015-03-10 Thread Quan Xu
when virtual machine is destroyed.

Signed-off-by: Quan Xu 
---
 tools/libxl/libxl_device.c | 61 +++---
 1 file changed, 57 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index b1a71fe..668bf71 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -660,10 +660,11 @@ void libxl__devices_destroy(libxl__egc *egc, 
libxl__devices_remove_state *drs)
 {
 STATE_AO_GC(drs->ao);
 uint32_t domid = drs->domid;
-char *path;
-unsigned int num_kinds, num_dev_xsentries;
-char **kinds = NULL, **devs = NULL;
-int i, j, rc = 0;
+char *path, *dom_name, *name;
+unsigned int num_kinds, num_fkinds, num_dev_xsentries, num_dev;
+char **kinds = NULL, **fkinds = NULL, **devs = NULL, **sdevs = NULL,
+**be_doms = NULL;
+int i, j, k, rc = 0;
 libxl__device *dev;
 libxl__multidev *multidev = &drs->multidev;
 libxl__ao_device *aodev;
@@ -731,6 +732,58 @@ void libxl__devices_destroy(libxl__egc *egc, 
libxl__devices_remove_state *drs)
 libxl__device_destroy(gc, dev);
 }
 
+/*
+ * Frontend device, such as vTPM, is under:
+ * /local/domain/0/frontend/{type}/{backend_dom_id}/{dev}
+ */
+path = GCSPRINTF("/local/domain/%d/frontend", 0);
+fkinds = libxl__xs_directory(gc, XBT_NULL, path, &num_fkinds);
+if (!fkinds) {
+if (errno != ENOENT) {
+LOGE(ERROR, "unable to get xenstore device listing %s", path);
+goto out;
+}
+num_fkinds = 0;
+}
+
+name = libxl_domid_to_name(CTX, domid);
+
+/* /local/domain/0/frontend/{type} */
+for (i = 0; i < num_fkinds; i++) {
+if (libxl__device_kind_from_string(fkinds[i], &kind))
+continue;
+
+path = GCSPRINTF("/local/domain/0/frontend/%s", fkinds[i]);
+be_doms = libxl__xs_directory(gc, XBT_NULL, path, &num_dev_xsentries);
+if (!be_doms)
+continue;
+
+/* /local/domain/0/frontend/{type}/{backend_dom_id} */
+for (j = 0; j < num_dev_xsentries; j++) {
+path = GCSPRINTF("/local/domain/0/frontend/%s/%d",
+  fkinds[i], atoi(be_doms[j]));
+sdevs = libxl__xs_directory(gc, XBT_NULL, path, &num_dev);
+
+/* /local/domain/0/frontend/{type}/{backend_dom_id}/{dev} */
+for (k = 0; k < num_dev; k++) {
+path = GCSPRINTF("/local/domain/0/frontend/%s/%d/%d/domain",
+ fkinds[i], atoi(be_doms[j]), atoi(sdevs[k]));
+dom_name = libxl__xs_read(gc, XBT_NULL, path);
+if (strcmp(name, dom_name)) {
+continue;
+}
+
+path = GCSPRINTF("/local/domain/0/frontend/%s/%d/%d/backend",
+ fkinds[i], atoi(be_doms[j]), atoi(sdevs[k]));
+path = libxl__xs_read(gc, XBT_NULL, path);
+GCNEW(dev);
+if (path && strcmp(path, "") &&
+libxl__parse_backend_path(gc, path, dev) == 0)
+libxl__device_destroy(gc, dev);
+}
+}
+}
+
 out:
 libxl__multidev_prepared(egc, multidev, rc);
 }
-- 
1.8.3.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 7/7] vTPM: add QEMU_STUBDOM_VTPM compile option

2015-03-10 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 Config.mk  | 4 
 tools/Makefile | 7 +++
 2 files changed, 11 insertions(+)

diff --git a/Config.mk b/Config.mk
index a5b6c41..5a5f413 100644
--- a/Config.mk
+++ b/Config.mk
@@ -254,6 +254,10 @@ endif
 OVMF_UPSTREAM_REVISION ?= 447d264115c476142f884af0be287622cd244423
 QEMU_UPSTREAM_REVISION ?= qemu-xen-4.5.0-rc1
 SEABIOS_UPSTREAM_REVISION ?= rel-1.7.5
+
+# Qemu stubdom vtpm frontend.
+QEMU_STUBDOM_VTPM ?= n
+
 # Thu May 22 16:59:16 2014 -0400
 # python3 fixes for vgabios and csm builds.
 
diff --git a/tools/Makefile b/tools/Makefile
index af9798a..1044149 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -197,6 +197,12 @@ else
 QEMU_XEN_ENABLE_DEBUG :=
 endif
 
+ifeq ($(QEMU_STUBDOM_VTPM),y)
+QEMU_TPM_ARGS="--enable-tpm"
+else
+QEMU_TPM_ARGS="--disable-tpm"
+endif
+
 subdir-all-qemu-xen-dir: qemu-xen-dir-find
if test -d $(QEMU_UPSTREAM_LOC) ; then \
source=$(QEMU_UPSTREAM_LOC); \
@@ -222,6 +228,7 @@ subdir-all-qemu-xen-dir: qemu-xen-dir-find
--datadir=$(SHAREDIR)/qemu-xen \
--localstatedir=$(localstatedir) \
--disable-kvm \
+$(QEMU_TPM_ARGS) \
--disable-docs \
--disable-guest-agent \
--python=$(PYTHON) \
-- 
1.8.3.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 4/7] vTPM: add vTPM device for HVM virtual machine

2015-03-10 Thread Quan Xu
refactor libxl__device_vtpm_add to call the right helpers
libxl__device_vtpm_add_{pv,hvm}. For HVM virtual machine,
it does not support hot-plug and hot-unplug, as it requires
SeaBios to initalize ACPI and virtual MMIO space for TPM
TIS when virtual machine starts.

Signed-off-by: Quan Xu 
---
 tools/libxl/libxl.c  | 176 +--
 tools/libxl/libxl.h  |   7 +-
 tools/libxl/libxl_create.c   |  32 +---
 tools/libxl/libxl_device.c   |  38 +-
 tools/libxl/libxl_dm.c   |  12 +++
 tools/libxl/libxl_internal.h |   6 +-
 tools/libxl/xl_cmdimpl.c |   4 +-
 7 files changed, 253 insertions(+), 22 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 18561fb..c2d4baa 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -1904,6 +1904,25 @@ out:
 return;
 }
 
+static int libxl__frontend_device_nextid(libxl__gc *gc, uint32_t domid, char 
*device)
+{
+char *dompath, **l;
+unsigned int nb;
+int nextid = -1;
+
+if (!(dompath = libxl__xs_get_dompath(gc, domid)))
+return nextid;
+
+l = libxl__xs_directory(gc, XBT_NULL,
+GCSPRINTF("/local/domain/0/frontend/%s/%u", device, domid), &nb);
+if (l == NULL || nb == 0)
+nextid = 0;
+else
+nextid = strtoul(l[nb - 1], NULL, 10) + 1;
+
+return nextid;
+}
+
 /* common function to get next device id */
 static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
 {
@@ -1957,9 +1976,9 @@ static int libxl__device_from_vtpm(libxl__gc *gc, 
uint32_t domid,
return 0;
 }
 
-void libxl__device_vtpm_add(libxl__egc *egc, uint32_t domid,
-   libxl_device_vtpm *vtpm,
-   libxl__ao_device *aodev)
+void libxl__device_vtpm_add_pv(libxl__egc *egc, uint32_t domid,
+   libxl_device_vtpm *vtpm,
+   libxl__ao_device *aodev)
 {
 STATE_AO_GC(aodev->ao);
 flexarray_t *front;
@@ -2073,6 +2092,134 @@ out:
 return;
 }
 
+void libxl__device_vtpm_add_hvm(libxl__egc *egc, uint32_t domid,
+libxl_device_vtpm *vtpm,
+libxl__ao_device *aodev)
+{
+STATE_AO_GC(aodev->ao);
+flexarray_t *front;
+flexarray_t *back;
+libxl__device *device;
+unsigned int rc;
+xs_transaction_t t = XBT_NULL;
+libxl_domain_config d_config;
+libxl_device_vtpm vtpm_saved;
+libxl__domain_userdata_lock *lock = NULL;
+
+libxl_domain_config_init(&d_config);
+libxl_device_vtpm_init(&vtpm_saved);
+libxl_device_vtpm_copy(CTX, &vtpm_saved, vtpm);
+
+rc = libxl__device_vtpm_setdefault(gc, vtpm);
+if (rc)
+goto out;
+
+front = flexarray_make(gc, 16, 1);
+back = flexarray_make(gc, 16, 1);
+
+if ((vtpm->devid = libxl__frontend_device_nextid(gc,
+   vtpm->backend_domid, "vtpm")) < 0) {
+rc =  ERROR_FAIL;
+goto out;
+}
+
+GCNEW(device);
+rc = libxl__device_from_vtpm(gc, 0, vtpm, device);
+if (rc != 0)
+goto out;
+flexarray_append(back, "frontend-id");
+flexarray_append(back, "0");
+flexarray_append(back, "online");
+flexarray_append(back, "1");
+flexarray_append(back, "state");
+flexarray_append(back, "1");
+flexarray_append(back, "handle");
+flexarray_append(back, GCSPRINTF("%d", vtpm->devid));
+
+flexarray_append(back, "uuid");
+flexarray_append(back, GCSPRINTF(LIBXL_UUID_FMT,
+ LIBXL_UUID_BYTES(vtpm->uuid)));
+flexarray_append(back, "resume");
+flexarray_append(back, "False");
+
+flexarray_append(front, "backend-id");
+flexarray_append(front, GCSPRINTF("%d", vtpm->backend_domid));
+flexarray_append(front, "state");
+flexarray_append(front, "1");
+flexarray_append(front, "handle");
+flexarray_append(front, GCSPRINTF("%d", vtpm->devid));
+
+flexarray_append(front, "domain");
+flexarray_append(front, GCSPRINTF("%s", libxl__domid_to_name(gc, domid)));
+
+if (aodev->update_json) {
+lock = libxl__lock_domain_userdata(gc, domid);
+if (!lock) {
+rc = ERROR_LOCK_FAIL;
+goto out;
+}
+
+rc = libxl__get_domain_configuration(gc, domid, &d_config);
+if (rc)
+goto out;
+
+DEVICE_ADD(vtpm, vtpms, domid, &vtpm_saved, COMPARE_DEVID, &d_config);
+}
+
+for (;;) {
+rc = libxl__xs_transaction_start(gc, &t);
+if (rc)
+goto out;
+
+rc = libxl__device_exists(gc, t, device);
+if (rc < 0)
+goto out;
+if (rc == 1) {
+
+/* already exists in xenstore */
+LOG(ERROR, "device already exists in xenstore");
+aodev->action = LIBXL__DEVICE_ACTION_ADD; /* for error message */
+rc = ERROR_DEVICE_EXISTS;
+goto out;
+}
+
+if (aodev->update_json) {
+rc = 

[Xen-devel] [PATCH v4 0/5] QEMU:Xen stubdom vTPM for HVM virtual machine

2015-03-10 Thread Quan Xu
*INTRODUCTION*
The goal of virtual Trusted Platform Module (vTPM) is to provide a TPM 
functionality to virtual machines (Fedora, Ubuntu, Redhat, Windows .etc). This 
allows programs to interact with a TPM in a virtual machine the same way they 
interact with a TPM on the physical system. Each virtual machine gets its own 
unique, emulated, software TPM. Each major component of vTPM is implemented as 
a stubdom, providing secure separation guaranteed by the hypervisor.

The vTPM stubdom is a Xen mini-OS domain that emulates a TPM for the virtual 
machine to use. It is a small wrapper around the Berlios TPM emulator. TPM 
commands are passed from mini-os TPM backend driver.

*ARCHITECTURE*
The architecture of stubdom vTPM for HVM virtual machine:

++
| Windows/Linux DomU | ...
||  ^|
|v  ||
|  Qemu tpm1.2 Tis   |
||  ^|
|v  ||
| XenStubdoms backend|
++
 |  ^
 v  |
++
|  XenDevOps |
++
 |  ^
 v  |
++
|  mini-os/tpmback   |
||  ^|
|v  ||
|   vtpm-stubdom | ...
||  ^|
|v  ||
|  mini-os/tpmfront  |
++
 |  ^
 v  |
++
|  mini-os/tpmback   |
||  ^|
|v  ||
|  vtpmmgr-stubdom   |
||  ^|
|v  ||
|  mini-os/tpm_tis   |
++
 |  ^
 v  |
++
|Hardware TPM|
++



 * Windows/Linux DomU:
The HVM based guest that wants to use a vTPM. There may be
more than one of these.

 * Qemu tpm1.2 Tis:
Implementation of the tpm1.2 Tis interface for HVM virtual
machines. It is Qemu emulation device.

 * vTPM xenstubdoms driver:
Qemu vTPM driver. This driver provides vtpm initialization
and sending data and commends to a para-virtualized vtpm
stubdom.

 * XenDevOps:
Register Xen stubdom vTPM frontend driver, and transfer any
request/repond between TPM xenstubdoms driver and Xen vTPM
stubdom. Facilitate communications between Xen vTPM stubdom
and vTPM xenstubdoms driver.

 * mini-os/tpmback:
Mini-os TPM backend driver. The Linux frontend driver connects
to this backend driver to facilitate communications between the
Linux DomU and its vTPM. This driver is also used by vtpmmgr
stubdom to communicate with vtpm-stubdom.

 * vtpm-stubdom:
A mini-os stub domain that implements a vTPM. There is a
one to one mapping between running vtpm-stubdom instances and
logical vtpms on the system. The vTPM Platform Configuration
Registers (PCRs) are all initialized to zero.

 * mini-os/tpmfront:
Mini-os TPM frontend driver. The vTPM mini-os domain vtpm
stubdom uses this driver to communicate with vtpmmgr-stubdom.
This driver could also be used separately to implement a mini-os
domain that wishes to use a vTPM of its own.

 * vtpmmgr-stubdom:
A mini-os domain that implements the vTPM manager. There is only
one vTPM manager and it should be running during the entire lifetime
of the machine. vtpmmgr domain securely stores encryption keys for
each of the vtpms and accesses to the hardware TPM to get the root of
trust for the entire system.

 * mini-os/tpm_tis:
Mini-os TPM version 1.2 TPM Interface Specification (TIS) driver.
This driver used by vtpmmgr-stubdom to talk directly to the hardware
TPM. Communication is facilitated by mapping hardware memory pages
into vtpmmgr stubdom.

 * Hardware TPM: The physical TPM 1.2 that is soldered onto the motherboard.

--Changes in v4:
-Fix the comment style
-Redesign vTPM xenstore architecture for HVM virtual machine.
-Remove unnecessary busy loop.
-Call xen_fe_register(vtpm ...) directly and move some initialzation
 chunk in the xen_vtpmdev_ops .init function.
-New xen_pvdev.c file
-Move xendevs queue to xen_pvdev.c
-Move xenstore functions to xen_pvdev.c
-Check status before setting the frontend to connect
-qapi schema enhancement.
-remove no need code.


--Changes in v3:
-New xen_frontend.c file
-Adjust the format of command line options
-Move xenbus_switch_state() to xen_frontend.c
-Move xen_stubdom_be() to xenstore_fe_read_be_str()
-Move *_stubdom_*() to *_fe_*()
-Move xen_stubd

[Xen-devel] [PATCH v4 3/5] Qemu-Xen-vTPM: Register Xen stubdom vTPM frontend driver

2015-03-10 Thread Quan Xu
This drvier transfers any request/repond between TPM xenstubdoms
driver and Xen vTPM stubdom, and facilitates communications between
Xen vTPM stubdom domain and vTPM xenstubdoms driver. It is a glue for
the TPM xenstubdoms driver and Xen stubdom vTPM domain that provides
the actual TPM functionality.

(Xen) Xen backend driver should run before running this frontend, and
initialize XenStore as the following for communication.

[XenStore]

for example:

Domain 0: runs QEMU for guest A
Domain 1: vtpmmgr
Domain 2: vTPM for guest A
Domain 3: HVM guest A

[...]
 local = ""
   domain = ""
0 = ""
 frontend = ""
  vtpm = ""
   2 = ""
0 = ""
 backend = "/local/domain/2/backend/vtpm/0/0"
 backend-id = "2"
 state = "*"
 handle = "0"
 domain = "Domain3's name"
 ring-ref = "*"
 event-channel = "*"
 feature-protocol-v2 = "1"
 backend = ""
  qdisk = ""
   [...]
  console = ""
  vif = ""
   [...]
2 = ""
 [...]
 backend = ""
  vtpm = ""
   0 = ""
0 = ""
 frontend = "/local/domain/0/frontend/vtpm/2/0"
 frontend-id = "0" ('0', frontend is running in Domain-0)
 [...]
3 = ""
 [...]
 device = "" (frontend device, the backend is running in QEMU/.etc)
  vkbd = ""
   [...]
  vif = ""
   [...]

 ..

(QEMU) xen_vtpmdev_ops is initialized with the following process:
  xen_hvm_init()
[...]
-->xen_fe_register("vtpm", ...)
  -->xenstore_fe_scan()
-->xen_fe_try_init(ops)
  --> XenDevOps.init()
-->xen_fe_get_xendev()
  --> XenDevOps.alloc()
-->xen_fe_check()
  -->xen_fe_try_initialise()
--> XenDevOps.initialise()
  -->xen_fe_try_connected()
--> XenDevOps.connected()
-->xs_watch()
[...]

--Changes in v3:
-Move xen_stubdom_vtpm.c to xen_vtpm_frontend.c
-Read Xen vTPM status via XenStore

--Changes in v4:
-Redesign vTPM xenstore architecture for HVM virtual machine.
-Remove unnecessary busy loop.
-Call xen_fe_register(vtpm ...) directly and move some initialzation
 chunk in the xen_vtpmdev_ops .init function.

Signed-off-by: Quan Xu 
---
 hw/tpm/Makefile.objs |   1 +
 hw/tpm/xen_vtpm_frontend.c   | 278 +++
 hw/xen/xen_frontend.c|  20 
 include/hw/xen/xen_backend.h |   5 +
 include/hw/xen/xen_common.h  |   6 +
 xen-hvm.c|   5 +
 6 files changed, 315 insertions(+)
 create mode 100644 hw/tpm/xen_vtpm_frontend.c

diff --git a/hw/tpm/Makefile.objs b/hw/tpm/Makefile.objs
index 99f5983..57919fa 100644
--- a/hw/tpm/Makefile.objs
+++ b/hw/tpm/Makefile.objs
@@ -1,2 +1,3 @@
 common-obj-$(CONFIG_TPM_TIS) += tpm_tis.o
 common-obj-$(CONFIG_TPM_PASSTHROUGH) += tpm_passthrough.o
+common-obj-$(CONFIG_TPM_XENSTUBDOMS) += xen_vtpm_frontend.o
diff --git a/hw/tpm/xen_vtpm_frontend.c b/hw/tpm/xen_vtpm_frontend.c
new file mode 100644
index 000..4ef0a26
--- /dev/null
+++ b/hw/tpm/xen_vtpm_frontend.c
@@ -0,0 +1,278 @@
+/*
+ * Connect to Xen vTPM stubdom domain
+ *
+ *  Copyright (c) 2015 Intel Corporation
+ *  Authors:
+ *Quan Xu 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "hw/hw.h"
+#include "block/aio.h"
+#include "hw/xen/xen_backend.h"
+
+#define XS_STUBDOM_VTPM_ENABLE"1"
+
+enum tpmif_state {
+TPMIF_STATE_IDLE,/* no contents / vTPM idle / cancel complete */
+TPMIF_STATE_SUBMIT,  /* request ready / vTPM working */
+TPMIF_STATE_FINISH,  /* response ready / vTPM idle */
+TPMIF_STATE_CANCEL,  /* cancel requested / vTPM working */
+};
+
+static AioContext *vtpm_aio_ctx;
+
+enum status_bits {
+VTPM_STATUS_RUNNING  = 0x1,
+VTPM_STATUS_IDLE = 0x2,
+VTPM_STATUS_RESULT   = 0x4,
+VTPM_STATUS_CANCELED = 0x8,
+};
+
+struct tpmif_shared_page {
+uint32_t length; /* request/response length in bytes */
+
+uint8_t  state;   /* enum tpmif_state */
+uint8_t  locality;/* for the current request */
+uint8_t  pad; /* should be zero */
+
+uint8_t  nr_extra_pages;  /* extra pages for long pa

[Xen-devel] [PATCH v3 0/7] vTPM: Xen stubdom vTPM for HVM virtual machine

2015-03-10 Thread Quan Xu
This patch series are only the Xen part to enable stubdom vTPM for HVM virtual 
machine.
it will work w/ Qemu patch series and seaBios patch series. Change 
QEMU_STUBDOM_VTPM compile
option from 'n' to 'y', when the Qemu/SeaBios patch series are merged.


*INTRODUCTION*

The goal of virtual Trusted Platform Module (vTPM) is to provide a TPM 
functionality to virtual
machines (Fedora, Ubuntu, Redhat, Windows .etc). This allows programs to 
interact with a TPM in
a virtual machine the same way they interact with a TPM on the physical system. 
Each virtual
machine gets its own unique, emulated, software TPM. Each major component of 
vTPM is implemented
as a stubdom, providing secure separation guaranteed by the hypervisor.

The vTPM stubdom is a Xen mini-OS domain that emulates a TPM for the virtual 
machine to use. It
is a small wrapper around the Berlios TPM emulator. TPM commands are passed 
from mini-os TPM
backend driver.


 *ARCHITECTURE*

The architecture of stubdom vTPM for HVM virtual machine:

++
| Windows/Linux DomU | ...
||  ^|
|v  ||
|  Qemu tpm1.2 Tis   |
||  ^|
|v  ||
| XenStubdoms backend|
++
 |  ^
 v  |
++
|  XenDevOps |
++
 |  ^
 v  |
++
|  mini-os/tpmback   |
||  ^|
|v  ||
|   vtpm-stubdom | ...
||  ^|
|v  ||
|  mini-os/tpmfront  |
++
 |  ^
 v  |
++
|  mini-os/tpmback   |
||  ^|
|v  ||
|  vtpmmgr-stubdom   |
||  ^|
|v  ||
|  mini-os/tpm_tis   |
++
 |  ^
 v  |
++
|Hardware TPM|
++

 * Windows/Linux DomU:
The HVM based guest that wants to use a vTPM. There may be
more than one of these.

 * Qemu tpm1.2 Tis:
Implementation of the tpm1.2 Tis interface for HVM virtual
machines. It is Qemu emulation device.

 * vTPM xenstubdoms driver:
Qemu vTPM driver. This driver provides vtpm initialization
and sending data and commends to a para-virtualized vtpm
stubdom.

 * XenDevOps:
Register Xen stubdom vTPM frontend driver, and transfer any
request/repond between TPM xenstubdoms driver and Xen vTPM
stubdom. Facilitate communications between Xen vTPM stubdom
and vTPM xenstubdoms driver.

 * mini-os/tpmback:
Mini-os TPM backend driver. The Linux frontend driver connects
to this backend driver to facilitate communications between the
Linux DomU and its vTPM. This driver is also used by vtpmmgr
stubdom to communicate with vtpm-stubdom.

 * vtpm-stubdom:
A mini-os stub domain that implements a vTPM. There is a
one to one mapping between running vtpm-stubdom instances and
logical vtpms on the system. The vTPM Platform Configuration
Registers (PCRs) are all initialized to zero.

 * mini-os/tpmfront:
Mini-os TPM frontend driver. The vTPM mini-os domain vtpm
stubdom uses this driver to communicate with vtpmmgr-stubdom.
This driver could also be used separately to implement a mini-os
domain that wishes to use a vTPM of its own.

 * vtpmmgr-stubdom:
A mini-os domain that implements the vTPM manager. There is only
one vTPM manager and it should be running during the entire lifetime
of the machine. vtpmmgr domain securely stores encryption keys for
each of the vtpms and accesses to the hardware TPM to get the root of
trust for the entire system.

 * mini-os/tpm_tis:
Mini-os TPM version 1.2 TPM Interface Specification (TIS) driver.
This driver used by vtpmmgr-stubdom to talk directly to the hardware
TPM. Communication is facilitated by mapping hardware memory pages
into vtpmmgr stubdom.

 * Hardware TPM: The physical TPM 1.2 that is soldered onto the motherboard.


*BUILD & TEST*

The following steps are how to build and test it:

1. SeaBios with my patch against upstream seabios is not submitted. I will
submit seabios patch later. Now I archive my seabios patch against upstream
seabios in Github: https://github.com/virt2x/seabios2 , try to build it for
test.

Configure it with Xen,
---  Con

[Xen-devel] [PATCH v4 5/5] Qemu-Xen-vTPM: QEMU machine class is initialized before tpm_init()

2015-03-10 Thread Quan Xu
make sure QEMU machine class is initialized and QEMU has registered
Xen stubdom vTPM driver when call tpm_init()

Signed-off-by: Quan Xu 
---
 vl.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/vl.c b/vl.c
index f6b3546..0bbdaa1 100644
--- a/vl.c
+++ b/vl.c
@@ -4114,12 +4114,6 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
-#ifdef CONFIG_TPM
-if (tpm_init() < 0) {
-exit(1);
-}
-#endif
-
 /* init the bluetooth world */
 if (foreach_device_config(DEV_BT, bt_parse))
 exit(1);
@@ -4225,6 +4219,17 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
+/*
+ * For compatible with Xen stubdom vTPM driver, make
+ * sure QEMU machine class is initialized and QEMU has
+ * registered Xen stubdom vTPM driver.
+ */
+#ifdef CONFIG_TPM
+if (tpm_init() < 0) {
+exit(1);
+}
+#endif
+
 /* init generic devices */
 if (qemu_opts_foreach(qemu_find_opts("device"), device_init_func, NULL, 1) 
!= 0)
 exit(1);
-- 
1.8.3.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 1/5] Qemu-Xen-vTPM: Support for Xen stubdom vTPM command line options

2015-03-10 Thread Quan Xu
--Changes in v4:
 -qapi schema enhancement.
 -remove no need code.

Signed-off-by: Quan Xu 
---
 configure| 14 ++
 hmp.c|  2 ++
 qapi-schema.json | 18 --
 qemu-options.hx  | 13 +++--
 tpm.c|  7 ++-
 5 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index a9e4d49..d63b8a1 100755
--- a/configure
+++ b/configure
@@ -2942,6 +2942,16 @@ else
 fi
 
 ##
+# TPM xenstubdoms is only on x86 Linux
+
+if test "$targetos" = Linux && test "$cpu" = i386 -o "$cpu" = x86_64 && \
+   test "$xen" = "yes"; then
+  tpm_xenstubdoms=$tpm
+else
+  tpm_xenstubdoms=no
+fi
+
+##
 # attr probe
 
 if test "$attr" != "no" ; then
@@ -4333,6 +4343,7 @@ echo "gcov  $gcov_tool"
 echo "gcov enabled  $gcov"
 echo "TPM support   $tpm"
 echo "libssh2 support   $libssh2"
+echo "TPM xenstubdoms   $tpm_xenstubdoms"
 echo "TPM passthrough   $tpm_passthrough"
 echo "QOM debugging $qom_cast_debug"
 echo "vhdx  $vhdx"
@@ -4810,6 +4821,9 @@ if test "$tpm" = "yes"; then
   if test "$tpm_passthrough" = "yes"; then
 echo "CONFIG_TPM_PASSTHROUGH=y" >> $config_host_mak
   fi
+  if test "$tpm_xenstubdoms" = "yes"; then
+echo "CONFIG_TPM_XENSTUBDOMS=y" >> $config_host_mak
+  fi
 fi
 
 echo "TRACE_BACKENDS=$trace_backends" >> $config_host_mak
diff --git a/hmp.c b/hmp.c
index 63d7686..5662cb6 100644
--- a/hmp.c
+++ b/hmp.c
@@ -718,6 +718,8 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
tpo->has_cancel_path ? ",cancel-path=" : "",
tpo->has_cancel_path ? tpo->cancel_path : "");
 break;
+case TPM_TYPE_OPTIONS_KIND_XENSTUBDOMS:
+break;
 case TPM_TYPE_OPTIONS_KIND_MAX:
 break;
 }
diff --git a/qapi-schema.json b/qapi-schema.json
index 24379ab..3f5c212 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -2854,9 +2854,11 @@
 #
 # @passthrough: TPM passthrough type
 #
+# @xenstubdoms: TPM xenstubdoms type (since 2.3)
+#
 # Since: 1.5
 ##
-{ 'enum': 'TpmType', 'data': [ 'passthrough' ] }
+{ 'enum': 'TpmType', 'data': [ 'passthrough', 'xenstubdoms' ] }
 
 ##
 # @query-tpm-types:
@@ -2884,6 +2886,16 @@
 { 'type': 'TPMPassthroughOptions', 'data': { '*path' : 'str',
  '*cancel-path' : 'str'} }
 
+# @TPMXenstubdomsOptions:
+#
+# Information about the TPM xenstubdoms type
+#
+# Since: 2.3
+##
+{ 'type': 'TPMXenstubdomsOptions', 'data': {  } }
+#
+##
+
 ##
 # @TpmTypeOptions:
 #
@@ -2894,7 +2906,9 @@
 # Since: 1.5
 ##
 { 'union': 'TpmTypeOptions',
-   'data': { 'passthrough' : 'TPMPassthroughOptions' } }
+  'data': { 'passthrough' : 'TPMPassthroughOptions',
+'xenstubdoms' : 'TPMXenstubdomsOptions' } }
+##
 
 ##
 # @TpmInfo:
diff --git a/qemu-options.hx b/qemu-options.hx
index 1e7d5b8..fd73f57 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2485,7 +2485,8 @@ DEF("tpmdev", HAS_ARG, QEMU_OPTION_tpmdev, \
 "-tpmdev passthrough,id=id[,path=path][,cancel-path=path]\n"
 "use path to provide path to a character device; default 
is /dev/tpm0\n"
 "use cancel-path to provide path to TPM's cancel sysfs 
entry; if\n"
-"not provided it will be searched for in 
/sys/class/misc/tpm?/device\n",
+"not provided it will be searched for in 
/sys/class/misc/tpm?/device\n"
+"-tpmdev xenstubdoms,id=id\n",
 QEMU_ARCH_ALL)
 STEXI
 
@@ -2495,7 +2496,8 @@ The general form of a TPM device option is:
 @item -tpmdev @var{backend} ,id=@var{id} [,@var{options}]
 @findex -tpmdev
 Backend type must be:
-@option{passthrough}.
+@option{passthrough}, or
+@option{xenstubdoms}.
 
 The specific backend type will determine the applicable options.
 The @code{-tpmdev} option creates the TPM backend and requires a
@@ -2545,6 +2547,13 @@ To create a passthrough TPM use the following two 
options:
 Note that the @code{-tpmdev} id is @code{tpm0} and is referenced by
 @code{tpmdev=tpm0} in the device option.
 
+To create a xenstubdoms TPM use the following two options:
+@example
+-tpmdev xenstubdoms,id=tpm0 -device tpm-tis,tpmdev=tpm0
+@end example
+Note that the @code{-tpmdev} id is @code{tpm0} and is referenced by
+@code{tpmdev=tpm0} in the device option.
+
 @end table
 
 ETEXI
diff --git a/tpm.c b/tpm.c
index c371023..ee9acb8 100644
--- a/tpm.c
+++ b/tpm.c
@@ -25,7 +25,7 @@ static QLIST_HEAD(, TPMBackend) tpm_backends =
 
 
 #define TPM_MAX_MODELS  1
-#define TPM_MAX_DRIVERS 1
+#define TPM_MAX_DRIVERS 2
 
 static TPMDriverOps const *be_drivers[TPM_MAX_DRIVERS] = {
 NULL,
@@ -256,6 +256,7 @@ static TPMInfo *qmp_query_tpm_inst(TPMBackend *drv)
 {
 TPMInfo *res = g_new0(TPMInfo, 1);
 TPMPassthroughOptions *tpo;
+TPMXenstubdomsOptions *txo;
 
 res->id = g_strdup(drv->id);
 res->model = 

[Xen-devel] [PATCH v3 2/7] vTPM: limit libxl__add_vtpms() function to para virtual machine

2015-03-10 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 tools/libxl/libxl_create.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index b1ff5ae..66877b3 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1358,8 +1358,15 @@ static void domcreate_attach_vtpms(libxl__egc *egc,
goto error_out;
}
 
-/* Plug vtpm devices */
-   if (d_config->num_vtpms > 0) {
+/*
+ * Plug vtpm devices only for PV guest. The xenstore directory is very
+ * different for PV guest and HVM guest, but it is still call it for
+ * creating HVM guest, and xl should create xenstore directory before
+ * spawning QEMU. So try to make it only for PV guest.
+ */
+if (d_config->num_vtpms > 0 &&
+d_config->b_info.type == LIBXL_DOMAIN_TYPE_PV) {
+
/* Attach vtpms */
libxl__multidev_begin(ao, &dcs->multidev);
dcs->multidev.callback = domcreate_attach_pci;
-- 
1.8.3.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 3/7] vTPM: add TPM TCPA and SSDT for HVM virtual machine when vTPM is added

2015-03-10 Thread Quan Xu
Signed-off-by: Quan Xu 
---
 tools/firmware/hvmloader/acpi/build.c | 7 ---
 tools/libxl/libxl_create.c| 5 -
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/firmware/hvmloader/acpi/build.c 
b/tools/firmware/hvmloader/acpi/build.c
index 1431296..49f6772 100644
--- a/tools/firmware/hvmloader/acpi/build.c
+++ b/tools/firmware/hvmloader/acpi/build.c
@@ -313,9 +313,10 @@ static int construct_secondary_tables(unsigned long 
*table_ptrs,
 
 /* TPM TCPA and SSDT. */
 tis_hdr = (uint16_t *)0xFED40F00;
-if ( (tis_hdr[0] == tis_signature[0]) &&
- (tis_hdr[1] == tis_signature[1]) &&
- (tis_hdr[2] == tis_signature[2]) )
+if (((tis_hdr[0] == tis_signature[0]) &&
+(tis_hdr[1] == tis_signature[1]) &&
+(tis_hdr[2] == tis_signature[2])) ||
+!strncmp(xenstore_read("platform/acpi_stubdom_vtpm", "1"), "1", 1))
 {
 ssdt = mem_alloc(sizeof(ssdt_tpm), 16);
 if (!ssdt) return -1;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 66877b3..ffb124a 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -432,7 +432,7 @@ int libxl__domain_build(libxl__gc *gc,
 vments[4] = "start_time";
 vments[5] = libxl__sprintf(gc, "%lu.%02d", 
start_time.tv_sec,(int)start_time.tv_usec/1);
 
-localents = libxl__calloc(gc, 9, sizeof(char *));
+localents = libxl__calloc(gc, 11, sizeof(char *));
 i = 0;
 localents[i++] = "platform/acpi";
 localents[i++] = libxl_defbool_val(info->u.hvm.acpi) ? "1" : "0";
@@ -440,6 +440,9 @@ int libxl__domain_build(libxl__gc *gc,
 localents[i++] = libxl_defbool_val(info->u.hvm.acpi_s3) ? "1" : "0";
 localents[i++] = "platform/acpi_s4";
 localents[i++] = libxl_defbool_val(info->u.hvm.acpi_s4) ? "1" : "0";
+localents[i++] = "platform/acpi_stubdom_vtpm";
+localents[i++] = (d_config->num_vtpms > 0) ? "1" : "0";
+
 if (info->u.hvm.mmio_hole_memkb) {
 uint64_t max_ram_below_4g =
 (1ULL << 32) - (info->u.hvm.mmio_hole_memkb << 10);
-- 
1.8.3.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] SeaBios/vTPM: Enable Xen stubdom vTPM for HVM virtual machine

2015-03-10 Thread Quan Xu
Signed-off-by: Quan Xu 
Signed-off-by: Stefan Berger 
---
 Makefile   |   2 +-
 src/post.c |   3 +
 src/tpm.c  | 309 +
 src/tpm.h  | 141 
 4 files changed, 454 insertions(+), 1 deletion(-)
 create mode 100644 src/tpm.c
 create mode 100644 src/tpm.h

diff --git a/Makefile b/Makefile
index eecb8a1..945e997 100644
--- a/Makefile
+++ b/Makefile
@@ -36,7 +36,7 @@ SRCBOTH=misc.c stacks.c output.c string.c block.c cdrom.c 
disk.c mouse.c kbd.c \
 hw/virtio-ring.c hw/virtio-pci.c hw/virtio-blk.c hw/virtio-scsi.c \
 hw/lsi-scsi.c hw/esp-scsi.c hw/megasas.c
 SRC16=$(SRCBOTH)
-SRC32FLAT=$(SRCBOTH) post.c memmap.c malloc.c romfile.c x86.c optionroms.c \
+SRC32FLAT=$(SRCBOTH) post.c memmap.c malloc.c romfile.c tpm.c x86.c 
optionroms.c \
 pmm.c font.c boot.c bootsplash.c jpeg.c bmp.c \
 hw/ahci.c hw/pvscsi.c hw/usb-xhci.c hw/usb-hub.c \
 fw/coreboot.c fw/lzmadecode.c fw/csm.c fw/biostables.c \
diff --git a/src/post.c b/src/post.c
index 0fdd28e..8cb1abd 100644
--- a/src/post.c
+++ b/src/post.c
@@ -28,6 +28,7 @@
 #include "output.h" // dprintf
 #include "string.h" // memset
 #include "util.h" // kbd_init
+#include "tpm.h" //vtpm4hvm_setup
 
 
 /
@@ -151,6 +152,8 @@ device_hardware_setup(void)
 esp_scsi_setup();
 megasas_setup();
 pvscsi_setup();
+if (runningOnXen())
+vtpm4hvm_setup();
 }
 
 static void
diff --git a/src/tpm.c b/src/tpm.c
new file mode 100644
index 000..a834d30
--- /dev/null
+++ b/src/tpm.c
@@ -0,0 +1,309 @@
+/*
+ * Implementation of a TPM driver for the TPM TIS interface
+ *
+ * Copyright (C) 2006-2013 IBM Corporation
+ * Copyright (C) 2015 Intel Corporation
+ *
+ * Authors:
+ * Stefan Berger 
+ * Quan Xu 
+ *
+ * This file may be distributed under the terms of the GNU
+ * LGPLv3 license.
+ */
+
+#include "config.h"
+#include "util.h"
+#include "tpm.h"
+
+static u32 tis_default_timeouts[4] = {
+TIS_DEFAULT_TIMEOUT_A,
+TIS_DEFAULT_TIMEOUT_B,
+TIS_DEFAULT_TIMEOUT_C,
+TIS_DEFAULT_TIMEOUT_D,
+};
+
+static u32 tpm_default_durations[3] = {
+TPM_DEFAULT_DURATION_SHORT,
+TPM_DEFAULT_DURATION_MEDIUM,
+TPM_DEFAULT_DURATION_LONG,
+};
+
+
+/* if device is not there, return '0', '1' otherwise */
+static u32 tis_probe(void)
+{
+u32 rc = 0;
+u32 didvid = readl(TIS_REG(0, TIS_REG_DID_VID));
+
+if ((didvid != 0) && (didvid != 0x))
+rc = 1;
+
+return rc;
+}
+
+static u32 tis_init(void)
+{
+writeb(TIS_REG(0, TIS_REG_INT_ENABLE), 0);
+
+if (tpm_drivers[TIS_DRIVER_IDX].durations == NULL) {
+u32 *durations = malloc_low(sizeof(tpm_default_durations));
+if (durations)
+memcpy(durations, tpm_default_durations,
+   sizeof(tpm_default_durations));
+else
+durations = tpm_default_durations;
+tpm_drivers[TIS_DRIVER_IDX].durations = durations;
+}
+
+if (tpm_drivers[TIS_DRIVER_IDX].timeouts == NULL) {
+u32 *timeouts = malloc_low(sizeof(tis_default_timeouts));
+if (timeouts)
+memcpy(timeouts, tis_default_timeouts,
+   sizeof(tis_default_timeouts));
+else
+timeouts = tis_default_timeouts;
+tpm_drivers[TIS_DRIVER_IDX].timeouts = timeouts;
+}
+
+return 1;
+}
+
+
+static void set_timeouts(u32 timeouts[4], u32 durations[3])
+{
+u32 *tos = tpm_drivers[TIS_DRIVER_IDX].timeouts;
+u32 *dus = tpm_drivers[TIS_DRIVER_IDX].durations;
+
+if (tos && tos != tis_default_timeouts && timeouts)
+memcpy(tos, timeouts, 4 * sizeof(u32));
+if (dus && dus != tpm_default_durations && durations)
+memcpy(dus, durations, 3 * sizeof(u32));
+}
+
+
+static u32 tis_wait_sts(u8 locty, u32 time, u8 mask, u8 expect)
+{
+u32 rc = 1;
+
+while (time > 0) {
+u8 sts = readb(TIS_REG(locty, TIS_REG_STS));
+if ((sts & mask) == expect) {
+rc = 0;
+break;
+}
+msleep(1);
+time--;
+}
+return rc;
+}
+
+static u32 tis_activate(u8 locty)
+{
+u32 rc = 0;
+u8 acc;
+int l;
+u32 timeout_a = tpm_drivers[TIS_DRIVER_IDX].timeouts[TIS_TIMEOUT_TYPE_A];
+
+if (!(readb(TIS_REG(locty, TIS_REG_ACCESS)) &
+  TIS_ACCESS_ACTIVE_LOCALITY)) {
+/* release locality in use top-downwards */
+for (l = 4; l >= 0; l--)
+writeb(TIS_REG(l, TIS_REG_ACCESS),
+   TIS_ACCESS_ACTIVE_LOCALITY);
+}
+
+/* request access to locality */
+writeb(TIS_REG(locty, TIS_REG_ACCESS), TIS_ACCESS_REQUEST_USE);
+
+acc = readb(TIS_REG(locty, TIS_REG_ACCESS));
+if ((acc & TIS_ACCESS_ACTIVE_LOCALITY)) {
+writeb(TIS_REG(locty, TIS_REG_STS), TIS_STS_COMMAND_READY);
+rc = tis_wait_sts(locty, timeout_a,
+  TIS_STS_COMMAND_READY, TIS_STS_COMMAND_READY);
+}
+
+return r

[Xen-devel] [PATCH] SeaBios/vTPM: Enable Xen stubdom vTPM for HVM virtual machine

2015-03-10 Thread Quan Xu
This patch series are only the SeaBios part to enable stubdom vTPM for HVM
virtual machine. It will work with Qemu patch series and Xen patch series.


*INTRODUCTION*

The goal of virtual Trusted Platform Module (vTPM) is to provide a TPM 
functionality
to virtual machines (Fedora, Ubuntu, Redhat, Windows .etc). This allows programs
to interact with a TPM in a virtual machine the same way they interact with a 
TPM
on the physical system. Each virtual machine gets its own unique, emulated, 
software
TPM. Each major component of vTPM is implemented as a stubdom, providing secure
separation guaranteed by the hypervisor.

The vTPM stubdom is a Xen mini-OS domain that emulates a TPM for the virtual 
machine
to use. It is a small wrapper around the Berlios TPM emulator. TPM commands are 
passed
from mini-os TPM backend driver.


Signed-off-by: Quan Xu 
Signed-off-by: Stefan Berger 

Quan Xu (1):
  SeaBios/vTPM: Enable Xen stubdom vTPM for HVM virtual machine

 Makefile   |   2 +-
 src/post.c |   3 +
 src/tpm.c  | 309 +
 src/tpm.h  | 141 
 4 files changed, 454 insertions(+), 1 deletion(-)
 create mode 100644 src/tpm.c
 create mode 100644 src/tpm.h

-- 
1.8.1.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 2/5] Qemu-Xen-vTPM: Xen frontend driver infrastructure

2015-03-10 Thread Quan Xu
This patch adds infrastructure for xen front drivers living in qemu,
so drivers don't need to implement common stuff on their own.  It's
mostly xenbus management stuff: some functions to access XenStore,
setting up XenStore watches, callbacks on device discovery and state
changes, and handle event channel between the virtual machines.

Call xen_fe_register() function to register XenDevOps, and make sure,
XenDevOps's flags is DEVOPS_FLAG_FE, which is flag bit to point out
the XenDevOps is Xen frontend.

Create a new file xen_pvdev.c for some common part of xen frontend
and backend, such as xendevs queue and xenstore update functions.

--Changes in v3:
-New xen_frontend.c file
-Move xenbus_switch_state() to xen_frontend.c
-Move xen_stubdom_be() to xenstore_fe_read_be_str()
-Move *_stubdom_*() to *_fe_*()

--Changes in v4:
-New xen_pvdev.c file
-Move xendevs queue to xen_pvdev.c
-Move xenstore functions to xen_pvdev.c
-Check status before setting the frontend to connect

Signed-off-by: Quan Xu 
---
 hw/display/xenfb.c   |   4 +-
 hw/xen/Makefile.objs |   2 +-
 hw/xen/xen_backend.c | 353 ---
 hw/xen/xen_frontend.c| 345 +++
 hw/xen/xen_pvdev.c   | 481 +++
 include/hw/xen/xen_backend.h |  22 +-
 6 files changed, 850 insertions(+), 357 deletions(-)
 create mode 100644 hw/xen/xen_frontend.c
 create mode 100644 hw/xen/xen_pvdev.c

diff --git a/hw/display/xenfb.c b/hw/display/xenfb.c
index 8a61e95..470b5d0 100644
--- a/hw/display/xenfb.c
+++ b/hw/display/xenfb.c
@@ -983,8 +983,8 @@ void xen_init_display(int domid)
 wait_more:
 i++;
 main_loop_wait(true);
-xfb = xen_be_find_xendev("vfb", domid, 0);
-xin = xen_be_find_xendev("vkbd", domid, 0);
+xfb = xen_find_xendev("vfb", domid, 0);
+xin = xen_find_xendev("vkbd", domid, 0);
 if (!xfb || !xin) {
 if (i < 256) {
 usleep(1);
diff --git a/hw/xen/Makefile.objs b/hw/xen/Makefile.objs
index a0ca0aa..95eb9d0 100644
--- a/hw/xen/Makefile.objs
+++ b/hw/xen/Makefile.objs
@@ -1,5 +1,5 @@
 # xen backend driver support
-common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o
+common-obj-$(CONFIG_XEN_BACKEND) += xen_backend.o xen_devconfig.o 
xen_frontend.o xen_pvdev.o
 
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o
 obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o xen_pt_msi.o
diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index b2cb22b..844f918 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -44,86 +44,11 @@
 /* - */
 
 /* public */
-XenXC xen_xc = XC_HANDLER_INITIAL_VALUE;
-struct xs_handle *xenstore = NULL;
 const char *xen_protocol;
 
 /* private */
-static QTAILQ_HEAD(XenDeviceHead, XenDevice) xendevs = 
QTAILQ_HEAD_INITIALIZER(xendevs);
 static int debug = 0;
 
-/* - */
-
-int xenstore_write_str(const char *base, const char *node, const char *val)
-{
-char abspath[XEN_BUFSIZE];
-
-snprintf(abspath, sizeof(abspath), "%s/%s", base, node);
-if (!xs_write(xenstore, 0, abspath, val, strlen(val))) {
-return -1;
-}
-return 0;
-}
-
-char *xenstore_read_str(const char *base, const char *node)
-{
-char abspath[XEN_BUFSIZE];
-unsigned int len;
-char *str, *ret = NULL;
-
-snprintf(abspath, sizeof(abspath), "%s/%s", base, node);
-str = xs_read(xenstore, 0, abspath, &len);
-if (str != NULL) {
-/* move to qemu-allocated memory to make sure
- * callers can savely g_free() stuff. */
-ret = g_strdup(str);
-free(str);
-}
-return ret;
-}
-
-int xenstore_write_int(const char *base, const char *node, int ival)
-{
-char val[12];
-
-snprintf(val, sizeof(val), "%d", ival);
-return xenstore_write_str(base, node, val);
-}
-
-int xenstore_write_int64(const char *base, const char *node, int64_t ival)
-{
-char val[21];
-
-snprintf(val, sizeof(val), "%"PRId64, ival);
-return xenstore_write_str(base, node, val);
-}
-
-int xenstore_read_int(const char *base, const char *node, int *ival)
-{
-char *val;
-int rc = -1;
-
-val = xenstore_read_str(base, node);
-if (val && 1 == sscanf(val, "%d", ival)) {
-rc = 0;
-}
-g_free(val);
-return rc;
-}
-
-int xenstore_read_uint64(const char *base, const char *node, uint64_t *uval)
-{
-char *val;
-int rc = -1;
-
-val = xenstore_read_str(base, node);
-if (val && 1 == sscanf(val, "%"SCNu64, uval)) {
-rc = 0;
-}
-g_free(val);
-return rc;
-}
-
 int xenstore_write_be_str(struct XenDevice *xendev, const char *node, const 
char *val)
 {
 return xenstore_write_str(xendev->be, node, val);
@@ -195,183 +120,6 @@ int xen_be_set_state(struct XenDevice *xendev, enum 
xenbus_state state)
 }
 
 /* --

[Xen-devel] [rumpuserxen test] 36177: regressions - FAIL

2015-03-10 Thread xen . org
flight 36177 rumpuserxen real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36177/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-rumpuserxen   5 rumpuserxen-build fail REGR. vs. 33866
 build-i386-rumpuserxen5 rumpuserxen-build fail REGR. vs. 33866

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a

version targeted for testing:
 rumpuserxen  7b103921add3cc1b30204d416ba246bfc8bdc05f
baseline version:
 rumpuserxen  30d72f3fc5e35cd53afd82c8179cc0e0b11146ad


People who touched revisions under test:
  Antti Kantee 
  Ian Jackson 
  Martin Lucina 
  Wei Liu 


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-i386-rumpuserxen-i386 blocked 



sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 503 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] iommu: add rmrr Xen command line option for misc rmrrs

2015-03-10 Thread Elena Ufimtseva
On Tue, Mar 10, 2015 at 04:27:15PM +, Jan Beulich wrote:
> >>> On 10.03.15 at 17:16,  wrote:
> > On Tue, Mar 10, 2015 at 02:47:24AM +, Tian, Kevin wrote:
> >> > From: elena.ufimts...@oracle.com [mailto:elena.ufimts...@oracle.com]
> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> > @@ -1232,6 +1232,38 @@ static int intel_iommu_domain_init(struct domain
> >> > *d)
> >> >  return 0;
> >> >  }
> >> > 
> >> > +static void add_misc_rmrr(void)
> >> > +{
> >> > +struct acpi_rmrr_unit *rmrrn;
> >> > +struct misc_rmrr_unit *rmrru, *r;
> >> > +
> >> > +list_for_each_entry_safe( rmrru, r, &misc_rmrr_units, list )
> >> > +{
> >> > +rmrrn = xzalloc(struct acpi_rmrr_unit);
> >> > +if ( !rmrrn )
> >> > +goto free;
> >> > +
> >> > +rmrrn->scope.devices = xzalloc(typeof(*rmrrn->scope.devices));
> >> > +if ( !rmrrn->scope.devices )
> >> > +{
> >> > +xfree(rmrrn);
> >> > +goto free;
> >> > +}
> >> > +rmrrn->scope.devices_cnt = 1;
> >> > +rmrrn->segment = rmrru->segment;
> >> > +rmrrn->scope.devices[0] = rmrru->device;
> >> 
> >> need handle one-rmrr-multiple-deviecs. even if you don't want
> >> to support it, need capture user attempts at least.
> > 
> > Kevin, on the second thought, I think to support multiple devices 
> > per one rmrr one need to put on command line same address/range and
> > specify unique device each time. 
> 
> Why? Iirc it was you who already proposed a way to properly
> express this on the command line without having to repeat the
> memory addresses.

One more thought and exploring options as I dont have strong inclination
to either of the options.

> 
> Jan
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/2] x86emul: drop unused "bigval" fields from struct operand

2015-03-10 Thread Andrew Cooper
On 10/03/15 16:35, Jan Beulich wrote:
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/2] x86emul: simplify asm() constraints

2015-03-10 Thread Andrew Cooper
On 10/03/15 16:36, Jan Beulich wrote:
> Use + on outputs instead of = and a matching input. Allow not just
> memory for the _eflags operand (it turns out that recent gcc produces
> worse code when also doing this for _dst.val, so the latter is being
> avoided).
>
> Signed-off-by: Jan Beulich 
>
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -428,7 +428,7 @@ typedef union {
>  /* Before executing instruction: restore necessary bits in EFLAGS. */
>  #define _PRE_EFLAGS(_sav, _msk, _tmp)   \
>  /* EFLAGS = (_sav & _msk) | (EFLAGS & ~_msk); _sav &= ~_msk; */ \
> -"movl %"_sav",%"_LO32 _tmp"; "  \
> +"movl %"_LO32 _sav",%"_LO32 _tmp"; "\
>  "push %"_tmp"; "\
>  "push %"_tmp"; "\
>  "movl %"_msk",%"_LO32 _tmp"; "  \
> @@ -448,7 +448,7 @@ typedef union {
>  "pushf; "   \
>  "pop  %"_tmp"; "\
>  "andl %"_msk",%"_LO32 _tmp"; "  \
> -"orl  %"_LO32 _tmp",%"_sav"; "
> +"orl  %"_LO32 _tmp",%"_LO32 _sav"; "
>  
>  /* Raw emulation: instruction has two explicit operands. */
>  #define __emulate_2op_nobyte(_op,_src,_dst,_eflags,_wx,_wy,_lx,_ly,_qx,_qy)\
> @@ -460,18 +460,16 @@ do{ unsigned long _tmp; 
>  _PRE_EFLAGS("0","4","2")   \
>  _op"w %"_wx"3,%1; "\
>  _POST_EFLAGS("0","4","2")  \
> -: "=m" (_eflags), "=m" ((_dst).val), "=&r" (_tmp)  \
> -: _wy ((_src).val), "i" (EFLAGS_MASK), \
> -  "m" (_eflags), "m" ((_dst).val) );   \
> +: "+g" (_eflags), "+m" ((_dst).val), "=&r" (_tmp)  \
> +: _wy ((_src).val), "i" (EFLAGS_MASK) );   \

I believe the old ASM was buggy, not just inefficient.

Having read the Extended ASM documentation quite carefully, the
following statement is relevant

"Only input operands may use numbers in constraints, and they must each
refer to an output operand. Only a number (or the symbolic assembler
name) in the constraint can guarantee that one operand is in the same
place as another. The mere fact tha|t 'foo' |||is the value of both
operands is not enough to guarantee that they are in the same place in
the generated assembler code."

Because the input operands do not use numbers, the asm must read from %5
and write to %0 to guarantee that the _eflags temporary is used properly.

I believe that this transformation does now make the asm correct, as the
output and input sides are now guaranteed to match the %0 used to
reference the _eflags temporary.

Did you observe any code changes simply from changing = constraints to
+, or did we get very lucky in with the generated code?

I think it might be a very wise idea to switch to using symbolic names. 
This code is very complicated and has many ways to go subtly wrong.

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 36062: trouble: broken/fail/pass

2015-03-10 Thread xen . org
flight 36062 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/36062/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-qemut-rhel6hvm-intel  3 host-install(3) broken REGR. vs. 35887
 test-amd64-i386-rhel6hvm-intel  3 host-install(3)   broken REGR. vs. 35887
 test-amd64-i386-xl-qemuu-debianhvm-amd64 3 host-install(3) broken REGR. vs. 
35887
 test-amd64-i386-xl-qemuu-win7-amd64  3 host-install(3)  broken REGR. vs. 35887
 test-amd64-i386-xl-qemut-win7-amd64  3 host-install(3)  broken REGR. vs. 35887

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-pcipt-intel  3 host-install(3)  broken REGR. vs. 35887
 test-amd64-i386-libvirt   3 host-install(3) broken REGR. vs. 35887
 test-amd64-amd64-xl-sedf-pin 15 guest-localmigrate/x10fail REGR. vs. 35887
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 35887

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64 13 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass

version targeted for testing:
 xen  4e59089ed90a11b9e30e67191789293bb07af686
baseline version:
 xen  f0ffd6032f679ec4b9a39d526cdbcdaf692e2f03


People who touched revisions under test:
  "Wei, Gang" 
  Aaron Adams 
  Chao Peng 
  Daniel De Graaf 
  Frediano Ziglio 
  Ian Campbell 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 
  Julien Grall 
  M. Gregory 
  Vijaya Kumar K 
  Wei Liu 
  Wei Liu 
  Zoltan Kiss 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-a

Re: [Xen-devel] [PATCH 1/4] x86/MSI-X: be more careful during teardown

2015-03-10 Thread Andrew Cooper
On 10/03/15 16:27, Jan Beulich wrote:
> When a device gets detached from a guest, pciback will clear its
> command register, thus disabling both memory and I/O decoding. The
> disabled memory decoding, however, has an effect on the MSI-X table
> accesses the hypervisor does: These won't have the intended effect
> anymore. Even worse, for PCIe devices (but not SR-IOV virtual
> functions) such accesses may (will?) be treated as Unsupported
> Requests, causing respective errors to be surfaced, potentially in the
> form of NMIs that may be fatal to the hypervisor or Dom0 is different
> ways. Hence rather than carrying out these accesses, we should avoid
> them where we can, and use alternative (e.g. PCI config space based)
> mechanisms to achieve at least the same effect.
>
> Signed-off-by: Jan Beulich 
> ---
> Backporting note (largely to myself):
>Depends on (not yet backported) commit 061eebe0e "x86/MSI: drop
>workaround for insecure Dom0 kernels" (due to re-use of struct
>arch_msix's warned field).
>
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -121,6 +121,27 @@ static void msix_put_fixmap(struct arch_
>  spin_unlock(&msix->table_lock);
>  }
>  
> +static bool_t memory_decoded(const struct pci_dev *dev)
> +{
> +u8 bus, slot, func;
> +
> +if ( !dev->info.is_virtfn )
> +{
> +bus = dev->bus;
> +slot = PCI_SLOT(dev->devfn);
> +func = PCI_FUNC(dev->devfn);
> +}
> +else
> +{
> +bus = dev->info.physfn.bus;
> +slot = PCI_SLOT(dev->info.physfn.devfn);
> +func = PCI_FUNC(dev->info.physfn.devfn);
> +}
> +
> +return !!(pci_conf_read16(dev->seg, bus, slot, func, PCI_COMMAND) &
> +  PCI_COMMAND_MEMORY);
> +}
> +

This check is racy against anyone who can write to the command register,
which includes dom0 and other pcpus in Xen.  There does not appear to be
any exclusion between Xen emulating a control register write on one cpu
and changing irq affinities on another.

As a result, this check does not actually protect against accessing the
MSI-X bar while memory decoding is disabled.  As a downside, it puts an
expensive config space access on moderately frequent codepaths.

One issue we have just identified pertains to dom0 resetting a device
and Xen falling over a UR which has been escalated to fatal, most likely
because an in-progress MSI-X interrupt migration.  There does not appear
to be sufficient synchronisation available in the interface for a dom0
to even cooperatively perform a device reset with Xen.

The more I consider this and related problems, the more I am thinking
that the only longterm solution is to have a full PCI implementation in
Xen, and to prevent any unmediated access, including from dom0.  Xen
need not gain much (any?) more device-specific knowledge, but needs to
gain the ability to properly mediate all config updates, and synchronise
resets against other users of the device.  I do not suggest this
lightly; I realise that it is a huge quantity of work.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 20/24] xen/passthrough: Extend XEN_DOMCTL_assign_device to support DT device

2015-03-10 Thread Daniel De Graaf

On 03/10/2015 12:52 PM, Julien Grall wrote:

Hi Daniel,

On 23/02/15 16:25, Daniel De Graaf wrote:

On 02/20/2015 12:17 PM, Ian Campbell wrote:

On Tue, 2015-01-13 at 14:25 +, Julien Grall wrote:

TODO: Update the commit message

A device node is described by a path. It will be used to retrieved the
node in the device tree and assign the related device to the domain.

Only device protected by an IOMMU can be assigned to a guest.

Signed-off-by: Julien Grall 
Cc: Ian Jackson 
Cc: Wei Liu 
Cc: Jan Beulich 

---
  Changes in v2:
  - Use a different number for XEN_DOMCTL_assign_dt_device
---
   tools/libxc/include/xenctrl.h | 10 
   tools/libxc/xc_domain.c   | 95
--


These bits all look fine.


+int iommu_do_dt_domctl(struct xen_domctl *domctl, struct domain *d,
+   XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
+{
+int ret;
+struct dt_device_node *dev;
+
+/* TODO: How to deal with XSM? */


Adding Daniel.

It seems the PCI ones are protected by
  xsm_test_assign_device(XSM_HOOK,
domctl->u.assign_device.machine_sbdf);

So it seem that either this needs to become "test_assign_pci_device" and
a similar "test_assign_dt_device" needs to be added and plumbed through
or it needs to grow a type parameter and take the union for the
identifier.


Either would work, but a distinct hook seems simpler to me, especially as
the call sites are distinct and the hook would process them differently.


Sounds good.


The code to apply an XSM context to a DT node would need consideration
too I suppose?


This may require a bit more thought.  At first glance, the dt_phandle
field seems to be an identifier that could be used by FLASK to identify a
device using an ocontext lookup.  Labeling would then be done in the same
way as PCI devices and x86 legacy I/O ports.


We don't always have a dt_phandle in hand. They are mostly used for
referencing a node within another (such as IOMMU, interrupt
controller...). Also, the value is controlled by the compiler.

AFAICT, the only unique value we have in hand is the path of the device.


OK. I was hoping that there would be a unique numeric identifier.  If
there is not, it may be necessary to either create one or to add a new
field to device nodes (like the one for event channels) so that they
can be labeled.


BTW, do you have any pointer on how to write a policy for device/IRQ
passthrough?


There is a bit of documentation in xsm-flask.txt about device labeling,
which is the hard part of making passthrough work.  Labels can be set
either statically in the security policy (as documented in the section
"Device Labeling") or dynamically using a tool like flask-label-pci
as documented in "Resource Policy".  Once that is done, then rules to
allow the passthrough operation can be added, similar to the example
resource nic_dev_t in xen.te.

In order to do static labeling for device passthrough, the nodes in a
device tree need a 32-bit numeric identifier.  IO memory uses the MFN,
PCI devices use SBDF, and IRQs and x86 legacy IOs just use the number.

If device tree nodes can be labeled in this way, they could be added
as another resource type in the policy.  If not, then the label of a
device node will need to be set at boot using the XSM hypercalls;
this label would be stored in a security field added to device tree
nodes.

--
Daniel De Graaf
National Security Agency

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 20/24] xen/passthrough: Extend XEN_DOMCTL_assign_device to support DT device

2015-03-10 Thread Julien Grall

Hi Daniel,

On 10/03/2015 22:45, Daniel De Graaf wrote:

BTW, do you have any pointer on how to write a policy for device/IRQ
passthrough?


There is a bit of documentation in xsm-flask.txt about device labeling,
which is the hard part of making passthrough work.  Labels can be set
either statically in the security policy (as documented in the section
"Device Labeling") or dynamically using a tool like flask-label-pci
as documented in "Resource Policy".  Once that is done, then rules to
allow the passthrough operation can be added, similar to the example
resource nic_dev_t in xen.te.


I tried to follow xsm-flask.txt and uncomment one of the pirqcon line in 
the xsm policy.


But I got the following error:

policy/modules/xen/xen.te:199:ERROR 'syntax error' at token 'pirqcon' on 
line 1986:

pirqcon 33 system_u:object_r:nic_dev_t

Did I miss anything?


In order to do static labeling for device passthrough, the nodes in a
device tree need a 32-bit numeric identifier.  IO memory uses the MFN,
PCI devices use SBDF, and IRQs and x86 legacy IOs just use the number.


Why it's restricted to an integer? Would it be possible to use a string 
as it's done for the sid?


Regards,


--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >