Re: [Xen-devel] [PATCH] VT-d: add iommu=igfx_off option to workaround graphics issues

2015-07-20 Thread Jan Beulich
>>> On 21.07.15 at 02:57,  wrote:
>>  From: Andrew Cooper [mailto:am...@hermes.cam.ac.uk] On Behalf Of Andrew 
> Cooper
>> Sent: Monday, July 20, 2015 4:21 PM
>> 
>> On 20/07/2015 02:28, Tian, Kevin wrote:
>> >> From: Ting-Wei Lan [mailto:lant...@gmail.com]
>> >> Sent: Saturday, July 18, 2015 3:06 AM
>> >>
>> >> When using Linux >= 3.19 (commit 47591df) as dom0 on some Intel Ironlake
>> >> devices, It is possible to encounter graphics issues that make screen
>> >> unreadable or crash the system. It was reported in freedesktop bugzilla:
>> >>
>> >> https://bugs.freedesktop.org/show_bug.cgi?id=90037 
>> >>
>> >> As we still cannot find a proper fix for this problem, this patch adds
>> >> iommu=igfx_off option that is similar to Linux intel_iommu=igfx_off for
>> >> users to manually workaround the problem.
>> >>
>> >> Signed-off-by: Ting-Wei Lan 
>> > Since igfx works before, I'd think a more proper fix should be on the
>> > bisected Linux commit or i915 to have two working correctly together.
>> > Otherwise this patch is just hiding problem.
>> 
>> The linux commit is the one which actually fixes PAT support for Linux
>> under Xen.
>> 
>> It will cause the i915 driver to actually get WC mappings when it asks
>> for them.
> 
> This is the part which I don't quite understand. WC is essentially an UC
> attribute with write buffer to accelerate the write efficiency. There 
> should be no correctness problem to use either WC or UC if i915 driver
> wants WC.

"Should" is too weak a term here: Using WC on the wrong piece of
memory or without the necessary fencing can imo very well cause
correctness problems (which would be hidden by WC -> UC
conversion behind the driver's back).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] unmodified-drivers: tolerate IRQF_DISABLED being undefined

2015-07-20 Thread Olaf Hering
On Mon, Jun 01, Jan Beulich wrote:

> >>> On 01.06.15 at 15:56,  wrote:
> > On Mon, 2015-06-01 at 14:50 +0100, Jan Beulich wrote:
> >> It's being removed in Linux 4.1.
> >> 
> >> Signed-off-by: Jan Beulich 
> > 
> > Not sure who should, but:
> 
> I guess no-one really needs to for that old code. But thanks!

Any chance to backport this to all 4.x branches?

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] blktap2: update connection handling to fix build with gcc5

2015-07-20 Thread Jan Beulich
>>> On 21.07.15 at 08:32,  wrote:
> On Mon, Jul 20, Wei Liu wrote:
> 
>> On Mon, Jul 20, 2015 at 11:16:34AM +0200, Olaf Hering wrote:
>> > On Mon, Jul 20, Jan Beulich wrote:
>> > 
>> > > >>> On 19.07.15 at 11:33,  wrote:
>> > > > [  198s] block-log.c:549:23: error: array subscript is above array 
>> > > > bounds 
> [-Werror=array-bounds]
>> > > > [  198s]  if (s->connections[i].id == id)
>> > > > [  198s]^
>> > > 
>> > > So what makes the compiler right with that complaint? I.e. how does
>> > > it know i > 0 here? After all - afaict - s->connected can only be 0 or
>> > 
>> > It has to assume that ->connected can get any value because the input
>> > comes from outside the unit.
>> > 
>> > To reduce the patch size "&& i < MAX_CONNECTIONS" could be added.
>> > 
>> 
>> A smaller patch would be preferable at this stage.
> 
> I disagree with that.

Please consider that we're in code freeze right now.

> What is the longterm goal of that binary?

It being mostly unmaintained, it's probably a candidate for removal
not too far in the future. Which would be another reason again
extensive changes.

But anyway, the primary question remains - isn't what you're seeing
a compiler bug? If yes, that's imo _yet another_ reason for doing a
minimal workaround (if any at all).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-20 Thread Chen, Tiejun

But d_config is a libxl_domain_config which is supplied by libxl's
caller.  It might contain some rdms.


I guess this line make you or other guys confused so lets delete this
line directly.


I don't think I am very confused.


And if you still worry about something, I can add assert() at the
beginning of this function like this,

assert(!d_config->num_rdms && !d_config->rdms).


If you are sure that this assertion is correct, then that would be
proper.

But as I say above, I don't think it is.



Based on Campbell' explanation I think you guys are raising a reasonable 
concern. We shouldn't clear that over there arbitrarily.


Thanks
Tiejun


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-20 Thread Chen, Tiejun

 > I think the confusion here is that the d_config->rdms array (which

num_rdms is the length of) is in the public API (because it is in
libxl_types.idl) but is apparently only being used in this series as an
internal state for the domain build process (i.e. xl doesn't ever add
anything to the array rdms).

Tiejun, is that an accurate summary?


Yes.



If the field is in the public API then the possibility of something
being passed in their must be considered now, even if this particular
series adds no such calls, since we cannot prevent 3rd party users of
libxl adding such configuration.

Is the possibility of the toolstack (i.e. the caller of libxl) supplying
an array of rdm regions seems to be being left aside for future work or
it not intended to ever support that?


Its very possible so you're right.
Thanks
Tiejun



Ian.




And if you still worry about something, I can add assert() at the
beginning of this function like this,

assert(!d_config->num_rdms && !d_config->rdms).


If you are sure that this assertion is correct, then that would be
proper.

But as I say above, I don't think it is.

Ian.






___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/6] relocator: Do not use memory region if its starta is smaller than size

2015-07-20 Thread Andrei Borzenkov
On Mon, Jul 20, 2015 at 5:35 PM, Daniel Kiper  wrote:
> malloc_in_range() should not use memory region if its starta is smaller
> than size. Otherwise target wraps around and points to region which is
> usually not a RAM, e.g.:
>
> loader/multiboot.c:93: segment 0: paddr=0x80, memsz=0x3f80, 
> vaddr=0x80
> lib/relocator.c:1241: min_addr = 0x0, max_addr = 0x, target = 
> 0x80
> lib/relocator.c:434: trying to allocate in 0x80-0x 
> aligned 0x1 size 0x3f80
> lib/relocator.c:434: trying to allocate in 0x0-0x80 aligned 0x1 size 
> 0x3f80
> lib/relocator.c:434: trying to allocate in 0x0-0x aligned 0x1 
> size 0x3f80
> lib/relocator.c:1188: allocated: 0xc07f+0x3f80
> lib/relocator.c:1277: allocated 0xc07f/0x80
>
> Signed-off-by: Daniel Kiper 
> ---
>  grub-core/lib/relocator.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/grub-core/lib/relocator.c b/grub-core/lib/relocator.c
> index f759c7f..4eee0c5 100644
> --- a/grub-core/lib/relocator.c
> +++ b/grub-core/lib/relocator.c
> @@ -748,7 +748,7 @@ malloc_in_range (struct grub_relocator *rel,
>   /* Found an usable address.  */
>   goto found;
>   }
> -   if (isinsidebefore && !isinsideafter && !from_low_priv)
> +   if (isinsidebefore && !isinsideafter && !from_low_priv && starta >= 
> size)

That's too late, we need to check end of region on previous iteration.
Consider region of 128 bytes, requested size 129 and alignment 256.
Than starta still ends up high in memory.

>   {
> target = starta - size;
> if (target > end - size)
> --
> 1.7.10.4
>
>
> ___
> Grub-devel mailing list
> grub-de...@gnu.org
> https://lists.gnu.org/mailman/listinfo/grub-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-20 Thread Chen, Tiejun

I hope the following can address all comments below:


diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 2f8e590..a4bd2a1 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -407,7 +407,7 @@ int libxl__domain_build(libxl__gc *gc,

 switch (info->type) {
 case LIBXL_DOMAIN_TYPE_HVM:
-ret = libxl__build_hvm(gc, domid, info, state);
+ret = libxl__build_hvm(gc, domid, d_config, state);
 if (ret)
 goto out;

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 634b8d2..ba852fe 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -92,6 +92,276 @@ const char *libxl__domain_device_model(libxl__gc *gc,
 return dm;
 }

+static int
+libxl__xc_device_get_rdm(libxl__gc *gc,
+ uint32_t flag,
+ uint16_t seg,
+ uint8_t bus,
+ uint8_t devfn,
+ unsigned int *nr_entries,
+ struct xen_reserved_device_memory **xrdm)
+{
+int rc = 0, r;
+
+/*
+ * We really can't presume how many entries we can get in advance.
+ */
+*nr_entries = 0;
+r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,
+  NULL, nr_entries);
+assert(r <= 0);
+/* "0" means we have no any rdm entry. */
+if (!r) goto out;
+
+if (errno != ENOBUFS) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+GCNEW_ARRAY(*xrdm, *nr_entries);
+r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,
+  *xrdm, nr_entries);
+if (r)
+rc = ERROR_FAIL;
+
+ out:
+if (rc) {
+*nr_entries = 0;
+*xrdm = NULL;
+LOG(ERROR, "Could not get reserved device memory maps.\n");
+}
+return rc;
+}
+
+/*
+ * Check whether there exists rdm hole in the specified memory range.
+ * Returns true if exists, else returns false.
+ */
+static bool overlaps_rdm(uint64_t start, uint64_t memsize,
+ uint64_t rdm_start, uint64_t rdm_size)
+{
+return (start + memsize > rdm_start) && (start < rdm_start + rdm_size);
+}
+
+static void
+add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
+  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
+{
+assert(d_config->num_rdms);
+
+d_config->rdms = libxl__realloc(NOGC, d_config->rdms,
+d_config->num_rdms * sizeof(libxl_device_rdm));
+
+d_config->rdms[d_config->num_rdms - 1].start = rdm_start;
+d_config->rdms[d_config->num_rdms - 1].size = rdm_size;
+d_config->rdms[d_config->num_rdms - 1].policy = rdm_policy;
+}
+
+/*
+ * Check reported RDM regions and handle potential gfn conflicts according
+ * to user preferred policy.
+ *
+ * RDM can reside in address space beyond 4G theoretically, but we never
+ * see this in real world. So in order to avoid breaking highmem layout
+ * we don't solve highmem conflict. Note this means highmem rmrr could
+ * still be supported if no conflict.
+ *
+ * But in the case of lowmem, RDM probably scatter the whole RAM space.
+ * Especially multiple RDM entries would worsen this to lead a complicated
+ * memory layout. And then its hard to extend hvm_info_table{} to work
+ * hvmloader out. So here we're trying to figure out a simple solution to
+ * avoid breaking existing layout. So when a conflict occurs,
+ *
+ * #1. Above a predefined boundary (default 2G)
+ * - Move lowmem_end below reserved region to solve conflict;
+ *
+ * #2. Below a predefined boundary (default 2G)
+ * - Check strict/relaxed policy.
+ * "strict" policy leads to fail libxl.
+ * "relaxed" policy issue a warning message and also mask this entry
+ * INVALID to indicate we shouldn't expose this entry to hvmloader.
+ * Note when both policies are specified on a given region, the per-device
+ * policy should override the global policy.
+ */
+int libxl__domain_device_construct_rdm(libxl__gc *gc,
+   libxl_domain_config *d_config,
+   uint64_t rdm_mem_boundary,
+   struct xc_hvm_build_args *args)
+{
+int i, j, conflict, rc;
+struct xen_reserved_device_memory *xrdm = NULL;
+uint32_t strategy = d_config->b_info.u.hvm.rdm.strategy;
+uint16_t seg;
+uint8_t bus, devfn;
+uint64_t rdm_start, rdm_size;
+uint64_t highmem_end = args->highmem_end ? args->highmem_end : 
(1ull<<32);

+
+/* Might not expose rdm. */
+if (strategy == LIBXL_RDM_RESERVE_STRATEGY_IGNORE &&
+!d_config->num_pcidevs)
+return 0;
+
+/* Query all RDM entries in this platform */
+if (strategy == LIBXL_RDM_RESERVE_STRATEGY_HOST) {
+unsigned int nr_entries;
+
+/* Collect all rdm info if exist. */
+rc = libxl__xc_device_get_rdm(gc, PCI_DEV_RDM_ALL,
+   

Re: [Xen-devel] [PATCH v5 10/15] x86/altp2m: add remaining support routines.

2015-07-20 Thread Jan Beulich
>>> On 21.07.15 at 07:46,  wrote:
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>>Sent: Sunday, July 19, 2015 11:53 PM
> On 18.07.15 at 00:32,  wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 16, 2015 2:34 AM

>>> On 16.07.15 at 11:16,  wrote:
>> From: Jan Beulich [mailto:jbeul...@suse.com]
>>Sent: Tuesday, July 14, 2015 7:32 AM
> On 14.07.15 at 02:14,  wrote:
>>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>unsigned long gla,
>>>  if ( npfec.write_access )
>>>  {
>>>  paging_mark_dirty(currd, mfn_x(mfn));
>>> +/* If p2m is really an altp2m, unlock here to avoid
>>> + lock
> ordering
>>> + * violation when the change below is propagated from
>>> + host p2m
> */
>>> +if ( ap2m_active )
>>> +__put_gfn(p2m, gfn);
>>>  p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>>> p2m_ram_rw);
>>
>>And this won't result in any races?
>
> No

To be honest I expected a little more than just "no" here. Now I have
to ask - why?

>>>
>>> Yes, I should have described it more than that :-)  so this part of
>>> the code is handling the log dirty transition of the page, and this
>>> page permission transition happens always on the hostp2m. Given the
>>> way the locking order is setup (hostp2m->altp2m-list-lock->altp2m and
>>> there was a separate writeup and discussion with George on that), at
>>> this point in this sequence there is a p2m lock (whether it's a
>>> hostp2m or altp2m lock depends on the mode of the
>>> domain) - the reason we have to drop the lock here first is due to
>>> what happens next; the permission changes in hostp2m will be serially
>>> propagated to altp2ms and not dropping the lock here would cause a
>>> locking order violation. Hope that clarifies.
>>
>>Sadly it doesn't at all: You re-explain why you need to drop the lock, while 
> you
>>fail to say anything on why this won't cause a race.
>>
> 
> It only drops the lock on the altp2m, which is no longer required in this 
> function anyway. The important aspect is that there is still a lock held on 
> the host p2m, and that is dropped after the log-dirty updates, as it would be 
> in the non-altp2m case (maybe that was the part I should have explained 
> clearly in the para above). Does that clarify or do you see a particular race 
> condition here? (We don't ).

Sounds okay then.

>>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>>> +long rc = -EINVAL;
>>
>>Why long (for both variable and function return type)? (More of
>>these in functions below.)
>
> Because the error variable in the code that calls these (in hvm.c)
> is a long, and you had given feedback earlier to propagate the
> returns from these functions through that calling code.

I don't see the connection. The function only returns zero or -E...
values, so why would its return type be "long"?

>>>
>>> do_hvm_op declares a rc that is of type "long" and hence this returns
>>> a "long"
>>
>>What type your caller(s) return is of no interest at all here: What would you 
> do
>>if you had multiple callers with differing return types?
>>A function's return type should be chosen based on the range of values it may
>>return, and the result possibly widened to not yield inefficient code (like 
> in
>>some of the uint16_t cases elsewhere in the series would be necessary).
>>
> 
> What do you suggest the return type be?

For the case here - int (quite obviously I would say).

For the uint16_t ones - unsigned int.

>>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>>> + mfn_t mfn, unsigned int page_order,
>>> + p2m_type_t p2mt, p2m_access_t
>>> +p2ma) {
>>> +struct p2m_domain *p2m;
>>> +p2m_access_t a;
>>> +p2m_type_t t;
>>> +mfn_t m;
>>> +uint16_t i;
>>> +bool_t reset_p2m;
>>> +unsigned int reset_count = 0;
>>> +uint16_t last_reset_idx = ~0;
>>> +
>>> +if ( !altp2m_active(d) )
>>> +return;
>>> +
>>> +altp2m_list_lock(d);
>>> +
>>> +for ( i = 0; i < MAX_ALTP2M; i++ )
>>> +{
>>> +if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>>> +continue;
>>> +
>>> +p2m = d->arch.altp2m_p2m[i];
>>> +m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0,
>>> + NULL);
>>> +
>>> +reset_p2m = 0;
>>> +
>>> +/* Check for a dropped page that may impact this altp2m */
>>> +if ( mfn_x(mfn) == INVALID_MFN &&
>>> + gfn_x(gfn) >= p2m->min_remapped_gfn &&
>>> + gfn_x(gfn) <= p2m->max_remapped_gfn )
>>> +reset_p2m = 1;
>>
>>Conside

Re: [Xen-devel] [PATCH] blktap2: update connection handling to fix build with gcc5

2015-07-20 Thread Olaf Hering
On Mon, Jul 20, Wei Liu wrote:

> On Mon, Jul 20, 2015 at 11:16:34AM +0200, Olaf Hering wrote:
> > On Mon, Jul 20, Jan Beulich wrote:
> > 
> > > >>> On 19.07.15 at 11:33,  wrote:
> > > > [  198s] block-log.c:549:23: error: array subscript is above array 
> > > > bounds [-Werror=array-bounds]
> > > > [  198s]  if (s->connections[i].id == id)
> > > > [  198s]^
> > > 
> > > So what makes the compiler right with that complaint? I.e. how does
> > > it know i > 0 here? After all - afaict - s->connected can only be 0 or
> > 
> > It has to assume that ->connected can get any value because the input
> > comes from outside the unit.
> > 
> > To reduce the patch size "&& i < MAX_CONNECTIONS" could be added.
> > 
> 
> A smaller patch would be preferable at this stage.

I disagree with that.

What is the longterm goal of that binary?
Will it ever be able to handle more than one connection? If so the loops
have to handle holes in ->connections[], which increases patch size.

If it will ever handle only a single conection, please review and spot
possible errors in my patch.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.

2015-07-20 Thread Jan Beulich
>>> On 21.07.15 at 07:04,  wrote:
> Are you ok if this mechanical change doesn't go into our 4.6 series? 

Reluctantly - yes.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 06/16] hvmloader/pci: Try to avoid placing BARs in RMRRs

2015-07-20 Thread Jan Beulich
>>> On 21.07.15 at 02:53,  wrote:
>> > Okay, I regenerate this patch online. And I just hope its good to be
>>> acked here:
>>
>> Provided it also works,
>> Reviewed-by: Jan Beulich 
> 
> Why is this marked as Acked-by this time? The same question is raised to 
> another hvmloader patch as well.
> 
> This really makes me confused since you're the key maintainer associated 
> to this, and I remember you also gave me Acked-by to the first hvmloader 
> patch. I know this solution is always argued, so does this mean you 
> still don't think this is good to go in the tree in your perspective, so 
> you want to leave this Acked-by to other maintainers, right?

Keeping in mind that others (and other projects) may consider this
differently, at least on the hypervisor side we've been mostly
treating (or trying to treat) Reviewed-by as a superset of Acked-by.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [Xen-Devel] Enabling IRQ Crossbar (Secondary Interrupt Controller) Support

2015-07-20 Thread Brandon Perez

Hello All,

You can find the relevant thread that outlines the issue at [1].

In short, the issue is as follows. On the TI DRA72 chip, there is a 
piece of hardware called the IRQ crossbar. Due to the large number of 
peripheral devices, not all devices can fit onto the SPI lines on the 
GIC. The crossbar maps peripheral devices to interrupt lines on the GIC.


The Linux kernel handles this by performing an on-demand mapping. 
Whenever a driver requires an interrupt, it requests it from the kernel. 
The Kernel keeps an internal map of the SPI lines, and checks for the 
next free line. It then sets up the appropriate crossbar register to map 
the given device to the SPI line, and marks that line as no longer free.


In the device tree, the "interrupts" property no longer contains 
the IRQ line corresponding to the device, but rather the crossbar input 
line corresponding to the device. This causes issues in Xen since it is 
expecting an IRQ line.


Ideally (in my opinion), the solution would be to mark these device 
tree entries (possibly with a different "interrupt-parent" property), so 
that Xen only maps the memory-mapped registers, and passes the interrupt 
on to Dom0. Dom0 would then handle the on-demand mapping of the devices. 
That way, Xen wouldn't need to repeat some fairly hardware-specific code 
that's already in the Kernel.


Below I've attached the patches of the changes I have made to Xen 
and the Linux kernel so far. For reference, here is my setup:

- Hardware: TI DRA72 Chip, Arm Cortex A15
- Xen:
- Version: 4.6-unstable
- Compiled from source, with some local changes
- Branch: master
- Commit: ecdae1cfaa7f6123decaa1b9d7205c3ff726b941
- Repo URL: git://xenbits.xen.org/xen.git
- Linux Kernel:
- Version: 3.14
- Compiled From source, with some local changes
- Branch: android-3.14-6AL.1.0
- Commit: 7b2f1133857414b96927c06f08ed6c440f5472e7
- Repo URL: git://git.omapzoom.org/kernel/omap.git

I have additional patches for U-Boot which I can provide if needed, 
but they aren't directly relevant to this issue. I also have patches for 
a static crossbar mapping with Xen, which is the temporary work-around 
I'm currently working with.


[1] http://www.gossamer-threads.com/lists/xen/users/389509

Brandon Perez

>From f2bf190255c8f872d15063d7f8a6382c279e312d Mon Sep 17 00:00:00 2001
From: Brandon Perez 
Date: Mon, 20 Jul 2015 17:56:49 -0400
Subject: [PATCH 1/3] This patch adds in IO mappings for several devices which
 are not explicitly laid out in the device tree, and
 mappings for DRA72x specific regions that are not
 devices.

In the ARM virtualization setup, the virtual memory uses a 2-stage
translation. The guest OS VM performs a virtual address translation to
an intermediate physical address, which is still not a true physical
memory address. The Xen hypervisor VM then translates this IPA to an
actual physical address.

Thus, in order for a guest OS (even Dom0) to access a physical address,
Xen must explicitly setup a mapping in its VM for the guest OS to access
this location. Several devices were missing from the device tree, so Xen
was unware of the need to map their MMIO registers. Thus, the platform
has specific_mapping() function, which patches up these holes. The OMAP5
specific mapping function is missing a few items that DRA72x chips need,
so these were added in.
---
 xen/arch/arm/platforms/omap5.c|   27 +++
 xen/include/asm-arm/platforms/omap5.h |3 +++
 2 files changed, 30 insertions(+)

diff --git a/xen/arch/arm/platforms/omap5.c b/xen/arch/arm/platforms/omap5.c
index e7bf30d..3c6495a 100644
--- a/xen/arch/arm/platforms/omap5.c
+++ b/xen/arch/arm/platforms/omap5.c
@@ -120,6 +120,32 @@ static int omap5_specific_mapping(struct domain *d)
 return 0;
 }
 
+/* Additional mappings for dom0 (not in the DTS) */
+static int dra7_specific_mapping(struct domain *d)
+{
+/* Map the PRM module */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_PRM_BASE), 2,
+ paddr_to_pfn(OMAP5_PRM_BASE));
+
+/* Map the PRM_MPU */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_PRCM_MPU_BASE), 1,
+ paddr_to_pfn(OMAP5_PRCM_MPU_BASE));
+
+/* Map the Wakeup Gen */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_WKUPGEN_BASE), 1,
+ paddr_to_pfn(OMAP5_WKUPGEN_BASE));
+
+/* Map the on-chip SRAM */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_SRAM_PA), 32,
+ paddr_to_pfn(OMAP5_SRAM_PA));
+
+/* Map GPMC address space for NAND flash. */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_GPMC_PA), 65536,
+ paddr_to_pfn(OMAP5_GPMC_PA));
+
+return 0;
+}
+
 static int __init omap5_smp_init(void)
 {
 void __iomem *wugen_base;
@@ -171,6 +197,7 @@ PLATFORM_START(dra7, "TI DRA7")
 .init_time = omap5_init_time,
 .cpu_up = cpu_up_send_sgi,
 .smp_init = omap5_smp_

Re: [Xen-devel] Preface working plan for altp2m during freeze exception

2015-07-20 Thread Sahita, Ravi
>From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Monday, July 20, 2015 12:12 AM
>
 On 17.07.15 at 21:43,  wrote:
>> We are working on addressing review comments in this order (and you
>> will see this pattern in our review responses):
>>
>> * Category 1 - Address review comments that affect ABI - these are of
>> course required and will be addressed first.
>>
>> * Category 2 - Address review comments that do not affect ABI - we
>> will try to address ones that we think we can realistically meet
>> within the time bounds - we ask you for some flexibility on these. If
>> these cannot be addressed within the allotted exception time-frame,
>> hopefully these wont be blockers for 4.6 since they can be addressed by
>follow-on patches.
>
>Not sure - we've had bad experience with allowing code to go in with the
>promise for later adjustments (which then never happened)...
>

For some of the remaining items (not addressed by our v6) we have some 
tentative patches that we could share with you post 4.6, we just think we don't 
have time to get to Category 2 things (and probably not all of Category 4) to 
do a good job on them - please tell us based on your read of the v6 series if 
that is close to an acceptable "final - 1" series - with any minor i's to be 
dotted and t's to be crossed in the final - since realistically, we need to get 
a final patch series to you guys by Wednesday evening PDT (Thursday AM for you) 
- correct?

>> * Category 3 - Address review comments that are really design
>> questions - These we will try to address by short descriptions in
>> review replies that attempt to give a gist of the design we followed,
>> but of course design changes obviously cannot be done at this late
>> stage - hopefully that is expected.
>
>If you really just mean questions on the design (rather than questions possibly
>resulting int the requirement to change the design), then that'd be fine of
>course. I think you understand that we shouldn't be deferring issues that
>require design adjustments. Otoh I don't even recall what design questions
>there were.
>
>> * Category 4 - Address trivial changes as we naturally update patches,
>> however if we run out of time, some may remain un-addressed (to be
>> taken care of post 4.6).
>
>See above (point 2).
>
>> Can we please get a "yes - makes sense" sort of acknowledgement of
>> this plan from the Maintainers?
>
>Considering the limitations above, this is only a "maybe" from me.
>

Could you please review the v6 patches and see if your "maybe" can change to a 
"good for 4.6, pending changes post 4.6" - thanks.
Also please see my responses to your other open comments - I've tried to 
address all of them.

>> YN   [PATCH v3 05/15] x86/altp2m: basic data structures
>and support routines
>> Status if not acked: Category 3: we will write a short description of some
>> design questions in review replies
>>  Category 2: moving altp2m struct to be dynamically
>allocated - this
>> has minor benefit and big downside so will be lower priority, also
>> some error handling fixes
>
>Big downside? You're not referring to the mechanical adjustments this
>implies, are you?
>

Just time to turn around and test the changes with our tests - like I said, we 
have a patch for this now - but time is short, so would request for this to not 
be the blocker.


>> YN   [PATCH v3 07/15] VMX: add VMFUNC leaf 0 (EPTP
>switching) to emulator
>> Status if not acked: Category 2/3 - changes staged after Jan's feedback on
>v5 -
>> ack with those addressed in v6?
>
>Let's see what v6 looks like.

Thanks.

>
>> YN   [PATCH v3 08/15] x86/altp2m: add control of
>suppress_ve
>> Status if not acked: Now has r-b both Jan and George -  need maintainers
>ack on
>> this one please
>
>Who do you refer to by "maintainer" here? I think the trivial adjustment to
>xen/arch/x86/mm/mem_sharing.c can in the worst case go in without Andres'
>ack. And everything else is covered by George's authorship and review.

Ok thanks,

Ravi

>
>Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 59770: regressions - FAIL

2015-07-20 Thread osstest service owner
flight 59770 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59770/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-xsm   13 guest-saverestore fail REGR. vs. 59254
 test-amd64-i386-xl   13 guest-saverestore fail REGR. vs. 59254
 test-amd64-i386-pair   21 guest-migrate/src_host/dst_host fail REGR. vs. 59254
 test-armhf-armhf-xl-credit2   6 xen-boot  fail REGR. vs. 59254
 test-armhf-armhf-xl-multivcpu  9 debian-install   fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59254
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59254
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
like 59423-bisect

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail  never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux52721d9d3334c1cb1f76219a161084094ec634dc
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z   12 days
Failing since 59348  2015-07-10 04:24:05 Z   11 days   11 attempts
Testing same since59770  2015-07-20 12:35:55 Z0 days1 attempts


406 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   fail
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   fail
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd

Re: [Xen-devel] [PATCH v5 06/15] VMX/altp2m: add code to support EPTP switching and #VE.

2015-07-20 Thread Sahita, Ravi
>From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Sunday, July 19, 2015 11:22 PM
>
 On 17.07.15 at 23:08,  wrote:
>>> From: Jan Beulich [mailto:jbeul...@suse.com]
>>>Sent: Thursday, July 16, 2015 2:39 AM
>>>
>> On 16.07.15 at 11:20,  wrote:
> From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Tuesday, July 14, 2015 6:57 AM
 On 14.07.15 at 02:14,  wrote:
>> +static bool_t vmx_vcpu_emulate_ve(struct vcpu *v) {
>> +bool_t rc = 0;
>> +ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) !=
>> +INVALID_GFN
>?
>> +hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn),
>0) :
>> +NULL;
>> +
>> +if ( !veinfo )
>> +return 0;
>> +
>> +if ( veinfo->semaphore != 0 )
>> +goto out;
>> +
>> +rc = 1;
>> +
>> +veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
>> +veinfo->semaphore = ~0l;
>
>Isn't semaphore a 32-bit quantity?

 Yes.
>>>
>>>I.e. the l suffix can and should be dropped.
>>>
>>
>> Ok.
>>
>> +{
>> +unsigned long idx;
>> +
>> +if ( v->arch.hvm_vmx.secondary_exec_control &
>> +SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
>> +__vmread(EPTP_INDEX, &idx);
>> +else
>> +{
>> +unsigned long eptp;
>> +
>> +__vmread(EPT_POINTER, &eptp);
>> +
>> +if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
>> + INVALID_ALTP2M )
>> +{
>> +gdprintk(XENLOG_ERR, "EPTP not found in alternate
>> + p2m
>>>list\n");
>> +domain_crash(v->domain);
>> +}
>> +}
>> +
>> +if ( (uint16_t)idx != vcpu_altp2m(v).p2midx )
>
>Is this cast really necessary?

 Yes - The index is 16-bits, this reflects how the field is specified
 in the vmcs also.
>>>
>>>While "yes" answers the question, the explanation you give suggests
>>>that the answer may be wrong: Can idx indeed have bits set beyond bit
>>>15? Because if it can't, the cast is pointless.
>>>
>>
>> We were just trying to ensure we matched the hardware behavior (I
>> think there was a message George had posted earlier for SVE that asked for
>that).
>> Since hardware considers only a 16 bit field we were doing the same.
>>
>> +{
>> +BUG_ON(idx >= MAX_ALTP2M);
>> +atomic_dec(&p2m_get_altp2m(v)->active_vcpus);
>> +vcpu_altp2m(v).p2midx = (uint16_t)idx;
>
>This one surely isn't (or else the field type is wrong).

 Again required. idx can't be uint16_t because __vmread() requires
 unsigned long*, but the index is 16 bits.
>>>
>>>But it's a 16-bit VMCS field that you read it from, and hence the
>>>upper 48
>> bits
>>>are necessarily zero.
>>>
>>>Just to re-iterate: Casts are necessary in certain places, yes, but I
>>>see them used pointlessly or even wrongly more often than not.
>>
>> Same approach as above - emulating hardware exactly.
>> Should we add a comment?
>
>No, drop the casts.
>
>Jan

Ok

Ravi

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 10/15] x86/altp2m: add remaining support routines.

2015-07-20 Thread Sahita, Ravi
>From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Sunday, July 19, 2015 11:53 PM
>
 On 18.07.15 at 00:32,  wrote:
>>> From: Jan Beulich [mailto:jbeul...@suse.com]
>>>Sent: Thursday, July 16, 2015 2:34 AM
>>>
>> On 16.07.15 at 11:16,  wrote:
> From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Tuesday, July 14, 2015 7:32 AM
 On 14.07.15 at 02:14,  wrote:
>> @@ -2965,9 +3003,15 @@ int hvm_hap_nested_page_fault(paddr_t
>gpa,
>unsigned long gla,
>>  if ( npfec.write_access )
>>  {
>>  paging_mark_dirty(currd, mfn_x(mfn));
>> +/* If p2m is really an altp2m, unlock here to avoid
>> + lock
 ordering
>> + * violation when the change below is propagated from
>> + host p2m
 */
>> +if ( ap2m_active )
>> +__put_gfn(p2m, gfn);
>>  p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>> p2m_ram_rw);
>
>And this won't result in any races?

 No
>>>
>>>To be honest I expected a little more than just "no" here. Now I have
>>>to ask - why?
>>>
>>
>> Yes, I should have described it more than that :-)  so this part of
>> the code is handling the log dirty transition of the page, and this
>> page permission transition happens always on the hostp2m. Given the
>> way the locking order is setup (hostp2m->altp2m-list-lock->altp2m and
>> there was a separate writeup and discussion with George on that), at
>> this point in this sequence there is a p2m lock (whether it's a
>> hostp2m or altp2m lock depends on the mode of the
>> domain) - the reason we have to drop the lock here first is due to
>> what happens next; the permission changes in hostp2m will be serially
>> propagated to altp2ms and not dropping the lock here would cause a
>> locking order violation. Hope that clarifies.
>
>Sadly it doesn't at all: You re-explain why you need to drop the lock, while 
>you
>fail to say anything on why this won't cause a race.
>

It only drops the lock on the altp2m, which is no longer required in this 
function anyway. The important aspect is that there is still a lock held on the 
host p2m, and that is dropped after the log-dirty updates, as it would be in 
the non-altp2m case (maybe that was the part I should have explained clearly in 
the para above). Does that clarify or do you see a particular race condition 
here? (We don't ).

>> +long p2m_init_altp2m_by_id(struct domain *d, uint16_t idx) {
>> +long rc = -EINVAL;
>
>Why long (for both variable and function return type)? (More of
>these in functions below.)

 Because the error variable in the code that calls these (in hvm.c)
 is a long, and you had given feedback earlier to propagate the
 returns from these functions through that calling code.
>>>
>>>I don't see the connection. The function only returns zero or -E...
>>>values, so why would its return type be "long"?
>>>
>>
>> do_hvm_op declares a rc that is of type "long" and hence this returns
>> a "long"
>
>What type your caller(s) return is of no interest at all here: What would you 
>do
>if you had multiple callers with differing return types?
>A function's return type should be chosen based on the range of values it may
>return, and the result possibly widened to not yield inefficient code (like in
>some of the uint16_t cases elsewhere in the series would be necessary).
>

What do you suggest the return type be?

>> +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
>> + mfn_t mfn, unsigned int page_order,
>> + p2m_type_t p2mt, p2m_access_t
>> +p2ma) {
>> +struct p2m_domain *p2m;
>> +p2m_access_t a;
>> +p2m_type_t t;
>> +mfn_t m;
>> +uint16_t i;
>> +bool_t reset_p2m;
>> +unsigned int reset_count = 0;
>> +uint16_t last_reset_idx = ~0;
>> +
>> +if ( !altp2m_active(d) )
>> +return;
>> +
>> +altp2m_list_lock(d);
>> +
>> +for ( i = 0; i < MAX_ALTP2M; i++ )
>> +{
>> +if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
>> +continue;
>> +
>> +p2m = d->arch.altp2m_p2m[i];
>> +m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0,
>> + NULL);
>> +
>> +reset_p2m = 0;
>> +
>> +/* Check for a dropped page that may impact this altp2m */
>> +if ( mfn_x(mfn) == INVALID_MFN &&
>> + gfn_x(gfn) >= p2m->min_remapped_gfn &&
>> + gfn_x(gfn) <= p2m->max_remapped_gfn )
>> +reset_p2m = 1;
>
>Considering that this looks like an optimization, what's the
>downside of possibly having min=0 and max=?
>I.e.
>can there a long latency operation result that's this way a guest can
>effect?
>

 ... A p2m is a gfn->mfn map, amongst other things. There is a
 r

Re: [Xen-devel] [PATCH v6 15/15] tools/xen-access: altp2m testcases

2015-07-20 Thread Razvan Cojocaru
On 07/21/2015 02:58 AM, Ed White wrote:
> From: Tamas K Lengyel 
> 
> Working altp2m test-case. Extended the test tool to support singlestepping
> to better highlight the core feature of altp2m view switching.
> 
> Signed-off-by: Tamas K Lengyel 
> Signed-off-by: Ed White 
> 
> Reviewed-by: Razvan Cojocaru 
> Acked-by: Wei Liu 
> ---
>  tools/tests/xen-access/xen-access.c | 173 
> ++--
>  1 file changed, 148 insertions(+), 25 deletions(-)

As previously discussed (privately) with Tamas, if an Acked-by is more
appropriate here (since in the meantime MAINTAINERS has been updated),
then please turn the Reviewed-by into an:

Acked-by: Razvan Cojocaru 

I'm not sure this is required, since a recent xen-devel discussion has
stated that Reviewed-by is stronger than, and implies, Acked-by, but
just in case I've read that wrong I don't want to hold up the patch.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.

2015-07-20 Thread Sahita, Ravi
>From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Sunday, July 19, 2015 11:20 PM
>
 On 18.07.15 at 00:36,  wrote:
>>> From: Jan Beulich [mailto:jbeul...@suse.com]
>>>Sent: Thursday, July 16, 2015 2:08 AM
>>>
>> On 16.07.15 at 10:57,  wrote:
> From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Tuesday, July 14, 2015 6:13 AM
 On 14.07.15 at 02:14,  wrote:
>> @@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct
>p2m_domain *p2m, unsigned long gfn,
>>  l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
>>
>>  /*
>> + * Alternate p2m: shadow p2m tables used for alternate memory
>> + views */
>> +
>> +/* get current alternate p2m table */ static inline struct
>> +p2m_domain *p2m_get_altp2m(struct vcpu *v) {
>> +struct domain *d = v->domain;
>> +uint16_t index = vcpu_altp2m(v).p2midx;
>> +
>> +ASSERT(index < MAX_ALTP2M);
>> +
>> +return (index == INVALID_ALTP2M) ? NULL :
>> +d->arch.altp2m_p2m[index]; }
>
>Looking at this again, I'm afraid I'd still prefer index <
>MAX_ALTP2M in the return statement (and the ASSERT() dropped): The
>ASSERT() does nothing in a debug=n build, and hence wouldn't shield
>us from possibly having to issue an XSA if somehow an access outside
>the array's bounds
>>>turned out possible.
>

 the assert was breaking v5 anyway. BUG_ON (with the right check) is
 probably the right thing to do, as we do in the exit handling that
 checks for a VMFUNC having changed the index.
 So will make that change.
>>>
>>>But why use a BUG_ON() when you can deal with this more gracefully?
>>>Please try to avoid crashing the hypervisor when there are other ways to
>recover.
>>>
>>
>> So in this case there isnt a graceful fallback; this case can happen
>> only if there is a bug in the hypervisor - which should be reported via the
>BUG_ON.
>
>Generally (as an example), if a hypervisor bug can be confined to a guest,
>killing just the guest instead of the hypervisor would still be preferred
>(allowing the admin to attempt to gracefully shut down other guests before
>updating/restarting).
>
>Jan

I agree with that principle, and that's what I was looking to do in the last 
iteration, but in this sort of error condition there is no telling what else 
could have gone wrong on the hypervisor side to cause this, so it seems the 
crash treatment seems suitable.

Ravi



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.

2015-07-20 Thread Sahita, Ravi
>From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Sunday, July 19, 2015 11:18 PM
>
 On 18.07.15 at 00:39,  wrote:
>>> From: Jan Beulich [mailto:jbeul...@suse.com]
>>>Sent: Thursday, July 16, 2015 2:02 AM
>>>
>> On 16.07.15 at 10:48,  wrote:
> From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Tuesday, July 14, 2015 1:53 AM
 On 14.07.15 at 02:01,  wrote:
>>>From: Jan Beulich [mailto:jbeul...@suse.com]
>>>Sent: Monday, July 13, 2015 1:01 AM
>> On 10.07.15 at 23:48,  wrote:
> From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Thursday, July 09, 2015 6:30 AM
 On 01.07.15 at 20:09,  wrote:
>> @@ -294,6 +298,12 @@ struct arch_domain
>>  struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
>>  mm_lock_t nested_p2m_lock;
>>
>> +/* altp2m: allow multiple copies of host p2m */
>> +bool_t altp2m_active;
>> +struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
>> +mm_lock_t altp2m_lock;
>> +uint64_t *altp2m_eptp;
>
>This is a non-insignificant increase of the structure size -
>perhaps all of these should hang off of struct arch_domain via a
>single, separately allocated pointer?

 Is this a nice-to-have - again we modelled after the nestedhvm
 extensions to the struct.
 This will affect a lot of our patch without really changing how
 much memory is allocated.
>>>
>>>I understand that. To a certain point I can agree to limit changes
>>>to what is there at this stage. But you wanting to avoid
>>>addressing concerns basically everywhere it's not a bug
>>>overextends this. Just because the series was submitted late
>>>doesn't mean you should now expect us to give in on any
>>>controversy regarding aspects we would normally expect to be
>>>changed. This would basically encourage others to submit their
>>>stuff late too in the
>> future,
>>>hoping for relaxed review.
>>>
>>
>> Couple things - first, we have absorbed a lot of (good) feedback -
>> thanks for that.
>> Second, I don't think the series can be characterized as late
>> (feedback from others welcome).
>> V1 had almost the same structure and was submitted in January.
>
>Still we're at v3 only here, not v10 or beyond.
>
>> On this change - this will be a lot of effects on the code and we
>> would like to avoid this one.
>
>While this may be a lot of mechanical change, I don't this
>presenting any
 major
>risk of breaking the code.

 On this one specific advice on how and where to implement such a
 change would be great just so that we don't thrash on this change.
>>>
>>>I don't follow - what to do here was said quite explicitly (still
>>>visible in
>> the
>>>context above). I.e. I have no idea what additional advice you seek.
>>
>> Ok that's fine - sorry if this was unclear - I was seeking if you had
>> some specific feedback on how to allocate and manage the dynamic
>> altp2m struct etc (if you had an opinion would be good to hear that).
>
>xmalloc() / xzalloc(). What other alternatives would you see?
>
>Jan

Ok - understood - as you must have seen, this change is not in our v6 - that is 
per our preface work plan -- we may not be able to get this change into the 
series proposed for 4.6.
Though I want to assure you the change can be made subsequently, to address 
your previous point, tonight I have prepared a delta patch for this change 
already but we need to test and that takes up a decent chunk of time. 
Are you ok if this mechanical change doesn't go into our 4.6 series? 

Thanks,
Ravi


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 59769: regressions - FAIL

2015-07-20 Thread osstest service owner
flight 59769 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59769/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 12 guest-saverestore   fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-ovmf-amd64 11 guest-saverestore  fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-debianhvm-amd64 11 guest-saverestore fail REGR. vs. 
59059
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 11 guest-saverestore fail REGR. 
vs. 59059
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 11 guest-saverestore fail REGR. vs. 
59059
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 11 guest-saverestore fail REGR. vs. 
59059
 test-amd64-amd64-xl-qemuu-ovmf-amd64 11 guest-saverestore fail REGR. vs. 59059
 test-amd64-amd64-xl-qemuu-winxpsp3 11 guest-saverestore   fail REGR. vs. 59059
 test-amd64-i386-freebsd10-amd64 12 guest-saverestore  fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-winxpsp3 11 guest-saverestorefail REGR. vs. 59059
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 11 guest-saverestore fail REGR. 
vs. 59059
 test-amd64-amd64-xl-qemuu-win7-amd64 11 guest-saverestore fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-win7-amd64 11 guest-saverestore  fail REGR. vs. 59059

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59059
 test-amd64-amd64-libvirt 11 guest-start  fail   like 59059
 test-amd64-i386-libvirt  11 guest-start  fail   like 59059
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   like 59059

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt-xsm 11 guest-start  fail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass

version targeted for testing:
 qemuu71358470eec668f5dc53def25e585ce250cea9bf
baseline version:
 qemuu35360642d043c2a5366e8a04a10e5545e7353bd5

Last test of basis59059  2015-07-05 10:39:20 Z   15 days
Failing since 59109  2015-07-06 14:58:21 Z   14 days   21 attempts
Testing same since59769  2015-07-20 12:37:30 Z0 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Alex Williamson 
  Alexander Graf 
  Alexey Kardashevskiy 
  Alvise Rigo 
  Amit Shah 
  Andreas Färber 
  Andrew Bennett 
  Andrew Jones 
  Artyom Tarasenko 
  Aurelien Jarno 
  Benjamin Herrenschmidt 
  Bharata B Rao 
  Bharata B Rao 
  Brian Kress 
  Christian Borntraeger 
  Christoph Hellwig 
  Claudio Fontana 
  Cormac O'Brien 
  Cornelia Huck 
  Daniel P. Berrange 
  David Gibson 
  Denis V. Lunev 
  Dmitry Osipenko 
  Dr. David Alan Gilbert 
  Eduardo Habkost 
  Eric Auger 
  Fam Zheng 
  Frediano Ziglio 
  Gabriel Laupre 
  Gavin Shan 
  Gerd Hoffmann 
  Gonglei 
  Greg Kurz 
  Hannes Reinecke 
  Hervé Poussineau 
  Igor Mammedov 
  James Hogan 
  Jan Kiszka 
  Jason Wang 
  Jeff Cody 
  Johannes Schlatow 
  John Snow 
  Josh Durgin 
  Juan Quintela 
  Justin Ossevoort 
  Keith Busch 
  Kevin Wolf 
  Kirk Allan 
  Laszlo Ersek 
  Laurent Vivier 
  Laurent Vivier 
  Leon Alrae 
  Li Zhijian 
  Liang Li 
  Lin Ma 
  Marc-André Lureau 
  Markus Armbruster 
  Max Filippov 
  Michael Roth 
  Michael S. Tsirkin 
  Nikunj A Dadhania 
  Olga Krishtal 
  Pankaj Gupta 
  Paolo Bonzini 
  Paul Durrant 
  Paulo Alcantara 
  Paulo Alcantara 
  Peter Crosthwaite 
  Peter Crosthwaite 
  Peter Maydell 
  Radim Krčmář 
  Richard W.M. Jones 
  Scott Feldman 
  Sergey Fedorov 
  Stefan Hajnoczi 
  Stefan Weil 
  Ting Wang 
  Vikram Sethi 
  Wen Congyang 
  Wenshuang Ma 
  Wolfgang Bumiller 
  Xu Wang 
  Yongbok Kim 
  马文霜 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt 

Re: [Xen-devel] [Patch V6 12/16] mm: provide early_memremap_ro to establish read-only mapping

2015-07-20 Thread Juergen Gross

Hi MM maintainers,

this patch is the last requiring an ack for the series to go in.
Could you please comment?


Juergen

On 07/17/2015 06:51 AM, Juergen Gross wrote:

During early boot as Xen pv domain the kernel needs to map some page
tables supplied by the hypervisor read only. This is needed to be
able to relocate some data structures conflicting with the physical
memory map especially on systems with huge RAM (above 512GB).

Provide the function early_memremap_ro() to provide this read only
mapping.

Signed-off-by: Juergen Gross 
Acked-by: Konrad Rzeszutek Wilk 
Cc: Arnd Bergmann 
Cc: linux...@kvack.org
Cc: linux-a...@vger.kernel.org
---
  include/asm-generic/early_ioremap.h |  2 ++
  include/asm-generic/fixmap.h|  3 +++
  mm/early_ioremap.c  | 12 
  3 files changed, 17 insertions(+)

diff --git a/include/asm-generic/early_ioremap.h 
b/include/asm-generic/early_ioremap.h
index a5de55c..316bd04 100644
--- a/include/asm-generic/early_ioremap.h
+++ b/include/asm-generic/early_ioremap.h
@@ -11,6 +11,8 @@ extern void __iomem *early_ioremap(resource_size_t phys_addr,
   unsigned long size);
  extern void *early_memremap(resource_size_t phys_addr,
unsigned long size);
+extern void *early_memremap_ro(resource_size_t phys_addr,
+  unsigned long size);
  extern void early_iounmap(void __iomem *addr, unsigned long size);
  extern void early_memunmap(void *addr, unsigned long size);

diff --git a/include/asm-generic/fixmap.h b/include/asm-generic/fixmap.h
index f23174f..1cbb833 100644
--- a/include/asm-generic/fixmap.h
+++ b/include/asm-generic/fixmap.h
@@ -46,6 +46,9 @@ static inline unsigned long virt_to_fix(const unsigned long 
vaddr)
  #ifndef FIXMAP_PAGE_NORMAL
  #define FIXMAP_PAGE_NORMAL PAGE_KERNEL
  #endif
+#if !defined(FIXMAP_PAGE_RO) && defined(PAGE_KERNEL_RO)
+#define FIXMAP_PAGE_RO PAGE_KERNEL_RO
+#endif
  #ifndef FIXMAP_PAGE_NOCACHE
  #define FIXMAP_PAGE_NOCACHE PAGE_KERNEL_NOCACHE
  #endif
diff --git a/mm/early_ioremap.c b/mm/early_ioremap.c
index e10ccd2..0cfadaf 100644
--- a/mm/early_ioremap.c
+++ b/mm/early_ioremap.c
@@ -217,6 +217,13 @@ early_memremap(resource_size_t phys_addr, unsigned long 
size)
return (__force void *)__early_ioremap(phys_addr, size,
   FIXMAP_PAGE_NORMAL);
  }
+#ifdef FIXMAP_PAGE_RO
+void __init *
+early_memremap_ro(resource_size_t phys_addr, unsigned long size)
+{
+   return (__force void *)__early_ioremap(phys_addr, size, FIXMAP_PAGE_RO);
+}
+#endif
  #else /* CONFIG_MMU */

  void __init __iomem *
@@ -231,6 +238,11 @@ early_memremap(resource_size_t phys_addr, unsigned long 
size)
  {
return (void *)phys_addr;
  }
+void __init *
+early_memremap_ro(resource_size_t phys_addr, unsigned long size)
+{
+   return (void *)phys_addr;
+}

  void __init early_iounmap(void __iomem *addr, unsigned long size)
  {




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/3] xen-blkback: rm BUG_ON() in purge_persistent_gnt()

2015-07-20 Thread Bob Liu
This BUG_ON() will be triggered when previous purge work haven't finished.
It's reasonable under pretty extreme load and should not panic the system.

Signed-off-by: Bob Liu 
---
 drivers/block/xen-blkback/blkback.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index ced9677..b90ac8e 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -394,7 +394,9 @@ static void purge_persistent_gnt(struct xen_blkif *blkif)
 
pr_debug("Going to purge %u persistent grants\n", num_clean);
 
-   BUG_ON(!list_empty(&blkif->persistent_purge_list));
+   if (!list_empty(&blkif->persistent_purge_list))
+   return;
+
root = &blkif->persistent_gnts;
 purge_list:
foreach_grant_safe(persistent_gnt, n, root, node) {
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/3] xen-blkfront: rm BUG_ON(info->feature_persistent) in blkif_free

2015-07-20 Thread Bob Liu
This BUG_ON() in blkif_free() is incorrect, because indirect page can be added
to list info->indirect_pages in blkif_completion() no matter feature_persistent
is true or false.

Signed-off-by: Bob Liu 
---
 drivers/block/xen-blkfront.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index e266d17..c98fcd0 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -986,7 +986,6 @@ static void blkif_free(struct blkfront_info *info, int 
suspend)
if (!list_empty(&info->indirect_pages)) {
struct page *indirect_page, *n;
 
-   BUG_ON(info->feature_persistent);
list_for_each_entry_safe(indirect_page, n, 
&info->indirect_pages, lru) {
list_del(&indirect_page->lru);
__free_page(indirect_page);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/3] xen-blkfront: introduce blkfront_gather_backend_features()

2015-07-20 Thread Bob Liu
There is a bug when migrate from !feature-persistent host to feature-persistent
host, because domU still think new host/backend don't support persistent.
Dmesg like:
backed has not unmapped grant: 839
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 839

The fix is to recheck feature-persistent of new backend in blkif_recover().
See: https://lkml.org/lkml/2015/5/25/469

As Roger suggested, we can split the part of blkfront_connect that checks for
optional features, like persistent grants, indirect descriptors and
flush/barrier features to a separate function and call it from both
blkfront_connect and blkif_recover

Signed-off-by: Bob Liu 
---
 drivers/block/xen-blkfront.c |  118 +++---
 1 file changed, 66 insertions(+), 52 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5b45ee5..e266d17 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -181,6 +181,7 @@ static DEFINE_SPINLOCK(minor_lock);
((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
 
 static int blkfront_setup_indirect(struct blkfront_info *info);
+static void blkfront_gather_backend_features(struct blkfront_info *info);
 
 static int get_id_from_freelist(struct blkfront_info *info)
 {
@@ -1514,6 +1515,7 @@ static int blkif_recover(struct blkfront_info *info)
info->shadow_free = info->ring.req_prod_pvt;
info->shadow[BLK_RING_SIZE(info)-1].req.u.rw.id = 0x0fff;
 
+   blkfront_gather_backend_features(info);
rc = blkfront_setup_indirect(info);
if (rc) {
kfree(copy);
@@ -1694,20 +1696,13 @@ static void blkfront_setup_discard(struct blkfront_info 
*info)
 
 static int blkfront_setup_indirect(struct blkfront_info *info)
 {
-   unsigned int indirect_segments, segs;
+   unsigned int segs;
int err, i;
 
-   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
-   "feature-max-indirect-segments", "%u", 
&indirect_segments,
-   NULL);
-   if (err) {
-   info->max_indirect_segments = 0;
+   if (info->max_indirect_segments == 0)
segs = BLKIF_MAX_SEGMENTS_PER_REQUEST;
-   } else {
-   info->max_indirect_segments = min(indirect_segments,
- xen_blkif_max_segments);
+   else
segs = info->max_indirect_segments;
-   }
 
err = fill_grant_buffer(info, (segs + INDIRECT_GREFS(segs)) * 
BLK_RING_SIZE(info));
if (err)
@@ -1771,6 +1766,66 @@ out_of_memory:
 }
 
 /*
+ * Gather all backend feature-*
+ */
+static void blkfront_gather_backend_features(struct blkfront_info *info)
+{
+   int err;
+   int barrier, flush, discard, persistent;
+   unsigned int indirect_segments;
+
+   info->feature_flush = 0;
+
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-barrier", "%d", &barrier,
+   NULL);
+
+   /*
+* If there's no "feature-barrier" defined, then it means
+* we're dealing with a very old backend which writes
+* synchronously; nothing to do.
+*
+* If there are barriers, then we use flush.
+*/
+   if (!err && barrier)
+   info->feature_flush = REQ_FLUSH | REQ_FUA;
+   /*
+* And if there is "feature-flush-cache" use that above
+* barriers.
+*/
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-flush-cache", "%d", &flush,
+   NULL);
+
+   if (!err && flush)
+   info->feature_flush = REQ_FLUSH;
+
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-discard", "%d", &discard,
+   NULL);
+
+   if (!err && discard)
+   blkfront_setup_discard(info);
+
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-persistent", "%u", &persistent,
+   NULL);
+   if (err)
+   info->feature_persistent = 0;
+   else
+   info->feature_persistent = persistent;
+
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-max-indirect-segments", "%u", 
&indirect_segments,
+   NULL);
+   if (err)
+   info->max_indirect_segments = 0;
+   else
+   info->max_indirect_segments = min(indirect_segments,
+ xen_blkif_max_segments);
+}
+
+/*
  * Invoked when the backend is finally 'ready' (and has told produced
  * the details about the physical device - #sectors, size, etc).
  */
@@ -1781,7 +1836,6 @@ static void blkfront_connect(struct blkfront_info *info)
unsigned int physical_sect

[Xen-devel] [linux-3.18 test] 59766: regressions - FAIL

2015-07-20 Thread osstest service owner
flight 59766 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59766/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail REGR. vs. 58581

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-pvh-intel  3 host-install(3) broken in 59718 pass in 59766
 test-armhf-armhf-xl-xsm   3 host-install(3)  broken in 59718 pass in 59766
 test-amd64-i386-xl-qemut-winxpsp3 3 host-install(3) broken in 59718 pass in 
59766
 test-amd64-i386-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail pass 
in 59718

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt  6 xen-boot  fail REGR. vs. 58581
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 14 guest-localmigrate.2 
fail baseline untested
 test-armhf-armhf-xl-rtds 14 guest-start.2   fail baseline untested
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail in 59718 baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail in 59718 baseline untested
 test-armhf-armhf-xl-rtds 11 guest-startfail in 59718 baseline untested
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   like 58558
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58558
 test-amd64-amd64-libvirt-xsm 11 guest-start  fail   like 58558
 test-amd64-i386-libvirt  11 guest-start  fail   like 58581
 test-armhf-armhf-xl-credit2   6 xen-boot fail   like 58581
 test-armhf-armhf-xl   6 xen-boot fail   like 58581
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail  like 58581
 test-armhf-armhf-xl-xsm   6 xen-boot fail   like 58581
 test-armhf-armhf-libvirt-xsm  6 xen-boot fail   like 58581
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58581
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58581
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58581

Tests which did not succeed, but are not blocking:
 test-amd64-i386-freebsd10-i386  9 freebsd-install  fail never pass
 test-amd64-i386-freebsd10-amd64  9 freebsd-install fail never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass

version targeted for testing:
 linux866cebe251f4fb2b435f4ecfe6d3bb4025938533
baseline version:
 linuxd048c068d00da7d4cfa5ea7651933b99026958cf

Last test of basis58581  2015-06-15 09:42:22 Z   35 days
Failing since 58976  2015-06-29 19:43:23 Z   21 days   30 attempts
Testing same since59412  2015-07-11 00:18:42 Z   10 days   18 attempts


308 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stub

Re: [Xen-devel] [PATCH] VT-d: add iommu=igfx_off option to workaround graphics issues

2015-07-20 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Monday, July 20, 2015 8:24 PM
> 
> >>> On 20.07.15 at 14:12,  wrote:
> > On 17/07/15 20:05, Ting-Wei Lan wrote:
> >> When using Linux >= 3.19 (commit 47591df) as dom0 on some Intel Ironlake
> >> devices, It is possible to encounter graphics issues that make screen
> >> unreadable or crash the system. It was reported in freedesktop bugzilla:
> >>
> >> https://bugs.freedesktop.org/show_bug.cgi?id=90037
> >>
> >> As we still cannot find a proper fix for this problem, this patch adds
> >> iommu=igfx_off option that is similar to Linux intel_iommu=igfx_off for
> >> users to manually workaround the problem.
> >>
> >> Signed-off-by: Ting-Wei Lan 
> >
> > Having looked into this issue, the i915 driver has several workarounds
> > in it for systems when the IOMMU is in use.  In some cases there are
> > plain errata, while in other cases there are specific hardware features
> > which don't function if the IOMMU is enabled.
> >
> > In all cases this is gated on Linux's idea of whether the IOMMU is
> > enabled.  When used under Xen, Linux has no clue that the IOMMU exists,
> > or that Xen has turned it on.
> 
> Perhaps it should just assume an IOMMU is in use when running under
> Xen. Having inspected all those code places quite some time ago, I
> came to the conclusion that making this assumption is better than
> the current one of there not being an enabled IOMMU (and I adjusted
> our kernels accordingly).

kind of agree here. IIRC, to have i915 working correctly in Dom0, user
needs to manually turn on CONFIG_DMAR and CONFIG_INTEL_IOMMU
although there's no IOMMU exposed. Otherwise i915 driver will use plain 
virt_to_phys when programming GTT table which causes trouble. 
Recently there are some improvements in this part, but still some IOMMU
specific tricks remain. So having the options default on under Xen looks
better.

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] VT-d: add iommu=igfx_off option to workaround graphics issues

2015-07-20 Thread Tian, Kevin
> From: Andrew Cooper [mailto:am...@hermes.cam.ac.uk] On Behalf Of Andrew Cooper
> Sent: Monday, July 20, 2015 4:21 PM
> 
> On 20/07/2015 02:28, Tian, Kevin wrote:
> >> From: Ting-Wei Lan [mailto:lant...@gmail.com]
> >> Sent: Saturday, July 18, 2015 3:06 AM
> >>
> >> When using Linux >= 3.19 (commit 47591df) as dom0 on some Intel Ironlake
> >> devices, It is possible to encounter graphics issues that make screen
> >> unreadable or crash the system. It was reported in freedesktop bugzilla:
> >>
> >> https://bugs.freedesktop.org/show_bug.cgi?id=90037
> >>
> >> As we still cannot find a proper fix for this problem, this patch adds
> >> iommu=igfx_off option that is similar to Linux intel_iommu=igfx_off for
> >> users to manually workaround the problem.
> >>
> >> Signed-off-by: Ting-Wei Lan 
> > Since igfx works before, I'd think a more proper fix should be on the
> > bisected Linux commit or i915 to have two working correctly together.
> > Otherwise this patch is just hiding problem.
> 
> The linux commit is the one which actually fixes PAT support for Linux
> under Xen.
> 
> It will cause the i915 driver to actually get WC mappings when it asks
> for them.

This is the part which I don't quite understand. WC is essentially an UC
attribute with write buffer to accelerate the write efficiency. There 
should be no correctness problem to use either WC or UC if i915 driver
wants WC.

> 
> > There is one possible usage to do selective IOMMU disable other than
> > global "iommu=off" switch. Then making this option general would
> > be better than igfx_off, e.g. based on BDF. But I'm not sure how it
> > is useful in reality.
> 
> It is curious that just disabling the IOMMU appears to fix the problem.
> Are there any errata you are aware of on this class of system?
> 

No such errata in my mind. It's not caused by IOMMU HW, since 
dom0-passthrough can't solve the problem neither. It's possible our 
pvMMU code may explicitly check presence of IOMMU to behave 
differently on PAT setting, but I'm not familiar with that part.

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 06/16] hvmloader/pci: Try to avoid placing BARs in RMRRs

2015-07-20 Thread Chen, Tiejun

Okay, I regenerate this patch online. And I just hope its good to be
acked here:


Provided it also works,
Reviewed-by: Jan Beulich 



Why is this marked as Acked-by this time? The same question is raised to 
another hvmloader patch as well.


This really makes me confused since you're the key maintainer associated 
to this, and I remember you also gave me Acked-by to the first hvmloader 
patch. I know this solution is always argued, so does this mean you 
still don't think this is good to go in the tree in your perspective, so 
you want to leave this Acked-by to other maintainers, right?


And what about patch #7, hvmloader/e820: construct guest e820 table, is 
this also not fine to you?


Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 03/15] VMX: implement suppress #VE.

2015-07-20 Thread Nakajima, Jun
On Mon, Jul 20, 2015 at 4:57 PM, Ed White  wrote:
> In preparation for selectively enabling #VE in a later patch, set
> suppress #VE on all EPTE's.
>
> Suppress #VE should always be the default condition for two reasons:
> it is generally not safe to deliver #VE into a guest unless that guest
> has been modified to receive it; and even then for most EPT violations only
> the hypervisor is able to handle the violation.
>
> Signed-off-by: Ed White 
>
> Acked-by: George Dunlap 

Acked-by: Jun Nakajima 

-- 
Jun
Intel Open Source Technology Center

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 13/15] x86/altp2m: XSM hooks for altp2m HVM ops

2015-07-20 Thread Ed White
From: Ravi Sahita 

Signed-off-by: Ravi Sahita 

Acked-by: Daniel De Graaf 
---
 tools/flask/policy/policy/modules/xen/xen.if |  4 ++--
 xen/arch/x86/hvm/hvm.c   |  6 ++
 xen/include/xsm/dummy.h  | 12 
 xen/include/xsm/xsm.h| 12 
 xen/xsm/dummy.c  |  2 ++
 xen/xsm/flask/hooks.c| 12 
 xen/xsm/flask/policy/access_vectors  |  7 +++
 7 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if 
b/tools/flask/policy/policy/modules/xen/xen.if
index da4c95b..a2f25e1 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -8,7 +8,7 @@
 define(`declare_domain_common', `
allow $1 $2:grant { query setup };
allow $1 $2:mmu { adjust physmap map_read map_write stat pinpage 
updatemp mmuext_op };
-   allow $1 $2:hvm { getparam setparam };
+   allow $1 $2:hvm { getparam setparam altp2mhvm_op };
allow $1 $2:domain2 get_vnumainfo;
 ')
 
@@ -58,7 +58,7 @@ define(`create_domain_common', `
allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
allow $1 $2:grant setup;
allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc
-   setparam pcilevel trackdirtyvram nested };
+   setparam pcilevel trackdirtyvram nested altp2mhvm 
altp2mhvm_op };
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index b94e07e..4913adb 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6002,6 +6002,9 @@ static int hvmop_set_param(
 nestedhvm_vcpu_destroy(v);
 break;
 case HVM_PARAM_ALTP2M:
+rc = xsm_hvm_param_altp2mhvm(XSM_PRIV, d);
+if ( rc )
+break;
 if ( a.value > 1 )
 rc = -EINVAL;
 if ( a.value &&
@@ -6186,6 +6189,9 @@ static int do_altp2m_op(
 goto out;
 }
 
+if ( (rc = xsm_hvm_altp2mhvm_op(XSM_TARGET, d ? d : current->domain)) )
+goto out;
+
 switch ( a.cmd )
 {
 case HVMOP_altp2m_get_domain_state:
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index adb02bc..bbbfce7 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -548,6 +548,18 @@ static XSM_INLINE int xsm_hvm_param_nested(XSM_DEFAULT_ARG 
struct domain *d)
 return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_hvm_param_altp2mhvm(XSM_DEFAULT_ARG struct domain *d)
+{
+XSM_ASSERT_ACTION(XSM_PRIV);
+return xsm_default_action(action, current->domain, d);
+}
+
+static XSM_INLINE int xsm_hvm_altp2mhvm_op(XSM_DEFAULT_ARG struct domain *d)
+{
+XSM_ASSERT_ACTION(XSM_TARGET);
+return xsm_default_action(action, current->domain, d);
+}
+
 static XSM_INLINE int xsm_vm_event_control(XSM_DEFAULT_ARG struct domain *d, 
int mode, int op)
 {
 XSM_ASSERT_ACTION(XSM_PRIV);
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 7886574..3678a93 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -147,6 +147,8 @@ struct xsm_operations {
 int (*hvm_param) (struct domain *d, unsigned long op);
 int (*hvm_control) (struct domain *d, unsigned long op);
 int (*hvm_param_nested) (struct domain *d);
+int (*hvm_param_altp2mhvm) (struct domain *d);
+int (*hvm_altp2mhvm_op) (struct domain *d);
 int (*get_vnumainfo) (struct domain *d);
 
 int (*vm_event_control) (struct domain *d, int mode, int op);
@@ -587,6 +589,16 @@ static inline int xsm_hvm_param_nested (xsm_default_t def, 
struct domain *d)
 return xsm_ops->hvm_param_nested(d);
 }
 
+static inline int xsm_hvm_param_altp2mhvm (xsm_default_t def, struct domain *d)
+{
+return xsm_ops->hvm_param_altp2mhvm(d);
+}
+
+static inline int xsm_hvm_altp2mhvm_op (xsm_default_t def, struct domain *d)
+{
+return xsm_ops->hvm_altp2mhvm_op(d);
+}
+
 static inline int xsm_get_vnumainfo (xsm_default_t def, struct domain *d)
 {
 return xsm_ops->get_vnumainfo(d);
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 06ac911..21b1bf8 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -116,6 +116,8 @@ void xsm_fixup_ops (struct xsm_operations *ops)
 set_to_dummy_if_null(ops, hvm_param);
 set_to_dummy_if_null(ops, hvm_control);
 set_to_dummy_if_null(ops, hvm_param_nested);
+set_to_dummy_if_null(ops, hvm_param_altp2mhvm);
+set_to_dummy_if_null(ops, hvm_altp2mhvm_op);
 
 set_to_dummy_if_null(ops, do_xsm_op);
 #ifdef CONFIG_COMPAT
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 882681f..7a4522e 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1176,6 +1176,16 @@ static int flask_hvm_param_nested(struct domain *d)
 return current_has_perm(d, SECCLASS_HVM, HVM__NESTED);
 }
 
+stati

[Xen-devel] [PATCH v6 14/15] tools/libxc: add support to altp2m hvmops

2015-07-20 Thread Ed White
From: Tamas K Lengyel 

Wrappers to issue altp2m hvmops.

Signed-off-by: Tamas K Lengyel 
Signed-off-by: Ravi Sahita 

Acked-by: Ian Campbell 
---
 tools/libxc/Makefile  |   1 +
 tools/libxc/include/xenctrl.h |  22 
 tools/libxc/xc_altp2m.c   | 248 ++
 3 files changed, 271 insertions(+)
 create mode 100644 tools/libxc/xc_altp2m.c

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 1aec848..b0a3e05 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -10,6 +10,7 @@ override CONFIG_MIGRATE := n
 endif
 
 CTRL_SRCS-y   :=
+CTRL_SRCS-y   += xc_altp2m.c
 CTRL_SRCS-y   += xc_core.c
 CTRL_SRCS-$(CONFIG_X86) += xc_core_x86.c
 CTRL_SRCS-$(CONFIG_ARM) += xc_core_arm.c
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index ce9029c..f869390 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2304,6 +2304,28 @@ void xc_tmem_save_done(xc_interface *xch, int dom);
 int xc_tmem_restore(xc_interface *xch, int dom, int fd);
 int xc_tmem_restore_extra(xc_interface *xch, int dom, int fd);
 
+/**
+ * altp2m operations
+ */
+
+int xc_altp2m_get_domain_state(xc_interface *handle, domid_t dom, bool *state);
+int xc_altp2m_set_domain_state(xc_interface *handle, domid_t dom, bool state);
+int xc_altp2m_set_vcpu_enable_notify(xc_interface *handle, domid_t domid,
+ uint32_t vcpuid, xen_pfn_t gfn);
+int xc_altp2m_create_view(xc_interface *handle, domid_t domid,
+  xenmem_access_t default_access, uint16_t *view_id);
+int xc_altp2m_destroy_view(xc_interface *handle, domid_t domid,
+   uint16_t view_id);
+/* Switch all vCPUs of the domain to the specified altp2m view */
+int xc_altp2m_switch_to_view(xc_interface *handle, domid_t domid,
+ uint16_t view_id);
+int xc_altp2m_set_mem_access(xc_interface *handle, domid_t domid,
+ uint16_t view_id, xen_pfn_t gfn,
+ xenmem_access_t access);
+int xc_altp2m_change_gfn(xc_interface *handle, domid_t domid,
+ uint16_t view_id, xen_pfn_t old_gfn,
+ xen_pfn_t new_gfn);
+
 /** 
  * Mem paging operations.
  * Paging is supported only on the x86 architecture in 64 bit mode, with
diff --git a/tools/libxc/xc_altp2m.c b/tools/libxc/xc_altp2m.c
new file mode 100644
index 000..0f3c5ed
--- /dev/null
+++ b/tools/libxc/xc_altp2m.c
@@ -0,0 +1,248 @@
+/**
+ *
+ * xc_altp2m.c
+ *
+ * Interface to altp2m related HVMOPs
+ *
+ * Copyright (c) 2015 Tamas K Lengyel (ta...@tklengyel.com)
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  
USA
+ */
+
+#include "xc_private.h"
+#include 
+#include 
+
+int xc_altp2m_get_domain_state(xc_interface *handle, domid_t dom, bool *state)
+{
+int rc;
+DECLARE_HYPERCALL;
+DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+if ( arg == NULL )
+return -1;
+
+hypercall.op = __HYPERVISOR_hvm_op;
+hypercall.arg[0] = HVMOP_altp2m;
+hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+arg->cmd = HVMOP_altp2m_get_domain_state;
+arg->domain = dom;
+
+rc = do_xen_hypercall(handle, &hypercall);
+
+if ( !rc )
+*state = arg->u.domain_state.state;
+
+xc_hypercall_buffer_free(handle, arg);
+return rc;
+}
+
+int xc_altp2m_set_domain_state(xc_interface *handle, domid_t dom, bool state)
+{
+int rc;
+DECLARE_HYPERCALL;
+DECLARE_HYPERCALL_BUFFER(xen_hvm_altp2m_op_t, arg);
+
+arg = xc_hypercall_buffer_alloc(handle, arg, sizeof(*arg));
+if ( arg == NULL )
+return -1;
+
+hypercall.op = __HYPERVISOR_hvm_op;
+hypercall.arg[0] = HVMOP_altp2m;
+hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
+
+arg->version = HVMOP_ALTP2M_INTERFACE_VERSION;
+arg->cmd = HVMOP_altp2m_set_domain_state;
+arg->domain = dom;
+arg->u.domain_state.state = state;
+
+rc = do_xen_hypercall(handle, &hypercall);
+
+xc_hypercall_buffer_free(handle, arg);
+ret

[Xen-devel] [PATCH v6 12/15] x86/altp2m: Add altp2mhvm HVM domain parameter.

2015-07-20 Thread Ed White
The altp2mhvm and nestedhvm parameters are mutually
exclusive and cannot be set together.

Signed-off-by: Ed White 

Reviewed-by: Andrew Cooper 
Acked-by: Wei Liu 
---
 docs/man/xl.cfg.pod.5   | 12 
 tools/libxl/libxl.h |  6 ++
 tools/libxl/libxl_create.c  |  1 +
 tools/libxl/libxl_dom.c |  2 ++
 tools/libxl/libxl_types.idl |  1 +
 tools/libxl/xl_cmdimpl.c| 10 ++
 xen/arch/x86/hvm/hvm.c  | 21 -
 xen/include/public/hvm/params.h |  5 -
 8 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 382f30b..e53fd45 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1027,6 +1027,18 @@ enabled by default and you should usually omit it. It 
may be necessary
 to disable the HPET in order to improve compatibility with guest
 Operating Systems (X86 only)
 
+=item B
+
+Enables or disables hvm guest access to alternate-p2m capability.
+Alternate-p2m allows a guest to manage multiple p2m guest physical
+"memory views" (as opposed to a single p2m). This option is
+disabled by default and is available only to hvm domains.
+You may want this option if you want to access-control/isolate
+access to specific guest physical memory pages accessed by
+the guest, e.g. for HVM domain memory introspection or
+for isolation/access-control of memory between components within
+a single guest hvm domain.
+
 =item B
 
 Enable or disables guest access to hardware virtualisation features,
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 5a7308d..6f86b21 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -758,6 +758,12 @@ typedef struct libxl__ctx libxl_ctx;
 #define LIBXL_HAVE_BUILDINFO_SERIAL_LIST 1
 
 /*
+ * LIBXL_HAVE_ALTP2M
+ * If this is defined, then libxl supports alternate p2m functionality.
+ */
+#define LIBXL_HAVE_ALTP2M 1
+
+/*
  * LIBXL_HAVE_REMUS
  * If this is defined, then libxl supports remus.
  */
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 5b4d333..e551e86 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -277,6 +277,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
 libxl_defbool_setdefault(&b_info->u.hvm.hpet,   true);
 libxl_defbool_setdefault(&b_info->u.hvm.vpt_align,  true);
 libxl_defbool_setdefault(&b_info->u.hvm.nested_hvm, false);
+libxl_defbool_setdefault(&b_info->u.hvm.altp2m, false);
 libxl_defbool_setdefault(&b_info->u.hvm.usb,false);
 libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 4cb247a..3d669be 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -300,6 +300,8 @@ static void hvm_set_conf_params(xc_interface *handle, 
uint32_t domid,
 libxl_defbool_val(info->u.hvm.vpt_align));
 xc_hvm_param_set(handle, domid, HVM_PARAM_NESTEDHVM,
 libxl_defbool_val(info->u.hvm.nested_hvm));
+xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
+libxl_defbool_val(info->u.hvm.altp2m));
 }
 
 int libxl__build_pre(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index bc0c4ef..b9dab54 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -458,6 +458,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("mmio_hole_memkb",  MemKB),
("timer_mode",   libxl_timer_mode),
("nested_hvm",   libxl_defbool),
+   ("altp2m",   libxl_defbool),
("smbios_firmware",  string),
("acpi_firmware",string),
("hdtype",   libxl_hdtype),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 8cbf30e..12b819f 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1562,6 +1562,16 @@ static void parse_config_data(const char *config_source,
 
 xlu_cfg_get_defbool(config, "nestedhvm", &b_info->u.hvm.nested_hvm, 0);
 
+xlu_cfg_get_defbool(config, "altp2mhvm", &b_info->u.hvm.altp2m, 0);
+
+if (!libxl_defbool_is_default(b_info->u.hvm.nested_hvm) &&
+libxl_defbool_val(b_info->u.hvm.nested_hvm) &&
+!libxl_defbool_is_default(b_info->u.hvm.altp2m) &&
+libxl_defbool_val(b_info->u.hvm.altp2m)) {
+fprintf(stderr, "ERROR: nestedhvm and altp2mhvm cannot be used 
together\n");
+exit(1);
+}
+
 xlu_cfg_replace_string(config, "smbios_firmware",
&b_info->u.hvm.smbios_firmware, 0);
 xlu_cfg_rep

[Xen-devel] [PATCH v6 10/15] x86/altp2m: add remaining support routines.

2015-07-20 Thread Ed White
Add the remaining routines required to support enabling the alternate
p2m functionality.

Signed-off-by: Ed White 

Reviewed-by: Andrew Cooper 
---
 xen/arch/x86/hvm/hvm.c   |  58 +-
 xen/arch/x86/mm/hap/Makefile |   1 +
 xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++
 xen/arch/x86/mm/p2m-ept.c|   3 +
 xen/arch/x86/mm/p2m.c| 385 +++
 xen/include/asm-x86/hvm/altp2m.h |   4 +
 xen/include/asm-x86/p2m.h|  33 
 7 files changed, 576 insertions(+), 6 deletions(-)
 create mode 100644 xen/arch/x86/mm/hap/altp2m_hap.c

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index f0ab4d4..38cf0c6 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2856,10 +2856,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
long gla,
 mfn_t mfn;
 struct vcpu *curr = current;
 struct domain *currd = curr->domain;
-struct p2m_domain *p2m;
+struct p2m_domain *p2m, *hostp2m;
 int rc, fall_through = 0, paged = 0;
 int sharing_enomem = 0;
 vm_event_request_t *req_ptr = NULL;
+bool_t ap2m_active = 0;
 
 /* On Nested Virtualization, walk the guest page table.
  * If this succeeds, all is fine.
@@ -2919,11 +2920,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned 
long gla,
 goto out;
 }
 
-p2m = p2m_get_hostp2m(currd);
-mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 
+ap2m_active = altp2m_active(currd);
+
+/* Take a lock on the host p2m speculatively, to avoid potential
+ * locking order problems later and to handle unshare etc.
+ */
+hostp2m = p2m_get_hostp2m(currd);
+mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
   P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE : 
0),
   NULL);
 
+if ( ap2m_active )
+{
+if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, &p2m) == 1 )
+{
+/* entry was lazily copied from host -- retry */
+__put_gfn(hostp2m, gfn);
+rc = 1;
+goto out;
+}
+
+mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
+}
+else
+p2m = hostp2m;
+
 /* Check access permissions first, then handle faults */
 if ( mfn_x(mfn) != INVALID_MFN )
 {
@@ -2963,6 +2984,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
 
 if ( violation )
 {
+/* Should #VE be emulated for this fault? */
+if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
+{
+bool_t sve;
+
+p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL, &sve);
+
+if ( !sve && altp2m_vcpu_emulate_ve(curr) )
+{
+rc = 1;
+goto out_put_gfn;
+}
+}
+
 if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
 {
 fall_through = 1;
@@ -2982,7 +3017,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
  (npfec.write_access &&
   (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
 {
-put_gfn(currd, gfn);
+__put_gfn(p2m, gfn);
+if ( ap2m_active )
+__put_gfn(hostp2m, gfn);
 
 rc = 0;
 if ( unlikely(is_pvh_domain(currd)) )
@@ -3011,6 +3048,7 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
 /* Spurious fault? PoD and log-dirty also take this path. */
 if ( p2m_is_ram(p2mt) )
 {
+rc = 1;
 /*
  * Page log dirty is always done with order 0. If this mfn resides in
  * a large page, we do not change other pages type within that large
@@ -3019,9 +3057,15 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
 if ( npfec.write_access )
 {
 paging_mark_dirty(currd, mfn_x(mfn));
+/* If p2m is really an altp2m, unlock here to avoid lock ordering
+ * violation when the change below is propagated from host p2m */
+if ( ap2m_active )
+__put_gfn(p2m, gfn);
 p2m_change_type_one(currd, gfn, p2m_ram_logdirty, p2m_ram_rw);
+__put_gfn(ap2m_active ? hostp2m : p2m, gfn);
+
+goto out;
 }
-rc = 1;
 goto out_put_gfn;
 }
 
@@ -3031,7 +3075,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
 rc = fall_through;
 
 out_put_gfn:
-put_gfn(currd, gfn);
+__put_gfn(p2m, gfn);
+if ( ap2m_active )
+__put_gfn(hostp2m, gfn);
 out:
 /* All of these are delayed until we exit, since we might 
  * sleep on event ring wait queues, and we must not hold
diff --git a/xen/arch/x86/mm/hap/Makefile b/xen/arch/x86/mm/hap/Makefile
index 68f2bb5..216cd90 100644
--- a/xen/arch/x86/mm/hap/Makefile
+++ b/xen/arch/x86/mm/hap/Makefile
@@ -4,6 +4,7 @@ obj-y += guest_w

[Xen-devel] [PATCH v6 15/15] tools/xen-access: altp2m testcases

2015-07-20 Thread Ed White
From: Tamas K Lengyel 

Working altp2m test-case. Extended the test tool to support singlestepping
to better highlight the core feature of altp2m view switching.

Signed-off-by: Tamas K Lengyel 
Signed-off-by: Ed White 

Reviewed-by: Razvan Cojocaru 
Acked-by: Wei Liu 
---
 tools/tests/xen-access/xen-access.c | 173 ++--
 1 file changed, 148 insertions(+), 25 deletions(-)

diff --git a/tools/tests/xen-access/xen-access.c 
b/tools/tests/xen-access/xen-access.c
index e6ca9ba..cdb8f4e 100644
--- a/tools/tests/xen-access/xen-access.c
+++ b/tools/tests/xen-access/xen-access.c
@@ -275,6 +275,19 @@ xenaccess_t *xenaccess_init(xc_interface **xch_r, domid_t 
domain_id)
 return NULL;
 }
 
+static inline
+int control_singlestep(
+xc_interface *xch,
+domid_t domain_id,
+unsigned long vcpu,
+bool enable)
+{
+uint32_t op = enable ?
+XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON : 
XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
+
+return xc_domain_debug_control(xch, domain_id, op, vcpu);
+}
+
 /*
  * Note that this function is not thread safe.
  */
@@ -317,13 +330,15 @@ static void put_response(vm_event_t *vm_event, 
vm_event_response_t *rsp)
 
 void usage(char* progname)
 {
-fprintf(stderr,
-"Usage: %s [-m]  write|exec|breakpoint\n"
+fprintf(stderr, "Usage: %s [-m]  write|exec", progname);
+#if defined(__i386__) || defined(__x86_64__)
+fprintf(stderr, "|breakpoint|altp2m_write|altp2m_exec");
+#endif
+fprintf(stderr,
 "\n"
 "Logs first page writes, execs, or breakpoint traps that occur on 
the domain.\n"
 "\n"
-"-m requires this program to run, or else the domain may pause\n",
-progname);
+"-m requires this program to run, or else the domain may pause\n");
 }
 
 int main(int argc, char *argv[])
@@ -341,6 +356,8 @@ int main(int argc, char *argv[])
 int required = 0;
 int breakpoint = 0;
 int shutting_down = 0;
+int altp2m = 0;
+uint16_t altp2m_view_id = 0;
 
 char* progname = argv[0];
 argv++;
@@ -379,10 +396,22 @@ int main(int argc, char *argv[])
 default_access = XENMEM_access_rw;
 after_first_access = XENMEM_access_rwx;
 }
+#if defined(__i386__) || defined(__x86_64__)
 else if ( !strcmp(argv[0], "breakpoint") )
 {
 breakpoint = 1;
 }
+else if ( !strcmp(argv[0], "altp2m_write") )
+{
+default_access = XENMEM_access_rx;
+altp2m = 1;
+}
+else if ( !strcmp(argv[0], "altp2m_exec") )
+{
+default_access = XENMEM_access_rw;
+altp2m = 1;
+}
+#endif
 else
 {
 usage(argv[0]);
@@ -415,22 +444,73 @@ int main(int argc, char *argv[])
 goto exit;
 }
 
-/* Set the default access type and convert all pages to it */
-rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
-if ( rc < 0 )
+/* With altp2m we just create a new, restricted view of the memory */
+if ( altp2m )
 {
-ERROR("Error %d setting default mem access type\n", rc);
-goto exit;
-}
+xen_pfn_t gfn = 0;
+unsigned long perm_set = 0;
+
+rc = xc_altp2m_set_domain_state( xch, domain_id, 1 );
+if ( rc < 0 )
+{
+ERROR("Error %d enabling altp2m on domain!\n", rc);
+goto exit;
+}
+
+rc = xc_altp2m_create_view( xch, domain_id, default_access, 
&altp2m_view_id );
+if ( rc < 0 )
+{
+ERROR("Error %d creating altp2m view!\n", rc);
+goto exit;
+}
 
-rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
-   (xenaccess->max_gpfn - START_PFN) );
+DPRINTF("altp2m view created with id %u\n", altp2m_view_id);
+DPRINTF("Setting altp2m mem_access permissions.. ");
 
-if ( rc < 0 )
+for(; gfn < xenaccess->max_gpfn; ++gfn)
+{
+rc = xc_altp2m_set_mem_access( xch, domain_id, altp2m_view_id, gfn,
+   default_access);
+if ( !rc )
+perm_set++;
+}
+
+DPRINTF("done! Permissions set on %lu pages.\n", perm_set);
+
+rc = xc_altp2m_switch_to_view( xch, domain_id, altp2m_view_id );
+if ( rc < 0 )
+{
+ERROR("Error %d switching to altp2m view!\n", rc);
+goto exit;
+}
+
+rc = xc_monitor_singlestep( xch, domain_id, 1 );
+if ( rc < 0 )
+{
+ERROR("Error %d failed to enable singlestep monitoring!\n", rc);
+goto exit;
+}
+}
+
+if ( !altp2m )
 {
-ERROR("Error %d setting all memory to access type %d\n", rc,
-  default_access);
-goto exit;
+/* Set the default access type and convert all pages to it */
+rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
+if ( rc < 0 )
+{
+ 

[Xen-devel] [PATCH v6 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.

2015-07-20 Thread Ed White
Signed-off-by: Ed White 
---
 xen/arch/x86/hvm/hvm.c  | 139 
 xen/include/public/hvm/hvm_op.h |  89 +
 2 files changed, 228 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 38cf0c6..15973b4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6135,6 +6135,141 @@ static int hvmop_get_param(
 return rc;
 }
 
+static int do_altp2m_op(
+XEN_GUEST_HANDLE_PARAM(void) arg)
+{
+struct xen_hvm_altp2m_op a;
+struct domain *d = NULL;
+int rc = 0;
+
+if ( !hvm_altp2m_supported() )
+return -EOPNOTSUPP;
+
+if ( copy_from_guest(&a, arg, 1) )
+return -EFAULT;
+
+if ( a.pad1 || a.pad2 ||
+ (a.version != HVMOP_ALTP2M_INTERFACE_VERSION) ||
+ (a.cmd > HVMOP_altp2m_change_gfn) )
+return -EINVAL;
+
+if ( a.cmd != HVMOP_altp2m_vcpu_enable_notify )
+{
+d = rcu_lock_domain_by_any_id(a.domain);
+if ( d == NULL )
+return -ESRCH;
+}
+
+if ( !is_hvm_domain(d ? d : current->domain) )
+{
+rc = -EOPNOTSUPP;
+goto out;
+}
+
+if ( (a.cmd != HVMOP_altp2m_get_domain_state) &&
+ (a.cmd != HVMOP_altp2m_set_domain_state) &&
+ !(d ? d : current->domain)->arch.altp2m_active )
+{
+rc = -EOPNOTSUPP;
+goto out;
+}
+
+switch ( a.cmd )
+{
+case HVMOP_altp2m_get_domain_state:
+a.u.domain_state.state = altp2m_active(d);
+rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+break;
+
+case HVMOP_altp2m_set_domain_state:
+{
+struct vcpu *v;
+bool_t ostate;
+
+if ( nestedhvm_enabled(d) )
+{
+rc = -EINVAL;
+break;
+}
+
+ostate = d->arch.altp2m_active;
+d->arch.altp2m_active = !!a.u.domain_state.state;
+
+/* If the alternate p2m state has changed, handle appropriately */
+if ( d->arch.altp2m_active != ostate &&
+ (ostate || !(rc = p2m_init_altp2m_by_id(d, 0))) )
+{
+for_each_vcpu( d, v )
+{
+if ( !ostate )
+altp2m_vcpu_initialise(v);
+else
+altp2m_vcpu_destroy(v);
+}
+
+if ( ostate )
+p2m_flush_altp2m(d);
+}
+break;
+}
+
+case HVMOP_altp2m_vcpu_enable_notify:
+{
+struct vcpu *curr = current;
+p2m_type_t p2mt;
+
+if ( a.u.enable_notify.pad || a.domain != DOMID_SELF ||
+ a.u.enable_notify.vcpu_id != curr->vcpu_id )
+rc = -EINVAL;
+
+if ( (gfn_x(vcpu_altp2m(curr).veinfo_gfn) != INVALID_GFN) ||
+ (mfn_x(get_gfn_query_unlocked(curr->domain,
+a.u.enable_notify.gfn, &p2mt)) == INVALID_MFN) )
+return -EINVAL;
+
+vcpu_altp2m(curr).veinfo_gfn = _gfn(a.u.enable_notify.gfn);
+altp2m_vcpu_update_vmfunc_ve(curr);
+break;
+}
+
+case HVMOP_altp2m_create_p2m:
+if ( !(rc = p2m_init_next_altp2m(d, &a.u.view.view)) )
+rc = __copy_to_guest(arg, &a, 1) ? -EFAULT : 0;
+break;
+
+case HVMOP_altp2m_destroy_p2m:
+rc = p2m_destroy_altp2m_by_id(d, a.u.view.view);
+break;
+
+case HVMOP_altp2m_switch_p2m:
+rc = p2m_switch_domain_altp2m_by_id(d, a.u.view.view);
+break;
+
+case HVMOP_altp2m_set_mem_access:
+if ( a.u.set_mem_access.pad )
+rc = -EINVAL;
+else
+rc = p2m_set_altp2m_mem_access(d, a.u.set_mem_access.view,
+_gfn(a.u.set_mem_access.gfn),
+a.u.set_mem_access.hvmmem_access);
+break;
+
+case HVMOP_altp2m_change_gfn:
+if ( a.u.change_gfn.pad1 || a.u.change_gfn.pad2 )
+rc = -EINVAL;
+else
+rc = p2m_change_altp2m_gfn(d, a.u.change_gfn.view,
+_gfn(a.u.change_gfn.old_gfn),
+_gfn(a.u.change_gfn.new_gfn));
+}
+
+ out:
+if ( d )
+rcu_unlock_domain(d);
+
+return rc;
+}
+
 /*
  * Note that this value is effectively part of the ABI, even if we don't need
  * to make it a formal part of it: A guest suspended for migration in the
@@ -6564,6 +6699,10 @@ long do_hvm_op(unsigned long op, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 rc = -EINVAL;
 break;
 
+case HVMOP_altp2m:
+rc = do_altp2m_op(arg);
+break;
+
 default:
 {
 gdprintk(XENLOG_DEBUG, "Bad HVM op %ld.\n", op);
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index d053db9..014546a 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -398,6 +398,95 @@ DEFINE_XEN_GUEST_HANDLE(xen_hvm_evtchn_upcall_vector_t);
 
 #define HVMOP_guest_request_vm_event 24
 
+/* HVMOP_altp2m: perform altp2m state operations */
+#define HVMOP_altp2m 25
+
+

[Xen-devel] [PATCH v6 08/15] x86/altp2m: add control of suppress_ve.

2015-07-20 Thread Ed White
From: George Dunlap 

The existing ept_set_entry() and ept_get_entry() routines are extended
to optionally set/get suppress_ve.  Passing -1 will set suppress_ve on
new p2m entries, or retain suppress_ve flag on existing entries.

Signed-off-by: George Dunlap 
Signed-off-by: Ravi Sahita 

Reviewed-by: Jan Beulich 
Reviewed-by: George Dunlap 
---
 xen/arch/x86/mm/mem_sharing.c |  4 ++--
 xen/arch/x86/mm/p2m-ept.c | 18 
 xen/arch/x86/mm/p2m-pod.c | 12 +--
 xen/arch/x86/mm/p2m-pt.c  | 10 +++--
 xen/arch/x86/mm/p2m.c | 48 +--
 xen/include/asm-x86/p2m.h | 24 --
 6 files changed, 67 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 1a01e45..d2e3786 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1260,7 +1260,7 @@ int relinquish_shared_pages(struct domain *d)
 
 if ( atomic_read(&d->shr_pages) == 0 )
 break;
-mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL);
+mfn = p2m->get_entry(p2m, gfn, &t, &a, 0, NULL, NULL);
 if ( mfn_valid(mfn) && (t == p2m_ram_shared) )
 {
 /* Does not fail with ENOMEM given the DESTROY flag */
@@ -1270,7 +1270,7 @@ int relinquish_shared_pages(struct domain *d)
  * unshare.  Must succeed: we just read the old entry and
  * we hold the p2m lock. */
 set_rc = p2m->set_entry(p2m, gfn, _mfn(0), PAGE_ORDER_4K,
-p2m_invalid, p2m_access_rwx);
+p2m_invalid, p2m_access_rwx, -1);
 ASSERT(set_rc == 0);
 count += 0x10;
 }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index b532811..3652996 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -658,7 +658,8 @@ bool_t ept_handle_misconfig(uint64_t gpa)
  */
 static int
 ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn, 
-  unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma)
+  unsigned int order, p2m_type_t p2mt, p2m_access_t p2ma,
+  int sve)
 {
 ept_entry_t *table, *ept_entry = NULL;
 unsigned long gfn_remainder = gfn;
@@ -804,7 +805,11 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, 
mfn_t mfn,
 ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
 }
 
-new_entry.suppress_ve = 1;
+if ( sve != -1 )
+new_entry.suppress_ve = !!sve;
+else
+new_entry.suppress_ve = is_epte_valid(&old_entry) ?
+old_entry.suppress_ve : 1;
 
 rc = atomic_write_ept_entry(ept_entry, new_entry, target);
 if ( unlikely(rc) )
@@ -851,8 +856,9 @@ out:
 
 /* Read ept p2m entries */
 static mfn_t ept_get_entry(struct p2m_domain *p2m,
-   unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
-   p2m_query_t q, unsigned int *page_order)
+unsigned long gfn, p2m_type_t *t, p2m_access_t* a,
+p2m_query_t q, unsigned int *page_order,
+bool_t *sve)
 {
 ept_entry_t *table = 
map_domain_page(_mfn(pagetable_get_pfn(p2m_get_pagetable(p2m;
 unsigned long gfn_remainder = gfn;
@@ -866,6 +872,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
 
 *t = p2m_mmio_dm;
 *a = p2m_access_n;
+if ( sve )
+*sve = 1;
 
 /* This pfn is higher than the highest the p2m map currently holds */
 if ( gfn > p2m->max_mapped_pfn )
@@ -931,6 +939,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
 else
 *t = ept_entry->sa_p2mt;
 *a = ept_entry->access;
+if ( sve )
+*sve = ept_entry->suppress_ve;
 
 mfn = _mfn(ept_entry->mfn);
 if ( i )
diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 6e27bcd..6aee85a 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -536,7 +536,7 @@ recount:
 p2m_access_t a;
 p2m_type_t t;
 
-(void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
+(void)p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
 
 if ( t == p2m_populate_on_demand )
 pod++;
@@ -587,7 +587,7 @@ recount:
 p2m_type_t t;
 p2m_access_t a;
 
-mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL);
+mfn = p2m->get_entry(p2m, gpfn + i, &t, &a, 0, NULL, NULL);
 if ( t == p2m_populate_on_demand )
 {
 p2m_set_entry(p2m, gpfn + i, _mfn(INVALID_MFN), 0, p2m_invalid,
@@ -676,7 +676,7 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, 
unsigned long gfn)
 for ( i=0; iget_entry(p2m, gfn + i, &type, &a, 0, NULL);
+mfn = p2m->get_entry(p2m, gfn + i, &type, &a, 0, NULL, NULL);
 
 if ( i == 0 )
 {
@@ -808,7 +808,7 @@ p2

[Xen-devel] [PATCH v6 09/15] x86/altp2m: alternate p2m memory events.

2015-07-20 Thread Ed White
Add a flag to indicate that a memory event occurred in an alternate p2m
and a field containing the p2m index. Allow any event response to switch
to a different alternate p2m using the same flag and field.

Modify p2m_mem_access_check() to handle alternate p2m's.

Signed-off-by: Ed White 

Acked-by: Andrew Cooper  for the x86 bits.
Acked-by: George Dunlap 
Acked-by: Tamas K Lengyel 
---
 xen/arch/x86/mm/p2m.c | 19 ++-
 xen/common/vm_event.c |  4 
 xen/include/asm-arm/p2m.h |  6 ++
 xen/include/asm-x86/p2m.h |  3 +++
 xen/include/public/vm_event.h | 12 
 5 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 93ecd87..ed3e65f 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1522,6 +1522,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
 }
 }
 
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+if ( altp2m_active(v->domain) )
+p2m_switch_vcpu_altp2m_by_id(v, idx);
+}
+
 bool_t p2m_mem_access_check(paddr_t gpa, unsigned long gla,
 struct npfec npfec,
 vm_event_request_t **req_ptr)
@@ -1529,7 +1535,7 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long 
gla,
 struct vcpu *v = current;
 unsigned long gfn = gpa >> PAGE_SHIFT;
 struct domain *d = v->domain;
-struct p2m_domain* p2m = p2m_get_hostp2m(d);
+struct p2m_domain *p2m = NULL;
 mfn_t mfn;
 p2m_type_t p2mt;
 p2m_access_t p2ma;
@@ -1537,6 +1543,11 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long 
gla,
 int rc;
 unsigned long eip = guest_cpu_user_regs()->eip;
 
+if ( altp2m_active(d) )
+p2m = p2m_get_altp2m(v);
+if ( !p2m )
+p2m = p2m_get_hostp2m(d);
+
 /* First, handle rx2rw conversion automatically.
  * These calls to p2m->set_entry() must succeed: we have the gfn
  * locked and just did a successful get_entry(). */
@@ -1651,6 +1662,12 @@ bool_t p2m_mem_access_check(paddr_t gpa, unsigned long 
gla,
 req->vcpu_id = v->vcpu_id;
 
 p2m_vm_event_fill_regs(req);
+
+if ( altp2m_active(v->domain) )
+{
+req->flags |= VM_EVENT_FLAG_ALTERNATE_P2M;
+req->altp2m_idx = vcpu_altp2m(v).p2midx;
+}
 }
 
 /* Pause the current VCPU */
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 4c6bf98..f3a8736 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -412,6 +412,10 @@ void vm_event_resume(struct domain *d, struct 
vm_event_domain *ved)
 
 };
 
+/* Check for altp2m switch */
+if ( rsp.flags & VM_EVENT_FLAG_ALTERNATE_P2M )
+p2m_altp2m_check(v, rsp.altp2m_idx);
+
 if ( rsp.flags & VM_EVENT_FLAG_VCPU_PAUSED )
 {
 if ( rsp.flags & VM_EVENT_FLAG_TOGGLE_SINGLESTEP )
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 63748ef..08bdce3 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -109,6 +109,12 @@ void p2m_mem_access_emulate_check(struct vcpu *v,
 /* Not supported on ARM. */
 }
 
+static inline
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
+{
+/* Not supported on ARM. */
+}
+
 #define p2m_is_foreign(_t)  ((_t) == p2m_map_foreign)
 #define p2m_is_ram(_t)  ((_t) == p2m_ram_rw || (_t) == p2m_ram_ro)
 
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 0a172e0..722e54c 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -751,6 +751,9 @@ uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t 
eptp);
 /* Switch alternate p2m for a single vcpu */
 bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx);
 
+/* Check to see if vcpu should be switched to a different p2m. */
+void p2m_altp2m_check(struct vcpu *v, uint16_t idx);
+
 /*
  * p2m type to IOMMU flags
  */
diff --git a/xen/include/public/vm_event.h b/xen/include/public/vm_event.h
index fbc76b2..ff2f217 100644
--- a/xen/include/public/vm_event.h
+++ b/xen/include/public/vm_event.h
@@ -79,6 +79,16 @@
   * Currently only useful for MSR, CR0, CR3 and CR4 write events.
   */
 #define VM_EVENT_FLAG_DENY   (1 << 6)
+/*
+ * This flag can be set in a request or a response
+ *
+ * On a request, indicates that the event occurred in the alternate p2m 
specified by
+ * the altp2m_idx request field.
+ *
+ * On a response, indicates that the VCPU should resume in the alternate p2m 
specified
+ * by the altp2m_idx response field if possible.
+ */
+#define VM_EVENT_FLAG_ALTERNATE_P2M  (1 << 7)
 
 /*
  * Reasons for the vm event request
@@ -221,6 +231,8 @@ typedef struct vm_event_st {
 uint32_t flags; /* VM_EVENT_FLAG_* */
 uint32_t reason;/* VM_EVENT_REASON_* */
 uint32_t vcpu_id;
+uint16_t altp2m_idx; /* may be used during request and response */
+uint16_t _pad[3];
 
 union {
 struct vm_event_paging 

[Xen-devel] [PATCH v6 07/15] VMX: add VMFUNC leaf 0 (EPTP switching) to emulator.

2015-07-20 Thread Ed White
From: Ravi Sahita 

Signed-off-by: Ravi Sahita 
---
 xen/arch/x86/hvm/emulate.c | 18 +++--
 xen/arch/x86/hvm/vmx/vmx.c | 36 ++
 xen/arch/x86/x86_emulate/x86_emulate.c | 19 --
 xen/arch/x86/x86_emulate/x86_emulate.h |  4 
 xen/include/asm-x86/hvm/hvm.h  |  2 ++
 5 files changed, 71 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 15c2496..30acb78 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -1593,6 +1593,18 @@ static int hvmemul_invlpg(
 return rc;
 }
 
+static int hvmemul_vmfunc(
+struct x86_emulate_ctxt *ctxt)
+{
+int rc;
+
+rc = hvm_funcs.altp2m_vcpu_emulate_vmfunc(ctxt->regs);
+if ( rc != X86EMUL_OKAY )
+hvmemul_inject_hw_exception(TRAP_invalid_op, 0, ctxt);
+
+return rc;
+}
+
 static const struct x86_emulate_ops hvm_emulate_ops = {
 .read  = hvmemul_read,
 .insn_fetch= hvmemul_insn_fetch,
@@ -1616,7 +1628,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
 .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
 .get_fpu   = hvmemul_get_fpu,
 .put_fpu   = hvmemul_put_fpu,
-.invlpg= hvmemul_invlpg
+.invlpg= hvmemul_invlpg,
+.vmfunc= hvmemul_vmfunc,
 };
 
 static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
@@ -1642,7 +1655,8 @@ static const struct x86_emulate_ops 
hvm_emulate_ops_no_write = {
 .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
 .get_fpu   = hvmemul_get_fpu,
 .put_fpu   = hvmemul_put_fpu,
-.invlpg= hvmemul_invlpg
+.invlpg= hvmemul_invlpg,
+.vmfunc= hvmemul_vmfunc,
 };
 
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 5ee3b2a..95c7d25 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -82,6 +82,7 @@ static void vmx_fpu_dirty_intercept(void);
 static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content);
 static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content);
 static void vmx_invlpg_intercept(unsigned long vaddr);
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
 
 uint8_t __read_mostly posted_intr_vector;
 
@@ -1838,6 +1839,19 @@ static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
 vmx_vmcs_exit(v);
 }
 
+static int vmx_vcpu_emulate_vmfunc(struct cpu_user_regs *regs)
+{
+int rc = X86EMUL_EXCEPTION;
+struct vcpu *curr = current;
+
+if ( !cpu_has_vmx_vmfunc && altp2m_active(curr->domain) &&
+ regs->_eax == 0 &&
+ p2m_switch_vcpu_altp2m_by_id(curr, (uint16_t)regs->_ecx) )
+rc = X86EMUL_OKAY;
+
+return rc;
+}
+
 static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
 {
 bool_t rc = 0;
@@ -1906,6 +1920,7 @@ static struct hvm_function_table __initdata 
vmx_function_table = {
 .msr_read_intercept   = vmx_msr_read_intercept,
 .msr_write_intercept  = vmx_msr_write_intercept,
 .invlpg_intercept = vmx_invlpg_intercept,
+.vmfunc_intercept = vmx_vmfunc_intercept,
 .handle_cd= vmx_handle_cd,
 .set_info_guest   = vmx_set_info_guest,
 .set_rdtsc_exiting= vmx_set_rdtsc_exiting,
@@ -1931,6 +1946,7 @@ static struct hvm_function_table __initdata 
vmx_function_table = {
 .altp2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
 .altp2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
 .altp2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
+.altp2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2102,6 +2118,19 @@ static void vmx_invlpg_intercept(unsigned long vaddr)
 vpid_sync_vcpu_gva(curr, vaddr);
 }
 
+static int vmx_vmfunc_intercept(struct cpu_user_regs *regs)
+{
+/*
+ * This handler is a placeholder for future where Xen may
+ * want to handle VMFUNC exits and resume a domain normally without
+ * injecting a #UD to the guest - for example, in a VT-nested
+ * scenario where Xen may want to lazily shadow the alternate
+ * EPTP list.
+ */
+gdprintk(XENLOG_ERR, "Failed guest VMFUNC execution\n");
+return X86EMUL_EXCEPTION;
+}
+
 static int vmx_cr_access(unsigned long exit_qualification)
 {
 struct vcpu *curr = current;
@@ -3260,6 +3289,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 update_guest_eip();
 break;
 
+case EXIT_REASON_VMFUNC:
+if ( vmx_vmfunc_intercept(regs) != X86EMUL_OKAY )
+hvm_inject_hw_exception(TRAP_invalid_op, 
HVM_DELIVER_NO_ERROR_CODE);
+else
+update_guest_eip();
+break;
+
 case EXIT_REASON_MWAIT_INSTRUCTION:
 case EXIT_REASON_MONITOR_INSTRUCTION:
 case EXIT_REASON_GETSEC:
diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c 
b/xen/arch/x86/x86_emulate/x86_

[Xen-devel] [PATCH v6 06/15] VMX/altp2m: add code to support EPTP switching and #VE.

2015-07-20 Thread Ed White
Implement and hook up the code to enable VMX support of VMFUNC and #VE.

VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.

Signed-off-by: Ed White 

Reviewed-by: Andrew Cooper 
Acked-by: Jun Nakajima 
---
 xen/arch/x86/hvm/vmx/vmx.c | 139 +
 1 file changed, 139 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 4f8b0e0..5ee3b2a 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -56,6 +56,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1770,6 +1771,105 @@ static bool_t vmx_is_singlestep_supported(void)
 return cpu_has_monitor_trap_flag;
 }
 
+static void vmx_vcpu_update_eptp(struct vcpu *v)
+{
+struct domain *d = v->domain;
+struct p2m_domain *p2m = NULL;
+struct ept_data *ept;
+
+if ( altp2m_active(d) )
+p2m = p2m_get_altp2m(v);
+if ( !p2m )
+p2m = p2m_get_hostp2m(d);
+
+ept = &p2m->ept;
+ept->asr = pagetable_get_pfn(p2m_get_pagetable(p2m));
+
+vmx_vmcs_enter(v);
+
+__vmwrite(EPT_POINTER, ept_get_eptp(ept));
+
+if ( v->arch.hvm_vmx.secondary_exec_control &
+SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+__vmwrite(EPTP_INDEX, vcpu_altp2m(v).p2midx);
+
+vmx_vmcs_exit(v);
+}
+
+static void vmx_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+struct domain *d = v->domain;
+u32 mask = SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+
+if ( !cpu_has_vmx_vmfunc )
+return;
+
+if ( cpu_has_vmx_virt_exceptions )
+mask |= SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
+vmx_vmcs_enter(v);
+
+if ( !d->is_dying && altp2m_active(d) )
+{
+v->arch.hvm_vmx.secondary_exec_control |= mask;
+__vmwrite(VM_FUNCTION_CONTROL, VMX_VMFUNC_EPTP_SWITCHING);
+__vmwrite(EPTP_LIST_ADDR, virt_to_maddr(d->arch.altp2m_eptp));
+
+if ( cpu_has_vmx_virt_exceptions )
+{
+p2m_type_t t;
+mfn_t mfn;
+
+mfn = get_gfn_query_unlocked(d, gfn_x(vcpu_altp2m(v).veinfo_gfn), 
&t);
+
+if ( mfn_x(mfn) != INVALID_MFN )
+__vmwrite(VIRT_EXCEPTION_INFO, mfn_x(mfn) << PAGE_SHIFT);
+else
+v->arch.hvm_vmx.secondary_exec_control &=
+~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+}
+}
+else
+v->arch.hvm_vmx.secondary_exec_control &= ~mask;
+
+__vmwrite(SECONDARY_VM_EXEC_CONTROL,
+v->arch.hvm_vmx.secondary_exec_control);
+
+vmx_vmcs_exit(v);
+}
+
+static bool_t vmx_vcpu_emulate_ve(struct vcpu *v)
+{
+bool_t rc = 0;
+ve_info_t *veinfo = gfn_x(vcpu_altp2m(v).veinfo_gfn) != INVALID_GFN ?
+hvm_map_guest_frame_rw(gfn_x(vcpu_altp2m(v).veinfo_gfn), 0) : NULL;
+
+if ( !veinfo )
+return 0;
+
+if ( veinfo->semaphore != 0 )
+goto out;
+
+rc = 1;
+
+veinfo->exit_reason = EXIT_REASON_EPT_VIOLATION;
+veinfo->semaphore = ~0l;
+veinfo->eptp_index = vcpu_altp2m(v).p2midx;
+
+vmx_vmcs_enter(v);
+__vmread(EXIT_QUALIFICATION, &veinfo->exit_qualification);
+__vmread(GUEST_LINEAR_ADDRESS, &veinfo->gla);
+__vmread(GUEST_PHYSICAL_ADDRESS, &veinfo->gpa);
+vmx_vmcs_exit(v);
+
+hvm_inject_hw_exception(TRAP_virtualisation,
+HVM_DELIVER_NO_ERROR_CODE);
+
+out:
+hvm_unmap_guest_frame(veinfo, 0);
+return rc;
+}
+
 static struct hvm_function_table __initdata vmx_function_table = {
 .name = "VMX",
 .cpu_up_prepare   = vmx_cpu_up_prepare,
@@ -1828,6 +1928,9 @@ static struct hvm_function_table __initdata 
vmx_function_table = {
 .hypervisor_cpuid_leaf = vmx_hypervisor_cpuid_leaf,
 .enable_msr_exit_interception = vmx_enable_msr_exit_interception,
 .is_singlestep_supported = vmx_is_singlestep_supported,
+.altp2m_vcpu_update_eptp = vmx_vcpu_update_eptp,
+.altp2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
+.altp2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2769,6 +2872,42 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
 /* Now enable interrupts so it's safe to take locks. */
 local_irq_enable();
 
+/*
+ * If the guest has the ability to switch EPTP without an exit,
+ * figure out whether it has done so and update the altp2m data.
+ */
+if ( altp2m_active(v->domain) &&
+(v->arch.hvm_vmx.secondary_exec_control &
+SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+{
+unsigned long idx;
+
+if ( v->arch.hvm_vmx.secondary_exec_control &
+SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS )
+__vmread(EPTP_INDEX, &idx);
+else
+{
+unsigned long eptp;
+
+__vmread(EPT_POINTER, &eptp);
+
+if ( (idx = p2m_find_altp2m_by_eptp(v->domain, eptp)) ==
+ INVALID_ALTP2M )
+{
+gdprintk(XENLOG_

[Xen-devel] [PATCH v6 03/15] VMX: implement suppress #VE.

2015-07-20 Thread Ed White
In preparation for selectively enabling #VE in a later patch, set
suppress #VE on all EPTE's.

Suppress #VE should always be the default condition for two reasons:
it is generally not safe to deliver #VE into a guest unless that guest
has been modified to receive it; and even then for most EPT violations only
the hypervisor is able to handle the violation.

Signed-off-by: Ed White 

Acked-by: George Dunlap 
---
 xen/arch/x86/mm/p2m-ept.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 9a3b65a..b532811 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -42,7 +42,8 @@
 #define is_epte_superpage(ept_entry)((ept_entry)->sp)
 static inline bool_t is_epte_valid(ept_entry_t *e)
 {
-return (e->epte != 0 && e->sa_p2mt != p2m_invalid);
+/* suppress_ve alone is not considered valid, so mask it off */
+return ((e->epte & ~(1ul << 63)) != 0 && e->sa_p2mt != p2m_invalid);
 }
 
 /* returns : 0 for success, -errno otherwise */
@@ -220,6 +221,8 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, 
ept_entry_t *entry,
 static int ept_set_middle_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry)
 {
 struct page_info *pg;
+ept_entry_t *table;
+unsigned int i;
 
 pg = p2m_alloc_ptp(p2m, 0);
 if ( pg == NULL )
@@ -233,6 +236,15 @@ static int ept_set_middle_entry(struct p2m_domain *p2m, 
ept_entry_t *ept_entry)
 /* Manually set A bit to avoid overhead of MMU having to write it later. */
 ept_entry->a = 1;
 
+ept_entry->suppress_ve = 1;
+
+table = __map_domain_page(pg);
+
+for ( i = 0; i < EPT_PAGETABLE_ENTRIES; i++ )
+table[i].suppress_ve = 1;
+
+unmap_domain_page(table);
+
 return 1;
 }
 
@@ -282,6 +294,7 @@ static int ept_split_super_page(struct p2m_domain *p2m, 
ept_entry_t *ept_entry,
 epte->sp = (level > 1);
 epte->mfn += i * trunk;
 epte->snp = (iommu_enabled && iommu_snoop);
+epte->suppress_ve = 1;
 
 ept_p2m_type_to_flags(p2m, epte, epte->sa_p2mt, epte->access);
 
@@ -791,6 +804,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, 
mfn_t mfn,
 ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
 }
 
+new_entry.suppress_ve = 1;
+
 rc = atomic_write_ept_entry(ept_entry, new_entry, target);
 if ( unlikely(rc) )
 old_entry.epte = 0;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 02/15] VMX: VMFUNC and #VE definitions and detection.

2015-07-20 Thread Ed White
Currently, neither is enabled globally but may be enabled on a per-VCPU
basis by the altp2m code.

Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
that bit is now hardware-defined.

Signed-off-by: Ed White 

Reviewed-by: Andrew Cooper 
Acked-by: George Dunlap 
Acked-by: Jun Nakajima 
---
 xen/arch/x86/hvm/vmx/vmcs.c| 42 +++---
 xen/arch/x86/mm/p2m-ept.c  |  1 -
 xen/include/asm-x86/hvm/vmx/vmcs.h | 14 +++--
 xen/include/asm-x86/hvm/vmx/vmx.h  | 13 +++-
 xen/include/asm-x86/msr-index.h|  1 +
 5 files changed, 64 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 4c5ceb5..bc1cabd 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -101,6 +101,8 @@ u32 vmx_secondary_exec_control __read_mostly;
 u32 vmx_vmexit_control __read_mostly;
 u32 vmx_vmentry_control __read_mostly;
 u64 vmx_ept_vpid_cap __read_mostly;
+u64 vmx_vmfunc __read_mostly;
+bool_t vmx_virt_exception __read_mostly;
 
 const u32 vmx_introspection_force_enabled_msrs[] = {
 MSR_IA32_SYSENTER_EIP,
@@ -140,6 +142,8 @@ static void __init vmx_display_features(void)
 P(cpu_has_vmx_virtual_intr_delivery, "Virtual Interrupt Delivery");
 P(cpu_has_vmx_posted_intr_processing, "Posted Interrupt Processing");
 P(cpu_has_vmx_vmcs_shadowing, "VMCS shadowing");
+P(cpu_has_vmx_vmfunc, "VM Functions");
+P(cpu_has_vmx_virt_exceptions, "Virtualisation Exceptions");
 P(cpu_has_vmx_pml, "Page Modification Logging");
 #undef P
 
@@ -185,6 +189,7 @@ static int vmx_init_vmcs_config(void)
 u64 _vmx_misc_cap = 0;
 u32 _vmx_vmexit_control;
 u32 _vmx_vmentry_control;
+u64 _vmx_vmfunc = 0;
 bool_t mismatch = 0;
 
 rdmsr(MSR_IA32_VMX_BASIC, vmx_basic_msr_low, vmx_basic_msr_high);
@@ -230,7 +235,9 @@ static int vmx_init_vmcs_config(void)
SECONDARY_EXEC_ENABLE_EPT |
SECONDARY_EXEC_ENABLE_RDTSCP |
SECONDARY_EXEC_PAUSE_LOOP_EXITING |
-   SECONDARY_EXEC_ENABLE_INVPCID);
+   SECONDARY_EXEC_ENABLE_INVPCID |
+   SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+   SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
 rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
 if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
 opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
@@ -341,6 +348,24 @@ static int vmx_init_vmcs_config(void)
   || !(_vmx_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT) )
 _vmx_pin_based_exec_control  &= ~ PIN_BASED_POSTED_INTERRUPT;
 
+/* The IA32_VMX_VMFUNC MSR exists only when VMFUNC is available */
+if ( _vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS )
+{
+rdmsrl(MSR_IA32_VMX_VMFUNC, _vmx_vmfunc);
+
+/*
+ * VMFUNC leaf 0 (EPTP switching) must be supported.
+ *
+ * Or we just don't use VMFUNC.
+ */
+if ( !(_vmx_vmfunc & VMX_VMFUNC_EPTP_SWITCHING) )
+_vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VM_FUNCTIONS;
+}
+
+/* Virtualization exceptions are only enabled if VMFUNC is enabled */
+if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VM_FUNCTIONS) )
+_vmx_secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS;
+
 min = 0;
 opt = VM_ENTRY_LOAD_GUEST_PAT | VM_ENTRY_LOAD_BNDCFGS;
 _vmx_vmentry_control = adjust_vmx_controls(
@@ -361,6 +386,9 @@ static int vmx_init_vmcs_config(void)
 vmx_vmentry_control= _vmx_vmentry_control;
 vmx_basic_msr  = ((u64)vmx_basic_msr_high << 32) |
  vmx_basic_msr_low;
+vmx_vmfunc = _vmx_vmfunc;
+vmx_virt_exception = !!(_vmx_secondary_exec_control &
+   SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
 vmx_display_features();
 
 /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -397,6 +425,9 @@ static int vmx_init_vmcs_config(void)
 mismatch |= cap_check(
 "EPT and VPID Capability",
 vmx_ept_vpid_cap, _vmx_ept_vpid_cap);
+mismatch |= cap_check(
+"VMFUNC Capability",
+vmx_vmfunc, _vmx_vmfunc);
 if ( cpu_has_vmx_ins_outs_instr_info !=
  !!(vmx_basic_msr_high & (VMX_BASIC_INS_OUT_INFO >> 32)) )
 {
@@ -967,6 +998,11 @@ static int construct_vmcs(struct vcpu *v)
 /* Do not enable Monitor Trap Flag unless start single step debug */
 v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
 
+/* Disable VMFUNC and #VE for now: they may be enabled later by altp2m. */
+v->arch.hvm_vmx.secondary_exec_control &=
+~(SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
+  SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
+
 if ( is_pvh_domain(d) )
 {
 /* Disable virtual apics, TPR */
@@ -1790,9 +1826,9 @@ void v

[Xen-devel] [PATCH v6 05/15] x86/altp2m: basic data structures and support routines.

2015-07-20 Thread Ed White
Add the basic data structures needed to support alternate p2m's and
the functions to initialise them and tear them down.

Although Intel hardware can handle 512 EPTP's per hardware thread
concurrently, only 10 per domain are supported in this patch for
performance reasons.

The iterator in hap_enable() does need to handle 512, so that is now
uint16_t.

This change also splits the p2m lock into one lock type for altp2m's
and another type for all other p2m's. The purpose of this is to place
the altp2m list lock between the types, so the list lock can be
acquired whilst holding the host p2m lock.

Signed-off-by: Ed White 
---
 xen/arch/x86/hvm/Makefile|   1 +
 xen/arch/x86/hvm/altp2m.c|  77 +
 xen/arch/x86/hvm/hvm.c   |  21 
 xen/arch/x86/mm/hap/hap.c|  38 ++-
 xen/arch/x86/mm/mm-locks.h   |  46 +-
 xen/arch/x86/mm/p2m.c| 102 +++
 xen/include/asm-x86/domain.h |  10 
 xen/include/asm-x86/hvm/altp2m.h |  38 +++
 xen/include/asm-x86/hvm/hvm.h|  14 ++
 xen/include/asm-x86/hvm/vcpu.h   |   9 
 xen/include/asm-x86/p2m.h|  30 +++-
 11 files changed, 382 insertions(+), 4 deletions(-)
 create mode 100644 xen/arch/x86/hvm/altp2m.c
 create mode 100644 xen/include/asm-x86/hvm/altp2m.h

diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 794e793..4d489cc 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -1,6 +1,7 @@
 subdir-y += svm
 subdir-y += vmx
 
+obj-y += altp2m.o
 obj-y += asid.o
 obj-y += emulate.o
 obj-y += event.o
diff --git a/xen/arch/x86/hvm/altp2m.c b/xen/arch/x86/hvm/altp2m.c
new file mode 100644
index 000..a10f347
--- /dev/null
+++ b/xen/arch/x86/hvm/altp2m.c
@@ -0,0 +1,77 @@
+/*
+ * Alternate p2m HVM
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 59 Temple
+ * Place - Suite 330, Boston, MA 02111-1307 USA.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+void
+altp2m_vcpu_reset(struct vcpu *v)
+{
+struct altp2mvcpu *av = &vcpu_altp2m(v);
+
+av->p2midx = INVALID_ALTP2M;
+av->veinfo_gfn = _gfn(INVALID_GFN);
+}
+
+void
+altp2m_vcpu_initialise(struct vcpu *v)
+{
+if ( v != current )
+vcpu_pause(v);
+
+altp2m_vcpu_reset(v);
+vcpu_altp2m(v).p2midx = 0;
+atomic_inc(&p2m_get_altp2m(v)->active_vcpus);
+
+altp2m_vcpu_update_eptp(v);
+
+if ( v != current )
+vcpu_unpause(v);
+}
+
+void
+altp2m_vcpu_destroy(struct vcpu *v)
+{
+struct p2m_domain *p2m;
+
+if ( v != current )
+vcpu_pause(v);
+
+if ( (p2m = p2m_get_altp2m(v)) )
+atomic_dec(&p2m->active_vcpus);
+
+altp2m_vcpu_reset(v);
+
+altp2m_vcpu_update_eptp(v);
+altp2m_vcpu_update_vmfunc_ve(v);
+
+if ( v != current )
+vcpu_unpause(v);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index eafaf9d..f0ab4d4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2462,6 +2463,7 @@ void hvm_vcpu_destroy(struct vcpu *v)
 {
 hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
 
+altp2m_vcpu_destroy(v);
 nestedhvm_vcpu_destroy(v);
 
 free_compat_arg_xlat(v);
@@ -6569,6 +6571,25 @@ void hvm_toggle_singlestep(struct vcpu *v)
 v->arch.hvm_vcpu.single_step = !v->arch.hvm_vcpu.single_step;
 }
 
+void altp2m_vcpu_update_eptp(struct vcpu *v)
+{
+if ( hvm_funcs.altp2m_vcpu_update_eptp )
+hvm_funcs.altp2m_vcpu_update_eptp(v);
+}
+
+void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v)
+{
+if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
+hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
+}
+
+bool_t altp2m_vcpu_emulate_ve(struct vcpu *v)
+{
+if ( hvm_funcs.altp2m_vcpu_emulate_ve )
+return hvm_funcs.altp2m_vcpu_emulate_ve(v);
+return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 63980af..a9a1667 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -459,7 +459,7 @@ void hap_domain_init(struct domain *d)
 int hap_enable(struct domain *d, u32 mode)

[Xen-devel] [PATCH v6 04/15] x86/HVM: Hardware alternate p2m support detection.

2015-07-20 Thread Ed White
As implemented here, only supported on platforms with VMX HAP.

By default this functionality is force-disabled, it can be enabled
by specifying altp2m=1 on the Xen command line.

Signed-off-by: Ed White 

Reviewed-by: Andrew Cooper 
---
 docs/misc/xen-command-line.markdown | 7 +++
 xen/arch/x86/hvm/hvm.c  | 7 +++
 xen/arch/x86/hvm/vmx/vmx.c  | 1 +
 xen/include/asm-x86/hvm/hvm.h   | 9 +
 4 files changed, 24 insertions(+)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 70d7ab8..7bdcfff 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -140,6 +140,13 @@ mode during S3 resume.
 
 Permit Xen to use superpages when performing memory management.
 
+### altp2m (Intel)
+> `= `
+
+> Default: `false`
+
+Permit multiple copies of host p2m.
+
 ### apic
 > `= bigsmp | default`
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c07e3ef..eafaf9d 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -96,6 +96,10 @@ bool_t opt_hvm_fep;
 boolean_param("hvm_fep", opt_hvm_fep);
 #endif
 
+/* Xen command-line option to enable altp2m */
+static bool_t __initdata opt_altp2m_enabled = 0;
+boolean_param("altp2m", opt_altp2m_enabled);
+
 static int cpu_callback(
 struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -162,6 +166,9 @@ static int __init hvm_enable(void)
 if ( !fns->pvh_supported )
 printk(XENLOG_INFO "HVM: PVH mode not supported on this platform\n");
 
+if ( !opt_altp2m_enabled )
+hvm_funcs.altp2m_supported = 0;
+
 /*
  * Allow direct access to the PC debug ports 0x80 and 0xed (they are
  * often used for I/O delays, but the vmexits simply slow things down).
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index d3183a8..4f8b0e0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1847,6 +1847,7 @@ const struct hvm_function_table * __init start_vmx(void)
 if ( cpu_has_vmx_ept && (cpu_has_vmx_pat || opt_force_ept) )
 {
 vmx_function_table.hap_supported = 1;
+vmx_function_table.altp2m_supported = 1;
 
 vmx_function_table.hap_capabilities = 0;
 
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 82f1b32..3a94f8c 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -94,6 +94,9 @@ struct hvm_function_table {
 /* Necessary hardware support for PVH mode? */
 int pvh_supported;
 
+/* Necessary hardware support for alternate p2m's? */
+bool_t altp2m_supported;
+
 /* Indicate HAP capabilities. */
 int hap_capabilities;
 
@@ -530,6 +533,12 @@ static inline bool_t hvm_is_singlestep_supported(void)
 hvm_funcs.is_singlestep_supported());
 }
 
+/* returns true if hardware supports alternate p2m's */
+static inline bool_t hvm_altp2m_supported(void)
+{
+return hvm_funcs.altp2m_supported;
+}
+
 #ifndef NDEBUG
 /* Permit use of the Forced Emulation Prefix in HVM guests */
 extern bool_t opt_hvm_fep;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6 00/15] Alternate p2m: support multiple copies of host p2m

2015-07-20 Thread Ed White
This set of patches adds support to hvm domains for EPTP switching by creating
multiple copies of the host p2m (currently limited to 10 copies).

The primary use of this capability is expected to be in scenarios where access
to memory needs to be monitored and/or restricted below the level at which the
guest OS page tables operate. Two examples that were discussed at the 2014 Xen
developer summit are:

VM introspection: 
http://www.slideshare.net/xen_com_mgr/
zero-footprint-guest-memory-introspection-from-xen

Secure inter-VM communication:
http://www.slideshare.net/xen_com_mgr/nakajima-nvf

A more detailed design specification can be found at:
http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg01319.html

Each p2m copy is populated lazily on EPT violations.
Permissions for pages in alternate p2m's can be changed in a similar
way to the existing memory access interface, and gfn->mfn mappings can be 
changed.

All this is done through extra HVMOP types.

The cross-domain HVMOP code has been compile-tested only. Also, the cross-domain
code is hypervisor-only, the toolstack has not been modified.

The intra-domain code has been tested. Violation notifications can only be 
received
for pages that have been modified (access permissions and/or gfn->mfn mapping) 
intra-domain, and only on VCPU's that have enabled notification.

VMFUNC and #VE will both be emulated on hardware without native support.

This code is not compatible with nested hvm functionality and will refuse to 
work
with nested hvm active. It is also not compatible with migration. It should be
considered experimental.

Changes since v5:

Rebased on staging.

We believe v6 addresses all ABI issues and actual bugs, it does
not address all outstanding maintainer issues.

Patch 1:
no changes

Patch 2:
no changes

Patch 3:
no changes
removed ack's etc

Patch 4:
fixed a markdown formatting error

Patch 5:
removed a buggy assert
removed Andrew's R-b

Patch 6:
fixed a bug when disabling #VE due to bad veinfo gfn

Patch 7:
addressed Jan's most recent comments

Patch 8:
no changes

Patch 9:
Added padding to vm_event_t header (per Andrew)

Patch 10:
No changes

Patch 11:
Reworked structure padding
Added altp2m_op interface version
Reworked altp2m_op handling again

Patch 12:
Mechanical changes due to patch 11 changes

Patch 13:
Mechanical changes due to patch 11 changes

Patch 14:
Mechanical changes due to patch 11 changes

Patch 15:
Mechanical changes due to an upstream change


Changes since v4:

Patch 3:  don't set bit 63 of top-level entries.

Patch 5:  extra locking order description in mm-locks.h
  don't initialise altp2m data unless altp2m is enabled globally
   and hardware supports it
  removed some hardware-specific wrappers that were not being used
  renamed ap2m... interfaces to altp2m...
  fixed error path in p2m_init_altp2m

Patch 7:  addressed remaining feedback

Patch 8:  made suppress_ve preservation consistent

Patch 9:  changed flag bit to avoid collision with recently applied series

Patch 10: check pad fields for zero
  minor formatting changes

Patch 11: renamed HVM parameter

Patch 15: removed v3 workaround


Changes since v3:

Major changes are:

Replaced patch 8.

Refactored patch 11 to use a single HVMOP with subcodes.

Addressed feedback in patch 7, and some other patches.

Added two tools/test patches from Tamas. Both are optional.

Added various ack's and reviewed-by's.

Rebased.

Ravi Sahita will now be the point of contact for this series.


Changes since v2:

Addressed all v2 feedback *except*:

In patch 5, the per-domain EPTP list page is still allocated from the
Xen heap. If allocated from the domain heap Xen panics - IIRC on Haswell
hardware when walking the EPTP list during exit processing in patch 6.

HVM_ops are not merged. Tamas suggested merging the memory access ops,
but in practice they are not as similar as they appear on the surface.
Razvan suggested merging the implementation code in p2m.c, but that is
also not as common as it appears on the surface.
Andrew suggested merging all altp2m ops into one with a subop code in
the input stucture. His point that only 255 ops can be defined is well
taken, but altp2m uses only 2 more ops than the recently introduced
ioreq ops, and <15% of the available ops have been defined. Since we
don't know how to implement XSM hooks and policy with the subop model,
we have not adopted this suggestion.

The p2m set/get interface is not modified. The altp2m code needs to
write suppress_ve in 2 places and read it in 1 place. The original

[Xen-devel] [PATCH v6 01/15] common/domain: Helpers to pause a domain while in context

2015-07-20 Thread Ed White
From: Andrew Cooper 

For use on codepaths which would need to use domain_pause() but might be in
the target domain's context.  In the case that the target domain is in
context, all other vcpus are paused.

Signed-off-by: Andrew Cooper 
---
 xen/common/domain.c | 28 
 xen/include/xen/sched.h |  5 +
 2 files changed, 33 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 8efef5c..49ee655 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1010,6 +1010,34 @@ int domain_unpause_by_systemcontroller(struct domain *d)
 return 0;
 }
 
+void domain_pause_except_self(struct domain *d)
+{
+struct vcpu *v, *curr = current;
+
+if ( curr->domain == d )
+{
+for_each_vcpu( d, v )
+if ( likely(v != curr) )
+vcpu_pause(v);
+}
+else
+domain_pause(d);
+}
+
+void domain_unpause_except_self(struct domain *d)
+{
+struct vcpu *v, *curr = current;
+
+if ( curr->domain == d )
+{
+for_each_vcpu( d, v )
+if ( likely(v != curr) )
+vcpu_unpause(v);
+}
+else
+domain_unpause(d);
+}
+
 int vcpu_reset(struct vcpu *v)
 {
 struct domain *d = v->domain;
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index b29d9e7..73d3bc8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -804,6 +804,11 @@ static inline int 
domain_pause_by_systemcontroller_nosync(struct domain *d)
 {
 return __domain_pause_by_systemcontroller(d, domain_pause_nosync);
 }
+
+/* domain_pause() but safe against trying to pause current. */
+void domain_pause_except_self(struct domain *d);
+void domain_unpause_except_self(struct domain *d);
+
 void cpu_init(void);
 
 struct scheduler;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 59768: regressions - FAIL

2015-07-20 Thread osstest service owner
flight 59768 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59768/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-libvirt-xsm  11 guest-start   fail REGR. vs. 58842

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt  11 guest-start  fail   like 58842
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58842

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  e46791e003444ce825feaf5bb2a16f778ee951e5
baseline version:
 libvirt  d10a5f58c75e7eb5943b44cc36a1e768adb2cdb0

Last test of basis58842  2015-06-23 04:23:54 Z   27 days
Failing since 58870  2015-06-24 04:20:11 Z   26 days   24 attempts
Testing same since59768  2015-07-20 12:35:21 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Boris Fiuczynski 
  Christophe Fergeau 
  Cédric Bosdonnat 
  Daniel P. Berrange 
  Daniel Veillard 
  Dmitry Guryanov 
  Eric Blake 
  Erik Skultety 
  Frediano Ziglio 
  Guido Günther 
  Jim Fehlig 
  Jiri Denemark 
  John Ferlan 
  Ján Tomko 
  Kothapally Madhu Pavan 
  Laine Stump 
  Luyao Huang 
  Martin Kletzander 
  Maxim Nestratov 
  Michal Dubiel 
  Michal Privoznik 
  Mikhail Feoktistov 
  Nikolay Shirokovskiy 
  Nikolay Shirokovskiy 
  Pavel Fedin 
  Pavel Hrdina 
  Peter Krempa 
  Prerna Saxena 
  Roman Bogorodskiy 
  Serge Hallyn 
  Wido den Hollander 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-libvirt fail
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 3165 lines long.)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-next test] 59767: tolerable FAIL

2015-07-20 Thread osstest service owner
flight 59767 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59767/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
baseline untested
 test-amd64-i386-libvirt  11 guest-start fail baseline untested
 test-amd64-amd64-libvirt 11 guest-start fail baseline untested
 test-amd64-i386-libvirt-xsm  11 guest-start fail baseline untested
 test-armhf-armhf-xl-rtds 11 guest-start fail baseline untested
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop   fail baseline untested
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail baseline 
untested
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop   fail baseline untested

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux708e764f2083fde82ce2b4fefa774ac908ed665d
baseline version:
 linux9d37e6679dfddbb5fa605fb2d7ff448f7cd6d038

Last test of basis  (not found) 
Failing since 0  1970-01-01 00:00:00 Z 16636 days
Testing same since59767  2015-07-20 12:33:09 Z0 days1 attempts

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  

Re: [Xen-devel] ARM - why does setup_frametable_size() round frametable_size to 32MB ?

2015-07-20 Thread Chris (Christopher) Brand
Thanks, Ian. I tried that, and it does seem to work (everything boots, I can 
still bring up VMs, and I see an extra 16MB of free memory). The patch I came 
up with follows (it would be nice to share code between create_32mb_mappings() 
and create_2mb_mappings(), but the setting of the contig bit is right in the 
middle, and the functions are pretty short).

Chris

From: Chris Brand 
Date: Mon, 20 Jul 2015 13:38:15 -0700
Subject: [PATCH] xen: arm: Support <32MB frametables

setup_frametable_mappings() rounds frametable_size up to a multiple
of 32MB. This is wasteful on systemes with less than 4GB of RAM,
although it does allow the "contig" bit to be set in the PTEs.

Where the frametable is less than 32MB in size, instead round up
to a multiple of 2MB, not setting the "contig" bit in the PTEs.

Signed-off-by: Chris Brand 
---
 xen/arch/arm/mm.c | 39 ---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index a91ea774f1f9..a7f4864f8d8f 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -656,6 +656,29 @@ static void __init create_32mb_mappings(lpae_t *second,
 }
 
 #ifdef CONFIG_ARM_32
+static void __init create_2mb_mappings(lpae_t *second,
+   unsigned long virt_offset,
+   unsigned long base_mfn,
+   unsigned long nr_mfns)
+{
+unsigned long i, count;
+lpae_t pte, *p;
+
+ASSERT(!((virt_offset >> PAGE_SHIFT) % LPAE_ENTRIES));
+ASSERT(!(base_mfn % LPAE_ENTRIES));
+ASSERT(!(nr_mfns % LPAE_ENTRIES));
+
+count = nr_mfns / LPAE_ENTRIES;
+p = second + second_linear_offset(virt_offset);
+pte = mfn_to_xen_entry(base_mfn, WRITEALLOC);
+for ( i = 0; i < count; i++ )
+{
+write_pte(p + i, pte);
+pte.pt.base += 1 << LPAE_SHIFT;
+}
+flush_xen_data_tlb_local();
+}
+
 /* Set up the xenheap: up to 1GB of contiguous, always-mapped memory. */
 void __init setup_xenheap_mappings(unsigned long base_mfn,
unsigned long nr_mfns)
@@ -749,6 +772,7 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t 
pe)
 unsigned long nr_pdxs = pfn_to_pdx(nr_pages);
 unsigned long frametable_size = nr_pdxs * sizeof(struct page_info);
 unsigned long base_mfn;
+unsigned long mask;
 #ifdef CONFIG_ARM_64
 lpae_t *second, pte;
 unsigned long nr_second, second_base;
@@ -757,8 +781,12 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t 
pe)
 
 frametable_base_pdx = pfn_to_pdx(ps >> PAGE_SHIFT);
 
-/* Round up to 32M boundary */
-frametable_size = (frametable_size + 0x1ff) & ~0x1ff;
+/* Round up to 2M or 32M boundary, as appropriate */
+if (frametable_size < MB(32))
+mask = MB(2) - 1;
+else
+mask = MB(32) - 1;
+frametable_size = (frametable_size + mask) & ~mask;
 base_mfn = alloc_boot_pages(frametable_size >> PAGE_SHIFT, 32<<(20-12));
 
 #ifdef CONFIG_ARM_64
@@ -773,7 +801,12 @@ void __init setup_frametable_mappings(paddr_t ps, paddr_t 
pe)
 }
 create_32mb_mappings(second, 0, base_mfn, frametable_size >> PAGE_SHIFT);
 #else
-create_32mb_mappings(xen_second, FRAMETABLE_VIRT_START, base_mfn, 
frametable_size >> PAGE_SHIFT);
+if (frametable_size < MB(32))
+create_2mb_mappings(xen_second, FRAMETABLE_VIRT_START,
+base_mfn, frametable_size >> PAGE_SHIFT);
+else
+create_32mb_mappings(xen_second, FRAMETABLE_VIRT_START,
+ base_mfn, frametable_size >> PAGE_SHIFT);
 #endif
 
 memset(&frame_table[0], 0, nr_pdxs * sizeof(struct page_info));
-- 
1.9.1



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread Jan Beulich
>>> On 20.07.15 at 16:35,  wrote:
> Looks just a little bit should be changed so I also paste this new 
> online to try winning your Acked here,

Just like the other one, provided it also works,
Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 06/16] hvmloader/pci: Try to avoid placing BARs in RMRRs

2015-07-20 Thread Jan Beulich
>>> On 20.07.15 at 16:32,  wrote:
> On 2015/7/20 22:16, Jan Beulich wrote:
> On 20.07.15 at 16:10,  wrote:
>>> Hmm... although I suppose that doesn't catch the possibility of a memory
>>> range crossing the 4G boundary.
>>
>> I think we can safely ignore that - both real and virtual hardware have
>> special regions right below 4Gb, so neither RAM not RMRRs can be
>> reasonably placed there.
>>
> 
> Okay, I regenerate this patch online. And I just hope its good to be 
> acked here:

Provided it also works,
Reviewed-by: Jan Beulich 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH RFC] libxl: fix build with glibc < 2.9

2015-07-20 Thread Jan Beulich
htobe*() and be*toh() don't exist there. While replacing the 32-bit
ones with hton() and ntoh() would be possible, there wouldn't be an
obvious replacement for the 64-bit ones. Hence just take what current
glibc (2.21) has (assuming __bswap_*() exists, which it does back to
at least 2.4 according to my checking).

Signed-off-by: Jan Beulich 
---
Not sure whether I picked an appropriate header to place this in, or
an appropriate #ifdef to hook this onto. Hence the RFC.

--- a/tools/libxl/libxl_osdeps.h
+++ b/tools/libxl/libxl_osdeps.h
@@ -59,6 +59,42 @@ int asprintf(char **buffer, char *fmt, .
 int vasprintf(char **buffer, const char *fmt, va_list ap);
 #endif /*NEED_OWN_ASPRINTF*/
 
+#ifndef htobe32 /* glibc < 2.9 */
+# include 
+
+# if __BYTE_ORDER == __LITTLE_ENDIAN
+#  define htobe16(x) __bswap_16(x)
+#  define htole16(x) (x)
+#  define be16toh(x) __bswap_16(x)
+#  define le16toh(x) (x)
+
+#  define htobe32(x) __bswap_32(x)
+#  define htole32(x) (x)
+#  define be32toh(x) __bswap_32(x)
+#  define le32toh(x) (x)
+
+#  define htobe64(x) __bswap_64(x)
+#  define htole64(x) (x)
+#  define be64toh(x) __bswap_64(x)
+#  define le64toh(x) (x)
+# else
+#  define htobe16(x) (x)
+#  define htole16(x) __bswap_16(x)
+#  define be16toh(x) (x)
+#  define le16toh(x) __bswap_16(x)
+
+#  define htobe32(x) (x)
+#  define htole32(x) __bswap_32(x)
+#  define be32toh(x) (x)
+#  define le32toh(x) __bswap_32(x)
+
+#  define htobe64(x) (x)
+#  define htole64(x) __bswap_64(x)
+#  define be64toh(x) (x)
+#  define le64toh(x) __bswap_64(x)
+# endif
+#endif
+
 #endif
 
 /*



libxl: fix build with glibc < 2.9

htobe*() and be*toh() don't exist there. While replacing the 32-bit
ones with hton() and ntoh() would be possible, there wouldn't be an
obvious replacement for the 64-bit ones. Hence just take what current
glibc (2.21) has (assuming __bswap_*() exists, which it does back to
at least 2.4 according to my checking).

Signed-off-by: Jan Beulich 
---
Not sure whether I picked an appropriate header to place this in, or
an appropriate #ifdef to hook this onto. Hence the RFC.

--- a/tools/libxl/libxl_osdeps.h
+++ b/tools/libxl/libxl_osdeps.h
@@ -59,6 +59,42 @@ int asprintf(char **buffer, char *fmt, .
 int vasprintf(char **buffer, const char *fmt, va_list ap);
 #endif /*NEED_OWN_ASPRINTF*/
 
+#ifndef htobe32 /* glibc < 2.9 */
+# include 
+
+# if __BYTE_ORDER == __LITTLE_ENDIAN
+#  define htobe16(x) __bswap_16(x)
+#  define htole16(x) (x)
+#  define be16toh(x) __bswap_16(x)
+#  define le16toh(x) (x)
+
+#  define htobe32(x) __bswap_32(x)
+#  define htole32(x) (x)
+#  define be32toh(x) __bswap_32(x)
+#  define le32toh(x) (x)
+
+#  define htobe64(x) __bswap_64(x)
+#  define htole64(x) (x)
+#  define be64toh(x) __bswap_64(x)
+#  define le64toh(x) (x)
+# else
+#  define htobe16(x) (x)
+#  define htole16(x) __bswap_16(x)
+#  define be16toh(x) (x)
+#  define le16toh(x) __bswap_16(x)
+
+#  define htobe32(x) (x)
+#  define htole32(x) __bswap_32(x)
+#  define be32toh(x) (x)
+#  define le32toh(x) __bswap_32(x)
+
+#  define htobe64(x) (x)
+#  define htole64(x) __bswap_64(x)
+#  define be64toh(x) (x)
+#  define le64toh(x) __bswap_64(x)
+# endif
+#endif
+
 #endif
 
 /*
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] VT-d: add iommu=igfx_off option to workaround graphics issues

2015-07-20 Thread Jan Beulich
>>> On 20.07.15 at 16:12,  wrote:
> On 20/07/15 14:55, Jan Beulich wrote:
> On 20.07.15 at 14:34,  wrote:
>>> On 20/07/15 13:24, Jan Beulich wrote:
>>> On 20.07.15 at 14:12,  wrote:
> On 17/07/15 20:05, Ting-Wei Lan wrote:
>> When using Linux >= 3.19 (commit 47591df) as dom0 on some Intel Ironlake
>> devices, It is possible to encounter graphics issues that make screen
>> unreadable or crash the system. It was reported in freedesktop bugzilla:
>>
>> https://bugs.freedesktop.org/show_bug.cgi?id=90037 
>>
>> As we still cannot find a proper fix for this problem, this patch adds
>> iommu=igfx_off option that is similar to Linux intel_iommu=igfx_off for
>> users to manually workaround the problem.
>>
>> Signed-off-by: Ting-Wei Lan 
> Having looked into this issue, the i915 driver has several workarounds
> in it for systems when the IOMMU is in use.  In some cases there are
> plain errata, while in other cases there are specific hardware features
> which don't function if the IOMMU is enabled.
>
> In all cases this is gated on Linux's idea of whether the IOMMU is
> enabled.  When used under Xen, Linux has no clue that the IOMMU exists,
> or that Xen has turned it on.
 Perhaps it should just assume an IOMMU is in use when running under
 Xen. Having inspected all those code places quite some time ago, I
 came to the conclusion that making this assumption is better than
 the current one of there not being an enabled IOMMU (and I adjusted
 our kernels accordingly).
>>> In at least one case, an errata workaround involves issuing extra IOMMU
>>> commands.  We cannot safely let even dom0 perform this.
>> Mind pointing out that one case? In our Xen kernels, IOMMU code
>> gets compiled out, so it is impossible for the driver to issue extra
>> IOMMU commands...
> 
> But a distro is going to want to ship a single kernel, especially with
> PVops these days.

Right, but you see that there continue to be examples of what isn't
being taken care of in pv-ops.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 7/7] x86/PCI: intercept all PV Dom0 MMCFG writes

2015-07-20 Thread Jan Beulich
... to hook up pci_conf_write_intercept() even for Dom0 not using
method 1 accesses for the base part of PCI device config space.

Signed-off-by: Jan Beulich 
---
Not entirely sure whether the complicated logging logic in x86/mm.c is
actually worth it.

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -734,6 +734,46 @@ static int update_xen_mappings(unsigned 
 return err;
 }
 
+#ifndef NDEBUG
+struct mmio_emul_range_ctxt {
+const struct domain *d;
+unsigned long mfn;
+};
+
+static int print_mmio_emul_range(unsigned long s, unsigned long e, void *arg)
+{
+const struct mmio_emul_range_ctxt *ctxt = arg;
+
+if ( ctxt->mfn > e )
+return 0;
+
+if ( ctxt->mfn >= s )
+{
+static DEFINE_SPINLOCK(last_lock);
+static const struct domain *last_d;
+static unsigned long last_s = ~0UL, last_e;
+bool_t print = 0;
+
+spin_lock(&last_lock);
+if ( last_d != ctxt->d || last_s != s || last_e != e )
+{
+last_d = ctxt->d;
+last_s = s;
+last_e = e;
+print = 1;
+}
+spin_unlock(&last_lock);
+
+if ( print )
+printk(XENLOG_G_INFO
+   "d%d: Forcing write emulation on MFNs %lx-%lx\n",
+   ctxt->d->domain_id, s, e);
+}
+
+return 1;
+}
+#endif
+
 int
 get_page_from_l1e(
 l1_pgentry_t l1e, struct domain *l1e_owner, struct domain *pg_owner)
@@ -757,6 +797,11 @@ get_page_from_l1e(
 if ( !mfn_valid(mfn) ||
  (real_pg_owner = page_get_owner_and_reference(page)) == dom_io )
 {
+#ifndef NDEBUG
+const unsigned long *ro_map;
+unsigned int seg, bdf;
+#endif
+
 /* Only needed the reference to confirm dom_io ownership. */
 if ( mfn_valid(mfn) )
 put_page(page);
@@ -792,9 +837,20 @@ get_page_from_l1e(
 if ( !(l1f & _PAGE_RW) ||
  !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
 return 0;
-dprintk(XENLOG_G_WARNING,
-"d%d: Forcing read-only access to MFN %lx\n",
-l1e_owner->domain_id, mfn);
+#ifndef NDEBUG
+if ( !pci_mmcfg_decode(mfn, &seg, &bdf) ||
+ ((ro_map = pci_get_ro_map(seg)) != NULL &&
+  test_bit(bdf, ro_map)) )
+printk(XENLOG_G_WARNING
+   "d%d: Forcing read-only access to MFN %lx\n",
+   l1e_owner->domain_id, mfn);
+else
+rangeset_report_ranges(mmio_ro_ranges, 0, ~0UL,
+   print_mmio_emul_range,
+   &(struct mmio_emul_range_ctxt){
+  .d = l1e_owner,
+  .mfn = mfn });
+#endif
 return 1;
 }
 
@@ -5145,6 +5201,7 @@ int ptwr_do_page_fault(struct vcpu *v, u
 
 /* We are looking only for read-only mappings of p.t. pages. */
 if ( ((l1e_get_flags(pte) & (_PAGE_PRESENT|_PAGE_RW)) != _PAGE_PRESENT) ||
+ rangeset_contains_singleton(mmio_ro_ranges, l1e_get_pfn(pte)) ||
  !get_page_from_pagenr(l1e_get_pfn(pte), d) )
 goto bail;
 
@@ -5192,6 +5249,7 @@ int ptwr_do_page_fault(struct vcpu *v, u
 struct mmio_ro_emulate_ctxt {
 struct x86_emulate_ctxt ctxt;
 unsigned long cr2;
+unsigned int seg, bdf;
 };
 
 static int mmio_ro_emulated_read(
@@ -5231,6 +5289,44 @@ static const struct x86_emulate_ops mmio
 .write  = mmio_ro_emulated_write,
 };
 
+static int mmio_intercept_write(
+enum x86_segment seg,
+unsigned long offset,
+void *p_data,
+unsigned int bytes,
+struct x86_emulate_ctxt *ctxt)
+{
+struct mmio_ro_emulate_ctxt *mmio_ctxt =
+container_of(ctxt, struct mmio_ro_emulate_ctxt, ctxt);
+
+/*
+ * Only allow naturally-aligned stores no wider than 4 bytes to the
+ * original %cr2 address.
+ */
+if ( ((bytes | offset) & (bytes - 1)) || bytes > 4 ||
+ offset != mmio_ctxt->cr2 )
+{
+MEM_LOG("mmio_intercept: bad write (cr2=%lx, addr=%lx, bytes=%u)",
+mmio_ctxt->cr2, offset, bytes);
+return X86EMUL_UNHANDLEABLE;
+}
+
+offset &= 0xfff;
+pci_conf_write_intercept(mmio_ctxt->seg, mmio_ctxt->bdf, offset, bytes,
+ p_data);
+pci_mmcfg_write(mmio_ctxt->seg, PCI_BUS(mmio_ctxt->bdf),
+PCI_DEVFN2(mmio_ctxt->bdf), offset, bytes,
+*(uint32_t *)p_data);
+
+return X86EMUL_OKAY;
+}
+
+static const struct x86_emulate_ops mmio_intercept_ops = {
+.read   = mmio_ro_emulated_read,
+.insn_fetch = ptwr_emulated_read,
+.write  = mmio_intercept_write,
+};
+
 /* Check if guest is trying to modify a r/o MMIO page. */
 int mmio_ro_do_page_fault(struct vcpu *v, unsigned long addr,
   struct cpu_user_regs *regs)
@@ -5245,6 +5341,7 @@ int mmio_ro_do_page_fault(struct vcpu *v
 .ctxt.swint_emulate = x86_swint_emulate_none,
 

Re: [Xen-devel] [PATCH v2 00/20] xen/arm64: Add support for 64KB page

2015-07-20 Thread Julien Grall
On 09/07/15 21:42, Julien Grall wrote:
> Average betwen 10 iperf :
> 
> DOM0  Guest   Result
> 
> 4KB-mod   64KB3.176 Gbits/sec
> 4KB-mod 4KB-mod 3.245 Gbits/sec
> 4KB-mod 4KB 3.258 Gbits/sec
> 4KB 4KB 3.292 Gbits/sec
> 4KB 4KB-mod 3.265 Gbits/sec
> 4KB 64KB3.189 Gbits/sec
> 
> 4KB-mod: Linux with the 64KB patch series
> 4KB: linux/master
> 
> The network performance is slightly worst with this series (-0.15%). I 
> suspect,
> this is because of using an indirection to setup the grant. This is necessary
> in order to ensure that the grant will be correctly sized no matter of the
> Linux page granularity. This could be used later in order to support bigger
> grant.

I didn't compute correctly the result. It's -1.5% and not -0.15% sorry.

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 17/20] net/xen-netfront: Make it running on 64KB page granularity

2015-07-20 Thread Julien Grall
Hi,

On 09/07/15 21:42, Julien Grall wrote:
> +static void xennet_make_one_txreq(unsigned long mfn, unsigned int offset,
> +   unsigned int *len, void *data)
> +{
> + struct xennet_gnttab_make_txreq *info = data;
> +
> + info->tx->flags |= XEN_NETTXF_more_data;
> + skb_get(info->skb);
> + xennet_make_one_txreq(mfn, offset, len, data);

This should be xennet_tx_setup_grant rather than calling itself. I did
the mistake while cleaning up the code sorry.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Request a freeze exception for COLO in v4.6

2015-07-20 Thread Lars Kurth
Forgot to attach the graph referred to (or it is not showing up in the archives)
Lars


]
> On 20 Jul 2015, at 07:02, Lars Kurth  wrote:
> 
> Hi all,
> 
> I have been travelling on Friday and wanted to appeal for calm on this 
> particular issue. Let's try and focus on making as much progress as we can on 
> the patch series which have freeze exceptions (or partial freeze exception) 
> this week. Continuing a debate on what may have gone wrong with Remus/COLO or 
> other series at this stage is going to be distracting and will affect 
> everyones chances to get the remaining code with freeze exceptions into Xen 
> 4.6. So please, let's focus on making as much progress as we can this week. 
> Having said that, it is absolutely true that the first Remus/COLO and COLO 
> RFC patches have received very little or no review time when they were posted 
> first by Wen Congyang in April 2013. And that situation had not changed until 
> a year later, when the issue was first raised with me (if I recall 
> correctly). 
> 
> I do sincerely apologise for this. Personally I would like to see COLO in Xen 
> 4.6, but this is not my decision.
> 
> I do also believe that there has been tremendous progress on Remus and COLO 
> in the last 12 months. There has also been great collaboration between 
> maintainers and contributors in the last 12 months, in particular around 
> Remus and COLOPre. Let's build on this and move forward.
> 
> = After July 24th =
> 
> We should have a discussion *after* July 24th, to see what is going wrong and 
> what we can improve as a community. We can also cover some of the accusations 
> that were made then. It is clear that as a community we do have some issues 
> and challenges that we have to address. I have to personally take 
> responsibility for underestimating some of the issues we face: until about 4 
> weeks ago, it looked as if many of the issues that I knew of are well on the 
> way of being resolved. I honestly believe that many of the changes we made 
> recently, such as focus on designs, had a very positive effect. I do not 
> believe, that what we are seeing is a sign of a dying community. There were 
> also some specific issues related to Remus/COLO: some have been addressed; 
> others have not yet been addressed.
> 
> However, it is not clear yet from the data that I can mine, exactly what is 
> going on in the general case. We have good data mining capability when it 
> comes to git, but *no* data mining capability when it comes to the review 
> process. I am meeting Bitergia tomorrow to see whether they (or I) can 
> implement some functionality that allows us to get metrics related to the 
> review process.
> 
> = Data we have =
> 
> What is interesting is that since 2012, we have seen an average annual 
> increase of 9% of patches that made it into the Xen Hypervisor. We also have 
> seen a slightly higher increase of Reviewed-by and ACKED-by tags during the 
> same time period (around 10% a year). However, this does not tell us much, as 
> the review period leading up to commits is not covered by git data.
> 
> The following graph shows the number of e-mails related to patches on 
> xen-devel@ - both patches, comments on patches, submissions of new versions 
> of patches, etc.
> 
> 
> 
> What is striking is that the ratio of discussions related to patches 
> (including posted patches) on xen-devel divided by the patches that made it 
> into Xen has increased almost grown exponentially recently: 5.85 (2012), 7.89 
> (2013), 8.63 (2014) to 11.65 (2015). This clearly shows that we have some 
> issues with code reviews that are getting worse and that there is an 
> underlying issue which we have to address: there are a number of possible 
> reasons. But let's not speculate now.
> 
> Best Regards
> Lars

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 2/3] ts-debian-hvm-install: di_installcmdline_core

2015-07-20 Thread Ian Campbell
On Mon, 2015-07-20 at 16:39 +0100, Wei Liu wrote:
> The subject of this mail is very terse. I guess you meant
> 
>  ts-debian-hvm-install: *use* di_installcmdline_core
> 
> ?

I did, I would even have sworn I typed that (or something very like it).
I may have driven vi wrongly at some point during the edit, sorry!

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 2/3] ts-debian-hvm-install: di_installcmdline_core

2015-07-20 Thread Wei Liu
The subject of this mail is very terse. I guess you meant

 ts-debian-hvm-install: *use* di_installcmdline_core

?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-20 Thread Ian Campbell
On Mon, 2015-07-20 at 16:24 +0100, Ian Jackson wrote:
> Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and
> avoid conflicts with RDM"):
> > [Ian Jackson:]
> > > The domain configuration specified to libxl might contain some rdms.
> > > Then num_rdms in the incoming config would be nonzero.
> > 
> > We never set d_config->num_rdms/d_config->rdms before we goes inside 
> > libxl__domain_device_construct_rdm(). And actually 
> > libxl__domain_device_construct_rdm is only one place to set 
> > d_config->num_rdms/d_config->rdms.
> 
> But d_config is a libxl_domain_config which is supplied by libxl's
> caller.  It might contain some rdms.
> 
> > I guess this line make you or other guys confused so lets delete this 
> > line directly.
> 
> I don't think I am very confused.

I think the confusion here is that the d_config->rdms array (which
num_rdms is the length of) is in the public API (because it is in
libxl_types.idl) but is apparently only being used in this series as an
internal state for the domain build process (i.e. xl doesn't ever add
anything to the array rdms).

Tiejun, is that an accurate summary?

If the field is in the public API then the possibility of something
being passed in their must be considered now, even if this particular
series adds no such calls, since we cannot prevent 3rd party users of
libxl adding such configuration.

Is the possibility of the toolstack (i.e. the caller of libxl) supplying
an array of rdm regions seems to be being left aside for future work or
it not intended to ever support that?

Ian.

> 
> > And if you still worry about something, I can add assert() at the 
> > beginning of this function like this,
> > 
> > assert(!d_config->num_rdms && !d_config->rdms).
> 
> If you are sure that this assertion is correct, then that would be
> proper.
> 
> But as I say above, I don't think it is.
> 
> Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 0/4] Have OpenStack tested on top of xen's master and libvirt's master.

2015-07-20 Thread Ian Campbell
On Mon, 2015-07-20 at 16:07 +0100, Anthony PERARD wrote:
> > >   I have not done it, but we could have some smoke test before
> > > Tempest where osstest tryied to start a guest.
> > 
> > A test can have a dependency on a build job, but I'm not sure about
> > another test, but that would seem generally useful and would allow your
> > new test to depend on test-ARCH-ARCH-libvirt.
> 
> I was thinking of an extra steps within the test which would just start a
> guest via OpenStack/Nova before running the full test suite from Tempest.

ISWYM.

That might be nice, if it were very little osstest-development effort to
make it happen, but I think it isn't going to be a small effort and
running tempest and seeing what happens seems like it would be good
enough anyway.

> > > Then later, there will be the question of which tree to track, devstack?
> > > nova? Or don't track any and just test with the master branch from time to
> > > time.
> > 
> > This is a complicated one, especially for things which don't fit into
> > the "one tree one branch" model of things...
> > 
> > In terms of gating (which matters to us for regression tracking even if
> > we don't care about the output of the gate) is it the case that if you
> > clone $rootthing (whatever that is) and select a given revision of that
> > you will always get the same thing?
> 
> > Or is it like raisin where you clone $rootthing and select a revision of
> > that and it will clone the latest version of everything at that time?
> > 
> > I'm expecting the second one?
> 
> Yes, I think your description of raisin would match the description of
> devstack, the $rootthing.
> So, devstack will clone master of every other tree by default. But we
> can select a specific revision for everything that is going to be cloned.
> 
> Also, if one clone a stable/* branch of devstack, then we'll get the same
> stable branch of every other trees.

But in the presence of stable backports we may not get the exact same
set of commits from one day to the next, despite cloning the exact same
devstack revision, I think?

> > I think the important thing is we would want to be testing stuff which
> > has already gone through openstack testing?
> 
> Yes. Everything in OpenStack trees have been tested and have gone through
> the gate anyway.
> 
> > Maybe we want to track particular OStack releases?
> 
> I think tracking master at first would be fine.

Right.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] libxl: fix ref counting of libxlMigrationDstArgs

2015-07-20 Thread Olaf Hering
On Thu, Jul 16, Jim Fehlig wrote:

> @@ -448,6 +438,8 @@ libxlDomainMigrationPrepare(virConnectPtr dconn,
>  virObjectUnref(socks[i]);
>  }
>  VIR_FREE(socks);
> +virObjectUnref(args);

This is now below the 'error' label, so args has to be initialized.

[  149s] libxl/libxl_migration.c: In function 'libxlDomainMigrationPrepare':
[  149s] libxl/libxl_migration.c:463:19: warning: 'args' may be used 
uninitialized in this function [-Wmaybe-uninitialized]
[  149s]  virObjectUnref(args);
[  149s]^

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 0/4] Have OpenStack tested on top of xen's master and libvirt's master.

2015-07-20 Thread Ian Jackson
Anthony PERARD writes ("Re: [PATCH OSSTEST 0/4] Have OpenStack tested on top of 
xen's master and libvirt's master."):
> Yes, I think your description of raisin would match the description of
> devstack, the $rootthing.
> So, devstack will clone master of every other tree by default. But we
> can select a specific revision for everything that is going to be cloned.

Are we trying to make a push gate out of this ?

In any case at the very least I think we need to be able to have
osstest's runvars control the versions of the subtrees.  That way the
bisector can work properly.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-20 Thread Ian Jackson
Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and
avoid conflicts with RDM"):
> [Ian Jackson:]
> > The domain configuration specified to libxl might contain some rdms.
> > Then num_rdms in the incoming config would be nonzero.
> 
> We never set d_config->num_rdms/d_config->rdms before we goes inside 
> libxl__domain_device_construct_rdm(). And actually 
> libxl__domain_device_construct_rdm is only one place to set 
> d_config->num_rdms/d_config->rdms.

But d_config is a libxl_domain_config which is supplied by libxl's
caller.  It might contain some rdms.

> I guess this line make you or other guys confused so lets delete this 
> line directly.

I don't think I am very confused.

> And if you still worry about something, I can add assert() at the 
> beginning of this function like this,
> 
> assert(!d_config->num_rdms && !d_config->rdms).

If you are sure that this assertion is correct, then that would be
proper.

But as I say above, I don't think it is.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH OSSTEST 2/3] ts-debian-hvm-install: di_installcmdline_core

2015-07-20 Thread Ian Campbell
This is primarily to get DEBIAN_FRONTEND=test, for easier to read
logging.

Previously the command line consisted of the console and
preseed/file=/preseed.cfg. After this it is more complex.

The preseed file uses file= which is an alias for preseed/file. Extra
options are given including DEBIAN_FRONTEND and DEBCONF_DEBUG and the
following are preseeded via the command line:

Previous implied were "auto=true preseed" which are now explicit.

In addition the following harmless (in this context) options are
added:
hw-detect/load_firmware=
hostname=
netcfg/dhcp_timeout=
netcfg/choose_interface=

The caller could also cause debconf/priority to be set, but doesn't
here.

ts-debian-di-install in the distro test series also uses
di_installcmdline_core for guest uses.

Signed-off-by: Ian Campbell 
---
 Osstest/Debian.pm |  4 +++-
 ts-debian-hvm-install | 27 +--
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/Osstest/Debian.pm b/Osstest/Debian.pm
index 718a7e2..8282918 100644
--- a/Osstest/Debian.pm
+++ b/Osstest/Debian.pm
@@ -627,6 +627,8 @@ our %preseed_cmds;
 sub di_installcmdline_core ($$;@) {
 my ($tho, $ps_url, %xopts) = @_;
 
+$xopts{PreseedScheme} //= 'url';
+
 $ps_url =~ s,^http://,,;
 
 my $netcfg_interface= get_host_property($tho,'interface force','auto');
@@ -640,7 +642,7 @@ sub di_installcmdline_core ($$;@) {
 push @cl, (
"DEBIAN_FRONTEND=$difront",
"hostname=$tho->{Name}",
-   "url=$ps_url",
+   "$xopts{PreseedScheme}=$ps_url",
"netcfg/dhcp_timeout=150",
"netcfg/choose_interface=$netcfg_interface"
);
diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
index 0c94c7e..69f1217 100755
--- a/ts-debian-hvm-install
+++ b/ts-debian-hvm-install
@@ -98,22 +98,45 @@ END
 }
 
 sub grub_cfg () {
+my @dicmdline = ();
+my $gconsole = "console=ttyS0,115200n8";
+
+push @dicmdline, $gconsole;
+push @dicmdline, di_installcmdline_core($gho, '/preseed.cfg',
+   PreseedScheme => 'file');
+push @dicmdline, "--";
 # See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=762007 for
 # why console= is repeated.
+push @dicmdline, $gconsole;
+
+my $cmdline = join(" ", @dicmdline);
+
 return <<"END";
 set default="0"
 set timeout=5
 
 menuentry 'debian guest auto Install' {
-linux /install.amd/vmlinuz preseed/file=/preseed.cfg 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
+linux /install.amd/vmlinuz $cmdline
 initrd /install.amd/initrd.gz
 }
 END
 }
 
 sub isolinux_cfg () {
+my @dicmdline = ();
+my $gconsole = "console=ttyS0,115200n8";
+
+push @dicmdline, $gconsole;
+push @dicmdline, di_installcmdline_core($gho, '/preseed.cfg',
+   PreseedScheme => 'file');
+push @dicmdline, "initrd=/install.amd/initrd.gz";
+push @dicmdline, "--";
 # See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=762007 for
 # why console= is repeated.
+push @dicmdline, $gconsole;
+
+my $cmdline = join(" ", @dicmdline);
+
 return <<"END";
 default autoinstall
 prompt 0
@@ -121,7 +144,7 @@ sub isolinux_cfg () {
 
 label autoinstall
 kernel /install.amd/vmlinuz
-append preseed/file=/preseed.cfg initrd=/install.amd/initrd.gz 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
+append $cmdline
 END
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH OSSTEST 0/3] fixes to ts-debian-hvm-install

2015-07-20 Thread Ian Campbell
The main one is the middle one which would have made
http://logs.test-lab.xenproject.org/osstest/logs/59681/test-amd64-i386-xl-qemuu-debianhvm-amd64/info.html
 a lot easier to read due to the DEBIAN_FRONTEND=text.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH OSSTEST 1/3] ts-debian-hvm-install: Remove VGA console runes.

2015-07-20 Thread Ian Campbell
I don't think there is any point in these since 60b6d20b0fd2
"ts-debian-hvm-install: Arrange for installed guest to use a serial
console" and they represent an unexplained difference between the
islinux and grub cases.

Signed-off-by: Ian Campbell 
---
 ts-debian-hvm-install | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
index f05b1a7..0c94c7e 100755
--- a/ts-debian-hvm-install
+++ b/ts-debian-hvm-install
@@ -105,7 +105,7 @@ set default="0"
 set timeout=5
 
 menuentry 'debian guest auto Install' {
-linux /install.amd/vmlinuz console=vga preseed/file=/preseed.cfg 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
+linux /install.amd/vmlinuz preseed/file=/preseed.cfg 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
 initrd /install.amd/initrd.gz
 }
 END
@@ -121,7 +121,7 @@ sub isolinux_cfg () {
 
 label autoinstall
 kernel /install.amd/vmlinuz
-append video=vesa:ywrap,mtrr vga=788 preseed/file=/preseed.cfg 
initrd=/install.amd/initrd.gz console=ttyS0,115200n8 -- console=ttyS0,115200n8
+append preseed/file=/preseed.cfg initrd=/install.amd/initrd.gz 
console=ttyS0,115200n8 -- console=ttyS0,115200n8
 END
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH OSSTEST 3/3] ts-debian-hvm-install: Use xargs -0 to avoid massive filelist in logs.

2015-07-20 Thread Ian Campbell
The current arrangement is a bit odd, I'm not sure why it would be
that way and it results in a huge list of files in the middle of the
log which is rather boring to scroll through.

Signed-off-by: Ian Campbell 
---
 ts-debian-hvm-install | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
index 69f1217..8434f8f 100755
--- a/ts-debian-hvm-install
+++ b/ts-debian-hvm-install
@@ -160,7 +160,7 @@ sub prepare_initrd ($$$) {
   cd -
   rm -rf $initrddir
   cd $newiso
-  md5sum `find -L -type f -print0 | xargs -0` > md5sum.txt
+  find -L -type f -print0 | xargs -0 md5sum > md5sum.txt
   cd -
 END
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 0/4] Have OpenStack tested on top of xen's master and libvirt's master.

2015-07-20 Thread Anthony PERARD
On Fri, Jul 17, 2015 at 05:22:10PM +0100, Ian Campbell wrote:
> On Thu, 2015-07-16 at 12:18 +0100, Anthony PERARD wrote:
> 
> > I've introduce an extra Osstest::Toolstack which help to install extra
> > package, 
> 
> I've commented on this.
> 
> > and use ballonning for Dom0, 500MB for Dom0 is definetly not
> > enough.
> 
> This is for overriding Dom0MemFixed I think?

Yes.

> I think going to ballooning is a bit too far, I think instead the actual
> size of dom0 ought to be something which can be overridden via a runvar,
> so that the openstack tests can ask for a bigger one.

I will look into that.

> > The ts-devstack script does prepare a bit more the host, clone devstack,
> > then run ./stack.sh, which is a bit like raisin. Once the machine ready,
> > the integration test suite from OpenStack, Tempest, is started. Do you
> > think those two step should be in separate test, one for devstack, and one
> > for Tempest?
> 
> I don't really understand the difference between them well enough, but I
> would say if they are separate things then different steps would be good
> -- since if nothing else it gives a clearer indication in the test
> report what had failed.

I'll seperated them in different step.

> >   I have not done it, but we could have some smoke test before
> > Tempest where osstest tryied to start a guest.
> 
> A test can have a dependency on a build job, but I'm not sure about
> another test, but that would seem generally useful and would allow your
> new test to depend on test-ARCH-ARCH-libvirt.

I was thinking of an extra steps within the test which would just start a
guest via OpenStack/Nova before running the full test suite from Tempest.

> However I think we don't actually want to do that:
> 
> For the openstack branch we will be using known good versions of all the
> other branches, so that test-ARCH-ARCH-libvirt will have happened
> already (when the libvirt branch got pushed).
> 
> For the other branches then we already have loads of tests which might
> all fail for the same reason which you could serialise to prevent this,
> but it would make the whole flight take much longer and it isn't really
> needed.
> 
> > Then later, there will be the question of which tree to track, devstack?
> > nova? Or don't track any and just test with the master branch from time to
> > time.
> 
> This is a complicated one, especially for things which don't fit into
> the "one tree one branch" model of things...
> 
> In terms of gating (which matters to us for regression tracking even if
> we don't care about the output of the gate) is it the case that if you
> clone $rootthing (whatever that is) and select a given revision of that
> you will always get the same thing?

> Or is it like raisin where you clone $rootthing and select a revision of
> that and it will clone the latest version of everything at that time?
> 
> I'm expecting the second one?

Yes, I think your description of raisin would match the description of
devstack, the $rootthing.
So, devstack will clone master of every other tree by default. But we
can select a specific revision for everything that is going to be cloned.

Also, if one clone a stable/* branch of devstack, then we'll get the same
stable branch of every other trees.

> I think the important thing is we would want to be testing stuff which
> has already gone through openstack testing?

Yes. Everything in OpenStack trees have been tested and have gone through
the gate anyway.

> Maybe we want to track particular OStack releases?

I think tracking master at first would be fine.

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-20 Thread Chen, Tiejun

+int libxl__domain_device_construct_rdm(libxl__gc *gc,
+   libxl_domain_config *d_config,
+   uint64_t rdm_mem_boundary,
+   struct xc_hvm_build_args *args)
+{

...

+/* Query all RDM entries in this platform */
+if (strategy == LIBXL_RDM_RESERVE_STRATEGY_HOST) {

...

+} else {
+d_config->num_rdms = 0;
+}


Does this not override the domain configuration's num_rdms ?  I don't


We don't have the specific "num_rdms" parameter in .cfg so I don't
understand what you mean here.


The domain configuration specified to libxl might contain some rdms.
Then num_rdms in the incoming config would be nonzero.


We never set d_config->num_rdms/d_config->rdms before we goes inside 
libxl__domain_device_construct_rdm(). And actually 
libxl__domain_device_construct_rdm is only one place to set 
d_config->num_rdms/d_config->rdms.


I guess this line make you or other guys confused so lets delete this 
line directly.


And if you still worry about something, I can add assert() at the 
beginning of this function like this,


assert(!d_config->num_rdms && !d_config->rdms).

Thanks
Tiejun



So I think there are two problems here:

1. If that were the case you would leak the application's rdms array.

2. Anyway, if the caller specifies such an array you should use it.
(Fixing this would avoid (1) in any case.)

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.6 0/3] FreeBSD fixes

2015-07-20 Thread Roger Pau Monne
Misc fixes for FreeBSD that affect libxl and the recently added 
xendriverdomain rc.d script.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.6 3/3] hotplug/FreeBSD: fix xendriverdomain rc.d script

2015-07-20 Thread Roger Pau Monne
hotplugpath.sh by default is located in /usr/local/etc/xen/scripts on
FreeBSD. Instead of hardcoding it's location use the XEN_SCRIPT_DIR variable
like it's used on the xencommons rc.d script.

Signed-off-by: Roger Pau Monné 
Cc: Ian Jackson 
Cc: Ian Campbell 
Cc: Wei Liu 
---
 tools/hotplug/FreeBSD/rc.d/xendriverdomain.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/hotplug/FreeBSD/rc.d/xendriverdomain.in 
b/tools/hotplug/FreeBSD/rc.d/xendriverdomain.in
index 25e3edd..4063c06 100644
--- a/tools/hotplug/FreeBSD/rc.d/xendriverdomain.in
+++ b/tools/hotplug/FreeBSD/rc.d/xendriverdomain.in
@@ -7,7 +7,7 @@
 
 . /etc/rc.subr
 
-. /etc/xen/scripts/hotplugpath.sh
+. @XEN_SCRIPT_DIR@/hotplugpath.sh
 
 LD_LIBRARY_PATH="${libdir}"
 export LD_LIBRARY_PATH
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.6 1/3] libxl: include sys/endian.h for FreeBSD

2015-07-20 Thread Roger Pau Monne
be64toh and friends are declared in sys/endian.h on FreeBSD, so include it
as part of libxl_osdeps.h.

Signed-off-by: Roger Pau Monné 
Cc: Ian Jackson 
Cc: Ian Campbell 
Cc: Wei Liu 
---
 tools/libxl/libxl_osdeps.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/libxl/libxl_osdeps.h b/tools/libxl/libxl_osdeps.h
index 08eaf0c..b265df8 100644
--- a/tools/libxl/libxl_osdeps.h
+++ b/tools/libxl/libxl_osdeps.h
@@ -42,6 +42,7 @@
 #define SYSFS_PCIBACK_DRIVER   "/dev/null"
 #define NETBACK_NIC_NAME   "xnb%u.%d"
 #include 
+#include 
 #endif
 
 #ifndef SYSFS_PCIBACK_DRIVER
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH for-4.6 2/3] libxl/psr: use Xen error codes when checking hypercall return values

2015-07-20 Thread Roger Pau Monne
We cannot use the systems errno values when checking return values from Xen,
because some OSes don't have the same set of errno definitions. Instead
use the definitions present in Xen public errno.h header.

Signed-off-by: Roger Pau Monné 
Cc: Ian Jackson 
Cc: Ian Campbell 
Cc: Wei Liu 
---
I have not checked if there are other places in libxl that need similar
treatment, I just came around this because FreeBSD doesn't have EBADSLT
defined.
---
 tools/libxl/libxl_internal.h |  1 +
 tools/libxl/libxl_psr.c  | 26 +-
 2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 2b6b2a0..cf5db8a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -55,6 +55,7 @@
 #include "xentoollog.h"
 
 #include 
+#include 
 
 #ifdef LIBXL_H
 # error libxl.h should be included via libxl_internal.h, not separately
diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 2a0..5711c38 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -24,17 +24,17 @@ static void libxl__psr_log_err_msg(libxl__gc *gc, int err)
 char *msg;
 
 switch (err) {
-case ENOSYS:
-case EOPNOTSUPP:
+case XEN_ENOSYS:
+case XEN_EOPNOTSUPP:
 msg = "unsupported operation";
 break;
-case ESRCH:
+case XEN_ESRCH:
 msg = "invalid domain ID";
 break;
-case EBADSLT:
+case XEN_EBADSLT:
 msg = "socket is not supported";
 break;
-case EFAULT:
+case XEN_EFAULT:
 msg = "failed to exchange data with Xen";
 break;
 default:
@@ -50,16 +50,16 @@ static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int 
err)
 char *msg;
 
 switch (err) {
-case ENODEV:
+case XEN_ENODEV:
 msg = "CMT is not supported in this system";
 break;
-case EEXIST:
+case XEN_EEXIST:
 msg = "CMT is already attached to this domain";
 break;
-case ENOENT:
+case XEN_ENOENT:
 msg = "CMT is not attached to this domain";
 break;
-case EUSERS:
+case XEN_EUSERS:
 msg = "no free RMID available";
 break;
 default:
@@ -75,16 +75,16 @@ static void libxl__psr_cat_log_err_msg(libxl__gc *gc, int 
err)
 char *msg;
 
 switch (err) {
-case ENODEV:
+case XEN_ENODEV:
 msg = "CAT is not supported in this system";
 break;
-case ENOENT:
+case XEN_ENOENT:
 msg = "CAT is not enabled on the socket";
 break;
-case EUSERS:
+case XEN_EUSERS:
 msg = "no free COS available";
 break;
-case EEXIST:
+case XEN_EEXIST:
 msg = "The same CBM is already set to this domain";
 break;
 
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-20 Thread Ian Jackson
Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM"):
> Note I need more time to address others.

Right.  But thanks for coming back quickly with this question:

> >> +int libxl__domain_device_construct_rdm(libxl__gc *gc,
> >> +   libxl_domain_config *d_config,
> >> +   uint64_t rdm_mem_boundary,
> >> +   struct xc_hvm_build_args *args)
> >> +{
> > ...
> >> +/* Query all RDM entries in this platform */
> >> +if (strategy == LIBXL_RDM_RESERVE_STRATEGY_HOST) {
> > ...
> >> +} else {
> >> +d_config->num_rdms = 0;
> >> +}
> >
> > Does this not override the domain configuration's num_rdms ?  I don't
> 
> We don't have the specific "num_rdms" parameter in .cfg so I don't 
> understand what you mean here.

The domain configuration specified to libxl might contain some rdms.
Then num_rdms in the incoming config would be nonzero.

So I think there are two problems here:

1. If that were the case you would leak the application's rdms array.

2. Anyway, if the caller specifies such an array you should use it.
   (Fixing this would avoid (1) in any case.)

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-20 Thread Boris Ostrovsky

On 07/20/2015 10:09 AM, Dario Faggioli wrote:

On Fri, 2015-07-17 at 14:17 -0400, Boris Ostrovsky wrote:

On 07/17/2015 03:27 AM, Dario Faggioli wrote:

In the meanwhile, what should we do? Document this? How? "don't use
vNUMA with PV guest in SMT enabled systems" seems a bit harsh... Is
there a workaround we can put in place/suggest?

I haven't been able to reproduce this on my Intel box because I think I
have different core enumeration.


Yes, most likely, that's highly topology dependant. :-(


Can you try adding
cpuid=['0x1:ebx=0001']
to your config file?


Done (sorry for the delay, the testbox was busy doing other stuff).

Still no joy (.101 is the IP address of the guest, domain id 3):

root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
root@Zhaman:~# xl vcpu-list 3
NameID  VCPU   CPU State   Time(s) Affinity 
(Hard / Soft)
test 3 04   r--  23.6  all / 0-7
test 3 19   r--  19.8  all / 0-7
test 3 28   -b-   0.4  all / 8-15
test 3 34   -b-   0.2  all / 8-15

*HOWEVER* it seems to have an effect. In fact, now, topology as it is
shown in /sys/... is different:

root@test:~# cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list
0
(it was 0-1)

This, OTOH, is still the same:
root@test:~# cat /sys/devices/system/cpu/cpu0/topology/core_siblings_list
0-3

Also, I now see this:

[0.150560] [ cut here ]
[0.150560] WARNING: CPU: 2 PID: 0 at ../arch/x86/kernel/smpboot.c:317 
topology_sane.isra.2+0x74/0x88()
[0.150560] sched: CPU #2's llc-sibling CPU #0 is not on the same node! 
[node: 1 != 0]. Ignoring dependency.
[0.150560] Modules linked in:
[0.150560] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0+ #1
[0.150560]  0009 88001ee2fdd0 81657c7b 
810bbd2c
[0.150560]  88001ee2fe20 88001ee2fe10 81081510 
88001ee2fea0
[0.150560]  8103aa02 88003ea0a001  
88001f20a040
[0.150560] Call Trace:
[0.150560]  [] dump_stack+0x4f/0x7b
[0.150560]  [] ? up+0x39/0x3e
[0.150560]  [] warn_slowpath_common+0xa1/0xbb
[0.150560]  [] ? topology_sane.isra.2+0x74/0x88
[0.150560]  [] warn_slowpath_fmt+0x46/0x48
[0.150560]  [] ? __cpuid.constprop.0+0x15/0x19
[0.150560]  [] topology_sane.isra.2+0x74/0x88
[0.150560]  [] set_cpu_sibling_map+0x27a/0x444
[0.150560]  [] ? numa_add_cpu+0x98/0x9f
[0.150560]  [] cpu_bringup+0x63/0xa8
[0.150560]  [] cpu_bringup_and_idle+0xe/0x1a
[0.150560] ---[ end trace 63d204896cce9f68 ]---

Notice that it now says 'llc-sibling', while, before, it was saying
'smt-sibling'.


Exactly. You are now passing the first topology test which was to see 
that threads are on the same node. And since each processor has only one 
thread (as evidenced by thread_siblings_list) we are good.


The second test checks that cores (i.e. things that share last level 
cache) are on the same node. And they are not.






On AMD, BTW, we fail a different test so some other bits probably need
to be tweaked. You may fail it too (the LLC sanity check).


Yep, that's the one I guess. Should I try something more/else?



I'll need to see how LLC IDs are calculated, probably also from some 
CPUID bits. The question though will be --- what do we do with how cache 
sizes (and TLB sizes for that matter) are presented to the guests. Do we 
scale them down per thread?


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-20 Thread Chen, Tiejun


Note I need more time to address others.


+int libxl__domain_device_construct_rdm(libxl__gc *gc,
+   libxl_domain_config *d_config,
+   uint64_t rdm_mem_boundary,
+   struct xc_hvm_build_args *args)
+{

...

+/* Query all RDM entries in this platform */
+if (strategy == LIBXL_RDM_RESERVE_STRATEGY_HOST) {

...

+} else {
+d_config->num_rdms = 0;
+}


Does this not override the domain configuration's num_rdms ?  I don't


We don't have the specific "num_rdms" parameter in .cfg so I don't 
understand what you mean here.


Thanks
Tiejun


think that is correct.

If the domain configuration has rdms and num_rdms already set, then
the strategy should presumably be ignored.  (Passing the same domain
configuration struct to libxl_domain_create_new, after destroying the
domain, ought to work, even after the first call has modified it.)





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 23/23] x86: add multiboot2 protocol support for relocatable images

2015-07-20 Thread Daniel Kiper
Add multiboot2 protocol support for relocatable images. Only GRUB2
with relevant patches understands that feature. Older multiboot
protocol (regardless of version) compatible loaders ignore it
and everything works as usual.

Signed-off-by: Daniel Kiper 
---
 xen/arch/x86/boot/head.S  |   46 +
 xen/arch/x86/x86_64/asm-offsets.c |1 +
 xen/include/xen/multiboot2.h  |   13 +++
 3 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index d484f68..2520e48 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -81,6 +81,13 @@ multiboot1_header_end:
 /* Align modules at page boundry. */
 mb2ht_init MB2_HT(MODULE_ALIGN), MB2_HT(REQUIRED)
 
+/* Load address preference. */
+mb2ht_init MB2_HT(RELOCATABLE), MB2_HT(OPTIONAL), \
+   sym_phys(start), /* Min load address. */ \
+   0x, /* Max load address (4 GiB - 1). */ \
+   0x20, /* Load address alignment (2 MiB). */ \
+   MULTIBOOT2_LOAD_PREFERENCE_HIGH
+
 /* Console flags tag. */
 mb2ht_init MB2_HT(CONSOLE_FLAGS), MB2_HT(OPTIONAL), \
MULTIBOOT2_CONSOLE_FLAGS_EGA_TEXT_SUPPORTED
@@ -176,30 +183,39 @@ efi_multiboot2_proto:
 lea MB2_fixed_sizeof(%rbx),%rcx
 
 0:
+/* Get Xen image base address from Multiboot2 information. */
+cmpl$MULTIBOOT2_TAG_TYPE_BASE_ADDR,MB2_tag_type(%rcx)
+jne 1f
+
+mov MB2_base_addr(%rcx),%ebp
+sub $XEN_IMG_OFFSET,%rbp
+jmp 4f
+
+1:
 /* Get EFI SystemTable address from Multiboot2 information. */
 cmpl$MULTIBOOT2_TAG_TYPE_EFI64,MB2_tag_type(%rcx)
-jne 1f
+jne 2f
 
 mov MB2_efi64_st(%rcx),%rsi
 
 /* Do not go into real mode on EFI platform. */
 movb$1,skip_realmode(%rip)
-jmp 3f
+jmp 4f
 
-1:
+2:
 /* Get EFI ImageHandle address from Multiboot2 information. */
 cmpl$MULTIBOOT2_TAG_TYPE_EFI64_IH,MB2_tag_type(%rcx)
-jne 2f
+jne 3f
 
 mov MB2_efi64_ih(%rcx),%rdi
-jmp 3f
+jmp 4f
 
-2:
+3:
 /* Is it the end of Multiboot2 information? */
 cmpl$MULTIBOOT2_TAG_TYPE_END,MB2_tag_type(%rcx)
 je  run_bs
 
-3:
+4:
 /* Go to next Multiboot2 information tag. */
 add MB2_tag_size(%rcx),%ecx
 add $(MULTIBOOT2_TAG_ALIGN-1),%rcx
@@ -297,14 +313,23 @@ multiboot2_proto:
 lea MB2_fixed_sizeof(%ebx),%ecx
 
 0:
+/* Get Xen image base address from Multiboot2 information. */
+cmpl$MULTIBOOT2_TAG_TYPE_BASE_ADDR,MB2_tag_type(%ecx)
+jne 1f
+
+mov MB2_base_addr(%ecx),%ebp
+sub $XEN_IMG_OFFSET,%ebp
+jmp 3f
+
+1:
 /* Get mem_lower from Multiboot2 information. */
 cmpl$MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO,MB2_tag_type(%ecx)
-jne 1f
+jne 2f
 
 mov MB2_mem_lower(%ecx),%edx
-jmp trampoline_bios_setup
+jmp 3f
 
-1:
+2:
 /* EFI mode is not supported via legacy BIOS path. */
 cmpl$MULTIBOOT2_TAG_TYPE_EFI32,MB2_tag_type(%ecx)
 je  mb2_too_old
@@ -316,6 +341,7 @@ multiboot2_proto:
 cmpl$MULTIBOOT2_TAG_TYPE_END,MB2_tag_type(%ecx)
 je  trampoline_bios_setup
 
+3:
 /* Go to next Multiboot2 information tag. */
 add MB2_tag_size(%ecx),%ecx
 add $(MULTIBOOT2_TAG_ALIGN-1),%ecx
diff --git a/xen/arch/x86/x86_64/asm-offsets.c 
b/xen/arch/x86/x86_64/asm-offsets.c
index b7aed49..5345a9e 100644
--- a/xen/arch/x86/x86_64/asm-offsets.c
+++ b/xen/arch/x86/x86_64/asm-offsets.c
@@ -172,6 +172,7 @@ void __dummy__(void)
 DEFINE(MB2_fixed_sizeof, sizeof(multiboot2_fixed_t));
 OFFSET(MB2_tag_type, multiboot2_tag_t, type);
 OFFSET(MB2_tag_size, multiboot2_tag_t, size);
+OFFSET(MB2_base_addr, multiboot2_tag_base_addr_t, base_addr);
 OFFSET(MB2_mem_lower, multiboot2_tag_basic_meminfo_t, mem_lower);
 OFFSET(MB2_efi64_st, multiboot2_tag_efi64_t, pointer);
 OFFSET(MB2_efi64_ih, multiboot2_tag_efi64_ih_t, pointer);
diff --git a/xen/include/xen/multiboot2.h b/xen/include/xen/multiboot2.h
index 09ee64e..a63c4d6 100644
--- a/xen/include/xen/multiboot2.h
+++ b/xen/include/xen/multiboot2.h
@@ -59,11 +59,17 @@
 #define MULTIBOOT2_HEADER_TAG_EFI_BS   7
 #define MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI32  8
 #define MULTIBOOT2_HEADER_TAG_ENTRY_ADDRESS_EFI64  9
+#define MULTIBOOT2_HEADER_TAG_RELOCATABLE  10
 
 /* Header tag flags. */
 #define MULTIBOOT2_HEADER_TAG_REQUIRED 0
 #define MULTIBOOT2_HEADER_TAG_OPTIONAL 1
 
+/* Where image should be loaded (suggestion not requirement). */
+#define MULTIBOOT2_LOAD_PREFERENCE_NONE0

[Xen-devel] [PATCH v2 6/6] multiboot2: Do not pass memory maps to image if EFI boot services are enabled

2015-07-20 Thread Daniel Kiper
Do not pass memory maps to image if it asked for EFI boot services. Maps are
usually invalid in that case and they can confuse potential user. Image should
get memory map itself just before ExitBootServices() call.

Signed-off-by: Daniel Kiper 
---
 grub-core/loader/multiboot_mbi2.c |   71 ++---
 1 file changed, 35 insertions(+), 36 deletions(-)

diff --git a/grub-core/loader/multiboot_mbi2.c 
b/grub-core/loader/multiboot_mbi2.c
index 7ac64ec..26e955c 100644
--- a/grub-core/loader/multiboot_mbi2.c
+++ b/grub-core/loader/multiboot_mbi2.c
@@ -431,7 +431,7 @@ static grub_size_t
 grub_multiboot_get_mbi_size (void)
 {
 #ifdef GRUB_MACHINE_EFI
-  if (!efi_mmap_size)
+  if (!keep_bs && !efi_mmap_size)
 find_efi_mmap_size ();
 #endif
   return 2 * sizeof (grub_uint32_t) + sizeof (struct multiboot_tag)
@@ -805,12 +805,13 @@ grub_multiboot_make_mbi (grub_uint32_t *target)
   }
   }
 
-  {
-struct multiboot_tag_mmap *tag = (struct multiboot_tag_mmap *) ptrorig;
-grub_fill_multiboot_mmap (tag);
-ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
-  / sizeof (grub_properly_aligned_t);
-  }
+  if (!keep_bs)
+{
+  struct multiboot_tag_mmap *tag = (struct multiboot_tag_mmap *) ptrorig;
+  grub_fill_multiboot_mmap (tag);
+  ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+   / sizeof (grub_properly_aligned_t);
+}
 
   {
 struct multiboot_tag_elf_sections *tag
@@ -826,18 +827,19 @@ grub_multiboot_make_mbi (grub_uint32_t *target)
   / sizeof (grub_properly_aligned_t);
   }
 
-  {
-struct multiboot_tag_basic_meminfo *tag
-  = (struct multiboot_tag_basic_meminfo *) ptrorig;
-tag->type = MULTIBOOT_TAG_TYPE_BASIC_MEMINFO;
-tag->size = sizeof (struct multiboot_tag_basic_meminfo); 
+  if (!keep_bs)
+{
+  struct multiboot_tag_basic_meminfo *tag
+   = (struct multiboot_tag_basic_meminfo *) ptrorig;
+  tag->type = MULTIBOOT_TAG_TYPE_BASIC_MEMINFO;
+  tag->size = sizeof (struct multiboot_tag_basic_meminfo);
 
-/* Convert from bytes to kilobytes.  */
-tag->mem_lower = grub_mmap_get_lower () / 1024;
-tag->mem_upper = grub_mmap_get_upper () / 1024;
-ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
-   / sizeof (grub_properly_aligned_t);
-  }
+  /* Convert from bytes to kilobytes.  */
+  tag->mem_lower = grub_mmap_get_lower () / 1024;
+  tag->mem_upper = grub_mmap_get_upper () / 1024;
+  ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+   / sizeof (grub_properly_aligned_t);
+}
 
   {
 struct grub_net_network_level_interface *net;
@@ -936,27 +938,24 @@ grub_multiboot_make_mbi (grub_uint32_t *target)
 grub_efi_uintn_t efi_desc_size;
 grub_efi_uint32_t efi_desc_version;
 
-tag->type = MULTIBOOT_TAG_TYPE_EFI_MMAP;
-tag->size = sizeof (*tag) + efi_mmap_size;
-
 if (!keep_bs)
-  err = grub_efi_finish_boot_services (&efi_mmap_size, tag->efi_mmap, NULL,
-  &efi_desc_size, &efi_desc_version);
-else
   {
-   if (grub_efi_get_memory_map (&efi_mmap_size, (void *) tag->efi_mmap,
-NULL,
-&efi_desc_size, &efi_desc_version) <= 0)
- err = grub_error (GRUB_ERR_IO, "couldn't retrieve memory map");
+   tag->type = MULTIBOOT_TAG_TYPE_EFI_MMAP;
+   tag->size = sizeof (*tag) + efi_mmap_size;
+
+   err = grub_efi_finish_boot_services (&efi_mmap_size, tag->efi_mmap, 
NULL,
+&efi_desc_size, &efi_desc_version);
+
+   if (err)
+ return err;
+
+   tag->descr_size = efi_desc_size;
+   tag->descr_vers = efi_desc_version;
+   tag->size = sizeof (*tag) + efi_mmap_size;
+
+   ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+ / sizeof (grub_properly_aligned_t);
   }
-if (err)
-  return err;
-tag->descr_size = efi_desc_size;
-tag->descr_vers = efi_desc_version;
-tag->size = sizeof (*tag) + efi_mmap_size;
-
-ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
-  / sizeof (grub_properly_aligned_t);
   }
 
   if (keep_bs)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 4/6] multiboot2: Add tags used to pass ImageHandle to loaded image

2015-07-20 Thread Daniel Kiper
Add tags used to pass ImageHandle to loaded image. It is used
by at least ExitBootServices() function.

Signed-off-by: Daniel Kiper 
---
 grub-core/loader/multiboot_mbi2.c |   46 +
 include/multiboot2.h  |   16 +
 2 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/grub-core/loader/multiboot_mbi2.c 
b/grub-core/loader/multiboot_mbi2.c
index 8d66e3f..dc9c709 100644
--- a/grub-core/loader/multiboot_mbi2.c
+++ b/grub-core/loader/multiboot_mbi2.c
@@ -172,6 +172,8 @@ grub_multiboot_load (grub_file_t file, const char *filename)
  case MULTIBOOT_TAG_TYPE_NETWORK:
  case MULTIBOOT_TAG_TYPE_EFI_MMAP:
  case MULTIBOOT_TAG_TYPE_EFI_BS:
+ case MULTIBOOT_TAG_TYPE_EFI32_IH:
+ case MULTIBOOT_TAG_TYPE_EFI64_IH:
break;
 
  default:
@@ -407,16 +409,18 @@ grub_multiboot_get_mbi_size (void)
 + grub_get_multiboot_mmap_count ()
 * sizeof (struct multiboot_mmap_entry)), MULTIBOOT_TAG_ALIGN)
 + ALIGN_UP (sizeof (struct multiboot_tag_framebuffer), MULTIBOOT_TAG_ALIGN)
+#ifdef GRUB_MACHINE_EFI
 + ALIGN_UP (sizeof (struct multiboot_tag_efi32), MULTIBOOT_TAG_ALIGN)
 + ALIGN_UP (sizeof (struct multiboot_tag_efi64), MULTIBOOT_TAG_ALIGN)
++ ALIGN_UP (sizeof (struct multiboot_tag_efi32_ih), MULTIBOOT_TAG_ALIGN)
++ ALIGN_UP (sizeof (struct multiboot_tag_efi64_ih), MULTIBOOT_TAG_ALIGN)
++ ALIGN_UP (sizeof (struct multiboot_tag_efi_mmap)
+   + efi_mmap_size, MULTIBOOT_TAG_ALIGN)
+#endif
 + ALIGN_UP (sizeof (struct multiboot_tag_old_acpi)
+ sizeof (struct grub_acpi_rsdp_v10), MULTIBOOT_TAG_ALIGN)
 + acpiv2_size ()
 + net_size ()
-#ifdef GRUB_MACHINE_EFI
-+ ALIGN_UP (sizeof (struct multiboot_tag_efi_mmap)
-   + efi_mmap_size, MULTIBOOT_TAG_ALIGN)
-#endif
 + sizeof (struct multiboot_tag_vbe) + MULTIBOOT_TAG_ALIGN - 1
 + sizeof (struct multiboot_tag_apm) + MULTIBOOT_TAG_ALIGN - 1;
 }
@@ -906,11 +910,35 @@ grub_multiboot_make_mbi (grub_uint32_t *target)
 
   if (keep_bs)
 {
-  struct multiboot_tag *tag = (struct multiboot_tag *) ptrorig;
-  tag->type = MULTIBOOT_TAG_TYPE_EFI_BS;
-  tag->size = sizeof (struct multiboot_tag);
-  ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
-   / sizeof (grub_properly_aligned_t);
+  {
+   struct multiboot_tag *tag = (struct multiboot_tag *) ptrorig;
+   tag->type = MULTIBOOT_TAG_TYPE_EFI_BS;
+   tag->size = sizeof (struct multiboot_tag);
+   ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+ / sizeof (grub_properly_aligned_t);
+  }
+
+#ifdef __x86_64__
+  {
+   struct multiboot_tag_efi64_ih *tag = (struct multiboot_tag_efi64_ih *) 
ptrorig;
+   tag->type = MULTIBOOT_TAG_TYPE_EFI64_IH;
+   tag->size = sizeof (*tag);
+   tag->pointer = (grub_addr_t) grub_efi_image_handle;
+   ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+ / sizeof (grub_properly_aligned_t);
+  }
+#endif
+
+#ifdef __i386__
+  {
+   struct multiboot_tag_efi32_ih *tag = (struct multiboot_tag_efi32_ih *) 
ptrorig;
+   tag->type = MULTIBOOT_TAG_TYPE_EFI32_IH;
+   tag->size = sizeof (*tag);
+   tag->pointer = (grub_addr_t) grub_efi_image_handle;
+   ptrorig += ALIGN_UP (tag->size, MULTIBOOT_TAG_ALIGN)
+ / sizeof (grub_properly_aligned_t);
+  }
+#endif
 }
 #endif
 
diff --git a/include/multiboot2.h b/include/multiboot2.h
index b3977e3..9f97ddc 100644
--- a/include/multiboot2.h
+++ b/include/multiboot2.h
@@ -60,6 +60,8 @@
 #define MULTIBOOT_TAG_TYPE_NETWORK   16
 #define MULTIBOOT_TAG_TYPE_EFI_MMAP  17
 #define MULTIBOOT_TAG_TYPE_EFI_BS18
+#define MULTIBOOT_TAG_TYPE_EFI32_IH  19
+#define MULTIBOOT_TAG_TYPE_EFI64_IH  20
 
 #define MULTIBOOT_HEADER_TAG_END  0
 #define MULTIBOOT_HEADER_TAG_INFORMATION_REQUEST  1
@@ -379,6 +381,20 @@ struct multiboot_tag_efi_mmap
   multiboot_uint8_t efi_mmap[0];
 }; 
 
+struct multiboot_tag_efi32_ih
+{
+  multiboot_uint32_t type;
+  multiboot_uint32_t size;
+  multiboot_uint32_t pointer;
+};
+
+struct multiboot_tag_efi64_ih
+{
+  multiboot_uint32_t type;
+  multiboot_uint32_t size;
+  multiboot_uint64_t pointer;
+};
+
 #endif /* ! ASM_FILE */
 
 #endif /* ! MULTIBOOT_HEADER */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 0/6] multiboot2: Add two extensions and fix some issues

2015-07-20 Thread Daniel Kiper
Hi,

This patch series:
  - enables EFI boot services usage in loaded images
by multiboot2 protocol,
  - add support for multiboot2 protocol compatible
relocatable images,
  - fixes two minor issues.

Daniel

 .gitignore|3 ++
 grub-core/Makefile.core.def   |1 +
 grub-core/lib/i386/relocator.c|   53 +
 grub-core/lib/i386/relocator64_efi.S  |   77 ++
 grub-core/lib/relocator.c |2 +-
 grub-core/loader/i386/multiboot_mbi.c |6 ++-
 grub-core/loader/multiboot.c  |   41 +---
 grub-core/loader/multiboot_elfxx.c|   28 ---
 grub-core/loader/multiboot_mbi2.c |  199 
--
 include/grub/i386/multiboot.h |   11 +
 include/grub/i386/relocator.h |   21 +
 include/grub/multiboot.h  |4 +-
 include/multiboot2.h  |   49 +++
 13 files changed, 423 insertions(+), 72 deletions(-)

Daniel Kiper (6):
  gitignore: Ignore *.orig, *.rej and *.swp files
  relocator: Do not use memory region if its starta is smaller than size
  i386/relocator: Add grub_relocator64_efi relocator
  multiboot2: Add tags used to pass ImageHandle to loaded image
  multiboot2: Add support for relocatable images
  multiboot2: Do not pass memory maps to image if EFI boot services are 
enabled


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 3/6] i386/relocator: Add grub_relocator64_efi relocator

2015-07-20 Thread Daniel Kiper
Add grub_relocator64_efi relocator. It will be used on EFI 64-bit platforms
when multiboot2 compatible image requests MULTIBOOT_TAG_TYPE_EFI_BS. Relocator
will set lower parts of %rax and %rbx accordingly to multiboot2 specification.
On the other hand processor mode, just before jumping into loaded image, will
be set accordingly to Unified Extensible Firmware Interface Specification,
Version 2.4 Errata B, section 2.3.4, x64 Platforms, boot services. This way
loaded image will be able to use EFI boot services without any issues.

If idea is accepted I will prepare grub_relocator32_efi relocator too.

Signed-off-by: Daniel Kiper 
---
 grub-core/Makefile.core.def  |1 +
 grub-core/lib/i386/relocator.c   |   53 +++
 grub-core/lib/i386/relocator64_efi.S |   77 ++
 grub-core/loader/multiboot.c |   29 +++--
 grub-core/loader/multiboot_mbi2.c|   19 +++--
 include/grub/i386/multiboot.h|   11 +
 include/grub/i386/relocator.h|   21 ++
 include/multiboot2.h |9 
 8 files changed, 213 insertions(+), 7 deletions(-)
 create mode 100644 grub-core/lib/i386/relocator64_efi.S

diff --git a/grub-core/Makefile.core.def b/grub-core/Makefile.core.def
index a6101de..d583549 100644
--- a/grub-core/Makefile.core.def
+++ b/grub-core/Makefile.core.def
@@ -1519,6 +1519,7 @@ module = {
   x86 = lib/i386/relocator_common_c.c;
   ieee1275 = lib/ieee1275/relocator.c;
   efi = lib/efi/relocator.c;
+  x86_64_efi = lib/i386/relocator64_efi.S;
   mips = lib/mips/relocator_asm.S;
   mips = lib/mips/relocator.c;
   powerpc = lib/powerpc/relocator_asm.S;
diff --git a/grub-core/lib/i386/relocator.c b/grub-core/lib/i386/relocator.c
index 71dd4f0..459027e 100644
--- a/grub-core/lib/i386/relocator.c
+++ b/grub-core/lib/i386/relocator.c
@@ -69,6 +69,19 @@ extern grub_uint64_t grub_relocator64_rsi;
 extern grub_addr_t grub_relocator64_cr3;
 extern struct grub_i386_idt grub_relocator16_idt;
 
+#ifdef GRUB_MACHINE_EFI
+#ifdef __x86_64__
+extern grub_uint8_t grub_relocator64_efi_start;
+extern grub_uint8_t grub_relocator64_efi_end;
+extern grub_uint64_t grub_relocator64_efi_rax;
+extern grub_uint64_t grub_relocator64_efi_rbx;
+extern grub_uint64_t grub_relocator64_efi_rcx;
+extern grub_uint64_t grub_relocator64_efi_rdx;
+extern grub_uint64_t grub_relocator64_efi_rip;
+extern grub_uint64_t grub_relocator64_efi_rsi;
+#endif
+#endif
+
 #define RELOCATOR_SIZEOF(x)(&grub_relocator##x##_end - 
&grub_relocator##x##_start)
 
 grub_err_t
@@ -214,3 +227,43 @@ grub_relocator64_boot (struct grub_relocator *rel,
   /* Not reached.  */
   return GRUB_ERR_NONE;
 }
+
+#ifdef GRUB_MACHINE_EFI
+#ifdef __x86_64__
+grub_err_t
+grub_relocator64_efi_boot (struct grub_relocator *rel,
+  struct grub_relocator64_efi_state state)
+{
+  grub_err_t err;
+  void *relst;
+  grub_relocator_chunk_t ch;
+
+  err = grub_relocator_alloc_chunk_align (rel, &ch, 0,
+ 0x4000 - RELOCATOR_SIZEOF 
(64_efi),
+ RELOCATOR_SIZEOF (64_efi), 16,
+ GRUB_RELOCATOR_PREFERENCE_NONE, 1);
+  if (err)
+return err;
+
+  grub_relocator64_efi_rax = state.rax;
+  grub_relocator64_efi_rbx = state.rbx;
+  grub_relocator64_efi_rcx = state.rcx;
+  grub_relocator64_efi_rdx = state.rdx;
+  grub_relocator64_efi_rip = state.rip;
+  grub_relocator64_efi_rsi = state.rsi;
+
+  grub_memmove (get_virtual_current_address (ch), &grub_relocator64_efi_start,
+   RELOCATOR_SIZEOF (64_efi));
+
+  err = grub_relocator_prepare_relocs (rel, get_physical_target_address (ch),
+  &relst, NULL);
+  if (err)
+return err;
+
+  ((void (*) (void)) relst) ();
+
+  /* Not reached.  */
+  return GRUB_ERR_NONE;
+}
+#endif
+#endif
diff --git a/grub-core/lib/i386/relocator64_efi.S 
b/grub-core/lib/i386/relocator64_efi.S
new file mode 100644
index 000..fcd1964
--- /dev/null
+++ b/grub-core/lib/i386/relocator64_efi.S
@@ -0,0 +1,77 @@
+/*
+ *  GRUB  --  GRand Unified Bootloader
+ *  Copyright (C) 2009,2010  Free Software Foundation, Inc.
+ *  Copyright (C) 2014,2015  Oracle Co.
+ *  Author: Daniel Kiper
+ *
+ *  GRUB is free software: you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation, either version 3 of the License, or
+ *  (at your option) any later version.
+ *
+ *  GRUB is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with GRUB.  If not, see .
+ */
+
+#include "relocator_common.S"
+
+   .p2align 

[Xen-devel] [PATCH v2 5/6] multiboot2: Add support for relocatable images

2015-07-20 Thread Daniel Kiper
Signed-off-by: Daniel Kiper 
---
 grub-core/loader/i386/multiboot_mbi.c |6 ++--
 grub-core/loader/multiboot.c  |   12 +--
 grub-core/loader/multiboot_elfxx.c|   28 +++
 grub-core/loader/multiboot_mbi2.c |   63 +
 include/grub/multiboot.h  |4 ++-
 include/multiboot2.h  |   24 +
 6 files changed, 118 insertions(+), 19 deletions(-)

diff --git a/grub-core/loader/i386/multiboot_mbi.c 
b/grub-core/loader/i386/multiboot_mbi.c
index 956d0e3..abdb98b 100644
--- a/grub-core/loader/i386/multiboot_mbi.c
+++ b/grub-core/loader/i386/multiboot_mbi.c
@@ -72,7 +72,8 @@ load_kernel (grub_file_t file, const char *filename,
   grub_err_t err;
   if (grub_multiboot_quirks & GRUB_MULTIBOOT_QUIRK_BAD_KLUDGE)
 {
-  err = grub_multiboot_load_elf (file, filename, buffer);
+  err = grub_multiboot_load_elf (file, filename, buffer, 0, 0, 0, 0,
+GRUB_RELOCATOR_PREFERENCE_NONE, NULL, 0);
   if (err == GRUB_ERR_UNKNOWN_OS && (header->flags & 
MULTIBOOT_AOUT_KLUDGE))
grub_errno = err = GRUB_ERR_NONE;
 }
@@ -118,7 +119,8 @@ load_kernel (grub_file_t file, const char *filename,
   return GRUB_ERR_NONE;
 }
 
-  return grub_multiboot_load_elf (file, filename, buffer);
+  return grub_multiboot_load_elf (file, filename, buffer, 0, 0, 0, 0,
+ GRUB_RELOCATOR_PREFERENCE_NONE, NULL, 0);
 }
 
 static struct multiboot_header *
diff --git a/grub-core/loader/multiboot.c b/grub-core/loader/multiboot.c
index ca7154f..1b1f7a9 100644
--- a/grub-core/loader/multiboot.c
+++ b/grub-core/loader/multiboot.c
@@ -190,12 +190,18 @@ static grub_uint64_t highest_load;
 /* Load ELF32 or ELF64.  */
 grub_err_t
 grub_multiboot_load_elf (grub_file_t file, const char *filename,
-void *buffer)
+void *buffer, int relocatable, grub_uint32_t min_addr,
+grub_uint32_t max_addr, grub_size_t align, 
grub_uint32_t preference,
+grub_uint32_t *base_addr, int avoid_efi_boot_services)
 {
   if (grub_multiboot_is_elf32 (buffer))
-return grub_multiboot_load_elf32 (file, filename, buffer);
+return grub_multiboot_load_elf32 (file, filename, buffer, relocatable,
+ min_addr, max_addr, align, preference,
+ base_addr, avoid_efi_boot_services);
   else if (grub_multiboot_is_elf64 (buffer))
-return grub_multiboot_load_elf64 (file, filename, buffer);
+return grub_multiboot_load_elf64 (file, filename, buffer, relocatable,
+ min_addr, max_addr, align, preference,
+ base_addr, avoid_efi_boot_services);
 
   return grub_error (GRUB_ERR_UNKNOWN_OS, N_("invalid arch-dependent ELF 
magic"));
 }
diff --git a/grub-core/loader/multiboot_elfxx.c 
b/grub-core/loader/multiboot_elfxx.c
index 6a220bd..4fce685 100644
--- a/grub-core/loader/multiboot_elfxx.c
+++ b/grub-core/loader/multiboot_elfxx.c
@@ -51,7 +51,10 @@ CONCAT(grub_multiboot_is_elf, XX) (void *buffer)
 }
 
 static grub_err_t
-CONCAT(grub_multiboot_load_elf, XX) (grub_file_t file, const char *filename, 
void *buffer)
+CONCAT(grub_multiboot_load_elf, XX) (grub_file_t file, const char *filename,
+void *buffer, int relocatable, 
grub_uint32_t min_addr,
+grub_uint32_t max_addr, grub_size_t align, 
grub_uint32_t preference,
+grub_uint32_t *base_addr, int 
avoid_efi_boot_services)
 {
   Elf_Ehdr *ehdr = (Elf_Ehdr *) buffer;
   char *phdr_base;
@@ -89,19 +92,30 @@ CONCAT(grub_multiboot_load_elf, XX) (grub_file_t file, 
const char *filename, voi
  if (phdr(i)->p_paddr + phdr(i)->p_memsz > highest_load)
highest_load = phdr(i)->p_paddr + phdr(i)->p_memsz;
 
- grub_dprintf ("multiboot_loader", "segment %d: paddr=0x%lx, 
memsz=0x%lx, vaddr=0x%lx\n",
-   i, (long) phdr(i)->p_paddr, (long) phdr(i)->p_memsz, 
(long) phdr(i)->p_vaddr);
+ grub_dprintf ("multiboot_loader", "segment %d: paddr=0x%lx, 
memsz=0x%lx, vaddr=0x%lx,"
+   "align=0x%lx, relocatable=%d, 
avoid_efi_boot_services=%d\n", i,
+   (long) phdr(i)->p_paddr, (long) phdr(i)->p_memsz, 
(long) phdr(i)->p_vaddr,
+   (long) align, relocatable, avoid_efi_boot_services);
 
  {
grub_relocator_chunk_t ch;
-   err = grub_relocator_alloc_chunk_addr (grub_multiboot_relocator, 
-  &ch, phdr(i)->p_paddr,
-  phdr(i)->p_memsz);
+
+   if (relocatable)
+ err = grub_relocator_alloc_chunk_align (grub_multiboot_relocator, 
&ch,
+ min_addr, max_addr

[Xen-devel] [PATCH v2 2/6] relocator: Do not use memory region if its starta is smaller than size

2015-07-20 Thread Daniel Kiper
malloc_in_range() should not use memory region if its starta is smaller
than size. Otherwise target wraps around and points to region which is
usually not a RAM, e.g.:

loader/multiboot.c:93: segment 0: paddr=0x80, memsz=0x3f80, 
vaddr=0x80
lib/relocator.c:1241: min_addr = 0x0, max_addr = 0x, target = 
0x80
lib/relocator.c:434: trying to allocate in 0x80-0x aligned 
0x1 size 0x3f80
lib/relocator.c:434: trying to allocate in 0x0-0x80 aligned 0x1 size 
0x3f80
lib/relocator.c:434: trying to allocate in 0x0-0x aligned 0x1 
size 0x3f80
lib/relocator.c:1188: allocated: 0xc07f+0x3f80
lib/relocator.c:1277: allocated 0xc07f/0x80

Signed-off-by: Daniel Kiper 
---
 grub-core/lib/relocator.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/grub-core/lib/relocator.c b/grub-core/lib/relocator.c
index f759c7f..4eee0c5 100644
--- a/grub-core/lib/relocator.c
+++ b/grub-core/lib/relocator.c
@@ -748,7 +748,7 @@ malloc_in_range (struct grub_relocator *rel,
  /* Found an usable address.  */
  goto found;
  }
-   if (isinsidebefore && !isinsideafter && !from_low_priv)
+   if (isinsidebefore && !isinsideafter && !from_low_priv && starta >= 
size)
  {
target = starta - size;
if (target > end - size)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 1/6] gitignore: Ignore *.orig, *.rej and *.swp files

2015-07-20 Thread Daniel Kiper
Signed-off-by: Daniel Kiper 
---
 .gitignore |3 +++
 1 file changed, 3 insertions(+)

diff --git a/.gitignore b/.gitignore
index 18ab8e8..6d25d39 100644
--- a/.gitignore
+++ b/.gitignore
@@ -147,6 +147,7 @@ mod-*.c
 missing
 netboot_test
 *.o
+*.orig
 *.a
 ohci_test
 partmap_test
@@ -160,9 +161,11 @@ po/stamp-po
 printf_test
 priority_queue_unit_test
 pseries_test
+*.rej
 stamp-h
 stamp-h1
 stamp-h.in
+*.swp
 symlist.c
 symlist.h
 trigtables.c
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v10][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-20 Thread Chen, Tiejun
Looks just a little bit should be changed so I also paste this new 
online to try winning your Acked here,



hvmloader/e820: construct guest e820 table

Now use the hypervisor-supplied memory map to build our final e820 table:
* Add regions for BIOS ranges and other special mappings not in the
  hypervisor map
* Add in the hypervisor supplied regions
* Adjust the lowmem and highmem regions if we've had to relocate
  memory (adding a highmem region if necessary)
* Sort all the ranges so that they appear in memory order.

CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Reviewed-by: George Dunlap 
Signed-off-by: Tiejun Chen 
---
 tools/firmware/hvmloader/e820.c | 109 
+++-

 1 file changed, 96 insertions(+), 13 deletions(-)

diff --git a/tools/firmware/hvmloader/e820.c 
b/tools/firmware/hvmloader/e820.c

index 7a414ab..a6cacdf 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -105,7 +105,11 @@ int build_e820_table(struct e820entry *e820,
  unsigned int lowmem_reserved_base,
  unsigned int bios_image_base)
 {
-unsigned int nr = 0;
+unsigned int nr = 0, i, j;
+uint32_t low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT;
+uint32_t add_high_mem = 0;
+uint64_t high_mem_end = (uint64_t)hvm_info->high_mem_pgend << 
PAGE_SHIFT;

+uint64_t map_start, map_size, map_end;

 if ( !lowmem_reserved_base )
 lowmem_reserved_base = 0xA;
@@ -149,13 +153,6 @@ int build_e820_table(struct e820entry *e820,
 e820[nr].type = E820_RESERVED;
 nr++;

-/* Low RAM goes here. Reserve space for special pages. */
-BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20));
-e820[nr].addr = 0x10;
-e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - 
e820[nr].addr;

-e820[nr].type = E820_RAM;
-nr++;
-
 /*
  * Explicitly reserve space for special pages.
  * This space starts at RESERVED_MEMBASE an extends to cover various
@@ -191,16 +188,102 @@ int build_e820_table(struct e820entry *e820,
 nr++;
 }

+/* Low RAM goes here. Reserve space for special pages. */
+BUG_ON(low_mem_end < (2u << 20));

-if ( hvm_info->high_mem_pgend )
+/*
+ * Construct E820 table according to recorded memory map.
+ *
+ * The memory map created by toolstack may include,
+ *
+ * #1. Low memory region
+ *
+ * Low RAM starts at least from 1M to make sure all standard regions
+ * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
+ * have enough space.
+ *
+ * #2. Reserved regions if they exist
+ *
+ * #3. High memory region if it exists
+ *
+ * Note we just have one low memory entry and one high mmeory entry if
+ * exists.
+ *
+ * But we may have relocated RAM to allocate sufficient MMIO previously
+ * so low_mem_pgend would be changed over there. And here memory_map[]
+ * records the original low/high memory, so if low_mem_end is less than
+ * the original we need to revise low/high memory range firstly.
+ */
+for ( i = 0; i < memory_map.nr_map; i++ )
 {
-e820[nr].addr = ((uint64_t)1 << 32);
-e820[nr].size =
-((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - 
e820[nr].addr;

-e820[nr].type = E820_RAM;
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+map_end = map_start + map_size;
+
+/* If we need to adjust lowmem. */
+if ( memory_map.map[i].type == E820_RAM &&
+ low_mem_end > map_start && low_mem_end < map_end )
+{
+add_high_mem = map_end - low_mem_end;
+memory_map.map[i].size = low_mem_end - map_start;
+break;
+}
+}
+
+/* If we need to adjust highmem. */
+if ( add_high_mem )
+{
+/* Modify the existing highmem region if it exists. */
+for ( i = 0; i < memory_map.nr_map; i++ )
+{
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+map_end = map_start + map_size;
+
+if ( memory_map.map[i].type == E820_RAM &&
+ map_start == ((uint64_t)1 << 32))
+{
+memory_map.map[i].size += add_high_mem;
+break;
+}
+}
+
+/* If there was no highmem region, just create one. */
+if ( i == memory_map.nr_map )
+{
+memory_map.map[i].addr = ((uint64_t)1 << 32);
+memory_map.map[i].size = add_high_mem;
+memory_map.map[i].type = E820_RAM;
+memory_map.nr_map++;
+}
+
+/* A sanity check if high memory is broken. */
+BUG_ON( high_mem_end !=
+memory_map.map[i].addr + memory_map.map[i].size);
+}
+
+/* Now fill e820.

Re: [Xen-devel] [v10][PATCH 06/16] hvmloader/pci: Try to avoid placing BARs in RMRRs

2015-07-20 Thread Chen, Tiejun

On 2015/7/20 22:16, Jan Beulich wrote:

On 20.07.15 at 16:10,  wrote:

Hmm... although I suppose that doesn't catch the possibility of a memory
range crossing the 4G boundary.


I think we can safely ignore that - both real and virtual hardware have
special regions right below 4Gb, so neither RAM not RMRRs can be
reasonably placed there.



Okay, I regenerate this patch online. And I just hope its good to be 
acked here:


hvmloader/pci: Try to avoid placing BARs in RMRRs

Try to avoid placing PCI BARs over RMRRs:

- If mmio_hole_size is not specified, and the existing MMIO range has
  RMRRs in it, and there is space to expand the hole in lowmem without
  moving more memory, then make the MMIO hole as large as possible.

- When placing RMRRs, find the next RMRR higher than the current base
  in the lowmem mmio hole.  If it overlaps, skip ahead of it and find
  the next one.

This certainly won't work in all cases, but it should work in a
significant number of cases.  Additionally, users should be able to
work around problems by setting mmio_hole_size larger in the guest
config.

Signed-off-by: George Dunlap 
Signed-off-by: Tiejun Chen 
---
 tools/firmware/hvmloader/pci.c | 65 
++

 1 file changed, 65 insertions(+)

diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
index 5ff87a7..74fc080 100644
--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -38,6 +38,46 @@ uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
 enum virtual_vga virtual_vga = VGA_none;
 unsigned long igd_opregion_pgbase = 0;

+/* Check if the specified range conflicts with any reserved device 
memory. */

+static bool check_overlap_all(uint64_t start, uint64_t size)
+{
+unsigned int i;
+
+for ( i = 0; i < memory_map.nr_map; i++ )
+{
+if ( memory_map.map[i].type == E820_RESERVED &&
+ check_overlap(start, size,
+   memory_map.map[i].addr,
+   memory_map.map[i].size) )
+return true;
+}
+
+return false;
+}
+
+/* Find the lowest RMRR higher than base. */
+static int find_next_rmrr(uint32_t base)
+{
+unsigned int i;
+int next_rmrr = -1;
+uint64_t end, min_end = (1ull << 32);
+
+for ( i = 0; i < memory_map.nr_map ; i++ )
+{
+end = memory_map.map[i].addr + memory_map.map[i].size;
+
+if ( memory_map.map[i].type == E820_RESERVED &&
+ end > base &&
+ min_end < min_end )
+{
+next_rmrr = i;
+min_end = end;
+}
+}
+
+return next_rmrr;
+}
+
 void pci_setup(void)
 {
 uint8_t is_64bar, using_64bar, bar64_relocate = 0;
@@ -46,6 +86,7 @@ void pci_setup(void)
 uint32_t vga_devfn = 256;
 uint16_t class, vendor_id, device_id;
 unsigned int bar, pin, link, isa_irq;
+int next_rmrr;

 /* Resources assignable to PCI devices via BARs. */
 struct resource {
@@ -299,6 +340,15 @@ void pci_setup(void)
 || (((pci_mem_start << 1) >> PAGE_SHIFT)
 >= hvm_info->low_mem_pgend)) )
 pci_mem_start <<= 1;
+
+/*
+ * Try to accomodate RMRRs in our MMIO region on a best-effort 
basis.
+ * If we have RMRRs in the range, then make pci_mem_start just 
after

+ * hvm_info->low_mem_pgend.
+ */
+if ( pci_mem_start > (hvm_info->low_mem_pgend << PAGE_SHIFT) &&
+ check_overlap_all(pci_mem_start, pci_mem_end-pci_mem_start) )
+pci_mem_start = hvm_info->low_mem_pgend << PAGE_SHIFT;
 }

 if ( mmio_total > (pci_mem_end - pci_mem_start) )
@@ -352,6 +402,8 @@ void pci_setup(void)
 io_resource.base = 0xc000;
 io_resource.max = 0x1;

+next_rmrr = find_next_rmrr(pci_mem_start);
+
 /* Assign iomem and ioport resources in descending order of size. */
 for ( i = 0; i < nr_bars; i++ )
 {
@@ -407,6 +459,19 @@ void pci_setup(void)
 }

 base = (resource->base  + bar_sz - 1) & ~(uint64_t)(bar_sz - 1);
+
+/* If we're using mem_resource, check for RMRR conflicts. */
+while ( resource == &mem_resource &&
+next_rmrr >= 0 &&
+check_overlap(base, bar_sz,
+  memory_map.map[next_rmrr].addr,
+  memory_map.map[next_rmrr].size) )
+{
+base = memory_map.map[next_rmrr].addr + 
memory_map.map[next_rmrr].size;

+base = (base + bar_sz - 1) & ~(bar_sz - 1);
+next_rmrr = find_next_rmrr(base);
+}
+
 bar_data |= (uint32_t)base;
 bar_data_upper = (uint32_t)(base >> 32);
 base += bar_sz;
--
1.9.1

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 22/23] x86: make Xen early boot code relocatable

2015-07-20 Thread Daniel Kiper
Every multiboot protocol (regardless of version) compatible image must
specify its load address (in ELF or multiboot header). Multiboot protocol
compatible loader have to load image at specified address. However, there
is no guarantee that the requested memory region (in case of Xen it starts
at 1 MiB and ends at 17 MiB) where image should be loaded initially is a RAM
and it is free (legacy BIOS platforms are merciful for Xen but I found at
least one EFI platform on which Xen load address conflicts with EFI boot
services; it is Dell PowerEdge R820 with latest firmware). To cope with
that problem we must make Xen early boot code relocatable. This patch does
that. However, it does not add multiboot2 protocol interface which is done
in next patch.

This patch changes following things:
  - default load address is changed from 1 MiB to 2 MiB; I did that because
initial page tables are using 2 MiB huge pages and this way required
updates for them are quite easy; it means that e.g. we avoid spacial
cases for beginning and end of required memory region if it live at
address not aligned to 2 MiB,
  - %ebp register is used as a storage for Xen image base address; this way
we can get this value very quickly if it is needed; however, %ebp register
is not used directly to access a given memory region,
  - %fs register is filled with segment descriptor which describes memory region
with Xen image (it could be relocated or not); it is used to access some of
Xen data in early boot code; potentially we can use above mentioned segment
descriptor to access data using %ds:%esi and/or %es:%esi (e.g. movs*); 
however,
I think that it could unnecessarily obfuscate code (e.g. we need at least
to operations to reload a given segment descriptor) and current solution
looks quite optimal.

Signed-off-by: Daniel Kiper 
---
 xen/arch/x86/Makefile  |6 +-
 xen/arch/x86/Rules.mk  |4 +
 xen/arch/x86/boot/head.S   |  165 ++--
 xen/arch/x86/boot/trampoline.S |   11 ++-
 xen/arch/x86/boot/wakeup.S |6 +-
 xen/arch/x86/boot/x86_64.S |   34 -
 xen/arch/x86/setup.c   |   33 
 xen/arch/x86/x86_64/mm.c   |2 +-
 xen/arch/x86/xen.lds.S |2 +-
 xen/include/asm-x86/config.h   |3 +
 xen/include/asm-x86/page.h |2 +-
 11 files changed, 182 insertions(+), 86 deletions(-)

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 82c5a93..93069a8 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -72,8 +72,10 @@ efi-$(x86_64) := $(shell if [ ! -r 
$(BASEDIR)/include/xen/compile.h -o \
  echo '$(TARGET).efi'; fi)
 
 $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
-   ./boot/mkelf32 $(TARGET)-syms $(TARGET) 0x10 \
-   `$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
+#  THIS IS UGLY HACK! PLEASE DO NOT COMPLAIN. I WILL FIX IT IN NEXT 
RELEASE.
+   ./boot/mkelf32 $(TARGET)-syms $(TARGET) $(XEN_IMG_PHYS_START) 
0x82d08100
+#  ./boot/mkelf32 $(TARGET)-syms $(TARGET) 0x10 \
+#  `$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
 
 
 ALL_OBJS := $(BASEDIR)/arch/x86/boot/built_in.o 
$(BASEDIR)/arch/x86/efi/built_in.o $(ALL_OBJS)
diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index 4a04a8a..7ccb8a0 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -15,6 +15,10 @@ HAS_GDBSX := y
 HAS_PDX := y
 xenoprof := y
 
+XEN_IMG_PHYS_START = 0x20
+
+CFLAGS += -DXEN_IMG_PHYS_START=$(XEN_IMG_PHYS_START)
+
 CFLAGS += -I$(BASEDIR)/include 
 CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-generic
 CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-default
diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index 3f1054d..d484f68 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -12,13 +12,15 @@
 .text
 .code32
 
-#define sym_phys(sym) ((sym) - __XEN_VIRT_START)
+#define sym_phys(sym) ((sym) - __XEN_VIRT_START + XEN_IMG_PHYS_START - 
XEN_IMG_OFFSET)
+#define sym_offset(sym)   ((sym) - __XEN_VIRT_START)
 
 #define BOOT_CS320x0008
 #define BOOT_CS640x0010
 #define BOOT_DS  0x0018
 #define BOOT_PSEUDORM_CS 0x0020
 #define BOOT_PSEUDORM_DS 0x0028
+#define BOOT_FS  0x0030
 
 #define MB2_HT(name)  (MULTIBOOT2_HEADER_TAG_##name)
 #define MB2_TT(name)  (MULTIBOOT2_TAG_TYPE_##name)
@@ -105,12 +107,13 @@ multiboot1_header_end:
 
 .word   0
 gdt_boot_descr:
-.word   6*8-1
-.long   sym_phys(trampoline_gdt)
+.word   7*8-1
+gdt_boot_descr_addr:
+.long   sym_offset(trampoline_gdt)
 .long   0 /* Needed for 64-bit lgdt */
 
 cs32_switch_addr:
-.long   sym_phys(cs32_switch)
+.long   sym_offset(cs32_switch)
 .word   BOOT_CS32
 
 .Lbad_cpu_msg: .asciz "ERR: Not a 64-bit CPU!"
@@ -120,13 +123,13 @@ cs32_switch_addr

[Xen-devel] [PATCH v2 21/23] x86/boot: implement early command line parser in C

2015-07-20 Thread Daniel Kiper
Current early command line parser implementation in assembler
is very difficult to change to relocatable stuff using segment
registers. This requires a lot of changes in very weird and
fragile code. So, reimplement this functionality in C. This
way code will be relocatable out of the box and much easier
to maintain.

Suggested-by: Andrew Cooper 
Signed-off-by: Daniel Kiper 
---
 .gitignore |5 +-
 xen/arch/x86/Makefile  |2 +-
 xen/arch/x86/boot/Makefile |7 +-
 xen/arch/x86/boot/build32.mk   |2 +
 xen/arch/x86/boot/cmdline.S|  367 -
 xen/arch/x86/boot/cmdline.c|  396 
 xen/arch/x86/boot/edd.S|3 -
 xen/arch/x86/boot/head.S   |   17 ++
 xen/arch/x86/boot/trampoline.S |   14 ++
 xen/arch/x86/boot/video.S  |6 -
 10 files changed, 439 insertions(+), 380 deletions(-)
 delete mode 100644 xen/arch/x86/boot/cmdline.S
 create mode 100644 xen/arch/x86/boot/cmdline.c

diff --git a/.gitignore b/.gitignore
index f6ddb00..e0618b9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -223,9 +223,10 @@ xen/arch/arm/xen.lds
 xen/arch/x86/asm-offsets.s
 xen/arch/x86/boot/mkelf32
 xen/arch/x86/xen.lds
+xen/arch/x86/boot/cmdline.S
 xen/arch/x86/boot/reloc.S
-xen/arch/x86/boot/reloc.bin
-xen/arch/x86/boot/reloc.lnk
+xen/arch/x86/boot/*.bin
+xen/arch/x86/boot/*.lnk
 xen/arch/x86/efi.lds
 xen/arch/x86/efi/check.efi
 xen/arch/x86/efi/disabled
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 0335445..82c5a93 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -170,4 +170,4 @@ clean::
rm -f asm-offsets.s *.lds boot/*.o boot/*~ boot/core boot/mkelf32
rm -f $(BASEDIR)/.xen-syms.[0-9]* boot/.*.d
rm -f $(BASEDIR)/.xen.efi.[0-9]* efi/*.o efi/.*.d efi/*.efi 
efi/disabled efi/mkreloc
-   rm -f boot/reloc.S boot/reloc.lnk boot/reloc.bin
+   rm -f boot/cmdline.S boot/reloc.S boot/*.lnk boot/*.bin
diff --git a/xen/arch/x86/boot/Makefile b/xen/arch/x86/boot/Makefile
index 06893d8..d73cc76 100644
--- a/xen/arch/x86/boot/Makefile
+++ b/xen/arch/x86/boot/Makefile
@@ -1,9 +1,14 @@
 obj-bin-y += head.o
 
+CMDLINE_DEPS = video.h
+
 RELOC_DEPS = $(BASEDIR)/include/asm-x86/config.h 
$(BASEDIR)/include/xen/multiboot.h \
 $(BASEDIR)/include/xen/multiboot2.h
 
-head.o: reloc.S
+head.o: cmdline.S reloc.S
+
+cmdline.S: cmdline.c $(CMDLINE_DEPS)
+   $(MAKE) -f build32.mk $@ CMDLINE_DEPS="$(CMDLINE_DEPS)"
 
 reloc.S: reloc.c $(RELOC_DEPS)
$(MAKE) -f build32.mk $@ RELOC_DEPS="$(RELOC_DEPS)"
diff --git a/xen/arch/x86/boot/build32.mk b/xen/arch/x86/boot/build32.mk
index c83effe..d681643 100644
--- a/xen/arch/x86/boot/build32.mk
+++ b/xen/arch/x86/boot/build32.mk
@@ -30,6 +30,8 @@ CFLAGS := $(filter-out -flto,$(CFLAGS))
esac; \
done
 
+cmdline.o: cmdline.c $(CMDLINE_DEPS)
+
 reloc.o: reloc.c $(RELOC_DEPS)
 
 .PRECIOUS: %.bin %.lnk
diff --git a/xen/arch/x86/boot/cmdline.S b/xen/arch/x86/boot/cmdline.S
deleted file mode 100644
index 00687eb..000
--- a/xen/arch/x86/boot/cmdline.S
+++ /dev/null
@@ -1,367 +0,0 @@
-/**
- * cmdline.S
- *
- * Early command-line parsing.
- */
-
-.code32
-
-#include "video.h"
-
-# NB. String pointer on stack is modified to point past parsed digits.
-.Latoi:
-push%ebx
-push%ecx
-push%edx
-push%esi
-xor %ebx,%ebx   /* %ebx = accumulator */
-mov $10,%ecx/* %ecx = base (default base 10) */
-mov 16+4(%esp),%esi /* %esi = pointer into ascii string. */
-lodsb
-cmpb$'0',%al
-jne 2f
-mov $8,%ecx /* Prefix '0' => octal (base 8) */
-lodsb
-cmpb$'x',%al
-jne 2f
-mov $16,%ecx/* Prefix '0x' => hex (base 16) */
-1:  lodsb
-2:  sub $'0',%al
-jb  4f
-cmp $9,%al
-jbe 3f
-sub $'A'-'0'-10,%al
-jb  4f
-cmp $15,%al
-jbe 3f
-sub $'a'-'A',%al
-jb  4f
-3:  cmp %cl,%al
-jae 4f
-movzbl  %al,%eax
-xchg%eax,%ebx
-mul %ecx
-xchg%eax,%ebx
-add %eax,%ebx
-jmp 1b
-4:  mov %ebx,%eax
-dec %esi
-mov %esi,16+4(%esp)
-pop %esi
-pop %edx
-pop %ecx
-pop %ebx
-ret
-
-.Lstrstr:
-push%ecx
-push%edx
-push%esi
-push%edi
-xor %eax,%eax
-xor %ecx,%ecx
-not %ecx
-mov 16+4(%esp),%esi
-mov 16+8(%esp),%edi
-repne   scasb
-not %ecx
-dec %ecx
-mov %ecx,%edx
-1:  mov 16+8(%esp),%edi
-mov  

[Xen-devel] [PATCH v2 14/23] efi: split out efi_find_gop_mode()

2015-07-20 Thread Daniel Kiper
..which finds suitable GOP mode. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - improve commit message
 (suggested by Jan Beulich).
---
 xen/common/efi/boot.c |   94 -
 1 file changed, 54 insertions(+), 40 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 6fad230..8d16470 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -665,6 +665,58 @@ static EFI_GRAPHICS_OUTPUT_PROTOCOL __init 
*efi_get_gop(void)
 return gop;
 }
 
+static UINTN __init efi_find_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop,
+  UINTN cols, UINTN rows, UINTN depth)
+{
+EFI_GRAPHICS_OUTPUT_MODE_INFORMATION *mode_info;
+EFI_STATUS status;
+UINTN gop_mode = ~0, info_size, size;
+unsigned int i;
+
+if ( !gop )
+return gop_mode;
+
+for ( i = size = 0; i < gop->Mode->MaxMode; ++i )
+{
+unsigned int bpp = 0;
+
+status = gop->QueryMode(gop, i, &info_size, &mode_info);
+if ( EFI_ERROR(status) )
+continue;
+switch ( mode_info->PixelFormat )
+{
+case PixelBitMask:
+bpp = hweight32(mode_info->PixelInformation.RedMask |
+mode_info->PixelInformation.GreenMask |
+mode_info->PixelInformation.BlueMask);
+break;
+case PixelRedGreenBlueReserved8BitPerColor:
+case PixelBlueGreenRedReserved8BitPerColor:
+bpp = 24;
+break;
+default:
+continue;
+}
+if ( cols == mode_info->HorizontalResolution &&
+ rows == mode_info->VerticalResolution &&
+ (!depth || bpp == depth) )
+{
+gop_mode = i;
+break;
+}
+if ( !cols && !rows &&
+ mode_info->HorizontalResolution *
+ mode_info->VerticalResolution > size )
+{
+size = mode_info->HorizontalResolution *
+   mode_info->VerticalResolution;
+gop_mode = i;
+}
+}
+
+return gop_mode;
+}
+
 static void __init setup_efi_pci(void)
 {
 EFI_STATUS status;
@@ -978,46 +1030,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 
 dir_handle->Close(dir_handle);
 
-if ( gop && !base_video )
-{
-for ( i = size = 0; i < gop->Mode->MaxMode; ++i )
-{
-unsigned int bpp = 0;
-
-status = gop->QueryMode(gop, i, &info_size, &mode_info);
-if ( EFI_ERROR(status) )
-continue;
-switch ( mode_info->PixelFormat )
-{
-case PixelBitMask:
-bpp = hweight32(mode_info->PixelInformation.RedMask |
-mode_info->PixelInformation.GreenMask |
-mode_info->PixelInformation.BlueMask);
-break;
-case PixelRedGreenBlueReserved8BitPerColor:
-case PixelBlueGreenRedReserved8BitPerColor:
-bpp = 24;
-break;
-default:
-continue;
-}
-if ( cols == mode_info->HorizontalResolution &&
- rows == mode_info->VerticalResolution &&
- (!depth || bpp == depth) )
-{
-gop_mode = i;
-break;
-}
-if ( !cols && !rows &&
- mode_info->HorizontalResolution *
- mode_info->VerticalResolution > size )
-{
-size = mode_info->HorizontalResolution *
-   mode_info->VerticalResolution;
-gop_mode = i;
-}
-}
-}
+if ( !base_video )
+gop_mode = efi_find_gop_mode(gop, cols, rows, depth);
 }
 
 efi_arch_edd();
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 18/23] efi: split out efi_exit_boot()

2015-07-20 Thread Daniel Kiper
..which gets memory map and calls ExitBootServices(). We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - improve commit message
 (suggested by Jan Beulich).
---
 xen/common/efi/boot.c |   92 +++--
 1 file changed, 50 insertions(+), 42 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 04b9c7e..bf2f198 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -879,6 +879,53 @@ static void __init 
efi_set_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, UINTN gop
 efi_arch_video_init(gop, info_size, mode_info);
 }
 
+static void __init efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
+{
+EFI_STATUS status;
+UINTN info_size = 0, map_key;
+bool_t retry;
+
+efi_bs->GetMemoryMap(&info_size, NULL, &map_key,
+ &efi_mdesc_size, &mdesc_ver);
+info_size += 8 * efi_mdesc_size;
+efi_memmap = efi_arch_allocate_mmap_buffer(info_size);
+if ( !efi_memmap )
+blexit(L"Unable to allocate memory for EFI memory map");
+
+for ( retry = 0; ; retry = 1 )
+{
+efi_memmap_size = info_size;
+status = SystemTable->BootServices->GetMemoryMap(&efi_memmap_size,
+ efi_memmap, &map_key,
+ &efi_mdesc_size,
+ &mdesc_ver);
+if ( EFI_ERROR(status) )
+PrintErrMesg(L"Cannot obtain memory map", status);
+
+efi_arch_process_memory_map(SystemTable, efi_memmap, efi_memmap_size,
+efi_mdesc_size, mdesc_ver);
+
+efi_arch_pre_exit_boot();
+
+status = SystemTable->BootServices->ExitBootServices(ImageHandle,
+ map_key);
+efi_bs = NULL;
+if ( status != EFI_INVALID_PARAMETER || retry )
+break;
+}
+
+if ( EFI_ERROR(status) )
+PrintErrMesg(L"Cannot exit boot services", status);
+
+/* Adjust pointers into EFI. */
+efi_ct = (void *)efi_ct + DIRECTMAP_VIRT_START;
+#ifdef USE_SET_VIRTUAL_ADDRESS_MAP
+efi_rs = (void *)efi_rs + DIRECTMAP_VIRT_START;
+#endif
+efi_memmap = (void *)efi_memmap + DIRECTMAP_VIRT_START;
+efi_fw_vendor = (void *)efi_fw_vendor + DIRECTMAP_VIRT_START;
+}
+
 static int __init __maybe_unused set_color(u32 mask, int bpp, u8 *pos, u8 *sz)
 {
if ( bpp < 0 )
@@ -903,11 +950,11 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 EFI_STATUS status;
 unsigned int i, argc;
 CHAR16 **argv, *file_name, *cfg_file_name = NULL, *options = NULL;
-UINTN map_key, info_size, gop_mode = ~0;
+UINTN gop_mode = ~0;
 EFI_SHIM_LOCK_PROTOCOL *shim_lock;
 EFI_GRAPHICS_OUTPUT_PROTOCOL *gop = NULL;
 union string section = { NULL }, name;
-bool_t base_video = 0, retry;
+bool_t base_video = 0;
 char *option_str;
 bool_t use_cfg_file;
 
@@ -1125,46 +1172,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 
 efi_set_gop_mode(gop, gop_mode);
 
-info_size = 0;
-efi_bs->GetMemoryMap(&info_size, NULL, &map_key,
- &efi_mdesc_size, &mdesc_ver);
-info_size += 8 * efi_mdesc_size;
-efi_memmap = efi_arch_allocate_mmap_buffer(info_size);
-if ( !efi_memmap )
-blexit(L"Unable to allocate memory for EFI memory map");
-
-for ( retry = 0; ; retry = 1 )
-{
-efi_memmap_size = info_size;
-status = SystemTable->BootServices->GetMemoryMap(&efi_memmap_size,
- efi_memmap, &map_key,
- &efi_mdesc_size,
- &mdesc_ver);
-if ( EFI_ERROR(status) )
-PrintErrMesg(L"Cannot obtain memory map", status);
-
-efi_arch_process_memory_map(SystemTable, efi_memmap, efi_memmap_size,
-efi_mdesc_size, mdesc_ver);
-
-efi_arch_pre_exit_boot();
-
-status = SystemTable->BootServices->ExitBootServices(ImageHandle,
- map_key);
-efi_bs = NULL;
-if ( status != EFI_INVALID_PARAMETER || retry )
-break;
-}
-
-if ( EFI_ERROR(status) )
-PrintErrMesg(L"Cannot exit boot services", status);
-
-/* Adjust pointers into EFI. */
-efi_ct = (void *)efi_ct + DIRECTMAP_VIRT_START;
-#ifdef USE_SET_VIRTUAL_ADDRESS_MAP
-efi_rs = (void *)efi_rs + DIRECTMAP_VIRT_START;
-#endif
-efi_memmap = (void *)efi_memmap + DIRECTMAP_VIRT_START;
-efi_fw_vendor = (void *)efi_fw_vendor + DIRECTMAP_VIRT_START;
+efi_exit_boot(ImageHandle, SystemTable);
 
 efi_arch_post_exit_boot();
  

[Xen-devel] [PATCH v2 17/23] efi: split out efi_set_gop_mode()

2015-07-20 Thread Daniel Kiper
..which sets chosen GOP mode. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - improve commit message
 (suggested by Jan Beulich).
---
 xen/common/efi/boot.c |   33 -
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 177697a..04b9c7e 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -860,6 +860,25 @@ static void __init efi_variables(void)
 }
 }
 
+static void __init efi_set_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop, UINTN 
gop_mode)
+{
+EFI_GRAPHICS_OUTPUT_MODE_INFORMATION *mode_info;
+EFI_STATUS status;
+UINTN info_size;
+
+if ( !gop )
+return;
+
+/* Set graphics mode. */
+if ( gop_mode < gop->Mode->MaxMode && gop_mode != gop->Mode->Mode )
+gop->SetMode(gop, gop_mode);
+
+/* Get graphics and frame buffer info. */
+status = gop->QueryMode(gop, gop->Mode->Mode, &info_size, &mode_info);
+if ( !EFI_ERROR(status) )
+efi_arch_video_init(gop, info_size, mode_info);
+}
+
 static int __init __maybe_unused set_color(u32 mask, int bpp, u8 *pos, u8 *sz)
 {
if ( bpp < 0 )
@@ -887,7 +906,6 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 UINTN map_key, info_size, gop_mode = ~0;
 EFI_SHIM_LOCK_PROTOCOL *shim_lock;
 EFI_GRAPHICS_OUTPUT_PROTOCOL *gop = NULL;
-EFI_GRAPHICS_OUTPUT_MODE_INFORMATION *mode_info;
 union string section = { NULL }, name;
 bool_t base_video = 0, retry;
 char *option_str;
@@ -1105,18 +1123,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 
 efi_arch_memory_setup();
 
-if ( gop )
-{
-
-/* Set graphics mode. */
-if ( gop_mode < gop->Mode->MaxMode && gop_mode != gop->Mode->Mode )
-gop->SetMode(gop, gop_mode);
-
-/* Get graphics and frame buffer info. */
-status = gop->QueryMode(gop, gop->Mode->Mode, &info_size, &mode_info);
-if ( !EFI_ERROR(status) )
-efi_arch_video_init(gop, info_size, mode_info);
-}
+efi_set_gop_mode(gop, gop_mode);
 
 info_size = 0;
 efi_bs->GetMemoryMap(&info_size, NULL, &map_key,
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 3/4] ts-devstack: Deploy OpenStack then test it with Tempest

2015-07-20 Thread Ian Campbell
On Mon, 2015-07-20 at 15:12 +0100, Anthony PERARD wrote:
> On Fri, Jul 17, 2015 at 05:04:03PM +0100, Ian Campbell wrote:
> > On Thu, 2015-07-16 at 12:18 +0100, Anthony PERARD wrote:
> > > +cd $builddir/devstack
> > > +>local.conf
> > > +echo >>local.conf '[[local|localrc]]'
> > > +echo >>local.conf ADMIN_PASSWORD=`pwgen 20 1`
> > > +echo >>local.conf DATABASE_PASSWORD=`pwgen 20 1`
> > > +echo >>local.conf RABBIT_PASSWORD=`pwgen 20 1`
> > > +echo >>local.conf SERVICE_PASSWORD=`pwgen 20 1`
> > > +echo >>local.conf SERVICE_TOKEN=`pwgen 20 1`
> > > +echo >>local.conf \\\# make it small because there is no way to not
> > > +echo >>local.conf \\\# have this lvm volume created
> > > +echo >>local.conf VOLUME_BACKING_FILE_SIZE=500M
> > > +echo >>local.conf DEST=/opt/stack
> > > +echo >>local.conf LOGFILE=\\\$DEST/logs/stack.sh.log
> > > +echo >>local.conf LOG_COLOR=False
> > > +echo >>local.conf LIBVIRT_TYPE=xen
> > > +echo >>local.conf GIT_BASE="$openstack_git_base"
> > > +echo >>local.conf disable_service horizon
> > > +echo >>local.conf disable_service n-novnc
> > > +echo >>local.conf enable_service n-obj
> > > +echo >>local.conf '[[post-config|\$CINDER_CONF]]'
> > > +echo >>local.conf '[lvmdriver-1]'
> > > +echo >>local.conf volume_group = $vg
> > 
> > target_putfilecontents_root_stash with a Perl here doc would be  better
> > I think?
> 
> Will use the function. I guest I just need to replace pwgen by something in
> perl. I think services are accessible from the network, which is why I'm
> using pwgen.

Only internally within the test COLO. Given that all the hosts have the
same root password I wouldn't worry too much :-)

> 
> > > +[...]+
> > > +  # OpenStack needs access to libvirt from a user.
> > > +  target_cmd_root($ho, < > > +echo >>/etc/libvirt/libvirtd.conf 'unix_sock_group = "libvirt"'
> > > +echo >>/etc/libvirt/libvirtd.conf 'unix_sock_ro_perms = "0777"'
> > > +echo >>/etc/libvirt/libvirtd.conf 'unix_sock_rw_perms = "0770"'
> > 
> > This one should be a bash heredoc, I think (can't be a Perl one because
> > this is an append?).
> 
> This is append to the existing libvirtd.conf, yes. I'll use bash heredoc.
> Or is target_editfile_root() can be used to add lines at the end?

I'd expect so, but it'll probably involve more Perl-fu than I can
summon. I think a HEREDOC is the right answer anyway.

> 
> > > +sub cleanup() {
> > > +  # Try to have less leaked stuff.
> > 
> > Leaked as in "discovered by ts-leak-check" or just a general tidy up?
> > 
> > If the latter I wouldn't bother.
> > 
> > If the former then won't this hide real issues?
> 
> I've added this when I've seen many "leaked" process from OpenStack by
> ts-leak-check. I can try to teach the ts- script to not considered as
> leaked, process and other files from openstack.

Be sure to distinguish between "expected" leaks (like a daemon which
should be running after the test) and unexpected rubbish. You'll
probably want to justify anything in the former category in the commit
log, and not to include the latter of course.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 10/23] efi: build xen.gz with EFI code

2015-07-20 Thread Daniel Kiper
Build xen.gz with EFI code. We need this to support multiboot2
protocol on EFI platforms.

If we wish to load not ELF file using multiboot (v1) or multiboot2 then
it must contain "linear" (or "flat") representation of code and data.
Currently, PE file contains many sections which are not "linear" (one
after another without any holes) or even do not have representation
in a file (e.g. BSS). In theory there is a chance that we could build
proper PE file using current build system. However, it means that
xen.efi further diverge from xen ELF file (in terms of contents and
build method). ELF have all needed properties. So, it means that this
is good starting point for further development. Additionally, I think
that this is also good starting point for further xen.efi code and
build optimizations. It looks that there is a chance that finally we
can generate xen.efi directly from xen ELF using just simple objcopy.

Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - build EFI code only if it is supported in a given build environment
 (suggested by Jan Beulich).
---
 xen/arch/x86/Makefile |   13 +
 xen/arch/x86/efi/Makefile |   16 +---
 xen/arch/x86/mm.c |3 ++-
 xen/common/efi/runtime.c  |6 ++
 4 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 5f24951..0335445 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -80,7 +80,7 @@ ALL_OBJS := $(BASEDIR)/arch/x86/boot/built_in.o 
$(BASEDIR)/arch/x86/efi/built_in
 
 ifeq ($(lto),y)
 # Gather all LTO objects together
-prelink_lto.o: $(ALL_OBJS)
+prelink_lto.o: $(ALL_OBJS) efi/relocs-dummy.o
$(LD_LTO) -r -o $@ $^
 
 prelink-efi_lto.o: $(ALL_OBJS) efi/runtime.o efi/compat.o
@@ -90,14 +90,14 @@ prelink-efi_lto.o: $(ALL_OBJS) efi/runtime.o efi/compat.o
 prelink.o: $(patsubst %/built_in.o,%/built_in_bin.o,$(ALL_OBJS)) prelink_lto.o
$(LD) $(LDFLAGS) -r -o $@ $^
 
-prelink-efi.o: $(patsubst %/built_in.o,%/built_in_bin.o,$(ALL_OBJS)) 
prelink-efi_lto.o efi/boot.init.o
+prelink-efi.o: $(patsubst %/built_in.o,%/built_in_bin.o,$(ALL_OBJS)) 
prelink-efi_lto.o
$(guard) $(LD) $(LDFLAGS) -r -o $@ $^
 else
-prelink.o: $(ALL_OBJS)
+prelink.o: $(ALL_OBJS) efi/relocs-dummy.o
$(LD) $(LDFLAGS) -r -o $@ $^
 
-prelink-efi.o: $(ALL_OBJS) efi/boot.init.o efi/runtime.o efi/compat.o
-   $(guard) $(LD) $(LDFLAGS) -r -o $@ $(filter-out %/efi/built_in.o,$^)
+prelink-efi.o: $(ALL_OBJS)
+   $(guard) $(LD) $(LDFLAGS) -r -o $@ $^
 endif
 
 $(BASEDIR)/common/symbols-dummy.o:
@@ -146,9 +146,6 @@ $(TARGET).efi: prelink-efi.o efi.lds efi/relocs-dummy.o 
$(BASEDIR)/common/symbol
if $(guard) false; then rm -f $@; echo 'EFI support disabled'; fi
rm -f $(@D)/.$(@F).[0-9]*
 
-efi/boot.init.o efi/runtime.o efi/compat.o: $(BASEDIR)/arch/x86/efi/built_in.o
-efi/boot.init.o efi/runtime.o efi/compat.o: ;
-
 asm-offsets.s: $(TARGET_SUBARCH)/asm-offsets.c
$(CC) $(filter-out -flto,$(CFLAGS)) -S -o $@ $<
 
diff --git a/xen/arch/x86/efi/Makefile b/xen/arch/x86/efi/Makefile
index 1daa7ac..b1e8883 100644
--- a/xen/arch/x86/efi/Makefile
+++ b/xen/arch/x86/efi/Makefile
@@ -1,14 +1,16 @@
 CFLAGS += -fshort-wchar
 
-obj-y += stub.o
-
-create = test -e $(1) || touch -t 19990101 $(1)
-
 efi := $(filter y,$(x86_64)$(shell rm -f disabled))
 efi := $(if $(efi),$(shell $(CC) $(filter-out $(CFLAGS-y) .%.d,$(CFLAGS)) -c 
check.c 2>disabled && echo y))
 efi := $(if $(efi),$(shell $(LD) -mi386pep --subsystem=10 -o check.efi check.o 
2>disabled && echo y))
-efi := $(if $(efi),$(shell rm disabled)y,$(shell $(call create,boot.init.o); 
$(call create,runtime.o)))
+efi := $(if $(efi),$(shell rm disabled)y)
 
-extra-$(efi) += boot.init.o relocs-dummy.o runtime.o compat.o
+extra-y += relocs-dummy.o
 
-stub.o: $(extra-y)
+ifeq ($(efi),y)
+obj-y += boot.init.o
+obj-y += compat.o
+obj-y += runtime.o
+else
+obj-y += stub.o
+endif
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 342414f..cef2eb6 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -344,7 +344,8 @@ void __init arch_init_memory(void)
 
 subarch_init_memory();
 
-efi_init_memory();
+if ( efi_enabled(EFI_PLATFORM) )
+efi_init_memory();
 
 mem_sharing_init();
 
diff --git a/xen/common/efi/runtime.c b/xen/common/efi/runtime.c
index aa064e7..3eb21c1 100644
--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -167,6 +167,9 @@ int efi_get_info(uint32_t idx, union xenpf_efi_info *info)
 {
 unsigned int i, n;
 
+if ( !efi_enabled(EFI_PLATFORM) )
+return -EOPNOTSUPP;
+
 switch ( idx )
 {
 case XEN_FW_EFI_VERSION:
@@ -301,6 +304,9 @@ int efi_runtime_call(struct xenpf_efi_runtime_call *op)
 EFI_STATUS status = EFI_NOT_STARTED;
 int rc = 0;
 
+if ( !efi_enabled(EFI_PLATFORM) )
+return -EOPNOTSUPP;
+
 switch ( op->function )
 {
 case XEN_EFI_get_time:
-- 
1.7.10.4



[Xen-devel] [PATCH v2 15/23] efi: split out efi_tables()

2015-07-20 Thread Daniel Kiper
..which collects system tables data. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - improve commit message
 (suggested by Jan Beulich).
---
 xen/common/efi/boot.c |   61 +++--
 1 file changed, 34 insertions(+), 27 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 8d16470..fd62125 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -717,6 +717,39 @@ static UINTN __init 
efi_find_gop_mode(EFI_GRAPHICS_OUTPUT_PROTOCOL *gop,
 return gop_mode;
 }
 
+static void __init efi_tables(void)
+{
+unsigned int i;
+
+/* Obtain basic table pointers. */
+for ( i = 0; i < efi_num_ct; ++i )
+{
+static EFI_GUID __initdata acpi2_guid = ACPI_20_TABLE_GUID;
+static EFI_GUID __initdata acpi_guid = ACPI_TABLE_GUID;
+static EFI_GUID __initdata mps_guid = MPS_TABLE_GUID;
+static EFI_GUID __initdata smbios_guid = SMBIOS_TABLE_GUID;
+static EFI_GUID __initdata smbios3_guid = SMBIOS3_TABLE_GUID;
+
+if ( match_guid(&acpi2_guid, &efi_ct[i].VendorGuid) )
+  efi.acpi20 = (long)efi_ct[i].VendorTable;
+if ( match_guid(&acpi_guid, &efi_ct[i].VendorGuid) )
+  efi.acpi = (long)efi_ct[i].VendorTable;
+if ( match_guid(&mps_guid, &efi_ct[i].VendorGuid) )
+  efi.mps = (long)efi_ct[i].VendorTable;
+if ( match_guid(&smbios_guid, &efi_ct[i].VendorGuid) )
+  efi.smbios = (long)efi_ct[i].VendorTable;
+if ( match_guid(&smbios3_guid, &efi_ct[i].VendorGuid) )
+  efi.smbios3 = (long)efi_ct[i].VendorTable;
+}
+
+#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */
+dmi_efi_get_table(efi.smbios != EFI_INVALID_TABLE_ADDR
+  ? (void *)(long)efi.smbios : NULL,
+  efi.smbios3 != EFI_INVALID_TABLE_ADDR
+  ? (void *)(long)efi.smbios3 : NULL);
+#endif
+}
+
 static void __init setup_efi_pci(void)
 {
 EFI_STATUS status;
@@ -1039,33 +1072,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 /* XXX Collect EDID info. */
 efi_arch_cpu();
 
-/* Obtain basic table pointers. */
-for ( i = 0; i < efi_num_ct; ++i )
-{
-static EFI_GUID __initdata acpi2_guid = ACPI_20_TABLE_GUID;
-static EFI_GUID __initdata acpi_guid = ACPI_TABLE_GUID;
-static EFI_GUID __initdata mps_guid = MPS_TABLE_GUID;
-static EFI_GUID __initdata smbios_guid = SMBIOS_TABLE_GUID;
-static EFI_GUID __initdata smbios3_guid = SMBIOS3_TABLE_GUID;
-
-if ( match_guid(&acpi2_guid, &efi_ct[i].VendorGuid) )
-  efi.acpi20 = (long)efi_ct[i].VendorTable;
-if ( match_guid(&acpi_guid, &efi_ct[i].VendorGuid) )
-  efi.acpi = (long)efi_ct[i].VendorTable;
-if ( match_guid(&mps_guid, &efi_ct[i].VendorGuid) )
-  efi.mps = (long)efi_ct[i].VendorTable;
-if ( match_guid(&smbios_guid, &efi_ct[i].VendorGuid) )
-  efi.smbios = (long)efi_ct[i].VendorTable;
-if ( match_guid(&smbios3_guid, &efi_ct[i].VendorGuid) )
-  efi.smbios3 = (long)efi_ct[i].VendorTable;
-}
-
-#ifndef CONFIG_ARM /* TODO - disabled until implemented on ARM */
-dmi_efi_get_table(efi.smbios != EFI_INVALID_TABLE_ADDR
-  ? (void *)(long)efi.smbios : NULL,
-  efi.smbios3 != EFI_INVALID_TABLE_ADDR
-  ? (void *)(long)efi.smbios3 : NULL);
-#endif
+efi_tables();
 
 /* Collect PCI ROM contents. */
 setup_efi_pci();
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 09/23] efi: create efi_enabled()

2015-07-20 Thread Daniel Kiper
We need more fine grained knowledge about EFI environment and check
for EFI platform and EFI loader separately to properly support
multiboot2 protocol. In general Xen loaded by this protocol uses
memory mappings and loaded modules in similar way to Xen loaded
by multiboot (v1) protocol. Hence, create efi_enabled() which
checks available features in efi.flags. This patch only defines
EFI_PLATFORM feature which is equal to old efi_enabled == 1.
Following patch will define EFI_LOADER feature accordingly.

Suggested-by: Jan Beulich 
Signed-off-by: Daniel Kiper 
---
 xen/arch/x86/dmi_scan.c|4 ++--
 xen/arch/x86/domain_page.c |2 +-
 xen/arch/x86/efi/stub.c|   11 ---
 xen/arch/x86/mpparse.c |4 ++--
 xen/arch/x86/setup.c   |   10 +-
 xen/arch/x86/shutdown.c|2 +-
 xen/arch/x86/time.c|2 +-
 xen/arch/x86/xen.lds.S |2 --
 xen/common/efi/boot.c  |4 
 xen/common/efi/runtime.c   |   17 +++--
 xen/drivers/acpi/osl.c |2 +-
 xen/include/xen/efi.h  |   16 ++--
 12 files changed, 46 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/dmi_scan.c b/xen/arch/x86/dmi_scan.c
index 269168c..95c5a77 100644
--- a/xen/arch/x86/dmi_scan.c
+++ b/xen/arch/x86/dmi_scan.c
@@ -229,7 +229,7 @@ const char *__init dmi_get_table(paddr_t *base, u32 *len)
 {
static unsigned int __initdata instance;
 
-   if (efi_enabled) {
+   if (efi_enabled(EFI_PLATFORM)) {
if (efi_smbios3_size && !(instance & 1)) {
*base = efi_smbios3_address;
*len = efi_smbios3_size;
@@ -693,7 +693,7 @@ static void __init dmi_decode(struct dmi_header *dm)
 
 void __init dmi_scan_machine(void)
 {
-   if ((!efi_enabled ? dmi_iterate(dmi_decode) :
+   if ((!efi_enabled(EFI_PLATFORM) ? dmi_iterate(dmi_decode) :
dmi_efi_iterate(dmi_decode)) == 0)
dmi_check_system(dmi_blacklist);
else
diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
index d86f8fe..fdf0d8a 100644
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -36,7 +36,7 @@ static inline struct vcpu *mapcache_current_vcpu(void)
  * domain's page tables but current may point at another domain's VCPU.
  * Return NULL as though current is not properly set up yet.
  */
-if ( efi_enabled && efi_rs_using_pgtables() )
+if ( efi_enabled(EFI_PLATFORM) && efi_rs_using_pgtables() )
 return NULL;
 
 /*
diff --git a/xen/arch/x86/efi/stub.c b/xen/arch/x86/efi/stub.c
index 07c2bd0..c5ae369 100644
--- a/xen/arch/x86/efi/stub.c
+++ b/xen/arch/x86/efi/stub.c
@@ -4,9 +4,14 @@
 #include 
 #include 
 
-#ifndef efi_enabled
-const bool_t efi_enabled = 0;
-#endif
+struct efi __read_mostly efi = {
+   .flags   = 0, /* Initialized later. */
+   .acpi= EFI_INVALID_TABLE_ADDR,
+   .acpi20  = EFI_INVALID_TABLE_ADDR,
+   .mps = EFI_INVALID_TABLE_ADDR,
+   .smbios  = EFI_INVALID_TABLE_ADDR,
+   .smbios3 = EFI_INVALID_TABLE_ADDR
+};
 
 void __init efi_init_memory(void) { }
 
diff --git a/xen/arch/x86/mpparse.c b/xen/arch/x86/mpparse.c
index 8609f4a..5223579 100644
--- a/xen/arch/x86/mpparse.c
+++ b/xen/arch/x86/mpparse.c
@@ -557,7 +557,7 @@ static inline void __init construct_default_ISA_mptable(int 
mpc_default_type)
 
 static __init void efi_unmap_mpf(void)
 {
-   if (efi_enabled)
+   if (efi_enabled(EFI_PLATFORM))
clear_fixmap(FIX_EFI_MPF);
 }
 
@@ -715,7 +715,7 @@ void __init find_smp_config (void)
 {
unsigned int address;
 
-   if (efi_enabled) {
+   if (efi_enabled(EFI_PLATFORM)) {
efi_check_config();
return;
}
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index ff34670..bce708c 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -444,8 +444,8 @@ static void __init parse_video_info(void)
 {
 struct boot_video_info *bvi = &bootsym(boot_vid_info);
 
-/* The EFI loader fills vga_console_info directly. */
-if ( efi_enabled )
+/* vga_console_info is filled directly on EFI platform. */
+if ( efi_enabled(EFI_PLATFORM) )
 return;
 
 if ( (bvi->orig_video_isVGA == 1) && (bvi->orig_video_mode == 3) )
@@ -695,7 +695,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 if ( !(mbi->flags & MBI_MODULES) || (mbi->mods_count == 0) )
 panic("dom0 kernel not specified. Check bootloader configuration.");
 
-if ( efi_enabled )
+if ( efi_enabled(EFI_PLATFORM) )
 {
 set_pdx_range(xen_phys_start >> PAGE_SHIFT,
   (xen_phys_start + BOOTSTRAP_MAP_BASE) >> PAGE_SHIFT);
@@ -806,7 +806,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
  * we can relocate the dom0 kernel and other multiboot modules. Also, on
  * x86/64, we relocate Xen to higher memory.
  */
-for ( i = 0; !efi_enabled && i < mbi->mods_count

[Xen-devel] [PATCH v2 19/23] x86/efi: create new early memory allocator

2015-07-20 Thread Daniel Kiper
There is a problem with place_string() which is used as early memory
allocator. It gets memory chunks starting from start symbol and
going down. Sadly this does not work when Xen is loaded using multiboot2
protocol because start lives on 1 MiB address. So, I tried to use
mem_lower address calculated by GRUB2. However, it works only on some
machines. There are machines in the wild (e.g. Dell PowerEdge R820)
which uses first ~640 KiB for boot services code or data... :-(((

In case of multiboot2 protocol we need that place_string() only allocate
memory chunk for EFI memory map. However, I think that it should be fixed
instead of making another function used just in one case. I thought about
two solutions.

1) We could use native EFI allocation functions (e.g. AllocatePool()
   or AllocatePages()) to get memory chunk. However, later (somewhere
   in __start_xen()) we must copy its contents to safe place or reserve
   this in e820 memory map and map it in Xen virtual address space.
   In later case we must also care about conflicts with e.g. crash
   kernel regions which could be quite difficult.

2) We may allocate memory area statically somewhere in Xen code which
   could be used as memory pool for early dynamic allocations. Looks
   quite simple. Additionally, it would not depend on EFI at all and
   could be used on legacy BIOS platforms if we need it. However, we
   must carefully choose size of this pool. We do not want increase
   Xen binary size too much and waste too much memory but also we must fit
   at least memory map on x86 EFI platforms. As I saw on small machine,
   e.g. IBM System x3550 M2 with 8 GiB RAM, memory map may contain more
   than 200 entries. Every entry on x86-64 platform is 40 bytes in size.
   So, it means that we need more than 8 KiB for EFI memory map only.
   Additionally, if we want to use this memory pool for Xen and modules
   command line storage (it would be used when xen.efi is executed as EFI
   application) then we should add, I think, about 1 KiB. In this case,
   to be on safe side, we should assume at least 64 KiB pool for early
   memory allocations, which is about 4 times of our earlier calculations.
   However, during discussion on Xen-devel Jan Beulich suggested that
   just in case we should use 1 MiB memory pool like it was in original
   place_string() implementation. So, let's use 1 MiB as it was proposed.
   If we think that we should not waste unallocated memory in the pool
   on running system then we can mark this region as __initdata and move
   all required data to dynamically allocated places somewhere in __start_xen().

Now solution #2 is implemented but maybe we should consider #1 one day.

Signed-off-by: Daniel Kiper 
---
 xen/arch/x86/efi/efi-boot.h |   38 ++
 xen/arch/x86/setup.c|3 +--
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
index 2dd69f6..3d25c48 100644
--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -103,9 +103,36 @@ static void __init relocate_trampoline(unsigned long phys)
 *(u16 *)(*trampoline_ptr + (long)trampoline_ptr) = phys >> 4;
 }
 
+#define EBMALLOC_SIZE  MB(1)
+
+static char __initdata ebmalloc_mem[EBMALLOC_SIZE];
+static char __initdata *ebmalloc_free = NULL;
+
+/* EFI boot allocator. */
+static void __init *ebmalloc(size_t size)
+{
+void *ptr;
+
+/*
+ * Init ebmalloc_free on runtime. Static initialization
+ * will not work because it puts virtual address there.
+ */
+if ( ebmalloc_free == NULL )
+ebmalloc_free = ebmalloc_mem;
+
+ptr = ebmalloc_free;
+
+ebmalloc_free += size;
+
+if ( ebmalloc_free - ebmalloc_mem > sizeof(ebmalloc_mem) )
+blexit(L"Out of static memory\r\n");
+
+return ptr;
+}
+
 static void __init place_string(u32 *addr, const char *s)
 {
-static char *__initdata alloc = start;
+char *alloc = NULL;
 
 if ( s && *s )
 {
@@ -113,7 +140,7 @@ static void __init place_string(u32 *addr, const char *s)
 const char *old = (char *)(long)*addr;
 size_t len2 = *addr ? strlen(old) + 1 : 0;
 
-alloc -= len1 + len2;
+alloc = ebmalloc(len1 + len2);
 /*
  * Insert new string before already existing one. This is needed
  * for options passed on the command line to override options from
@@ -196,12 +223,7 @@ static void __init 
efi_arch_process_memory_map(EFI_SYSTEM_TABLE *SystemTable,
 
 static void *__init efi_arch_allocate_mmap_buffer(UINTN map_size)
 {
-place_string(&mbi.mem_upper, NULL);
-mbi.mem_upper -= map_size;
-mbi.mem_upper &= -__alignof__(EFI_MEMORY_DESCRIPTOR);
-if ( mbi.mem_upper < xen_phys_start )
-return NULL;
-return (void *)(long)mbi.mem_upper;
+return ebmalloc(map_size);
 }
 
 static void __init efi_arch_pre_exit_boot(void)
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index bce708c..a59f

[Xen-devel] [PATCH v2 20/23] x86: add multiboot2 protocol support for EFI platforms

2015-07-20 Thread Daniel Kiper
Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - generate multiboot2 header using macros
 (suggested by Jan Beulich),
   - switch CPU to x86_32 mode before
 jumping to 32-bit code
 (suggested by Andrew Cooper),
   - reduce code changes to increase patch readability
 (suggested by Jan Beulich),
   - improve comments
 (suggested by Jan Beulich),
   - ignore MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO tag on EFI platform
 and find on my own multiboot2.mem_lower value,
   - stop execution if EFI platform is detected
 in legacy BIOS path.
---
 xen/arch/x86/boot/head.S  |  157 +++--
 xen/arch/x86/efi/efi-boot.h   |   30 +++
 xen/arch/x86/efi/stub.c   |5 ++
 xen/arch/x86/setup.c  |   10 ++-
 xen/arch/x86/x86_64/asm-offsets.c |2 +
 xen/arch/x86/xen.lds.S|4 +-
 xen/common/efi/boot.c |   12 +++
 xen/include/xen/efi.h |1 +
 8 files changed, 210 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index 57197db..056047f 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -89,6 +89,13 @@ multiboot1_header_end:
0, /* Number of the lines - no preference. */ \
0  /* Number of bits per pixel - no preference. */
 
+/* Do not disable EFI boot services. */
+mb2ht_init MB2_HT(EFI_BS), MB2_HT(OPTIONAL)
+
+/* EFI64 entry point. */
+mb2ht_init MB2_HT(ENTRY_ADDRESS_EFI64), MB2_HT(OPTIONAL), \
+   sym_phys(__efi64_start)
+
 /* Multiboot2 header end tag. */
 mb2ht_init MB2_HT(END), MB2_HT(REQUIRED)
 .Lmultiboot2_header_end:
@@ -100,9 +107,15 @@ multiboot1_header_end:
 gdt_boot_descr:
 .word   6*8-1
 .long   sym_phys(trampoline_gdt)
+.long   0 /* Needed for 64-bit lgdt */
+
+cs32_switch_addr:
+.long   sym_phys(cs32_switch)
+.word   BOOT_CS32
 
 .Lbad_cpu_msg: .asciz "ERR: Not a 64-bit CPU!"
 .Lbad_ldr_msg: .asciz "ERR: Not a Multiboot bootloader!"
+.Lbad_mb2_ldr: .asciz "ERR: Use latest Multiboot2 compatible bootloader!"
 
 .section .init.text, "ax", @progbits
 
@@ -111,6 +124,9 @@ bad_cpu:
 jmp print_err
 not_multiboot:
 mov $(sym_phys(.Lbad_ldr_msg)),%esi # Error message
+jmp print_err
+mb2_too_old:
+mov $(sym_phys(.Lbad_mb2_ldr)),%esi # Error message
 print_err:
 mov $0xB8000,%edi  # VGA framebuffer
 1:  mov (%esi),%bl
@@ -130,6 +146,119 @@ print_err:
 .Lhalt: hlt
 jmp .Lhalt
 
+.code64
+
+__efi64_start:
+cld
+
+/* Check for Multiboot2 bootloader. */
+cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax
+je  efi_multiboot2_proto
+
+/* Jump to not_multiboot after switching CPU to x86_32 mode. */
+lea not_multiboot(%rip),%rdi
+jmp x86_32_switch
+
+efi_multiboot2_proto:
+/*
+ * Multiboot2 information address is 32-bit,
+ * so, zero higher half of %rbx.
+ */
+mov %ebx,%ebx
+
+/* Skip Multiboot2 information fixed part. */
+lea MB2_fixed_sizeof(%rbx),%rcx
+
+0:
+/* Get EFI SystemTable address from Multiboot2 information. */
+cmpl$MULTIBOOT2_TAG_TYPE_EFI64,MB2_tag_type(%rcx)
+jne 1f
+
+mov MB2_efi64_st(%rcx),%rsi
+
+/* Do not go into real mode on EFI platform. */
+movb$1,skip_realmode(%rip)
+jmp 3f
+
+1:
+/* Get EFI ImageHandle address from Multiboot2 information. */
+cmpl$MULTIBOOT2_TAG_TYPE_EFI64_IH,MB2_tag_type(%rcx)
+jne 2f
+
+mov MB2_efi64_ih(%rcx),%rdi
+jmp 3f
+
+2:
+/* Is it the end of Multiboot2 information? */
+cmpl$MULTIBOOT2_TAG_TYPE_END,MB2_tag_type(%rcx)
+je  run_bs
+
+3:
+/* Go to next Multiboot2 information tag. */
+add MB2_tag_size(%rcx),%ecx
+add $(MULTIBOOT2_TAG_ALIGN-1),%rcx
+and $~(MULTIBOOT2_TAG_ALIGN-1),%rcx
+jmp 0b
+
+run_bs:
+push%rax
+push%rdi
+
+/* Initialize BSS (no nasty surprises!). */
+lea __bss_start(%rip),%rdi
+lea __bss_end(%rip),%rcx
+sub %rdi,%rcx
+shr $3,%rcx
+xor %eax,%eax
+rep stosq
+
+pop %rdi
+
+/*
+ * IN: %rdi - EFI ImageHandle, %rsi - EFI SystemTable.
+ * OUT: %rax - multiboot2.mem_lower. Do not get this value from
+ * MULTIBOOT2_TAG_TYPE_BASIC_MEMINFO tag. It could be bogus on
+ * EFI platforms.
+ */
+callefi_multiboot2
+
+/* Convert multiboot2.mem_lower to bytes/16. */
+mov %rax,%rcx
+shr $4,%rcx
+
+pop %rax
+
+/* Jump to trampoline_setup after switching CPU to x86_32 mode. */
+lea 

[Xen-devel] [PATCH v2 12/23] efi: split out efi_console_set_mode()

2015-07-20 Thread Daniel Kiper
..which sets console mode. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - improve commit message
 (suggested by Jan Beulich).
---
 xen/common/efi/boot.c |   37 -
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 6f327cd..4614146 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -611,6 +611,25 @@ static void __init efi_init(EFI_HANDLE ImageHandle, 
EFI_SYSTEM_TABLE *SystemTabl
 StdErr = SystemTable->StdErr ?: StdOut;
 }
 
+static void __init efi_console_set_mode(void)
+{
+UINTN cols, rows, size;
+unsigned int best, i;
+
+for ( i = 0, size = 0, best = StdOut->Mode->Mode;
+  i < StdOut->Mode->MaxMode; ++i )
+{
+if ( StdOut->QueryMode(StdOut, i, &cols, &rows) == EFI_SUCCESS &&
+ cols * rows > size )
+{
+size = cols * rows;
+best = i;
+}
+}
+if ( best != StdOut->Mode->Mode )
+StdOut->SetMode(StdOut, best);
+}
+
 static void __init setup_efi_pci(void)
 {
 EFI_STATUS status;
@@ -799,23 +818,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 }
 
 if ( !base_video )
-{
-unsigned int best;
-UINTN cols, rows, size;
-
-for ( i = 0, size = 0, best = StdOut->Mode->Mode;
-  i < StdOut->Mode->MaxMode; ++i )
-{
-if ( StdOut->QueryMode(StdOut, i, &cols, &rows) == EFI_SUCCESS 
&&
- cols * rows > size )
-{
-size = cols * rows;
-best = i;
-}
-}
-if ( best != StdOut->Mode->Mode )
-StdOut->SetMode(StdOut, best);
-}
+efi_console_set_mode();
 }
 
 PrintStr(L"Xen " __stringify(XEN_VERSION) "." __stringify(XEN_SUBVERSION)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 11/23] efi: split out efi_init()

2015-07-20 Thread Daniel Kiper
..which initializes basic EFI variables. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - improve commit message
 (suggested by Jan Beulich).
---
 xen/common/efi/boot.c |   28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 1f188fe..6f327cd 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -595,6 +595,22 @@ static char *__init get_value(const struct file *cfg, 
const char *section,
 return NULL;
 }
 
+static void __init efi_init(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
+{
+efi_ih = ImageHandle;
+efi_bs = SystemTable->BootServices;
+efi_bs_revision = efi_bs->Hdr.Revision;
+efi_rs = SystemTable->RuntimeServices;
+efi_ct = SystemTable->ConfigurationTable;
+efi_num_ct = SystemTable->NumberOfTableEntries;
+efi_version = SystemTable->Hdr.Revision;
+efi_fw_vendor = SystemTable->FirmwareVendor;
+efi_fw_revision = SystemTable->FirmwareRevision;
+
+StdOut = SystemTable->ConOut;
+StdErr = SystemTable->StdErr ?: StdOut;
+}
+
 static void __init setup_efi_pci(void)
 {
 EFI_STATUS status;
@@ -721,18 +737,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 set_bit(EFI_PLATFORM, &efi.flags);
 #endif
 
-efi_ih = ImageHandle;
-efi_bs = SystemTable->BootServices;
-efi_bs_revision = efi_bs->Hdr.Revision;
-efi_rs = SystemTable->RuntimeServices;
-efi_ct = SystemTable->ConfigurationTable;
-efi_num_ct = SystemTable->NumberOfTableEntries;
-efi_version = SystemTable->Hdr.Revision;
-efi_fw_vendor = SystemTable->FirmwareVendor;
-efi_fw_revision = SystemTable->FirmwareRevision;
+efi_init(ImageHandle, SystemTable);
 
-StdOut = SystemTable->ConOut;
-StdErr = SystemTable->StdErr ?: StdOut;
 use_cfg_file = efi_arch_use_config_file(SystemTable);
 
 status = efi_bs->HandleProtocol(ImageHandle, &loaded_image_guid,
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 16/23] efi: split out efi_variables()

2015-07-20 Thread Daniel Kiper
..which collects variable store parameters. We want to re-use this
code to support multiboot2 protocol on EFI platforms.

Signed-off-by: Daniel Kiper 
---
v2 - suggestions/fixes:
   - improve commit message
 (suggested by Jan Beulich).
---
 xen/common/efi/boot.c |   41 -
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index fd62125..177697a 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -837,6 +837,29 @@ static void __init setup_efi_pci(void)
 efi_bs->FreePool(handles);
 }
 
+static void __init efi_variables(void)
+{
+EFI_STATUS status;
+
+status = (efi_rs->Hdr.Revision >> 16) >= 2 ?
+ efi_rs->QueryVariableInfo(EFI_VARIABLE_NON_VOLATILE |
+   EFI_VARIABLE_BOOTSERVICE_ACCESS |
+   EFI_VARIABLE_RUNTIME_ACCESS,
+   &efi_boot_max_var_store_size,
+   &efi_boot_remain_var_store_size,
+   &efi_boot_max_var_size) :
+ EFI_INCOMPATIBLE_VERSION;
+if ( EFI_ERROR(status) )
+{
+efi_boot_max_var_store_size = 0;
+efi_boot_remain_var_store_size = 0;
+efi_boot_max_var_size = status;
+PrintStr(L"Warning: Could not query variable store: ");
+DisplayUint(status, 0);
+PrintStr(newline);
+}
+}
+
 static int __init __maybe_unused set_color(u32 mask, int bpp, u8 *pos, u8 *sz)
 {
if ( bpp < 0 )
@@ -1078,23 +1101,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE 
*SystemTable)
 setup_efi_pci();
 
 /* Get snapshot of variable store parameters. */
-status = (efi_rs->Hdr.Revision >> 16) >= 2 ?
- efi_rs->QueryVariableInfo(EFI_VARIABLE_NON_VOLATILE |
-   EFI_VARIABLE_BOOTSERVICE_ACCESS |
-   EFI_VARIABLE_RUNTIME_ACCESS,
-   &efi_boot_max_var_store_size,
-   &efi_boot_remain_var_store_size,
-   &efi_boot_max_var_size) :
- EFI_INCOMPATIBLE_VERSION;
-if ( EFI_ERROR(status) )
-{
-efi_boot_max_var_store_size = 0;
-efi_boot_remain_var_store_size = 0;
-efi_boot_max_var_size = status;
-PrintStr(L"Warning: Could not query variable store: ");
-DisplayUint(status, 0);
-PrintStr(newline);
-}
+efi_variables();
 
 efi_arch_memory_setup();
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   >