Re: [Xen-devel] [PATCH OSSTEST 07/12] For hvm guest configuration, config console to 'hvc0'

2015-02-12 Thread Hu, Robert


Best Regards,
Robert Ho

> -Original Message-
> From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
> Sent: Thursday, February 12, 2015 1:04 AM
> To: Hu, Robert
> Cc: xen-devel@lists.xen.org; jfeh...@suse.com; wei.l...@citrix.com;
> ian.campb...@citrix.com; Pang, LongtaoX
> Subject: Re: [PATCH OSSTEST 07/12] For hvm guest configuration, config
> console to 'hvc0'
> 
> Robert Ho writes ("[PATCH OSSTEST 07/12] For hvm guest configuration, config
> console to 'hvc0'"):
> > ---
> >  Osstest/TestSupport.pm | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
> > index c23bbc7..864805e 100644
> > --- a/Osstest/TestSupport.pm
> > +++ b/Osstest/TestSupport.pm
> > @@ -1753,7 +1753,11 @@ sub target_kernkind_check ($) {
> >  if ($kernkind eq 'pvops') {
> >  store_runvar($pfx."rootdev", 'xvda') if $isguest;
> >  store_runvar($pfx."console", 'hvc0');
> > -} elsif ($kernkind !~ m/2618/) {
> > +}
> > +elsif ($kernkind eq 'hvm'){
> > +store_runvar($pfx."console", 'hvc0');   #nested hvm guest
> shall not append console=xvc0; I guess this applies to all hvm guests.
> > +}
> > +elsif ($kernkind !~ m/2618/) {
> 
> I don't understand why this is necessary.  Surely all the kernels here
> are pvops so the kernkind should be 'pvops' in all cases and the
> console will be set to hvc0 anyway ?
Oh, this is my mistake. Will revert.
> 
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-12 Thread Julien Grall



On 13/02/2015 15:12, Ard Biesheuvel wrote:

On 13 February 2015 at 15:03, Julien Grall  wrote:

Hi Ard,


On 12/02/2015 19:29, Ard Biesheuvel wrote:


This patch registers hvc0 as the preferred console if no console
has been specified explicitly on the kernel command line.

The purpose is to allow platform agnostic kernels and boot images
(such as distro installers) to boot in a Xen/ARM domU without the
need to modify the command line by hand.

Signed-off-by: Ard Biesheuvel 
---
   arch/arm/xen/enlighten.c | 4 
   1 file changed, 4 insertions(+)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 0abeefa7dbf8..927be1d1bad7 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -24,6 +24,7 @@
   #include 
   #include 
   #include 
+#include 

   #include 

@@ -255,6 +256,9 @@ void __init xen_early_init(void)
 xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
 else
 xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+
+   if (!console_set_on_cmdline && !xen_initial_domain())
+   add_preferred_console("hvc", 0, NULL);



Unfortunately, this won't work as expected.



Did you try it?


No just looked at the code.


console_set_on_cmdline is set when Linux parses the early params. The
parsing is done after setup_arch (the function which call xen_early_init).

So we will end up to add the HVC console even if a console has been passed
on the command line.



parse_early_param() is also called by setup_arch(), before xen_early_init()

The call to parse_early_param() in generic code is only there for
architectures that don't call it in their setup_arch()


Oh, right. Sorry for the noise.

So:

Reviewed-by: Julien Grall 

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 10/12] Compose the main body of test-nested test job.

2015-02-12 Thread Hu, Robert
> -Original Message-
> From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
> Sent: Thursday, February 12, 2015 1:07 AM
> To: Hu, Robert
> Cc: xen-devel@lists.xen.org; jfeh...@suse.com; wei.l...@citrix.com;
> ian.campb...@citrix.com; Pang, LongtaoX
> Subject: Re: [PATCH OSSTEST 10/12] Compose the main body of test-nested
> test job.
> 
> Robert Ho writes ("[PATCH OSSTEST 10/12] Compose the main body of
> test-nested test job."):
> >  Compose the main body of test-nested test job.
> 
> Ah, this is what I was missing earlier.  You really need to order this
> so that things come after things which depend on them.
> 
> Typically:
>  * cleanups
>  * define new TestSupport facilities
>  * define new ts-* scripts if any
>  * define new recipies
>  * updates to make-flight to define new jobs.
> 
OK, will follow this order.
> > +proc need-hosts/test-nested {} {return host}
> > +proc run-job/test-nested {} {
> > +run-ts . = ts-debian-hvm-install + host + nested + nested_L1
> 
> ts-debian-hvm-install takes only two arguments.  You are passing 3.
> I guess this is in further patches...
sure:)
> 
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-12 Thread Ard Biesheuvel
On 13 February 2015 at 15:03, Julien Grall  wrote:
> Hi Ard,
>
>
> On 12/02/2015 19:29, Ard Biesheuvel wrote:
>>
>> This patch registers hvc0 as the preferred console if no console
>> has been specified explicitly on the kernel command line.
>>
>> The purpose is to allow platform agnostic kernels and boot images
>> (such as distro installers) to boot in a Xen/ARM domU without the
>> need to modify the command line by hand.
>>
>> Signed-off-by: Ard Biesheuvel 
>> ---
>>   arch/arm/xen/enlighten.c | 4 
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
>> index 0abeefa7dbf8..927be1d1bad7 100644
>> --- a/arch/arm/xen/enlighten.c
>> +++ b/arch/arm/xen/enlighten.c
>> @@ -24,6 +24,7 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>>
>>   #include 
>>
>> @@ -255,6 +256,9 @@ void __init xen_early_init(void)
>> xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
>> else
>> xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
>> +
>> +   if (!console_set_on_cmdline && !xen_initial_domain())
>> +   add_preferred_console("hvc", 0, NULL);
>
>
> Unfortunately, this won't work as expected.
>

Did you try it?

> console_set_on_cmdline is set when Linux parses the early params. The
> parsing is done after setup_arch (the function which call xen_early_init).
>
> So we will end up to add the HVC console even if a console has been passed
> on the command line.
>

parse_early_param() is also called by setup_arch(), before xen_early_init()

The call to parse_early_param() in generic code is only there for
architectures that don't call it in their setup_arch()

-- 
Ard.

>>   }
>>
>>   static int __init xen_guest_init(void)
>>
>
> Regards,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 01/12] Add support of parsing grub which has 'submenu' primitive

2015-02-12 Thread Hu, Robert

> -Original Message-
> From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
> Sent: Friday, February 13, 2015 2:32 AM
> To: Wei Liu
> Cc: Hu, Robert; xen-devel@lists.xen.org; jfeh...@suse.com;
> ian.campb...@citrix.com; Pang, LongtaoX
> Subject: Re: [PATCH OSSTEST 01/12] Add support of parsing grub which has
> 'submenu' primitive
> 
> Wei Liu writes ("Re: [PATCH OSSTEST 01/12] Add support of parsing grub which
> has 'submenu' primitive"):
> > On Thu, Feb 12, 2015 at 02:01:59AM +, Hu, Robert wrote:
> > > Yes, this minor change just get 'parsemenu' subroutine capability of
> recognizing 'submenu'.
> > > The outer layer logic isn't affected.
> > > Actually, the Xen boot menuentry we choose, is inside a submenu. It works
> and /etc/default/grub
> > > is assigned properly.
> 
> Great.
> 
> > In any case this is a very useful improvement.
> 
> Yes, indeed!
> 
> > Out of interest what Linux are you running?  If you're running Debian
> > and the overlay 20_linux_xen (inside $OSSTEST/overlay/etc/etc/grub.d) is
> > copied to your test host, there shouldn't be any submenu entries in your
> > grub.cfg, I think.
> 
> I consider that a workaround (and I think so do you).
> 
> So I think subject to the (rather daft) argument we are having over
> whitespace this is a really useful patch.
> 
> > > > Can you please not adjust the whitespace ?  osstest in general doesn't
> > > > have a requirement for any particular whitespace use, and certainly if
> > > > there are to be any whitespace changes they ought to be in a separate
> > > > patch.
> > >
> > > I adjust those because some one in last version's reply told us that
> > > osstest prefers white space substitution to tab,
> 
> I'm sorry that we seem to be having a disagreement over this.  That's
> not very helpful for you, I realise!
> 
> I hope that whoever made those comments would agree that whitespace
> cleanups should at least be in a separate patch.  So please when you
> resubmit can you split them out ?
Sure, will separate white space change and indentation adjustments out.
> 
> I can't seem to find the email you refer to.  Do you happen to be able
> to give me a reference ?
> 
> > > and traditionally 4 white space of 1 tab. (This align with my
> > > previous coding experience as well)
> 
> 4-character tabs are quite unusual in the Free Software world.  8 is
> usual.
> 
> > > And I indeed find that this hunk of code doesn't looks well in my editor.
> > > Its unalignment increases difficulty of reading.
> 
> Since evidently this is annoying to you I won't stand in the way of
> your effort to clean this up, even though I don't much care about it.
> So if you submit this as a separate patch I won't block it.
Thanks for your understanding.
> 
> But maybe simply configuring your editor to use 8-character tabs will
> fix the problem for you ?  That would be less work than preparing
> whitespace adjustment patches.
OK, will have a try first. :)
> 
> Thanksw,
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-12 Thread Julien Grall

Hi Ard,

On 12/02/2015 19:29, Ard Biesheuvel wrote:

This patch registers hvc0 as the preferred console if no console
has been specified explicitly on the kernel command line.

The purpose is to allow platform agnostic kernels and boot images
(such as distro installers) to boot in a Xen/ARM domU without the
need to modify the command line by hand.

Signed-off-by: Ard Biesheuvel 
---
  arch/arm/xen/enlighten.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 0abeefa7dbf8..927be1d1bad7 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -24,6 +24,7 @@
  #include 
  #include 
  #include 
+#include 

  #include 

@@ -255,6 +256,9 @@ void __init xen_early_init(void)
xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
else
xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+
+   if (!console_set_on_cmdline && !xen_initial_domain())
+   add_preferred_console("hvc", 0, NULL);


Unfortunately, this won't work as expected.

console_set_on_cmdline is set when Linux parses the early params. The 
parsing is done after setup_arch (the function which call xen_early_init).


So we will end up to add the HVC console even if a console has been 
passed on the command line.



  }

  static int __init xen_guest_init(void)



Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 12/12] Changes to test step of xen install

2015-02-12 Thread Hu, Robert
> -Original Message-
> From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
> Sent: Friday, February 13, 2015 2:21 AM
> To: Hu, Robert
> Cc: xen-devel@lists.xen.org; ian.jack...@eu.citrix.com; jfeh...@suse.com;
> wei.l...@citrix.com; ian.campb...@citrix.com; Pang, LongtaoX
> Subject: Re: [PATCH OSSTEST 12/12] Changes to test step of xen install
> 
> Robert Ho writes ("[PATCH OSSTEST 12/12] Changes to test step of xen
> install"):
> >  This patch accomodates ts-xen-install to nested L1 xen installation
> >  usage. Its change is relatively simpler than
> >  ts-debain-hvm-install. We simply alter '$ho' usage to 'w_ho', which
> >  is assigned to '$ho' in original L0 installation context, while
> >  assigned to '$gho' in L1 Xen installation context.
> 
> Certainly I think we should use ts-xen-install for installing Xen on
> the L1.  But I think that ts-xen-install should probably think almost
> entirely about the L1 and $ho should be the L1.
> 
> I think if you followed the suggestion for the structure that I made
> in my previous patch, very little of the changes here would be
> necessary.
> 
> It's not clear to me that _anything_ would need to change in
> ts-xen-install, in fact.  (Although I may be wrong.)
Going to read selecthost() and its relative callers deeper, I think you
are right.
What needs to be change is the selecthost() not xen install.
> 
> Thanks,
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 00/17] blktap2 related bugfix patches

2015-02-12 Thread Hongyang Yang

Hi George,

在 11/03/2014 05:58 PM, George Dunlap 写道:

On 10/29/2014 05:49 AM, Wen Congyang wrote:

On 10/20/2014 10:25 PM, George Dunlap wrote:

On Wed, Oct 15, 2014 at 2:05 AM, Wen Congyang  wrote:

On 10/14/2014 11:48 PM, Ian Jackson wrote:

Wen Congyang writes ("[PATCH 00/17] blktap2 related bugfix patches"):

These bugs are found when we implement COLO, or rebase
COLO to upstream xen. They are independent patches, so
post them in separate series.

blktap2 is unmaintained AFAICT.

In the last year there has been only one commit which shows evidence
of someone caring even slightly about tools/blktap2/.  The last
substantial attention was in March 2013.

(I'm disregarding commits which touch tools/blktap2/ to fix up compile
problems with new compilers, sort out build system and file
rearrangements, etc.)

The file you are touching in your 01/17 was last edited (by anyone, at
all) in January 2010.

Under the circumstances, we should probably take all these changes
without looking for anyone to ack them.

Perhaps you would like to become the maintainers of blktap2 ? :-)

Hmm, I don't have any knowledge about disk format, but blktap2 have
such codes(For example: block-vhd.c, block-qcow.c...). I think I can
maintain the rest codes.

Congyang, were you aware that XenServer has a fork of blktap is
actually still under active development and maintainership outside of
the main Xen tree?

git://github.com/xen-org/blktap.git

Both CentOS and Fedora are actually using snapshots of the "blktap2"
branch in that tree for their Xen RPMs.  (I'm sure CentOS is, I
believe Fedora is.)  It's not unlikely that the bugs you're fixing
here have already been fixed in the XenServer fork.

I have another question:
Why we don't merge the "blktap2' branch into xen upstream periodically?


I take it you've found blktap "2.5" useful? :-)

I've been meaning to write an e-mail about this.

The basic reason is that it's normally up to the people doing the development to
submit changes upstream.  Some years ago XenServer forked the blktap2 codebase
but got behind in upstreaming things; at this point there are far too many
changes to simply push them upstream. Furthermore, even XenServer isn't 100%
sure what they're going to do in the future; as of a year ago they were planning
to get rid of blktap entirely in favor of another solution.

One of the ideas I'm going to discuss in my e-mail is the idea of treating
blktap2.5 (and/or blktap3) as an external upstream project, similar to the way
that we treat qemu, seabios, ipxe, and ovmf. That would have a similar effect to
what you describe.


How is this going?



  -George
.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 11/12] Changes on test step of debain hvm guest install

2015-02-12 Thread Hu, Robert
> -Original Message-
> From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
> Sent: Friday, February 13, 2015 2:17 AM
> To: Hu, Robert
> Cc: xen-devel@lists.xen.org; jfeh...@suse.com; wei.l...@citrix.com;
> ian.campb...@citrix.com; Pang, LongtaoX
> Subject: Re: [PATCH OSSTEST 11/12] Changes on test step of debain hvm guest
> install
> 
> Robert Ho writes ("[PATCH OSSTEST 11/12] Changes on test step of debain hvm
> guest install"):
> >  This patch is to make ts-debian-hvm-install accomodate
> 
> Ah yes here is the meat.
> 
> Firstly, can you please reformat your commit message so that the
> individual points are separated out into paragraphs.  But I think
> actually that probably some of this wants to go into different commits
> (and perhaps different places).
You mean dividing this patch into more pieces/commits?
> 
> > diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
> > index 37eade2..e905698 100755
> > --- a/ts-debian-hvm-install
> > +++ b/ts-debian-hvm-install
> > @@ -28,22 +28,30 @@ if (@ARGV && $ARGV[0] =~ m/^--stage(\d+)$/)
> { $stage=$1; shift @ARGV; }
> ...
> > +if ($nested eq 'nested_L1') {
> > +$gn ||= 'nested';
> > +$guesthost ||= "$gn.l1.osstest";
> > +} elsif ($nested eq 'nested_L2') {
> > +$whhost = 'L1_host';
> > +$gn ||= 'nested2';
> > +$guesthost ||= "$gn.l2.osstest";
> > +} else {
> > +$gn ||= 'debianhvm';
> > +$guesthost= "$gn.guest.osstest";
> > +}
> 
> I don't think this is the right way to control the nestedness.
> Also your test recipe seems wrong.  You write:
> 
> +run-ts . = ts-debian-hvm-install + host + nested + nested_L1
> +run-ts . = ts-xen-install + host + nested + nested_build
> +run-ts . = ts-debian-hvm-install + host + nested2 + nested_L2
> +run-ts . = ts-guest-destroy + host + nested
> 
> I think this should look more like:
> 
> +run-ts . = ts-debian-hvm-install + host + nested
> +run-ts . = ts-nested-setup + host + nested
> +run-ts . = ts-xen-install nested
> +run-ts . = ts-host-reboot nested
> +run-ts . = ts-debian-hvm-install nested nested2
> 
OK. Since we could only try to learn your design arch/hierarchy of osstest,
through code reading, such as terms of test job, test step, recipe, etc., 
we inevitably made some misunderstanding or unawareness.
Fortunately getting closer and closer to your mind now.
Will follow your recipe composing above.
> ts-nested-setup would turn on nested HVM support in the domain config,
> figures out the hostname etc. and makes some appropriate runvars which
> selecthost would recognise.
> 
Thanks for this help.
> I don't know why you need to use a dedicated VG for your nested
> guests's L2 guests - please explain - but if you do, probably
> ts-nested-setup could set it up.
The existing ts-debian-hvm-install code presume host has vg set
for guest installation. To make minimal code change, we'd like
to imitate that presumption for L2's host. 
> 
> > @@ -63,7 +71,7 @@ d-i partman-auto/expert_recipe string \\
> >  use_filesystem{ } filesystem{ vfat } \\
> >  mountpoint{ /boot/efi } \\
> >  . \\
> > -5000 50 5000 ext4 \\
> > +1 50 1 ext4 \\
> 
> I think this needs an explanation.  You mentioned it in your commit
> message but didn't give reasons.  I think this should perhaps be done
> in a different way.
You mean not increase the size uniformly, but conditionally only for
L1?
> 
> > +if ($nested eq 'nested_L2') {
> > +my $L2_disk_mb = 2;
> > +my $L0= selecthost($r{'L0_Ident'});
> 
> As a style matter, runvars and perl local variable generally have
> all-lowercase names.
Sure, will follow the convention.
> 
> > +if ($nested eq 'nested_L2') {
> > +target_cmd_root($gho, "init 0");
> > +target_await_down($gho,60);
> > +target_ping_check_down($gho);
> > +}
> > +if ($nested eq 'nested_L1') {
> > +store_runvar("L1_host", $gn);
> > +store_runvar("L1_IP", $gho->{Ip});
> > +store_runvar("L0_Ident", $whhost);
> > +target_cmd_root($gho, "mkdir -p /home/osstest/.ssh && cp
> /root/.ssh/authorized_keys /home/osstest/.ssh/");
> > +}
> 
> I don't understand the purpose behind these special cases.
The first block is shut down L2 guest after proving it boots up.
The second block is in L1 context, that store run vars to pass down
information to L2.
To follow your recipe, these parts shall be moved to other ts-*.
> 
> Thanks,
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH V4] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Raghavendra K T
Paravirt spinlock clears slowpath flag after doing unlock.
As explained by Linus currently it does:
prev = *lock;
add_smp(&lock->tickets.head, TICKET_LOCK_INC);

/* add_smp() is a full mb() */

if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
__ticket_unlock_slowpath(lock, prev);

which is *exactly* the kind of things you cannot do with spinlocks,
because after you've done the "add_smp()" and released the spinlock
for the fast-path, you can't access the spinlock any more.  Exactly
because a fast-path lock might come in, and release the whole data
structure.

Linus suggested that we should not do any writes to lock after unlock(),
and we can move slowpath clearing to fastpath lock.

So this patch implements the fix with:
1. Moving slowpath flag to head (Oleg):
Unlocked locks don't care about the slowpath flag; therefore we can keep
it set after the last unlock, and clear it again on the first (try)lock.
-- this removes the write after unlock. note that keeping slowpath flag would
result in unnecessary kicks.
By moving the slowpath flag from the tail to the head ticket we also avoid
the need to access both the head and tail tickets on unlock.

2. use xadd to avoid read/write after unlock that checks the need for
unlock_kick (Linus):
We further avoid the need for a read-after-release by using xadd;
the prev head value will include the slowpath flag and indicate if we
need to do PV kicking of suspended spinners -- on modern chips xadd
isn't (much) more expensive than an add + load.

Result:
 setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
 benchmark overcommit %improve
 kernbench  1x   -0.13
 kernbench  2x0.02
 dbench 1x   -1.77
 dbench 2x   -0.63

[Jeremy: hinted missing TICKET_LOCK_INC for kick]
[Oleg: Moving slowpath flag to head, ticket_equals idea]
[PeterZ: Detailed changelog]

Reported-by: Sasha Levin 
Suggested-by: Linus Torvalds 
Signed-off-by: Raghavendra K T 
---
 arch/x86/include/asm/spinlock.h | 91 -
 arch/x86/kernel/kvm.c   | 10 +++--
 arch/x86/xen/spinlock.c | 10 +++--
 3 files changed, 56 insertions(+), 55 deletions(-)

potential TODO:
 * The whole patch be splitted into, 1. move slowpath flag
 2. fix memory corruption in completion problem ??

Changes since V3:
  - Detailed changelog (PeterZ)
  - Replace ACCESS_ONCE with READ_ONCE (oleg)
  - Add xen changes (Oleg)
  - Correct break logic in unlock_wait() (Oleg)

Changes since V2:
  - Move the slowpath flag to head, this enables xadd usage in unlock code
and inturn we can get rid of read/write after  unlock (Oleg)
  - usage of ticket_equals (Oleg)

Changes since V1:
  - Add missing TICKET_LOCK_INC before unlock kick (fixes hang in overcommit: 
Jeremy).
  - Remove SLOWPATH_FLAG clearing in fast lock. (Jeremy)
  - clear SLOWPATH_FLAG in arch_spin_value_unlocked during comparison.
 Note: The current implementation is still based on avoid writing after unlock.
  we could still have potential invalid memory read. (Sasha)

 Result:
 setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest)
base = 3_19_rc7

3_19_rc7_spinfix_v3
+---+---+---++---+
 kernbench (Time taken in sec lower is better)
+---+---+---++---+
 base   %stdevpatched  %stdev  %improve
+---+---+---++---+
1x   54.2300 3.0652 54.3008 4.0366-0.13056
2x   90.1883 5.5509 90.1650 6.4336 0.02583
+---+---+---++---+
+---+---+---++---+
dbench (Throughput higher is better)
+---+---+---++---+
 base   %stdevpatched  %stdev  %improve
+---+---+---++---+
1x 7029.9188 2.5952   6905.0712 4.4737-1.77595
2x 3254.207514.8291   3233.713726.8784-0.62976
+---+---+---++---+

 (here is the result I got from the patches, I believe there may
 be some small overhead from xadd etc, but overall looks fine but
 a thorough test may be needed)

diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 625660f..646a1a3 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -46,7 +46,7 @@ static __always_inline bool static_key_false(struct 
static_key *key);
 
 static inline void __ticket_enter_slowpath(arch_spinlock_t *lock)
 {
-   set_bit(0, (volatile unsigned long *)&lock->tickets.tail);
+   set_bit(0, (volatile unsigned long *)&lock->tickets.head);
 }
 
 #else  /* !CONFIG_PARAVIRT_SPINLOCKS */
@@ -60,10 +60,30 @@ static inline void __ticket_unlock_kick(arch_spinlock_t 
*lock,
 }
 
 #endif /* CONFIG_P

Re: [Xen-devel] [PATCH OSSTEST 01/12] Add support of parsing grub which has 'submenu' primitive

2015-02-12 Thread Hu, Robert
> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: Thursday, February 12, 2015 6:13 PM
> To: Hu, Robert
> Cc: Ian Jackson; xen-devel@lists.xen.org; jfeh...@suse.com;
> wei.l...@citrix.com; ian.campb...@citrix.com; Pang, LongtaoX
> Subject: Re: [PATCH OSSTEST 01/12] Add support of parsing grub which has
> 'submenu' primitive
> 
> On Thu, Feb 12, 2015 at 02:01:59AM +, Hu, Robert wrote:
> >
> > > -Original Message-
> > > From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
> > > Sent: Wednesday, February 11, 2015 10:44 PM
> > > To: Hu, Robert
> > > Cc: xen-devel@lists.xen.org; jfeh...@suse.com; wei.l...@citrix.com;
> > > ian.campb...@citrix.com; Pang, LongtaoX
> > > Subject: Re: [PATCH OSSTEST 01/12] Add support of parsing grub which has
> > > 'submenu' primitive
> > >
> > > Robert Ho writes ("[PATCH OSSTEST 01/12] Add support of parsing grub
> which
> > > has 'submenu' primitive"):
> > > >  From a hvm kernel build from Linux stable Kernel tree,
> > > >  the auto generated grub2 menu will have 'submenu' primitive, upon the
> > > >  'menuentry' items. Xen boot entries will be grouped into a submenu.
> This
> > > >  patch adds capability to support such grub formats. Also, this patch
> adjust
> > > >  some indent alignments.
> > >
> > > Thanks for this submission.  Dealing with submenus is definitely
> > > something we want to do.
> > >
> > > I haven't looked at the code in detail yet but I have a general
> > > question: we currently count menu entries and eventually write
> > > GRUB_DEFAULT=  into /etc/default/grub.
> > >
> > > Does this work properly if the entry is in a submenu ?  I guess you
> > > have probably tested this but I thought I should ask...
> > >
> > Yes, this minor change just get 'parsemenu' subroutine capability of
> recognizing 'submenu'.
> > The outer layer logic isn't affected.
> > Actually, the Xen boot menuentry we choose, is inside a submenu. It works
> and /etc/default/grub
> > is assigned properly.
> 
> In any case this is a very useful improvement.
> 
> Out of interest what Linux are you running?  If you're running Debian
> and the overlay 20_linux_xen (inside $OSSTEST/overlay/etc/etc/grub.d) is
> copied to your test host, there shouldn't be any submenu entries in your
> grub.cfg, I think.
> 
> Wei.
We use Debian + linux-stable kernel in the test.
Didn't look into details of the grub generating procedure, but my observation
is that it does have the submenu.
> 
> > > Can you please not adjust the whitespace ?  osstest in general doesn't
> > > have a requirement for any particular whitespace use, and certainly if
> > > there are to be any whitespace changes they ought to be in a separate
> > > patch.
> > I adjust those because some one in last version's reply told us that
> > osstest prefers white space substitution to tab, and traditionally 4
> > white space of 1 tab. (This align with my previous coding experience as 
> > well)
> > And I indeed find that this hunk of code doesn't looks well in my editor.
> > Its unalignment increases difficulty of reading.
> > I would suggest to adjust the this hunk's indentation and use white space
> > substitution to tab to have best suitability across different editors.
> > >
> > > Thanks,
> > > Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 34484: regressions - FAIL

2015-02-12 Thread xen . org
flight 34484 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34484/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-libvirt 13 guest-destroy fail REGR. vs. 34341

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 34341

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass

version targeted for testing:
 xen  d40cbb98a3eb447b8055ec4e70e93a6f22850ac5
baseline version:
 xen  001324547356af86875fad5003f679571a6b8f1c


People who touched revisions under test:
  Ian Jackson 
  Jan Beulich 
  Paul Durrant 
  Wei Liu 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64 

Re: [Xen-devel] [RESEND Patch V2 1/4] xen: build infrastructure for generating hypercall depending symbols

2015-02-12 Thread Juergen Gross

 ##   ###  # #   #
 # #   #   ###  # #
 # #   #   # #   #  #
 ###   #  #  #  #  
 # #   #   # #  # #
 # #   ###  # #
 ####  # #   #

David still wants a comment from the x86 maintainers...

Juergen

On 01/21/2015 08:49 AM, Juergen Gross wrote:

Today there are several places in the kernel which build tables
containing one entry for each possible Xen hypercall. Create an
infrastructure to be able to generate these tables at build time.

Based-on-patch-by: Jan Beulich 
Signed-off-by: Juergen Gross 
Reviewed-by: David Vrabel 
---
  arch/x86/syscalls/Makefile |  9 +
  scripts/xen-hypercalls.sh  | 12 
  2 files changed, 21 insertions(+)
  create mode 100644 scripts/xen-hypercalls.sh

diff --git a/arch/x86/syscalls/Makefile b/arch/x86/syscalls/Makefile
index 3323c27..a55abb9 100644
--- a/arch/x86/syscalls/Makefile
+++ b/arch/x86/syscalls/Makefile
@@ -19,6 +19,9 @@ quiet_cmd_syshdr = SYSHDR  $@
  quiet_cmd_systbl = SYSTBL  $@
cmd_systbl = $(CONFIG_SHELL) '$(systbl)' $< $@

+quiet_cmd_hypercalls = HYPERCALLS $@
+  cmd_hypercalls = $(CONFIG_SHELL) '$<' $@ $(filter-out $<,$^)
+
  syshdr_abi_unistd_32 := i386
  $(uapi)/unistd_32.h: $(syscall32) $(syshdr)
$(call if_changed,syshdr)
@@ -47,10 +50,16 @@ $(out)/syscalls_32.h: $(syscall32) $(systbl)
  $(out)/syscalls_64.h: $(syscall64) $(systbl)
$(call if_changed,systbl)

+$(out)/xen-hypercalls.h: $(srctree)/scripts/xen-hypercalls.sh
+   $(call if_changed,hypercalls)
+
+$(out)/xen-hypercalls.h: $(srctree)/include/xen/interface/xen*.h
+
  uapisyshdr-y  += unistd_32.h unistd_64.h unistd_x32.h
  syshdr-y  += syscalls_32.h
  syshdr-$(CONFIG_X86_64)   += unistd_32_ia32.h unistd_64_x32.h
  syshdr-$(CONFIG_X86_64)   += syscalls_64.h
+syshdr-$(CONFIG_XEN)   += xen-hypercalls.h

  targets   += $(uapisyshdr-y) $(syshdr-y)

diff --git a/scripts/xen-hypercalls.sh b/scripts/xen-hypercalls.sh
new file mode 100644
index 000..676d922
--- /dev/null
+++ b/scripts/xen-hypercalls.sh
@@ -0,0 +1,12 @@
+#!/bin/sh
+out="$1"
+shift
+in="$@"
+
+for i in $in; do
+   eval $CPP $LINUXINCLUDE -dD -imacros "$i" -x c /dev/null
+done | \
+awk '$1 == "#define" && $2 ~ /__HYPERVISOR_[a-z][a-z_0-9]*/ { v[$3] = $2 }
+   END {   print "/* auto-generated by scripts/xen-hypercall.sh */"
+   for (i in v) if (!(v[i] in v))
+   print "HYPERCALL("substr(v[i], 14)")"}' | sort -u >$out




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-12 Thread Kai Huang


On 02/12/2015 08:34 PM, Tim Deegan wrote:

Hi,

Thanks for posting this design!

At 16:28 +0800 on 11 Feb (1423668493), Kai Huang wrote:

Design
==

- PML feature is used globally

A new Xen boot parameter, say 'opt_enable_pml', will be introduced to control 
PML feature detection, and PML feature will only be detected if opt_enable_pml 
= 1. Once PML feature is detected, it will be used for dirty logging for all 
domains globally. Currently we don't support to use PML on basis of per-domain 
as it will require additional control from XL tool.

Sounds good.  I agree that there's no point in making this a per-VM
feature.


- PML enable/disable for particular Domain

PML needs to be enabled (allocate PML buffer, initialize PML index, PML base 
address, turn PML on VMCS, etc) for all vcpus of the domain, as PML buffer and 
PML index are per-vcpu, but EPT table may be shared by vcpus. Enabling PML on 
partial vcpus of the domain won't work. Also PML will only be enabled for the 
domain when it is switched to dirty logging mode, and it will be disabled when 
domain is switched back to normal mode. As looks vcpu number won't be changed 
dynamically during guest is running (correct me if I am wrong here), so we 
don't have to consider enabling PML for new created vcpu when guest is in dirty 
logging mode.


No - you really ought to handle enabling this for new VCPUs.  There
have been cases in the past where VMs are put into log-dirty mode
before their VCPUs are assigned, and there might be again.

"Assigned" here means created?



It ought to be easy to handle, though - just one more check and
function call on the vcpu setup path.
I think "check and function call" means check function call to enable 
PML on this vcpu? Then what if enabling PML for vcpu fails (possible as 
it needs to allocate 4K PML buffer)? It's better to choose to roll back 
to use write protection instead of indicating failure of creating the 
vcpu. But in this case there will be problem if the domain has already 
been in log dirty mode as we might already have EPT table setup with 
D-bit clear for logdirty range, which means we need to re-check the 
logdirty ranges and re-set EPT table to be read-only.  Does this sound 
reasonable?



After PML is enabled for the domain, we only need to clear EPT entry's D-bit 
for guest memory in dirty logging mode. We achieve this by checking if PML is 
enabled for the domain when p2m_ram_rx changed to p2m_ram_logdirty, and 
updating EPT entry accordingly. However, for super pages, we still write 
protect them in case of PML as we still need to split super page to 4K page in 
dirty logging mode.


IIUC, you are suggesting leaving superpages handled as they are now,
with read-only EPTEs, and only using PML for single-page mappings.
That seems good. :)


- PML buffer flush

There are two places we need to flush PML buffer. The first place is PML buffer 
full VMEXIT handler (apparently), and the second place is in 
paging_log_dirty_op (either peek or clean), as vcpus are running asynchronously 
along with paging_log_dirty_op is called from userspace via hypercall, and it's 
possible there are dirty GPAs logged in vcpus' PML buffers but not full. 
Therefore we'd better to flush all vcpus' PML buffers before reporting dirty 
GPAs to userspace.

We handle above two cases by flushing PML buffer at the beginning of all 
VMEXITs. This solves the first case above, and it also solves the second case, 
as prior to paging_log_dirty_op, domain_pause is called, which kicks vcpus 
(that are in guest mode) out of guest mode via sending IPI, which cause VMEXIT, 
to them.


I would prefer to flush only on buffer-full VMEXITs and handle the
peek/clear path by explicitly reading all VCPUs' buffers.  That avoids
putting more code on the fast paths for other VMEXIT types.
OK. But looks this requires a new interface like paging_flush_log_dirty, 
called at beginning of paging_log_dirty_op? This is actually what I 
wanted to avoid originally.





This also makes log-dirty radix tree more updated as PML buffer is flushed on 
basis of all VMEXITs but not only PML buffer full VMEXIT.

- Video RAM tracking (and partial dirty logging for guest memory range)

Video RAM is in dirty logging mode unconditionally during guest's run-time, and 
it is partial memory range of the guest. However, PML operates on the whole 
guest memory (the whole valid EPT table, more precisely), so we need to choose 
whether to use PML if only partial guest memory ranges are in dirty logging 
mode.

Currently, PML will be used as long as there's guest memory in dirty logging 
mode, no matter globally or partially. And in case of partial dirty logging, we 
need to check if the logged GPA in PML buffer is in dirty logging range.


I think, as other people have said, that you can just use PML for this
case without any other restrictions.  After all, mappings for non-VRAM
areas ought not to have their D-bits clear anyway.

Agreed.

Thanks,
-Kai


Cheers,

Tim.

Re: [Xen-devel] [RFC v1 5/8] xen: x86: add XEN_PV

2015-02-12 Thread Luis R. Rodriguez
On Thu, Feb 12, 2015 at 11:03:26AM +, David Vrabel wrote:
> On 12/02/15 06:03, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" 
> > 
> > This lets us rip out under the general XEN config the
> > XEN_HAVE_PVMMU dependency. This only exists on the x86
> > universe. This is as per the agreed upon Xen Kconfig
> > changes [0].
> [...]
> > @@ -52,3 +51,9 @@ config XEN_PVH
> > depends on X86_64 && XEN
> > select XEN_PVHVM
> > def_bool n
> > +
> > +config XEN_PV
> > +   bool "Support for running as a PV guest"
> > +   depends on XEN && X86
> > +   select XEN_HAVE_PVMMU
> > +   def_bool n
> 
> These options should be in this order: XEN_PV, XEN_PVHVM, XEN_PVH.

OK I'll move them in order and also add a descrition to
both XEN_PV and XEN_PVH, I cannot describe XEN_PVHVM as
its a def_bool and the user does not get to select it or
review it.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-12 Thread Tian, Kevin
> From: Tim Deegan [mailto:t...@xen.org]
> Sent: Thursday, February 12, 2015 8:42 PM
> 
> At 07:08 + on 12 Feb (1423721283), Tian, Kevin wrote:
> > for general log dirty, ept_invalidate_emt is required because there is
> > access permission change (dirtied page becomes rw after 1st fault,
> > so need to change them back to ro again for the new dirty tracking
> > round). But for PML, there's no permission change at all (always rw),
> > so such behavior should be noted by general logdirty layer for better
> > optimization.
> 
> AIUI the reason for calling ept_invalidate_emt() is to avoid having to
> update a large number of EPTEs at once.  If you still need to update a
> large number of EPTEs (to clear the Dirty bits), that has to me
> preemptable, or else use ept_invalidate_emt().
> 
> Or have I misunderstood?
> 

preemptable is fine and we can judge whether dirty set is large or not. 
My feeling is that replace simple D-bit cleanup with ept misconfig exit
is not optimal. Jan explained not strictly one misconfig exit for every D 
bit since whole L1 will be handled in a batch, but we need have some 
understanding of actual impact based on various workload patterns.

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-12 Thread Kai Huang


On 02/12/2015 08:42 PM, Tim Deegan wrote:

At 07:08 + on 12 Feb (1423721283), Tian, Kevin wrote:

for general log dirty, ept_invalidate_emt is required because there is
access permission change (dirtied page becomes rw after 1st fault,
so need to change them back to ro again for the new dirty tracking
round). But for PML, there's no permission change at all (always rw),
so such behavior should be noted by general logdirty layer for better
optimization.

AIUI the reason for calling ept_invalidate_emt() is to avoid having to
update a large number of EPTEs at once.  If you still need to update a
large number of EPTEs (to clear the Dirty bits), that has to me
preemptable, or else use ept_invalidate_emt().

Or have I misunderstood?
I think you are correct. We still need to use ept_invalidate_emt for 
clearing D-bit, unless we invent a new paging layer interface, say 
paging_enable_log_dirty_gfn, which explicitly enables log-dirty for 
single GFN, either by write protection, or clearing D-bit, in case of PML.


Thanks,
-Kai


Tim.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-12 Thread Kai Huang


On 02/12/2015 10:10 PM, Andrew Cooper wrote:

On 12/02/15 06:54, Tian, Kevin wrote:

which presumably
means that the PML buffer flush needs to be aware of which gfns are
mapped by superpages to be able to correctly set a block of bits in the
logdirty bitmap.


Unfortunately PML itself can't tell us if the logged GPA comes from
superpage or not, but even in PML we still need to split superpages to
4K page, just like traditional write protection approach does. I think
this is because live migration should be based on 4K page granularity.
Marking all 512 bits of a 2M page to be dirty by a single write doesn't
make sense in both write protection and PML cases.


agree. extending one write to superpage enlarges dirty set unnecessary.
since spec doesn't say superpage logging is not supported, I'd think a
4k-aligned entry being logged if within superpage.

The spec states that an gfn is appended to the log strictly on the
transition of the D bit from 0 to 1.

In the case of a 2M superpage, there is a single D bit for the entire 2M
range.


The plausible (working) scenarios I can see are:

1) superpages are not supported (not indicated by the whitepaper).
A better description would be -- PML doesn't check if it's superpage, it 
just operates with D-bit, no matter what page size.

2) a single entry will be written which must be taken to cover the
entire 2M range.
3) an individual entry is written for every access.
Below is the reply from our hardware guy related to PML on superpage. It 
should have answered accurately.


"As noted in Section 1.3, logging occurs whenever the CPU would set an 
EPT D bit.


It does not matter whether the D bit is in an EPT PTE (4KB page), EPT 
PDE (2MB page), or EPT PDPTE (1GB page).


In all cases, the GPA written to the PML log will be the address of the 
write that causes the D bit in question to be updated, with bits 11:0 
cleared.


This means that, in the case in which the D bit is in an EPT PDE or an 
EPT PDPTE, the log entry will communicate which 4KB region within the 
larger page was being written.


Once the D bit is set in one of these entries, a subsequent write to the 
larger page will not generate a log entry, even if that write is to a 
different 4KB region within the larger page.  This is because log 
entries are created only when a D bit is being set and a write will not 
cause a D bit to be set if the page's D bit is already set.


The log entries do not communicate the level of the EPT paging-structure 
entry in which the D bit was set (i.e., it does not communicate the page 
size). "


Thanks,
-Kai




Have I missed anything?

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] stubdom vtpm build failure in staging

2015-02-12 Thread Xu, Quan


> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: Friday, February 13, 2015 1:40 AM
> To: Xu, Quan; Olaf Hering
> Cc: Daniel De Graaf; xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] stubdom vtpm build failure in staging
> 
> On 12/02/15 17:24, Xu, Quan wrote:
> >
> >
> >> -Original Message-
> >> From: xen-devel-boun...@lists.xen.org
> >> [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Xu, Quan
> >> Sent: Friday, February 13, 2015 12:57 AM
> >> To: Olaf Hering
> >> Cc: xen-devel@lists.xen.org
> >> Subject: Re: [Xen-devel] stubdom vtpm build failure in staging
> >>
> >> Sorry for that. Read the other thread of email, it looks that some
> >> maintainers are working for this issue.
> >> And I am working for 'Xen stubdom vTPM for HVM virtual machine' v4
> patches.
> >> There are a lot of modifications.
> >>
> >> I will be out of office from Feb. 16th to Feb. 26th for Chinese New
> >> Year. I plan to summit v4 patches Before Feb. 16, and fix this issue after 
> >> Feb.
> 26th.
> >>
> >> --Quan
> >>
> >>
> >>> -Original Message-
> >>> From: Olaf Hering [mailto:o...@aepfle.de]
> >>> Sent: Wednesday, February 11, 2015 11:21 PM
> >>> To: Xu, Quan
> >>> Cc: xen-devel@lists.xen.org
> >>> Subject: Re: [Xen-devel] stubdom vtpm build failure in staging
> >>>
> >>> On Wed, Jan 28, Xu, Quan wrote:
> >>>
>  Thanks, I will check and fix it tomorrow. It is 23:12 PM Pacific time 
>  now.
> >>> Any progress?
> >>> These typedefs are duplicated in stubdom/vtpmmgr/tcg.h and supported
> >>> compilers do not cope with current staging:
> >>>
> >>> # for i in `grep -w typedef stubdom/vtpmmgr/tcg.h | sed -n
> >>> '/;/{s@^.* @@;s@;@@p}'` # do
> >>> # if test -n "`git grep -wn $i|grep -w typedef|grep -v
> >>> stubdom/vtpmmgr/tcg.h`"
> >>> # then
> >>> # echo $i
> >>> # fi
> >>> # done
> >>>
> >>> BYTE
> >>> BOOL
> >>> UINT16
> >>> UINT32
> >>> UINT64
> >>> TPM_HANDLE
> >>> TPM_ALGORITHM_ID
> >>>
> >>> TPMI_RH_HIERARCHY_AUTH and TPM_ALG_ID are defined twice in the
> same
> >>> file.
> >>>
> >>> This change works for me:
> >>>
> >>> ---
> >>>  stubdom/vtpmmgr/odd_types.h  | 11 +++
> >>>  stubdom/vtpmmgr/tcg.h|  9 +
> >>>  stubdom/vtpmmgr/tpm2_types.h | 11 +--
> >>>  3 files changed, 13 insertions(+), 18 deletions(-)  create mode
> >>> 100644 stubdom/vtpmmgr/odd_types.h
> >>>
> >>> diff --git a/stubdom/vtpmmgr/odd_types.h
> >>> b/stubdom/vtpmmgr/odd_types.h new file mode 100644 index
> >>> 000..d72da9b
> >>> --- /dev/null
> >>> +++ b/stubdom/vtpmmgr/odd_types.h
> >>> @@ -0,0 +1,11 @@
> >>> +#ifndef VTPM_ODD_TYPES
> >>> +#define VTPM_ODD_TYPES 1
> >>> +typedef unsigned char BYTE;
> >>> +typedef unsigned char BOOL;
> >>> +typedef uint16_t UINT16;
> >>> +typedef uint32_t UINT32;
> >>> +typedef uint64_t UINT64;
> >>> +typedef UINT32 TPM_HANDLE;
> >>> +typedef UINT32 TPM_ALGORITHM_ID;
> >>> +#endif
> >>> +
> >>> diff --git a/stubdom/vtpmmgr/tcg.h b/stubdom/vtpmmgr/tcg.h index
> >>> 7321ec6..cac1bbc 100644
> >>> --- a/stubdom/vtpmmgr/tcg.h
> >>> +++ b/stubdom/vtpmmgr/tcg.h
> >>> @@ -401,16 +401,10 @@
> >>>
> >>>
> >>>  // *** TYPEDEFS
> >>> * -typedef unsigned char BYTE;
> >>> -typedef unsigned char BOOL; -typedef uint16_t UINT16; -typedef
> >>> uint32_t UINT32; -typedef uint64_t UINT64;
> >>> -
> >>> +#include "odd_types.h"
> > I think it is just for gcc backward compatibility. IMHO, That does seem 
> > pretty
> strange.
> > cc Daniel who is the maintainer of vTPM / XSM.
> >
> > -Quan
> 
> Redefining an identifier in the same scope is violation of the C spec.
> 
> Newer GCC tolerates bad code which redundantly declares identifiers, but even
> newer GCC will still emit a diagnostic in -pedantic mode.
> 
> This build breakage needs fixing, and not just in the name of backwards
> compatibility.
> 
> ~Andrew

Thanks Andrew.
I will fix this build breakage after Feb. 26th. 
I try to redefine tpm2_types.h, deleting 'UINT16, UINT32 ..' and changing 
'TPM_HANDLE ...' to ' TPM2_HANDLE ..', 
Or, could you give some advice? Thanks.

-Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [edk2] [PATCH] MdeModulePkg: mark completion of PCI enumeration in PciEnumeratorLight

2015-02-12 Thread Ni, Ruiyu
Jordan,
You are correct that subsequent installations of this protocol will fail.

But I don't think locating this protocol before installing it is a good 
implementation choice. This would make the code confusing.

Thanks,
Ray

-Original Message-
From: Justen, Jordan L 
Sent: Friday, February 13, 2015 5:03 AM
To: edk2-de...@lists.sourceforge.net; Ni, Ruiyu; wei.l...@citrix.com
Cc: xen-devel@lists.xen.org
Subject: Re: [edk2] [PATCH] MdeModulePkg: mark completion of PCI enumeration in 
PciEnumeratorLight

On 2015-02-11 17:23:26, Ni, Ruiyu wrote:
> Wei,
> No you cannot install gEfiPciEnumerationCompleteProtocolGuid in
> PciEnumeratorLight().
> For a real platform, PCI BUS is fully enumerated in PciEnumerator()
> and later if reconnect happens, it's light enumerated in
> PciEnumeratorLight(). The protocol should only be installed once in
> PeiEnumerator(). Your fix will cause this protocol installed every
> time a reconnect happens.

I don't think it will, since the protocol is already installed on the
Host Bridge Handle, I think EFI_INVALID_PARAMETER will be returned on
subsequent calls to install the protocol.

But, getting that error back will lead to PciEnumeratorLight returning
an error, and this could cause other issues.

> The protocol 's meaning is that the PCI BUS is fully enumerated. If
> the PCI BUS is fully enumerated before starting PciBus driver, light
> PCI enumeration is used.
> For your OVMF/QEMU case, an alternative fix is to install this
> protocol in a platform driver when it detects that the PCI BUS is
> fully enumerated.

I think the PciBusDxe driver is still the right place to install the
protocol, but I agree we need to be careful to prevent trying to
install the protocol multiple times.

I guess we could try to locate the protocol on the Host Bridge Handle
before installing it to prevent multiple installation attempts.

-Jordan

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com] 
> Sent: Thursday, February 12, 2015 4:24 AM
> To: edk2-de...@lists.sourceforge.net
> Cc: xen-devel@lists.xen.org; Laszlo
> Subject: [edk2] [PATCH] MdeModulePkg: mark completion of PCI enumeration in 
> PciEnumeratorLight
> 
> I had an issue when trying to boot Xen HVM guest with latest OVMF
> master. Guest crashed with memory violation, and the bisection pointed
> to 66b280df2 ("OvmfPkg: AcpiPlatformDxe: make dependency on PCI
> enumeration explicit"). That commit made AcpiPlatformDxe depend on PCI
> enumeration using gEfiPciEnumerationCompleteProtocolGuid, which is a
> very reasonable change.
> 
> The real culprit is that Xen HVM is using PciEnumeratorLight which
> doesn't install gEfiPciEnumerationCompleteProtocolGuid. This, in
> combination with 66b280df2, makes AcpiPlatformDxe not able to be loaded,
> resulting in guest crash.
> 
> The fix is to install gEfiPciEnumerationCompleteProtocolGuid in
> PciEnumeratorLight.
> 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Wei Liu 
> Cc: Feng Tian 
> Cc: Anthony Perard 
> Cc: Laszlo Ersek 
> Cc: Jordan Justen 
> ---
>  MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c 
> b/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
> index 9e7ac74..7659585 100644
> --- a/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
> +++ b/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
> @@ -2256,6 +2256,7 @@ PciEnumeratorLight (
>  {
>  
>EFI_STATUSStatus;
> +  EFI_HANDLEHostBridgeHandle;
>EFI_PCI_ROOT_BRIDGE_IO_PROTOCOL   *PciRootBridgeIo;
>PCI_IO_DEVICE *RootBridgeDev;
>UINT16MinBus;
> @@ -2288,6 +2289,11 @@ PciEnumeratorLight (
>  return Status;
>}
>  
> +  //
> +  // Get the host bridge handle
> +  //
> +  HostBridgeHandle = PciRootBridgeIo->ParentHandle;
> +
>Status = PciRootBridgeIo->Configuration (PciRootBridgeIo, (VOID **) 
> &Descriptors);
>  
>if (EFI_ERROR (Status)) {
> @@ -2348,7 +2354,14 @@ PciEnumeratorLight (
>  Descriptors++;
>}
>  
> -  return EFI_SUCCESS;
> +  Status = gBS->InstallProtocolInterface (
> +  &HostBridgeHandle,
> +  &gEfiPciEnumerationCompleteProtocolGuid,
> +  EFI_NATIVE_INTERFACE,
> +  NULL
> +  );
> +
> +  return Status;
>  }
>  
>  /**
> -- 
> 1.9.1
> 
> 
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _

Re: [Xen-devel] [Qemu-devel] [v2][PATCH] libxl: add one machine property to support IGD GFX passthrough

2015-02-12 Thread Chen, Tiejun

Ian,

Just ping this, or do you think I should send this as a patch?

Thanks
Tiejun

On 2015/2/11 10:45, Chen, Tiejun wrote:

On 2015/2/9 19:05, Ian Campbell wrote:

On Mon, 2015-02-09 at 14:28 +0800, Chen, Tiejun wrote:


What about this?


I've not read the code in detail,since I'm travelling but from a quick
glance it looks to be implementing the sort of thing I meant, thanks.


Thanks for your time.



A couple of higher level comments:

I'd suggest to put the code for reading the vid/did into a helper
function so it can be reused.


Looks good.



You might like to optionally consider add a forcing option somehow so
that people with new devices not in the list can control things without
the need to recompile (e.g. gfx_passthru_kind_override?). Perhaps that
isn't needed for a first cut though and it would be a libxl API so
thought required.


What about 'gfx_passthru_force'? Because what we're doing is, we want to
make sure if we have a such a IGD that needs to workaround by posting a
parameter to qemu. So in case of non-listed devices we just need to
provide a bool to force this regardless of that real device.



I think it should probably log something at a lowish level when it has
autodetected IGD.



So I tried to rebase that according to your all comments,

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 98687bd..398d9da 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -361,6 +361,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
  libxl_defbool_setdefault(&b_info->u.hvm.nographic, false);

  libxl_defbool_setdefault(&b_info->u.hvm.gfx_passthru, false);
+libxl_defbool_setdefault(&b_info->u.hvm.gfx_passthru_force,
false);

  break;
  case LIBXL_DOMAIN_TYPE_PV:
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 8599a6a..507034f 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -710,9 +710,6 @@ static char **
libxl__build_device_model_args_new(libxl__gc *gc,
  flexarray_append(dm_args, "-net");
  flexarray_append(dm_args, "none");
  }
-if (libxl_defbool_val(b_info->u.hvm.gfx_passthru)) {
-flexarray_append(dm_args, "-gfx_passthru");
-}
  } else {
  if (!sdl && !vnc) {
  flexarray_append(dm_args, "-nographic");
@@ -757,6 +754,11 @@ static char **
libxl__build_device_model_args_new(libxl__gc *gc,
  machinearg,
max_ram_below_4g);
  }
  }
+
+if (libxl__is_igd_vga_passthru(gc, guest_config)) {
+machinearg = GCSPRINTF("%s,igd-passthru=on", machinearg);
+}
+
  flexarray_append(dm_args, machinearg);
  for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL;
i++)
  flexarray_append(dm_args, b_info->extra_hvm[i]);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 934465a..35ec5fc 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1177,6 +1177,9 @@ _hidden int libxl__create_pci_backend(libxl__gc
*gc, uint32_t domid,
libxl_device_pci *pcidev, int num);
  _hidden int libxl__device_pci_destroy_all(libxl__gc *gc, uint32_t domid);

+_hidden int libxl__is_igd_vga_passthru(libxl__gc *gc,
+   const libxl_domain_config
*d_config);
+
  /*- xswait: wait for a xenstore node to be suitable -*/

  typedef struct libxl__xswait_state libxl__xswait_state;
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index f3ae132..07b9e22 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -491,6 +491,130 @@ static int sysfs_dev_unbind(libxl__gc *gc,
libxl_device_pci *pcidev,
  return 0;
  }

+static unsigned long sysfs_dev_get_vendor(libxl__gc *gc,
+  libxl_device_pci *pcidev)
+{
+char *pci_device_vendor_path =
+libxl__sprintf(gc, SYSFS_PCI_DEV"/"PCI_BDF"/vendor",
+   pcidev->domain, pcidev->bus, pcidev->dev,
+   pcidev->func);
+int read_items;
+unsigned long pci_device_vendor;
+
+FILE *f = fopen(pci_device_vendor_path, "r");
+if (!f) {
+LOGE(ERROR,
+ "pci device "PCI_BDF" does not have vendor attribute",
+ pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func);
+return 0x;
+}
+read_items = fscanf(f, "0x%lx\n", &pci_device_vendor);
+fclose(f);
+if (read_items != 1) {
+LOGE(ERROR,
+ "cannot read vendor of pci device "PCI_BDF,
+ pcidev->domain, pcidev->bus, pcidev->dev, pcidev->func);
+return 0x;
+}
+
+return pci_device_vendor;
+}
+
+static unsigned long sysfs_dev_get_device(libxl__gc *gc,
+  libxl_device_pci *pcidev)
+{
+char *pci_device_device_path =
+   

Re: [Xen-devel] [PATCH v4 00/29] Xen/ARM guest support

2015-02-12 Thread Ard Biesheuvel
On 13 February 2015 at 05:18, Jordan Justen  wrote:
> Do you have this in a public branch based on this tree?
> https://github.com/tianocore/edk2
>

The patches are available here
https://git.linaro.org/people/ard.biesheuvel/uefi-next.git/shortlog/refs/heads/linaro-topic-xen

I also have a version rebased onto the latest upstream, and verified
that it builds ok
https://git.linaro.org/people/ard.biesheuvel/uefi-next.git/shortlog/refs/heads/linaro-topic-xen-v4-rebase

Regards,
Ard.


> On 2015-02-12 03:18:52, Ard Biesheuvel wrote:
>> This series implements support for executing Tianocore inside a Xen
>> guest domain on 64-bit ARM systems (AArch64)
>>
>> The first part addresses ARM platform specifics, primarily to allow a
>> Tianocore binary image to be runtime relocatable, and execute from DRAM.
>>
>> The second part refactors the XenBus support, and adds some missing device
>> drivers that are needed to execute on ARM: a Xen PV console and a real time
>> clock driver.
>>
>> Finally, patch #29 wraps it all together and implements the .dsc and .fdf
>> platform descriptions that can be used to build the binary image.
>>
>> NOTES:
>> - the Xen RTC driver is a dummy implementation, as it is a Runtime driver 
>> which
>>   is callable through Runtime Services from the OS, and this is currently not
>>   supportable under Xen, due to the need to share the shared info page 
>> between
>>   the OS and the firmware
>> - UEFI maps the entire physical memory space as cached, and relies on Xen to
>>   use the correct stage2 mappings for regions that are backed by devices, 
>> such
>>   as the GIC or device passthrough. The reason is that the I/O console ring 
>> and
>>   grant table are backed by RAM that Xen maps as cached, which means that 
>> UEFI
>>   *must* map those as cached as well. Instead of discovering those regions
>>   early on (i.e., before enabling the MMU) it is much easier to rely on the
>>   architecturally mandated behavior that stage2 device mappings supersede 
>> stage1
>>   cached mappings for the same region.
>> - this code is not yet tested on x86 (still only build tested for v4)
>>
>> Changes since v3:
>> - rebased onto Olivier's pending GICv3 patches
>> - moved InterlockedCompareExchange16 () to BaseSynchronizationLib
>> - reimplemented XenBusDxe's TestAndClearBit () using
>>   InterlockedCompareExchange16 () so that XenBusDxe itself is now completely
>>   architecture agnostic
>> - various minor style and comment changes based on review feedback from
>>   Laszlo and Olivier
>> - added acks and R-b's
>>
>> Changes since v2:
>> - rebased onto latest upstream containing Laszlo's ARM generic timer changes,
>>   with Olivier's pending GICv3 patches applied on top;
>> - moved the relocatable PrePi to a completely separate module, and dropped
>>   patches changing the original ARM PrePi code: all required changes have 
>> been
>>   incorporated directly into the split off version
>> - dropped the ARM BDS entirely, only Intel BDS supported as of now
>> - added a constructor to XenConsoleSerialPortLib, otherwise there is no 
>> output
>>   from the release build;
>> - implemented all review comments regarding style and correctness, including
>>   cleaning up the DSC in the final patch
>> - added acks and R-b's
>>
>> Changes since v1:
>> - move to PatchableInModule PCDs for the runtime self-relocating PrePi: this 
>> is
>>   semantically more correct, and will make the build system help us spot if
>>   there are remaining instances of FixedPcdGetXX() which need attention
>> - split some prepi and xen patches to make it easier on the reviewers
>> - split off the PCI support from XenBusDxe instead of the frankenstein DXE 
>> from
>>   v1
>> - implemented review comments regarding moving of files, splitting of 
>> libraries
>>   and some EDK2 optimizations suggested by Laszlo (casting, use of specific
>>   types etc)
>> - added some acks and R-b's
>>
>>
>>
>> Ard Biesheuvel (29):
>>   ArmPkg: allow HYP timer interrupt to be omitted
>>   ArmPkg: allow patchable PCDs for memory, FD and FV addresses
>>   ArmPlatformPkg: allow patchable PCD for FD base address
>>   ArmVirtualizationPkg: add GICv3 detection to VirtFdtDxe
>>   ArmVirtualizationPkg: allow patchable PCD for device tree base address
>>   ArmVirtualizationPkg: move early UART discovery to PlatformPeim
>>   ArmVirtualizationPkg: use a HOB to store device tree blob
>>   ArmVirtualizationPkg: add padding to FDT allocation
>>   ArmVirtualizationPkg: add a relocatable version of PrePi
>>   ArmVirtualizationPkg: implement custom MemoryInitPeiLib
>>   ArmVirtualizationPkg: allow patchable PCD for FV and DT base addresses
>>   ArmVirtualizationPkg: Xen/PV relocatable platformlib instance
>>   MdePkg/BaseSynchronizationLib: Added proper support for ARM
>> architecture
>>   MdePkg/BaseSynchronizationLib: implement 16-bit compare-exchange
>>   Ovmf/Xen: move Xen interface version to 
>>   Ovmf/Xen: fix pointer to int cast in XenBusDxe
>>   Ovmf/Xen: refact

[Xen-devel] [qemu-upstream-unstable test] 34477: regressions - FAIL

2015-02-12 Thread xen . org
flight 34477 qemu-upstream-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34477/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 11 guest-localmigrate  fail REGR. vs. 33488
 test-amd64-i386-freebsd10-amd64 11 guest-localmigrate fail REGR. vs. 33488
 test-amd64-i386-xl-win7-amd64 10 guest-localmigrate   fail REGR. vs. 33488
 test-amd64-amd64-xl-win7-amd64 10 guest-localmigrate  fail REGR. vs. 33488
 test-amd64-i386-xl-winxpsp3-vcpus1 10 guest-localmigrate  fail REGR. vs. 33488
 test-amd64-i386-xl-winxpsp3  10 guest-localmigratefail REGR. vs. 33488
 test-amd64-i386-rhel6hvm-amd 6 leak-check/basis(6) running in 34247 
[st=running!]
 test-amd64-amd64-xl-winxpsp3 10 guest-localmigrate fail in 34247 REGR. vs. 
33488

Tests which are failing intermittently (not blocking):
 test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail pass in 34247
 test-amd64-amd64-xl-winxpsp3  7 windows-install fail pass in 34319

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt   9 guest-start  fail   like 33488
 test-amd64-i386-xl-qemuu-debianhvm-amd64 10 guest-localmigrate fail REGR. vs. 
33488
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 10 guest-localmigrate fail REGR. vs. 
33488
 test-amd64-amd64-xl-qemuu-ovmf-amd64 10 guest-localmigrate fail REGR. vs. 33488
 test-amd64-i386-xl-qemuu-win7-amd64 10 guest-localmigrate fail REGR. vs. 33488
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 10 guest-localmigrate fail REGR. vs. 
33488
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 guest-localmigrate fail REGR. vs. 33488
 test-amd64-i386-xl-qemuu-winxpsp3 10 guest-localmigrate   fail REGR. vs. 33488
 test-amd64-amd64-xl-qemuu-win7-amd64 10 guest-localmigrate fail REGR. vs. 33488
 test-amd64-amd64-xl-qemuu-winxpsp3 10 guest-localmigrate  fail REGR. vs. 33488
 test-armhf-armhf-xl-multivcpu 14 leak-check/check fail in 34247 blocked in 
33488
 test-armhf-armhf-libvirt 13 guest-destroy   fail in 34247 blocked in 33488
 test-armhf-armhf-xl-credit2   5 xen-bootfail in 34247 blocked in 33488

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass

version targeted for testing:
 qemuube11dc1e9172f91e798a8f831b30c14b479e08e8
baseline version:
 qemuu0d37748342e29854db7c9f6c47d7f58c6cfba6b2


People who touched revisions under test:
  Don Slutz 
  Paul Durrant 
  Stefano Stabellini 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd

Re: [Xen-devel] [PATCH v4 00/29] Xen/ARM guest support

2015-02-12 Thread Jordan Justen
Do you have this in a public branch based on this tree?
https://github.com/tianocore/edk2

On 2015-02-12 03:18:52, Ard Biesheuvel wrote:
> This series implements support for executing Tianocore inside a Xen
> guest domain on 64-bit ARM systems (AArch64)
> 
> The first part addresses ARM platform specifics, primarily to allow a
> Tianocore binary image to be runtime relocatable, and execute from DRAM.
> 
> The second part refactors the XenBus support, and adds some missing device
> drivers that are needed to execute on ARM: a Xen PV console and a real time
> clock driver.
> 
> Finally, patch #29 wraps it all together and implements the .dsc and .fdf
> platform descriptions that can be used to build the binary image.
> 
> NOTES:
> - the Xen RTC driver is a dummy implementation, as it is a Runtime driver 
> which
>   is callable through Runtime Services from the OS, and this is currently not
>   supportable under Xen, due to the need to share the shared info page between
>   the OS and the firmware
> - UEFI maps the entire physical memory space as cached, and relies on Xen to
>   use the correct stage2 mappings for regions that are backed by devices, such
>   as the GIC or device passthrough. The reason is that the I/O console ring 
> and
>   grant table are backed by RAM that Xen maps as cached, which means that UEFI
>   *must* map those as cached as well. Instead of discovering those regions
>   early on (i.e., before enabling the MMU) it is much easier to rely on the
>   architecturally mandated behavior that stage2 device mappings supersede 
> stage1
>   cached mappings for the same region.
> - this code is not yet tested on x86 (still only build tested for v4)
> 
> Changes since v3:
> - rebased onto Olivier's pending GICv3 patches
> - moved InterlockedCompareExchange16 () to BaseSynchronizationLib
> - reimplemented XenBusDxe's TestAndClearBit () using
>   InterlockedCompareExchange16 () so that XenBusDxe itself is now completely
>   architecture agnostic
> - various minor style and comment changes based on review feedback from
>   Laszlo and Olivier
> - added acks and R-b's
> 
> Changes since v2:
> - rebased onto latest upstream containing Laszlo's ARM generic timer changes,
>   with Olivier's pending GICv3 patches applied on top;
> - moved the relocatable PrePi to a completely separate module, and dropped
>   patches changing the original ARM PrePi code: all required changes have been
>   incorporated directly into the split off version
> - dropped the ARM BDS entirely, only Intel BDS supported as of now
> - added a constructor to XenConsoleSerialPortLib, otherwise there is no output
>   from the release build;
> - implemented all review comments regarding style and correctness, including
>   cleaning up the DSC in the final patch
> - added acks and R-b's
> 
> Changes since v1:
> - move to PatchableInModule PCDs for the runtime self-relocating PrePi: this 
> is
>   semantically more correct, and will make the build system help us spot if
>   there are remaining instances of FixedPcdGetXX() which need attention
> - split some prepi and xen patches to make it easier on the reviewers
> - split off the PCI support from XenBusDxe instead of the frankenstein DXE 
> from
>   v1
> - implemented review comments regarding moving of files, splitting of 
> libraries
>   and some EDK2 optimizations suggested by Laszlo (casting, use of specific
>   types etc)
> - added some acks and R-b's
> 
> 
> 
> Ard Biesheuvel (29):
>   ArmPkg: allow HYP timer interrupt to be omitted
>   ArmPkg: allow patchable PCDs for memory, FD and FV addresses
>   ArmPlatformPkg: allow patchable PCD for FD base address
>   ArmVirtualizationPkg: add GICv3 detection to VirtFdtDxe
>   ArmVirtualizationPkg: allow patchable PCD for device tree base address
>   ArmVirtualizationPkg: move early UART discovery to PlatformPeim
>   ArmVirtualizationPkg: use a HOB to store device tree blob
>   ArmVirtualizationPkg: add padding to FDT allocation
>   ArmVirtualizationPkg: add a relocatable version of PrePi
>   ArmVirtualizationPkg: implement custom MemoryInitPeiLib
>   ArmVirtualizationPkg: allow patchable PCD for FV and DT base addresses
>   ArmVirtualizationPkg: Xen/PV relocatable platformlib instance
>   MdePkg/BaseSynchronizationLib: Added proper support for ARM
> architecture
>   MdePkg/BaseSynchronizationLib: implement 16-bit compare-exchange
>   Ovmf/Xen: move Xen interface version to 
>   Ovmf/Xen: fix pointer to int cast in XenBusDxe
>   Ovmf/Xen: refactor XenBusDxe hypercall implementation
>   Ovmf/Xen: move XenBusDxe hypercall code to separate library
>   Ovmf/Xen: introduce XENIO_PROTOCOL
>   Ovmf/Xen: add separate driver for Xen PCI device
>   Ovmf/Xen: move XenBusDxe to abstract XENIO_PROTOCOL
>   Ovmf/Xen: implement XenHypercallLib for ARM
>   Ovmf/Xen: port XenBusDxe to other architectures
>   Ovmf/Xen: add Xen PV console SerialPortLib driver
>   ArmVirtualizationPkg: implement dummy RealTimeClockLib for Xen
>   Ovfm/Xen: add a Vendor 

Re: [Xen-devel] [PATCH] xen/Coverity: Audit of MISSING_BREAK defects

2015-02-12 Thread Don Koch
On Thu, 12 Feb 2015 20:08:46 +
Andrew Cooper  wrote:

> Coverity uses several heuristics to identify when one case statement
> legitimately falls through into the next, and a comment as the final item in a
> case statement is one heuristic (the assumption being that it is a
> justification for the fallthrough).
> 
> Use this to perform an audit of defects and hide the legitimate fallthroughs.
> 
> No functional change.  All identified fallthroughs are legitimate.
> 
> Signed-off-by: Andrew Cooper 
> Coverity-IDs: 1055483, 1055484, 1055486 - 1055488, 1055490 - 1055496,
>   1055498 - 1055500, 1055501, 1220091
> CC: Keir Fraser 
> CC: Jan Beulich 
> CC: Tim Deegan 
> CC: Xen Coverity Team 
> ---
>  xen/arch/x86/hvm/emulate.c  |1 +
>  xen/arch/x86/hvm/svm/svm.c  |1 +
>  xen/arch/x86/hvm/vlapic.c   |1 +
>  xen/arch/x86/mm.c   |2 ++
>  xen/arch/x86/traps.c|3 +++
>  xen/arch/x86/x86_64/compat/mm.c |1 +
>  xen/common/lib.c|4 
>  xen/common/schedule.c   |1 +
>  8 files changed, 14 insertions(+)
> 
> diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
> index 636c909..c657bc6 100644
> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -161,6 +161,7 @@ static int hvmemul_do_io(
>  put_page(ram_page);
>  return X86EMUL_RETRY;
>  }
> +/* fallthrough */
>  default:
>  if ( ram_page )
>  put_page(ram_page);
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index a7655bd..018dd70 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -2378,6 +2378,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
>  case NESTEDHVM_VMEXIT_ERROR:
>  break;
>  }
> +/* fallthrough */
>  case NESTEDHVM_VMEXIT_ERROR:
>  gdprintk(XENLOG_ERR,
>  "nestedsvm_check_intercepts() returned 
> NESTEDHVM_VMEXIT_ERROR\n");
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 5da6d8f..cee8699 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -762,6 +762,7 @@ static int vlapic_reg_write(struct vcpu *v,
>  vlapic->hw.tdt_msr = 0;
>  }
>  vlapic->pt.irq = val & APIC_VECTOR_MASK;
> +/* fallthrough */
>  case APIC_LVTTHMR:  /* LVT Thermal Monitor */
>  case APIC_LVTPC:/* LVT Performance Counter */
>  case APIC_LVT0: /* LVT LINT0 Reg */
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index d4965da..12e5006 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2771,6 +2771,7 @@ int new_guest_cr3(unsigned long mfn)
>  {
>  case -EINTR:
>  rc = -ERESTART;
> +/* fallthrough */
>  case -ERESTART:
>  curr->arch.old_guest_table = page;
>  break;
> @@ -3126,6 +3127,7 @@ long do_mmuext_op(
>  {
>  case -EINTR:
>  rc = -ERESTART;
> +/* fallthrough */
>  case -ERESTART:
>  curr->arch.old_guest_table = page;
>  okay = 0;
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index f5516dc..057a7af 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1739,7 +1739,9 @@ static int guest_io_okay(
>port>>3, 2) )
>  {
>  default: x.bytes[0] = ~0;
> +/* fallthrough */
>  case 1:  x.bytes[1] = ~0;
> +/* fallthrough */
>  case 0:  break;
>  }
>  TOGGLE_MODE();
> @@ -3320,6 +3322,7 @@ static void pci_serr_error(const struct cpu_user_regs 
> *regs)
>  {
>  case 'd': /* 'dom0' */
>  nmi_hwdom_report(_XEN_NMIREASON_pci_serr);
> +/* fallthrough */
>  case 'i': /* 'ignore' */
>  /* Would like to print a diagnostic here but can't call printk()
> from NMI context -- raise a softirq instead. */
> diff --git a/xen/arch/x86/x86_64/compat/mm.c b/xen/arch/x86/x86_64/compat/mm.c
> index f90f611..1491ce3 100644
> --- a/xen/arch/x86/x86_64/compat/mm.c
> +++ b/xen/arch/x86/x86_64/compat/mm.c
> @@ -292,6 +292,7 @@ int 
> compat_mmuext_op(XEN_GUEST_HANDLE_PARAM(mmuext_op_compat_t) cmp_uops,
>  break;
>  case MMUEXT_NEW_USER_BASEPTR:
>  rc = -EINVAL;
> +/* fallthrough */
>  case MMUEXT_TLB_FLUSH_LOCAL:
>  case MMUEXT_TLB_FLUSH_MULTI:
>  case MMUEXT_TLB_FLUSH_ALL:
> diff --git a/xen/common/lib.c b/xen/common/lib.c
> index 89c74ad..ae0bbb3 100644
> --- a/xen/common/lib.c
> +++ b/xen/common/lib.c
> @@ -461,12 +461,16 @@ unsigned long long parse_size_and_unit(const char *s, 
> const

Re: [Xen-devel] [edk2] [PATCH] MdeModulePkg: mark completion of PCI enumeration in PciEnumeratorLight

2015-02-12 Thread Jordan Justen
On 2015-02-11 17:23:26, Ni, Ruiyu wrote:
> Wei,
> No you cannot install gEfiPciEnumerationCompleteProtocolGuid in
> PciEnumeratorLight().
> For a real platform, PCI BUS is fully enumerated in PciEnumerator()
> and later if reconnect happens, it's light enumerated in
> PciEnumeratorLight(). The protocol should only be installed once in
> PeiEnumerator(). Your fix will cause this protocol installed every
> time a reconnect happens.

I don't think it will, since the protocol is already installed on the
Host Bridge Handle, I think EFI_INVALID_PARAMETER will be returned on
subsequent calls to install the protocol.

But, getting that error back will lead to PciEnumeratorLight returning
an error, and this could cause other issues.

> The protocol 's meaning is that the PCI BUS is fully enumerated. If
> the PCI BUS is fully enumerated before starting PciBus driver, light
> PCI enumeration is used.
> For your OVMF/QEMU case, an alternative fix is to install this
> protocol in a platform driver when it detects that the PCI BUS is
> fully enumerated.

I think the PciBusDxe driver is still the right place to install the
protocol, but I agree we need to be careful to prevent trying to
install the protocol multiple times.

I guess we could try to locate the protocol on the Host Bridge Handle
before installing it to prevent multiple installation attempts.

-Jordan

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com] 
> Sent: Thursday, February 12, 2015 4:24 AM
> To: edk2-de...@lists.sourceforge.net
> Cc: xen-devel@lists.xen.org; Laszlo
> Subject: [edk2] [PATCH] MdeModulePkg: mark completion of PCI enumeration in 
> PciEnumeratorLight
> 
> I had an issue when trying to boot Xen HVM guest with latest OVMF
> master. Guest crashed with memory violation, and the bisection pointed
> to 66b280df2 ("OvmfPkg: AcpiPlatformDxe: make dependency on PCI
> enumeration explicit"). That commit made AcpiPlatformDxe depend on PCI
> enumeration using gEfiPciEnumerationCompleteProtocolGuid, which is a
> very reasonable change.
> 
> The real culprit is that Xen HVM is using PciEnumeratorLight which
> doesn't install gEfiPciEnumerationCompleteProtocolGuid. This, in
> combination with 66b280df2, makes AcpiPlatformDxe not able to be loaded,
> resulting in guest crash.
> 
> The fix is to install gEfiPciEnumerationCompleteProtocolGuid in
> PciEnumeratorLight.
> 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Wei Liu 
> Cc: Feng Tian 
> Cc: Anthony Perard 
> Cc: Laszlo Ersek 
> Cc: Jordan Justen 
> ---
>  MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c 
> b/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
> index 9e7ac74..7659585 100644
> --- a/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
> +++ b/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
> @@ -2256,6 +2256,7 @@ PciEnumeratorLight (
>  {
>  
>EFI_STATUSStatus;
> +  EFI_HANDLEHostBridgeHandle;
>EFI_PCI_ROOT_BRIDGE_IO_PROTOCOL   *PciRootBridgeIo;
>PCI_IO_DEVICE *RootBridgeDev;
>UINT16MinBus;
> @@ -2288,6 +2289,11 @@ PciEnumeratorLight (
>  return Status;
>}
>  
> +  //
> +  // Get the host bridge handle
> +  //
> +  HostBridgeHandle = PciRootBridgeIo->ParentHandle;
> +
>Status = PciRootBridgeIo->Configuration (PciRootBridgeIo, (VOID **) 
> &Descriptors);
>  
>if (EFI_ERROR (Status)) {
> @@ -2348,7 +2354,14 @@ PciEnumeratorLight (
>  Descriptors++;
>}
>  
> -  return EFI_SUCCESS;
> +  Status = gBS->InstallProtocolInterface (
> +  &HostBridgeHandle,
> +  &gEfiPciEnumerationCompleteProtocolGuid,
> +  EFI_NATIVE_INTERFACE,
> +  NULL
> +  );
> +
> +  return Status;
>  }
>  
>  /**
> -- 
> 1.9.1
> 
> 
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> ___
> edk2-devel mailing list
> edk2-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/edk2-devel
> 
> --
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials a

Re: [Xen-devel] [RFC v1 7/8] xen: unwrap XEN_BACKEND from XEN_DOM0

2015-02-12 Thread Luis R. Rodriguez
On Thu, Feb 12, 2015 at 11:05:14AM +, David Vrabel wrote:
> On 12/02/15 06:03, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" 
> > 
> > This unwraps XEN_BACKEND from depending on XEN_DOM0, it
> > instead makes it depend on the possible x86 backends and
> > under what scenerios its allowed under ARM. This is as per
> > the agreed upon Xen Kconfig changes [0].
> [...]
> > --- a/drivers/xen/Kconfig
> > +++ b/drivers/xen/Kconfig
> > @@ -77,7 +77,8 @@ config XEN_DEV_EVTCHN
> >  
> >  config XEN_BACKEND
> > bool "Backend driver support"
> > -   depends on XEN_DOM0
> > +   depends on ARM || ARM64 || (X86 && (XEN_PV || XEN_PVH || XEN_PVHVM))
> 
> You don't need X86 here. Just XEN_PV etc. is fine.

Fixed.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [edk2] [PATCH] OvmfPkg: AcpiPlatformDxe: PCI enumeration may be disabled

2015-02-12 Thread Jordan Justen
I think gEfiPciEnumerationCompleteProtocolGuid should be installed by
MdeModulePkg/Bus/Pci/PciBusDxe, even when PcdPciDisableBusEnumeration
is set.

Ray's main feedback seemed to be that we need to make sure PciBusDxe
only installs the protocol once. (I'll also reply to the other related
patch thread.)

-Jordan

On 2015-02-12 04:16:07, Laszlo Ersek wrote:
> SVN r16411 delayed ACPI table installation until PCI enumeration was
> complete, because on QEMU the ACPI-related fw_cfg files should only be
> downloaded after PCI enumeration.
> 
> However, InitializeXen() in "OvmfPkg/PlatformPei/Xen.c" sets
> PcdPciDisableBusEnumeration to TRUE. This causes
> PciBusDriverBindingStart() in "MdeModulePkg/Bus/Pci/PciBusDxe/PciBus.c" to
> set gFullEnumeration to FALSE, which in turn makes PciEnumerator() in
> "MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumerator.c" branch to
> PciEnumeratorLight(). The installation of
> EFI_PCI_ENUMERATION_COMPLETE_PROTOCOL at the end of PciEnumerator() is not
> reached.
> 
> Which means that starting with SVN r16411, AcpiPlatformDxe is never
> dispatched on Xen.
> 
> This patch replaces the EFI_PCI_ENUMERATION_COMPLETE_PROTOCOL depex with a
> matching protocol registration callback for the PCI enumeration enabled
> (ie. QEMU) case. When PCI enumeration is disabled (ie. when running on
> Xen), AcpiPlatformDxe doesn't wait for
> EFI_PCI_ENUMERATION_COMPLETE_PROTOCOL.
> 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Laszlo Ersek 
> ---
>  OvmfPkg/AcpiPlatformDxe/AcpiPlatformDxe.inf |  4 +-
>  OvmfPkg/AcpiPlatformDxe/AcpiPlatform.c  | 84 
> +++--
>  2 files changed, 72 insertions(+), 16 deletions(-)
> 
> diff --git a/OvmfPkg/AcpiPlatformDxe/AcpiPlatformDxe.inf 
> b/OvmfPkg/AcpiPlatformDxe/AcpiPlatformDxe.inf
> index 53292bf..6b2c9d2 100644
> --- a/OvmfPkg/AcpiPlatformDxe/AcpiPlatformDxe.inf
> +++ b/OvmfPkg/AcpiPlatformDxe/AcpiPlatformDxe.inf
> @@ -56,16 +56,18 @@
>  
>  [Protocols]
>gEfiAcpiTableProtocolGuid # PROTOCOL ALWAYS_CONSUMED
> +  gEfiPciEnumerationCompleteProtocolGuid# PROTOCOL SOMETIMES_CONSUMED
>  
>  [Guids]
>gEfiXenInfoGuid
>  
>  [Pcd]
>gEfiMdeModulePkgTokenSpaceGuid.PcdAcpiTableStorageFile
> +  gEfiMdeModulePkgTokenSpaceGuid.PcdPciDisableBusEnumeration
>gUefiCpuPkgTokenSpaceGuid.PcdCpuLocalApicBaseAddress
>gPcAtChipsetPkgTokenSpaceGuid.Pcd8259LegacyModeEdgeLevel
>gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFdBaseAddress
>  
>  [Depex]
> -  gEfiAcpiTableProtocolGuid AND gEfiPciEnumerationCompleteProtocolGuid
> +  gEfiAcpiTableProtocolGuid
>  
> diff --git a/OvmfPkg/AcpiPlatformDxe/AcpiPlatform.c 
> b/OvmfPkg/AcpiPlatformDxe/AcpiPlatform.c
> index 11f0ca8..9823eba 100644
> --- a/OvmfPkg/AcpiPlatformDxe/AcpiPlatform.c
> +++ b/OvmfPkg/AcpiPlatformDxe/AcpiPlatform.c
> @@ -12,6 +12,7 @@
>  
>  **/
>  
> +#include 
>  #include "AcpiPlatform.h"
>  
>  EFI_STATUS
> @@ -221,6 +222,47 @@ FindAcpiTablesInFv (
>return EFI_SUCCESS;
>  }
>  
> +STATIC
> +EFI_STATUS
> +EFIAPI
> +InstallTables (
> +  VOID
> +  )
> +{
> +  EFI_STATUS  Status;
> +  EFI_ACPI_TABLE_PROTOCOL *AcpiTable;
> +
> +  Status = gBS->LocateProtocol (&gEfiAcpiTableProtocolGuid,
> +  NULL /* Registration */, (VOID **)&AcpiTable);
> +  if (EFI_ERROR (Status)) {
> +return EFI_ABORTED;
> +  }
> +
> +  if (XenDetected ()) {
> +Status = InstallXenTables (AcpiTable);
> +  } else {
> +Status = InstallAllQemuLinkedTables (AcpiTable);
> +  }
> +
> +  if (EFI_ERROR (Status)) {
> +Status = FindAcpiTablesInFv (AcpiTable);
> +  }
> +
> +  return Status;
> +}
> +
> +STATIC
> +VOID
> +EFIAPI
> +OnPciEnumerated (
> +  IN EFI_EVENT Event,
> +  IN VOID  *Context
> +  )
> +{
> +  InstallTables ();
> +  gBS->CloseEvent (Event);
> +}
> +
>  /**
>Entrypoint of Acpi Platform driver.
>  
> @@ -239,31 +281,43 @@ AcpiPlatformEntryPoint (
>IN EFI_SYSTEM_TABLE   *SystemTable
>)
>  {
> -  EFI_STATUS Status;
> -  EFI_ACPI_TABLE_PROTOCOL*AcpiTable;
> +  EFI_STATUS Status;
> +  VOID   *Interface;
> +  EFI_EVENT  PciEnumerated;
> +  VOID   *Registration;
>  
>//
> -  // Find the AcpiTable protocol
> +  // If PCI enumeration has been disabled, or it has already completed, 
> install
> +  // the tables at once, and let the entry point's return code reflect the 
> full
> +  // functionality.
>//
> -  Status = gBS->LocateProtocol (
> -  &gEfiAcpiTableProtocolGuid,
> -  NULL,
> -  (VOID**)&AcpiTable
> -  );
> -  if (EFI_ERROR (Status)) {
> -return EFI_ABORTED;
> +  if (PcdGetBool (PcdPciDisableBusEnumeration)) {
> +return InstallTables ();
>}
>  
> -  if (XenDetected ()) {
> -Status = InstallXenTables (AcpiTable);
> -  } else {
> -Status = InstallAllQemuLinkedTables (AcpiTable);
> +  Status = gBS->LocateProtocol (&gEfiPciEnumerationCompleteProtocolGuid,
> +  

Re: [Xen-devel] [PATCH] tools/Coverity: Audit of MISSING_BREAK defects

2015-02-12 Thread Don Koch
On Thu, 12 Feb 2015 20:08:33 +
Andrew Cooper  wrote:

> Coverity uses several heuristics to identify when one case statement
> legitimately falls through into the next, and a comment as the final item in a
> case statement is one heuristic (the assumption being that it is a
> justification for the fallthrough).
> 
> Use this to perform an audit of defects and hide the legitimate fallthroughs.
> 
> There are two bugfixes identified in the audit, both minor:
>  * 'n' command line handling for gtracestat
>  * BKSPC handling in xentop
> 
> All other identified defaults are legitimate fallthoughs
> 
> Signed-off-by: Andrew Cooper 
> Coverity-IDs: 1055464, 1055465, 1055467, 1055468, 1055481, 1055482
> CC: Ian Campbell 
> CC: Ian Jackson 
> CC: Wei Liu 
> CC: Xen Coverity Team 
> ---
>  tools/libxl/xl_cmdimpl.c |3 +++
>  tools/misc/gtracestat.c  |1 +
>  tools/misc/gtraceview.c  |1 +
>  tools/xenstat/xentop/xentop.c|1 +
>  tools/xenstore/xenstore_client.c |1 +
>  5 files changed, 7 insertions(+)
> 
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 440db78..53c16eb 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -2752,11 +2752,14 @@ static int64_t parse_mem_size_kb(const char *mem)
>  switch (tolower((uint8_t)*endptr)) {
>  case 't':
>  kbytes <<= 10;
> +/* fallthrough */
>  case 'g':
>  kbytes <<= 10;
> +/* fallthrough */
>  case '\0':
>  case 'm':
>  kbytes <<= 10;
> +/* fallthrough */
>  case 'k':
>  break;
>  case 'b':
> diff --git a/tools/misc/gtracestat.c b/tools/misc/gtracestat.c
> index 874a043..a59e536 100644
> --- a/tools/misc/gtracestat.c
> +++ b/tools/misc/gtracestat.c
> @@ -167,6 +167,7 @@ int main(int argc, char *argv[])
>  tsc2phase = atoll(optarg);
>  if (tsc2phase <= 0)
>  tsc2phase = 5580UL;
> +break;
>  case 'd':
>  is_digest = 1;
>  break;
> diff --git a/tools/misc/gtraceview.c b/tools/misc/gtraceview.c
> index cf9287c..501f86a 100644
> --- a/tools/misc/gtraceview.c
> +++ b/tools/misc/gtraceview.c
> @@ -1097,6 +1097,7 @@ void choose_cpus(void)
>  this->init();
>  return;
>  }
> +/* fallthrough */
>  case KEY_F(4):
>  exit(EXIT_SUCCESS);
>  }
> diff --git a/tools/xenstat/xentop/xentop.c b/tools/xenstat/xentop/xentop.c
> index 3062cb5..23b57f1 100644
> --- a/tools/xenstat/xentop/xentop.c
> +++ b/tools/xenstat/xentop/xentop.c
> @@ -407,6 +407,7 @@ static int handle_key(int ch)
>   case KEY_BACKSPACE:
>   if(prompt_val_len > 0)
>   prompt_val[--prompt_val_len] = '\0';
> +break;

Whitespace? (Yeah, inconsistent tools dir coding style. :-P )

Otherwise...
Reviewed-by: Don Koch 

>   default:
>   if((prompt_val_len+1) < PROMPT_VAL_LEN
>  && isprint(ch)) {
> diff --git a/tools/xenstore/xenstore_client.c 
> b/tools/xenstore/xenstore_client.c
> index 1054f18..3d14d37 100644
> --- a/tools/xenstore/xenstore_client.c
> +++ b/tools/xenstore/xenstore_client.c
> @@ -87,6 +87,7 @@ usage(enum mode mode, int incl_mode, const char *progname)
>   errx(1, "Usage: %s %s[-h] [-s] [-t] key [...]", progname, mstr);
>  case MODE_exists:
>   mstr = incl_mode ? "exists " : "";
> + /* fallthrough */
>  case MODE_list:
>   mstr = mstr ? : incl_mode ? "list " : "";
>   errx(1, "Usage: %s %s[-h] [-p] [-s] key [...]", progname, mstr);
> -- 
> 1.7.10.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v1 3/8] xen: drivers: add XEN_FRONTEND and fold front end drivers under them

2015-02-12 Thread Luis R. Rodriguez
On Thu, Feb 12, 2015 at 11:01:52AM +, David Vrabel wrote:
> On 12/02/15 06:03, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" 
> > 
> > Fold Xen front end drivers under their own Kconfig entry.
> > You may want to for example only enable domU guests with
> > pv-drivers.
> > 
> > While at it make HVC_XEN_FRONTEND select HVC_XEN.
> [...]
> > --- a/drivers/xen/Kconfig
> > +++ b/drivers/xen/Kconfig
> > @@ -83,6 +83,16 @@ config XEN_BACKEND
> >   Support for backend device drivers that provide I/O services
> >   to other virtual machines.
> >  
> > +config XEN_FRONTEND
> > +   bool "Frontend driver support"
> > +   select XEN
> > +   select XEN_XENBUS_FRONTEND
> > +   default y
> > +   help
> > + Support for frontend device drivers for Xen. You want to enable
> > + this if you want to run Xen guests. You can for instance enable
> > + domU guests with pv-drivers.
> 
> XEN_FRONTEND should appear before XEN_BACKEND.

Fixed.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v1 2/8] xen: x86: make XEN_MAX_DOMAIN_MEMORY depend on XEN_HAVE_PVMMU

2015-02-12 Thread Luis R. Rodriguez
On Thu, Feb 12, 2015 at 09:56:30AM +, David Vrabel wrote:
> On 12/02/15 06:03, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" 
> > 
> > Although XEN currently selects XEN_HAVE_PVMMU that will not
> > be the case in the near future so select this requirement
> > explicitly as per the agreed upon Kconfig changes [0].
> [...]
> > --- a/arch/x86/xen/Kconfig
> > +++ b/arch/x86/xen/Kconfig
> > @@ -28,7 +28,7 @@ config XEN_MAX_DOMAIN_MEMORY
> > int
> > default 500 if X86_64
> > default 64 if X86_32
> > -   depends on XEN
> > +   depends on XEN && XEN_HAVE_PVMMU
> 
> This can be just
> 
>   depends on XEN_HAVE_PVMMU

Fixed.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v1 1/8] xen: make dom0 specific changes depend on XEN_DOM0

2015-02-12 Thread Luis R. Rodriguez
On Thu, Feb 12, 2015 at 09:55:18AM +, David Vrabel wrote:
> On 12/02/15 06:03, Luis R. Rodriguez wrote:
> > From: "Luis R. Rodriguez" 
> > 
> > These are Kconfig options which are known to only make
> > sense with Xen dom0 support. This is as per the agreed
> > upon changes to Xen's kconfig changes [0].
> [...]
> > --- a/arch/x86/xen/Kconfig
> > +++ b/arch/x86/xen/Kconfig
> > @@ -35,13 +35,13 @@ config XEN_MAX_DOMAIN_MEMORY
> >  
> >  config XEN_SAVE_RESTORE
> > bool
> > -   depends on XEN
> > +   depends on XEN_DOM0
> > select HIBERNATE_CALLBACKS
> > default y
> 
> This breaks save/restore of domUs.
> 
> >  config XEN_DEBUG_FS
> > bool "Enable Xen debug and tuning parameters in debugfs"
> > -   depends on XEN && DEBUG_FS
> > +   depends on XEN_DOM0 && DEBUG_FS
> > default n
> > help
> >   Enable statistics output and various tuning options in debugfs.
> 
> This is useful for domUs.
> 
> > diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> > index 08f41ad..34af197 100644
> > --- a/drivers/watchdog/Kconfig
> > +++ b/drivers/watchdog/Kconfig
> > @@ -1409,7 +1409,7 @@ config WATCHDOG_RIO
> >  
> >  config XEN_WDT
> > tristate "Xen Watchdog support"
> > -   depends on XEN
> > +   depends on XEN_DOM0
> > help
> >   Say Y here to support the hypervisor watchdog capability provided
> >   by Xen 4.0 and newer.  The watchdog timeout period is normally one
> 
> Again, useful for domUs.

OK that leaves only XEN_ACPI_PROCESSOR for the patch. Fixed.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 23/24] xl: introduce xcalloc

2015-02-12 Thread Andrew Cooper
On 12/02/15 19:44, Wei Liu wrote:
> Signed-off-by: Wei Liu 
> Cc: Ian Campbell 
> Cc: Ian Jackson 
> ---
>  tools/libxl/xl_cmdimpl.c | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 440db78..ec7fb2d 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -289,6 +289,18 @@ static void *xmalloc(size_t sz) {
>  return r;
>  }
>  
> +static void *xcalloc(size_t n, size_t sz) __attribute__((unused));
> +static void *xcalloc(size_t n, size_t sz) {
> +void *r;
> +r = calloc(n, sz);

These two lines can be joined, espcially in a small wrapper like this.

> +if (!r) {
> +fprintf(stderr,"xl: Unable to calloc %lu bytes.\n",
> +(unsigned long)sz * (unsigned long)n);

%zu is the correct format identifier for a size_t, and it will allow you
to drop the casts.

~Andrew

> +exit(-ERROR_FAIL);
> +}
> +return r;
> +}
> +
>  static void *xrealloc(void *ptr, size_t sz) {
>  void *r;
>  if (!sz) { free(ptr); return 0; }


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] tools/Coverity: Audit of MISSING_BREAK defects

2015-02-12 Thread Andrew Cooper
Coverity uses several heuristics to identify when one case statement
legitimately falls through into the next, and a comment as the final item in a
case statement is one heuristic (the assumption being that it is a
justification for the fallthrough).

Use this to perform an audit of defects and hide the legitimate fallthroughs.

There are two bugfixes identified in the audit, both minor:
 * 'n' command line handling for gtracestat
 * BKSPC handling in xentop

All other identified defaults are legitimate fallthoughs

Signed-off-by: Andrew Cooper 
Coverity-IDs: 1055464, 1055465, 1055467, 1055468, 1055481, 1055482
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Xen Coverity Team 
---
 tools/libxl/xl_cmdimpl.c |3 +++
 tools/misc/gtracestat.c  |1 +
 tools/misc/gtraceview.c  |1 +
 tools/xenstat/xentop/xentop.c|1 +
 tools/xenstore/xenstore_client.c |1 +
 5 files changed, 7 insertions(+)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 440db78..53c16eb 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2752,11 +2752,14 @@ static int64_t parse_mem_size_kb(const char *mem)
 switch (tolower((uint8_t)*endptr)) {
 case 't':
 kbytes <<= 10;
+/* fallthrough */
 case 'g':
 kbytes <<= 10;
+/* fallthrough */
 case '\0':
 case 'm':
 kbytes <<= 10;
+/* fallthrough */
 case 'k':
 break;
 case 'b':
diff --git a/tools/misc/gtracestat.c b/tools/misc/gtracestat.c
index 874a043..a59e536 100644
--- a/tools/misc/gtracestat.c
+++ b/tools/misc/gtracestat.c
@@ -167,6 +167,7 @@ int main(int argc, char *argv[])
 tsc2phase = atoll(optarg);
 if (tsc2phase <= 0)
 tsc2phase = 5580UL;
+break;
 case 'd':
 is_digest = 1;
 break;
diff --git a/tools/misc/gtraceview.c b/tools/misc/gtraceview.c
index cf9287c..501f86a 100644
--- a/tools/misc/gtraceview.c
+++ b/tools/misc/gtraceview.c
@@ -1097,6 +1097,7 @@ void choose_cpus(void)
 this->init();
 return;
 }
+/* fallthrough */
 case KEY_F(4):
 exit(EXIT_SUCCESS);
 }
diff --git a/tools/xenstat/xentop/xentop.c b/tools/xenstat/xentop/xentop.c
index 3062cb5..23b57f1 100644
--- a/tools/xenstat/xentop/xentop.c
+++ b/tools/xenstat/xentop/xentop.c
@@ -407,6 +407,7 @@ static int handle_key(int ch)
case KEY_BACKSPACE:
if(prompt_val_len > 0)
prompt_val[--prompt_val_len] = '\0';
+break;
default:
if((prompt_val_len+1) < PROMPT_VAL_LEN
   && isprint(ch)) {
diff --git a/tools/xenstore/xenstore_client.c b/tools/xenstore/xenstore_client.c
index 1054f18..3d14d37 100644
--- a/tools/xenstore/xenstore_client.c
+++ b/tools/xenstore/xenstore_client.c
@@ -87,6 +87,7 @@ usage(enum mode mode, int incl_mode, const char *progname)
errx(1, "Usage: %s %s[-h] [-s] [-t] key [...]", progname, mstr);
 case MODE_exists:
mstr = incl_mode ? "exists " : "";
+   /* fallthrough */
 case MODE_list:
mstr = mstr ? : incl_mode ? "list " : "";
errx(1, "Usage: %s %s[-h] [-p] [-s] key [...]", progname, mstr);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen/Coverity: Audit of MISSING_BREAK defects

2015-02-12 Thread Andrew Cooper
Coverity uses several heuristics to identify when one case statement
legitimately falls through into the next, and a comment as the final item in a
case statement is one heuristic (the assumption being that it is a
justification for the fallthrough).

Use this to perform an audit of defects and hide the legitimate fallthroughs.

No functional change.  All identified fallthroughs are legitimate.

Signed-off-by: Andrew Cooper 
Coverity-IDs: 1055483, 1055484, 1055486 - 1055488, 1055490 - 1055496,
  1055498 - 1055500, 1055501, 1220091
CC: Keir Fraser 
CC: Jan Beulich 
CC: Tim Deegan 
CC: Xen Coverity Team 
---
 xen/arch/x86/hvm/emulate.c  |1 +
 xen/arch/x86/hvm/svm/svm.c  |1 +
 xen/arch/x86/hvm/vlapic.c   |1 +
 xen/arch/x86/mm.c   |2 ++
 xen/arch/x86/traps.c|3 +++
 xen/arch/x86/x86_64/compat/mm.c |1 +
 xen/common/lib.c|4 
 xen/common/schedule.c   |1 +
 8 files changed, 14 insertions(+)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 636c909..c657bc6 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -161,6 +161,7 @@ static int hvmemul_do_io(
 put_page(ram_page);
 return X86EMUL_RETRY;
 }
+/* fallthrough */
 default:
 if ( ram_page )
 put_page(ram_page);
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index a7655bd..018dd70 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2378,6 +2378,7 @@ void svm_vmexit_handler(struct cpu_user_regs *regs)
 case NESTEDHVM_VMEXIT_ERROR:
 break;
 }
+/* fallthrough */
 case NESTEDHVM_VMEXIT_ERROR:
 gdprintk(XENLOG_ERR,
 "nestedsvm_check_intercepts() returned 
NESTEDHVM_VMEXIT_ERROR\n");
diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 5da6d8f..cee8699 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -762,6 +762,7 @@ static int vlapic_reg_write(struct vcpu *v,
 vlapic->hw.tdt_msr = 0;
 }
 vlapic->pt.irq = val & APIC_VECTOR_MASK;
+/* fallthrough */
 case APIC_LVTTHMR:  /* LVT Thermal Monitor */
 case APIC_LVTPC:/* LVT Performance Counter */
 case APIC_LVT0: /* LVT LINT0 Reg */
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d4965da..12e5006 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2771,6 +2771,7 @@ int new_guest_cr3(unsigned long mfn)
 {
 case -EINTR:
 rc = -ERESTART;
+/* fallthrough */
 case -ERESTART:
 curr->arch.old_guest_table = page;
 break;
@@ -3126,6 +3127,7 @@ long do_mmuext_op(
 {
 case -EINTR:
 rc = -ERESTART;
+/* fallthrough */
 case -ERESTART:
 curr->arch.old_guest_table = page;
 okay = 0;
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index f5516dc..057a7af 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1739,7 +1739,9 @@ static int guest_io_okay(
   port>>3, 2) )
 {
 default: x.bytes[0] = ~0;
+/* fallthrough */
 case 1:  x.bytes[1] = ~0;
+/* fallthrough */
 case 0:  break;
 }
 TOGGLE_MODE();
@@ -3320,6 +3322,7 @@ static void pci_serr_error(const struct cpu_user_regs 
*regs)
 {
 case 'd': /* 'dom0' */
 nmi_hwdom_report(_XEN_NMIREASON_pci_serr);
+/* fallthrough */
 case 'i': /* 'ignore' */
 /* Would like to print a diagnostic here but can't call printk()
from NMI context -- raise a softirq instead. */
diff --git a/xen/arch/x86/x86_64/compat/mm.c b/xen/arch/x86/x86_64/compat/mm.c
index f90f611..1491ce3 100644
--- a/xen/arch/x86/x86_64/compat/mm.c
+++ b/xen/arch/x86/x86_64/compat/mm.c
@@ -292,6 +292,7 @@ int 
compat_mmuext_op(XEN_GUEST_HANDLE_PARAM(mmuext_op_compat_t) cmp_uops,
 break;
 case MMUEXT_NEW_USER_BASEPTR:
 rc = -EINVAL;
+/* fallthrough */
 case MMUEXT_TLB_FLUSH_LOCAL:
 case MMUEXT_TLB_FLUSH_MULTI:
 case MMUEXT_TLB_FLUSH_ALL:
diff --git a/xen/common/lib.c b/xen/common/lib.c
index 89c74ad..ae0bbb3 100644
--- a/xen/common/lib.c
+++ b/xen/common/lib.c
@@ -461,12 +461,16 @@ unsigned long long parse_size_and_unit(const char *s, 
const char **ps)
 {
 case 'T': case 't':
 ret <<= 10;
+/* fallthrough */
 case 'G': case 'g':
 ret <<= 10;
+/* fallthrough */
 case 'M': case 'm':
 ret <<= 10;
+/* fallthrough */
 case 'K': case 'k':
 ret <<= 10;
+/* fallthrough */

[Xen-devel] [PATCH v5 21/24] libxlu: nested list support

2015-02-12 Thread Wei Liu
1. Extend grammar of parser.
2. Adjust internal functions to accept XLU_ConfigValue instead of
   char *.

Signed-off-by: Wei Liu 
Cc: Ian Jackson 
Cc: Ian Campbell 
Acked-by: Ian Jackson 
---
 tools/libxl/libxlu_cfg.c   | 30 +++---
 tools/libxl/libxlu_cfg_i.h |  5 +++--
 tools/libxl/libxlu_cfg_y.c | 26 +-
 tools/libxl/libxlu_cfg_y.y |  4 ++--
 4 files changed, 25 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index f000eed..611f5ec 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -332,19 +332,14 @@ XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx, 
char *atom)
 return NULL;
 }
 
-XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom)
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
+  XLU_ConfigValue *val)
 {
 XLU_ConfigValue *value = NULL;
 XLU_ConfigValue **values = NULL;
-XLU_ConfigValue *val = NULL;
 
 if (ctx->err) goto x;
 
-val = malloc(sizeof(*val));
-if (!val) goto xe;
-val->type = XLU_STRING;
-val->u.string = atom;
-
 values = malloc(sizeof(*values));
 if (!values) goto xe;
 values[0] = val;
@@ -363,19 +358,17 @@ XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, 
char *atom)
  x:
 free(value);
 free(values);
-free(val);
-free(atom);
+xlu__cfg_value_free(val);
 return NULL;
 }
 
 void xlu__cfg_list_append(CfgParseContext *ctx,
   XLU_ConfigValue *list,
-  char *atom)
+  XLU_ConfigValue *val)
 {
-XLU_ConfigValue *val = NULL;
 if (ctx->err) return;
 
-assert(atom);
+assert(val);
 assert(list->type == XLU_LIST);
 
 if (list->u.list.nvalues >= list->u.list.avalues) {
@@ -384,7 +377,7 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
 
 if (list->u.list.avalues > INT_MAX / 100) {
 ctx->err = ERANGE;
-free(atom);
+xlu__cfg_value_free(val);
 return;
 }
 
@@ -393,7 +386,7 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
   sizeof(*new_values) * new_avalues);
 if (!new_values) {
 ctx->err = errno;
-free(atom);
+xlu__cfg_value_free(val);
 return;
 }
 
@@ -401,15 +394,6 @@ void xlu__cfg_list_append(CfgParseContext *ctx,
 list->u.list.values  = new_values;
 }
 
-val = malloc(sizeof(*val));
-if (!val) {
-ctx->err = errno;
-free(atom);
-return;
-}
-
-val->type = XLU_STRING;
-val->u.string = atom;
 list->u.list.values[list->u.list.nvalues] = val;
 list->u.list.nvalues++;
 }
diff --git a/tools/libxl/libxlu_cfg_i.h b/tools/libxl/libxlu_cfg_i.h
index b71e9fd..11dc33f 100644
--- a/tools/libxl/libxlu_cfg_i.h
+++ b/tools/libxl/libxlu_cfg_i.h
@@ -27,10 +27,11 @@ void xlu__cfg_set_store(CfgParseContext*, char *name,
 XLU_ConfigValue *val, int lineno);
 XLU_ConfigValue *xlu__cfg_string_mk(CfgParseContext *ctx,
 char *atom);
-XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx, char *atom);
+XLU_ConfigValue *xlu__cfg_list_mk(CfgParseContext *ctx,
+  XLU_ConfigValue *val);
 void xlu__cfg_list_append(CfgParseContext *ctx,
   XLU_ConfigValue *list,
-  char *atom);
+  XLU_ConfigValue *val);
 void xlu__cfg_value_free(XLU_ConfigValue *value);
 char *xlu__cfgl_strdup(CfgParseContext*, const char *src);
 char *xlu__cfgl_dequote(CfgParseContext*, const char *src);
diff --git a/tools/libxl/libxlu_cfg_y.c b/tools/libxl/libxlu_cfg_y.c
index eb3884f..b05e48b 100644
--- a/tools/libxl/libxlu_cfg_y.c
+++ b/tools/libxl/libxlu_cfg_y.c
@@ -377,7 +377,7 @@ union yyalloc
 /* YYFINAL -- State number of the termination state.  */
 #define YYFINAL  3
 /* YYLAST -- Last index in YYTABLE.  */
-#define YYLAST   24
+#define YYLAST   25
 
 /* YYNTOKENS -- Number of terminals.  */
 #define YYNTOKENS  12
@@ -444,8 +444,8 @@ static const yytype_int8 yyrhs[] =
   15,-1,16,17,-1,17,-1, 1, 6,-1,
3, 7,18,-1, 6,-1, 8,-1,19,-1,
9,22,20,10,-1, 4,-1, 5,-1,-1,
-  21,-1,21,11,22,-1,19,22,-1,21,
-  11,22,19,22,-1,-1,22, 6,-1
+  21,-1,21,11,22,-1,18,22,-1,21,
+  11,22,18,22,-1,-1,22, 6,-1
 };
 
 /* YYRLINE[YYN] -- source line where rule number YYN was defined.  */
@@ -517,14 +517,14 @@ static const yytype_int8 yydefgoto[] =
 static const yytype_int8 yypact[] =
 {
  -18, 4, 0,   -18,-1, 6,   -18,   -18,   -18, 3,
- -18,   -18,1

[Xen-devel] [PATCH v5 24/24] xl: vNUMA support

2015-02-12 Thread Wei Liu
This patch includes configuration options parser and documentation.

Please find the hunk to xl.cfg.pod.5 for more information.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
---
Changes in v5:
1. New syntax for vNUMA configuration.
---
 docs/man/xl.cfg.pod.5|  54 ++
 tools/libxl/xl_cmdimpl.c | 139 ++-
 2 files changed, 192 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 408653f..2a27b1c 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -266,6 +266,60 @@ it will crash.
 
 =back
 
+=head3 Guest Virtual NUMA Configuration
+
+=over 4
+
+=item B in the list specifies the configuration of nth
+virtual node.
+
+Each B is a list, which has a form of
+"[VNODE_CONFIG_OPTION,VNODE_CONFIG_OPTION, ... ]"  (without quotes).
+
+For example vnuma = [ ["pnode=0","size=512","vcpus=0-4","vdistances=10,20"] ]
+means vnode 0 is mapped to pnode 0, has 512MB ram, has vcpus 0 to 4, the
+distance to itself is 10 and the distance to vnode 1 is 20.
+
+Each B is a quoted string. Supported
+Bs are:
+
+=over 4
+
+=item B
+
+Specify which physical node this virtual node maps to.
+
+=item B
+
+Specify the size of this virtual node. The sum of memory size of all
+vnodes must match B (or B if B is not
+specified).
+
+=item B
+
+Specify which vcpus belong to this node. B is a string
+separated by comma. You can specify range and single cpu. An example
+is "vcpus=0-5,8", which means you specify vcpu 0 to vcpu 5, and vcpu
+8.
+
+=item B
+
+Specify virtual distance from this node to all nodes (including
+itself) with positional arguments. For example, "vdistance=10,20"
+for vnode 0 means the distance from vnode 0 to vnode 0 is 10, from
+vnode 0 to vnode 1 is 20. The number of arguments supplied must match
+the total number of vnodes.
+
+Normally you can use the values from "xl info -n" or "numactl
+--hardware" to fill in vdistance list.
+
+=back
+
+=back
+
 =head3 Event Actions
 
 =over 4
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index ec7fb2d..f52daf9 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -158,7 +158,6 @@ struct domain_create {
 };
 
 
-static uint32_t find_domain(const char *p) __attribute__((warn_unused_result));
 static uint32_t find_domain(const char *p)
 {
 uint32_t domid;
@@ -989,6 +988,142 @@ static int parse_nic_config(libxl_device_nic *nic, 
XLU_Config **config, char *to
 return 0;
 }
 
+static void parse_vnuma_config(const XLU_Config *config,
+   libxl_domain_build_info *b_info)
+{
+libxl_physinfo physinfo;
+uint32_t nr_nodes;
+XLU_ConfigList *vnuma;
+int i, j, len, num_vnuma;
+
+
+libxl_physinfo_init(&physinfo);
+if (libxl_get_physinfo(ctx, &physinfo) != 0) {
+libxl_physinfo_dispose(&physinfo);
+fprintf(stderr, "libxl_get_physinfo failed\n");
+exit(1);
+}
+
+nr_nodes = physinfo.nr_nodes;
+libxl_physinfo_dispose(&physinfo);
+
+if (xlu_cfg_get_list(config, "vnuma", &vnuma, &num_vnuma, 1))
+return;
+
+b_info->num_vnuma_nodes = num_vnuma;
+b_info->vnuma_nodes = xcalloc(num_vnuma, sizeof(libxl_vnode_info));
+
+for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+libxl_vnode_info_init(p);
+libxl_cpu_bitmap_alloc(ctx, &p->vcpus, b_info->max_vcpus);
+libxl_bitmap_set_none(&p->vcpus);
+p->distances = xcalloc(b_info->num_vnuma_nodes,
+   sizeof(*p->distances));
+p->num_distances = b_info->num_vnuma_nodes;
+}
+
+for (i = 0; i < num_vnuma; i++) {
+XLU_ConfigValue *vnode_spec, *conf_option;
+XLU_ConfigList *vnode_config_list;
+int conf_count;
+libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+vnode_spec = xlu_cfg_get_listitem2(vnuma, i);
+assert(vnode_spec);
+
+xlu_cfg_value_get_list(config, vnode_spec, &vnode_config_list, 0);
+if (!vnode_config_list) {
+fprintf(stderr, "xl: cannot get vnode config option list\n");
+exit(1);
+}
+
+for (conf_count = 0;
+ (conf_option =
+  xlu_cfg_get_listitem2(vnode_config_list, conf_count));
+ conf_count++) {
+
+if (xlu_cfg_value_type(conf_option) == XLU_STRING) {
+char *buf, *option_untrimmed, *value_untrimmed;
+char *option, *value;
+char *endptr;
+unsigned long val;
+
+xlu_cfg_value_get_string(config, conf_option, &buf, 0);
+
+if (!buf) continue;
+
+if (split_string_into_pair(buf, "=",
+   &option_untrimmed,
+   &value_untrimmed)) {
+fprintf(stderr, "xl: failed to split \"%s\" into pair\n",
+

[Xen-devel] [PATCH v5 12/24] hvmloader: retrieve vNUMA information from hypervisor

2015-02-12 Thread Wei Liu
Hvmloader issues XENMEM_get_vnumainfo hypercall and stores the
information retrieved in scratch space for later use.

Signed-off-by: Wei Liu 
Cc: Jan Beulich 
---
Changes in v5:
1. Group scratch_alloc togeter.
2. Use memset.
3. Drop unnecessary "return";
4. Rebase onto Jan's errno ABI change.

Changes in v4:
1. Use *vnode_to_pnode to calculate size.
2. Remove loop.

Changes in v3:
1. Move init_vnuma_info before ACPI stuff.
2. Fix errno.h inclusion.
3. Remove upper limits and use loop.
---
 tools/firmware/hvmloader/Makefile|  2 +-
 tools/firmware/hvmloader/hvmloader.c |  3 ++
 tools/firmware/hvmloader/vnuma.c | 84 
 tools/firmware/hvmloader/vnuma.h | 52 ++
 4 files changed, 140 insertions(+), 1 deletion(-)
 create mode 100644 tools/firmware/hvmloader/vnuma.c
 create mode 100644 tools/firmware/hvmloader/vnuma.h

diff --git a/tools/firmware/hvmloader/Makefile 
b/tools/firmware/hvmloader/Makefile
index b759e81..cf967fd 100644
--- a/tools/firmware/hvmloader/Makefile
+++ b/tools/firmware/hvmloader/Makefile
@@ -29,7 +29,7 @@ LOADADDR = 0x10
 CFLAGS += $(CFLAGS_xeninclude)
 
 OBJS  = hvmloader.o mp_tables.o util.o smbios.o 
-OBJS += smp.o cacheattr.o xenbus.o
+OBJS += smp.o cacheattr.o xenbus.o vnuma.o
 OBJS += e820.o pci.o pir.o ctype.o
 OBJS += hvm_param.o
 ifeq ($(debug),y)
diff --git a/tools/firmware/hvmloader/hvmloader.c 
b/tools/firmware/hvmloader/hvmloader.c
index 7b0da38..25b7f08 100644
--- a/tools/firmware/hvmloader/hvmloader.c
+++ b/tools/firmware/hvmloader/hvmloader.c
@@ -26,6 +26,7 @@
 #include "pci_regs.h"
 #include "apic_regs.h"
 #include "acpi/acpi2_0.h"
+#include "vnuma.h"
 #include 
 #include 
 
@@ -310,6 +311,8 @@ int main(void)
 
 if ( acpi_enabled )
 {
+init_vnuma_info();
+
 if ( bios->acpi_build_tables )
 {
 printf("Loading ACPI ...\n");
diff --git a/tools/firmware/hvmloader/vnuma.c b/tools/firmware/hvmloader/vnuma.c
new file mode 100644
index 000..a71d31a
--- /dev/null
+++ b/tools/firmware/hvmloader/vnuma.c
@@ -0,0 +1,84 @@
+/*
+ * vnuma.c: obtain vNUMA information from hypervisor
+ *
+ * Copyright (c) 2014 Wei Liu, Citrix Systems (R&D) Ltd.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include "util.h"
+#include "hypercall.h"
+#include "vnuma.h"
+#include 
+
+unsigned int nr_vnodes, nr_vmemranges;
+unsigned int *vcpu_to_vnode, *vdistance;
+xen_vmemrange_t *vmemrange;
+
+void init_vnuma_info(void)
+{
+int rc;
+struct xen_vnuma_topology_info vnuma_topo;
+
+memset(&vnuma_topo, 0, sizeof(vnuma_topo));
+vnuma_topo.domid = DOMID_SELF;
+
+rc = hypercall_memory_op(XENMEM_get_vnumainfo, &vnuma_topo);
+
+if ( rc != -XEN_ENOBUFS )
+return;
+
+ASSERT(vnuma_topo.nr_vcpus == hvm_info->nr_vcpus);
+
+vcpu_to_vnode =
+scratch_alloc(sizeof(*vcpu_to_vnode) * hvm_info->nr_vcpus, 0);
+vdistance = scratch_alloc(sizeof(uint32_t) * vnuma_topo.nr_vnodes *
+  vnuma_topo.nr_vnodes, 0);
+vmemrange = scratch_alloc(sizeof(xen_vmemrange_t) *
+  vnuma_topo.nr_vmemranges, 0);
+
+set_xen_guest_handle(vnuma_topo.vdistance.h, vdistance);
+set_xen_guest_handle(vnuma_topo.vcpu_to_vnode.h, vcpu_to_vnode);
+set_xen_guest_handle(vnuma_topo.vmemrange.h, vmemrange);
+
+rc = hypercall_memory_op(XENMEM_get_vnumainfo, &vnuma_topo);
+
+if ( rc < 0 )
+{
+printf("Failed to retrieve vNUMA information, rc = %d\n", rc);
+return;
+}
+
+nr_vnodes = vnuma_topo.nr_vnodes;
+nr_vmemranges = vnuma_topo.nr_vmemranges;
+}
+
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent

[Xen-devel] [PATCH v5 19/24] libxl: define LIBXL_HAVE_VNUMA

2015-02-12 Thread Wei Liu
Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
---
 tools/libxl/libxl.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index c219f59..f33178c 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -67,6 +67,12 @@
  * the same $(XEN_VERSION) (e.g. throughout a major release).
  */
 
+/* LIBXL_HAVE_VNUMA
+ *
+ * If it is defined, libxl supports vNUMA configuration
+ */
+#define LIBXL_HAVE_VNUMA 1
+
 /* LIBXL_HAVE_USERDATA_UNLINK
  *
  * If it is defined, libxl has a library function called
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 13/24] hvmloader: construct SRAT

2015-02-12 Thread Wei Liu
Signed-off-by: Wei Liu 
Acked-by: Jan Beulich 
---
Changes in v3:
1. Remove redundant variable.
2. Coding style fix.
3. Add assertion.

Changes in v2:
1. Remove explicit zero initializers.
2. Adapt to new vNUMA retrieval routine.
3. Move SRAT very late in secondary table build.
---
 tools/firmware/hvmloader/acpi/acpi2_0.h | 53 
 tools/firmware/hvmloader/acpi/build.c   | 72 +
 2 files changed, 125 insertions(+)

diff --git a/tools/firmware/hvmloader/acpi/acpi2_0.h 
b/tools/firmware/hvmloader/acpi/acpi2_0.h
index 7b22d80..6169213 100644
--- a/tools/firmware/hvmloader/acpi/acpi2_0.h
+++ b/tools/firmware/hvmloader/acpi/acpi2_0.h
@@ -364,6 +364,57 @@ struct acpi_20_madt_intsrcovr {
 };
 
 /*
+ * System Resource Affinity Table header definition (SRAT)
+ */
+struct acpi_20_srat {
+struct acpi_header header;
+uint32_t table_revision;
+uint32_t reserved2[2];
+};
+
+#define ACPI_SRAT_TABLE_REVISION 1
+
+/*
+ * System Resource Affinity Table structure types.
+ */
+#define ACPI_PROCESSOR_AFFINITY 0x0
+#define ACPI_MEMORY_AFFINITY0x1
+struct acpi_20_srat_processor {
+uint8_t type;
+uint8_t length;
+uint8_t domain;
+uint8_t apic_id;
+uint32_t flags;
+uint8_t sapic_id;
+uint8_t domain_hi[3];
+uint32_t reserved;
+};
+
+/*
+ * Local APIC Affinity Flags.  All other bits are reserved and must be 0.
+ */
+#define ACPI_LOCAL_APIC_AFFIN_ENABLED (1 << 0)
+
+struct acpi_20_srat_memory {
+uint8_t type;
+uint8_t length;
+uint32_t domain;
+uint16_t reserved;
+uint64_t base_address;
+uint64_t mem_length;
+uint32_t reserved2;
+uint32_t flags;
+uint64_t reserved3;
+};
+
+/*
+ * Memory Affinity Flags.  All other bits are reserved and must be 0.
+ */
+#define ACPI_MEM_AFFIN_ENABLED (1 << 0)
+#define ACPI_MEM_AFFIN_HOTPLUGGABLE (1 << 1)
+#define ACPI_MEM_AFFIN_NONVOLATILE (1 << 2)
+
+/*
  * Table Signatures.
  */
 #define ACPI_2_0_RSDP_SIGNATURE ASCII64('R','S','D',' ','P','T','R',' ')
@@ -375,6 +426,7 @@ struct acpi_20_madt_intsrcovr {
 #define ACPI_2_0_TCPA_SIGNATURE ASCII32('T','C','P','A')
 #define ACPI_2_0_HPET_SIGNATURE ASCII32('H','P','E','T')
 #define ACPI_2_0_WAET_SIGNATURE ASCII32('W','A','E','T')
+#define ACPI_2_0_SRAT_SIGNATURE ASCII32('S','R','A','T')
 
 /*
  * Table revision numbers.
@@ -388,6 +440,7 @@ struct acpi_20_madt_intsrcovr {
 #define ACPI_2_0_HPET_REVISION 0x01
 #define ACPI_2_0_WAET_REVISION 0x01
 #define ACPI_1_0_FADT_REVISION 0x01
+#define ACPI_2_0_SRAT_REVISION 0x01
 
 #pragma pack ()
 
diff --git a/tools/firmware/hvmloader/acpi/build.c 
b/tools/firmware/hvmloader/acpi/build.c
index 1431296..3e96c23 100644
--- a/tools/firmware/hvmloader/acpi/build.c
+++ b/tools/firmware/hvmloader/acpi/build.c
@@ -23,6 +23,7 @@
 #include "ssdt_pm.h"
 #include "../config.h"
 #include "../util.h"
+#include "../vnuma.h"
 #include 
 #include 
 
@@ -203,6 +204,66 @@ static struct acpi_20_waet *construct_waet(void)
 return waet;
 }
 
+static struct acpi_20_srat *construct_srat(void)
+{
+struct acpi_20_srat *srat;
+struct acpi_20_srat_processor *processor;
+struct acpi_20_srat_memory *memory;
+unsigned int size;
+void *p;
+int i;
+
+size = sizeof(*srat) + sizeof(*processor) * hvm_info->nr_vcpus +
+sizeof(*memory) * nr_vmemranges;
+
+p = mem_alloc(size, 16);
+if ( !p )
+return NULL;
+
+srat = p;
+memset(srat, 0, sizeof(*srat));
+srat->header.signature= ACPI_2_0_SRAT_SIGNATURE;
+srat->header.revision = ACPI_2_0_SRAT_REVISION;
+fixed_strcpy(srat->header.oem_id, ACPI_OEM_ID);
+fixed_strcpy(srat->header.oem_table_id, ACPI_OEM_TABLE_ID);
+srat->header.oem_revision = ACPI_OEM_REVISION;
+srat->header.creator_id   = ACPI_CREATOR_ID;
+srat->header.creator_revision = ACPI_CREATOR_REVISION;
+srat->table_revision  = ACPI_SRAT_TABLE_REVISION;
+
+processor = (struct acpi_20_srat_processor *)(srat + 1);
+for ( i = 0; i < hvm_info->nr_vcpus; i++ )
+{
+memset(processor, 0, sizeof(*processor));
+processor->type = ACPI_PROCESSOR_AFFINITY;
+processor->length   = sizeof(*processor);
+processor->domain   = vcpu_to_vnode[i];
+processor->apic_id  = LAPIC_ID(i);
+processor->flags= ACPI_LOCAL_APIC_AFFIN_ENABLED;
+processor++;
+}
+
+memory = (struct acpi_20_srat_memory *)processor;
+for ( i = 0; i < nr_vmemranges; i++ )
+{
+memset(memory, 0, sizeof(*memory));
+memory->type  = ACPI_MEMORY_AFFINITY;
+memory->length= sizeof(*memory);
+memory->domain= vmemrange[i].nid;
+memory->flags = ACPI_MEM_AFFIN_ENABLED;
+memory->base_address  = vmemrange[i].start;
+memory->mem_length= vmemrange[i].end - vmemrange[i].start;
+memory++;
+}
+
+ASSERT(((unsigned long)memory) - ((unsigned long)p) == size);
+
+srat->header.length = size;
+  

[Xen-devel] [PATCH v5 20/24] libxlu: rework internal representation of setting

2015-02-12 Thread Wei Liu
This patches does following things:

1. Properly define a XLU_ConfigList type. Originally it was defined to
   be XLU_ConfigSetting.
2. Define XLU_ConfigValue type, which can be either a string or a list
   of XLU_ConfigValue.
3. ConfigSetting now references XLU_ConfigValue. Originally it only
   worked with **string.
4. Properly construct list where necessary, see changes to .y file.

To achieve above changes:

1. xlu__cfg_set_mk and xlu__cfg_set_add are deleted, because they
   are no more needed in the new code.
2. Introduce xlu__cfg_string_mk to make a XLU_ConfigSetting that points
   to a XLU_ConfigValue that wraps a string.
3. Introduce xlu__cfg_list_mk to make a XLU_ConfigSetting that points
   to XLU_ConfigValue that is a list.
4. The parser now generates XLU_ConfigValue instead of XLU_ConfigSetting
   when construct values, which enables us to recursively generate list
   of lists.
5. XLU_ConfigSetting is generated in xlu__cfg_set_store.
6. Adapt other functions to use new types.

No change to public API. Xl compiles without problem and 'xl create -n
guest.cfg' is valgrind clean.

This patch is needed because we're going to implement nested list
support, which requires support for list of list.

Signed-off-by: Wei Liu 
Cc: Ian Jackson 
Cc: Ian Campbell 
---
Changes in v5:
1. Use standard expanding-array pattern.
---
 tools/libxl/libxlu_cfg.c  | 170 ++
 tools/libxl/libxlu_cfg_i.h|  12 ++-
 tools/libxl/libxlu_cfg_y.c|  24 +++---
 tools/libxl/libxlu_cfg_y.h|   2 +-
 tools/libxl/libxlu_cfg_y.y|  14 ++--
 tools/libxl/libxlu_internal.h |  30 ++--
 6 files changed, 173 insertions(+), 79 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 22adcb0..f000eed 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -131,14 +131,28 @@ int xlu_cfg_readdata(XLU_Config *cfg, const char *data, 
int length) {
 return ctx.err;
 }
 
-void xlu__cfg_set_free(XLU_ConfigSetting *set) {
+void xlu__cfg_value_free(XLU_ConfigValue *value)
+{
 int i;
 
+if (!value) return;
+
+switch (value->type) {
+case XLU_STRING:
+free(value->u.string);
+break;
+case XLU_LIST:
+for (i = 0; i < value->u.list.nvalues; i++)
+xlu__cfg_value_free(value->u.list.values[i]);
+free(value->u.list.values);
+}
+free(value);
+}
+
+void xlu__cfg_set_free(XLU_ConfigSetting *set) {
 if (!set) return;
 free(set->name);
-for (i=0; invalues; i++)
-free(set->values[i]);
-free(set->values);
+xlu__cfg_value_free(set->value);
 free(set);
 }
 
@@ -173,7 +187,7 @@ static int find_atom(const XLU_Config *cfg, const char *n,
 set= find(cfg,n);
 if (!set) return ESRCH;
 
-if (set->avalues!=1) {
+if (set->value->type!=XLU_STRING) {
 if (!dont_warn)
 fprintf(cfg->report,
 "%s:%d: warning: parameter `%s' is"
@@ -191,7 +205,7 @@ int xlu_cfg_get_string(const XLU_Config *cfg, const char *n,
 int e;
 
 e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
-*value_r= set->values[0];
+*value_r= set->value->u.string;
 return 0;
 }
 
@@ -202,7 +216,7 @@ int xlu_cfg_replace_string(const XLU_Config *cfg, const 
char *n,
 
 e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
 free(*value_r);
-*value_r= strdup(set->values[0]);
+*value_r= strdup(set->value->u.string);
 return 0;
 }
 
@@ -214,7 +228,7 @@ int xlu_cfg_get_long(const XLU_Config *cfg, const char *n,
 char *ep;
 
 e= find_atom(cfg,n,&set,dont_warn);  if (e) return e;
-errno= 0; l= strtol(set->values[0], &ep, 0);
+errno= 0; l= strtol(set->value->u.string, &ep, 0);
 e= errno;
 if (errno) {
 e= errno;
@@ -226,7 +240,7 @@ int xlu_cfg_get_long(const XLU_Config *cfg, const char *n,
 cfg->config_source, set->lineno, n, strerror(e));
 return e;
 }
-if (*ep || ep==set->values[0]) {
+if (*ep || ep==set->value->u.string) {
 if (!dont_warn)
 fprintf(cfg->report,
 "%s:%d: warning: parameter `%s' is not a valid number\n",
@@ -253,7 +267,7 @@ int xlu_cfg_get_list(const XLU_Config *cfg, const char *n,
  XLU_ConfigList **list_r, int *entries_r, int dont_warn) {
 XLU_ConfigSetting *set;
 set= find(cfg,n);  if (!set) return ESRCH;
-if (set->avalues==1) {
+if (set->value->type!=XLU_LIST) {
 if (!dont_warn) {
 fprintf(cfg->report,
 "%s:%d: warning: parameter `%s' is a single value"
@@ -262,8 +276,8 @@ int xlu_cfg_get_list(const XLU_Config *cfg, const char *n,
 }
 return EINVAL;
 }
-if (list_r) *list_r= set;
-if (entries_r) *entries_r= set->nvalues;
+if (list_r) *list_r= &set->value->u.list;
+if (entries_r) *entries_r= set->value->u.list.nvalues;
 return 0;
 }
 
@@ -290,72 +304,130 @@ int xlu_cfg_get_list_as_

[Xen-devel] [PATCH v5 18/24] libxl: disallow memory relocation when vNUMA is enabled

2015-02-12 Thread Wei Liu
Disallow memory relocation when vNUMA is enabled, because relocated
memory ends up off node. Further more, even if we dynamically expand
node coverage in hvmloader, low memory and high memory may reside
in different physical nodes, blindly relocating low memory to high
memory gives us a sub-optimal configuration.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Konrad Wilk 
---
 tools/libxl/libxl_dm.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 8599a6a..8edf276 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -1365,13 +1365,15 @@ void libxl__spawn_local_dm(libxl__egc *egc, 
libxl__dm_spawn_state *dmss)
 libxl__sprintf(gc, "%s/hvmloader/bios", path),
 "%s", libxl_bios_type_to_string(b_info->u.hvm.bios));
 /* Disable relocating memory to make the MMIO hole larger
- * unless we're running qemu-traditional */
+ * unless we're running qemu-traditional and vNUMA is not
+ * configured. */
 libxl__xs_write(gc, XBT_NULL,
 libxl__sprintf(gc,
"%s/hvmloader/allow-memory-relocate",
path),
 "%d",
-
b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL);
+
b_info->device_model_version==LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL &&
+!b_info->num_vnuma_nodes);
 free(path);
 }
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 14/24] hvmloader: construct SLIT

2015-02-12 Thread Wei Liu
Signed-off-by: Wei Liu 
Acked-by: Jan Beulich 
---
Changes in v3:
1. Coding style fix.
2. Fix an error code.
3. Use unsigned int for loop variable.

Changes in v2:
1. Adapt to new vNUMA retrieval routine.
2. Move SLIT very late in secondary table build.
---
 tools/firmware/hvmloader/acpi/acpi2_0.h |  8 +++
 tools/firmware/hvmloader/acpi/build.c   | 40 -
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/tools/firmware/hvmloader/acpi/acpi2_0.h 
b/tools/firmware/hvmloader/acpi/acpi2_0.h
index 6169213..d698095 100644
--- a/tools/firmware/hvmloader/acpi/acpi2_0.h
+++ b/tools/firmware/hvmloader/acpi/acpi2_0.h
@@ -414,6 +414,12 @@ struct acpi_20_srat_memory {
 #define ACPI_MEM_AFFIN_HOTPLUGGABLE (1 << 1)
 #define ACPI_MEM_AFFIN_NONVOLATILE (1 << 2)
 
+struct acpi_20_slit {
+struct acpi_header header;
+uint64_t localities;
+uint8_t entry[0];
+};
+
 /*
  * Table Signatures.
  */
@@ -427,6 +433,7 @@ struct acpi_20_srat_memory {
 #define ACPI_2_0_HPET_SIGNATURE ASCII32('H','P','E','T')
 #define ACPI_2_0_WAET_SIGNATURE ASCII32('W','A','E','T')
 #define ACPI_2_0_SRAT_SIGNATURE ASCII32('S','R','A','T')
+#define ACPI_2_0_SLIT_SIGNATURE ASCII32('S','L','I','T')
 
 /*
  * Table revision numbers.
@@ -441,6 +448,7 @@ struct acpi_20_srat_memory {
 #define ACPI_2_0_WAET_REVISION 0x01
 #define ACPI_1_0_FADT_REVISION 0x01
 #define ACPI_2_0_SRAT_REVISION 0x01
+#define ACPI_2_0_SLIT_REVISION 0x01
 
 #pragma pack ()
 
diff --git a/tools/firmware/hvmloader/acpi/build.c 
b/tools/firmware/hvmloader/acpi/build.c
index 3e96c23..7dac6a8 100644
--- a/tools/firmware/hvmloader/acpi/build.c
+++ b/tools/firmware/hvmloader/acpi/build.c
@@ -264,6 +264,38 @@ static struct acpi_20_srat *construct_srat(void)
 return srat;
 }
 
+static struct acpi_20_slit *construct_slit(void)
+{
+struct acpi_20_slit *slit;
+unsigned int i, num, size;
+
+num = nr_vnodes * nr_vnodes;
+size = sizeof(*slit) + num * sizeof(uint8_t);
+
+slit = mem_alloc(size, 16);
+if ( !slit )
+return NULL;
+
+memset(slit, 0, size);
+slit->header.signature= ACPI_2_0_SLIT_SIGNATURE;
+slit->header.revision = ACPI_2_0_SLIT_REVISION;
+fixed_strcpy(slit->header.oem_id, ACPI_OEM_ID);
+fixed_strcpy(slit->header.oem_table_id, ACPI_OEM_TABLE_ID);
+slit->header.oem_revision = ACPI_OEM_REVISION;
+slit->header.creator_id   = ACPI_CREATOR_ID;
+slit->header.creator_revision = ACPI_CREATOR_REVISION;
+
+for ( i = 0; i < num; i++ )
+slit->entry[i] = vdistance[i];
+
+slit->localities = nr_vnodes;
+
+slit->header.length = size;
+set_checksum(slit, offsetof(struct acpi_header, checksum), size);
+
+return slit;
+}
+
 static int construct_passthrough_tables(unsigned long *table_ptrs,
 int nr_tables)
 {
@@ -319,6 +351,7 @@ static int construct_secondary_tables(unsigned long 
*table_ptrs,
 struct acpi_20_waet *waet;
 struct acpi_20_tcpa *tcpa;
 struct acpi_20_srat *srat;
+struct acpi_20_slit *slit;
 unsigned char *ssdt;
 static const uint16_t tis_signature[] = {0x0001, 0x0001, 0x0001};
 uint16_t *tis_hdr;
@@ -408,7 +441,7 @@ static int construct_secondary_tables(unsigned long 
*table_ptrs,
 }
 }
 
-/* SRAT */
+/* SRAT and SLIT */
 if ( nr_vnodes > 0 )
 {
 srat = construct_srat();
@@ -416,6 +449,11 @@ static int construct_secondary_tables(unsigned long 
*table_ptrs,
 table_ptrs[nr_tables++] = (unsigned long)srat;
 else
 printf("Failed to build SRAT, skipping...\n");
+slit = construct_slit();
+if ( slit )
+table_ptrs[nr_tables++] = (unsigned long)slit;
+else
+printf("Failed to build SLIT, skipping...\n");
 }
 
 /* Load any additional tables passed through. */
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 10/24] libxl: functions to build vmemranges for PV guest

2015-02-12 Thread Wei Liu
Introduce a arch-independent routine to generate one vmemrange per
vnode. Also introduce arch-dependent routines for different
architectures because part of the process is arch-specific -- ARM has
yet have NUMA support and E820 is x86 only.

For those x86 guests who care about machine E820 map (i.e. with
e820_host=1), vnode is further split into several vmemranges to
accommodate memory holes.  A few stubs for libxl_arm.c are created.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
---
Changes in v5:
1. Allocate array all in one go.
2. Reverse the logic of vmemranges generation.

Changes in v4:
1. Adapt to new interface.
2. Address Ian Jackson's comments.

Changes in v3:
1. Rewrite commit log.
---
 tools/libxl/libxl_arch.h |  6 
 tools/libxl/libxl_arm.c  |  8 +
 tools/libxl/libxl_internal.h |  8 +
 tools/libxl/libxl_vnuma.c| 41 +
 tools/libxl/libxl_x86.c  | 73 
 5 files changed, 136 insertions(+)

diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h
index d3bc136..e249048 100644
--- a/tools/libxl/libxl_arch.h
+++ b/tools/libxl/libxl_arch.h
@@ -27,4 +27,10 @@ int libxl__arch_domain_init_hw_description(libxl__gc *gc,
 int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
   libxl_domain_build_info *info,
   struct xc_dom_image *dom);
+
+/* build vNUMA vmemrange with arch specific information */
+int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
+  uint32_t domid,
+  libxl_domain_build_info *b_info,
+  libxl__domain_build_state *state);
 #endif
diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index 65a762b..7da254f 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -707,6 +707,14 @@ int libxl__arch_domain_finalise_hw_description(libxl__gc 
*gc,
 return 0;
 }
 
+int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
+  uint32_t domid,
+  libxl_domain_build_info *info,
+  libxl__domain_build_state *state)
+{
+return libxl__vnuma_build_vmemrange_pv_generic(gc, domid, info, state);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 258be0d..7d1e1cf 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3400,6 +3400,14 @@ void libxl__numa_candidate_put_nodemap(libxl__gc *gc,
 int libxl__vnuma_config_check(libxl__gc *gc,
   const libxl_domain_build_info *b_info,
   const libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_pv_generic(libxl__gc *gc,
+uint32_t domid,
+libxl_domain_build_info *b_info,
+libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
+uint32_t domid,
+libxl_domain_build_info *b_info,
+libxl__domain_build_state *state);
 
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
const libxl_ms_vm_genid *id);
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
index fa5aa8d..3d46239 100644
--- a/tools/libxl/libxl_vnuma.c
+++ b/tools/libxl/libxl_vnuma.c
@@ -14,6 +14,7 @@
  */
 #include "libxl_osdeps.h" /* must come before any other headers */
 #include "libxl_internal.h"
+#include "libxl_arch.h"
 #include 
 
 /* Sort vmemranges in ascending order with "start" */
@@ -122,6 +123,46 @@ out:
 return rc;
 }
 
+
+int libxl__vnuma_build_vmemrange_pv_generic(libxl__gc *gc,
+uint32_t domid,
+libxl_domain_build_info *b_info,
+libxl__domain_build_state *state)
+{
+int i;
+uint64_t next;
+xen_vmemrange_t *v = NULL;
+
+/* Generate one vmemrange for each virtual node. */
+GCREALLOC_ARRAY(v, b_info->num_vnuma_nodes);
+next = 0;
+for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+libxl_vnode_info *p = &b_info->vnuma_nodes[i];
+
+v[i].start = next;
+v[i].end = next + (p->memkb << 10);
+v[i].flags = 0;
+v[i].nid = i;
+
+next = v[i].end;
+}
+
+state->vmemranges = v;
+state->num_vmemranges = i;
+
+return 0;
+}
+
+/* Build vmemranges for PV guest */
+int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
+uint32_t domid,
+libxl_domain_build_info *b_inf

[Xen-devel] [PATCH v5 22/24] libxlu: introduce new APIs

2015-02-12 Thread Wei Liu
These APIs can be used to manipulate XLU_ConfigValue and XLU_ConfigList.

APIs introduced:
1. xlu_cfg_value_type
2. xlu_cfg_value_get_string
3. xlu_cfg_value_get_list
4. xlu_cfg_get_listitem2

Move some definitions from private header to public header as needed.

Signed-off-by: Wei Liu 
Cc: Ian Jackson 
Cc: Ian Campbell 
---
Changes in v5:
1. Use calling convention like old APIs.
---
 tools/libxl/libxlu_cfg.c  | 41 +
 tools/libxl/libxlu_internal.h |  7 ---
 tools/libxl/libxlutil.h   | 13 +
 3 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 611f5ec..46b1d4f 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -199,6 +199,47 @@ static int find_atom(const XLU_Config *cfg, const char *n,
 return 0;
 }
 
+
+enum XLU_ConfigValueType xlu_cfg_value_type(const XLU_ConfigValue *value)
+{
+return value->type;
+}
+
+int xlu_cfg_value_get_string(const XLU_Config *cfg, XLU_ConfigValue *value,
+ char **value_r, int dont_warn)
+{
+if (value->type != XLU_STRING) {
+if (!dont_warn)
+fprintf(cfg->report, "warning: value is not a string\n");
+*value_r = NULL;
+return EINVAL;
+}
+
+*value_r = value->u.string;
+return 0;
+}
+
+int xlu_cfg_value_get_list(const XLU_Config *cfg, XLU_ConfigValue *value,
+   XLU_ConfigList **value_r, int dont_warn)
+{
+if (value->type != XLU_LIST) {
+if (!dont_warn)
+fprintf(cfg->report, "warning: value is not a list\n");
+*value_r = NULL;
+return EINVAL;
+}
+
+*value_r = &value->u.list;
+return 0;
+}
+
+XLU_ConfigValue *xlu_cfg_get_listitem2(const XLU_ConfigList *list,
+   int entry)
+{
+if (entry < 0 || entry >= list->nvalues) return NULL;
+return list->values[entry];
+}
+
 int xlu_cfg_get_string(const XLU_Config *cfg, const char *n,
const char **value_r, int dont_warn) {
 XLU_ConfigSetting *set;
diff --git a/tools/libxl/libxlu_internal.h b/tools/libxl/libxlu_internal.h
index 092a17a..24ed6d4 100644
--- a/tools/libxl/libxlu_internal.h
+++ b/tools/libxl/libxlu_internal.h
@@ -25,13 +25,6 @@
 
 #include "libxlutil.h"
 
-enum XLU_ConfigValueType {
-XLU_STRING,
-XLU_LIST,
-};
-
-typedef struct XLU_ConfigValue XLU_ConfigValue;
-
 typedef struct XLU_ConfigList {
 int avalues; /* available slots */
 int nvalues; /* actual occupied slots */
diff --git a/tools/libxl/libxlutil.h b/tools/libxl/libxlutil.h
index 0333e55..989605a 100644
--- a/tools/libxl/libxlutil.h
+++ b/tools/libxl/libxlutil.h
@@ -20,9 +20,15 @@
 
 #include "libxl.h"
 
+enum XLU_ConfigValueType {
+XLU_STRING,
+XLU_LIST,
+};
+
 /* Unless otherwise stated, all functions return an errno value. */
 typedef struct XLU_Config XLU_Config;
 typedef struct XLU_ConfigList XLU_ConfigList;
+typedef struct XLU_ConfigValue XLU_ConfigValue;
 
 XLU_Config *xlu_cfg_init(FILE *report, const char *report_filename);
   /* 0 means we got ENOMEM. */
@@ -66,6 +72,13 @@ const char *xlu_cfg_get_listitem(const XLU_ConfigList*, int 
entry);
   /* xlu_cfg_get_listitem cannot fail, except that if entry is
* out of range it returns 0 (not setting errno) */
 
+enum XLU_ConfigValueType xlu_cfg_value_type(const XLU_ConfigValue *value);
+int xlu_cfg_value_get_string(const XLU_Config *cfg,  XLU_ConfigValue *value,
+ char **value_r, int dont_warn);
+int xlu_cfg_value_get_list(const XLU_Config *cfg, XLU_ConfigValue *value,
+   XLU_ConfigList **value_r, int dont_warn);
+XLU_ConfigValue *xlu_cfg_get_listitem2(const XLU_ConfigList *list,
+   int entry);
 
 /*
  * Disk specification parsing.
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 11/24] libxl: build, check and pass vNUMA info to Xen for PV guest

2015-02-12 Thread Wei Liu
Transform the user supplied vNUMA configuration into libxl internal
representations, and finally libxc representations. Check validity of
the configuration along the line.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
Acked-by: Ian Campbell 
---
Changes in v5:
1. Adapt to change of interface (ditching xc_vnuma_info).

Changes in v4:
1. Adapt to new interfaces.

Changes in v3:
1. Add more commit log.
---
 tools/libxl/libxl_dom.c | 77 +
 1 file changed, 77 insertions(+)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 48d661a..1ff0704 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -515,6 +515,51 @@ retry_transaction:
 return 0;
 }
 
+static int set_vnuma_info(libxl__gc *gc, uint32_t domid,
+  const libxl_domain_build_info *info,
+  const libxl__domain_build_state *state)
+{
+int rc = 0;
+int i, nr_vdistance;
+unsigned int *vcpu_to_vnode, *vnode_to_pnode, *vdistance = NULL;
+
+vcpu_to_vnode = libxl__calloc(gc, info->max_vcpus,
+  sizeof(unsigned int));
+vnode_to_pnode = libxl__calloc(gc, info->num_vnuma_nodes,
+   sizeof(unsigned int));
+
+nr_vdistance = info->num_vnuma_nodes * info->num_vnuma_nodes;
+vdistance = libxl__calloc(gc, nr_vdistance, sizeof(unsigned int));
+
+for (i = 0; i < info->num_vnuma_nodes; i++) {
+libxl_vnode_info *v = &info->vnuma_nodes[i];
+int bit;
+
+/* vnode to pnode mapping */
+vnode_to_pnode[i] = v->pnode;
+
+/* vcpu to vnode mapping */
+libxl_for_each_set_bit(bit, v->vcpus)
+vcpu_to_vnode[bit] = i;
+
+/* node distances */
+assert(info->num_vnuma_nodes == v->num_distances);
+memcpy(vdistance + (i * info->num_vnuma_nodes),
+   v->distances,
+   v->num_distances * sizeof(unsigned int));
+}
+
+if (xc_domain_setvnuma(CTX->xch, domid, info->num_vnuma_nodes,
+   state->num_vmemranges, info->max_vcpus,
+   state->vmemranges, vdistance,
+   vcpu_to_vnode, vnode_to_pnode) < 0) {
+LOGE(ERROR, "xc_domain_setvnuma failed");
+rc = ERROR_FAIL;
+}
+
+return rc;
+}
+
 int libxl__build_pv(libxl__gc *gc, uint32_t domid,
  libxl_domain_build_info *info, libxl__domain_build_state *state)
 {
@@ -572,6 +617,38 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
 dom->xenstore_domid = state->store_domid;
 dom->claim_enabled = libxl_defbool_val(info->claim_mode);
 
+if (info->num_vnuma_nodes != 0) {
+int i;
+
+ret = libxl__vnuma_build_vmemrange_pv(gc, domid, info, state);
+if (ret) {
+LOGE(ERROR, "cannot build vmemranges");
+goto out;
+}
+ret = libxl__vnuma_config_check(gc, info, state);
+if (ret) goto out;
+
+ret = set_vnuma_info(gc, domid, info, state);
+if (ret) goto out;
+
+dom->nr_vmemranges = state->num_vmemranges;
+dom->vmemranges = xc_dom_malloc(dom, sizeof(*dom->vmemranges) *
+dom->nr_vmemranges);
+
+for (i = 0; i < dom->nr_vmemranges; i++) {
+dom->vmemranges[i].start = state->vmemranges[i].start;
+dom->vmemranges[i].end   = state->vmemranges[i].end;
+dom->vmemranges[i].flags = state->vmemranges[i].flags;
+dom->vmemranges[i].nid   = state->vmemranges[i].nid;
+}
+
+dom->nr_vnodes = info->num_vnuma_nodes;
+dom->vnode_to_pnode = xc_dom_malloc(dom, sizeof(*dom->vnode_to_pnode) *
+dom->nr_vnodes);
+for (i = 0; i < info->num_vnuma_nodes; i++)
+dom->vnode_to_pnode[i] = info->vnuma_nodes[0].pnode;
+}
+
 if ( (ret = xc_dom_boot_xen_init(dom, ctx->xch, domid)) != 0 ) {
 LOGE(ERROR, "xc_dom_boot_xen_init failed");
 goto out;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 16/24] libxc: allocate memory with vNUMA information for HVM guest

2015-02-12 Thread Wei Liu
The algorithm is more or less the same as the one used for PV guest.
Libxc gets hold of the mapping of vnode to pnode and size of each vnode
then allocate memory accordingly.

And then the function returns low memory end, high memory end and mmio
start to caller. Libxl needs those values to construct vmemranges for
that guest.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
---
Changes in v5:
1. Use a better loop variable name vnid.

Changes in v4:
1. Adapt to new interface.
2. Shorten error message.
3. This patch includes only functional changes.

Changes in v3:
1. Rewrite commit log.
2. Add a few code comments.
---
 tools/libxc/include/xenguest.h |  11 +
 tools/libxc/xc_hvm_build_x86.c | 105 ++---
 2 files changed, 100 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 40bbac8..ff66cb1 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -230,6 +230,17 @@ struct xc_hvm_build_args {
 struct xc_hvm_firmware_module smbios_module;
 /* Whether to use claim hypercall (1 - enable, 0 - disable). */
 int claim_enabled;
+
+/* vNUMA information*/
+xen_vmemrange_t *vmemranges;
+unsigned int nr_vmemranges;
+unsigned int *vnode_to_pnode;
+unsigned int nr_vnodes;
+
+/* Out parameters  */
+uint64_t lowmem_end;
+uint64_t highmem_end;
+uint64_t mmio_start;
 };
 
 /**
diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index ecc3224..a2a3777 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -89,7 +89,8 @@ static int modules_init(struct xc_hvm_build_args *args,
 }
 
 static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
-   uint64_t mmio_start, uint64_t mmio_size)
+   uint64_t mmio_start, uint64_t mmio_size,
+   struct xc_hvm_build_args *args)
 {
 struct hvm_info_table *hvm_info = (struct hvm_info_table *)
 (((unsigned char *)hvm_info_page) + HVM_INFO_OFFSET);
@@ -119,6 +120,10 @@ static void build_hvm_info(void *hvm_info_page, uint64_t 
mem_size,
 hvm_info->high_mem_pgend = highmem_end >> PAGE_SHIFT;
 hvm_info->reserved_mem_pgstart = ioreq_server_pfn(0);
 
+args->lowmem_end = lowmem_end;
+args->highmem_end = highmem_end;
+args->mmio_start = mmio_start;
+
 /* Finish with the checksum. */
 for ( i = 0, sum = 0; i < hvm_info->length; i++ )
 sum += ((uint8_t *)hvm_info)[i];
@@ -244,7 +249,7 @@ static int setup_guest(xc_interface *xch,
char *image, unsigned long image_size)
 {
 xen_pfn_t *page_array = NULL;
-unsigned long i, nr_pages = args->mem_size >> PAGE_SHIFT;
+unsigned long i, vmemid, nr_pages = args->mem_size >> PAGE_SHIFT;
 unsigned long target_pages = args->mem_target >> PAGE_SHIFT;
 uint64_t mmio_start = (1ull << 32) - args->mmio_size;
 uint64_t mmio_size = args->mmio_size;
@@ -258,13 +263,13 @@ static int setup_guest(xc_interface *xch,
 xen_capabilities_info_t caps;
 unsigned long stat_normal_pages = 0, stat_2mb_pages = 0, 
 stat_1gb_pages = 0;
-int pod_mode = 0;
+unsigned int memflags = 0;
 int claim_enabled = args->claim_enabled;
 xen_pfn_t special_array[NR_SPECIAL_PAGES];
 xen_pfn_t ioreq_server_array[NR_IOREQ_SERVER_PAGES];
-
-if ( nr_pages > target_pages )
-pod_mode = XENMEMF_populate_on_demand;
+uint64_t total_pages;
+xen_vmemrange_t dummy_vmemrange;
+unsigned int dummy_vnode_to_pnode;
 
 memset(&elf, 0, sizeof(elf));
 if ( elf_init(&elf, image, image_size) != 0 )
@@ -276,6 +281,43 @@ static int setup_guest(xc_interface *xch,
 v_start = 0;
 v_end = args->mem_size;
 
+if ( nr_pages > target_pages )
+memflags |= XENMEMF_populate_on_demand;
+
+if ( args->nr_vmemranges == 0 )
+{
+/* Build dummy vnode information */
+dummy_vmemrange.start = 0;
+dummy_vmemrange.end   = args->mem_size;
+dummy_vmemrange.flags = 0;
+dummy_vmemrange.nid   = 0;
+args->nr_vmemranges = 1;
+args->vmemranges = &dummy_vmemrange;
+
+dummy_vnode_to_pnode = XC_VNUMA_NO_NODE;
+args->nr_vnodes = 1;
+args->vnode_to_pnode = &dummy_vnode_to_pnode;
+}
+else
+{
+if ( nr_pages > target_pages )
+{
+PERROR("Cannot enable vNUMA and PoD at the same time");
+goto error_out;
+}
+}
+
+total_pages = 0;
+for ( i = 0; i < args->nr_vmemranges; i++ )
+total_pages += ((args->vmemranges[i].end - args->vmemranges[i].start)
+>> PAGE_SHIFT);
+if ( total_pages != (args->mem_size >> PAGE_SHIFT) )
+{
+PERROR("vNUMA memory pages mismatch (0x%"PRIx64" != 0x%"PRIx64")",
+   total_pages, args->mem_size >> PAG

[Xen-devel] [PATCH v5 23/24] xl: introduce xcalloc

2015-02-12 Thread Wei Liu
Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
---
 tools/libxl/xl_cmdimpl.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 440db78..ec7fb2d 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -289,6 +289,18 @@ static void *xmalloc(size_t sz) {
 return r;
 }
 
+static void *xcalloc(size_t n, size_t sz) __attribute__((unused));
+static void *xcalloc(size_t n, size_t sz) {
+void *r;
+r = calloc(n, sz);
+if (!r) {
+fprintf(stderr,"xl: Unable to calloc %lu bytes.\n",
+(unsigned long)sz * (unsigned long)n);
+exit(-ERROR_FAIL);
+}
+return r;
+}
+
 static void *xrealloc(void *ptr, size_t sz) {
 void *r;
 if (!sz) { free(ptr); return 0; }
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 15/24] libxc: indentation change to xc_hvm_build_x86.c

2015-02-12 Thread Wei Liu
Move a while loop in xc_hvm_build_x86 one block to the right. No
functional change introduced.

Functional changes will be introduced in next patch.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
Acked-by: Ian Campbell 
---
 tools/libxc/xc_hvm_build_x86.c | 153 ++---
 1 file changed, 81 insertions(+), 72 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index c81a25b..ecc3224 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -353,98 +353,107 @@ static int setup_guest(xc_interface *xch,
 cur_pages = 0xc0;
 stat_normal_pages = 0xc0;
 
-while ( (rc == 0) && (nr_pages > cur_pages) )
 {
-/* Clip count to maximum 1GB extent. */
-unsigned long count = nr_pages - cur_pages;
-unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
-
-if ( count > max_pages )
-count = max_pages;
-
-cur_pfn = page_array[cur_pages];
-
-/* Take care the corner cases of super page tails */
-if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
- (count > (-cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1))) )
-count = -cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1);
-else if ( ((count & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
-  (count > SUPERPAGE_1GB_NR_PFNS) )
-count &= ~(SUPERPAGE_1GB_NR_PFNS - 1);
-
-/* Attemp to allocate 1GB super page. Because in each pass we only
- * allocate at most 1GB, we don't have to clip super page boundaries.
- */
-if ( ((count | cur_pfn) & (SUPERPAGE_1GB_NR_PFNS - 1)) == 0 &&
- /* Check if there exists MMIO hole in the 1GB memory range */
- !check_mmio_hole(cur_pfn << PAGE_SHIFT,
-  SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
-  mmio_start, mmio_size) )
+while ( (rc == 0) && (nr_pages > cur_pages) )
 {
-long done;
-unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
-xen_pfn_t sp_extents[nr_extents];
-
-for ( i = 0; i < nr_extents; i++ )
-sp_extents[i] = page_array[cur_pages+(i< 0 )
-{
-stat_1gb_pages += done;
-done <<= SUPERPAGE_1GB_SHIFT;
-cur_pages += done;
-count -= done;
-}
-}
+/* Clip count to maximum 1GB extent. */
+unsigned long count = nr_pages - cur_pages;
+unsigned long max_pages = SUPERPAGE_1GB_NR_PFNS;
 
-if ( count != 0 )
-{
-/* Clip count to maximum 8MB extent. */
-max_pages = SUPERPAGE_2MB_NR_PFNS * 4;
 if ( count > max_pages )
 count = max_pages;
-
-/* Clip partial superpage extents to superpage boundaries. */
-if ( ((cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
- (count > (-cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1))) )
-count = -cur_pfn & (SUPERPAGE_2MB_NR_PFNS-1);
-else if ( ((count & (SUPERPAGE_2MB_NR_PFNS-1)) != 0) &&
-  (count > SUPERPAGE_2MB_NR_PFNS) )
-count &= ~(SUPERPAGE_2MB_NR_PFNS - 1); /* clip non-s.p. tail */
-
-/* Attempt to allocate superpage extents. */
-if ( ((count | cur_pfn) & (SUPERPAGE_2MB_NR_PFNS - 1)) == 0 )
+
+cur_pfn = page_array[cur_pages];
+
+/* Take care the corner cases of super page tails */
+if ( ((cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
+ (count > (-cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1))) )
+count = -cur_pfn & (SUPERPAGE_1GB_NR_PFNS-1);
+else if ( ((count & (SUPERPAGE_1GB_NR_PFNS-1)) != 0) &&
+  (count > SUPERPAGE_1GB_NR_PFNS) )
+count &= ~(SUPERPAGE_1GB_NR_PFNS - 1);
+
+/* Attemp to allocate 1GB super page. Because in each pass
+ * we only allocate at most 1GB, we don't have to clip
+ * super page boundaries.
+ */
+if ( ((count | cur_pfn) & (SUPERPAGE_1GB_NR_PFNS - 1)) == 0 &&
+ /* Check if there exists MMIO hole in the 1GB memory
+  * range */
+ !check_mmio_hole(cur_pfn << PAGE_SHIFT,
+  SUPERPAGE_1GB_NR_PFNS << PAGE_SHIFT,
+  mmio_start, mmio_size) )
 {
 long done;
-unsigned long nr_extents = count >> SUPERPAGE_2MB_SHIFT;
+unsigned long nr_extents = count >> SUPERPAGE_1GB_SHIFT;
 xen_pfn_t sp_extents[nr_extents];
 
 for ( i = 0; i < nr_extents; i++ )
-sp_extents[i] = 
page_array[cur_pages+(i< 0 )
 {
-stat_2mb_pages += done;
-

[Xen-devel] [PATCH v5 17/24] libxl: build, check and pass vNUMA info to Xen for HVM guest

2015-02-12 Thread Wei Liu
Transform user supplied vNUMA configuration into libxl internal
representations then libxc representations. Check validity along the
line.

Libxc has more involvement in building vmemranges in HVM case compared
to PV case. The building of vmemranges is placed after xc_hvm_build
returns, because it relies on memory hole information provided by
xc_hvm_build.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
---
Changes in v5:
1. Check vnode 0 is large enough to accommodate video ram.

Changes in v4:
1. Adapt to new interface.
2. Rename some variables.
3. Use GCREALLOC_ARRAY.

Changes in v3:
1. Rewrite commit log.
---
 tools/libxc/xc_hvm_build_x86.c |  2 +-
 tools/libxl/libxl_create.c |  9 +++
 tools/libxl/libxl_dom.c| 37 
 tools/libxl/libxl_internal.h   |  5 
 tools/libxl/libxl_vnuma.c  | 56 ++
 5 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index a2a3777..bd12e30 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -407,7 +407,7 @@ static int setup_guest(xc_interface *xch,
 new_memflags |= XENMEMF_exact_node_request;
 }
 
-end_pages = args->vmemranges[i].end >> PAGE_SHIFT;
+end_pages = args->vmemranges[vmemid].end >> PAGE_SHIFT;
 /*
  * Consider vga hole belongs to the vmemrange that covers
  * 0xA-0xC. Note that 0x0-0xA is populated just
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 98687bd..af04248 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -853,6 +853,15 @@ static void initiate_domain_create(libxl__egc *egc,
 goto error_out;
 }
 
+/* Disallow PoD and vNUMA to be enabled at the same time because PoD
+ * pool is not vNUMA-aware yet.
+ */
+if (pod_enabled && d_config->b_info.num_vnuma_nodes) {
+ret = ERROR_INVAL;
+LOG(ERROR, "Cannot enable PoD and vNUMA at the same time");
+goto error_out;
+}
+
 ret = libxl__domain_create_info_setdefault(gc, &d_config->c_info);
 if (ret) goto error_out;
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 1ff0704..b2c9daf 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -893,12 +893,49 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 goto out;
 }
 
+if (info->num_vnuma_nodes != 0) {
+int i;
+
+args.nr_vmemranges = state->num_vmemranges;
+args.vmemranges = libxl__malloc(gc, sizeof(*args.vmemranges) *
+args.nr_vmemranges);
+
+for (i = 0; i < args.nr_vmemranges; i++) {
+args.vmemranges[i].start = state->vmemranges[i].start;
+args.vmemranges[i].end   = state->vmemranges[i].end;
+args.vmemranges[i].flags = state->vmemranges[i].flags;
+args.vmemranges[i].nid   = state->vmemranges[i].nid;
+}
+
+/* Consider video ram belongs to vmemrange 0 -- just shrink it
+ * by the size of video ram.
+ */
+if (((args.vmemranges[0].end - args.vmemranges[0].start) >> 10)
+< info->video_memkb) {
+LOG(ERROR, "vmemrange 0 too small to contain video ram");
+goto out;
+}
+
+args.vmemranges[0].end -= (info->video_memkb << 10);
+}
+
 ret = xc_hvm_build(ctx->xch, domid, &args);
 if (ret) {
 LOGEV(ERROR, ret, "hvm building failed");
 goto out;
 }
 
+if (info->num_vnuma_nodes != 0) {
+ret = libxl__vnuma_build_vmemrange_hvm(gc, domid, info, state, &args);
+if (ret) {
+LOGEV(ERROR, ret, "hvm build vmemranges failed");
+goto out;
+}
+ret = libxl__vnuma_config_check(gc, info, state);
+if (ret) goto out;
+ret = set_vnuma_info(gc, domid, info, state);
+if (ret) goto out;
+}
 ret = hvm_build_set_params(ctx->xch, domid, info, state->store_port,
&state->store_mfn, state->console_port,
&state->console_mfn, state->store_domid,
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7d1e1cf..e93089a 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3408,6 +3408,11 @@ int libxl__vnuma_build_vmemrange_pv(libxl__gc *gc,
 uint32_t domid,
 libxl_domain_build_info *b_info,
 libxl__domain_build_state *state);
+int libxl__vnuma_build_vmemrange_hvm(libxl__gc *gc,
+ uint32_t domid,
+ libxl_domain_build_info *b_info,
+ libxl__domain_build_state *state,
+  

[Xen-devel] BUG - xen-netback stats interface limited to 32-bit values on 64 bit systems

2015-02-12 Thread Atom2

Hi guys,
I am forwarding this message after initially having confirmed with Ian 
Campbell on the user list that there's really an issue - please see 
further below.


I am currently running xen-4.3.3 on gentoo (dom0 is based on kernel 
3.17.7) and I am happy to help out by applying and testing patches on my 
version.


Thanks Atom2

 Weitergeleitete Nachricht 
Betreff: Re: [Xen-users] BUG? vif RX/TX byte counters limited to 32-bit 
values on 64 bit systems

Datum: Mon, 9 Feb 2015 11:09:08 +
Von: Ian Campbell 
An: Atom2 
Kopie (CC): xen-us...@lists.xen.org

On Mon, 2015-02-09 at 00:37 +0100, Atom2 wrote:

Hi guys,
I recently experienced that ifconfig executed within dom0 wraps around
byte counters after reaching the 32-bit max value (2^32) for XEN vif
interfaces. Specifically I was able to observe this for a XEN vif
interface connected to a HVM domU running FreeBSD 10.0.


Looks like xen-netback was never converted to the 64 bit stats interface
like e.g. netfront was (see commit e00f85bec0a9 in ~v3.1).

I could have sworn netback changed eons ago -- I was clearly mistaken.

Ian.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 02/24] xen: make two memory hypercalls vNUMA-aware

2015-02-12 Thread Wei Liu
Make XENMEM_increase_reservation and XENMEM_populate_physmap
vNUMA-aware.

That is, if guest requests Xen to allocate memory for specific vnode,
Xen can translate vnode to pnode using vNUMA information of that guest.

XENMEMF_vnode is introduced for the guest to mark the node number is in
fact virtual node number and should be translated by Xen.

XENFEAT_memory_op_vnode_supported is introduced to indicate that Xen is
able to translate virtual node to physical node.

Signed-off-by: Wei Liu 
Cc: Jan Beulich 
---
Changes in v5:
1. New logic in translation function.

Changes in v3:
1. Coding style fix.
2. Remove redundant assignment.

Changes in v2:
1. Return start_extent when vnode translation fails.
2. Expose new feature bit to guest.
3. Fix typo in comment.
---
 xen/common/kernel.c   |  2 +-
 xen/common/memory.c   | 51 +++
 xen/include/public/features.h |  3 +++
 xen/include/public/memory.h   |  2 ++
 4 files changed, 53 insertions(+), 5 deletions(-)

diff --git a/xen/common/kernel.c b/xen/common/kernel.c
index 0d9e519..e5e0050 100644
--- a/xen/common/kernel.c
+++ b/xen/common/kernel.c
@@ -301,7 +301,7 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 switch ( fi.submap_idx )
 {
 case 0:
-fi.submap = 0;
+fi.submap = (1U << XENFEAT_memory_op_vnode_supported);
 if ( VM_ASSIST(d, VMASST_TYPE_pae_extended_cr3) )
 fi.submap |= (1U << XENFEAT_pae_pgdir_above_4gb);
 if ( paging_mode_translate(current->domain) )
diff --git a/xen/common/memory.c b/xen/common/memory.c
index e84ace9..fa3729b 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -692,6 +692,43 @@ out:
 return rc;
 }
 
+static int translate_vnode_to_pnode(struct domain *d,
+struct xen_memory_reservation *r,
+struct memop_args *a)
+{
+int rc = 0;
+unsigned int vnode, pnode;
+
+if ( r->mem_flags & XENMEMF_vnode )
+{
+a->memflags &= ~MEMF_node(XENMEMF_get_node(r->mem_flags));
+a->memflags &= ~MEMF_exact_node;
+
+read_lock(&d->vnuma_rwlock);
+if ( d->vnuma )
+{
+vnode = XENMEMF_get_node(r->mem_flags);
+
+if ( vnode < d->vnuma->nr_vnodes )
+{
+pnode = d->vnuma->vnode_to_pnode[vnode];
+
+if ( pnode != NUMA_NO_NODE )
+{
+a->memflags |= MEMF_node(pnode);
+if ( r->mem_flags & XENMEMF_exact_node_request )
+a->memflags |= MEMF_exact_node;
+}
+}
+else
+rc = -EINVAL;
+}
+read_unlock(&d->vnuma_rwlock);
+}
+
+return rc;
+}
+
 long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
 struct domain *d;
@@ -734,10 +771,6 @@ long do_memory_op(unsigned long cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 args.memflags = MEMF_bits(address_bits);
 }
 
-args.memflags |= MEMF_node(XENMEMF_get_node(reservation.mem_flags));
-if ( reservation.mem_flags & XENMEMF_exact_node_request )
-args.memflags |= MEMF_exact_node;
-
 if ( op == XENMEM_populate_physmap
  && (reservation.mem_flags & XENMEMF_populate_on_demand) )
 args.memflags |= MEMF_populate_on_demand;
@@ -747,6 +780,16 @@ long do_memory_op(unsigned long cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 return start_extent;
 args.domain = d;
 
+args.memflags |= MEMF_node(XENMEMF_get_node(reservation.mem_flags));
+if ( reservation.mem_flags & XENMEMF_exact_node_request )
+args.memflags |= MEMF_exact_node;
+
+if ( translate_vnode_to_pnode(d, &reservation, &args) )
+{
+rcu_unlock_domain(d);
+return start_extent;
+}
+
 if ( xsm_memory_adjust_reservation(XSM_TARGET, current->domain, d) )
 {
 rcu_unlock_domain(d);
diff --git a/xen/include/public/features.h b/xen/include/public/features.h
index 16d92aa..2110b04 100644
--- a/xen/include/public/features.h
+++ b/xen/include/public/features.h
@@ -99,6 +99,9 @@
 #define XENFEAT_grant_map_identity12
  */
 
+/* Guest can use XENMEMF_vnode to specify virtual node for memory op. */
+#define XENFEAT_memory_op_vnode_supported 13
+
 #define XENFEAT_NR_SUBMAPS 1
 
 #endif /* __XEN_PUBLIC_FEATURES_H__ */
diff --git a/xen/include/public/memory.h b/xen/include/public/memory.h
index 595f953..2b5206b 100644
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -55,6 +55,8 @@
 /* Flag to request allocation only from the node specified */
 #define XENMEMF_exact_node_request  (1<<17)
 #define XENMEMF_exact_node(n) (XENMEMF_node(n) | XENMEMF_exact_node_request)
+/* Flag to indicate the node specified is virtual node */
+#define XENMEMF_vnode  (1<<18)
 #endif
 

[Xen-devel] [PATCH v5 00/24] Virtual NUMA for PV and HVM

2015-02-12 Thread Wei Liu
Hi all

This is version 5 of this series rebased on top of master.

This patch series implements virtual NUMA support for both PV and HVM guest.
That is, admin can configure via libxl what virtual NUMA topology the guest
sees.

This is the stage 1 (basic vNUMA support) and part of stage 2 (vNUMA-ware
ballooning, hypervisor side) described in my previous email to xen-devel [0].

This series is broken into several parts:

1. xen patches: vNUMA debug output and vNUMA-aware memory hypercall support.
2. libxc/libxl support for PV vNUMA.
3. libxc/libxl/hypervisor support for HVM vNUMA.
4. xl vNUMA configuration documentation and parser.

One significant difference from Elena's work is that this patch series makes
use of multiple vmemranges should there be a memory hole, instead of shrinking
ram. This matches the behaviour of real hardware.

The vNUMA auto placement algorithm is missing at the moment and Dario is
working on it.

This series can be found at:
 git://xenbits.xen.org/people/liuw/xen.git wip.vnuma-v5

With this series, the following configuration can be used to enabled virtual
NUMA support, and it works for both PV and HVM guests.

vnuma = [ [ "pnode=0","size=3000","vcpus=0-3","vdistances=10,20"  ],
  [ "pnode=0","size=3000","vcpus=4-7","vdistances=20,10"  ],
]

For example output of guest NUMA information, please look at [1].

In terms of libxl / libxc internal, things are broken into several
parts:

1. libxl interface

Users of libxl can only specify how many vnodes a guest can have, but
currently they have no control over the actual memory layout. Note that
it's fairly easy to export the interface to control memory layout in the
future.

2. libxl internal

It generates some internal vNUMA configurations when building domain,
then transform them into libxc representations. It also validates vNUMA
configuration along the line.

3. libxc internal

Libxc does what it's told to do. It doesn't do anything smart (in fact,
I delibrately didn't put any smart logic inside it). Libxc will also
report back some information in HVM case to libxl but that's it.

Wei.

[0] <2014173606.gc21...@zion.uk.xensource.com>
[1] <1416582421-10789-1-git-send-email-wei.l...@citrix.com>

Changes in v5:
1. Rewrite PV memory allocation functions, take vmemranges into account.
2. Address Ian J's comments with regard to libxlu.
3. Address Jan and Andrew's comments with regard to hypervisor patches.
4. New syntax for vNUMA xl configuration.

Changes in v4:
1. Address comments from many people.
2. Break down the libxlu patch into three.
3. Use dedicate patches for non-functional changes.

Changes in v3:
1. Address comments made by Jan.
2. More commit messages and comments.
3. Shorten some error messages.

Changes in v2:
1. Make vnuma_vdistances mandatory.
2. Use nested list to specify distances among nodes.
3. Hvmloader uses hypercall to retrieve vNUMA information.
4. Fix some problems spotted by Jan.


Wei Liu (24):
  xen: dump vNUMA information with debug key "u"
  xen: make two memory hypercalls vNUMA-aware
  libxc: duplicate snippet to allocate p2m_host array
  libxc: add p2m_size to xc_dom_image
  libxc: allocate memory with vNUMA information for PV guest
  libxl: introduce vNUMA types
  libxl: add vmemrange to libxl__domain_build_state
  libxl: introduce libxl__vnuma_config_check
  libxl: x86: factor out e820_host_sanitize
  libxl: functions to build vmemranges for PV guest
  libxl: build, check and pass vNUMA info to Xen for PV guest
  hvmloader: retrieve vNUMA information from hypervisor
  hvmloader: construct SRAT
  hvmloader: construct SLIT
  libxc: indentation change to xc_hvm_build_x86.c
  libxc: allocate memory with vNUMA information for HVM guest
  libxl: build, check and pass vNUMA info to Xen for HVM guest
  libxl: disallow memory relocation when vNUMA is enabled
  libxl: define LIBXL_HAVE_VNUMA
  libxlu: rework internal representation of setting
  libxlu: nested list support
  libxlu: introduce new APIs
  xl: introduce xcalloc
  xl: vNUMA support

 docs/man/xl.cfg.pod.5   |  54 +++
 tools/firmware/hvmloader/Makefile   |   2 +-
 tools/firmware/hvmloader/acpi/acpi2_0.h |  61 
 tools/firmware/hvmloader/acpi/build.c   | 110 +++
 tools/firmware/hvmloader/hvmloader.c|   3 +
 tools/firmware/hvmloader/vnuma.c|  84 +++
 tools/firmware/hvmloader/vnuma.h|  52 +++
 tools/libxc/include/xc_dom.h|   7 +
 tools/libxc/include/xenguest.h  |  11 ++
 tools/libxc/xc_dom_arm.c|   1 +
 tools/libxc/xc_dom_core.c   |   8 +-
 tools/libxc/xc_dom_x86.c| 132 ++
 tools/libxc/xc_hvm_build_x86.c  | 240 +---
 tools/libxc/xc_private.h|   2 +
 tools/libxl/Makefile|   2 +-
 tools/libxl/libxl.h |   6 +
 tools/libxl/libxl_arch.h|   6 +
 tools/libxl/libxl_arm.c   

[Xen-devel] [PATCH v5 03/24] libxc: duplicate snippet to allocate p2m_host array

2015-02-12 Thread Wei Liu
Currently all in tree code doesn't set the superpage flag, but Konrad
wants it retained for the moment.

As I'm going to change the p2m_host array allocation, duplicate the code
snippet to allocate p2m_host array in this patch, so that we retain the
behaviour in superpage case.

This patch introduces no functional change and it will make future patch
easier to review. Also removed one stray tab while I was there.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
CC: Konrad Wilk 
---
 tools/libxc/xc_dom_x86.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index bf06fe4..9dbaedb 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -772,15 +772,16 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 return rc;
 }
 
-dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * dom->total_pages);
-if ( dom->p2m_host == NULL )
-return -EINVAL;
-
 if ( dom->superpages )
 {
 int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
 xen_pfn_t extents[count];
 
+dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
+  dom->total_pages);
+if ( dom->p2m_host == NULL )
+return -EINVAL;
+
 DOMPRINTF("Populating memory with %d superpages", count);
 for ( pfn = 0; pfn < count; pfn++ )
 extents[pfn] = pfn << SUPERPAGE_PFN_SHIFT;
@@ -809,9 +810,13 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 return rc;
 }
 /* setup initial p2m */
+dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
+  dom->total_pages);
+if ( dom->p2m_host == NULL )
+return -EINVAL;
 for ( pfn = 0; pfn < dom->total_pages; pfn++ )
 dom->p2m_host[pfn] = pfn;
-
+
 /* allocate guest memory */
 for ( i = rc = allocsz = 0;
   (i < dom->total_pages) && !rc;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 08/24] libxl: introduce libxl__vnuma_config_check

2015-02-12 Thread Wei Liu
This function is used to check whether vNUMA configuration (be it
auto-generated or supplied by user) is valid.

Define a new error code ERROR_VNUMA_CONFIG_INVALID.

The checks performed can be found in the comment of the function.

This vNUMA function (and future ones) is placed in a new file called
libxl_vnuma.c

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
---
Changes in v5:
1. Define and use new error code.
2. Use LOG macro.
3. Fix hard tabs.

Changes in v4:
1. Adapt to new interface.

Changes in v3:
1. Rewrite commit log.
2. Shorten two error messages.
---
 tools/libxl/Makefile |   2 +-
 tools/libxl/libxl_internal.h |   7 +++
 tools/libxl/libxl_types.idl  |   1 +
 tools/libxl/libxl_vnuma.c| 131 +++
 4 files changed, 140 insertions(+), 1 deletion(-)
 create mode 100644 tools/libxl/libxl_vnuma.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 7329521..1b16598 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -93,7 +93,7 @@ LIBXL_LIBS += -lyajl
 LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
libxl_internal.o libxl_utils.o libxl_uuid.o \
-   libxl_json.o libxl_aoutils.o libxl_numa.o \
+   libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
\
libxl_save_callout.o _libxl_save_msgs_callout.o \
libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6d3ac58..258be0d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3394,6 +3394,13 @@ void libxl__numa_candidate_put_nodemap(libxl__gc *gc,
 libxl_bitmap_copy(CTX, &cndt->nodemap, nodemap);
 }
 
+/* Check if vNUMA config is valid. Returns 0 if valid,
+ * ERROR_VNUMA_CONFIG_INVALID otherwise.
+ */
+int libxl__vnuma_config_check(libxl__gc *gc,
+  const libxl_domain_build_info *b_info,
+  const libxl__domain_build_state *state);
+
 _hidden int libxl__ms_vm_genid_set(libxl__gc *gc, uint32_t domid,
const libxl_ms_vm_genid *id);
 
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 14c7e7c..23951fc 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -63,6 +63,7 @@ libxl_error = Enumeration("error", [
 (-17, "DEVICE_EXISTS"),
 (-18, "REMUS_DEVOPS_DOES_NOT_MATCH"),
 (-19, "REMUS_DEVICE_NOT_SUPPORTED"),
+(-20, "VNUMA_CONFIG_INVALID"),
 ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
diff --git a/tools/libxl/libxl_vnuma.c b/tools/libxl/libxl_vnuma.c
new file mode 100644
index 000..fa5aa8d
--- /dev/null
+++ b/tools/libxl/libxl_vnuma.c
@@ -0,0 +1,131 @@
+/*
+ * Copyright (C) 2014  Citrix Ltd.
+ * Author Wei Liu 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+#include "libxl_osdeps.h" /* must come before any other headers */
+#include "libxl_internal.h"
+#include 
+
+/* Sort vmemranges in ascending order with "start" */
+static int compare_vmemrange(const void *a, const void *b)
+{
+const xen_vmemrange_t *x = a, *y = b;
+if (x->start < y->start)
+return -1;
+if (x->start > y->start)
+return 1;
+return 0;
+}
+
+/* Check if vNUMA configuration is valid:
+ *  1. all pnodes inside vnode_to_pnode array are valid
+ *  2. one vcpu belongs to and only belongs to one vnode
+ *  3. each vmemrange is valid and doesn't overlap with each other
+ */
+int libxl__vnuma_config_check(libxl__gc *gc,
+  const libxl_domain_build_info *b_info,
+  const libxl__domain_build_state *state)
+{
+int i, j, rc = ERROR_VNUMA_CONFIG_INVALID, nr_nodes;
+libxl_numainfo *ninfo = NULL;
+uint64_t total_memkb = 0;
+libxl_bitmap cpumap;
+libxl_vnode_info *p;
+
+libxl_bitmap_init(&cpumap);
+
+/* Check pnode specified is valid */
+ninfo = libxl_get_numainfo(CTX, &nr_nodes);
+if (!ninfo) {
+LOG(ERROR, "libxl_get_numainfo failed");
+goto out;
+}
+
+for (i = 0; i < b_info->num_vnuma_nodes; i++) {
+uint32_t pnode;
+
+p = &b_info->vnuma_nodes[i];
+pnode = p->pnode;
+
+   

[Xen-devel] [PATCH v5 05/24] libxc: allocate memory with vNUMA information for PV guest

2015-02-12 Thread Wei Liu
>From libxc's point of view, it only needs to know vnode to pnode mapping
and size of each vnode to allocate memory accordingly. Add these fields
to xc_dom structure.

The caller might not pass in vNUMA information. In that case, a dummy
layout is generated for the convenience of libxc's allocation code. The
upper layer (libxl etc) still sees the domain has no vNUMA
configuration.

Note that for this patch on PV x86 guest can have multiple regions of
ram allocated.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
---
Changes in v5:
1. Ditch xc_vnuma_info.

Changes in v4:
1. Pack fields into a struct.
2. Use "page" as unit.
3. __FUNCTION__ -> __func__.
4. Don't print total_pages.
5. Improve comment.

Changes in v3:
1. Rewrite commit log.
2. Shorten some error messages.
---
 tools/libxc/include/xc_dom.h |   6 +++
 tools/libxc/xc_dom_x86.c | 104 +--
 tools/libxc/xc_private.h |   2 +
 3 files changed, 98 insertions(+), 14 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index f57da42..52d9832 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -168,6 +168,12 @@ struct xc_dom_image {
 struct xc_dom_loader *kernel_loader;
 void *private_loader;
 
+/* vNUMA information */
+xen_vmemrange_t *vmemranges;
+unsigned int nr_vmemranges;
+unsigned int *vnode_to_pnode;
+unsigned int nr_vnodes;
+
 /* kernel loader */
 struct xc_dom_arch *arch_hooks;
 /* allocate up to virt_alloc_end */
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index bea54f2..3f8c5b8 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -760,7 +760,8 @@ static int x86_shadow(xc_interface *xch, domid_t domid)
 int arch_setup_meminit(struct xc_dom_image *dom)
 {
 int rc;
-xen_pfn_t pfn, allocsz, i, j, mfn;
+xen_pfn_t pfn, allocsz, mfn, total, pfn_base;
+int i, j;
 
 rc = x86_compat(dom->xch, dom->guest_domid, dom->guest_type);
 if ( rc )
@@ -811,26 +812,101 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 if ( rc )
 return rc;
 }
-/* setup initial p2m */
-dom->p2m_size = dom->total_pages;
+
+/* Setup dummy vNUMA information if it's not provided. Note
+ * that this is a valid state if libxl doesn't provide any
+ * vNUMA information.
+ *
+ * The dummy values make libxc allocate all pages from
+ * arbitrary physical. This is the expected behaviour if no
+ * vNUMA configuration is provided to libxc.
+ *
+ * Note that the following hunk is just for the convenience of
+ * allocation code. No defaulting happens in libxc.
+ */
+if ( dom->nr_vmemranges == 0 )
+{
+dom->nr_vmemranges = 1;
+dom->vmemranges = xc_dom_malloc(dom, sizeof(*dom->vmemranges));
+dom->vmemranges[0].start = 0;
+dom->vmemranges[0].end   = dom->total_pages << PAGE_SHIFT;
+dom->vmemranges[0].flags = 0;
+dom->vmemranges[0].nid   = 0;
+
+dom->nr_vnodes = 1;
+dom->vnode_to_pnode = xc_dom_malloc(dom,
+  sizeof(*dom->vnode_to_pnode));
+dom->vnode_to_pnode[0] = XC_VNUMA_NO_NODE;
+}
+
+total = dom->p2m_size = 0;
+for ( i = 0; i < dom->nr_vmemranges; i++ )
+{
+total += ((dom->vmemranges[i].end - dom->vmemranges[i].start)
+  >> PAGE_SHIFT);
+dom->p2m_size =
+dom->p2m_size > (dom->vmemranges[i].end >> PAGE_SHIFT) ?
+dom->p2m_size : (dom->vmemranges[i].end >> PAGE_SHIFT);
+}
+if ( total != dom->total_pages )
+{
+xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+ "%s: vNUMA page count mismatch (0x%"PRIpfn" != 
0x%"PRIpfn")\n",
+ __func__, total, dom->total_pages);
+return -EINVAL;
+}
+
 dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
   dom->p2m_size);
 if ( dom->p2m_host == NULL )
 return -EINVAL;
-for ( pfn = 0; pfn < dom->total_pages; pfn++ )
-dom->p2m_host[pfn] = pfn;
+for ( pfn = 0; pfn < dom->p2m_size; pfn++ )
+dom->p2m_host[pfn] = INVALID_P2M_ENTRY;
 
 /* allocate guest memory */
-for ( i = rc = allocsz = 0;
-  (i < dom->total_pages) && !rc;
-  i += allocsz )
+for ( i = 0; i < dom->nr_vmemranges; i++ )
 {
-allocsz = dom->total_pages - i;
-if ( allocsz > 1024*1024 )
-allocsz = 1024*1024;
-rc = xc_domain_populate_physmap_exact(
-dom->xch, dom->guest_domid, allocsz,
-0, 0, &dom->p2m_host[i]);
+   

[Xen-devel] [PATCH v5 01/24] xen: dump vNUMA information with debug key "u"

2015-02-12 Thread Wei Liu
Signed-off-by: Elena Ufimsteva 
Signed-off-by: Wei Liu 
Cc: Jan Beulich 
---
Changes in v5:
1. Use read_trylock.
2. Use correct array size for strlcpy.
3. Coding style fix.

Changes in v4:
1. Acquire rwlock before accessing vnuma struct.
2. Improve output.

Changes in v3:
1. Constify struct vnuma_info.
2. Don't print amount of ram of a vmemrange.
3. Process softirqs when dumping information.
4. Fix format string.

Changes in v2:
1. Use unsigned int for loop vars.
2. Use strlcpy.
3. Properly align output.
---
 xen/arch/x86/numa.c | 71 -
 1 file changed, 70 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 628a40a..e500f33 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int numa_setup(char *s);
 custom_param("numa", numa_setup);
@@ -363,10 +364,12 @@ EXPORT_SYMBOL(node_data);
 static void dump_numa(unsigned char key)
 {
 s_time_t now = NOW();
-int i;
+unsigned int i, j;
+int err;
 struct domain *d;
 struct page_info *page;
 unsigned int page_num_node[MAX_NUMNODES];
+const struct vnuma_info *vnuma;
 
 printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
(u32)(now>>32), (u32)now);
@@ -393,6 +396,8 @@ static void dump_numa(unsigned char key)
 printk("Memory location of each domain:\n");
 for_each_domain ( d )
 {
+process_pending_softirqs();
+
 printk("Domain %u (total: %u):\n", d->domain_id, d->tot_pages);
 
 for_each_online_node ( i )
@@ -408,6 +413,70 @@ static void dump_numa(unsigned char key)
 
 for_each_online_node ( i )
 printk("Node %u: %u\n", i, page_num_node[i]);
+
+if ( !read_trylock(&d->vnuma_rwlock) )
+continue;
+
+if ( !d->vnuma )
+{
+read_unlock(&d->vnuma_rwlock);
+continue;
+}
+
+vnuma = d->vnuma;
+printk(" %u vnodes, %u vcpus, guest physical layout:\n",
+   vnuma->nr_vnodes, d->max_vcpus);
+for ( i = 0; i < vnuma->nr_vnodes; i++ )
+{
+unsigned int start_cpu = ~0U;
+
+err = snprintf(keyhandler_scratch, 12, "%3u",
+vnuma->vnode_to_pnode[i]);
+if ( err < 0 || vnuma->vnode_to_pnode[i] == NUMA_NO_NODE )
+strlcpy(keyhandler_scratch, "???", sizeof(keyhandler_scratch));
+
+printk("   %3u: pnode %s,", i, keyhandler_scratch);
+
+printk(" vcpus ");
+
+for ( j = 0; j < d->max_vcpus; j++ )
+{
+if ( !(j & 0x3f) )
+process_pending_softirqs();
+
+if ( vnuma->vcpu_to_vnode[j] == i )
+{
+if ( start_cpu == ~0U )
+{
+printk("%d", j);
+start_cpu = j;
+}
+}
+else if ( start_cpu != ~0U )
+{
+if ( j - 1 != start_cpu )
+printk("-%d ", j - 1);
+else
+printk(" ");
+start_cpu = ~0U;
+}
+}
+
+if ( start_cpu != ~0U  && start_cpu != j - 1 )
+printk("-%d", j - 1);
+
+printk("\n");
+
+for ( j = 0; j < vnuma->nr_vmemranges; j++ )
+{
+if ( vnuma->vmemrange[j].nid == i )
+printk("   %016"PRIx64" - %016"PRIx64"\n",
+   vnuma->vmemrange[j].start,
+   vnuma->vmemrange[j].end);
+}
+}
+
+read_unlock(&d->vnuma_rwlock);
 }
 
 rcu_read_unlock(&domlist_read_lock);
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 04/24] libxc: add p2m_size to xc_dom_image

2015-02-12 Thread Wei Liu
Add a new field p2m_size to keep track of the number of pages covert by
p2m.  Change total_pages to p2m_size in functions which in fact need
the size of p2m.

This is needed because we are going to ditch the assumption that PV x86
has only one contiguous ram region. Originally the p2m size was always
equal to total_pages, but we will soon change that in later patch.

This patch doesn't change the behaviour of libxc.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
---
 tools/libxc/include/xc_dom.h |  1 +
 tools/libxc/xc_dom_arm.c |  1 +
 tools/libxc/xc_dom_core.c|  8 
 tools/libxc/xc_dom_x86.c | 19 +++
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index 07d7224..f57da42 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -129,6 +129,7 @@ struct xc_dom_image {
  */
 xen_pfn_t rambase_pfn;
 xen_pfn_t total_pages;
+xen_pfn_t p2m_size; /* number of pfns covert by p2m */
 struct xc_dom_phys *phys_pages;
 int realmodearea_log;
 #if defined (__arm__) || defined(__aarch64__)
diff --git a/tools/libxc/xc_dom_arm.c b/tools/libxc/xc_dom_arm.c
index 9b31b1f..f278927 100644
--- a/tools/libxc/xc_dom_arm.c
+++ b/tools/libxc/xc_dom_arm.c
@@ -449,6 +449,7 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 assert(dom->rambank_size[0] != 0);
 assert(ramsize == 0); /* Too much RAM is rejected above */
 
+dom->p2m_size = p2m_size;
 dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) * p2m_size);
 if ( dom->p2m_host == NULL )
 return -EINVAL;
diff --git a/tools/libxc/xc_dom_core.c b/tools/libxc/xc_dom_core.c
index ecbf981..b100ce1 100644
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -931,9 +931,9 @@ int xc_dom_update_guest_p2m(struct xc_dom_image *dom)
 {
 case 4:
 DOMPRINTF("%s: dst 32bit, pages 0x%" PRIpfn "",
-  __FUNCTION__, dom->total_pages);
+  __FUNCTION__, dom->p2m_size);
 p2m_32 = dom->p2m_guest;
-for ( i = 0; i < dom->total_pages; i++ )
+for ( i = 0; i < dom->p2m_size; i++ )
 if ( dom->p2m_host[i] != INVALID_P2M_ENTRY )
 p2m_32[i] = dom->p2m_host[i];
 else
@@ -941,9 +941,9 @@ int xc_dom_update_guest_p2m(struct xc_dom_image *dom)
 break;
 case 8:
 DOMPRINTF("%s: dst 64bit, pages 0x%" PRIpfn "",
-  __FUNCTION__, dom->total_pages);
+  __FUNCTION__, dom->p2m_size);
 p2m_64 = dom->p2m_guest;
-for ( i = 0; i < dom->total_pages; i++ )
+for ( i = 0; i < dom->p2m_size; i++ )
 if ( dom->p2m_host[i] != INVALID_P2M_ENTRY )
 p2m_64[i] = dom->p2m_host[i];
 else
diff --git a/tools/libxc/xc_dom_x86.c b/tools/libxc/xc_dom_x86.c
index 9dbaedb..bea54f2 100644
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -122,11 +122,11 @@ static int count_pgtables(struct xc_dom_image *dom, int 
pae,
 
 try_pfn_end = (try_virt_end - dom->parms.virt_base) >> PAGE_SHIFT_X86;
 
-if ( try_pfn_end > dom->total_pages )
+if ( try_pfn_end > dom->p2m_size )
 {
 xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
  "%s: not enough memory for initial mapping 
(%#"PRIpfn" > %#"PRIpfn")",
- __FUNCTION__, try_pfn_end, dom->total_pages);
+ __FUNCTION__, try_pfn_end, dom->p2m_size);
 return -ENOMEM;
 }
 
@@ -440,10 +440,11 @@ pfn_error:
 
 static int alloc_magic_pages(struct xc_dom_image *dom)
 {
-size_t p2m_size = dom->total_pages * dom->arch_hooks->sizeof_pfn;
+size_t p2m_alloc_size = dom->p2m_size * dom->arch_hooks->sizeof_pfn;
 
 /* allocate phys2mach table */
-if ( xc_dom_alloc_segment(dom, &dom->p2m_seg, "phys2mach", 0, p2m_size) )
+if ( xc_dom_alloc_segment(dom, &dom->p2m_seg, "phys2mach",
+  0, p2m_alloc_size) )
 return -1;
 dom->p2m_guest = xc_dom_seg_to_ptr(dom, &dom->p2m_seg);
 if ( dom->p2m_guest == NULL )
@@ -777,8 +778,9 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 int count = dom->total_pages >> SUPERPAGE_PFN_SHIFT;
 xen_pfn_t extents[count];
 
+dom->p2m_size = dom->total_pages;
 dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
-  dom->total_pages);
+  dom->p2m_size);
 if ( dom->p2m_host == NULL )
 return -EINVAL;
 
@@ -810,8 +812,9 @@ int arch_setup_meminit(struct xc_dom_image *dom)
 return rc;
 }
 /* setup initial p2m */
+dom->p2m_size = dom->total_pages;
 dom->p2m_host = xc_dom_malloc(dom, sizeof(xen_pfn_t) *
-  dom->total_pages);
+  dom

[Xen-devel] [PATCH v5 06/24] libxl: introduce vNUMA types

2015-02-12 Thread Wei Liu
A domain can contain several virtual NUMA nodes, hence we introduce an
array in libxl_domain_build_info.

libxl_vnode_info contains the size of memory in that node, the distance
from that node to every nodes, the underlying pnode and a bitmap of
vcpus.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
Acked-by: Ian Campbell 
---
Changes in v4:
1. Use MemKB.

Changes in v3:
1. Add commit message.
---
 tools/libxl/libxl_types.idl | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 02be466..14c7e7c 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -356,6 +356,13 @@ libxl_domain_sched_params = Struct("domain_sched_params",[
 ("budget",   integer, {'init_val': 
'LIBXL_DOMAIN_SCHED_PARAM_BUDGET_DEFAULT'}),
 ])
 
+libxl_vnode_info = Struct("vnode_info", [
+("memkb", MemKB),
+("distances", Array(uint32, "num_distances")), # distances from this node 
to other nodes
+("pnode", uint32), # physical node of this node
+("vcpus", libxl_bitmap), # vcpus in this node
+])
+
 libxl_domain_build_info = Struct("domain_build_info",[
 ("max_vcpus",   integer),
 ("avail_vcpus", libxl_bitmap),
@@ -376,6 +383,8 @@ libxl_domain_build_info = Struct("domain_build_info",[
 ("disable_migrate", libxl_defbool),
 ("cpuid",   libxl_cpuid_policy_list),
 ("blkdev_start",string),
+
+("vnuma_nodes", Array(libxl_vnode_info, "num_vnuma_nodes")),
 
 ("device_model_version", libxl_device_model_version),
 ("device_model_stubdomain", libxl_defbool),
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 07/24] libxl: add vmemrange to libxl__domain_build_state

2015-02-12 Thread Wei Liu
A vnode consists of one or more vmemranges (virtual memory range).  One
example of multiple vmemranges is that there is a hole in one vnode.

Currently we haven't exported vmemrange interface to libxl user.
Vmemranges are generated during domain build, so we have relevant
structures in domain build state.

Later if we discover we need to export the interface, those structures
can be moved to libxl_domain_build_info as well.

These new fields (along with other fields in that struct) are set to 0
at start of day so we don't need to explicitly initialise them. A
following patch which introduces an independent checking function will
need to access these fields. I don't feel very comfortable squashing
this change into that one so I didn't use a single commit.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
Acked-by: Ian Campbell 
---
Changes in v5:
1. Fix commit message.

Changes in v4:
1. Improve commit message.

Changes in v3:
1. Rewrite commit message.
---
 tools/libxl/libxl_internal.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 934465a..6d3ac58 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -973,6 +973,9 @@ typedef struct {
 libxl__file_reference pv_ramdisk;
 const char * pv_cmdline;
 bool pvh_enabled;
+
+xen_vmemrange_t *vmemranges;
+uint32_t num_vmemranges;
 } libxl__domain_build_state;
 
 _hidden int libxl__build_pre(libxl__gc *gc, uint32_t domid,
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v5 09/24] libxl: x86: factor out e820_host_sanitize

2015-02-12 Thread Wei Liu
This function gets the machine E820 map and sanitize it according to PV
guest configuration.

This will be used in later patch. No functional change introduced in
this patch.

Signed-off-by: Wei Liu 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Dario Faggioli 
Cc: Elena Ufimtseva 
Acked-by: Ian Campbell 
---
Changes in v4:
1. Use actual size of the map instead of using E820MAX.
---
 tools/libxl/libxl_x86.c | 32 +++-
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 9ceb373..d012b4d 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -207,6 +207,27 @@ static int e820_sanitize(libxl_ctx *ctx, struct e820entry 
src[],
 return 0;
 }
 
+static int e820_host_sanitize(libxl__gc *gc,
+  libxl_domain_build_info *b_info,
+  struct e820entry map[],
+  uint32_t *nr)
+{
+int rc;
+
+rc = xc_get_machine_memory_map(CTX->xch, map, *nr);
+if (rc < 0) {
+errno = rc;
+return ERROR_FAIL;
+}
+
+*nr = rc;
+
+rc = e820_sanitize(CTX, map, nr, b_info->target_memkb,
+   (b_info->max_memkb - b_info->target_memkb) +
+   b_info->u.pv.slack_memkb);
+return rc;
+}
+
 static int libxl__e820_alloc(libxl__gc *gc, uint32_t domid,
 libxl_domain_config *d_config)
 {
@@ -223,15 +244,8 @@ static int libxl__e820_alloc(libxl__gc *gc, uint32_t domid,
 if (!libxl_defbool_val(b_info->u.pv.e820_host))
 return ERROR_INVAL;
 
-rc = xc_get_machine_memory_map(ctx->xch, map, E820MAX);
-if (rc < 0) {
-errno = rc;
-return ERROR_FAIL;
-}
-nr = rc;
-rc = e820_sanitize(ctx, map, &nr, b_info->target_memkb,
-   (b_info->max_memkb - b_info->target_memkb) +
-   b_info->u.pv.slack_memkb);
+nr = E820MAX;
+rc = e820_host_sanitize(gc, b_info, map, &nr);
 if (rc)
 return ERROR_FAIL;
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] libxl: libxl__device_from_disk should retrieve backend from xenstore

2015-02-12 Thread Ian Jackson
Jim Fehlig writes ("Re: [PATCH 3/3] libxl: libxl__device_from_disk should 
retrieve backend from xenstore"):
> Wei Liu wrote:
> > On Tue, Feb 10, 2015 at 11:01:46AM +, Ian Jackson wrote:
> >> What should happen if the caller specifies a different target in disk
> >> to the one the device is actually using ?  The documentation should
> >> specify which of the fields are important.
> >
> > I'm not sure because it's not documented.

I know :-).  I meant: what should we write in the documentation ?
(And, obviously, implement.)

> > We should take a step back to define the important fields first.

Right.

> >> Maybe libxl_device_disk_remove needs to call libxl_vdev_to_device_disk
> >> and check that the supplied disk struct is plausible somehow.  In that
> >> case it might be nice for the caller to be able to fill in only the
> >> vdev.
> >
> > If so we need to make clear in the documentation. I'm of course fine
> > with this behaviour.

Well, feel free to object if you think my (rather vague) suggestion is
wrong.

> > Jim, does libvirt (as an example of libxl user) actually cares
> > specifying every fields in that struct? The other user (xl) doesn't seem
> > to care that much.
> 
> At minimum, libvirt will populate the pdev_path, vdev, backend, and
> format fields. If backend and format (which, in libvirt-speack
> correspond to the 'name' and 'type' attributes on the optional 
> element) are not specified, they are set to LIBXL_DISK_BACKEND_UNKNOWN
> and LIBXL_DISK_FORMAT_RAW respectively.

I think for fields libvirt has gone to the trouble of specifying,
libxl should check that they match.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] How do I print all hypercalls as they come in?

2015-02-12 Thread Andrew Cooper
Please do not top post.

On 12/02/15 18:19, D'Mita Levy wrote:
> Andrew,
>
> My apologies if my logic is flawed or what I am describing is
> convoluted - I am a student doing research into Xen and trying my best
> to grasp what is going on, also my ASM is subpar. I have read a paper
> on system call interception (
> https://hal.inria.fr/inria-00431031/PDF/Technical_Report_Syscall_Interception.pdf
> ) - page 10 describes disabling fast system calls by commenting out
> some code in the do_set_trap_table() function and logging the calls
> along with other guest info. My concern is that this may be a dated
> methodology as the paper was written in 2009 but also that this will
> only work for x86 and not x86_64 systems; including possible loss of
> performance since fast calls tend to run better on x86 series
> processor systems. My goal is to identify when a guest makes a
> hypercall requesting HYPERVISOR_..grant_table_op(),  mmu_update(),
> set_trap_table(), essentially I would love to be able to say...if
> trapcode = xxx printk("Hypercall xxx\n") has occurred but I am unsure
> what would be a good route to do something like that.

For something written in 2009, that has aged surprisingly well, given
that it refers to exact snippets of code.  It will however fail to catch
any system call made using sysenter or syscall.

However, intercepting system calls in a PV guest is completely different
to intercepting hypercalls, and the described method will not help you
in this case.

My original point still stands.  You cannot put a printk in hypercall
handlers such as mmu_update and grant_table_op.  Xen will be completely
crippled under the spew of all the logging.

Have you considered using xentrace?

~Andrew
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 01/12] Add support of parsing grub which has 'submenu' primitive

2015-02-12 Thread Ian Jackson
Wei Liu writes ("Re: [PATCH OSSTEST 01/12] Add support of parsing grub which 
has 'submenu' primitive"):
> On Thu, Feb 12, 2015 at 02:01:59AM +, Hu, Robert wrote:
> > Yes, this minor change just get 'parsemenu' subroutine capability of 
> > recognizing 'submenu'.
> > The outer layer logic isn't affected.
> > Actually, the Xen boot menuentry we choose, is inside a submenu. It works 
> > and /etc/default/grub
> > is assigned properly.

Great.

> In any case this is a very useful improvement.

Yes, indeed!

> Out of interest what Linux are you running?  If you're running Debian
> and the overlay 20_linux_xen (inside $OSSTEST/overlay/etc/etc/grub.d) is
> copied to your test host, there shouldn't be any submenu entries in your
> grub.cfg, I think.

I consider that a workaround (and I think so do you).

So I think subject to the (rather daft) argument we are having over
whitespace this is a really useful patch.

> > > Can you please not adjust the whitespace ?  osstest in general doesn't
> > > have a requirement for any particular whitespace use, and certainly if
> > > there are to be any whitespace changes they ought to be in a separate
> > > patch.
> >
> > I adjust those because some one in last version's reply told us that
> > osstest prefers white space substitution to tab,

I'm sorry that we seem to be having a disagreement over this.  That's
not very helpful for you, I realise!

I hope that whoever made those comments would agree that whitespace
cleanups should at least be in a separate patch.  So please when you
resubmit can you split them out ?

I can't seem to find the email you refer to.  Do you happen to be able
to give me a reference ?

> > and traditionally 4 white space of 1 tab. (This align with my
> > previous coding experience as well)

4-character tabs are quite unusual in the Free Software world.  8 is
usual.

> > And I indeed find that this hunk of code doesn't looks well in my editor.
> > Its unalignment increases difficulty of reading.

Since evidently this is annoying to you I won't stand in the way of
your effort to clean this up, even though I don't much care about it.
So if you submit this as a separate patch I won't block it.

But maybe simply configuring your editor to use 8-character tabs will
fix the problem for you ?  That would be less work than preparing
whitespace adjustment patches.

Thanksw,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] How do I print all hypercalls as they come in?

2015-02-12 Thread D'Mita Levy
Andrew,

My apologies if my logic is flawed or what I am describing is convoluted -
I am a student doing research into Xen and trying my best to grasp what is
going on, also my ASM is subpar. I have read a paper on system call
interception (
https://hal.inria.fr/inria-00431031/PDF/Technical_Report_Syscall_Interception.pdf
) - page 10 describes disabling fast system calls by commenting out some
code in the do_set_trap_table() function and logging the calls along with
other guest info. My concern is that this may be a dated methodology as the
paper was written in 2009 but also that this will only work for x86 and not
x86_64 systems; including possible loss of performance since fast calls
tend to run better on x86 series processor systems. My goal is to identify
when a guest makes a hypercall requesting HYPERVISOR_..grant_table_op(),
 mmu_update(), set_trap_table(), essentially I would love to be able to
say...if trapcode = xxx printk("Hypercall xxx\n") has occurred but I am
unsure what would be a good route to do something like that.

Thanks,
D'Mita

On Fri, Feb 6, 2015 at 10:23 AM, Andrew Cooper 
wrote:

>  On 06/02/15 15:17, D'Mita Levy wrote:
>
> Andrew,
>
>  Thanks for your help. I am trying to log the following hypercalls to
> dmesg as they come in:
>
>   -HYPERVISOR_grant_table_op()
>
> - HYPERVISOR_mmu_update()
>
> - HYPERVISOR_set_trap_table()
>
> Are there single handlers for these as well?
>
>
> The hypercall_table in arch/x86/x86_64/entry.S is the function pointer
> dispatch table, and is indexed by hypercall number.
>
> ~Andrew
>



-- 
D'Mita Levy
Cyber Fellow, Applied Research Center
Florida International University
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 12/12] Changes to test step of xen install

2015-02-12 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 12/12] Changes to test step of xen install"):
>  This patch accomodates ts-xen-install to nested L1 xen installation
>  usage. Its change is relatively simpler than
>  ts-debain-hvm-install. We simply alter '$ho' usage to 'w_ho', which
>  is assigned to '$ho' in original L0 installation context, while
>  assigned to '$gho' in L1 Xen installation context.

Certainly I think we should use ts-xen-install for installing Xen on
the L1.  But I think that ts-xen-install should probably think almost
entirely about the L1 and $ho should be the L1.

I think if you followed the suggestion for the structure that I made
in my previous patch, very little of the changes here would be
necessary.

It's not clear to me that _anything_ would need to change in
ts-xen-install, in fact.  (Although I may be wrong.)

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 11/12] Changes on test step of debain hvm guest install

2015-02-12 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 11/12] Changes on test step of debain hvm 
guest install"):
>  This patch is to make ts-debian-hvm-install accomodate

Ah yes here is the meat.

Firstly, can you please reformat your commit message so that the
individual points are separated out into paragraphs.  But I think
actually that probably some of this wants to go into different commits
(and perhaps different places).

> diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
> index 37eade2..e905698 100755
> --- a/ts-debian-hvm-install
> +++ b/ts-debian-hvm-install
> @@ -28,22 +28,30 @@ if (@ARGV && $ARGV[0] =~ m/^--stage(\d+)$/) { $stage=$1; 
> shift @ARGV; }
...
> +if ($nested eq 'nested_L1') {
> +$gn ||= 'nested';
> +$guesthost ||= "$gn.l1.osstest";
> +} elsif ($nested eq 'nested_L2') {
> +$whhost = 'L1_host';
> +$gn ||= 'nested2';
> +$guesthost ||= "$gn.l2.osstest";
> +} else {
> +$gn ||= 'debianhvm';
> +$guesthost= "$gn.guest.osstest";
> +}

I don't think this is the right way to control the nestedness.
Also your test recipe seems wrong.  You write:

+run-ts . = ts-debian-hvm-install + host + nested + nested_L1
+run-ts . = ts-xen-install + host + nested + nested_build
+run-ts . = ts-debian-hvm-install + host + nested2 + nested_L2
+run-ts . = ts-guest-destroy + host + nested

I think this should look more like:

+run-ts . = ts-debian-hvm-install + host + nested
+run-ts . = ts-nested-setup + host + nested
+run-ts . = ts-xen-install nested
+run-ts . = ts-host-reboot nested
+run-ts . = ts-debian-hvm-install nested nested2

ts-nested-setup would turn on nested HVM support in the domain config,
figures out the hostname etc. and makes some appropriate runvars which
selecthost would recognise.

I don't know why you need to use a dedicated VG for your nested
guests's L2 guests - please explain - but if you do, probably
ts-nested-setup could set it up.

> @@ -63,7 +71,7 @@ d-i partman-auto/expert_recipe string \\
>  use_filesystem{ } filesystem{ vfat } \\
>  mountpoint{ /boot/efi } \\
>  . \\
> -5000 50 5000 ext4 \\
> +1 50 1 ext4 \\

I think this needs an explanation.  You mentioned it in your commit
message but didn't give reasons.  I think this should perhaps be done
in a different way.

> +if ($nested eq 'nested_L2') {
> +my $L2_disk_mb = 2;
> +my $L0= selecthost($r{'L0_Ident'});

As a style matter, runvars and perl local variable generally have
all-lowercase names.

> +if ($nested eq 'nested_L2') {
> +target_cmd_root($gho, "init 0");
> +target_await_down($gho,60);
> +target_ping_check_down($gho);
> +}
> +if ($nested eq 'nested_L1') {
> +store_runvar("L1_host", $gn);
> +store_runvar("L1_IP", $gho->{Ip});
> +store_runvar("L0_Ident", $whhost);
> +target_cmd_root($gho, "mkdir -p /home/osstest/.ssh && cp 
> /root/.ssh/authorized_keys /home/osstest/.ssh/");
> +}

I don't understand the purpose behind these special cases.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 34471: regressions - trouble: broken/fail/pass

2015-02-12 Thread xen . org
flight 34471 linux-linus real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34471/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels 
fail REGR. vs. 34227
 test-amd64-amd64-libvirt  9 guest-start   fail REGR. vs. 34227
 test-amd64-i386-qemuu-rhel6hvm-intel  5 xen-boot  fail REGR. vs. 34227
 test-amd64-i386-qemut-rhel6hvm-intel  5 xen-boot  fail REGR. vs. 34227
 test-amd64-amd64-xl-winxpsp3  5 xen-boot  fail REGR. vs. 34227
 test-amd64-i386-xl-qemuu-winxpsp3  5 xen-boot fail REGR. vs. 34227
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  5 xen-boot  fail REGR. vs. 34227
 test-amd64-amd64-xl-qemut-winxpsp3  3 host-install(3)   broken REGR. vs. 34227

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 13 guest-destroy   fail like 34394-bisect
 test-amd64-i386-libvirt   9 guest-start  fail   like 34227
 test-amd64-i386-freebsd10-i386  7 freebsd-install  fail like 34227
 test-amd64-i386-freebsd10-amd64  7 freebsd-install fail like 34227
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 34227
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install  fail like 34227

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass

version targeted for testing:
 linuxc5ce28df0e7c01a1de23c36ebdefcd803f2b6cbb
baseline version:
 linux9d82f5eb3376cbae96ad36a063a9390de1694546


691 people touched revisions under test,
not listing them all


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-fre

[Xen-devel] Xen Security Advisory 117 (CVE-2015-0268) - arm: vgic-v2: GICD_SGIR is not properly emulated

2015-02-12 Thread Xen . org security team
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Xen Security Advisory CVE-2015-0268 / XSA-117
  version 2

   arm: vgic-v2: GICD_SGIR is not properly emulated

UPDATES IN VERSION 2


CVE assigned.

Mention CVE and XSA numbers in patch commit message.

Public release.

ISSUE DESCRIPTION
=

When decoding a guest write to a specific register in the virtual
interrupt controller Xen would treat an invalid value as a critical
error and crash the host.

IMPACT
==

By writing an invalid value to the GICD.SGIR register a guest can
crash the host, resulting in a Denial of Service attack.

VULNERABLE SYSTEMS
==

Xen 4.5 and later systems running on ARM hardware with version 2 of
the generic interrupt controller are vulnerable.

Systems running on ARM hardware with version 3 of the generic
interrupt controller are not vulnerable.

x86 systems are not affected.

MITIGATION
==

None.

CREDITS
===

This issue was discovered by Julien Grall.

RESOLUTION
==

Applying the appropriate attached patch resolves this issue.

xsa117.patchXen 4.5.x, xen-unstable

$ sha256sum xsa117*.patch
5d7c1ec3bd604ed4a56fefeebda1206f424b1b48c0e44899f13bc1e55cd0  xsa117.patch
$
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJU3OW0AAoJEIP+FMlX6CvZePcH/06WboLULU7JEfvzFqpnxpQV
XmNXCuvjcOt4d/w77a78kq8Bw8RUiDHR3f6qb+sJeNsJ1V55o0/KGgydEu+DqoF7
3bftmPDvuBcqoF3+7KupjRp0sBU+11Q/Jtb+P/0ZtVReFKGxmpg8kBura56rL3wf
iL1kMA4V0Kd4abmXXr6yUJMQuI19OZSQ43Zo7F9kOomyc7lcKB6vhnMtCiXw1F9Y
zfnyP1V1s5h77juSe01pQhEqjDlKv/NNkfJav6s7eVYVbJAwFgUP2vOZ14t2dR+o
5M8PPwF6EFBm421Z1D67caBh1ovGzeywZcrCl8nxuex+dqwomLymIMaL0P/fY6g=
=edQs
-END PGP SIGNATURE-


xsa117.patch
Description: Binary data
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] stubdom vtpm build failure in staging

2015-02-12 Thread Andrew Cooper
On 12/02/15 17:24, Xu, Quan wrote:
>   
>
>> -Original Message-
>> From: xen-devel-boun...@lists.xen.org
>> [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Xu, Quan
>> Sent: Friday, February 13, 2015 12:57 AM
>> To: Olaf Hering
>> Cc: xen-devel@lists.xen.org
>> Subject: Re: [Xen-devel] stubdom vtpm build failure in staging
>>
>> Sorry for that. Read the other thread of email, it looks that some 
>> maintainers are
>> working for this issue.
>> And I am working for 'Xen stubdom vTPM for HVM virtual machine' v4 patches.
>> There are a lot of modifications.
>>
>> I will be out of office from Feb. 16th to Feb. 26th for Chinese New Year. I 
>> plan to
>> summit v4 patches Before Feb. 16, and fix this issue after Feb. 26th.
>>
>> --Quan
>>
>>
>>> -Original Message-
>>> From: Olaf Hering [mailto:o...@aepfle.de]
>>> Sent: Wednesday, February 11, 2015 11:21 PM
>>> To: Xu, Quan
>>> Cc: xen-devel@lists.xen.org
>>> Subject: Re: [Xen-devel] stubdom vtpm build failure in staging
>>>
>>> On Wed, Jan 28, Xu, Quan wrote:
>>>
 Thanks, I will check and fix it tomorrow. It is 23:12 PM Pacific time now.
>>> Any progress?
>>> These typedefs are duplicated in stubdom/vtpmmgr/tcg.h and supported
>>> compilers do not cope with current staging:
>>>
>>> # for i in `grep -w typedef stubdom/vtpmmgr/tcg.h | sed -n '/;/{s@^.*
>>> @@;s@;@@p}'` # do
>>> # if test -n "`git grep -wn $i|grep -w typedef|grep -v
>>> stubdom/vtpmmgr/tcg.h`"
>>> # then
>>> # echo $i
>>> # fi
>>> # done
>>>
>>> BYTE
>>> BOOL
>>> UINT16
>>> UINT32
>>> UINT64
>>> TPM_HANDLE
>>> TPM_ALGORITHM_ID
>>>
>>> TPMI_RH_HIERARCHY_AUTH and TPM_ALG_ID are defined twice in the same
>>> file.
>>>
>>> This change works for me:
>>>
>>> ---
>>>  stubdom/vtpmmgr/odd_types.h  | 11 +++
>>>  stubdom/vtpmmgr/tcg.h|  9 +
>>>  stubdom/vtpmmgr/tpm2_types.h | 11 +--
>>>  3 files changed, 13 insertions(+), 18 deletions(-)  create mode
>>> 100644 stubdom/vtpmmgr/odd_types.h
>>>
>>> diff --git a/stubdom/vtpmmgr/odd_types.h b/stubdom/vtpmmgr/odd_types.h
>>> new file mode 100644 index 000..d72da9b
>>> --- /dev/null
>>> +++ b/stubdom/vtpmmgr/odd_types.h
>>> @@ -0,0 +1,11 @@
>>> +#ifndef VTPM_ODD_TYPES
>>> +#define VTPM_ODD_TYPES 1
>>> +typedef unsigned char BYTE;
>>> +typedef unsigned char BOOL;
>>> +typedef uint16_t UINT16;
>>> +typedef uint32_t UINT32;
>>> +typedef uint64_t UINT64;
>>> +typedef UINT32 TPM_HANDLE;
>>> +typedef UINT32 TPM_ALGORITHM_ID;
>>> +#endif
>>> +
>>> diff --git a/stubdom/vtpmmgr/tcg.h b/stubdom/vtpmmgr/tcg.h index
>>> 7321ec6..cac1bbc 100644
>>> --- a/stubdom/vtpmmgr/tcg.h
>>> +++ b/stubdom/vtpmmgr/tcg.h
>>> @@ -401,16 +401,10 @@
>>>
>>>
>>>  // *** TYPEDEFS
>>> * -typedef unsigned char BYTE;
>>> -typedef unsigned char BOOL; -typedef uint16_t UINT16; -typedef
>>> uint32_t UINT32; -typedef uint64_t UINT64;
>>> -
>>> +#include "odd_types.h"
> I think it is just for gcc backward compatibility. IMHO, That does seem 
> pretty strange.
> cc Daniel who is the maintainer of vTPM / XSM.
>
> -Quan

Redefining an identifier in the same scope is violation of the C spec.

Newer GCC tolerates bad code which redundantly declares identifiers, but
even newer GCC will still emit a diagnostic in -pedantic mode.

This build breakage needs fixing, and not just in the name of backwards
compatibility.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 17/20] x86/shadow: Alter sh_remove_{all_}shadows{, _and_parents}() to take a domain

2015-02-12 Thread Andrew Cooper
This allows the removal of 3 improper uses of d->vcpu[0] from toolstack context

Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/hvm/hvm.c  |2 +-
 xen/arch/x86/mm.c   |4 ++--
 xen/arch/x86/mm/shadow/common.c |   18 --
 xen/arch/x86/mm/shadow/multi.c  |   15 ---
 xen/include/asm-x86/shadow.h|8 
 5 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index a52c6e0..05d35c0 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -6074,7 +6074,7 @@ long do_hvm_op(unsigned long op, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 paging_mark_dirty(d, page_to_mfn(page));
 /* These are most probably not page tables any more */
 /* don't take a long time and don't die either */
-sh_remove_shadows(d->vcpu[0], _mfn(page_to_mfn(page)), 1, 0);
+sh_remove_shadows(d, _mfn(page_to_mfn(page)), 1, 0);
 put_page(page);
 }
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index d4965da..3cc6138 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2118,7 +2118,7 @@ int free_page_type(struct page_info *page, unsigned long 
type,
 ASSERT(VALID_M2P(gmfn));
 /* Page sharing not supported for shadowed domains */
 if(!SHARED_M2P(gmfn))
-shadow_remove_all_shadows(owner->vcpu[0], _mfn(gmfn));
+shadow_remove_all_shadows(owner, _mfn(gmfn));
 }
 
 if ( !(type & PGT_partial) )
@@ -2283,7 +2283,7 @@ static int __get_page_type(struct page_info *page, 
unsigned long type,
  && (page->count_info & PGC_page_table)
  && !((page->shadow_flags & (1u<<29))
   && type == PGT_writable_page) )
-   shadow_remove_all_shadows(d->vcpu[0], _mfn(page_to_mfn(page)));
+   shadow_remove_all_shadows(d, _mfn(page_to_mfn(page)));
 
 ASSERT(!(x & PGT_pae_xen_l2));
 if ( (x & PGT_type_mask) != type )
diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index e10b578..30580ee 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -687,7 +687,7 @@ static int oos_remove_write_access(struct vcpu *v, mfn_t 
gmfn,
  * the page.  If that doesn't work either, the guest is granting
  * his pagetables and must be killed after all.
  * This will flush the tlb, so we can return with no worries. */
-sh_remove_shadows(v, gmfn, 0 /* Be thorough */, 1 /* Must succeed */);
+sh_remove_shadows(d, gmfn, 0 /* Be thorough */, 1 /* Must succeed */);
 return 1;
 }
 
@@ -1130,7 +1130,7 @@ sh_validate_guest_pt_write(struct vcpu *v, mfn_t gmfn,
  * Since the validate call above will have made a "safe" (i.e. zero)
  * shadow entry, we can let the domain live even if we can't fully
  * unshadow the page. */
-sh_remove_shadows(v, gmfn, 0, 0);
+sh_remove_shadows(d, gmfn, 0, 0);
 }
 }
 
@@ -2570,7 +2570,7 @@ static int sh_remove_shadow_via_pointer(struct domain *d, 
mfn_t smfn)
 return rc;
 }
 
-void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int fast, int all)
+void sh_remove_shadows(struct domain *d, mfn_t gmfn, int fast, int all)
 /* Remove the shadows of this guest page.
  * If fast != 0, just try the quick heuristic, which will remove
  * at most one reference to each shadow of the page.  Otherwise, walk
@@ -2579,7 +2579,6 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
  * (all != 0 implies fast == 0)
  */
 {
-struct domain *d = v->domain;
 struct page_info *pg = mfn_to_page(gmfn);
 mfn_t smfn;
 unsigned char t;
@@ -2633,8 +2632,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
  * can be called via put_page_type when we clear a shadow l1e).*/
 paging_lock_recursive(d);
 
-SHADOW_PRINTK("d=%d, v=%d, gmfn=%05lx\n",
-   d->domain_id, v->vcpu_id, mfn_x(gmfn));
+SHADOW_PRINTK("d=%d: gmfn=%lx\n", d->domain_id, mfn_x(gmfn));
 
 /* Bail out now if the page is not shadowed */
 if ( (pg->count_info & PGC_page_table) == 0 )
@@ -2703,11 +2701,11 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 }
 
 static void
-sh_remove_all_shadows_and_parents(struct vcpu *v, mfn_t gmfn)
+sh_remove_all_shadows_and_parents(struct domain *d, mfn_t gmfn)
 /* Even harsher: this is a HVM page that we thing is no longer a pagetable.
  * Unshadow it, and recursively unshadow pages that reference it. */
 {
-sh_remove_shadows(v, gmfn, 0, 1);
+sh_remove_shadows(d, gmfn, 0, 1);
 /* XXX TODO:
  * Rework this hashtable walker to return a linked-list of all
  * the shadows it modified, then do breadth-first recursion
@@ -3384,7 +3382,7 @@ static void sh_unshadow_for_p2m_change(struct domain *d,

[Xen-devel] [PATCH 14/20] x86/shadow: Alter sh_remove_l?_shadow() to take a domain

2015-02-12 Thread Andrew Cooper
This involves introducing the domain variant of hash_foreach()

Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c |   52 +--
 xen/arch/x86/mm/shadow/multi.c  |9 +++
 xen/arch/x86/mm/shadow/multi.h  |6 ++---
 3 files changed, 56 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index e522b60..3810b75 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1998,6 +1998,7 @@ void shadow_hash_delete(struct domain *d, unsigned long 
n, unsigned int t,
 }
 
 typedef int (*hash_vcpu_callback_t)(struct vcpu *v, mfn_t smfn, mfn_t 
other_mfn);
+typedef int (*hash_domain_callback_t)(struct domain *d, mfn_t smfn, mfn_t 
other_mfn);
 
 static void hash_vcpu_foreach(struct vcpu *v, unsigned int callback_mask,
   const hash_vcpu_callback_t callbacks[],
@@ -2046,6 +2047,53 @@ static void hash_vcpu_foreach(struct vcpu *v, unsigned 
int callback_mask,
 d->arch.paging.shadow.hash_walking = 0;
 }
 
+static void hash_domain_foreach(struct domain *d,
+unsigned int callback_mask,
+const hash_domain_callback_t callbacks[],
+mfn_t callback_mfn)
+/* Walk the hash table looking at the types of the entries and
+ * calling the appropriate callback function for each entry.
+ * The mask determines which shadow types we call back for, and the array
+ * of callbacks tells us which function to call.
+ * Any callback may return non-zero to let us skip the rest of the scan.
+ *
+ * WARNING: Callbacks MUST NOT add or remove hash entries unless they
+ * then return non-zero to terminate the scan. */
+{
+int i, done = 0;
+struct page_info *x;
+
+ASSERT(paging_locked_by_me(d));
+
+/* Can be called via p2m code &c after shadow teardown. */
+if ( unlikely(!d->arch.paging.shadow.hash_table) )
+return;
+
+/* Say we're here, to stop hash-lookups reordering the chains */
+ASSERT(d->arch.paging.shadow.hash_walking == 0);
+d->arch.paging.shadow.hash_walking = 1;
+
+for ( i = 0; i < SHADOW_HASH_BUCKETS; i++ )
+{
+/* WARNING: This is not safe against changes to the hash table.
+ * The callback *must* return non-zero if it has inserted or
+ * deleted anything from the hash (lookups are OK, though). */
+for ( x = d->arch.paging.shadow.hash_table[i]; x; x = next_shadow(x) )
+{
+if ( callback_mask & (1 << x->u.sh.type) )
+{
+ASSERT(x->u.sh.type <= 15);
+ASSERT(callbacks[x->u.sh.type] != NULL);
+done = callbacks[x->u.sh.type](d, page_to_mfn(x),
+   callback_mfn);
+if ( done ) break;
+}
+}
+if ( done ) break;
+}
+d->arch.paging.shadow.hash_walking = 0;
+}
+
 
 /**/
 /* Destroy a shadow page: simple dispatcher to call the per-type destructor
@@ -2537,7 +2585,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 
 /* Dispatch table for getting per-type functions: each level must
  * be called with the function to remove a lower-level shadow. */
-static const hash_vcpu_callback_t callbacks[SH_type_unused] = {
+static const hash_domain_callback_t callbacks[SH_type_unused] = {
 NULL, /* none*/
 NULL, /* l1_32   */
 NULL, /* fl1_32  */
@@ -2621,7 +2669,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 if( !fast   \
 && (pg->count_info & PGC_page_table)\
 && (pg->shadow_flags & (1 << t)) )  \
-hash_vcpu_foreach(v, masks[t], callbacks, smfn);\
+hash_domain_foreach(d, masks[t], callbacks, smfn);  \
 } while (0)
 
 DO_UNSHADOW(SH_type_l2_32_shadow);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 469ad25..ab6ebe2 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4373,10 +4373,9 @@ void sh_clear_shadow_entry(struct domain *d, void *ep, 
mfn_t smfn)
 }
 }
 
-int sh_remove_l1_shadow(struct vcpu *v, mfn_t sl2mfn, mfn_t sl1mfn)
+int sh_remove_l1_shadow(struct domain *d, mfn_t sl2mfn, mfn_t sl1mfn)
 /* Remove all mappings of this l1 shadow from this l2 shadow */
 {
-struct domain *d = v->domain;
 shadow_l2e_t *sl2e;
 int done = 0;
 int flags;
@@ -4397,10 +4396,9 @@ int sh_remove_l1_shadow(struct vcpu *v, mfn_t sl2mfn, 
mfn_t sl1mfn)
 }
 
 #if GUEST_PAGING_LEVELS >= 4
-int sh_remove_l2_shadow(struct vcpu *v, mfn_t sl3mfn, mfn_t sl2mfn)
+int sh_remove_l2_shadow(struct domain *d, mfn_t sl3

[Xen-devel] [PATCH 11/20] x86/shadow: Alter sh_get_ref() and sh_{, un}pin() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |6 +++---
 xen/arch/x86/mm/shadow/multi.c   |   16 
 xen/arch/x86/mm/shadow/private.h |   11 ---
 3 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index e2ea6cb..046201a 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1305,7 +1305,7 @@ static void _shadow_prealloc(
 
 /* Unpin this top-level shadow */
 trace_shadow_prealloc_unpin(d, smfn);
-sh_unpin(v, smfn);
+sh_unpin(d, smfn);
 
 /* See if that freed up enough space */
 if ( d->arch.paging.shadow.free_pages >= pages ) return;
@@ -1370,7 +1370,7 @@ static void shadow_blow_tables(struct domain *d)
 foreach_pinned_shadow(d, sp, t)
 {
 smfn = page_to_mfn(sp);
-sh_unpin(v, smfn);
+sh_unpin(d, smfn);
 }
 
 /* Second pass: unhook entries of in-use shadows */
@@ -2616,7 +2616,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 break;  \
 }   \
 if ( sh_type_is_pinnable(d, t) )\
-sh_unpin(v, smfn);  \
+sh_unpin(d, smfn);  \
 else if ( sh_type_has_up_pointer(d, t) )\
 sh_remove_shadow_via_pointer(v, smfn);  \
 if( !fast   \
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 7d82d90..ccb08d3 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -906,10 +906,10 @@ static int shadow_set_l4e(struct vcpu *v,
 {
 /* About to install a new reference */
 mfn_t sl3mfn = shadow_l4e_get_mfn(new_sl4e);
-ok = sh_get_ref(v, sl3mfn, paddr);
+ok = sh_get_ref(d, sl3mfn, paddr);
 /* Are we pinning l3 shadows to handle wierd linux behaviour? */
 if ( sh_type_is_pinnable(d, SH_type_l3_64_shadow) )
-ok |= sh_pin(v, sl3mfn);
+ok |= sh_pin(d, sl3mfn);
 if ( !ok )
 {
 domain_crash(d);
@@ -956,7 +956,7 @@ static int shadow_set_l3e(struct vcpu *v,
 if ( shadow_l3e_get_flags(new_sl3e) & _PAGE_PRESENT )
 {
 /* About to install a new reference */
-if ( !sh_get_ref(v, shadow_l3e_get_mfn(new_sl3e), paddr) )
+if ( !sh_get_ref(d, shadow_l3e_get_mfn(new_sl3e), paddr) )
 {
 domain_crash(d);
 return SHADOW_SET_ERROR;
@@ -1018,7 +1018,7 @@ static int shadow_set_l2e(struct vcpu *v,
 ASSERT(mfn_to_page(sl1mfn)->u.sh.head);
 
 /* About to install a new reference */
-if ( !sh_get_ref(v, sl1mfn, paddr) )
+if ( !sh_get_ref(d, sl1mfn, paddr) )
 {
 domain_crash(d);
 return SHADOW_SET_ERROR;
@@ -1537,7 +1537,7 @@ sh_make_shadow(struct vcpu *v, mfn_t gmfn, u32 
shadow_type)
 page_list_for_each_safe(sp, t, 
&d->arch.paging.shadow.pinned_shadows)
 {
 if ( sp->u.sh.type == SH_type_l3_64_shadow )
-sh_unpin(v, page_to_mfn(sp));
+sh_unpin(d, page_to_mfn(sp));
 }
 d->arch.paging.shadow.opt_flags &= ~SHOPT_LINUX_L3_TOPLEVEL;
 sh_reset_l3_up_pointers(v);
@@ -3866,7 +3866,7 @@ sh_set_toplevel_shadow(struct vcpu *v,
 ASSERT(mfn_valid(smfn));
 
 /* Pin the shadow and put it (back) on the list of pinned shadows */
-if ( sh_pin(v, smfn) == 0 )
+if ( sh_pin(d, smfn) == 0 )
 {
 SHADOW_ERROR("can't pin %#lx as toplevel shadow\n", mfn_x(smfn));
 domain_crash(d);
@@ -3874,7 +3874,7 @@ sh_set_toplevel_shadow(struct vcpu *v,
 
 /* Take a ref to this page: it will be released in sh_detach_old_tables()
  * or the next call to set_toplevel_shadow() */
-if ( !sh_get_ref(v, smfn, 0) )
+if ( !sh_get_ref(d, smfn, 0) )
 {
 SHADOW_ERROR("can't install %#lx as toplevel shadow\n", mfn_x(smfn));
 domain_crash(d);
@@ -3895,7 +3895,7 @@ sh_set_toplevel_shadow(struct vcpu *v,
 /* Need to repin the old toplevel shadow if it's been unpinned
  * by shadow_prealloc(): in PV mode we're still running on this
  * shadow and it's not safe to free it yet. */
-if ( !mfn_to_page(old_smfn)->u.sh.pinned && !sh_pin(v, old_smfn) )
+if ( !mfn_to_page(old_smfn)->u.sh.pinned && !sh_pin(d, old_smfn) )
 {
 SHADOW_ERROR("can't re-pin %#lx\n", mfn_x(old_smfn));
 domain_crash(d);
diff --git a/xen/arch/x86/mm/shadow/private.h b/xen/arch/x86/mm/shadow/private.h
index a848c94..cd

[Xen-devel] [PATCH 10/20] x86/shadow: Alter sh_put_ref() and shadow destroy functions to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |   19 +--
 xen/arch/x86/mm/shadow/multi.c   |   30 +-
 xen/arch/x86/mm/shadow/multi.h   |8 
 xen/arch/x86/mm/shadow/private.h |9 -
 4 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index c6b8e6f..e2ea6cb 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2052,9 +2052,8 @@ static void hash_vcpu_foreach(struct vcpu *v, unsigned 
int callback_mask,
  * which will decrement refcounts appropriately and return memory to the
  * free pool. */
 
-void sh_destroy_shadow(struct vcpu *v, mfn_t smfn)
+void sh_destroy_shadow(struct domain *d, mfn_t smfn)
 {
-struct domain *d = v->domain;
 struct page_info *sp = mfn_to_page(smfn);
 unsigned int t = sp->u.sh.type;
 
@@ -2076,36 +2075,36 @@ void sh_destroy_shadow(struct vcpu *v, mfn_t smfn)
 {
 case SH_type_l1_32_shadow:
 case SH_type_fl1_32_shadow:
-SHADOW_INTERNAL_NAME(sh_destroy_l1_shadow, 2)(v, smfn);
+SHADOW_INTERNAL_NAME(sh_destroy_l1_shadow, 2)(d, smfn);
 break;
 case SH_type_l2_32_shadow:
-SHADOW_INTERNAL_NAME(sh_destroy_l2_shadow, 2)(v, smfn);
+SHADOW_INTERNAL_NAME(sh_destroy_l2_shadow, 2)(d, smfn);
 break;
 
 case SH_type_l1_pae_shadow:
 case SH_type_fl1_pae_shadow:
-SHADOW_INTERNAL_NAME(sh_destroy_l1_shadow, 3)(v, smfn);
+SHADOW_INTERNAL_NAME(sh_destroy_l1_shadow, 3)(d, smfn);
 break;
 case SH_type_l2_pae_shadow:
 case SH_type_l2h_pae_shadow:
-SHADOW_INTERNAL_NAME(sh_destroy_l2_shadow, 3)(v, smfn);
+SHADOW_INTERNAL_NAME(sh_destroy_l2_shadow, 3)(d, smfn);
 break;
 
 case SH_type_l1_64_shadow:
 case SH_type_fl1_64_shadow:
-SHADOW_INTERNAL_NAME(sh_destroy_l1_shadow, 4)(v, smfn);
+SHADOW_INTERNAL_NAME(sh_destroy_l1_shadow, 4)(d, smfn);
 break;
 case SH_type_l2h_64_shadow:
 ASSERT(is_pv_32on64_domain(d));
 /* Fall through... */
 case SH_type_l2_64_shadow:
-SHADOW_INTERNAL_NAME(sh_destroy_l2_shadow, 4)(v, smfn);
+SHADOW_INTERNAL_NAME(sh_destroy_l2_shadow, 4)(d, smfn);
 break;
 case SH_type_l3_64_shadow:
-SHADOW_INTERNAL_NAME(sh_destroy_l3_shadow, 4)(v, smfn);
+SHADOW_INTERNAL_NAME(sh_destroy_l3_shadow, 4)(d, smfn);
 break;
 case SH_type_l4_64_shadow:
-SHADOW_INTERNAL_NAME(sh_destroy_l4_shadow, 4)(v, smfn);
+SHADOW_INTERNAL_NAME(sh_destroy_l4_shadow, 4)(d, smfn);
 break;
 
 default:
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index f2dea16..7d82d90 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -931,7 +931,7 @@ static int shadow_set_l4e(struct vcpu *v,
 {
 flags |= SHADOW_SET_FLUSH;
 }
-sh_put_ref(v, osl3mfn, paddr);
+sh_put_ref(d, osl3mfn, paddr);
 }
 return flags;
 }
@@ -977,7 +977,7 @@ static int shadow_set_l3e(struct vcpu *v,
 {
 flags |= SHADOW_SET_FLUSH;
 }
-sh_put_ref(v, osl2mfn, paddr);
+sh_put_ref(d, osl2mfn, paddr);
 }
 return flags;
 }
@@ -1063,7 +1063,7 @@ static int shadow_set_l2e(struct vcpu *v,
 {
 flags |= SHADOW_SET_FLUSH;
 }
-sh_put_ref(v, osl1mfn, paddr);
+sh_put_ref(d, osl1mfn, paddr);
 }
 return flags;
 }
@@ -1882,9 +1882,8 @@ static shadow_l1e_t * shadow_get_and_create_l1e(struct 
vcpu *v,
  */
 
 #if GUEST_PAGING_LEVELS >= 4
-void sh_destroy_l4_shadow(struct vcpu *v, mfn_t smfn)
+void sh_destroy_l4_shadow(struct domain *d, mfn_t smfn)
 {
-struct domain *d = v->domain;
 shadow_l4e_t *sl4e;
 struct page_info *sp = mfn_to_page(smfn);
 u32 t = sp->u.sh.type;
@@ -1904,7 +1903,7 @@ void sh_destroy_l4_shadow(struct vcpu *v, mfn_t smfn)
 SHADOW_FOREACH_L4E(sl4mfn, sl4e, 0, 0, d, {
 if ( shadow_l4e_get_flags(*sl4e) & _PAGE_PRESENT )
 {
-sh_put_ref(v, shadow_l4e_get_mfn(*sl4e),
+sh_put_ref(d, shadow_l4e_get_mfn(*sl4e),
(((paddr_t)mfn_x(sl4mfn)) << PAGE_SHIFT)
| ((unsigned long)sl4e & ~PAGE_MASK));
 }
@@ -1914,9 +1913,8 @@ void sh_destroy_l4_shadow(struct vcpu *v, mfn_t smfn)
 shadow_free(d, smfn);
 }
 
-void sh_destroy_l3_shadow(struct vcpu *v, mfn_t smfn)
+void sh_destroy_l3_shadow(struct domain *d, mfn_t smfn)
 {
-struct domain *d = v->domain;
 shadow_l3e_t *sl3e;
 struct page_info *sp = mfn_to_page(smfn);
 u32 t = sp->u.sh.type;
@@ -1936,7 +1934,7 @@ void sh_destroy_l3_shadow(struct vcpu *v, mfn_t smfn)
 sl3mfn = smfn;
 SHADOW_FOREACH_L3E(sl3mfn, sl3e, 0, 0, {
 if ( shadow_l3e_get_flags(*sl3e) & _PAGE_PRESENT )
-

[Xen-devel] [PATCH 13/20] x86/shadow: Alter sh_{clear_shadow_entry, remove_shadow_via_pointer}() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c |   11 +--
 xen/arch/x86/mm/shadow/multi.c  |4 +---
 xen/arch/x86/mm/shadow/multi.h  |2 +-
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 046201a..e522b60 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2466,11 +2466,10 @@ static int sh_remove_all_mappings(struct vcpu *v, mfn_t 
gmfn)
 /**/
 /* Remove all shadows of a guest frame from the shadow tables */
 
-static int sh_remove_shadow_via_pointer(struct vcpu *v, mfn_t smfn)
+static int sh_remove_shadow_via_pointer(struct domain *d, mfn_t smfn)
 /* Follow this shadow's up-pointer, if it has one, and remove the reference
  * found there.  Returns 1 if that was the only reference to this shadow */
 {
-struct domain *d = v->domain;
 struct page_info *sp = mfn_to_page(smfn);
 mfn_t pmfn;
 void *vaddr;
@@ -2496,19 +2495,19 @@ static int sh_remove_shadow_via_pointer(struct vcpu *v, 
mfn_t smfn)
 {
 case SH_type_l1_32_shadow:
 case SH_type_l2_32_shadow:
-SHADOW_INTERNAL_NAME(sh_clear_shadow_entry, 2)(v, vaddr, pmfn);
+SHADOW_INTERNAL_NAME(sh_clear_shadow_entry, 2)(d, vaddr, pmfn);
 break;
 case SH_type_l1_pae_shadow:
 case SH_type_l2_pae_shadow:
 case SH_type_l2h_pae_shadow:
-SHADOW_INTERNAL_NAME(sh_clear_shadow_entry, 3)(v, vaddr, pmfn);
+SHADOW_INTERNAL_NAME(sh_clear_shadow_entry, 3)(d, vaddr, pmfn);
 break;
 case SH_type_l1_64_shadow:
 case SH_type_l2_64_shadow:
 case SH_type_l2h_64_shadow:
 case SH_type_l3_64_shadow:
 case SH_type_l4_64_shadow:
-SHADOW_INTERNAL_NAME(sh_clear_shadow_entry, 4)(v, vaddr, pmfn);
+SHADOW_INTERNAL_NAME(sh_clear_shadow_entry, 4)(d, vaddr, pmfn);
 break;
 default: BUG(); /* Some wierd unknown shadow type */
 }
@@ -2618,7 +2617,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 if ( sh_type_is_pinnable(d, t) )\
 sh_unpin(d, smfn);  \
 else if ( sh_type_has_up_pointer(d, t) )\
-sh_remove_shadow_via_pointer(v, smfn);  \
+sh_remove_shadow_via_pointer(d, smfn);  \
 if( !fast   \
 && (pg->count_info & PGC_page_table)\
 && (pg->shadow_flags & (1 << t)) )  \
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 1db8161..469ad25 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4347,11 +4347,9 @@ int sh_rm_mappings_from_l1(struct vcpu *v, mfn_t sl1mfn, 
mfn_t target_mfn)
 /**/
 /* Functions to excise all pointers to shadows from higher-level shadows. */
 
-void sh_clear_shadow_entry(struct vcpu *v, void *ep, mfn_t smfn)
+void sh_clear_shadow_entry(struct domain *d, void *ep, mfn_t smfn)
 /* Blank out a single shadow entry */
 {
-struct domain *d = v->domain;
-
 switch ( mfn_to_page(smfn)->u.sh.type )
 {
 case SH_type_l1_shadow:
diff --git a/xen/arch/x86/mm/shadow/multi.h b/xen/arch/x86/mm/shadow/multi.h
index 614103d..e33948c 100644
--- a/xen/arch/x86/mm/shadow/multi.h
+++ b/xen/arch/x86/mm/shadow/multi.h
@@ -69,7 +69,7 @@ SHADOW_INTERNAL_NAME(sh_rm_mappings_from_l1, GUEST_LEVELS)
 
 extern void
 SHADOW_INTERNAL_NAME(sh_clear_shadow_entry, GUEST_LEVELS)
-(struct vcpu *v, void *ep, mfn_t smfn);
+(struct domain *d, void *ep, mfn_t smfn);
 
 extern int
 SHADOW_INTERNAL_NAME(sh_remove_l1_shadow, GUEST_LEVELS)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 19/20] x86/shadow: Alter sh_remove_write_access to take a domain

2015-02-12 Thread Andrew Cooper
This allows the removal an improper use of d->vcpu[0] from toolstack context

Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |7 +++
 xen/arch/x86/mm/shadow/multi.c   |   16 ++--
 xen/arch/x86/mm/shadow/private.h |2 +-
 3 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index d24859e..4e6397a 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -671,7 +671,7 @@ static int oos_remove_write_access(struct vcpu *v, mfn_t 
gmfn,
 
 ftlb |= oos_fixup_flush_gmfn(v, gmfn, fixup);
 
-switch ( sh_remove_write_access(v, gmfn, 0, 0) )
+switch ( sh_remove_write_access(d, gmfn, 0, 0) )
 {
 default:
 case 0:
@@ -2180,7 +2180,7 @@ static inline void trace_shadow_wrmap_bf(mfn_t gmfn)
  * level==0 means we have some other reason for revoking write access.
  * If level==0 we are allowed to fail, returning -1. */
 
-int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
+int sh_remove_write_access(struct domain *d, mfn_t gmfn,
unsigned int level,
unsigned long fault_addr)
 {
@@ -2212,7 +2212,6 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 | SHF_L1_64
 | SHF_FL1_64
 ;
-struct domain *d = v->domain;
 struct page_info *pg = mfn_to_page(gmfn);
 #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
 struct vcpu *curr = current;
@@ -3689,7 +3688,7 @@ int shadow_track_dirty_vram(struct domain *d,
 for ( i = begin_pfn; i < end_pfn; i++ ) {
 mfn_t mfn = get_gfn_query_unlocked(d, i, &t);
 if (mfn_x(mfn) != INVALID_MFN)
-flush_tlb |= sh_remove_write_access(d->vcpu[0], mfn, 
1, 0);
+flush_tlb |= sh_remove_write_access(d, mfn, 1, 0);
 }
 dirty_vram->last_dirty = -1;
 }
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 288c7d5..16cd60d 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -278,11 +278,7 @@ shadow_check_gl1e(struct vcpu *v, walk_t *gw)
 static inline uint32_t
 gw_remove_write_accesses(struct vcpu *v, unsigned long va, walk_t *gw)
 {
-#if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
-#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
 struct domain *d = v->domain;
-#endif
-#endif
 uint32_t rc = 0;
 
 #if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
@@ -295,7 +291,7 @@ gw_remove_write_accesses(struct vcpu *v, unsigned long va, 
walk_t *gw)
 }
 else
 #endif /* OOS */
- if ( sh_remove_write_access(v, gw->l3mfn, 3, va) )
+ if ( sh_remove_write_access(d, gw->l3mfn, 3, va) )
  rc = GW_RMWR_FLUSHTLB;
 #endif /* GUEST_PAGING_LEVELS >= 4 */
 
@@ -307,7 +303,7 @@ gw_remove_write_accesses(struct vcpu *v, unsigned long va, 
walk_t *gw)
 }
 else
 #endif /* OOS */
-if ( sh_remove_write_access(v, gw->l2mfn, 2, va) )
+if ( sh_remove_write_access(d, gw->l2mfn, 2, va) )
 rc |= GW_RMWR_FLUSHTLB;
 #endif /* GUEST_PAGING_LEVELS >= 3 */
 
@@ -316,7 +312,7 @@ gw_remove_write_accesses(struct vcpu *v, unsigned long va, 
walk_t *gw)
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
  && !mfn_is_out_of_sync(gw->l1mfn)
 #endif /* OOS */
- && sh_remove_write_access(v, gw->l1mfn, 1, va) )
+ && sh_remove_write_access(d, gw->l1mfn, 1, va) )
 rc |= GW_RMWR_FLUSHTLB;
 
 return rc;
@@ -4028,7 +4024,7 @@ sh_update_cr3(struct vcpu *v, int do_locking)
  * replace the old shadow pagetable(s), so that we can safely use the
  * (old) shadow linear maps in the writeable mapping heuristics. */
 #if GUEST_PAGING_LEVELS == 2
-if ( sh_remove_write_access(v, gmfn, 2, 0) != 0 )
+if ( sh_remove_write_access(d, gmfn, 2, 0) != 0 )
 flush_tlb_mask(d->domain_dirty_cpumask);
 sh_set_toplevel_shadow(v, 0, gmfn, SH_type_l2_shadow);
 #elif GUEST_PAGING_LEVELS == 3
@@ -4048,7 +4044,7 @@ sh_update_cr3(struct vcpu *v, int do_locking)
 gl2gfn = guest_l3e_get_gfn(gl3e[i]);
 gl2mfn = get_gfn_query_unlocked(d, gfn_x(gl2gfn), &p2mt);
 if ( p2m_is_ram(p2mt) )
-flush |= sh_remove_write_access(v, gl2mfn, 2, 0);
+flush |= sh_remove_write_access(d, gl2mfn, 2, 0);
 }
 }
 if ( flush )
@@ -4072,7 +4068,7 @@ sh_update_cr3(struct vcpu *v, int do_locking)
 }
 }
 #elif GUEST_PAGING_LEVELS == 4
-if ( sh_remove_write_access(v, gmfn, 4, 0) != 0 )
+if ( sh_remove_write_access(d, gmfn, 4, 0) != 0 )
 flush_tlb_mask(d->domain_dirty_cpumask);
 sh_set_toplevel_shadow(v, 0, gmfn, SH_type_l4_shadow);
 #else
diff --git a/xen/arch/x86/mm/shadow/private.h b/xen/arch/x86/mm/shadow/private.h
index 96b53b9..1bf1deb 100644
--- a/xen/arch/x86/mm/

[Xen-devel] [PATCH 16/20] x86/shadow: Alter sh_rm_write_access_from_???() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |   19 ++-
 xen/arch/x86/mm/shadow/multi.c   |6 ++
 xen/arch/x86/mm/shadow/multi.h   |4 ++--
 xen/arch/x86/mm/shadow/private.h |2 +-
 4 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 4a9b94b..e10b578 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -587,12 +587,13 @@ static inline void _sh_resync_l1(struct vcpu *v, mfn_t 
gmfn, mfn_t snpmfn)
 static inline int oos_fixup_flush_gmfn(struct vcpu *v, mfn_t gmfn,
struct oos_fixup *fixup)
 {
+struct domain *d = v->domain;
 int i;
 for ( i = 0; i < SHADOW_OOS_FIXUPS; i++ )
 {
 if ( mfn_x(fixup->smfn[i]) != INVALID_MFN )
 {
-sh_remove_write_access_from_sl1p(v, gmfn,
+sh_remove_write_access_from_sl1p(d, gmfn,
  fixup->smfn[i],
  fixup->off[i]);
 fixup->smfn[i] = _mfn(INVALID_MFN);
@@ -638,7 +639,7 @@ void oos_fixup_add(struct domain *d, mfn_t gmfn,
 TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_OOS_FIXUP_EVICT);
 
 /* Reuse this slot and remove current writable mapping. */
-sh_remove_write_access_from_sl1p(v, gmfn,
+sh_remove_write_access_from_sl1p(d, gmfn,
  oos_fixup[idx].smfn[next],
  oos_fixup[idx].off[next]);
 perfc_incr(shadow_oos_fixup_evict);
@@ -2184,7 +2185,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
unsigned long fault_addr)
 {
 /* Dispatch table for getting per-type functions */
-static const hash_vcpu_callback_t callbacks[SH_type_unused] = {
+static const hash_domain_callback_t callbacks[SH_type_unused] = {
 NULL, /* none*/
 SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 2), /* l1_32   */
 SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 2), /* fl1_32  */
@@ -2367,7 +2368,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 int shtype = mfn_to_page(last_smfn)->u.sh.type;
 
 if ( callbacks[shtype] )
-callbacks[shtype](curr, last_smfn, gmfn);
+callbacks[shtype](d, last_smfn, gmfn);
 
 if ( (pg->u.inuse.type_info & PGT_count_mask) != old_count )
 perfc_incr(shadow_writeable_h_5);
@@ -2384,7 +2385,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 perfc_incr(shadow_writeable_bf_1);
 else
 perfc_incr(shadow_writeable_bf);
-hash_vcpu_foreach(v, callback_mask, callbacks, gmfn);
+hash_domain_foreach(d, callback_mask, callbacks, gmfn);
 
 /* If that didn't catch the mapping, then there's some non-pagetable
  * mapping -- ioreq page, grant mapping, &c. */
@@ -2404,7 +2405,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 }
 
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
-int sh_remove_write_access_from_sl1p(struct vcpu *v, mfn_t gmfn,
+int sh_remove_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
  mfn_t smfn, unsigned long off)
 {
 struct page_info *sp = mfn_to_page(smfn);
@@ -2416,16 +2417,16 @@ int sh_remove_write_access_from_sl1p(struct vcpu *v, 
mfn_t gmfn,
  || sp->u.sh.type == SH_type_fl1_32_shadow )
 {
 return SHADOW_INTERNAL_NAME(sh_rm_write_access_from_sl1p,2)
-(v, gmfn, smfn, off);
+(d, gmfn, smfn, off);
 }
 else if ( sp->u.sh.type == SH_type_l1_pae_shadow
   || sp->u.sh.type == SH_type_fl1_pae_shadow )
 return SHADOW_INTERNAL_NAME(sh_rm_write_access_from_sl1p,3)
-(v, gmfn, smfn, off);
+(d, gmfn, smfn, off);
 else if ( sp->u.sh.type == SH_type_l1_64_shadow
   || sp->u.sh.type == SH_type_fl1_64_shadow )
 return SHADOW_INTERNAL_NAME(sh_rm_write_access_from_sl1p,4)
-(v, gmfn, smfn, off);
+(d, gmfn, smfn, off);
 
 return 0;
 }
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 79d..0d1021b 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4177,10 +4177,9 @@ sh_update_cr3(struct vcpu *v, int do_locking)
 /* Functions to revoke guest rights */
 
 #if SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC
-int sh_rm_write_access_from_sl1p(struct vcpu *v, mfn_t gmfn,
+int sh_rm_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
  mfn_t smfn, unsigned long off)
 {
-struct domain *d = v->domain;
 struct vcpu *curr = current;
 int r;
 shadow_l1e_t *sl1p, sl1e;
@@ -4280,11 +4279,10 @@ static int sh_guess_wrmap(struct vcpu *v, unsigned long 
vaddr, mfn_t 

[Xen-devel] [PATCH 15/20] x86/shadow: Alter shadow_unhook{_???}_mappings() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |   12 ++--
 xen/arch/x86/mm/shadow/multi.c   |   13 +
 xen/arch/x86/mm/shadow/multi.h   |6 +++---
 xen/arch/x86/mm/shadow/private.h |2 +-
 4 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 3810b75..4a9b94b 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1244,20 +1244,20 @@ static unsigned int shadow_min_acceptable_pages(struct 
domain *d)
 /* Dispatcher function: call the per-mode function that will unhook the
  * non-Xen mappings in this top-level shadow mfn.  With user_only == 1,
  * unhooks only the user-mode mappings. */
-void shadow_unhook_mappings(struct vcpu *v, mfn_t smfn, int user_only)
+void shadow_unhook_mappings(struct domain *d, mfn_t smfn, int user_only)
 {
 struct page_info *sp = mfn_to_page(smfn);
 switch ( sp->u.sh.type )
 {
 case SH_type_l2_32_shadow:
-SHADOW_INTERNAL_NAME(sh_unhook_32b_mappings, 2)(v, smfn, user_only);
+SHADOW_INTERNAL_NAME(sh_unhook_32b_mappings, 2)(d, smfn, user_only);
 break;
 case SH_type_l2_pae_shadow:
 case SH_type_l2h_pae_shadow:
-SHADOW_INTERNAL_NAME(sh_unhook_pae_mappings, 3)(v, smfn, user_only);
+SHADOW_INTERNAL_NAME(sh_unhook_pae_mappings, 3)(d, smfn, user_only);
 break;
 case SH_type_l4_64_shadow:
-SHADOW_INTERNAL_NAME(sh_unhook_64b_mappings, 4)(v, smfn, user_only);
+SHADOW_INTERNAL_NAME(sh_unhook_64b_mappings, 4)(d, smfn, user_only);
 break;
 default:
 SHADOW_ERROR("top-level shadow has bad type %08x\n", sp->u.sh.type);
@@ -1322,7 +1322,7 @@ static void _shadow_prealloc(
 if ( !pagetable_is_null(v2->arch.shadow_table[i]) )
 {
 TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_PREALLOC_UNHOOK);
-shadow_unhook_mappings(v,
+shadow_unhook_mappings(d,
pagetable_get_mfn(v2->arch.shadow_table[i]), 0);
 
 /* See if that freed up enough space */
@@ -1377,7 +1377,7 @@ static void shadow_blow_tables(struct domain *d)
 for_each_vcpu(d, v)
 for ( i = 0 ; i < 4 ; i++ )
 if ( !pagetable_is_null(v->arch.shadow_table[i]) )
-shadow_unhook_mappings(v,
+shadow_unhook_mappings(d,
pagetable_get_mfn(v->arch.shadow_table[i]), 0);
 
 /* Make sure everyone sees the unshadowings */
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index ab6ebe2..79d 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -2074,9 +2074,8 @@ void sh_destroy_monitor_table(struct vcpu *v, mfn_t mmfn)
 
 #if GUEST_PAGING_LEVELS == 2
 
-void sh_unhook_32b_mappings(struct vcpu *v, mfn_t sl2mfn, int user_only)
+void sh_unhook_32b_mappings(struct domain *d, mfn_t sl2mfn, int user_only)
 {
-struct domain *d = v->domain;
 shadow_l2e_t *sl2e;
 SHADOW_FOREACH_L2E(sl2mfn, sl2e, 0, 0, d, {
 if ( !user_only || (sl2e->l2 & _PAGE_USER) )
@@ -2086,10 +2085,9 @@ void sh_unhook_32b_mappings(struct vcpu *v, mfn_t 
sl2mfn, int user_only)
 
 #elif GUEST_PAGING_LEVELS == 3
 
-void sh_unhook_pae_mappings(struct vcpu *v, mfn_t sl2mfn, int user_only)
+void sh_unhook_pae_mappings(struct domain *d, mfn_t sl2mfn, int user_only)
 /* Walk a PAE l2 shadow, unhooking entries from all the subshadows */
 {
-struct domain *d = v->domain;
 shadow_l2e_t *sl2e;
 SHADOW_FOREACH_L2E(sl2mfn, sl2e, 0, 0, d, {
 if ( !user_only || (sl2e->l2 & _PAGE_USER) )
@@ -2099,9 +2097,8 @@ void sh_unhook_pae_mappings(struct vcpu *v, mfn_t sl2mfn, 
int user_only)
 
 #elif GUEST_PAGING_LEVELS == 4
 
-void sh_unhook_64b_mappings(struct vcpu *v, mfn_t sl4mfn, int user_only)
+void sh_unhook_64b_mappings(struct domain *d, mfn_t sl4mfn, int user_only)
 {
-struct domain *d = v->domain;
 shadow_l4e_t *sl4e;
 SHADOW_FOREACH_L4E(sl4mfn, sl4e, 0, 0, d, {
 if ( !user_only || (sl4e->l4 & _PAGE_USER) )
@@ -4506,7 +4503,7 @@ static void sh_pagetable_dying(struct vcpu *v, paddr_t 
gpa)
 {
 gmfn = _mfn(mfn_to_page(smfn)->v.sh.back);
 mfn_to_page(gmfn)->shadow_flags |= SHF_pagetable_dying;
-shadow_unhook_mappings(v, smfn, 1/* user pages only */);
+shadow_unhook_mappings(d, smfn, 1/* user pages only */);
 flush = 1;
 }
 }
@@ -4545,7 +4542,7 @@ static void sh_pagetable_dying(struct vcpu *v, paddr_t 
gpa)
 if ( mfn_valid(smfn) )
 {
 mfn_to_page(gmfn)->shadow_flags |= SHF_pagetable_dying;
-shadow_unhook_mappings(v, smfn, 1/* user pages only */);
+shadow_unhook_mappings(d, smfn, 1/* user pages only */);
 /* Now flush the TLB: we removed toplevel mappings. */
 flush_tlb_mask(d->domain_dir

[Xen-devel] [PATCH 12/20] x86/shadow: Alter shadow_set_l?e() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/multi.c |   69 
 1 file changed, 35 insertions(+), 34 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index ccb08d3..1db8161 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -885,12 +885,11 @@ shadow_put_page_from_l1e(shadow_l1e_t sl1e, struct domain 
*d)
 }
 
 #if GUEST_PAGING_LEVELS >= 4
-static int shadow_set_l4e(struct vcpu *v,
+static int shadow_set_l4e(struct domain *d,
   shadow_l4e_t *sl4e,
   shadow_l4e_t new_sl4e,
   mfn_t sl4mfn)
 {
-struct domain *d = v->domain;
 int flags = 0, ok;
 shadow_l4e_t old_sl4e;
 paddr_t paddr;
@@ -936,12 +935,11 @@ static int shadow_set_l4e(struct vcpu *v,
 return flags;
 }
 
-static int shadow_set_l3e(struct vcpu *v,
+static int shadow_set_l3e(struct domain *d,
   shadow_l3e_t *sl3e,
   shadow_l3e_t new_sl3e,
   mfn_t sl3mfn)
 {
-struct domain *d = v->domain;
 int flags = 0;
 shadow_l3e_t old_sl3e;
 paddr_t paddr;
@@ -983,12 +981,11 @@ static int shadow_set_l3e(struct vcpu *v,
 }
 #endif /* GUEST_PAGING_LEVELS >= 4 */
 
-static int shadow_set_l2e(struct vcpu *v,
+static int shadow_set_l2e(struct domain *d,
   shadow_l2e_t *sl2e,
   shadow_l2e_t new_sl2e,
   mfn_t sl2mfn)
 {
-struct domain *d = v->domain;
 int flags = 0;
 shadow_l2e_t old_sl2e;
 paddr_t paddr;
@@ -1165,14 +1162,13 @@ static inline void shadow_vram_put_l1e(shadow_l1e_t 
old_sl1e,
 }
 }
 
-static int shadow_set_l1e(struct vcpu *v,
+static int shadow_set_l1e(struct domain *d,
   shadow_l1e_t *sl1e,
   shadow_l1e_t new_sl1e,
   p2m_type_t new_type,
   mfn_t sl1mfn)
 {
 int flags = 0;
-struct domain *d = v->domain;
 shadow_l1e_t old_sl1e;
 #if SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC
 mfn_t new_gmfn = shadow_l1e_get_mfn(new_sl1e);
@@ -1699,7 +1695,7 @@ static shadow_l3e_t * shadow_get_and_create_l3e(struct 
vcpu *v,
 }
 /* Install the new sl3 table in the sl4e */
 l4e_propagate_from_guest(v, gw->l4e, *sl3mfn, &new_sl4e, ft);
-r = shadow_set_l4e(v, sl4e, new_sl4e, sl4mfn);
+r = shadow_set_l4e(d, sl4e, new_sl4e, sl4mfn);
 ASSERT((r & SHADOW_SET_FLUSH) == 0);
 if ( r & SHADOW_SET_ERROR )
 return NULL;
@@ -1755,7 +1751,7 @@ static shadow_l2e_t * shadow_get_and_create_l2e(struct 
vcpu *v,
 }
 /* Install the new sl2 table in the sl3e */
 l3e_propagate_from_guest(v, gw->l3e, *sl2mfn, &new_sl3e, ft);
-r = shadow_set_l3e(v, sl3e, new_sl3e, sl3mfn);
+r = shadow_set_l3e(d, sl3e, new_sl3e, sl3mfn);
 ASSERT((r & SHADOW_SET_FLUSH) == 0);
 if ( r & SHADOW_SET_ERROR )
 return NULL;
@@ -1845,7 +1841,7 @@ static shadow_l1e_t * shadow_get_and_create_l1e(struct 
vcpu *v,
 }
 /* Install the new sl1 table in the sl2e */
 l2e_propagate_from_guest(v, gw->l2e, *sl1mfn, &new_sl2e, ft);
-r = shadow_set_l2e(v, sl2e, new_sl2e, sl2mfn);
+r = shadow_set_l2e(d, sl2e, new_sl2e, sl2mfn);
 ASSERT((r & SHADOW_SET_FLUSH) == 0);
 if ( r & SHADOW_SET_ERROR )
 return NULL;
@@ -2084,7 +2080,7 @@ void sh_unhook_32b_mappings(struct vcpu *v, mfn_t sl2mfn, 
int user_only)
 shadow_l2e_t *sl2e;
 SHADOW_FOREACH_L2E(sl2mfn, sl2e, 0, 0, d, {
 if ( !user_only || (sl2e->l2 & _PAGE_USER) )
-(void) shadow_set_l2e(v, sl2e, shadow_l2e_empty(), sl2mfn);
+(void) shadow_set_l2e(d, sl2e, shadow_l2e_empty(), sl2mfn);
 });
 }
 
@@ -2097,7 +2093,7 @@ void sh_unhook_pae_mappings(struct vcpu *v, mfn_t sl2mfn, 
int user_only)
 shadow_l2e_t *sl2e;
 SHADOW_FOREACH_L2E(sl2mfn, sl2e, 0, 0, d, {
 if ( !user_only || (sl2e->l2 & _PAGE_USER) )
-(void) shadow_set_l2e(v, sl2e, shadow_l2e_empty(), sl2mfn);
+(void) shadow_set_l2e(d, sl2e, shadow_l2e_empty(), sl2mfn);
 });
 }
 
@@ -2109,7 +2105,7 @@ void sh_unhook_64b_mappings(struct vcpu *v, mfn_t sl4mfn, 
int user_only)
 shadow_l4e_t *sl4e;
 SHADOW_FOREACH_L4E(sl4mfn, sl4e, 0, 0, d, {
 if ( !user_only || (sl4e->l4 & _PAGE_USER) )
-(void) shadow_set_l4e(v, sl4e, shadow_l4e_empty(), sl4mfn);
+(void) shadow_set_l4e(d, sl4e, shadow_l4e_empty(), sl4mfn);
 });
 }
 
@@ -2180,7 +2176,7 @@ static int validate_gl4e(struct vcpu *v, void *new_ge, 
mfn_t sl4mfn, void *se)
 }
 }
 
-result |= shadow_set_l4e(v, sl4p, new_sl4e, sl4mfn);
+result |= shadow_set_l4e(d, sl4p, new_sl4e, sl4mfn);
 return result;
 }
 
@@

[Xen-devel] [PATCH 20/20] x86/shadow: Cleanup of vcpu handling

2015-02-12 Thread Andrew Cooper
There existed some codepaths which had a domain in their hand, but needed a
vcpu to drive the shadow interface.

Now that these interfaces have changed to be domain based, the hoop-jumping
can be removed.

Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c |   30 ++
 1 file changed, 10 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 4e6397a..01bc750 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1280,23 +1280,17 @@ static inline void trace_shadow_prealloc_unpin(struct 
domain *d, mfn_t smfn)
 
 /* Make sure there are at least count order-sized pages
  * available in the shadow page pool. */
-static void _shadow_prealloc(
-struct domain *d,
-unsigned int pages)
+static void _shadow_prealloc(struct domain *d, unsigned int pages)
 {
-/* Need a vpcu for calling unpins; for now, since we don't have
- * per-vcpu shadows, any will do */
-struct vcpu *v, *v2;
+struct vcpu *v;
 struct page_info *sp, *t;
 mfn_t smfn;
 int i;
 
 if ( d->arch.paging.shadow.free_pages >= pages ) return;
 
-v = current;
-if ( v->domain != d )
-v = d->vcpu[0];
-ASSERT(v != NULL); /* Shouldn't have enabled shadows if we've no vcpus  */
+/* Shouldn't have enabled shadows if we've no vcpus. */
+ASSERT(d->vcpu && d->vcpu[0]);
 
 /* Stage one: walk the list of pinned pages, unpinning them */
 perfc_incr(shadow_prealloc_1);
@@ -1317,14 +1311,14 @@ static void _shadow_prealloc(
  * mappings. */
 perfc_incr(shadow_prealloc_2);
 
-for_each_vcpu(d, v2)
+for_each_vcpu(d, v)
 for ( i = 0 ; i < 4 ; i++ )
 {
-if ( !pagetable_is_null(v2->arch.shadow_table[i]) )
+if ( !pagetable_is_null(v->arch.shadow_table[i]) )
 {
 TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_PREALLOC_UNHOOK);
 shadow_unhook_mappings(d,
-   pagetable_get_mfn(v2->arch.shadow_table[i]), 0);
+   pagetable_get_mfn(v->arch.shadow_table[i]), 0);
 
 /* See if that freed up enough space */
 if ( d->arch.paging.shadow.free_pages >= pages )
@@ -1361,11 +1355,12 @@ void shadow_prealloc(struct domain *d, u32 type, 
unsigned int count)
 static void shadow_blow_tables(struct domain *d)
 {
 struct page_info *sp, *t;
-struct vcpu *v = d->vcpu[0];
+struct vcpu *v;
 mfn_t smfn;
 int i;
 
-ASSERT(v != NULL);
+/* Shouldn't have enabled shadows if we've no vcpus. */
+ASSERT(d->vcpu && d->vcpu[0]);
 
 /* Pass one: unpin all pinned pages */
 foreach_pinned_shadow(d, sp, t)
@@ -3363,11 +3358,6 @@ static void sh_unshadow_for_p2m_change(struct domain *d, 
unsigned long gfn,
l1_pgentry_t *p, l1_pgentry_t new,
unsigned int level)
 {
-struct vcpu *v = current;
-
-if ( v->domain != d )
-v = d->vcpu ? d->vcpu[0] : NULL;
-
 /* The following assertion is to make sure we don't step on 1GB host
  * page support of HVM guest. */
 ASSERT(!(level > 2 && (l1e_get_flags(*p) & _PAGE_PRESENT) &&
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 18/20] x86/shadow: Alter sh_{remove_all_mappings, rm_mappings_from_l1}() to take a domain

2015-02-12 Thread Andrew Cooper
This allows the removal an improper use of d->vcpu[0] from toolstack context

Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c |   13 ++---
 xen/arch/x86/mm/shadow/multi.c  |3 +--
 xen/arch/x86/mm/shadow/multi.h  |2 +-
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 30580ee..d24859e 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2436,13 +2436,12 @@ int sh_remove_write_access_from_sl1p(struct domain *d, 
mfn_t gmfn,
 /* Remove all mappings of a guest frame from the shadow tables.
  * Returns non-zero if we need to flush TLBs. */
 
-static int sh_remove_all_mappings(struct vcpu *v, mfn_t gmfn)
+static int sh_remove_all_mappings(struct domain *d, mfn_t gmfn)
 {
-struct domain *d = v->domain;
 struct page_info *page = mfn_to_page(gmfn);
 
 /* Dispatch table for getting per-type functions */
-static const hash_vcpu_callback_t callbacks[SH_type_unused] = {
+static const hash_domain_callback_t callbacks[SH_type_unused] = {
 NULL, /* none*/
 SHADOW_INTERNAL_NAME(sh_rm_mappings_from_l1, 2), /* l1_32   */
 SHADOW_INTERNAL_NAME(sh_rm_mappings_from_l1, 2), /* fl1_32  */
@@ -2484,7 +2483,7 @@ static int sh_remove_all_mappings(struct vcpu *v, mfn_t 
gmfn)
 
 /* Brute-force search of all the shadows, by walking the hash */
 perfc_incr(shadow_mappings_bf);
-hash_vcpu_foreach(v, callback_mask, callbacks, gmfn);
+hash_domain_foreach(d, callback_mask, callbacks, gmfn);
 
 /* If that didn't catch the mapping, something is very wrong */
 if ( !sh_check_page_has_no_refs(page) )
@@ -3383,7 +3382,7 @@ static void sh_unshadow_for_p2m_change(struct domain *d, 
unsigned long gfn,
 if ( (p2m_is_valid(p2mt) || p2m_is_grant(p2mt)) && mfn_valid(mfn) )
 {
 sh_remove_all_shadows_and_parents(d, mfn);
-if ( sh_remove_all_mappings(v, mfn) )
+if ( sh_remove_all_mappings(d, mfn) )
 flush_tlb_mask(d->domain_dirty_cpumask);
 }
 }
@@ -3418,7 +3417,7 @@ static void sh_unshadow_for_p2m_change(struct domain *d, 
unsigned long gfn,
 {
 /* This GFN->MFN mapping has gone away */
 sh_remove_all_shadows_and_parents(d, omfn);
-if ( sh_remove_all_mappings(v, omfn) )
+if ( sh_remove_all_mappings(d, omfn) )
 cpumask_or(&flushmask, &flushmask,
d->domain_dirty_cpumask);
 }
@@ -3634,7 +3633,7 @@ int shadow_track_dirty_vram(struct domain *d,
 dirty = 1;
 /* TODO: Heuristics for finding the single mapping of
  * this gmfn */
-flush_tlb |= sh_remove_all_mappings(d->vcpu[0], mfn);
+flush_tlb |= sh_remove_all_mappings(d, mfn);
 }
 else
 {
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 7705674..288c7d5 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4315,10 +4315,9 @@ int sh_rm_write_access_from_l1(struct domain *d, mfn_t 
sl1mfn,
 }
 
 
-int sh_rm_mappings_from_l1(struct vcpu *v, mfn_t sl1mfn, mfn_t target_mfn)
+int sh_rm_mappings_from_l1(struct domain *d, mfn_t sl1mfn, mfn_t target_mfn)
 /* Excises all mappings to guest frame from this shadow l1 table */
 {
-struct domain *d = v->domain;
 shadow_l1e_t *sl1e;
 int done = 0;
 int flags;
diff --git a/xen/arch/x86/mm/shadow/multi.h b/xen/arch/x86/mm/shadow/multi.h
index 1af9225..935e12d 100644
--- a/xen/arch/x86/mm/shadow/multi.h
+++ b/xen/arch/x86/mm/shadow/multi.h
@@ -65,7 +65,7 @@ SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, GUEST_LEVELS)
 (struct domain *d, mfn_t sl1mfn, mfn_t readonly_mfn);
 extern int
 SHADOW_INTERNAL_NAME(sh_rm_mappings_from_l1, GUEST_LEVELS)
-(struct vcpu *v, mfn_t sl1mfn, mfn_t target_mfn);
+(struct domain *d, mfn_t sl1mfn, mfn_t target_mfn);
 
 extern void
 SHADOW_INTERNAL_NAME(sh_clear_shadow_entry, GUEST_LEVELS)
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] stubdom vtpm build failure in staging

2015-02-12 Thread Xu, Quan


> -Original Message-
> From: xen-devel-boun...@lists.xen.org
> [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Xu, Quan
> Sent: Friday, February 13, 2015 12:57 AM
> To: Olaf Hering
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] stubdom vtpm build failure in staging
> 
> Sorry for that. Read the other thread of email, it looks that some 
> maintainers are
> working for this issue.
> And I am working for 'Xen stubdom vTPM for HVM virtual machine' v4 patches.
> There are a lot of modifications.
> 
> I will be out of office from Feb. 16th to Feb. 26th for Chinese New Year. I 
> plan to
> summit v4 patches Before Feb. 16, and fix this issue after Feb. 26th.
> 
> --Quan
> 
> 
> > -Original Message-
> > From: Olaf Hering [mailto:o...@aepfle.de]
> > Sent: Wednesday, February 11, 2015 11:21 PM
> > To: Xu, Quan
> > Cc: xen-devel@lists.xen.org
> > Subject: Re: [Xen-devel] stubdom vtpm build failure in staging
> >
> > On Wed, Jan 28, Xu, Quan wrote:
> >
> > > Thanks, I will check and fix it tomorrow. It is 23:12 PM Pacific time now.
> >
> > Any progress?
> > These typedefs are duplicated in stubdom/vtpmmgr/tcg.h and supported
> > compilers do not cope with current staging:
> >
> > # for i in `grep -w typedef stubdom/vtpmmgr/tcg.h | sed -n '/;/{s@^.*
> > @@;s@;@@p}'` # do
> > # if test -n "`git grep -wn $i|grep -w typedef|grep -v
> > stubdom/vtpmmgr/tcg.h`"
> > # then
> > # echo $i
> > # fi
> > # done
> >
> > BYTE
> > BOOL
> > UINT16
> > UINT32
> > UINT64
> > TPM_HANDLE
> > TPM_ALGORITHM_ID
> >
> > TPMI_RH_HIERARCHY_AUTH and TPM_ALG_ID are defined twice in the same
> > file.
> >
> > This change works for me:
> >
> > ---
> >  stubdom/vtpmmgr/odd_types.h  | 11 +++
> >  stubdom/vtpmmgr/tcg.h|  9 +
> >  stubdom/vtpmmgr/tpm2_types.h | 11 +--
> >  3 files changed, 13 insertions(+), 18 deletions(-)  create mode
> > 100644 stubdom/vtpmmgr/odd_types.h
> >
> > diff --git a/stubdom/vtpmmgr/odd_types.h b/stubdom/vtpmmgr/odd_types.h
> > new file mode 100644 index 000..d72da9b
> > --- /dev/null
> > +++ b/stubdom/vtpmmgr/odd_types.h
> > @@ -0,0 +1,11 @@
> > +#ifndef VTPM_ODD_TYPES
> > +#define VTPM_ODD_TYPES 1
> > +typedef unsigned char BYTE;
> > +typedef unsigned char BOOL;
> > +typedef uint16_t UINT16;
> > +typedef uint32_t UINT32;
> > +typedef uint64_t UINT64;
> > +typedef UINT32 TPM_HANDLE;
> > +typedef UINT32 TPM_ALGORITHM_ID;
> > +#endif
> > +
> > diff --git a/stubdom/vtpmmgr/tcg.h b/stubdom/vtpmmgr/tcg.h index
> > 7321ec6..cac1bbc 100644
> > --- a/stubdom/vtpmmgr/tcg.h
> > +++ b/stubdom/vtpmmgr/tcg.h
> > @@ -401,16 +401,10 @@
> >
> >
> >  // *** TYPEDEFS
> > * -typedef unsigned char BYTE;
> > -typedef unsigned char BOOL; -typedef uint16_t UINT16; -typedef
> > uint32_t UINT32; -typedef uint64_t UINT64;
> > -
> > +#include "odd_types.h"

I think it is just for gcc backward compatibility. IMHO, That does seem pretty 
strange.
cc Daniel who is the maintainer of vTPM / XSM.

-Quan

> >  typedef UINT32 TPM_RESULT;
> >  typedef UINT32 TPM_PCRINDEX;
> >  typedef UINT32 TPM_DIRINDEX;
> > -typedef UINT32 TPM_HANDLE;
> >  typedef TPM_HANDLE TPM_AUTHHANDLE;
> >  typedef TPM_HANDLE TCPA_HASHHANDLE;
> >  typedef TPM_HANDLE TCPA_HMACHANDLE;
> > @@ -422,7 +416,6 @@ typedef UINT32 TPM_COMMAND_CODE;  typedef
> > UINT16 TPM_PROTOCOL_ID;  typedef BYTE TPM_AUTH_DATA_USAGE;
> typedef
> > UINT16 TPM_ENTITY_TYPE; -typedef UINT32 TPM_ALGORITHM_ID; typedef
> > UINT16 TPM_KEY_USAGE;  typedef UINT16 TPM_STARTUP_TYPE; typedef
> UINT32
> > TPM_CAPABILITY_AREA; diff --git a/stubdom/vtpmmgr/tpm2_types.h
> > b/stubdom/vtpmmgr/tpm2_types.h index ac2830d..63564cd 100644
> > --- a/stubdom/vtpmmgr/tpm2_types.h
> > +++ b/stubdom/vtpmmgr/tpm2_types.h
> > @@ -83,12 +83,8 @@
> >  #defineMAX_ECC_KEY_BYTES((MAX_ECC_KEY_BITS + 7) / 8)
> >
> >
> > -typedef unsigned char BYTE;
> > -typedef unsigned char BOOL;
> > +#include "odd_types.h"
> >  typedef uint8_t   UINT8;
> > -typedef uint16_t  UINT16;
> > -typedef uint32_t  UINT32;
> > -typedef uint64_t  UINT64;
> >
> >  // TPM2 command code
> >
> > @@ -216,7 +212,6 @@ typedef UINT16 TPM_ST;
> >
> >
> >  // TPM Handle types
> > -typedef UINT32 TPM_HANDLE;
> >  typedef UINT8 TPM_HT;
> >
> >
> > @@ -233,7 +228,6 @@ typedef UINT32 TPM_RH;
> >  #defineTPM_RH_LAST   (TPM_RH)(0x400C)
> >
> >  // Table 4 -- DocumentationClarity Types 
> > -typedef UINT32TPM_ALGORITHM_ID;
> >  typedef UINT32TPM_MODIFIER_INDICATOR;
> >  typedef UINT32TPM_SESSION_OFFSET;
> >  typedef UINT16TPM_KEY_SIZE;
> > @@ -261,8 +255,6 @@ typedef BYTE TPMA_LOCALITY;  // Table 37 --
> > TPMI_YES_NO Type   typedef BYTE TPMI_YES_NO;
> >
> > -typedef TPM_HANDLE TPMI_RH_HIERARCHY_AUTH;
> > -
> >  // Table 38 -- TPMI_DH_OBJECT Type   typedef TPM_HANDLE
> > TPMI_DH_OBJECT;
> >
> > @@ -304,7 +296,6 @@ typedef TPM_HANDLE TPMI_RH_LOC

[Xen-devel] [PATCH RFC 00/20] Change parts of the shadow interface to be domain based

2015-02-12 Thread Andrew Cooper
The purpose of this series is to prevent toolstack entry points into the
shadow code from passing d->vcpu[0] for actions which are inherenly domain
wide.  It also fixes the fact that shadow heuristics were being applied to
vcpu 0 for toolstack-initiated actions.

This series is composed mostly of mechanical changes.  The only patches which
have a practical difference on shadow execution are patches 4 and 20

The entire series has been compile tested at each changeset, for both
shadow-paging=y and n.  It has also been tested internally in XenServer, but
appears to have gotten caught up in some collateral damage from an unrelated
merge.  I am rerunning the tests.

This series can be found as in the shadow-dom-v1 branch on

  git://xenbits.xen.org/people/andrewcoop/xen.git


Andrew Cooper (20):
  x86/shadow: Whitespace cleanup
  x86/shadow: Rename hash_foreach() to hash_vcpu_foreach()
  x86/shadow: Introduce 'd' pointers and clean up use of 'v->domain'
  x86/shadow: Only apply shadow heuristics when in guest context
  x86/shadow: Alter shadow_hash_{lookup,insert,delete}() to take a domain
  x86/shadow: Alter *_shadow_status() and make_fl1_shadow() to take a domain
  x86/shadow: Alter sh_type_{is_pinnable,has_up_pointer}() to take a domain
  x86/shadow: Alter OOS functions to take a domain
  x86/shadow: Alter shadow_{pro,de}mote() to take a domain
  x86/shadow: Alter sh_put_ref() and shadow destroy functions to take a domain
  x86/shadow: Alter sh_get_ref() and sh_{,un}pin() to take a domain
  x86/shadow: Alter shadow_set_l?e() to take a domain
  x86/shadow: Alter sh_{clear_shadow_entry,remove_shadow_via_pointer}() to take 
a domain
  x86/shadow: Alter sh_remove_l?_shadow() to take a domain
  x86/shadow: Alter shadow_unhook{_???}_mappings() to take a domain
  x86/shadow: Alter sh_rm_write_access_from_???() to take a domain
  x86/shadow: Alter sh_remove_{all_}shadows{,_and_parents}() to take a domain
  x86/shadow: Alter sh_{remove_all_mappings,rm_mappings_from_l1}() to take a 
domain
  x86/shadow: Alter sh_remove_write_access to take a domain
  x86/shadow: Cleanup of vcpu handling

 xen/arch/x86/hvm/hvm.c   |2 +-
 xen/arch/x86/mm.c|4 +-
 xen/arch/x86/mm/shadow/common.c  |  769 +
 xen/arch/x86/mm/shadow/multi.c   | 1180 +++---
 xen/arch/x86/mm/shadow/multi.h   |   64 +--
 xen/arch/x86/mm/shadow/private.h |  154 ++---
 xen/arch/x86/mm/shadow/types.h   |   42 +-
 xen/include/asm-x86/shadow.h |8 +-
 8 files changed, 1140 insertions(+), 1083 deletions(-)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 08/20] x86/shadow: Alter OOS functions to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |   23 ---
 xen/arch/x86/mm/shadow/multi.c   |   19 ---
 xen/arch/x86/mm/shadow/private.h |6 +++---
 3 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index bdb19fb..6945dfe 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -603,13 +603,13 @@ static inline int oos_fixup_flush_gmfn(struct vcpu *v, 
mfn_t gmfn,
 return 1;
 }
 
-void oos_fixup_add(struct vcpu *v, mfn_t gmfn,
+void oos_fixup_add(struct domain *d, mfn_t gmfn,
mfn_t smfn,  unsigned long off)
 {
 int idx, next;
 mfn_t *oos;
 struct oos_fixup *oos_fixup;
-struct domain *d = v->domain;
+struct vcpu *v;
 
 perfc_incr(shadow_oos_fixup_add);
 
@@ -788,13 +788,13 @@ static void oos_hash_add(struct vcpu *v, mfn_t gmfn)
 }
 
 /* Remove an MFN from the list of out-of-sync guest pagetables */
-static void oos_hash_remove(struct vcpu *v, mfn_t gmfn)
+static void oos_hash_remove(struct domain *d, mfn_t gmfn)
 {
 int idx;
 mfn_t *oos;
-struct domain *d = v->domain;
+struct vcpu *v;
 
-SHADOW_PRINTK("%pv gmfn %lx\n", v, mfn_x(gmfn));
+SHADOW_PRINTK("d%d gmfn %lx\n", d->domain_id, mfn_x(gmfn));
 
 for_each_vcpu(d, v)
 {
@@ -813,12 +813,12 @@ static void oos_hash_remove(struct vcpu *v, mfn_t gmfn)
 BUG();
 }
 
-mfn_t oos_snapshot_lookup(struct vcpu *v, mfn_t gmfn)
+mfn_t oos_snapshot_lookup(struct domain *d, mfn_t gmfn)
 {
 int idx;
 mfn_t *oos;
 mfn_t *oos_snapshot;
-struct domain *d = v->domain;
+struct vcpu *v;
 
 for_each_vcpu(d, v)
 {
@@ -839,13 +839,13 @@ mfn_t oos_snapshot_lookup(struct vcpu *v, mfn_t gmfn)
 }
 
 /* Pull a single guest page back into sync */
-void sh_resync(struct vcpu *v, mfn_t gmfn)
+void sh_resync(struct domain *d, mfn_t gmfn)
 {
 int idx;
 mfn_t *oos;
 mfn_t *oos_snapshot;
 struct oos_fixup *oos_fixup;
-struct domain *d = v->domain;
+struct vcpu *v;
 
 for_each_vcpu(d, v)
 {
@@ -1000,7 +1000,7 @@ void shadow_promote(struct vcpu *v, mfn_t gmfn, unsigned 
int type)
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
 /* Is the page already shadowed and out of sync? */
 if ( page_is_out_of_sync(page) )
-sh_resync(v, gmfn);
+sh_resync(d, gmfn);
 #endif
 
 /* We should never try to promote a gmfn that has writeable mappings */
@@ -1019,6 +1019,7 @@ void shadow_promote(struct vcpu *v, mfn_t gmfn, unsigned 
int type)
 
 void shadow_demote(struct vcpu *v, mfn_t gmfn, u32 type)
 {
+struct domain *d = v->domain;
 struct page_info *page = mfn_to_page(gmfn);
 
 ASSERT(test_bit(_PGC_page_table, &page->count_info));
@@ -1032,7 +1033,7 @@ void shadow_demote(struct vcpu *v, mfn_t gmfn, u32 type)
 /* Was the page out of sync? */
 if ( page_is_out_of_sync(page) )
 {
-oos_hash_remove(v, gmfn);
+oos_hash_remove(d, gmfn);
 }
 #endif
 clear_bit(_PGC_page_table, &page->count_info);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index ea3b520..82759a6 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -278,6 +278,11 @@ shadow_check_gl1e(struct vcpu *v, walk_t *gw)
 static inline uint32_t
 gw_remove_write_accesses(struct vcpu *v, unsigned long va, walk_t *gw)
 {
+#if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
+struct domain *d = v->domain;
+#endif
+#endif
 uint32_t rc = 0;
 
 #if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
@@ -285,7 +290,7 @@ gw_remove_write_accesses(struct vcpu *v, unsigned long va, 
walk_t *gw)
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
 if ( mfn_is_out_of_sync(gw->l3mfn) )
 {
-sh_resync(v, gw->l3mfn);
+sh_resync(d, gw->l3mfn);
 rc = GW_RMWR_REWALK;
 }
 else
@@ -297,7 +302,7 @@ gw_remove_write_accesses(struct vcpu *v, unsigned long va, 
walk_t *gw)
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
 if ( mfn_is_out_of_sync(gw->l2mfn) )
 {
-sh_resync(v, gw->l2mfn);
+sh_resync(d, gw->l2mfn);
 rc |= GW_RMWR_REWALK;
 }
 else
@@ -1030,7 +1035,7 @@ static int shadow_set_l2e(struct vcpu *v,
OOS. */
 if ( (sp->u.sh.type != SH_type_fl1_shadow) && mfn_valid(gl1mfn)
  && mfn_is_out_of_sync(gl1mfn) )
-sh_resync(v, gl1mfn);
+sh_resync(d, gl1mfn);
 }
 #endif
 #if GUEST_PAGING_LEVELS == 2
@@ -1178,7 +1183,7 @@ static int shadow_set_l1e(struct vcpu *v,
 if ( mfn_valid(new_gmfn) && mfn_oos_may_write(new_gmfn)
  && ((shadow_l1e_get_flags(new_sl1e) & (_PAGE_RW|_PAGE_PRESENT))
  == (_PAGE_RW|_PAGE_PRESENT)) )
-oos_fixup_add(v, new_gmfn, sl1mfn, pgentry_ptr

[Xen-devel] [PATCH 05/20] x86/shadow: Alter shadow_hash_{lookup, insert, delete}() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |   11 ---
 xen/arch/x86/mm/shadow/multi.c   |   22 +-
 xen/arch/x86/mm/shadow/private.h |6 +++---
 3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 26dab30..80174df 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1878,11 +1878,10 @@ static void shadow_hash_teardown(struct domain *d)
 }
 
 
-mfn_t shadow_hash_lookup(struct vcpu *v, unsigned long n, unsigned int t)
+mfn_t shadow_hash_lookup(struct domain *d, unsigned long n, unsigned int t)
 /* Find an entry in the hash table.  Returns the MFN of the shadow,
  * or INVALID_MFN if it doesn't exist */
 {
-struct domain *d = v->domain;
 struct page_info *sp, *prev;
 key_t key;
 
@@ -1932,11 +1931,10 @@ mfn_t shadow_hash_lookup(struct vcpu *v, unsigned long 
n, unsigned int t)
 return _mfn(INVALID_MFN);
 }
 
-void shadow_hash_insert(struct vcpu *v, unsigned long n, unsigned int t,
+void shadow_hash_insert(struct domain *d, unsigned long n, unsigned int t,
 mfn_t smfn)
 /* Put a mapping (n,t)->smfn into the hash table */
 {
-struct domain *d = v->domain;
 struct page_info *sp;
 key_t key;
 
@@ -1958,11 +1956,10 @@ void shadow_hash_insert(struct vcpu *v, unsigned long 
n, unsigned int t,
 sh_hash_audit_bucket(d, key);
 }
 
-void shadow_hash_delete(struct vcpu *v, unsigned long n, unsigned int t,
+void shadow_hash_delete(struct domain *d, unsigned long n, unsigned int t,
 mfn_t smfn)
 /* Excise the mapping (n,t)->smfn from the hash table */
 {
-struct domain *d = v->domain;
 struct page_info *sp, *x;
 key_t key;
 
@@ -2611,7 +2608,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 if( !(pg->count_info & PGC_page_table)  \
 || !(pg->shadow_flags & (1 << t)) ) \
 break;  \
-smfn = shadow_hash_lookup(v, mfn_x(gmfn), t);   \
+smfn = shadow_hash_lookup(d, mfn_x(gmfn), t);   \
 if ( unlikely(!mfn_valid(smfn)) )   \
 {   \
 SHADOW_ERROR(": gmfn %#lx has flags %#"PRIx32   \
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index f532bff..1e6bc33 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -94,7 +94,8 @@ static inline mfn_t
 get_fl1_shadow_status(struct vcpu *v, gfn_t gfn)
 /* Look for FL1 shadows in the hash table */
 {
-mfn_t smfn = shadow_hash_lookup(v, gfn_x(gfn), SH_type_fl1_shadow);
+struct domain *d = v->domain;
+mfn_t smfn = shadow_hash_lookup(d, gfn_x(gfn), SH_type_fl1_shadow);
 ASSERT(!mfn_valid(smfn) || mfn_to_page(smfn)->u.sh.head);
 return smfn;
 }
@@ -103,7 +104,8 @@ static inline mfn_t
 get_shadow_status(struct vcpu *v, mfn_t gmfn, u32 shadow_type)
 /* Look for shadows in the hash table */
 {
-mfn_t smfn = shadow_hash_lookup(v, mfn_x(gmfn), shadow_type);
+struct domain *d = v->domain;
+mfn_t smfn = shadow_hash_lookup(d, mfn_x(gmfn), shadow_type);
 ASSERT(!mfn_valid(smfn) || mfn_to_page(smfn)->u.sh.head);
 perfc_incr(shadow_get_shadow_status);
 return smfn;
@@ -113,11 +115,12 @@ static inline void
 set_fl1_shadow_status(struct vcpu *v, gfn_t gfn, mfn_t smfn)
 /* Put an FL1 shadow into the hash table */
 {
+struct domain *d = v->domain;
 SHADOW_PRINTK("gfn=%"SH_PRI_gfn", type=%08x, smfn=%05lx\n",
gfn_x(gfn), SH_type_fl1_shadow, mfn_x(smfn));
 
 ASSERT(mfn_to_page(smfn)->u.sh.head);
-shadow_hash_insert(v, gfn_x(gfn), SH_type_fl1_shadow, smfn);
+shadow_hash_insert(d, gfn_x(gfn), SH_type_fl1_shadow, smfn);
 }
 
 static inline void
@@ -140,17 +143,18 @@ set_shadow_status(struct vcpu *v, mfn_t gmfn, u32 
shadow_type, mfn_t smfn)
 ASSERT(res == 1);
 }
 
-shadow_hash_insert(v, mfn_x(gmfn), shadow_type, smfn);
+shadow_hash_insert(d, mfn_x(gmfn), shadow_type, smfn);
 }
 
 static inline void
 delete_fl1_shadow_status(struct vcpu *v, gfn_t gfn, mfn_t smfn)
 /* Remove a shadow from the hash table */
 {
+struct domain *d = v->domain;
 SHADOW_PRINTK("gfn=%"SH_PRI_gfn", type=%08x, smfn=%05lx\n",
gfn_x(gfn), SH_type_fl1_shadow, mfn_x(smfn));
 ASSERT(mfn_to_page(smfn)->u.sh.head);
-shadow_hash_delete(v, gfn_x(gfn), SH_type_fl1_shadow, smfn);
+shadow_hash_delete(d, gfn_x(gfn), SH_type_fl1_shadow, smfn);
 }
 
 static inline void
@@ -162,7 +166,7 @@ delete_shadow_status(struct vcpu *v, mfn_t gmfn, u32 
shadow_type, mfn_t smfn)
d->domain_id, v->vcpu_id,
mfn_x(gm

[Xen-devel] [PATCH 06/20] x86/shadow: Alter *_shadow_status() and make_fl1_shadow() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/multi.c |   99 +++-
 1 file changed, 48 insertions(+), 51 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 1e6bc33..154274f 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -91,20 +91,18 @@ static char *fetch_type_names[] = {
  */
 
 static inline mfn_t
-get_fl1_shadow_status(struct vcpu *v, gfn_t gfn)
+get_fl1_shadow_status(struct domain *d, gfn_t gfn)
 /* Look for FL1 shadows in the hash table */
 {
-struct domain *d = v->domain;
 mfn_t smfn = shadow_hash_lookup(d, gfn_x(gfn), SH_type_fl1_shadow);
 ASSERT(!mfn_valid(smfn) || mfn_to_page(smfn)->u.sh.head);
 return smfn;
 }
 
 static inline mfn_t
-get_shadow_status(struct vcpu *v, mfn_t gmfn, u32 shadow_type)
+get_shadow_status(struct domain *d, mfn_t gmfn, u32 shadow_type)
 /* Look for shadows in the hash table */
 {
-struct domain *d = v->domain;
 mfn_t smfn = shadow_hash_lookup(d, mfn_x(gmfn), shadow_type);
 ASSERT(!mfn_valid(smfn) || mfn_to_page(smfn)->u.sh.head);
 perfc_incr(shadow_get_shadow_status);
@@ -112,10 +110,9 @@ get_shadow_status(struct vcpu *v, mfn_t gmfn, u32 
shadow_type)
 }
 
 static inline void
-set_fl1_shadow_status(struct vcpu *v, gfn_t gfn, mfn_t smfn)
+set_fl1_shadow_status(struct domain *d, gfn_t gfn, mfn_t smfn)
 /* Put an FL1 shadow into the hash table */
 {
-struct domain *d = v->domain;
 SHADOW_PRINTK("gfn=%"SH_PRI_gfn", type=%08x, smfn=%05lx\n",
gfn_x(gfn), SH_type_fl1_shadow, mfn_x(smfn));
 
@@ -124,15 +121,13 @@ set_fl1_shadow_status(struct vcpu *v, gfn_t gfn, mfn_t 
smfn)
 }
 
 static inline void
-set_shadow_status(struct vcpu *v, mfn_t gmfn, u32 shadow_type, mfn_t smfn)
+set_shadow_status(struct domain *d, mfn_t gmfn, u32 shadow_type, mfn_t smfn)
 /* Put a shadow into the hash table */
 {
-struct domain *d = v->domain;
 int res;
 
-SHADOW_PRINTK("d=%d, v=%d, gmfn=%05lx, type=%08x, smfn=%05lx\n",
-   d->domain_id, v->vcpu_id, mfn_x(gmfn),
-   shadow_type, mfn_x(smfn));
+SHADOW_PRINTK("d=%d: gmfn=%lx, type=%08x, smfn=%lx\n",
+  d->domain_id, mfn_x(gmfn), shadow_type, mfn_x(smfn));
 
 ASSERT(mfn_to_page(smfn)->u.sh.head);
 
@@ -147,10 +142,9 @@ set_shadow_status(struct vcpu *v, mfn_t gmfn, u32 
shadow_type, mfn_t smfn)
 }
 
 static inline void
-delete_fl1_shadow_status(struct vcpu *v, gfn_t gfn, mfn_t smfn)
+delete_fl1_shadow_status(struct domain *d, gfn_t gfn, mfn_t smfn)
 /* Remove a shadow from the hash table */
 {
-struct domain *d = v->domain;
 SHADOW_PRINTK("gfn=%"SH_PRI_gfn", type=%08x, smfn=%05lx\n",
gfn_x(gfn), SH_type_fl1_shadow, mfn_x(smfn));
 ASSERT(mfn_to_page(smfn)->u.sh.head);
@@ -158,13 +152,11 @@ delete_fl1_shadow_status(struct vcpu *v, gfn_t gfn, mfn_t 
smfn)
 }
 
 static inline void
-delete_shadow_status(struct vcpu *v, mfn_t gmfn, u32 shadow_type, mfn_t smfn)
+delete_shadow_status(struct domain *d, mfn_t gmfn, u32 shadow_type, mfn_t smfn)
 /* Remove a shadow from the hash table */
 {
-struct domain *d = v->domain;
-SHADOW_PRINTK("d=%d, v=%d, gmfn=%05lx, type=%08x, smfn=%05lx\n",
-   d->domain_id, v->vcpu_id,
-   mfn_x(gmfn), shadow_type, mfn_x(smfn));
+SHADOW_PRINTK("d=%d: gmfn=%lx, type=%08x, smfn=%lx\n",
+  d->domain_id, mfn_x(gmfn), shadow_type, mfn_x(smfn));
 ASSERT(mfn_to_page(smfn)->u.sh.head);
 shadow_hash_delete(d, mfn_x(gmfn), shadow_type, smfn);
 /* 32-on-64 PV guests don't own their l4 pages; see set_shadow_status */
@@ -330,6 +322,7 @@ gw_remove_write_accesses(struct vcpu *v, unsigned long va, 
walk_t *gw)
  * through the audit mechanisms */
 static void sh_audit_gw(struct vcpu *v, walk_t *gw)
 {
+struct domain *d = v->domain;
 mfn_t smfn;
 
 if ( !(SHADOW_AUDIT_ENABLE) )
@@ -337,33 +330,33 @@ static void sh_audit_gw(struct vcpu *v, walk_t *gw)
 
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
 if ( mfn_valid(gw->l4mfn)
- && mfn_valid((smfn = get_shadow_status(v, gw->l4mfn,
+ && mfn_valid((smfn = get_shadow_status(d, gw->l4mfn,
 SH_type_l4_shadow))) )
 (void) sh_audit_l4_table(v, smfn, _mfn(INVALID_MFN));
 if ( mfn_valid(gw->l3mfn)
- && mfn_valid((smfn = get_shadow_status(v, gw->l3mfn,
+ && mfn_valid((smfn = get_shadow_status(d, gw->l3mfn,
 SH_type_l3_shadow))) )
 (void) sh_audit_l3_table(v, smfn, _mfn(INVALID_MFN));
 #endif /* PAE or 64... */
 if ( mfn_valid(gw->l2mfn) )
 {
-if ( mfn_valid((smfn = get_shadow_status(v, gw->l2mfn,
+if ( mfn_valid((smfn = get_shadow_status(d, gw->l2mfn,
  SH_type_l2_shadow))) )
 (voi

[Xen-devel] [PATCH 03/20] x86/shadow: Introduce 'd' pointers and clean up use of 'v->domain'

2015-02-12 Thread Andrew Cooper
All of the introduced domain pointers will eventually be removed, but doing
this mechanical cleanup here allows the subsequent patches which change
function prototypes to be smaller and more clear.

In addition, swap some use of is_pv_32on64_vcpu(v) for is_pv_32on64_domain(d).

No functional change.

Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |   49 +++--
 xen/arch/x86/mm/shadow/multi.c   |  146 ++
 xen/arch/x86/mm/shadow/private.h |6 +-
 3 files changed, 116 insertions(+), 85 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 2e4954f..3b5ef19 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -665,6 +665,7 @@ void oos_fixup_add(struct vcpu *v, mfn_t gmfn,
 static int oos_remove_write_access(struct vcpu *v, mfn_t gmfn,
struct oos_fixup *fixup)
 {
+struct domain *d = v->domain;
 int ftlb = 0;
 
 ftlb |= oos_fixup_flush_gmfn(v, gmfn, fixup);
@@ -690,7 +691,7 @@ static int oos_remove_write_access(struct vcpu *v, mfn_t 
gmfn,
 }
 
 if ( ftlb )
-flush_tlb_mask(v->domain->domain_dirty_cpumask);
+flush_tlb_mask(d->domain_dirty_cpumask);
 
 return 0;
 }
@@ -991,6 +992,7 @@ int sh_unsync(struct vcpu *v, mfn_t gmfn)
  */
 void shadow_promote(struct vcpu *v, mfn_t gmfn, unsigned int type)
 {
+struct domain *d = v->domain;
 struct page_info *page = mfn_to_page(gmfn);
 
 ASSERT(mfn_valid(gmfn));
@@ -1004,7 +1006,7 @@ void shadow_promote(struct vcpu *v, mfn_t gmfn, unsigned 
int type)
 /* We should never try to promote a gmfn that has writeable mappings */
 ASSERT((page->u.inuse.type_info & PGT_type_mask) != PGT_writable_page
|| (page->u.inuse.type_info & PGT_count_mask) == 0
-   || v->domain->is_shutting_down);
+   || d->is_shutting_down);
 
 /* Is the page already shadowed? */
 if ( !test_and_set_bit(_PGC_page_table, &page->count_info) )
@@ -2056,6 +2058,7 @@ static void hash_vcpu_foreach(struct vcpu *v, unsigned 
int callback_mask,
 
 void sh_destroy_shadow(struct vcpu *v, mfn_t smfn)
 {
+struct domain *d = v->domain;
 struct page_info *sp = mfn_to_page(smfn);
 unsigned int t = sp->u.sh.type;
 
@@ -2068,9 +2071,8 @@ void sh_destroy_shadow(struct vcpu *v, mfn_t smfn)
t == SH_type_fl1_pae_shadow ||
t == SH_type_fl1_64_shadow  ||
t == SH_type_monitor_table  ||
-   (is_pv_32on64_vcpu(v) && t == SH_type_l4_64_shadow) ||
-   (page_get_owner(mfn_to_page(backpointer(sp)))
-== v->domain));
+   (is_pv_32on64_domain(d) && t == SH_type_l4_64_shadow) ||
+   (page_get_owner(mfn_to_page(backpointer(sp))) == d));
 
 /* The down-shifts here are so that the switch statement is on nice
  * small numbers that the compiler will enjoy */
@@ -2098,7 +2100,7 @@ void sh_destroy_shadow(struct vcpu *v, mfn_t smfn)
 SHADOW_INTERNAL_NAME(sh_destroy_l1_shadow, 4)(v, smfn);
 break;
 case SH_type_l2h_64_shadow:
-ASSERT(is_pv_32on64_vcpu(v));
+ASSERT(is_pv_32on64_domain(d));
 /* Fall through... */
 case SH_type_l2_64_shadow:
 SHADOW_INTERNAL_NAME(sh_destroy_l2_shadow, 4)(v, smfn);
@@ -2166,15 +2168,16 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 | SHF_L1_64
 | SHF_FL1_64
 ;
+struct domain *d = v->domain;
 struct page_info *pg = mfn_to_page(gmfn);
 
-ASSERT(paging_locked_by_me(v->domain));
+ASSERT(paging_locked_by_me(d));
 
 /* Only remove writable mappings if we are doing shadow refcounts.
  * In guest refcounting, we trust Xen to already be restricting
  * all the writes to the guest page tables, so we do not need to
  * do more. */
-if ( !shadow_mode_refcounts(v->domain) )
+if ( !shadow_mode_refcounts(d) )
 return 0;
 
 /* Early exit if it's already a pagetable, or otherwise not writeable */
@@ -2198,7 +2201,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 SHADOW_ERROR("can't remove write access to mfn %lx, type_info is %"
   PRtype_info "\n",
   mfn_x(gmfn), mfn_to_page(gmfn)->u.inuse.type_info);
-domain_crash(v->domain);
+domain_crash(d);
 }
 
 #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
@@ -2226,7 +2229,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 GUESS(0xC000UL + (fault_addr >> 10), 1);
 
 /* Linux lowmem: first 896MB is mapped 1-to-1 above 0xC000 */
-if ((gfn = mfn_to_gfn(v->domain, gmfn)) < 0x38000 )
+if ((gfn = mfn_to_gfn(d, gmfn)) < 0x38000 )
 GUESS(0xC000UL + (gfn << PAGE_SHIFT), 4);
 
 /* FreeBSD: Linear map at 0xBFC0 */
@@ -2244,7 +2247,7 @@ int sh_remove_write_access(struct vcpu *v, 

[Xen-devel] [PATCH 04/20] x86/shadow: Only apply shadow heuristics when in guest context

2015-02-12 Thread Andrew Cooper
It is incorrect to be applying these heuristics because of toolstack actions.

As the vcpu parameters are to be replaced with domain parameters, guest
context is identified by using current->domain.

Note that the majority of the heuristics in sh_remove_write_access() were
already restricted to guest context, but they are updated to use 'curr' for
clarity.

Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c |   21 +
 xen/arch/x86/mm/shadow/multi.c  |   13 +
 2 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 3b5ef19..26dab30 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2170,6 +2170,9 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 ;
 struct domain *d = v->domain;
 struct page_info *pg = mfn_to_page(gmfn);
+#if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
+struct vcpu *curr = current;
+#endif
 
 ASSERT(paging_locked_by_me(d));
 
@@ -2205,7 +2208,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 }
 
 #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
-if ( v == current )
+if ( curr->domain == d )
 {
 unsigned long gfn;
 /* Heuristic: there is likely to be only one writeable mapping,
@@ -2213,7 +2216,8 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
  * in the guest's linear map (on non-HIGHPTE linux and windows)*/
 
 #define GUESS(_a, _h) do {  \
-if ( v->arch.paging.mode->shadow.guess_wrmap(v, (_a), gmfn) ) \
+if ( curr->arch.paging.mode->shadow.guess_wrmap(\
+ curr, (_a), gmfn) )\
 perfc_incr(shadow_writeable_h_ ## _h);  \
 if ( (pg->u.inuse.type_info & PGT_count_mask) == 0 )\
 {   \
@@ -,7 +2226,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 }   \
 } while (0)
 
-if ( v->arch.paging.mode->guest_levels == 2 )
+if ( curr->arch.paging.mode->guest_levels == 2 )
 {
 if ( level == 1 )
 /* 32bit non-PAE w2k3: linear map at 0xC000 */
@@ -2237,7 +2241,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 GUESS(0xBFC0UL
   + ((fault_addr & VADDR_MASK) >> 10), 6);
 }
-else if ( v->arch.paging.mode->guest_levels == 3 )
+else if ( curr->arch.paging.mode->guest_levels == 3 )
 {
 /* 32bit PAE w2k3: linear map at 0xC000 */
 switch ( level )
@@ -2259,7 +2263,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
   + ((fault_addr & VADDR_MASK) >> 18), 6); break;
 }
 }
-else if ( v->arch.paging.mode->guest_levels == 4 )
+else if ( curr->arch.paging.mode->guest_levels == 4 )
 {
 /* 64bit w2k3: linear map at 0xf680 */
 switch ( level )
@@ -2312,14 +2316,15 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
  * the writeable mapping by looking at the same MFN where the last
  * brute-force search succeeded. */
 
-if ( v->arch.paging.shadow.last_writeable_pte_smfn != 0 )
+if ( (curr->domain == d) &&
+ (curr->arch.paging.shadow.last_writeable_pte_smfn != 0) )
 {
 unsigned long old_count = (pg->u.inuse.type_info & PGT_count_mask);
-mfn_t last_smfn = _mfn(v->arch.paging.shadow.last_writeable_pte_smfn);
+mfn_t last_smfn = 
_mfn(curr->arch.paging.shadow.last_writeable_pte_smfn);
 int shtype = mfn_to_page(last_smfn)->u.sh.type;
 
 if ( callbacks[shtype] )
-callbacks[shtype](v, last_smfn, gmfn);
+callbacks[shtype](curr, last_smfn, gmfn);
 
 if ( (pg->u.inuse.type_info & PGT_count_mask) != old_count )
 perfc_incr(shadow_writeable_h_5);
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index b538997..f532bff 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4185,6 +4185,8 @@ sh_update_cr3(struct vcpu *v, int do_locking)
 int sh_rm_write_access_from_sl1p(struct vcpu *v, mfn_t gmfn,
  mfn_t smfn, unsigned long off)
 {
+struct domain *d = v->domain;
+struct vcpu *curr = current;
 int r;
 shadow_l1e_t *sl1p, sl1e;
 struct page_info *sp;
@@ -4193,9 +4195,9 @@ int sh_rm_write_access_from_sl1p(struct vcpu *v, mfn_t 
gmfn,
 ASSERT(mfn_valid(smfn));
 
 /* Remember if we've been told that this process is being torn down */
-v->arch.paging.shadow.pagetable_dying
-= !!(mfn_to_page(gmfn)->sh

[Xen-devel] [PATCH 07/20] x86/shadow: Alter sh_type_{is_pinnable, has_up_pointer}() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |7 ---
 xen/arch/x86/mm/shadow/multi.c   |   10 +-
 xen/arch/x86/mm/shadow/private.h |   18 ++
 3 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 80174df..bdb19fb 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2472,6 +2472,7 @@ static int sh_remove_shadow_via_pointer(struct vcpu *v, 
mfn_t smfn)
 /* Follow this shadow's up-pointer, if it has one, and remove the reference
  * found there.  Returns 1 if that was the only reference to this shadow */
 {
+struct domain *d = v->domain;
 struct page_info *sp = mfn_to_page(smfn);
 mfn_t pmfn;
 void *vaddr;
@@ -2479,7 +2480,7 @@ static int sh_remove_shadow_via_pointer(struct vcpu *v, 
mfn_t smfn)
 
 ASSERT(sp->u.sh.type > 0);
 ASSERT(sp->u.sh.type < SH_type_max_shadow);
-ASSERT(sh_type_has_up_pointer(v, sp->u.sh.type));
+ASSERT(sh_type_has_up_pointer(d, sp->u.sh.type));
 
 if (sp->up == 0) return 0;
 pmfn = _mfn(sp->up >> PAGE_SHIFT);
@@ -2616,9 +2617,9 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
  mfn_x(gmfn), (uint32_t)pg->shadow_flags, t);   \
 break;  \
 }   \
-if ( sh_type_is_pinnable(v, t) )\
+if ( sh_type_is_pinnable(d, t) )\
 sh_unpin(v, smfn);  \
-else if ( sh_type_has_up_pointer(v, t) )\
+else if ( sh_type_has_up_pointer(d, t) )\
 sh_remove_shadow_via_pointer(v, smfn);  \
 if( !fast   \
 && (pg->count_info & PGC_page_table)\
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 154274f..ea3b520 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -903,7 +903,7 @@ static int shadow_set_l4e(struct vcpu *v,
 mfn_t sl3mfn = shadow_l4e_get_mfn(new_sl4e);
 ok = sh_get_ref(v, sl3mfn, paddr);
 /* Are we pinning l3 shadows to handle wierd linux behaviour? */
-if ( sh_type_is_pinnable(v, SH_type_l3_64_shadow) )
+if ( sh_type_is_pinnable(d, SH_type_l3_64_shadow) )
 ok |= sh_pin(v, sl3mfn);
 if ( !ok )
 {
@@ -1501,7 +1501,7 @@ sh_make_shadow(struct vcpu *v, mfn_t gmfn, u32 
shadow_type)
 SHADOW_DEBUG(MAKE_SHADOW, "(%05lx, %u)=>%05lx\n",
   mfn_x(gmfn), shadow_type, mfn_x(smfn));
 
-if ( sh_type_has_up_pointer(v, shadow_type) )
+if ( sh_type_has_up_pointer(d, shadow_type) )
 /* Lower-level shadow, not yet linked form a higher level */
 mfn_to_page(smfn)->up = 0;
 
@@ -2367,7 +2367,7 @@ int sh_safe_not_to_sync(struct vcpu *v, mfn_t gl1mfn)
 struct page_info *sp;
 mfn_t smfn;
 
-if ( !sh_type_has_up_pointer(v, SH_type_l1_shadow) )
+if ( !sh_type_has_up_pointer(d, SH_type_l1_shadow) )
 return 0;
 
 smfn = get_shadow_status(d, gl1mfn, SH_type_l1_shadow);
@@ -2383,7 +2383,7 @@ int sh_safe_not_to_sync(struct vcpu *v, mfn_t gl1mfn)
 #if (SHADOW_PAGING_LEVELS == 4)
 /* up to l3 */
 sp = mfn_to_page(smfn);
-ASSERT(sh_type_has_up_pointer(v, SH_type_l2_shadow));
+ASSERT(sh_type_has_up_pointer(d, SH_type_l2_shadow));
 if ( sp->u.sh.count != 1 || !sp->up )
 return 0;
 smfn = _mfn(sp->up >> PAGE_SHIFT);
@@ -2392,7 +2392,7 @@ int sh_safe_not_to_sync(struct vcpu *v, mfn_t gl1mfn)
 /* up to l4 */
 sp = mfn_to_page(smfn);
 if ( sp->u.sh.count != 1
- || !sh_type_has_up_pointer(v, SH_type_l3_64_shadow) || !sp->up )
+ || !sh_type_has_up_pointer(d, SH_type_l3_64_shadow) || !sp->up )
 return 0;
 smfn = _mfn(sp->up >> PAGE_SHIFT);
 ASSERT(mfn_valid(smfn));
diff --git a/xen/arch/x86/mm/shadow/private.h b/xen/arch/x86/mm/shadow/private.h
index df1dd8c..8c06775 100644
--- a/xen/arch/x86/mm/shadow/private.h
+++ b/xen/arch/x86/mm/shadow/private.h
@@ -195,7 +195,7 @@ extern void shadow_audit_tables(struct vcpu *v);
  * What counts as a pinnable shadow?
  */
 
-static inline int sh_type_is_pinnable(struct vcpu *v, unsigned int t)
+static inline int sh_type_is_pinnable(struct domain *d, unsigned int t)
 {
 /* Top-level shadow types in each mode can be pinned, so that they
  * persist even when not currently in use in a guest CR3 */
@@ -211,7 +211,7 @@ static inline int sh_type_is_pinnable(struct vcpu *v, 
unsigned int t)
  * page.  When we're shadowing those kernels, we have to pin l3
  * shadows so they don't just eva

[Xen-devel] [PATCH 09/20] x86/shadow: Alter shadow_{pro, de}mote() to take a domain

2015-02-12 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c  |6 ++
 xen/arch/x86/mm/shadow/multi.c   |   10 +-
 xen/arch/x86/mm/shadow/private.h |4 ++--
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 6945dfe..c6b8e6f 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -990,9 +990,8 @@ int sh_unsync(struct vcpu *v, mfn_t gmfn)
  * involves making sure there are no writable mappings available to the guest
  * for this page.
  */
-void shadow_promote(struct vcpu *v, mfn_t gmfn, unsigned int type)
+void shadow_promote(struct domain *d, mfn_t gmfn, unsigned int type)
 {
-struct domain *d = v->domain;
 struct page_info *page = mfn_to_page(gmfn);
 
 ASSERT(mfn_valid(gmfn));
@@ -1017,9 +1016,8 @@ void shadow_promote(struct vcpu *v, mfn_t gmfn, unsigned 
int type)
 TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_PROMOTE);
 }
 
-void shadow_demote(struct vcpu *v, mfn_t gmfn, u32 type)
+void shadow_demote(struct domain *d, mfn_t gmfn, u32 type)
 {
-struct domain *d = v->domain;
 struct page_info *page = mfn_to_page(gmfn);
 
 ASSERT(test_bit(_PGC_page_table, &page->count_info));
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 82759a6..f2dea16 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -1563,7 +1563,7 @@ sh_make_shadow(struct vcpu *v, mfn_t gmfn, u32 
shadow_type)
 }
 }
 
-shadow_promote(v, gmfn, shadow_type);
+shadow_promote(d, gmfn, shadow_type);
 set_shadow_status(d, gmfn, shadow_type, smfn);
 
 return smfn;
@@ -1898,7 +1898,7 @@ void sh_destroy_l4_shadow(struct vcpu *v, mfn_t smfn)
 /* Record that the guest page isn't shadowed any more (in this type) */
 gmfn = backpointer(sp);
 delete_shadow_status(d, gmfn, t, smfn);
-shadow_demote(v, gmfn, t);
+shadow_demote(d, gmfn, t);
 /* Decrement refcounts of all the old entries */
 sl4mfn = smfn;
 SHADOW_FOREACH_L4E(sl4mfn, sl4e, 0, 0, d, {
@@ -1930,7 +1930,7 @@ void sh_destroy_l3_shadow(struct vcpu *v, mfn_t smfn)
 /* Record that the guest page isn't shadowed any more (in this type) */
 gmfn = backpointer(sp);
 delete_shadow_status(d, gmfn, t, smfn);
-shadow_demote(v, gmfn, t);
+shadow_demote(d, gmfn, t);
 
 /* Decrement refcounts of all the old entries */
 sl3mfn = smfn;
@@ -1968,7 +1968,7 @@ void sh_destroy_l2_shadow(struct vcpu *v, mfn_t smfn)
 /* Record that the guest page isn't shadowed any more (in this type) */
 gmfn = backpointer(sp);
 delete_shadow_status(d, gmfn, t, smfn);
-shadow_demote(v, gmfn, t);
+shadow_demote(d, gmfn, t);
 
 /* Decrement refcounts of all the old entries */
 sl2mfn = smfn;
@@ -2005,7 +2005,7 @@ void sh_destroy_l1_shadow(struct vcpu *v, mfn_t smfn)
 {
 mfn_t gmfn = backpointer(sp);
 delete_shadow_status(d, gmfn, t, smfn);
-shadow_demote(v, gmfn, t);
+shadow_demote(d, gmfn, t);
 }
 
 if ( shadow_mode_refcounts(d) )
diff --git a/xen/arch/x86/mm/shadow/private.h b/xen/arch/x86/mm/shadow/private.h
index 7abb0e0..3820d9e 100644
--- a/xen/arch/x86/mm/shadow/private.h
+++ b/xen/arch/x86/mm/shadow/private.h
@@ -350,8 +350,8 @@ void  shadow_hash_delete(struct domain *d,
  unsigned long n, unsigned int t, mfn_t smfn);
 
 /* shadow promotion */
-void shadow_promote(struct vcpu *v, mfn_t gmfn, u32 type);
-void shadow_demote(struct vcpu *v, mfn_t gmfn, u32 type);
+void shadow_promote(struct domain *d, mfn_t gmfn, u32 type);
+void shadow_demote(struct domain *d, mfn_t gmfn, u32 type);
 
 /* Shadow page allocation functions */
 void  shadow_prealloc(struct domain *d, u32 shadow_type, unsigned int count);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 02/20] x86/shadow: Rename hash_foreach() to hash_vcpu_foreach()

2015-02-12 Thread Andrew Cooper
A later change requires the introduction of a domain variant.

Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 
---
 xen/arch/x86/mm/shadow/common.c |   31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
index 502e0d8..2e4954f 100644
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1999,12 +1999,11 @@ void shadow_hash_delete(struct vcpu *v, unsigned long 
n, unsigned int t,
 sh_hash_audit_bucket(d, key);
 }
 
-typedef int (*hash_callback_t)(struct vcpu *v, mfn_t smfn, mfn_t other_mfn);
+typedef int (*hash_vcpu_callback_t)(struct vcpu *v, mfn_t smfn, mfn_t 
other_mfn);
 
-static void hash_foreach(struct vcpu *v,
- unsigned int callback_mask,
- const hash_callback_t callbacks[],
- mfn_t callback_mfn)
+static void hash_vcpu_foreach(struct vcpu *v, unsigned int callback_mask,
+  const hash_vcpu_callback_t callbacks[],
+  mfn_t callback_mfn)
 /* Walk the hash table looking at the types of the entries and
  * calling the appropriate callback function for each entry.
  * The mask determines which shadow types we call back for, and the array
@@ -2140,7 +2139,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
unsigned long fault_addr)
 {
 /* Dispatch table for getting per-type functions */
-static const hash_callback_t callbacks[SH_type_unused] = {
+static const hash_vcpu_callback_t callbacks[SH_type_unused] = {
 NULL, /* none*/
 SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 2), /* l1_32   */
 SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 2), /* fl1_32  */
@@ -2334,7 +2333,7 @@ int sh_remove_write_access(struct vcpu *v, mfn_t gmfn,
 perfc_incr(shadow_writeable_bf_1);
 else
 perfc_incr(shadow_writeable_bf);
-hash_foreach(v, callback_mask, callbacks, gmfn);
+hash_vcpu_foreach(v, callback_mask, callbacks, gmfn);
 
 /* If that didn't catch the mapping, then there's some non-pagetable
  * mapping -- ioreq page, grant mapping, &c. */
@@ -2390,7 +2389,7 @@ static int sh_remove_all_mappings(struct vcpu *v, mfn_t 
gmfn)
 struct page_info *page = mfn_to_page(gmfn);
 
 /* Dispatch table for getting per-type functions */
-static const hash_callback_t callbacks[SH_type_unused] = {
+static const hash_vcpu_callback_t callbacks[SH_type_unused] = {
 NULL, /* none*/
 SHADOW_INTERNAL_NAME(sh_rm_mappings_from_l1, 2), /* l1_32   */
 SHADOW_INTERNAL_NAME(sh_rm_mappings_from_l1, 2), /* fl1_32  */
@@ -2432,7 +2431,7 @@ static int sh_remove_all_mappings(struct vcpu *v, mfn_t 
gmfn)
 
 /* Brute-force search of all the shadows, by walking the hash */
 perfc_incr(shadow_mappings_bf);
-hash_foreach(v, callback_mask, callbacks, gmfn);
+hash_vcpu_foreach(v, callback_mask, callbacks, gmfn);
 
 /* If that didn't catch the mapping, something is very wrong */
 if ( !sh_check_page_has_no_refs(page) )
@@ -2533,7 +2532,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 
 /* Dispatch table for getting per-type functions: each level must
  * be called with the function to remove a lower-level shadow. */
-static const hash_callback_t callbacks[SH_type_unused] = {
+static const hash_vcpu_callback_t callbacks[SH_type_unused] = {
 NULL, /* none*/
 NULL, /* l1_32   */
 NULL, /* fl1_32  */
@@ -2594,7 +2593,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 perfc_incr(shadow_unshadow);
 
 /* Lower-level shadows need to be excised from upper-level shadows.
- * This call to hash_foreach() looks dangerous but is in fact OK: each
+ * This call to hash_vcpu_foreach() looks dangerous but is in fact OK: each
  * call will remove at most one shadow, and terminate immediately when
  * it does remove it, so we never walk the hash after doing a deletion.  */
 #define DO_UNSHADOW(_type) do { \
@@ -2617,7 +2616,7 @@ void sh_remove_shadows(struct vcpu *v, mfn_t gmfn, int 
fast, int all)
 if( !fast   \
 && (pg->count_info & PGC_page_table)\
 && (pg->shadow_flags & (1 << t)) )  \
-hash_foreach(v, masks[t], callbacks, smfn); \
+hash_vcpu_foreach(v, masks[t], callbacks, smfn);\
 } while (0)
 
 DO_UNSHADOW(SH_type_l2_32_shadow);
@@ -2678,7 +2677,7 @@ static int sh_clear_up_pointer(struct vcpu *v, mfn_t 
smfn, mfn_t unused)
 
 void sh_reset_l3_up_pointers(struct vcpu *v)
 {
-static const hash_callback_t callbacks[SH_type_unused] = {
+static const hash_vcpu_callb

Re: [Xen-devel] stubdom vtpm build failure in staging

2015-02-12 Thread Xu, Quan
Sorry for that. Read the other thread of email, it looks that some maintainers 
are working for this issue.
And I am working for 'Xen stubdom vTPM for HVM virtual machine' v4 patches. 
There are a lot of modifications. 

I will be out of office from Feb. 16th to Feb. 26th for Chinese New Year. I 
plan to summit v4 patches
Before Feb. 16, and fix this issue after Feb. 26th. 

--Quan


> -Original Message-
> From: Olaf Hering [mailto:o...@aepfle.de]
> Sent: Wednesday, February 11, 2015 11:21 PM
> To: Xu, Quan
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] stubdom vtpm build failure in staging
> 
> On Wed, Jan 28, Xu, Quan wrote:
> 
> > Thanks, I will check and fix it tomorrow. It is 23:12 PM Pacific time now.
> 
> Any progress?
> These typedefs are duplicated in stubdom/vtpmmgr/tcg.h and supported
> compilers do not cope with current staging:
> 
> # for i in `grep -w typedef stubdom/vtpmmgr/tcg.h | sed -n '/;/{s@^.*
> @@;s@;@@p}'` # do
> # if test -n "`git grep -wn $i|grep -w typedef|grep -v
> stubdom/vtpmmgr/tcg.h`"
> # then
> # echo $i
> # fi
> # done
> 
> BYTE
> BOOL
> UINT16
> UINT32
> UINT64
> TPM_HANDLE
> TPM_ALGORITHM_ID
> 
> TPMI_RH_HIERARCHY_AUTH and TPM_ALG_ID are defined twice in the same
> file.
> 
> This change works for me:
> 
> ---
>  stubdom/vtpmmgr/odd_types.h  | 11 +++
>  stubdom/vtpmmgr/tcg.h|  9 +
>  stubdom/vtpmmgr/tpm2_types.h | 11 +--
>  3 files changed, 13 insertions(+), 18 deletions(-)  create mode 100644
> stubdom/vtpmmgr/odd_types.h
> 
> diff --git a/stubdom/vtpmmgr/odd_types.h b/stubdom/vtpmmgr/odd_types.h
> new file mode 100644 index 000..d72da9b
> --- /dev/null
> +++ b/stubdom/vtpmmgr/odd_types.h
> @@ -0,0 +1,11 @@
> +#ifndef VTPM_ODD_TYPES
> +#define VTPM_ODD_TYPES 1
> +typedef unsigned char BYTE;
> +typedef unsigned char BOOL;
> +typedef uint16_t UINT16;
> +typedef uint32_t UINT32;
> +typedef uint64_t UINT64;
> +typedef UINT32 TPM_HANDLE;
> +typedef UINT32 TPM_ALGORITHM_ID;
> +#endif
> +
> diff --git a/stubdom/vtpmmgr/tcg.h b/stubdom/vtpmmgr/tcg.h index
> 7321ec6..cac1bbc 100644
> --- a/stubdom/vtpmmgr/tcg.h
> +++ b/stubdom/vtpmmgr/tcg.h
> @@ -401,16 +401,10 @@
> 
> 
>  // *** TYPEDEFS
> * -typedef unsigned char BYTE;
> -typedef unsigned char BOOL; -typedef uint16_t UINT16; -typedef uint32_t
> UINT32; -typedef uint64_t UINT64;
> -
> +#include "odd_types.h"
>  typedef UINT32 TPM_RESULT;
>  typedef UINT32 TPM_PCRINDEX;
>  typedef UINT32 TPM_DIRINDEX;
> -typedef UINT32 TPM_HANDLE;
>  typedef TPM_HANDLE TPM_AUTHHANDLE;
>  typedef TPM_HANDLE TCPA_HASHHANDLE;
>  typedef TPM_HANDLE TCPA_HMACHANDLE;
> @@ -422,7 +416,6 @@ typedef UINT32 TPM_COMMAND_CODE;  typedef
> UINT16 TPM_PROTOCOL_ID;  typedef BYTE TPM_AUTH_DATA_USAGE;
> typedef UINT16 TPM_ENTITY_TYPE; -typedef UINT32 TPM_ALGORITHM_ID;
> typedef UINT16 TPM_KEY_USAGE;  typedef UINT16 TPM_STARTUP_TYPE;
> typedef UINT32 TPM_CAPABILITY_AREA; diff --git
> a/stubdom/vtpmmgr/tpm2_types.h b/stubdom/vtpmmgr/tpm2_types.h index
> ac2830d..63564cd 100644
> --- a/stubdom/vtpmmgr/tpm2_types.h
> +++ b/stubdom/vtpmmgr/tpm2_types.h
> @@ -83,12 +83,8 @@
>  #defineMAX_ECC_KEY_BYTES((MAX_ECC_KEY_BITS + 7) / 8)
> 
> 
> -typedef unsigned char BYTE;
> -typedef unsigned char BOOL;
> +#include "odd_types.h"
>  typedef uint8_t   UINT8;
> -typedef uint16_t  UINT16;
> -typedef uint32_t  UINT32;
> -typedef uint64_t  UINT64;
> 
>  // TPM2 command code
> 
> @@ -216,7 +212,6 @@ typedef UINT16 TPM_ST;
> 
> 
>  // TPM Handle types
> -typedef UINT32 TPM_HANDLE;
>  typedef UINT8 TPM_HT;
> 
> 
> @@ -233,7 +228,6 @@ typedef UINT32 TPM_RH;
>  #defineTPM_RH_LAST   (TPM_RH)(0x400C)
> 
>  // Table 4 -- DocumentationClarity Types 
> -typedef UINT32TPM_ALGORITHM_ID;
>  typedef UINT32TPM_MODIFIER_INDICATOR;
>  typedef UINT32TPM_SESSION_OFFSET;
>  typedef UINT16TPM_KEY_SIZE;
> @@ -261,8 +255,6 @@ typedef BYTE TPMA_LOCALITY;  // Table 37 --
> TPMI_YES_NO Type   typedef BYTE TPMI_YES_NO;
> 
> -typedef TPM_HANDLE TPMI_RH_HIERARCHY_AUTH;
> -
>  // Table 38 -- TPMI_DH_OBJECT Type   typedef TPM_HANDLE
> TPMI_DH_OBJECT;
> 
> @@ -304,7 +296,6 @@ typedef TPM_HANDLE TPMI_RH_LOCKOUT;
> 
>  // Table 7 -- TPM_ALG_ID
>  typedef UINT16 TPM_ALG_ID;
> -typedef UINT16 TPM_ALG_ID;
> 
>  #defineTPM2_ALG_ERROR (TPM_ALG_ID)(0x) // a: ; D:
>  #defineTPM2_ALG_FIRST (TPM_ALG_ID)(0x0001) // a: ; D:
> 
> Olaf
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Oleg Nesterov
On 02/11, Jeremy Fitzhardinge wrote:
>
> On 02/11/2015 09:24 AM, Oleg Nesterov wrote:
> > I agree, and I have to admit I am not sure I fully understand why
> > unlock uses the locked add. Except we need a barrier to avoid the race
> > with the enter_slowpath() users, of course. Perhaps this is the only
> > reason?
>
> Right now it needs to be a locked operation to prevent read-reordering.
> x86 memory ordering rules state that all writes are seen in a globally
> consistent order, and are globally ordered wrt reads *on the same
> addresses*, but reads to different addresses can be reordered wrt to writes.
>
> So, if the unlocking add were not a locked operation:
>
> __add(&lock->tickets.head, TICKET_LOCK_INC);  /* not locked */
>
> if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
> __ticket_unlock_slowpath(lock, prev);
>
> Then the read of lock->tickets.tail can be reordered before the unlock,
> which introduces a race:

Yes, yes, thanks, but this is what I meant. We need a barrier. Even if
"Every store is a release" as Linus mentioned.

> This *might* be OK, but I think it's on dubious ground:
>
> __add(&lock->tickets.head, TICKET_LOCK_INC);  /* not locked */
>
>   /* read overlaps write, and so is ordered */
> if (unlikely(lock->head_tail & (TICKET_SLOWPATH_FLAG << TICKET_SHIFT))
> __ticket_unlock_slowpath(lock, prev);
>
> because I think Intel and AMD differed in interpretation about how
> overlapping but different-sized reads & writes are ordered (or it simply
> isn't architecturally defined).

can't comment, I simply so not know how the hardware works.

> If the slowpath flag is moved to head, then it would always have to be
> locked anyway, because it needs to be atomic against other CPU's RMW
> operations setting the flag.

Yes, this is true.

But again, if we want to avoid the read-after-unlock, we need to update
this lock and read SLOWPATH atomically, it seems that we can't avoid the
locked insn.

Oleg.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] vsprintf: Make sure argument to %pX specifier is valid

2015-02-12 Thread Andrew Cooper
On 12/02/15 16:33, Boris Ostrovsky wrote:
> On 02/12/2015 10:48 AM, Andrew Cooper wrote:
>> On 12/02/15 15:38, Boris Ostrovsky wrote:
>>> On 02/12/2015 10:21 AM, Andrew Cooper wrote:
 On 12/02/15 15:01, Boris Ostrovsky wrote:
> On 02/12/2015 06:04 AM, Andrew Cooper wrote:
>> On 11/02/15 20:58, Boris Ostrovsky wrote:
>>> If invalid pointer (i.e. something smaller than
>>> HYPERVISOR_VIRT_START)
>>> is passed for %*ph/%pv/%ps/%pS format specifiers then print
>>> "(NULL)"
>>>
>>> Signed-off-by: Boris Ostrovsky 
>>> ---
>>> xen/common/vsprintf.c |   23 ---
>>> 1 files changed, 16 insertions(+), 7 deletions(-)
>>>
>>> v2:
>>> * Print "(NULL)" instead of specifier-specific string
>>> * Consider all addresses under HYPERVISOR_VIRT_START as
>>> invalid.
>>> (I think
>>>   this is true for both x86 and ARM but I don't have ARM
>>> platform
>>> to test).
>>>
>>>
>>> diff --git a/xen/common/vsprintf.c b/xen/common/vsprintf.c
>>> index 065cc42..b9542b5 100644
>>> --- a/xen/common/vsprintf.c
>>> +++ b/xen/common/vsprintf.c
>>> @@ -270,6 +270,22 @@ static char *pointer(char *str, char *end,
>>> const char **fmt_ptr,
>>> const char *fmt = *fmt_ptr, *s;
>>>   /* Custom %p suffixes. See
>>> XEN_ROOT/docs/misc/printk-formats.txt */
>>> +
>>> +switch ( fmt[1] )
>>> +{
>>> +case 'h':
>>> +case 's':
>>> +case 'S':
>>> +case 'v':
>>> +++*fmt_ptr;
>>> +}
>>> +
>>> +if ( (unsigned long)arg < HYPERVISOR_VIRT_START )
>>> +{
>>> +char *s = "(NULL)";
>>> +return string(str, end, s, -1, -1, 0);
>>> +}
>>> +
>>> switch ( fmt[1] )
>> This wont function, as you have inverted the increment of
>> *fmt_ptr and
>> check of fmt[1].
> fmt value doesn't change, it is stashed at the top of the routine.
 You are correct.  My apologies.  I however dislike the splitting of
 the
 switch into two.

> (What *is* wrong in the above code is the fact that the arg test is
> done outside the switch. It should be part of the four case
> statements, otherwise we will print plain %p arguments as "NULL").
>
>
>> "(NULL)" is inappropriate for non-null pointers less than
>> VIRT_START.
> Yes, I thought about it after I sent it. "(invalid)"?
 Better, but overriding the number with a string does hide information.
 In the case that the pointer is invalid, it would be useful to see its
 contents.
>>>
>>> How about "<0xXX>" (i.e. effectively replace "%pv" with "<%p>",
>>> with angle brackets indicating invalid pointer)?
>>>
>> It feels like change for change sake, especially as there is a perfectly
>> good hex decode for plain %p.
>>
>> Given the VIRT check, I would just put the entire switch statement
>> inside an "if ( (unsigned long)arg < HYPERVISOR_VIRT_START )" block
>> and
>> let it fall through to the plain number case for a bogus pointer.
> Not sure I understand what you are suggesting here, sorry.
>
> -boris
 if ( (unsigned long)arg < HYPERVISOR_VIRT_START )
 {
   switch ( fmt[1] )
   {
   
   }
 }


 This makes the patch a whole 3 line addition and indenting the whole
 switch block by 4 spaces.
>>> Still don't understand. This will never print anything unless it's a
>>> bad pointer, won't it?
>>>
>>> (And if you meant '>=' then we will simply print the invalid pointer
>>> in plain %p format. Which, btw, may be the solution but we will still
>>> need to bump fmt_ptr, so we again will need another switch or
>>> something to test for sub-specifier)
>> Oops - I did mean >=.  I.e. only do the custom %pX decoding in the case
>> that arg is a plausible pointer.
>>
>> There is no need I can see to alter the fmt_ptr handling.  The code
>> currently works, other than the issue at hand of falling over a bad
>> pointer.
>
> We do, otherwise we will be printing the sub-specifier. Here is
> example of what happens if we don't bump it:
>
> struct vcpu *badvcpu = NULL;
> printk("badvcpu: %pv current: %pv\n", badvcpu, current);
>
> console:
> badvcpu: v current: d0v0
>

Ah - I see what you mean now.

>
> Also, for %*ph format, if we just go with falling through to plain
> format and not marking somehow that we are printing a bad pointer:
>
> unsigned badval = 0xab;
> unsigned *badptr = &badval;
> printk("badptr = %*ph\n", 1, badptr);
>
> console:
> badptr = ab
>
> We don't know here whether badptr was pointing to 0xab or it itself
> was 0xab.

Ok.  As this is only for making debug code less fragile, it probably is
better to explicitly call out a bad pointer differently.

~Andrew



Re: [Xen-devel] pvSCSI test

2015-02-12 Thread Kristian Hagsted Rasmussen
On Monday, February 9, 2015 07:02, Juergen Gross  wrote:
> To: Kristian Hagsted Rasmussen; Olaf Hering; xen-de...@lists.xensource.com
> Subject: Re: [Xen-devel] pvSCSI test

snip

> 
> No, that's okay. The connection between p-dev and the drive is done
> via the target infrastructure.
> 
> Something seems to be wrong with your link in configfs: the target
> seems not to be active. Could you please check the link to be correct?
> Please check whether the pscsi (or iblock) entry is active. This can
> be done via the "ls" command in targetcli for example.
> 

In targetcli, ls returns:

o- / 
.
 [...]
  o- backstores 
..
 [...]
  | o- fileio 
...
 [0 Storage Object]
  | o- iblock 
...
 [0 Storage Object]
  | o- pscsi 

 [1 Storage Object]
  | | o- 3:0:0:0 
..
 [/dev/sdb activated]
  | o- rd_dr 

 [0 Storage Object]
  | o- rd_mcp 
...
 [0 Storage Object]
  o- ib_srpt 
...
 [0 Targets]
  o- iscsi 
.
 [0 Targets]
  o- loopback 
..
 [0 Targets]
  o- qla2xxx 
...
 [0 Targets]
  o- tcm_fc 

 [0 Targets]

And my script for starting xen-pvscsi is this:

modprobe xen-scsiback
mkdir /sys/kernel/config/target/xen-pvscsi
mkdir -p /sys/kernel/config/target/xen-pvscsi/naa.600140512a981c66/tpgt_0
echo naa.6001405708ab297e > 
/sys/kernel/config/target/xen-pvscsi/naa.600140512a981c66/tpgt_0/nexus
 pvscsi Target Ports
mkdir -p 
/sys/kernel/config/target/xen-pvscsi/naa.600140512a981c66/tpgt_0/lun/lun_0
ln -s /sys/kernel/config/target/core/pscsi_0/3:0:0:0 
/sys/kernel/config/target/xen-pvscsi/naa.600140512a981c66/tpgt_0/lun/lun_0/xen-pvscsi_port
 Attributes for pvscsi Target Portal Group
 Parameters for pvscsi Target Portal Group
echo "3:0:0:0" > 
/sys/kernel/config/target/xen-pvscsi/naa.600140512a981c66/tpgt_0/param/alias

I hope you can spot my error, as I am a little lost right now.

> When I tested the pscsi entry in configfs switched to "active" when I
> linked the xen-pvscsi entry to it.
> 
>> Do I have to manually add the device to xenstore?
> 
> I never did it. :-)
> 
> 
> Juergen
> 
>>
>> If you do not feel for answering more of my questions please feel free to 
>> say so, I am just interested in this work and really look forward to its 
>> inclusion in xen.
>>
>> /Kristian
>>
>>> You are not using pscsi, but iblock. Is that on purpose? I have tested
>>> pscsi and fileio only.
>>>
>>> What does lsscsi tell you after adding the device via targetcli? I
>>> suppose you see a new scsi target you should use instead of 3:0:0:0
>>> (that's what I did in the fileio case).
>>>

I do not see more devices with lsscsi when I add and iBlock devices, however I 
also tested with a fileIO device, this does also not show up in lsscsi. However 
I can get it to show up by making a loopback entry in targetcli, this however 
does not change the outcome of my domain creation.

Best regards Kristian

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] vsprintf: Make sure argument to %pX specifier is valid

2015-02-12 Thread Boris Ostrovsky

On 02/12/2015 11:33 AM, Boris Ostrovsky wrote:



Also, for %*ph format, if we just go with falling through to plain 
format and not marking somehow that we are printing a bad pointer:


unsigned badval = 0xab;
unsigned *badptr = &badval;
printk("badptr = %*ph\n", 1, badptr);

console:
badptr = ab

We don't know here whether badptr was pointing to 0xab or it itself 
was 0xab.




Ugh, bad example. Here is what I meant:

unsigned badval = 0xab;
unsigned *badptr = &badval;
unsigned *badvalptr = (void *)0xab;
printk("badvalptr = %*ph badptr = %*ph\n", 1, badvalptr, 1, badptr);

console:
badvalptr = ab badptr = ab


Sorry.

-boris


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.14 test] 34468: regressions - FAIL

2015-02-12 Thread xen . org
flight 34468 linux-3.14 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34468/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-qemut-rhel6hvm-intel 11 leak-check/check  fail REGR. vs. 34268

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-sedf  9 guest-start   fail REGR. vs. 34268
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 34268

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass

version targeted for testing:
 linuxa74f1d1204a5c892466b52ac68ee6443c1e459d7
baseline version:
 linux4ccf212fb84d79b75fc66a2e26ac6bdbab0aedbf


People who touched revisions under test:
  Aaro Koskinen 
  Andrew Morton 
  Andy Lutomirski 
  Bjorn Helgaas 
  Bo Shen 
  Catalin Marinas 
  Charlotte Richardson 
  David Daney 
  David S. Miller 
  Dmitry Monakhov 
  Eric Dumazet 
  Eric Nelson 
  Felix Fietkau 
  Greg Kroah-Hartman 
  Hemmo Nieminen 
  hujianyang 
  Jaroslav Kysela 
  Jeff Layton 
  Johan Hovold 
  karl beldan 
  Karl Beldan 
  Lai Jiangshan 
  Linus Torvalds 
  Linus Walleij 
  Mark Brown 
  Mark Rutland 
  Mathias Krause 
  Michal Marek 
  Naoya Horiguchi 
  Paolo Bonzini 
  Pavel Hofman 
  Peter Kümmel 
  Ralf Baechle 
  Raymond Ngun 
  Russell King 
  Ryusuke Konishi 
  Sachin Prabhu 
  Shiraz Hashim 
  Shirish Pargaonkar 
  Steve French 
  Takashi Iwai 
  Theodore Ts'o 
  Thomas Gleixner 
  Wang Kai 
  Will Deacon 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64  

Re: [Xen-devel] [PATCH v2] vsprintf: Make sure argument to %pX specifier is valid

2015-02-12 Thread Boris Ostrovsky

On 02/12/2015 10:48 AM, Andrew Cooper wrote:

On 12/02/15 15:38, Boris Ostrovsky wrote:

On 02/12/2015 10:21 AM, Andrew Cooper wrote:

On 12/02/15 15:01, Boris Ostrovsky wrote:

On 02/12/2015 06:04 AM, Andrew Cooper wrote:

On 11/02/15 20:58, Boris Ostrovsky wrote:

If invalid pointer (i.e. something smaller than
HYPERVISOR_VIRT_START)
is passed for %*ph/%pv/%ps/%pS format specifiers then print "(NULL)"

Signed-off-by: Boris Ostrovsky 
---
xen/common/vsprintf.c |   23 ---
1 files changed, 16 insertions(+), 7 deletions(-)

v2:
* Print "(NULL)" instead of specifier-specific string
* Consider all addresses under HYPERVISOR_VIRT_START as invalid.
(I think
  this is true for both x86 and ARM but I don't have ARM platform
to test).


diff --git a/xen/common/vsprintf.c b/xen/common/vsprintf.c
index 065cc42..b9542b5 100644
--- a/xen/common/vsprintf.c
+++ b/xen/common/vsprintf.c
@@ -270,6 +270,22 @@ static char *pointer(char *str, char *end,
const char **fmt_ptr,
const char *fmt = *fmt_ptr, *s;
  /* Custom %p suffixes. See
XEN_ROOT/docs/misc/printk-formats.txt */
+
+switch ( fmt[1] )
+{
+case 'h':
+case 's':
+case 'S':
+case 'v':
+++*fmt_ptr;
+}
+
+if ( (unsigned long)arg < HYPERVISOR_VIRT_START )
+{
+char *s = "(NULL)";
+return string(str, end, s, -1, -1, 0);
+}
+
switch ( fmt[1] )

This wont function, as you have inverted the increment of *fmt_ptr and
check of fmt[1].

fmt value doesn't change, it is stashed at the top of the routine.

You are correct.  My apologies.  I however dislike the splitting of the
switch into two.


(What *is* wrong in the above code is the fact that the arg test is
done outside the switch. It should be part of the four case
statements, otherwise we will print plain %p arguments as "NULL").



"(NULL)" is inappropriate for non-null pointers less than VIRT_START.

Yes, I thought about it after I sent it. "(invalid)"?

Better, but overriding the number with a string does hide information.
In the case that the pointer is invalid, it would be useful to see its
contents.


How about "<0xXX>" (i.e. effectively replace "%pv" with "<%p>",
with angle brackets indicating invalid pointer)?


It feels like change for change sake, especially as there is a perfectly
good hex decode for plain %p.


Given the VIRT check, I would just put the entire switch statement
inside an "if ( (unsigned long)arg < HYPERVISOR_VIRT_START )" block
and
let it fall through to the plain number case for a bogus pointer.

Not sure I understand what you are suggesting here, sorry.

-boris

if ( (unsigned long)arg < HYPERVISOR_VIRT_START )
{
  switch ( fmt[1] )
  {
  
  }
}


This makes the patch a whole 3 line addition and indenting the whole
switch block by 4 spaces.

Still don't understand. This will never print anything unless it's a
bad pointer, won't it?

(And if you meant '>=' then we will simply print the invalid pointer
in plain %p format. Which, btw, may be the solution but we will still
need to bump fmt_ptr, so we again will need another switch or
something to test for sub-specifier)

Oops - I did mean >=.  I.e. only do the custom %pX decoding in the case
that arg is a plausible pointer.

There is no need I can see to alter the fmt_ptr handling.  The code
currently works, other than the issue at hand of falling over a bad pointer.


We do, otherwise we will be printing the sub-specifier. Here is example 
of what happens if we don't bump it:


struct vcpu *badvcpu = NULL;
printk("badvcpu: %pv current: %pv\n", badvcpu, current);

console:
badvcpu: v current: d0v0


Also, for %*ph format, if we just go with falling through to plain 
format and not marking somehow that we are printing a bad pointer:


unsigned badval = 0xab;
unsigned *badptr = &badval;
printk("badptr = %*ph\n", 1, badptr);

console:
badptr = ab

We don't know here whether badptr was pointing to 0xab or it itself was 
0xab.


-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] vsprintf: Make sure argument to %pX specifier is valid

2015-02-12 Thread Andrew Cooper
On 12/02/15 15:38, Boris Ostrovsky wrote:
> On 02/12/2015 10:21 AM, Andrew Cooper wrote:
>> On 12/02/15 15:01, Boris Ostrovsky wrote:
>>> On 02/12/2015 06:04 AM, Andrew Cooper wrote:
 On 11/02/15 20:58, Boris Ostrovsky wrote:
> If invalid pointer (i.e. something smaller than
> HYPERVISOR_VIRT_START)
> is passed for %*ph/%pv/%ps/%pS format specifiers then print "(NULL)"
>
> Signed-off-by: Boris Ostrovsky 
> ---
>xen/common/vsprintf.c |   23 ---
>1 files changed, 16 insertions(+), 7 deletions(-)
>
> v2:
>* Print "(NULL)" instead of specifier-specific string
>* Consider all addresses under HYPERVISOR_VIRT_START as invalid.
> (I think
>  this is true for both x86 and ARM but I don't have ARM platform
> to test).
>
>
> diff --git a/xen/common/vsprintf.c b/xen/common/vsprintf.c
> index 065cc42..b9542b5 100644
> --- a/xen/common/vsprintf.c
> +++ b/xen/common/vsprintf.c
> @@ -270,6 +270,22 @@ static char *pointer(char *str, char *end,
> const char **fmt_ptr,
>const char *fmt = *fmt_ptr, *s;
>  /* Custom %p suffixes. See
> XEN_ROOT/docs/misc/printk-formats.txt */
> +
> +switch ( fmt[1] )
> +{
> +case 'h':
> +case 's':
> +case 'S':
> +case 'v':
> +++*fmt_ptr;
> +}
> +
> +if ( (unsigned long)arg < HYPERVISOR_VIRT_START )
> +{
> +char *s = "(NULL)";
> +return string(str, end, s, -1, -1, 0);
> +}
> +
>switch ( fmt[1] )
 This wont function, as you have inverted the increment of *fmt_ptr and
 check of fmt[1].
>>>
>>> fmt value doesn't change, it is stashed at the top of the routine.
>>
>> You are correct.  My apologies.  I however dislike the splitting of the
>> switch into two.
>>
>>>
>>> (What *is* wrong in the above code is the fact that the arg test is
>>> done outside the switch. It should be part of the four case
>>> statements, otherwise we will print plain %p arguments as "NULL").
>>>
>>>

 "(NULL)" is inappropriate for non-null pointers less than VIRT_START.
>>>
>>> Yes, I thought about it after I sent it. "(invalid)"?
>>
>> Better, but overriding the number with a string does hide information.
>> In the case that the pointer is invalid, it would be useful to see its
>> contents.
>
>
> How about "<0xXX>" (i.e. effectively replace "%pv" with "<%p>",
> with angle brackets indicating invalid pointer)?
>

It feels like change for change sake, especially as there is a perfectly
good hex decode for plain %p.

>
>>
>>>

 Given the VIRT check, I would just put the entire switch statement
 inside an "if ( (unsigned long)arg < HYPERVISOR_VIRT_START )" block
 and
 let it fall through to the plain number case for a bogus pointer.
>>>
>>> Not sure I understand what you are suggesting here, sorry.
>>>
>>> -boris
>>
>> if ( (unsigned long)arg < HYPERVISOR_VIRT_START )
>> {
>>  switch ( fmt[1] )
>>  {
>>  
>>  }
>> }
>>
>>
>> This makes the patch a whole 3 line addition and indenting the whole
>> switch block by 4 spaces.
>
> Still don't understand. This will never print anything unless it's a
> bad pointer, won't it?
>
> (And if you meant '>=' then we will simply print the invalid pointer
> in plain %p format. Which, btw, may be the solution but we will still
> need to bump fmt_ptr, so we again will need another switch or
> something to test for sub-specifier)

Oops - I did mean >=.  I.e. only do the custom %pX decoding in the case
that arg is a plausible pointer.

There is no need I can see to alter the fmt_ptr handling.  The code
currently works, other than the issue at hand of falling over a bad pointer.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-next test] 34461: regressions - FAIL

2015-02-12 Thread xen . org
flight 34461 linux-next real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34461/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-amd64  5 xen-boot   fail REGR. vs. 34299
 test-amd64-i386-qemut-rhel6hvm-intel  5 xen-boot  fail REGR. vs. 34299
 test-amd64-i386-xl-qemuu-win7-amd64  5 xen-boot   fail REGR. vs. 34299
 test-amd64-i386-xl-qemut-debianhvm-amd64  5 xen-boot  fail REGR. vs. 34299
 test-armhf-armhf-xl  12 guest-start.2 fail REGR. vs. 34299
 test-amd64-i386-qemuu-rhel6hvm-intel  5 xen-boot  fail REGR. vs. 34299
 test-amd64-i386-xl-qemut-winxpsp3  5 xen-boot fail REGR. vs. 34299
 test-amd64-i386-xl-qemut-win7-amd64  5 xen-boot   fail REGR. vs. 34299
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  5 xen-boot  fail REGR. vs. 34299
 test-amd64-amd64-xl   5 xen-boot  fail REGR. vs. 34299
 test-amd64-amd64-xl-qemuu-ovmf-amd64  5 xen-boot  fail REGR. vs. 34299
 test-amd64-amd64-pair 8 xen-boot/dst_host fail REGR. vs. 34299
 test-amd64-amd64-pair 7 xen-boot/src_host fail REGR. vs. 34299
 test-amd64-i386-xl-qemuu-debianhvm-amd64  5 xen-boot  fail REGR. vs. 34299
 test-amd64-amd64-xl-qemut-winxpsp3  7 windows-install fail REGR. vs. 34299

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-sedf  5 xen-boot  fail REGR. vs. 34299
 test-amd64-i386-libvirt   9 guest-start  fail   like 34299
 test-amd64-amd64-xl-sedf-pin  5 xen-boot  fail REGR. vs. 34299
 test-amd64-i386-freebsd10-i386  7 freebsd-install  fail like 34299
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 34299
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install  fail like 34299

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass

version targeted for testing:
 linuxf61b8d6d777dfb241ebe87478c5c82126ac8d687
baseline version:
 linux26cdd1f76a889a21faa851bcb260782db2c7f0a9

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  fail
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemu

  1   2   >