Re: [Xen-devel] [v7][PATCH 13/16] libxl: construct e820 map with RDM information for HVM guest

2015-07-14 Thread Ian Campbell
On Tue, 2015-07-14 at 13:44 +0800, Chen, Tiejun wrote:
 On 2015/7/13 18:15, Ian Campbell wrote:
  On Mon, 2015-07-13 at 17:47 +0800, Chen, Tiejun wrote:
  This approach looks like it should work, and I think given the point in
  the release it would be acceptable for 4.6.
 
  However long term I think it might make sense to try and reuse one of
  the existing libxl__arch hooks, i.e.
  libxl__arch_domain_init_hw_description or
  libxl__arch_domain_finalise_hw_description. On ARM these are to do with
  setting the Device Tree Blob, which included the memory map, so it is
  somewhat morally equivalent to configuring the e820 on x86, I think.
 
  Those hooks are only called from libxl__build_pv today, but calling them
  from libxl__build_hvm seems like it would be good too.
 
  But seems this is raising some potential risks, isn't this? Although
  libxl__arch_domain_init_hw_description() and
  libxl__arch_domain_finalise_hw_description() are NOP to x86, they're
  really working on ARM side. So if we call them inside
  libxl__build_hvm(), any affects to ARM? I'm not very sure at this point
  unless anyone can validate this change on ARM, or you really ensure my
  concerns is unnecessary.
 
  All ARM guests use the PV code path so there is no risk.
 
 Okay but please take a close look at this,
 
 libxl__build_pv(gc, domid, info, state)
  |
  + libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
libxl_domain_build_info *info,
struct xc_dom_image *dom)
 
 But in our case we need this parameter, struct xc_hvm_build_args *args, 
 so how can we handle this conflict? Its not easy to add this, and it 
 doesn't make sense as well in pv case.

This is an internal API, you can feel free to modify it as necessary.

Please note that I started this subthread with However long term I
think it might make sense ..., This was not a request to redo this
patch now.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3] libxl: Increase device model startup timeout to 1min.

2015-07-14 Thread Ian Campbell
On Tue, 2015-07-14 at 07:17 +0100, Jan Beulich wrote:
  On 07.07.15 at 17:41, ian.campb...@citrix.com wrote:
  On Tue, 2015-07-07 at 16:14 +0100, Ian Jackson wrote:
  Anthony PERARD writes ([PATCH V3] libxl: Increase device model startup 
  timeout to 1min.):
   On a busy host, QEMU may take more than 10s to load and start.
   
   This is likely due to a bug in Linux where the I/O subsystem sometime
   produce high latency under load and result in QEMU taking a long time to
   load every single dynamic libraries.
  
  Acked-by: Ian Jackson ian.jack...@eu.citrix.com
  
  Applied.
 
 So is this the answer to Problems with merlot* AMD Opteron 6376
 systems?

It'll be hard to say until this change gets through the Xen push gate
and that version gets used for other branches (linux testing, libvirt,
ovmf, osstest's own gate etc).

At the moment it looks like it has helped with some but not all of the
issues.

These:

http://logs.test-lab.xenproject.org/osstest/results/host/merlot0.html
http://logs.test-lab.xenproject.org/osstest/results/host/merlot1.html

still don't look brilliant, even if you only pay attention to the
xen-unstable lines.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 13/16] libxl: construct e820 map with RDM information for HVM guest

2015-07-14 Thread Chen, Tiejun

However long term I think it might make sense to try and reuse one of
the existing libxl__arch hooks, i.e.
libxl__arch_domain_init_hw_description or
libxl__arch_domain_finalise_hw_description. On ARM these are to do with
setting the Device Tree Blob, which included the memory map, so it is
somewhat morally equivalent to configuring the e820 on x86, I think.

Those hooks are only called from libxl__build_pv today, but calling them
from libxl__build_hvm seems like it would be good too.


But seems this is raising some potential risks, isn't this? Although
libxl__arch_domain_init_hw_description() and
libxl__arch_domain_finalise_hw_description() are NOP to x86, they're
really working on ARM side. So if we call them inside
libxl__build_hvm(), any affects to ARM? I'm not very sure at this point
unless anyone can validate this change on ARM, or you really ensure my
concerns is unnecessary.


All ARM guests use the PV code path so there is no risk.


Okay but please take a close look at this,

libxl__build_pv(gc, domid, info, state)
  |
  + libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
libxl_domain_build_info *info,
struct xc_dom_image *dom)

But in our case we need this parameter, struct xc_hvm_build_args *args,
so how can we handle this conflict? Its not easy to add this, and it
doesn't make sense as well in pv case.


This is an internal API, you can feel free to modify it as necessary.


I mean struct xc_hvm_build_args[] is a parameter specific to hvm so its 
wired to pass this in the case of hv. If we wrapper this again its not 
worth going this way.




Please note that I started this subthread with However long term I
think it might make sense ..., This was not a request to redo this
patch now.



Okay lets record this and now just keep moving forward with the original.

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.4-testing test] 59510: regressions - FAIL

2015-07-14 Thread osstest service owner
flight 59510 xen-4.4-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59510/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10 fail REGR. vs. 
59289

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeat fail blocked in 
59289
 test-amd64-i386-libvirt  11 guest-start  fail   like 59289
 test-amd64-amd64-libvirt 11 guest-start  fail   like 59289
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail like 59289

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 build-amd64-rumpuserxen   6 xen-buildfail   never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt 11 guest-start  fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass

version targeted for testing:
 xen  33eba764618669b9c394c7d9cd2e335b426862ab
baseline version:
 xen  36c53c22cd72f742cb42891354e82e9532436fcf

Last test of basis59289  2015-07-09 12:40:11 Z4 days
Testing same since59510  2015-07-13 12:41:25 Z0 days1 attempts


People who touched revisions under test:
  Alan Robinson alan.robin...@ts.fujitsu.com
  Andrew Cooper andrew.coop...@citrix.com
  Jan Beulich jbeul...@suse.com
  Ross Lagerwall ross.lagerw...@citrix.com
  Suravee Suthikulpanit suravee.suthikulpa...@amd.com
  Tim Deegan t...@xen.org

jobs:
 build-amd64-xend pass
 build-i386-xend  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 

[Xen-devel] how to locate the hypercall address in memory?

2015-07-14 Thread big strong
As syscalls can be located with the help of symbol files, is it possible to
do it to hypercalls too?
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/9] libxl: fix libxl__abs_path

2015-07-14 Thread Ian Campbell
On Mon, 2015-07-13 at 18:12 +0100, Ian Jackson wrote:
 Ian Campbell writes (Re: [PATCH 1/9] libxl: fix libxl__abs_path):
  I rather dislike subjects of the form fix $function, since it gives
  very little clue to someone reading the shortlog what is going on.
 
 Yes.
 
  In this case I think libxl: make libxl__abs_path correctly handle a
  NULL argument would be an accurate description.
 
 But: it is quite surprising that libxl__abs_path can be legally passed
 a NULL for any of its parameters.

True.
 
 There are no call sites in libxl which can pass a NULL.
 
 I think that if we are to retain this feature, it ought to be
 documented, at least.
 
 Maybe
 
   _hidden char *libxl__abs_path(libxl__gc *gc,
 const char *s /* NULL OK */,
 const char *path);

Or add an assert if we don't wish to support this?


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3] libxl: Increase device model startup timeout to 1min.

2015-07-14 Thread Jan Beulich
 On 07.07.15 at 17:41, ian.campb...@citrix.com wrote:
 On Tue, 2015-07-07 at 16:14 +0100, Ian Jackson wrote:
 Anthony PERARD writes ([PATCH V3] libxl: Increase device model startup 
 timeout to 1min.):
  On a busy host, QEMU may take more than 10s to load and start.
  
  This is likely due to a bug in Linux where the I/O subsystem sometime
  produce high latency under load and result in QEMU taking a long time to
  load every single dynamic libraries.
 
 Acked-by: Ian Jackson ian.jack...@eu.citrix.com
 
 Applied.

So is this the answer to Problems with merlot* AMD Opteron 6376
systems?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 59509: regressions - FAIL

2015-07-14 Thread osstest service owner
flight 59509 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59509/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 14 guest-localmigrate.2 
fail REGR. vs. 58958
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
REGR. vs. 58965
 test-amd64-i386-qemut-rhel6hvm-intel 12 guest-start/redhat.repeat fail REGR. 
vs. 58965
 test-amd64-amd64-xl-qemuu-ovmf-amd64 16 guest-stopfail REGR. vs. 58965
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 16 guest-stop fail REGR. vs. 
58965
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 16 guest-stop fail REGR. vs. 58965
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 16 guest-stop fail REGR. vs. 58965
 test-amd64-i386-xl-qemuu-debianhvm-amd64 16 guest-stopfail REGR. vs. 58965
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 9 windows-install fail REGR. vs. 58965
 test-amd64-i386-xl-qemut-win7-amd64  9 windows-installfail REGR. vs. 58965
 test-amd64-amd64-xl-qemut-win7-amd64  9 windows-install   fail REGR. vs. 58965
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 9 windows-install fail REGR. vs. 58965
 test-amd64-amd64-xl-qemuu-win7-amd64  9 windows-install   fail REGR. vs. 58965
 test-amd64-i386-xl-qemut-winxpsp3  9 windows-install  fail REGR. vs. 58965

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-rumpuserxen-i386 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail like 58948
 test-amd64-i386-xl-qemuu-win7-amd64  9 windows-install fail like 58958
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 58965

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass

version targeted for testing:
 xen  42b79d4ad84da0928f11c4b265eb30cece5fe2fb
baseline version:
 xen  c40317f11b3f05e7c06a2213560c8471081f2662

Last test of basis58965  2015-06-29 02:08:30 Z   15 days
Failing since 58974  2015-06-29 15:11:59 Z   14 days   16 attempts
Testing same since59509  2015-07-13 12:41:20 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper andrew.coop...@citrix.com
  Anthony PERARD anthony.per...@citrix.com
  Ard Biesheuvel a...@linaro.org
  Ben Catterall ben.catter...@citrix.com
  Boris Ostrovsky boris.ostrov...@oracle.com
  Chao Peng chao.p.p...@linux.intel.com
  Chen Baozi baoz...@gmail.com
  Daniel De Graaf dgde...@tycho.nsa.gov
  Dario Faggioli dario.faggi...@citrix.com
  David Scott dave.sc...@citrix.com
  David Vrabel david.vra...@citrix.com
  Dietmar Hahn dietmar.h...@ts.fujitsu.com
  Euan Harris euan.har...@citrix.com
  Fabio Fantoni fabio.fant...@m2r.biz
  Feng Wu feng...@intel.com
  George Dunlap george.dun...@eu.citrix.com
  Ian Campbell ian,campb...@citrix.com
  Ian Campbell ian.campb...@citrix.com
  Ian Jackson ian.jack...@eu.citrix.com
  Jan Beulich jbeul...@suse.com
  Jennifer Herbert jennifer.herb...@citrix.com
  Juergen Gross jgr...@suse.com
  Julien Grall julien.gr...@citrix.com
  Julien Grall julien.gr...@linaro.org
  Kevin Tian kevin.t...@intel.com
  Liang Li liang.z...@intel.com
  Paul Durrant paul.durr...@citrix.com
  Razvan Cojocaru rcojoc...@bitdefender.com
  Rob Hoes rob.h...@citrix.com
  Roger Pau Monné roger@citrix.com
  Samuel Thibault samuel.thiba...@ens-lyon.org
  Sander Eikelenboom li...@eikelenboom.it
  Tamas K Lengyel tleng...@novetta.com
  Thomas Leonard tal...@gmail.com
  Tiejun Chen tiejun.c...@intel.com
  Tim Deegan t...@xen.org
  Vitaly Kuznetsov vkuzn...@redhat.com
  Wei Liu wei.l...@citrix.com
  Wen Congyang we...@cn.fujitsu.com
  Yang Zhang yang.z.zh...@intel.com

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm   

Re: [Xen-devel] [PATCH v4 for Xen 4.6 0/4] Enable per-VCPU parameter settings for RTDS scheduler

2015-07-14 Thread Dario Faggioli
On Mon, 2015-07-13 at 22:45 -0700, Meng Xu wrote:
 Hi Dario,

Hi,

  - not enough benchmarks/performance figures: I'd like to have
 the
latency numbers, e.g., from cyclictest, we've spoke many
 times with
Meng, give our official blessing at using it
 
 
 ​Considering t
 he improved version Dagaen is working on should be able to improve the
 efficiency of the scheduler a lot​
 ​,​ 
 ​I will evaluate the performance of the RTDS scheduler with cyclictest
 once Dagaen have the improved version done. ​

Sure thing! It actually was part of my own reasoning about why we want
to wait. :-)

Regards,
Dario

-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for ARM/ITS patches

2015-07-14 Thread Ian Campbell
On Mon, 2015-07-13 at 18:24 +0100, Stefano Stabellini wrote:
 On Mon, 13 Jul 2015, Wei Liu wrote:
  On Fri, Jul 10, 2015 at 04:16:07PM +0530, Vijay Kilari wrote:
   Hi Wei,
   
   I would like to have freeze exception for ITS feature on ARM64.
   Design got freeze few weeks back and I have sent v4 version of patch 
   series
   today.
   
   This patches will not impact any generic code of other platforms and have 
   minor
   changes generic arm related code. Also these patches are only for
   ARM64 platform.
   
   These patches are pre-requisite for PCI support / Pass-through support
   on ARM64 platforms.
   
   The risk is minor and as of today only used by Cavium ThunderX platform.
   
  
  
  I'm not a ARM expert, but last time I checked most patches in v3 are not
  acked.
  
  I also got conflict statements from maintainers and core developer. I
  will wait a bit for them clarify the situation.
  
  But as Ian said, if you can't post v4 and get most if all of your
  patches acked / reviewed early this week, my answer to this request
  would be no.
 
 I pretty much agree with Ian: I went through the patches and the impact
 of the series on non-ITS platforms will be null after Vijay addresses:
 
 - the lpi irq_desc and irq_pending allocation issues
 - improve lpi_supported to check for ITS presence
 
 these two changes should be trivial and are certainly necessary for a
 freeze exception in my view.
 
 
 On this basis, if Vijay manages to resend a v5 version on time with
 those two issues covered, making sure that the new code is not enabled
 unless an its is present, then I think that a freeze exception would be
 OK as the risk would be zero.

I don't think we should be limiting ourselves to only fixing issues
which reduce the risk on non-ITS platforms. So the two issues which you
highlight above are necessary but not sufficient for a freeze exception
in my view.

For example I am firmly of the opinion that the VPLI injection code
needs to be corrected as discussed during review.

Likewise I said that care needs to be taken wrt when any of this code is
enabled, which includes not exposing it to domU even on platforms which
support ITS. I also view this as a requirement for a freeze exception.
In other words only dom0 and only on an ITS enabled system should be
exposed to any aspect of the ITS support.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 03:42, tiejun.c...@intel.com wrote:
  +{
 +printk(XENLOG_G_ERR VTDPREFIX
 +cannot assign %04x:%02x:%02x.%u
 +with shared RMRR for Dom%d.\n,
 +   seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
 +   d-domain_id);
 +return -EPERM;
 +}
 +}

 Two if()-s like these should be folded into one.

 In your place I'd also consider also printing the RMRR base address
 for easier analysis of the issue.

 
 I agree but I think the whole range info should be better,
 
  with shared RMRR [%PRIx64,%PRIx64] for Dom%d.\n,

Perhaps, albeit due to there not being overlapping RMRRs the
base address is sufficient for uniquely identifying the one in
question.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for RMRR

2015-07-14 Thread Chen, Tiejun

Y   Y   [v7][PATCH 14/16] xen/vtd: enable USB device assignment
Y   Y   [v7][PATCH 15/16] xen/vtd: prevent from assign the 
device with
shared rmrr


And yet again for these two. Please avoid giving a false impression


But these two patches really won Kevin's Ack, and also I wrote this line

Acked-by: Kevin Tian kevin.t...@intel.com

both in these two patches.


But talk here is about their review status, not who ack-ed them (and
an ack by other than  a maintainer of the affected code is not very
meaningful anyway).


Isn't Kevin the key maintainer specific to IOMMU subsystem?

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 06/16] hvmloader/pci: skip reserved ranges

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 08:39, tiejun.c...@intel.com wrote:
  -} *resource, mem_resource, high_mem_resource, io_resource;
 +} *resource, mem_resource, high_mem_resource, io_resource, 
 exp_mem_resource;

 Despite having gone through description and the rest of the patch I
 can't seem to be able to guess what exp_mem stands for.
 Meaningful variable names are quite helpful though, often avoiding
 the need for comments.
 
 exp_mem_resource() is the expanded mem_resource in the case of 
 populating RAM.
 
 Maybe I should use the whole word, expand_mem_resource.

And what does expand here mean then?

 @@ -309,29 +339,31 @@ void pci_setup(void)
   }

   /* Relocate RAM that overlaps PCI space (in 64k-page chunks). */
 +cur_pci_mem_start = pci_mem_start;
   while ( (pci_mem_start  PAGE_SHIFT)  hvm_info-low_mem_pgend )
 +relocate_ram_for_pci_memory(cur_pci_mem_start);

 Please be consistent which variable to want to use in the loop
 (pci_mem_start vs cur_pci_mem_start).
 
 Overall I just call relocate_ram_for_pci_memory() twice and each I 
 always pass cur_pci_mem_start. Any inconsistent place?

In the quoted code you use pci_mem_start in the while()
condition and cur_pci_mem_start in that same while()'s body.

 Also, this being the first substantial change to the function makes
 clear that you _still_ leave the sizing loop untouched, and instead
 make the allocation logic below more complicated. I said before a
 
 But this may be more reasonable than it used to do. In my point of view 
 we always need to first allocate 32bit mmio and then allocate 64bit mmio 
 since as you said we don't want to expand high memory if possible.
 
 number of times that I don't think this helps maintainability of this
 already convoluted code. Among other things this manifests itself
 in your second call to relocate_ram_for_pci_memory() in no way
 playing by the constraints explained a few lines up from here in an
 extensive comment.
 
 Can't all variables/comments express what I intend to do here? Except 
 for that exp_mem_resource.

I'm not talking about a lack of comments, but about your added
use of the function not being in line with what is being explained
in an earlier (pre-existing) comment.

 Therefore I'll not make any further comments on the rest of the
 patch, but instead outline an allocation model that I think would
 fit our needs: Subject to the constraints mentioned above, set up
 a bitmap (maximum size 64k [2Gb = 2^^19 pages needing 2^^19
 bits], i.e. reasonably small a memory block). Each bit represents a
 page usable for MMIO: First of all you remove the range from
 PCI_MEM_END upwards. Then remove all RDM pages. Now do a
 first pass over all devices, allocating (in the bitmap) space for only
 the 32-bit MMIO BARs, starting with the biggest one(s), by finding
 a best fit (i.e. preferably a range not usable by any bigger BAR)
 from top down. For example, if you have available

 [f000,f800)
 [f900,f9001000)
 [fa00,fa003000)
 [fa01,fa012000)

 and you're looking for a single page slot, you should end up
 picking fa002000.
 
 Why is this [f900,f9001000]? Just one page in this slot.

This was just a simple example I gave. Or maybe I don't understand
your question...

 After this pass you should be able to do RAM relocation in a
 single attempt just like we do today (you may still grow the MMIO
 window if you know you need to and can fit some of the 64-bit
 BARs in there, subject to said constraints; this is in an attempt
 to help OSes not comfortable with 64-bit resources).

 In a 2nd pass you'd then assign 64-bit resources: If you can fit
 them below 4G (you still have the bitmap left of what you've got
 available), put them there. Allocation strategy could be the same
 
 I think basically, your logic is similar to what I did as I described in 
 changelog,

The goal is the same, but the approaches look quite different to
me. In particular my approach avoids calculating mmio_total up
front, then basing RAM relocation on it, only to find subsequently
that more RAM may need to be relocated.

 I think bitmap mechanism is a good idea but honestly, its not easy to 
 cover all requirements here. And just like bootmem on Linux side, so its 
 a little complicated to implement this entirely. So I prefer not to 
 introduce this way in current phase.

I'm afraid it's going to be hard to convince me of any approaches
further complicating the current mechanism instead of overhauling
it.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Ping: [PATCH v6] dmar: device scope mem leak fix

2015-07-14 Thread Jan Beulich
 On 13.07.15 at 20:10, elena.ufimts...@oracle.com wrote:
 On its way Jan!

This wasn't a ping to you, but to the VT-d maintainers to finally ack
this patch.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for ARM/ITS patches

2015-07-14 Thread Vijay Kilari
On Tue, Jul 14, 2015 at 3:20 PM, Ian Campbell ian.campb...@citrix.com wrote:
 On Tue, 2015-07-14 at 14:54 +0530, Vijay Kilari wrote:
 I am trying to boot latest staging Xen branch on ThunderX with ITS patches.

 I face below issues with above [1] patch series

 1)  If pcie support only MSI, then INT mapping is not specified in DT. 
 However
   the below code returns error if INT mapping is not found and does not map.
 In ThunderX INT mapping is not specified for pcie nodes.

 static int map_device_children(struct domain *d,
 const struct dt_device_node *dev)
 {
  int ret;

  if ( dt_device_type_is_equal(dev, pci) )
  {
  DPRINT(Mapping children of %s to guest\n, dt_node_full_name(dev));

  ret = dt_for_each_irq_map(dev, map_dt_irq_to_domain, d);
  if ( ret  0 )
  return ret;  // Returns error here

 Hrm, I suppose dt_for_each_irq_map ought to return success if the device
 in question has no interrupt-map at all.

Yes, I do so.


 At first glance it seems like:
 if ( imap == NULL )
 {
 dt_dprintk( - no map, ignoring\n);
 goto fail;
 }
 Should become:
 if ( imap == NULL )
 {
 dt_dprintk( - no map, ignoring\n);
 return 0;
 }

 Can you test that and if it is correct submit it as a patch please.

 ...
 }

 2) Dom0 fails to boot with GICv3. It hangs just after GICv3 initialization.

 To know which is last distributor/re-distributor registers read, I
 have added debug prints
 in GICD  GICR mmio handlers in Xen. But with your patches Linux driver never
 traps to Xen to read/write GICD/GICR registers. If I revert back this
 patch series,
 I see the traps.

 Where this patch series is this:

 $ git log --oneline 
 d7f132c762d1359f03b2b5b89406daf39d8aefc0..467e5cbb2ffc5e0994c4cb584b7ace6a01a727af
 467e5cb xen: arm: consolidate mmio and irq mapping to dom0
 f65399f xen: arm: Import of_bus PCI entry from Linux (as a dt_bus entry)
 864f82a xen: arm: map child MMIO and IRQs to dom0 for PCI bus DT nodes.
 eed5e39 xen: arm: drop redundant extra call to vgic_reserve_virq
 f9d08f4 xen: dt: add dt_for_each_range helper
 5cefb30 xen: dt: add dt_for_each_irq_map helper
 $

 ?

 That's rather strange, nothing here should be interacting with the vgic
 trapping for GICD/GICR.

 About the only thing I can imagine is that something has incorrectly
 created a p2m mapping over the GICD/GICR addresses. If that is the case
 then it ought to be pretty apparent from the logs with DT_DEBUG enabled
 in domain_build.c.

 If it isn't apparent from the debug log then please could you bisect
 down to a specific patch.

I found the reason. There is some discrepancy with DT which is overlapping
with GIC address space. It took a day to figure out.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 0/5] iommu: add rmrr Xen command line option

2015-07-14 Thread Jan Beulich
 On 13.07.15 at 20:17, elena.ufimts...@oracle.com wrote:
 From: Elena Ufimtseva elena.ufimts...@oracle.com
 
 Add Xen command line option rmrr to specify RMRR 

 regions for devices that are not defined in ACPI thus

 causing IO Page Fault while booting dom0 in PVH mode.

 These additional regions will be added to the list of

 RMRR regions parsed from ACPI.   

 
 Changes in v10:
  - incorporate patch 'dmar: device scope mem leak fix' as series requires it.
  - move patch 'pci: add PCI_SBDF and PCI_SEG macros' close to the last patch 
 which uses it;

Which isn't quite what

 Or, even better, add such macros when the first user appears. Iirc
 I said so before...

Yes, I realized this late. Will move over in the next version if needed. 

(on the 9th) and

The dependency exists on memory leak patch, so I will add it to this series 
and 
squash the first patch from v9.

(on the 13th) meant to me: I expected the patch to be folded into
the consuming one, not moved closer to it in the series. Anyway,
that on its own doesn't require a v11.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 02:01, ravi.sah...@intel.com wrote:
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Monday, July 13, 2015 1:01 AM
 On 10.07.15 at 23:48, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Thursday, July 09, 2015 6:30 AM
 On 01.07.15 at 20:09, edmund.h.wh...@intel.com wrote:
 +altp2m_vcpu_reset(v);
 +vcpu_altp2m(v).p2midx = 0;
 +atomic_inc(p2m_get_altp2m(v)-active_vcpus);
 +
 +ap2m_vcpu_update_eptp(v);

We're in vendor independent code here - either the function is
misnamed, or it shouldn't be called directly from here.

 Would it be reasonable to add if hap_enabled and cpu_has_vmx checks
 like other code in this file that invokes ept specific ops?
 Otherwise it seems ok that the function would be called from here for
 p2m_altp2m interactions such as switching altp2m by id etc.
 Open to any other suggestions from you, or we would like to leave it
 as it is.

Imo such should be abstracted out properly (if it's indeed EPT-specific), or 
 the
function be renamed.

 
 Renaming may make sense - checking first - Would a name like 
 altp2m_vcpu_update_p2m() make sense?

Sounds fine to me.

 @@ -294,6 +298,12 @@ struct arch_domain
  struct p2m_domain *nested_p2m[MAX_NESTEDP2M];
  mm_lock_t nested_p2m_lock;

 +/* altp2m: allow multiple copies of host p2m */
 +bool_t altp2m_active;
 +struct p2m_domain *altp2m_p2m[MAX_ALTP2M];
 +mm_lock_t altp2m_lock;
 +uint64_t *altp2m_eptp;

This is a non-insignificant increase of the structure size - perhaps
all of these should hang off of struct arch_domain via a single,
separately allocated pointer?

 Is this a nice-to-have - again we modelled after the nestedhvm
 extensions to the struct.
 This will affect a lot of our patch without really changing how much
 memory is allocated.

I understand that. To a certain point I can agree to limit changes to what is
there at this stage. But you wanting to avoid addressing concerns basically
everywhere it's not a bug overextends this. Just because the series was
submitted late doesn't mean you should now expect us to give in on any
controversy regarding aspects we would normally expect to be changed. This
would basically encourage others to submit their stuff late too in the 
 future,
hoping for relaxed review.

 
 Couple things - first, we have absorbed a lot of (good) feedback - thanks for 
 that.
 Second, I don't think the series can be characterized as late (feedback from 
 others welcome). 
 V1 had almost the same structure and was submitted in January.

Still we're at v3 only here, not v10 or beyond.

 On this change - this will be a lot of effects on the code and we would like 
 to avoid this one.

While this may be a lot of mechanical change, I don't this presenting
any major risk of breaking the code.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/3] arm64: Add multiboot support (via fdt) for Xen boot

2015-07-14 Thread Ian Campbell
On Mon, 2015-07-13 at 16:53 +0800, fu@linaro.org wrote:
 From: Fu Wei fu@linaro.org
 
   - This adds support for the Xen boot on ARM specification for arm64.

I have used this to PXE boot Xen on a mustang board. My (handcrafted)
grub.cfg was:

set default=0
set timeout=5
menuentry 'pxe multiboot xen' {
  echo xen_hypervisor /trap/xen.efi
  xen_hypervisor /trap/xen.efi conswitch=x watchdog console=dtuart 
dtuart=/soc/serial@1c02 dom0_mem=512M,max:512M
  echo xen_module /trap/vmlinuz
  xen_module /trap/vmlinuz root=/dev/mapper/trap--vg-root ro  
console=hvc0
  echo xen_module /trap/initrd.gz
  xen_module /trap/initrd.gz
  boot
}

Tested-by: Ian Campbell ian.campb...@citrix.com

I didn't yet try a local boot from hdd since I'm sure it would work
equivalently and there still seems to be some discussion around how the
update-grub side should fit together.

Thanks!

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for RMRR

2015-07-14 Thread Wei Liu
On Tue, Jul 14, 2015 at 09:27:17AM +0800, Chen, Tiejun wrote:
 Please work with maintainers to get those hvmloader patches acked or
 reviewed.
 
 I will do.
 
 
 
 Note Jackson and Campbell also raised some comments to improve current
 codes.
 
 2. explain why it needs to be in this release (benefits).
 
 RMRR mechanism was broken for a long time and this makes VM always face
 security issues. In addition, those associated devices can't be passed
 through to VM and even result in VM crashes.
 
 3. explain why it doesn't break things (risks).
 
 Our policy makes sure that system will work in the original way by default
 as without the RMRR patches. And especially, this series just impacts those
 platforms which have RMRR.
 
 
 Your patches touch crucial path in hvmloader to build memory map. There
 is risk that it may break hvmloader even if it is turned off.
 
 I would need at least some positive test results from you to confirm if
 this feature is turned off everything works as before.
 
 
 Could you show what sort of test you need here? I just did boot a VM without
 any RDM parameters. I just think maybe someone had this good experience to
 check this.
 

Yes, this is the basic test I need -- at least booting a VM with RDM
off.

Feel free to do more tests and report back if you have time.

Wei.

 Thanks
 Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for VT-d posted-interrupts

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 11:21, wei.l...@citrix.com wrote:
 On Tue, Jul 14, 2015 at 05:51:02AM +, Wu, Feng wrote:
 Is it possible to get to 4.6 if making this feature default off?
 
 Note that I'm not the only one who makes the decision and I can't speak
 for maintainers. The first thing you ought to do is to convince
 maintainers, not me.
 
 If you ask for my opinion -- I don't see a point in releasing feature
 with security flaw in design, even if it is off by default. 

It was actually me who suggested that by flagging this experimental
and defaulting it to off, chances would increase for this to be allowed
in without said issue fixed.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 26/28] tools/libxl: Handle checkpoint records in a libxl migration v2 stream

2015-07-14 Thread Andrew Cooper
On 14/07/15 11:33, Yang Hongyang wrote:
 On 07/13/2015 08:01 PM, Andrew Cooper wrote:
 [...]
   domcreate_rebuild_done(egc, dcs, rc);
 @@ -966,6 +989,7 @@ static void domcreate_bootloader_done(libxl__egc
 *egc,
   }

   /* Restore */
 +callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;

 This should be moved after read_stream_init()

 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index e42de47..86f7f13 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -989,8 +989,6 @@ static void domcreate_bootloader_done(libxl__egc
 *egc,
  }

  /* Restore */
 -callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;
 -
  rc = libxl__build_pre(gc, domid, d_config, state);
  if (rc)
  goto out;
 @@ -1003,6 +1001,7 @@ static void domcreate_bootloader_done(libxl__egc
 *egc,
  dcs-srs.legacy = (dcs-restore_params.stream_version == 1);
  dcs-srs.completion_callback = domcreate_stream_done;
  dcs-srs.checkpoint_callback = remus_checkpoint_stream_done;
 +callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;

  libxl__stream_read_start(egc, dcs-srs);
  return;

Thanks.  I have already fixed this a different way, given IanJ's review
of the _init() methods.

The new init methods don't clobber the callbacks.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 1/3] xen/domain_page: Convert map_domain_page_global() to using mfn_t

2015-07-14 Thread Ian Jackson
Jan Beulich writes (Re: [Xen-devel] [PATCH v2 1/3] xen/domain_page: Convert 
map_domain_page_global() to using mfn_t):
 On 13.07.15 at 18:56, ian.jack...@eu.citrix.com wrote:
  Surely xfree() ought to have the same prototype as free() ?
 
 Why? If it were to be a full match, why wouldn't we call it free() in
 the first place?

Is that what is supposed to differ between free and xfree ?  That
would be a bit odd.

In the userland world, xfree is the companion to xmalloc:

  
http://manpages.debian.org/cgi-bin/man.cgi?query=xfreeapropos=0sektion=0manpath=Debian+8+jessieformat=htmllocale=en

(Although, logically speaking, xfree is unnecessary in that set.)

 Note also that Linux has
 
 void kfree(const void *);
 void kzfree(const void *);
 
 (with even the krealloc() flavors matching that model).

How odd.

  free is declared in C99 (7.20.3.2 in my copy of TC2) as:
  
  void free(void *ptr);
  
  That is, non-const.  AFAIAA this is not generally regarded as a bug.
 
 Perhaps, but certainly depending on who looks at it. I for my part
 dislike (as hinted at above) that I have to cast away const-ness in
 order to be able to call free() without causing compiler warnings,
 and I generally hide this in wrappers to limit such casting.

The flipside is that if free can take a const*, functions which take a
const struct foo* can free it.  That's not what I would expect such a
function to do.


Do we need to resolve this disagreement now ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 06/16] hvmloader/pci: skip reserved ranges

2015-07-14 Thread Chen, Tiejun


Note here I don't address your comments above since I think we should 
achieve an agreement firstly.



I think bitmap mechanism is a good idea but honestly, its not easy to
cover all requirements here. And just like bootmem on Linux side, so its
a little complicated to implement this entirely. So I prefer not to
introduce this way in current phase.


I'm afraid it's going to be hard to convince me of any approaches
further complicating the current mechanism instead of overhauling
it.


I agree we'd better overhaul this since we already found something 
unreasonable here. But one or two weeks is really not enough to fix this 
with a bitmap framework, and although a bitmap can make mmio allocation 
better, but more complicated if we just want to allocate PCI mmio.


So could we do this next? I just feel if you'd like to pay more time 
help me refine our current solution, its relatively realistic to this 
case :) And then we can go into bitmap in details or work out a better 
solution in sufficient time slot.


Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 11/15] x86/altp2m: define and implement alternate p2m HVMOP types.

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 01:39, ravi.sah...@intel.com wrote:

 
-Original Message-
From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Monday, July 13, 2015 12:26 AM
To: Sahita, Ravi
Cc: Andrew Cooper; Wei Liu; George Dunlap; Ian Jackson; White, Edmund H;
xen-devel@lists.xen.org; tleng...@novetta.com; Daniel De Graaf; Tim
Deegan
Subject: RE: [PATCH v4 11/15] x86/altp2m: define and implement alternate
p2m HVMOP types.

 On 11.07.15 at 00:03, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
Sent: Friday, July 10, 2015 3:01 AM
 On 10.07.15 at 02:52, edmund.h.wh...@intel.com wrote:
 +default:
 +return -ENOSYS;
 +
 +break;

Bogus (unreachable) break.

 Wanted to keep this so that if someone removes the error code then
 they don't cause an invalid fall through.
 But ok with removing it if you think so.

We don't (intentionally) do this anywhere else, so it should be removed.
 
 Done
 

 +if ( !(d ? d : current-domain)-arch.altp2m_active
 + )

This is bogus: d is NULL if and only if altp2m_vcpu_enable_notify,
i.e. I
 don't
see why you can't just use current-domain inside that case (and you
really do). That would then also eliminate the need for this redundant
and obfuscating switch() nesting you use.


 We need to check if the target domain is in altp2m mode for all the
 following sub-ops.
 If we removed this check, we would need to check for target domain
 being in altp2m for all the following cases.
 Andrew wanted to refactor and pull common code up, and that's what
 this is one case of for hvm_op.

I'd be fine with such refactoring if it didn't result in nested switch()-es 
 using
the same control expression.

 
 Agree that removing the current-domain test will allow us to remove the 
 switch() nesting, however that will also require replicating the common code 
 blocks - I'm ok with making this change just want an ok from Andrew on this 
 one before we make this change to avoid thrashing on this one (since he was 
 the one asking for the refactor).

I don't see why you shouldn't be able to factor this out without this
odd switch() nesting. Note that I didn't object to multiple sequential
switch() statements, only to their strange nesting...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for VT-d posted-interrupts

2015-07-14 Thread Wei Liu
On Tue, Jul 14, 2015 at 05:51:02AM +, Wu, Feng wrote:
 
 
  -Original Message-
  From: Wei Liu [mailto:wei.l...@citrix.com]
  Sent: Monday, July 13, 2015 7:01 PM
  To: Wu, Feng
  Cc: wei.l...@citrix.com; xen-devel@lists.xen.org; Jan Beulich
  (jbeul...@suse.com); andrew.coop...@citrix.com; George Dunlap; Tian, Kevin;
  Zhang, Yang Z; Wang, Yong Y
  Subject: Re: Requesting for freeze exception for VT-d posted-interrupts
  
  On Mon, Jul 13, 2015 at 06:55:30AM +, Wu, Feng wrote:
   Hi maintainers,
  
   We would like to request an extension for freeze exception for VT-d
  posted-interrupts patch-set.
  
   1. clarify the state of patch series / feature.
   [v3 01/15] Vt-d Posted-interrupt (PI) design
   Reviewed-by: Kevin Tian kevin.t...@intel.com
  
   [v3 02/15] Add helper macro for X86_FEATURE_CX16 feature detection
   Reviewed-by: Kevin Tian kevin.t...@intel.com
   Reviewed-by: Andrew Cooper andrew.coop...@citrix.com
  
   [v3 04/15] iommu: Add iommu_intpost to control VT-d Posted-Interrupts
  feature
   Reviewed-by: Kevin Tian kevin.t...@intel.com
  
   [v3 06/15] vmx: Extend struct pi_desc to support VT-d Posted-Interrupts
   Reviewed-by: Andrew Cooper andrew.coop...@citrix.com
   Acked-by: Kevin Tian kevin.t...@intel.com
  
   [v3 07/15] vmx: Initialize VT-d Posted-Interrupts Descriptor
   Acked-by: Kevin Tian kevin.t...@intel.com
  
   [v3 09/15] vt-d: Extend struct iremap_entry to support VT-d
  Posted-Interrupts
   Acked-by: Kevin Tian kevin.t...@intel.com
  
   [v3 10/15] vt-d: Add API to update IRTE when VT-d PI is used
   Acked-by: Kevin Tian kevin.t...@intel.com
  
   [v3 13/15] vmx: Properly handle notification event when vCPU is running
   Acked-by: Kevin Tian kevin.t...@intel.com
  
   [v3 14/15] Update Posted-Interrupts Descriptor during vCPU scheduling
   Acked-by: Kevin Tian kevin.t...@intel.com
  
   [v3 15/15] Add a command line parameter for VT-d posted-interrupts
   Reviewed-by: Kevin Tian kevin.t...@intel.com
  
   2. explain why it needs to be in this release (benefits).
   VT-d posted-interrupts is an important interrupt virtualization feature 
   for
   device pass-through, the running guest can handle external interrupts
   in non-root mode, hence it can eliminate the VM-Exits caused by external
   interrupts. Please refer to the design doc:
   http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg03691.html
  
   From our experimental environment, after using VT-d posted-interrupts, we
   measured 25% improvement in transaction rate netperf TCP_RR benchmark
   and 28% reduction in host CPU utilization when using assigned devices.
   (10G NIC in my test).
  
   3. explain why it doesn't break things (risks).
   This feature only exists in Broadwell Server platform, it has no effect 
   on the
   current hardware.
  
  
  You miss the part that how much common code it touches. There is still
  risk of breaking VMX and VT-D even if PI is disabled.
  
   4. CC relevant maintainers and release manager.
   Done
  
   There are two main outstanding issues so far:
   1. Jan's security concern. I have proposed some solutions but Jan still 
   has
   some problems with my proposals. It would be great if Jan can give a clear
   proposal so that we can discuss and keep making progress.
   2. Scheduler issue: there are conflicts among maintainers 
   Jan/George/Dario.
   I would agree with Jan's suggestion below:
  
Doing this in a central place is certainly the right approach, but
   adding an arch hook that needs to be called everywhere
   vcpu_runstate_change() wouldn't serve that purpose. Instead
   we'd need to replace all current vcpu_runstate_change() calls
   with calls to a new function calling both this and the to be added
   arch hook.
  
  
  Given the current time scale now, I think it would be very hard to get
  these two concerns addressed within a week. Xen has always taken
  security serious, I don't want to rush in a feature with possible flawed
  design.
  
  My answer to this request is no until these concerns are addressed.
 
 Is it possible to get to 4.6 if making this feature default off?
 

Note that I'm not the only one who makes the decision and I can't speak
for maintainers. The first thing you ought to do is to convince
maintainers, not me.

If you ask for my opinion -- I don't see a point in releasing feature
with security flaw in design, even if it is off by default. 

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-14 Thread Chen, Tiejun

On 2015/7/14 17:32, Jan Beulich wrote:

On 14.07.15 at 07:22, tiejun.c...@intel.com wrote:
+for ( i = 0; i  memory_map.nr_map; i++ )
+{
+uint64_t end = e820[i].addr + e820[i].size;


Either loop index/boundary or used array are wrong here: In the
earlier loop you copied memory_map[0...nr_map-1] to
e820[n...n+nr_map-1], but here you're looping looking at
e820[0...nr_map-1]


You're right. I should lookup all e820[] like this,

for ( i = 0; i  nr; i++ )


Hmm, I would have thought you only care about either part of
the just glued together map.


+if ( e820[i].type == E820_RAM 
+ low_mem_end  e820[i].addr  low_mem_end  end )


Assuming you mean to look at the RDM e820[] entries here, this
is not a correct check: You don't care about partly or fully
contained, all you care about is whether low_mem_end extends
beyond the start of the region.


Here I'm looking at the e820 entry indicating low memory. Because

low_mem_end = hvm_info-low_mem_pgend  PAGE_SHIFT;

and when we allocate MMIO in pci.c, its possible to populate RAM so
hvm_info-low_mem_pgend would be changed over there. So we need to
compensate this loss with high memory. Here memory_map[] also records
the original low/high memory, so if low_mem_end is less-than the
original we need this compensation.


And I'm not disputing your intentions - I'm merely pointing out that
afaics the code above doesn't match these intentions. In particular
(as said) I don't see why you need to check low_mem_end  end.



Before we probably relocate RAM,

low_mem_end = hvm_info-low_mem_pgend  PAGE_SHIFT

and the e820 entry specific to low memory,

[e820[X].addr, end]

Here, end = e820[X].addr + e820[X].size;

Now low_mem_end = end.

After that, low_mem_end  end. so if

(low_mem_end  e820[X].addr  low_mem_end  end) is true, this means 
that associated RAM entry is hitting, right? Then we need to revise this 
entry as [e820[X].addr, low_mem_end], and compensate [end - low_mem_end] 
to high memory. Anything I'm still wrong here?


Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 5/5] iommu: add rmrr Xen command line option for extra rmrrs

2015-07-14 Thread Jan Beulich
 On 13.07.15 at 20:18, elena.ufimts...@oracle.com wrote:
 --- a/xen/drivers/passthrough/vtd/dmar.c
 +++ b/xen/drivers/passthrough/vtd/dmar.c
 @@ -867,6 +867,145 @@ out:
  return ret;
  }
  
 +#define MAX_EXTRA_RMRR_PAGES 16
 +#define MAX_EXTRA_RMRR 10
 +
 +/* RMRR units derived from command line rmrr option. */
 +#define MAX_EXTRA_RMRR_DEV 20
 +struct extra_rmrr_unit {
 +struct list_head list;
 +unsigned long base_pfn, end_pfn;
 +unsigned int dev_count;
 +u32sbdf[MAX_EXTRA_RMRR_DEV];

unless you want to align _all_ fields' names, there should be just a
single space between type and name.

 +};
 +static __initdata unsigned int nr_rmrr;
 +static struct __initdata extra_rmrr_unit extra_rmrr_units[MAX_EXTRA_RMRR];
 +
 +/* Macro for RMRR inclusive range formatting. */
 +#define PRI_RMRR(s,e) [%lx-%lx]

Just PRI_RMRR (i.e. no parens or parameters) please. And I'm still
missing a macro to pair the respective arguments - as said before,
as single format specifier should be accompanied by a single
argument (as visible to the reader at the use sites).

 +static void __init add_extra_rmrr(void)
 +{
 +struct acpi_rmrr_unit *acpi_rmrr;
 +struct acpi_rmrr_unit *rmrru;
 +unsigned int dev, seg, i, j;
 +unsigned long pfn;
 +bool_t overlap;
 +
 +for ( i = 0; i  nr_rmrr; i++ )
 +{
 +if ( extra_rmrr_units[i].base_pfn  extra_rmrr_units[i].end_pfn )
 +{
 +printk(XENLOG_ERR VTDPREFIX
 +   Invalid RMRR Range PRI_RMRR(s,e)\n,
 +   extra_rmrr_units[i].base_pfn, 
 extra_rmrr_units[i].end_pfn);
 +continue;
 +}
 +
 +if ( extra_rmrr_units[i].end_pfn - extra_rmrr_units[i].base_pfn =
 + MAX_EXTRA_RMRR_PAGES )
 +{
 +printk(XENLOG_ERR VTDPREFIX
 +   RMRR range PRI_RMRR(s,e) exceeds 
 __stringify(MAX_EXTRA_RMRR_PAGES) pages\n,
 +   extra_rmrr_units[i].base_pfn, 
 extra_rmrr_units[i].end_pfn);
 +continue;
 +}
 +
 +for ( j = 0; j  nr_rmrr; j++ )
 +{
 +if ( i != j 
 + extra_rmrr_units[i].base_pfn = extra_rmrr_units[j].end_pfn 
 
 + extra_rmrr_units[j].base_pfn = extra_rmrr_units[i].end_pfn 
 )
 +{
 +printk(XENLOG_ERR VTDPREFIX
 +  Overlapping RMRRs PRI_RMRR(s,e) and 
 PRI_RMRR(s,e)\n,

No extra here ...

 +  extra_rmrr_units[i].base_pfn, 
 extra_rmrr_units[i].end_pfn,
 +  extra_rmrr_units[j].base_pfn, 
 extra_rmrr_units[j].end_pfn);
 +break;
 +}
 +}
 +/* Broke out of the overlap loop check, continue with next rmrr. */
 +if ( j  nr_rmrr )
 +continue;
 +overlap = 0;
 +list_for_each_entry(rmrru, acpi_rmrr_units, list)
 +{
 +if ( pfn_to_paddr(extra_rmrr_units[i].base_pfn = 
 rmrru-end_address) 
 + rmrru-base_address = 
 pfn_to_paddr(extra_rmrr_units[i].end_pfn) )
 +{
 +printk(XENLOG_ERR VTDPREFIX
 +   Overlapping extra RMRRs PRI_RMRR(s,e) and ACPI 
 RMRRs PRI_RMRR(s,e)\n,

... but here? This may end up being confusing... Also s/RMRRs/RMRR/g
here please.

Also note the misplaced closing parenthesis on the first line of the
enclosing if().

 +   extra_rmrr_units[i].base_pfn,
 +   extra_rmrr_units[i].end_pfn,
 +   paddr_to_pfn(rmrru-base_address),
 +   paddr_to_pfn(rmrru-end_address));
 +overlap = 1;
 +break;
 +}
 +}
 +/* Continue to next RMRR is this one overlaps with one from ACPI. */
 +if ( overlap )
 +continue;
 +
 +pfn = extra_rmrr_units[i].base_pfn;
 +do
 +{
 +if ( !mfn_valid(pfn) || (pfn  (paddr_bits - PAGE_SHIFT)) )
 +{
 +printk(XENLOG_ERR VTDPREFIX
 +   Invalid pfn in RMRR range PRI_RMRR(s,e)\n,
 +   extra_rmrr_units[i].base_pfn, 
 extra_rmrr_units[i].end_pfn);
 +break;
 +}
 +} while ( pfn++ = extra_rmrr_units[i].end_pfn );

Off by one afaict.

 +static void __init parse_rmrr_param(const char *str)
 +{
 +const char *s = str, *cur, *stmp;
 +unsigned int seg, bus, dev, func;
 +unsigned long start, end;
 +
 +do {
 +start = simple_strtoul(cur = s, s, 0);
 +if ( cur == s )
 +break;
 +
 +if ( *s == '-' )
 +{
 +end = simple_strtoul(cur = s + 1, s, 0);
 +if ( cur == s )
 +break;
 +}
 +else
 +end = start;
 +
 +extra_rmrr_units[nr_rmrr].base_pfn = start;
 +extra_rmrr_units[nr_rmrr].end_pfn = end;
 +extra_rmrr_units[nr_rmrr].dev_count = 0;
 +
 +if ( *s != '=' )
 +

Re: [Xen-devel] [PATCH V3] libxl: Increase device model startup timeout to 1min.

2015-07-14 Thread Dario Faggioli
On Tue, 2015-07-14 at 10:37 +0100, Ian Campbell wrote:
 On Tue, 2015-07-14 at 11:25 +0200, Dario Faggioli wrote:
  On Tue, 2015-07-14 at 08:55 +0100, Ian Campbell wrote:

   It'll be hard to say until this change gets through the Xen push gate
   and that version gets used for other branches (linux testing, libvirt,
   ovmf, osstest's own gate etc).
   
  Indeed. My opinion is that no, it is not.
  
  My understanding of the data Anthony provided is that, under some
  (difficult to track/analyze/reproduce/etc) load conditions, the Linux IO
  and VM subsystem suffer from high latency, delaying QEMU startup.
  
  In the merlot* cases, the system is completely idle, apart from the
  failing creation/migration operation.
  
  So, no, I don't think that would not be the fix we need for that
  situation.
 
 Even if it is not the correct fix it seems like in some situations the
 increase in timeout has improved things, hence it is an answer as Jan
 asked (his quote marks).
 
Sure! And that's why I find this weird/interesting...

   At the moment it looks like it has helped with some but not all of the
   issues.
   
   These:
   
   http://logs.test-lab.xenproject.org/osstest/results/host/merlot0.html
   http://logs.test-lab.xenproject.org/osstest/results/host/merlot1.html
   
  Can I ask why (I mean, e.g., comparing what with what) you're saying it
  seems to have helped?
 
 There seemed (unscientifically) to be fewer of the libvirt related
 guest-start failures.
 
And you mean by only looking at xen-unstable lines, don't you?

If yes, looking at merlot0, I've found the below.

Old timeout, failing:
http://logs.test-lab.xenproject.org/osstest/logs/59105/test-amd64-amd64-libvirt-xsm/info.html

New timeout, success:
http://logs.test-lab.xenproject.org/osstest/logs/59311/test-amd64-amd64-libvirt/info.html

And, looking at how long QEMU did take to start up that would be:

  13:44:32 - 13:43:42

i.e., just a bit less than 1min!

So, yes, it looks that this change is actually going to help in this
case.

What I'm missing is how it is possible that, on an idle system, DM
spawning takes that long. As said, in Anthony's OpenStack case, the
system was quite busy... not that it can't be a bug (somewhere, perhaps
in Linux) in that case too, but here, it looks even more weird to me.

May it be the NUMA misconfiguration? Well, if yes, I'm not sure how...

Dario
-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-14 Thread Chen, Tiejun

The way this sort of thing is defined in the rest of domctl.h is like this:

#define _XEN_DOMCTL_CDF_hvm_guest 0
#define XEN_DOMCTL_CDF_hvm_guest  (1U_XEN_DOMCTL_CDF_hvm_guest)

So the above should be

#define _XEN_DOMCTL_DEV_RDM_RELAXED 0
#define XEN_DOMCTL_DEV_RDM_RELAXED (1U_XEN_DOMCTL_DEV_RDM_RELAXED)

And then your check in iommu_do_pci_domctl() would look like

if (flag  ~XEN_DOMCTL_DEV_RDM_RELAXED)

And if we end up adding any extra flags, we just | them into the above
conditional, as is done in, for example, the XEN_DOMCTL_createdomain
case in xen/common/domctl.c:do_domctl().



Seems Jan didn't like this way IIRC, so I hope Jan also can have a look 
at this beforehand :)


Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for ARM/ITS patches

2015-07-14 Thread Stefano Stabellini
On Tue, 14 Jul 2015, Ian Campbell wrote:
 On Mon, 2015-07-13 at 18:24 +0100, Stefano Stabellini wrote:
  On Mon, 13 Jul 2015, Wei Liu wrote:
   On Fri, Jul 10, 2015 at 04:16:07PM +0530, Vijay Kilari wrote:
Hi Wei,

I would like to have freeze exception for ITS feature on ARM64.
Design got freeze few weeks back and I have sent v4 version of patch 
series
today.

This patches will not impact any generic code of other platforms and 
have minor
changes generic arm related code. Also these patches are only for
ARM64 platform.

These patches are pre-requisite for PCI support / Pass-through support
on ARM64 platforms.

The risk is minor and as of today only used by Cavium ThunderX platform.

   
   
   I'm not a ARM expert, but last time I checked most patches in v3 are not
   acked.
   
   I also got conflict statements from maintainers and core developer. I
   will wait a bit for them clarify the situation.
   
   But as Ian said, if you can't post v4 and get most if all of your
   patches acked / reviewed early this week, my answer to this request
   would be no.
  
  I pretty much agree with Ian: I went through the patches and the impact
  of the series on non-ITS platforms will be null after Vijay addresses:
  
  - the lpi irq_desc and irq_pending allocation issues
  - improve lpi_supported to check for ITS presence
  
  these two changes should be trivial and are certainly necessary for a
  freeze exception in my view.
  
  
  On this basis, if Vijay manages to resend a v5 version on time with
  those two issues covered, making sure that the new code is not enabled
  unless an its is present, then I think that a freeze exception would be
  OK as the risk would be zero.
 
 I don't think we should be limiting ourselves to only fixing issues
 which reduce the risk on non-ITS platforms. So the two issues which you
 highlight above are necessary but not sufficient for a freeze exception
 in my view.
 
 For example I am firmly of the opinion that the VPLI injection code
 needs to be corrected as discussed during review.
 
 Likewise I said that care needs to be taken wrt when any of this code is
 enabled, which includes not exposing it to domU even on platforms which
 support ITS. I also view this as a requirement for a freeze exception.
 In other words only dom0 and only on an ITS enabled system should be
 exposed to any aspect of the ITS support.
 
I agree.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 08/29] tools/libxl: Add save_helper_state pointers to libxl__xc_domain_{save, restore}()

2015-07-14 Thread Andrew Cooper
Currently, libxl__xc_domain_{save,restore}() have specific knowledge
of where the libxl__save_helper_state lives inside a
libxl__domain_{create,save}_state object.

In later changes, the logical ownership of the
libxl__save_helper_state will change and will no longer be
d{c,s}s-shs.

No functional change.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
New in v3
---
 tools/libxl/libxl_create.c   |4 +--
 tools/libxl/libxl_dom.c  |   20 +++---
 tools/libxl/libxl_internal.h |5 +++-
 tools/libxl/libxl_save_callout.c |   54 --
 4 files changed, 44 insertions(+), 39 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index be13204..7f0ffc6 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -988,7 +988,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 rc = ERROR_INVAL;
 goto out;
 }
-libxl__xc_domain_restore(egc, dcs,
+libxl__xc_domain_restore(egc, dcs, dcs-shs,
  hvm, pae, superpages);
 return;
 
@@ -1000,7 +1000,7 @@ void libxl__srm_callout_callback_restore_results(unsigned 
long store_mfn,
   unsigned long console_mfn, void *user)
 {
 libxl__save_helper_state *shs = user;
-libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
+libxl__domain_create_state *dcs = shs-caller_state;
 STATE_AO_GC(dcs-ao);
 libxl__domain_build_state *const state = dcs-build_state;
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 8642192..a685b77 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1121,7 +1121,7 @@ int libxl__toolstack_restore(uint32_t domid, const 
uint8_t *ptr,
  uint32_t size, void *user)
 {
 libxl__save_helper_state *shs = user;
-libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
+libxl__domain_create_state *dcs = shs-caller_state;
 STATE_AO_GC(dcs-ao);
 int ret;
 uint32_t version = 0, bufsize;
@@ -1188,7 +1188,7 @@ static void logdirty_init(libxl__logdirty_switch *lds)
 libxl__save_helper_state *shs)
 {
 libxl__egc *egc = shs-egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_suspend_state *dss = shs-caller_state;
 libxl__logdirty_switch *lds = dss-logdirty;
 STATE_AO_GC(dss-ao);
 int rc;
@@ -1260,7 +1260,7 @@ static void logdirty_init(libxl__logdirty_switch *lds)
 libxl__save_helper_state *shs)
 {
 libxl__egc *egc = shs-egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_suspend_state *dss = shs-caller_state;
 STATE_AO_GC(dss-ao);
 int rc;
 
@@ -1279,7 +1279,7 @@ static void logdirty_init(libxl__logdirty_switch *lds)
 {
 libxl__save_helper_state *shs = user;
 libxl__egc *egc = shs-egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_suspend_state *dss = shs-caller_state;
 STATE_AO_GC(dss-ao);
 
 switch (libxl__device_model_version_running(gc, domid)) {
@@ -1832,7 +1832,7 @@ static void libxl__domain_suspend_callback(void *data)
 {
 libxl__save_helper_state *shs = data;
 libxl__egc *egc = shs-egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_suspend_state *dss = shs-caller_state;
 
 dss-callback_common_done = domain_suspend_callback_common_done;
 domain_suspend_callback_common(egc, dss);
@@ -1859,7 +1859,7 @@ static void libxl__remus_domain_suspend_callback(void 
*data)
 {
 libxl__save_helper_state *shs = data;
 libxl__egc *egc = shs-egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_suspend_state *dss = shs-caller_state;
 
 dss-callback_common_done = remus_domain_suspend_callback_common_done;
 domain_suspend_callback_common(egc, dss);
@@ -1902,7 +1902,7 @@ static void libxl__remus_domain_resume_callback(void 
*data)
 {
 libxl__save_helper_state *shs = data;
 libxl__egc *egc = shs-egc;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
+libxl__domain_suspend_state *dss = shs-caller_state;
 STATE_AO_GC(dss-ao);
 
 libxl__remus_devices_state *const rds = dss-rds;
@@ -1947,8 +1947,8 @@ static void remus_next_checkpoint(libxl__egc *egc, 
libxl__ev_time *ev,
 static void libxl__remus_domain_checkpoint_callback(void *data)
 {
 libxl__save_helper_state *shs = data;
-libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
-libxl__egc *egc = dss-shs.egc;
+libxl__domain_suspend_state *dss = shs-caller_state;
+libxl__egc *egc = shs-egc;
 STATE_AO_GC(dss-ao);
 
 /* This would go into tailbuf. */
@@ -2117,7 +2117,7 @@ void 

[Xen-devel] [PATCH v4 02/29] tools/libxc: Always compile the compat qemu variables into xc_sr_context

2015-07-14 Thread Andrew Cooper
This is safe (as the variables will simply be unused), and is required
for correct compilation when midway through untangling the libxc/libxl
interaction.

The #define is left in place to highlight that the variables can be
removed once the untangling is complete.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
 tools/libxc/xc_sr_common.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 565c5da..08c66db 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -307,10 +307,10 @@ struct xc_sr_context
 void *context;
 size_t contextsz;
 
-#ifdef XG_LIBXL_HVM_COMPAT
+/* #ifdef XG_LIBXL_HVM_COMPAT */
 uint32_t qlen;
 void *qbuf;
-#endif
+/* #endif */
 } restore;
 };
 } x86_hvm;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 05/29] tools/libxl: Stash all restore parameters in domain_create_state

2015-07-14 Thread Andrew Cooper
Shortly more parameters will appear, and this saves unboxing each one.
libxl_domain_restore_params is mandatory for restore streams, and
ignored for plain creation.  The old 'checkpointed_stream' was
incorrectly identified as a private parameter when it was infact
public.

No functional change.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
Reviewed-by: Yang Hongyang yan...@cn.fujitsu.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
Since v1:
 * Gate validity on restore_fd being valid.
---
 tools/libxl/libxl_create.c   |   13 +++--
 tools/libxl/libxl_internal.h |2 +-
 tools/libxl/libxl_save_callout.c |2 +-
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index b785ddd..61515da 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1512,8 +1512,8 @@ static void domain_create_cb(libxl__egc *egc,
  int rc, uint32_t domid);
 
 static int do_domain_create(libxl_ctx *ctx, libxl_domain_config *d_config,
-uint32_t *domid,
-int restore_fd, int checkpointed_stream,
+uint32_t *domid, int restore_fd,
+const libxl_domain_restore_params *params,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
@@ -1526,8 +1526,9 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 libxl_domain_config_init(cdcs-dcs.guest_config_saved);
 libxl_domain_config_copy(ctx, cdcs-dcs.guest_config_saved, d_config);
 cdcs-dcs.restore_fd = restore_fd;
+if (restore_fd  -1)
+cdcs-dcs.restore_params = *params;
 cdcs-dcs.callback = domain_create_cb;
-cdcs-dcs.checkpointed_stream = checkpointed_stream;
 libxl__ao_progress_gethow(cdcs-dcs.aop_console_how, aop_console_how);
 cdcs-domid_out = domid;
 
@@ -1553,7 +1554,7 @@ int libxl_domain_create_new(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
-return do_domain_create(ctx, d_config, domid, -1, 0,
+return do_domain_create(ctx, d_config, domid, -1, NULL,
 ao_how, aop_console_how);
 }
 
@@ -1563,8 +1564,8 @@ int libxl_domain_create_restore(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 const libxl_asyncop_how *ao_how,
 const libxl_asyncprogress_how *aop_console_how)
 {
-return do_domain_create(ctx, d_config, domid, restore_fd,
-params-checkpointed_stream, ao_how, 
aop_console_how);
+return do_domain_create(ctx, d_config, domid, restore_fd, params,
+ao_how, aop_console_how);
 }
 
 /*
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7ccbf55..6428757 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3217,11 +3217,11 @@ struct libxl__domain_create_state {
 libxl_domain_config *guest_config;
 libxl_domain_config guest_config_saved; /* vanilla config */
 int restore_fd;
+libxl_domain_restore_params restore_params;
 libxl__domain_create_cb *callback;
 libxl_asyncprogress_how aop_console_how;
 /* private to domain_create */
 int guest_domid;
-int checkpointed_stream;
 libxl__domain_build_state build_state;
 libxl__bootloader_state bl;
 libxl__stub_dm_spawn_state dmss;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index b82a5c1..80aae1b 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -60,7 +60,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, 
libxl__domain_create_state *dcs,
 state-store_domid, state-console_port,
 state-console_domid,
 hvm, pae, superpages,
-cbflags, dcs-checkpointed_stream,
+cbflags, dcs-restore_params.checkpointed_stream,
 };
 
 dcs-shs.ao = ao;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 01/29] bsd-sys-queue-h-seddery: Massage `offsetof'

2015-07-14 Thread Andrew Cooper
From: Ian Jackson ian.jack...@eu.citrix.com

For some reason BSD's queue.h uses `__offsetof'.  It expects it to
work just like offsetof.  So use offsetof.

Reported-by: Andrew Cooper andrew.coop...@citrix.com
Signed-off-by: Ian Jackson ian.jack...@eu.citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
---
 tools/include/xen-external/bsd-sys-queue-h-seddery |2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/include/xen-external/bsd-sys-queue-h-seddery 
b/tools/include/xen-external/bsd-sys-queue-h-seddery
index 7a957e3..3f8716d 100755
--- a/tools/include/xen-external/bsd-sys-queue-h-seddery
+++ b/tools/include/xen-external/bsd-sys-queue-h-seddery
@@ -69,4 +69,6 @@ s/\b struct \s+ type \b/type/xg;
 
 s,^\#include.*sys/cdefs.*,/* $ */,xg;
 
+s,\b __offsetof \b ,offsetof,xg;
+
 s/\b( NULL )/0/xg;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 59511: all pass - PUSHED

2015-07-14 Thread osstest service owner
flight 59511 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59511/

Perfect :-)
All tests in this flight passed
version targeted for testing:
 ovmf 2ad9cf37a492e69a4e1b7624d92d9a35fce083fc
baseline version:
 ovmf 6bc4e42f9d043bcda03f03a74e6dec0aa5c0ead1

Last test of basis59498  2015-07-13 10:48:01 Z0 days
Testing same since59511  2015-07-13 13:47:15 Z0 days1 attempts


People who touched revisions under test:
  Leif Lindholm leif.lindh...@linaro.org

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=ovmf
+ revision=2ad9cf37a492e69a4e1b7624d92d9a35fce083fc
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{Repos} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push ovmf 
2ad9cf37a492e69a4e1b7624d92d9a35fce083fc
+ branch=ovmf
+ revision=2ad9cf37a492e69a4e1b7624d92d9a35fce083fc
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{Repos} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . cri-common
++ . cri-getconfig
++ umask 002
+ select_xenbranch
+ case $branch in
+ tree=ovmf
+ xenbranch=xen-unstable
+ '[' xovmf = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ : tested/2.6.39.x
+ . ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{OsstestUpstream} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/staging/qemu-xen-unstable.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{GitCacheProxy} or die $!;
'
+++ local cache=git://cache:9419/
+++ '[' xgit://cache:9419/ '!=' x ']'
+++ echo 
'git://cache:9419/https://github.com/rumpkernel/rumpkernel-netbsd-src%20[fetch=try]'
++ : 
'git://cache:9419/https://github.com/rumpkernel/rumpkernel-netbsd-src%20[fetch=try]'
++ : git
++ : git://git.seabios.org/seabios.git
++ : 

Re: [Xen-devel] [GIT-PULL OSSTEST] ap-fetch-version: Arrange for osstest merges from upstream to be stable

2015-07-14 Thread Ian Campbell
On Mon, 2015-07-13 at 18:03 +0100, Ian Jackson wrote:
 Ian Campbell writes ([GIT-PULL OSSTEST] ap-fetch-version: Arrange for 
 osstest merges from upstream to be stable):
  ap-fetch-version: Arrange for osstest merges from upstream to be
  stable has now passed the Cambridge push gate and done a couple of
  merges from upstream as well as some failed attempts, which behaved as
  expected. I've dropped a stop file in Cambridge while this pull request
  is pending, so we don't end up racing.
 
 I have merged, and pushed this to pretest in the colo.

Thanks.

 Let us avoid pushing anything else to pretest for a bit, so that we
 can contain the history mess.

The Cambridge instance is currently stopped. I won't unstop it until
this merge passes mainline pretest, at which point further merges into
Cambridge should be fast forwarding again.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3 1/1] libxl: set stub domain size based on VRAM size

2015-07-14 Thread George Dunlap
On Sat, Jul 11, 2015 at 10:30 PM, Eric Shelton eshel...@pobox.com wrote:
 Allocate additional memory to the stub domain for qemu-traditional if
 more than 4 MB is assigned to the video adapter to avoid out of memory
 condition for QEMU.

 Signed-off-by: Eric Shelton eshel...@pobox.com

This seems like a good fix for now, thanks.

But (just speaking to people in general) it does make it even harder
to predict exactly how much host memory a VM is going to end up
needing to start, which is something a lot of people complain about.

I'm beginning to think that one feature we should try to look at is a
way for the admin to specify, Max host memory used, from which
shadow/p2m/device memory/stubdomain can all be allocated.

 -George

 ---
  tools/libxl/libxl_dm.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
 index 317a8eb..9a5a937 100644
 --- a/tools/libxl/libxl_dm.c
 +++ b/tools/libxl/libxl_dm.c
 @@ -1087,6 +1087,7 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
 libxl__stub_dm_spawn_state *sdss)

  dm_config-b_info.max_vcpus = 1;
 -dm_config-b_info.max_memkb = 32 * 1024;
 +dm_config-b_info.max_memkb = 28 * 1024 +
 +guest_config-b_info.video_memkb;
  dm_config-b_info.target_memkb = dm_config-b_info.max_memkb;

  dm_config-b_info.u.pv.features = ;
 --
 2.1.0


 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 07:22, tiejun.c...@intel.com wrote:
 +for ( i = 0; i  memory_map.nr_map; i++ )
 +{
 +uint64_t end = e820[i].addr + e820[i].size;

 Either loop index/boundary or used array are wrong here: In the
 earlier loop you copied memory_map[0...nr_map-1] to
 e820[n...n+nr_map-1], but here you're looping looking at
 e820[0...nr_map-1]
 
 You're right. I should lookup all e820[] like this,
 
 for ( i = 0; i  nr; i++ )

Hmm, I would have thought you only care about either part of
the just glued together map.

 +if ( e820[i].type == E820_RAM 
 + low_mem_end  e820[i].addr  low_mem_end  end )

 Assuming you mean to look at the RDM e820[] entries here, this
 is not a correct check: You don't care about partly or fully
 contained, all you care about is whether low_mem_end extends
 beyond the start of the region.
 
 Here I'm looking at the e820 entry indicating low memory. Because
 
 low_mem_end = hvm_info-low_mem_pgend  PAGE_SHIFT;
 
 and when we allocate MMIO in pci.c, its possible to populate RAM so 
 hvm_info-low_mem_pgend would be changed over there. So we need to 
 compensate this loss with high memory. Here memory_map[] also records 
 the original low/high memory, so if low_mem_end is less-than the 
 original we need this compensation.

And I'm not disputing your intentions - I'm merely pointing out that
afaics the code above doesn't match these intentions. In particular
(as said) I don't see why you need to check low_mem_end  end.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 27/29] tools/libxl: Handle checkpoint records in a libxl migration v2 stream

2015-07-14 Thread Andrew Cooper
This is the final bit of untangling for Remus.

When libxc issues a checkpoint callback, start reading and buffering
all libxl records from the stream.  Once a CHECKPOINT_END record is
encountered, start processing all records.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v3: Simplify, use named constants for API
---
 tools/libxl/libxl_create.c  |   25 
 tools/libxl/libxl_internal.h|8 +++
 tools/libxl/libxl_stream_read.c |  123 +++
 3 files changed, 156 insertions(+)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index c51d64f..5b4d333 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -672,6 +672,27 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
 libxl_device_model_version_to_string(b_info-device_model_version));
 }
 
+/*- remus asynchronous checkpoint callback -*/
+
+static void remus_checkpoint_stream_done(
+libxl__egc *egc, libxl__stream_read_state *srs, int rc);
+
+static void libxl__remus_domain_checkpoint_callback(void *data)
+{
+libxl__save_helper_state *shs = data;
+libxl__domain_create_state *dcs = shs-caller_state;
+libxl__egc *egc = shs-egc;
+STATE_AO_GC(dcs-ao);
+
+libxl__stream_read_start_checkpoint(egc, dcs-srs);
+}
+
+static void remus_checkpoint_stream_done(
+libxl__egc *egc, libxl__stream_read_state *stream, int rc)
+{
+libxl__xc_domain_saverestore_async_callback_done(egc, stream-shs, rc);
+}
+
 /*- main domain creation -*/
 
 /* We have a linear control flow; only one event callback is
@@ -939,6 +960,8 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 libxl_domain_config *const d_config = dcs-guest_config;
 const int restore_fd = dcs-restore_fd;
 libxl__domain_build_state *const state = dcs-build_state;
+libxl__srm_restore_autogen_callbacks *const callbacks =
+dcs-srs.shs.callbacks.restore.a;
 
 if (rc) {
 domcreate_rebuild_done(egc, dcs, rc);
@@ -966,6 +989,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 }
 
 /* Restore */
+callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;
 
 rc = libxl__build_pre(gc, domid, d_config, state);
 if (rc)
@@ -976,6 +1000,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 dcs-srs.fd = restore_fd;
 dcs-srs.legacy = (dcs-restore_params.stream_version == 1);
 dcs-srs.completion_callback = domcreate_stream_done;
+dcs-srs.checkpoint_callback = remus_checkpoint_stream_done;
 
 libxl__stream_read_start(egc, dcs-srs);
 return;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 9961a53..7aa4f16 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3309,9 +3309,13 @@ struct libxl__stream_read_state {
 void (*completion_callback)(libxl__egc *egc,
 libxl__stream_read_state *srs,
 int rc);
+void (*checkpoint_callback)(libxl__egc *egc,
+libxl__stream_read_state *srs,
+int rc);
 /* Private */
 int rc;
 bool running;
+bool in_checkpoint;
 libxl__save_helper_state shs;
 libxl__conversion_helper_state chs;
 
@@ -3321,6 +3325,8 @@ struct libxl__stream_read_state {
 LIBXL_STAILQ_HEAD(, libxl__sr_record_buf) record_queue; /* NOGC */
 enum {
 SRS_PHASE_NORMAL,
+SRS_PHASE_BUFFERING,
+SRS_PHASE_UNBUFFERING,
 } phase;
 bool recursion_guard;
 
@@ -3335,6 +3341,8 @@ struct libxl__stream_read_state {
 _hidden void libxl__stream_read_init(libxl__stream_read_state *stream);
 _hidden void libxl__stream_read_start(libxl__egc *egc,
   libxl__stream_read_state *stream);
+_hidden void libxl__stream_read_start_checkpoint(libxl__egc *egc,
+ libxl__stream_read_state 
*stream);
 _hidden void libxl__stream_read_abort(libxl__egc *egc,
   libxl__stream_read_state *stream, int 
rc);
 static inline bool
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 3051a2b..2d17403 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -35,6 +35,8 @@
  * Undefinedundef  undefundef   undefundef
  * Idle false  undeffalse   00
  * Active   true   NORMAL   false   0/1  0/partial
+ * Active   true   BUFFERINGtrueany  0/partial
+ * Active   true   UNBUFFERING  trueany  0
  *
  * While reading data from the stream, 'dc' is active and a callback
  * is expected.  Most actions in process_record() start a callback of
@@ -46,6 +48,15 @@
  *   Records are read 

[Xen-devel] [PATCH v4 25/29] tools/libxl: Write checkpoint records into the stream

2015-07-14 Thread Andrew Cooper
when signalled to do so by libxl__remus_domain_checkpoint_callback()

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v3: Corrections to comments

This patch has changed substantially in v2 as a result of changes earlier in
the series.  No behavioural difference from v1.
---
 tools/libxl/libxl_dom.c  |   18 -
 tools/libxl/libxl_internal.h |7 
 tools/libxl/libxl_stream_write.c |   81 --
 3 files changed, 93 insertions(+), 13 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index cda3a7a..927a10e 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1937,8 +1937,8 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
 
 /*- remus asynchronous checkpoint callback -*/
 
-static void remus_checkpoint_dm_saved(libxl__egc *egc,
-  libxl__domain_suspend_state *dss, int 
rc);
+static void remus_checkpoint_stream_written(
+libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 static void remus_devices_commit_cb(libxl__egc *egc,
 libxl__remus_devices_state *rds,
 int rc);
@@ -1953,17 +1953,14 @@ static void 
libxl__remus_domain_checkpoint_callback(void *data)
 libxl__egc *egc = shs-egc;
 STATE_AO_GC(dss-ao);
 
-/* This would go into tailbuf. */
-if (dss-hvm) {
-libxl__domain_save_device_model(egc, dss, remus_checkpoint_dm_saved);
-} else {
-remus_checkpoint_dm_saved(egc, dss, 0);
-}
+libxl__stream_write_start_checkpoint(egc, dss-sws);
 }
 
-static void remus_checkpoint_dm_saved(libxl__egc *egc,
-  libxl__domain_suspend_state *dss, int rc)
+static void remus_checkpoint_stream_written(
+libxl__egc *egc, libxl__stream_write_state *sws, int rc)
 {
+libxl__domain_suspend_state *dss = CONTAINER_OF(sws, *dss, sws);
+
 /* Convenience aliases */
 libxl__remus_devices_state *const rds = dss-rds;
 
@@ -2113,6 +2110,7 @@ void libxl__domain_suspend(libxl__egc *egc, 
libxl__domain_suspend_state *dss)
 callbacks-suspend = libxl__remus_domain_suspend_callback;
 callbacks-postcopy = libxl__remus_domain_resume_callback;
 callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;
+dss-sws.checkpoint_callback = remus_checkpoint_stream_written;
 } else
 callbacks-suspend = libxl__domain_suspend_callback;
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 0697ba6..9961a53 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2968,9 +2968,13 @@ struct libxl__stream_write_state {
 void (*completion_callback)(libxl__egc *egc,
 libxl__stream_write_state *sws,
 int rc);
+void (*checkpoint_callback)(libxl__egc *egc,
+libxl__stream_write_state *sws,
+int rc);
 /* Private */
 int rc;
 bool running;
+bool in_checkpoint;
 libxl__save_helper_state shs;
 
 /* Main stream-writing data. */
@@ -2987,6 +2991,9 @@ struct libxl__stream_write_state {
 _hidden void libxl__stream_write_init(libxl__stream_write_state *stream);
 _hidden void libxl__stream_write_start(libxl__egc *egc,
libxl__stream_write_state *stream);
+_hidden void
+libxl__stream_write_start_checkpoint(libxl__egc *egc,
+ libxl__stream_write_state *stream);
 _hidden void libxl__stream_write_abort(libxl__egc *egc,
libxl__stream_write_state *stream,
int rc);
diff --git a/tools/libxl/libxl_stream_write.c b/tools/libxl/libxl_stream_write.c
index d1343e0..944a87b 100644
--- a/tools/libxl/libxl_stream_write.c
+++ b/tools/libxl/libxl_stream_write.c
@@ -22,6 +22,8 @@
  * Entry points from outside:
  *  - libxl__stream_write_start()
  * - Start writing a stream from the start.
+ *  - libxl__stream_write_start_checkpoint()
+ * - Write the records which form a checkpoint into a stream.
  *
  * In normal operation, there are two tasks running at once; this
  * stream processing, and the libxl-save-helper.  check_all_finished()
@@ -39,6 +41,12 @@
  *  - Toolstack record
  *  - if (hvm), Qemu record
  *  - End record
+ *
+ * For checkpointed stream, there is a second loop which is triggered by a
+ * save-helper checkpoint callback.  It writes:
+ *  - Toolstack record
+ *  - if (hvm), Qemu record
+ *  - Checkpoint end record
  */
 
 /* Success/error/cleanup handling. */
@@ -48,6 +56,9 @@ static void stream_complete(libxl__egc *egc,
 libxl__stream_write_state *stream, int rc);
 static void 

[Xen-devel] [PATCH v4 24/29] docs/libxl: Introduce CHECKPOINT_END to support migration v2 remus streams

2015-07-14 Thread Andrew Cooper
In a remus scenario, libxc will write a CHECKPOINT record, then hand
ownership of the fd to libxl.  Libxl then writes any records required
and finishes with a CHECKPOINT_END record, then hands ownership of the
fd back to libxc.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
 docs/specs/libxl-migration-stream.pandoc |   14 +-
 tools/libxl/libxl_sr_stream_format.h |1 +
 tools/python/xen/migration/libxl.py  |   11 +++
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/docs/specs/libxl-migration-stream.pandoc 
b/docs/specs/libxl-migration-stream.pandoc
index 4192950..c24a434 100644
--- a/docs/specs/libxl-migration-stream.pandoc
+++ b/docs/specs/libxl-migration-stream.pandoc
@@ -119,7 +119,9 @@ type 0x: END
 
  0x0003: EMULATOR_CONTEXT
 
- 0x0004 - 0x7FFF: Reserved for future _mandatory_
+ 0x0004: CHECKPOINT_END
+
+ 0x0005 - 0x7FFF: Reserved for future _mandatory_
  records.
 
  0x8000 - 0x: Reserved for future _optional_
@@ -203,3 +205,13 @@ indexIndex of this emulator for the domain, if 
multiple
 
 emulator_ctx Emulator context blob.
 
+
+CHECKPOINT\_END
+---
+
+A checkpoint end record marks the end of a checkpoint in the image.
+
+ 0 1 2 3 4 5 6 7 octet
++-+
+
+The end record contains no fields; its body_length is 0.
diff --git a/tools/libxl/libxl_sr_stream_format.h 
b/tools/libxl/libxl_sr_stream_format.h
index f4f790b..3f3c497 100644
--- a/tools/libxl/libxl_sr_stream_format.h
+++ b/tools/libxl/libxl_sr_stream_format.h
@@ -35,6 +35,7 @@
 #define REC_TYPE_LIBXC_CONTEXT   0x0001U
 #define REC_TYPE_XENSTORE_DATA   0x0002U
 #define REC_TYPE_EMULATOR_CONTEXT0x0003U
+#define REC_TYPE_CHECKPOINT_END  0x0004U
 
 typedef struct libxl__sr_emulator_hdr
 {
diff --git a/tools/python/xen/migration/libxl.py 
b/tools/python/xen/migration/libxl.py
index 4e1f4f8..415502e 100644
--- a/tools/python/xen/migration/libxl.py
+++ b/tools/python/xen/migration/libxl.py
@@ -36,12 +36,14 @@ REC_TYPE_end  = 0x
 REC_TYPE_libxc_context= 0x0001
 REC_TYPE_xenstore_data= 0x0002
 REC_TYPE_emulator_context = 0x0003
+REC_TYPE_checkpoint_end   = 0x0004
 
 rec_type_to_str = {
 REC_TYPE_end  : End,
 REC_TYPE_libxc_context: Libxc context,
 REC_TYPE_xenstore_data: Xenstore data,
 REC_TYPE_emulator_context : Emulator context,
+REC_TYPE_checkpoint_end   : Checkpoint end,
 }
 
 # emulator_context
@@ -176,6 +178,13 @@ class VerifyLibxl(VerifyBase):
 self.info(  Index %d, type %s % (emu_idx, 
emulator_id_to_str[emu_id]))
 
 
+def verify_record_checkpoint_end(self, content):
+ Checkpoint end record 
+
+if len(content) != 0:
+raise RecordError(Checkpoint end record with non-zero length)
+
+
 record_verifiers = {
 REC_TYPE_end:
 VerifyLibxl.verify_record_end,
@@ -185,4 +194,6 @@ record_verifiers = {
 VerifyLibxl.verify_record_xenstore_data,
 REC_TYPE_emulator_context:
 VerifyLibxl.verify_record_emulator_context,
+REC_TYPE_checkpoint_end:
+VerifyLibxl.verify_record_checkpoint_end,
 }
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for ARM/ITS patches

2015-07-14 Thread Vijay Kilari
Hi Ian,

On Sat, Jul 11, 2015 at 12:06 PM, Julien Grall julien.gr...@citrix.com wrote:
 Hi,

 On 10/07/2015 17:52, Ian Campbell wrote:

 On Fri, 2015-07-10 at 12:01 +0100, Jan Beulich wrote:

 On 10.07.15 at 12:46, vijay.kil...@gmail.com wrote:

  I would like to have freeze exception for ITS feature on ARM64.
 Design got freeze few weeks back and I have sent v4 version of patch
 series
 today.

 This patches will not impact any generic code of other platforms and
 have
 minor
 changes generic arm related code. Also these patches are only for
 ARM64 platform.

 These patches are pre-requisite for PCI support / Pass-through support
 on ARM64 platforms.


 But what good will it do for the release when it's only a prereq for
 further work?


 I think this may have been true of v3 but I think it is underselling v4.
 AFAICT with the final patch in v4 Xen will actually be capable of
 booting on a ThunderX platform and starting guests. It will simply lack
 PCI passthrough capabilities (as do all ARM platforms today).


 I think you are overselling this patch series. PCI devices provide the
 support of both legacy interrupt and MSI.

 The former is handled by your PCI patch series [1] pushed upstreamed few
 days ago. This should be enough for booting ThunderX and creating guests.

 The ITS add the support of MSI which is useful for performance question and
 PCI passthrough.

 I think that's a reasonable thing to try and get into 4.6.


 If we ever decide to get this into 4.6 we need some guarantee from Cavium to
 finish the work afterwards.

 Regards,

 [1] http://lists.xen.org/archives/html/xen-devel/2015-07/msg00656.html

I am trying to boot latest staging Xen branch on ThunderX with ITS patches.

I face below issues with above [1] patch series

1)  If pcie support only MSI, then INT mapping is not specified in DT. However
  the below code returns error if INT mapping is not found and does not map.
In ThunderX INT mapping is not specified for pcie nodes.

static int map_device_children(struct domain *d,
const struct dt_device_node *dev)
{
 int ret;

 if ( dt_device_type_is_equal(dev, pci) )
 {
 DPRINT(Mapping children of %s to guest\n, dt_node_full_name(dev));

 ret = dt_for_each_irq_map(dev, map_dt_irq_to_domain, d);
 if ( ret  0 )
 return ret;  // Returns error here
...
}

2) Dom0 fails to boot with GICv3. It hangs just after GICv3 initialization.

To know which is last distributor/re-distributor registers read, I
have added debug prints
in GICD  GICR mmio handlers in Xen. But with your patches Linux driver never
traps to Xen to read/write GICD/GICR registers. If I revert back this
patch series,
I see the traps.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for RMRR

2015-07-14 Thread Ian Campbell
On Tue, 2015-07-14 at 10:18 +0100, Jan Beulich wrote:

  Y Y   [v7][PATCH 14/16] xen/vtd: enable USB device assignment

diffstat:
 xen/drivers/passthrough/vtd/dmar.h  |  1 -
 xen/drivers/passthrough/vtd/iommu.c | 11 ++-
 xen/drivers/passthrough/vtd/utils.c |  7 ---
 3 files changed, 2 insertions(+), 17 deletions(-)

  Y Y   [v7][PATCH 15/16] xen/vtd: prevent from assign the 
  device with
  shared rmrr

xen/drivers/passthrough/vtd/iommu.c | 32 +---
 1 file changed, 29 insertions(+), 3 deletions(-)

 
  And yet again for these two. Please avoid giving a false impression
  
  But these two patches really won Kevin's Ack, and also I wrote this line
  
  Acked-by: Kevin Tian kevin.t...@intel.com
  
  both in these two patches.
 
 But talk here is about their review status, not who ack-ed them (and
 an ack by other than  a maintainer of the affected code is not very
 meaningful anyway).

Kevin is maintainer of that code though, isn't he?

$ ./scripts/get_maintainer.pl -f xen/drivers/passthrough/vtd/dmar.h
Yang Zhang yang.z.zh...@intel.com
Kevin Tian kevin.t...@intel.com
xen-devel@lists.xen.org

(same for the other files)

From MAINTAINERS I suppose that is this entry:

INTEL(R) VT FOR DIRECTED I/O (VT-D)
M:  Yang Zhang yang.z.zh...@intel.com
M:  Kevin Tian kevin.t...@intel.com
S:  Supported
F:  xen/drivers/passthrough/vtd/

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] sysctl: adjust XEN_SYSCTL_numainfo behavior

2015-07-14 Thread Jan Beulich
... to match XEN_SYSCTL_cputopoinfo, allowing the caller to get what it
needs (if e.g. it's after the data for just one specific node) with
just one hypercall, without caring about the total number of nodes in
the system.

Suggested-by: Boris Ostrovsky boris.ostrov...@oracle.com
Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -285,15 +285,9 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xe
 {
 xen_sysctl_meminfo_t meminfo = { 0 };
 
-if ( ni-num_nodes  num_nodes )
-{
-ret = -ENOBUFS;
-i = num_nodes;
-}
-else
-i = 0;
-
-for ( ; i  num_nodes; i++ )
+if ( num_nodes  ni-num_nodes )
+num_nodes = ni-num_nodes;
+for ( i = 0; i  num_nodes; ++i )
 {
 static uint32_t distance[MAX_NUMNODES];
 
@@ -335,7 +329,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xe
 else
 i = num_nodes;
 
-if ( (!ret || (ret == -ENOBUFS))  (ni-num_nodes != i) )
+if ( !ret  (ni-num_nodes != i) )
 {
 ni-num_nodes = i;
 if ( __copy_field_to_guest(u_sysctl, op,
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -515,10 +515,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_memin
  *non-null)
  *
  * OUT:
- *  - If 'num_nodes' is less than the number Xen needs to write, -ENOBUFS shall
- *be returned and 'num_nodes' updated to reflect the intended number.
- *  - On success, 'num_nodes' shall indicate the number of entries written, 
which
- *may be less than the maximum.
+ *  - If 'num_nodes' is less than the number Xen wants to write but either
+ *handle is not a NULL one, partial data gets returned and 'num_nodes'
+ *gets updated to reflect the intended number.
+ *  - Otherwise, 'num_nodes' shall indicate the number of entries written, 
which
+ *may be less than the input value.
  */
 
 struct xen_sysctl_numainfo {



sysctl: adjust XEN_SYSCTL_numainfo behavior

... to match XEN_SYSCTL_cputopoinfo, allowing the caller to get what it
needs (if e.g. it's after the data for just one specific node) with
just one hypercall, without caring about the total number of nodes in
the system.

Suggested-by: Boris Ostrovsky boris.ostrov...@oracle.com
Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -285,15 +285,9 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xe
 {
 xen_sysctl_meminfo_t meminfo = { 0 };
 
-if ( ni-num_nodes  num_nodes )
-{
-ret = -ENOBUFS;
-i = num_nodes;
-}
-else
-i = 0;
-
-for ( ; i  num_nodes; i++ )
+if ( num_nodes  ni-num_nodes )
+num_nodes = ni-num_nodes;
+for ( i = 0; i  num_nodes; ++i )
 {
 static uint32_t distance[MAX_NUMNODES];
 
@@ -335,7 +329,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xe
 else
 i = num_nodes;
 
-if ( (!ret || (ret == -ENOBUFS))  (ni-num_nodes != i) )
+if ( !ret  (ni-num_nodes != i) )
 {
 ni-num_nodes = i;
 if ( __copy_field_to_guest(u_sysctl, op,
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -515,10 +515,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_memin
  *non-null)
  *
  * OUT:
- *  - If 'num_nodes' is less than the number Xen needs to write, -ENOBUFS shall
- *be returned and 'num_nodes' updated to reflect the intended number.
- *  - On success, 'num_nodes' shall indicate the number of entries written, 
which
- *may be less than the maximum.
+ *  - If 'num_nodes' is less than the number Xen wants to write but either
+ *handle is not a NULL one, partial data gets returned and 'num_nodes'
+ *gets updated to reflect the intended number.
+ *  - Otherwise, 'num_nodes' shall indicate the number of entries written, 
which
+ *may be less than the input value.
  */
 
 struct xen_sysctl_numainfo {
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 15/15] tools/xen-access: altp2m testcases

2015-07-14 Thread Wei Liu
On Mon, Jul 13, 2015 at 05:15:03PM -0700, Ed White wrote:
 From: Tamas K Lengyel tleng...@novetta.com
 
 Working altp2m test-case. Extended the test tool to support singlestepping
 to better highlight the core feature of altp2m view switching.
 
 Signed-off-by: Tamas K Lengyel tleng...@novetta.com
 Signed-off-by: Ed White edmund.h.wh...@intel.com
 
 Reviewed-by: Razvan Cojocaru rcojoc...@bitdefender.com

Acked-by: Wei Liu wei.l...@citrix.com

Some nits below.

I do notice there is inconsistency in coding style, so I won't ask you
to resubmit just for fixing style. But a follow-up patch to fix up all
style problem is welcome.

 ---
  tools/tests/xen-access/xen-access.c | 173 
 ++--
  1 file changed, 148 insertions(+), 25 deletions(-)
 
 diff --git a/tools/tests/xen-access/xen-access.c 
 b/tools/tests/xen-access/xen-access.c
 index 12ab921..6b69c26 100644
 --- a/tools/tests/xen-access/xen-access.c
 +++ b/tools/tests/xen-access/xen-access.c
 @@ -275,6 +275,19 @@ xenaccess_t *xenaccess_init(xc_interface **xch_r, 
 domid_t domain_id)
  return NULL;
  }
  
 +static inline
 +int control_singlestep(
 +xc_interface *xch,
 +domid_t domain_id,
 +unsigned long vcpu,
 +bool enable)
 +{
 +uint32_t op = enable ?
 +XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON : 
 XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
 +
 +return xc_domain_debug_control(xch, domain_id, op, vcpu);
 +}
 +
  /*
   * Note that this function is not thread safe.
   */
 @@ -317,13 +330,15 @@ static void put_response(vm_event_t *vm_event, 
 vm_event_response_t *rsp)
  
  void usage(char* progname)
  {
 -fprintf(stderr,
 -Usage: %s [-m] domain_id write|exec|breakpoint\n
 +fprintf(stderr, Usage: %s [-m] domain_id write|exec, progname);
 +#if defined(__i386__) || defined(__x86_64__)
 +fprintf(stderr, |breakpoint|altp2m_write|altp2m_exec);
 +#endif
 +fprintf(stderr,
  \n
  Logs first page writes, execs, or breakpoint traps that occur 
 on the domain.\n
  \n
 --m requires this program to run, or else the domain may 
 pause\n,
 -progname);
 +-m requires this program to run, or else the domain may 
 pause\n);

Indentation looks wrong, but this is not your fault so don't worry
about this.

  }
  
  int main(int argc, char *argv[])
 @@ -341,6 +356,8 @@ int main(int argc, char *argv[])
  int required = 0;
  int breakpoint = 0;
  int shutting_down = 0;
 +int altp2m = 0;
 +uint16_t altp2m_view_id = 0;
  
  char* progname = argv[0];
  argv++;
 @@ -379,10 +396,22 @@ int main(int argc, char *argv[])
  default_access = XENMEM_access_rw;
  after_first_access = XENMEM_access_rwx;
  }
 +#if defined(__i386__) || defined(__x86_64__)
  else if ( !strcmp(argv[0], breakpoint) )
  {
  breakpoint = 1;
  }
 +else if ( !strcmp(argv[0], altp2m_write) )
 +{
 +default_access = XENMEM_access_rx;
 +altp2m = 1;
 +}
 +else if ( !strcmp(argv[0], altp2m_exec) )
 +{
 +default_access = XENMEM_access_rw;
 +altp2m = 1;
 +}
 +#endif
  else
  {
  usage(argv[0]);
 @@ -415,22 +444,73 @@ int main(int argc, char *argv[])
  goto exit;
  }
  
 -/* Set the default access type and convert all pages to it */
 -rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
 -if ( rc  0 )
 +/* With altp2m we just create a new, restricted view of the memory */
 +if ( altp2m )
  {
 -ERROR(Error %d setting default mem access type\n, rc);
 -goto exit;
 -}
 +xen_pfn_t gfn = 0;
 +unsigned long perm_set = 0;
 +
 +rc = xc_altp2m_set_domain_state( xch, domain_id, 1 );

Extraneous spaces in brackets.

 +if ( rc  0 )
 +{
 +ERROR(Error %d enabling altp2m on domain!\n, rc);
 +goto exit;
 +}
 +
 +rc = xc_altp2m_create_view( xch, domain_id, default_access, 
 altp2m_view_id );

Extraneous spaces and line too long.

 +if ( rc  0 )
 +{
 +ERROR(Error %d creating altp2m view!\n, rc);
 +goto exit;
 +}
  
 -rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
 -   (xenaccess-max_gpfn - START_PFN) );
 +DPRINTF(altp2m view created with id %u\n, altp2m_view_id);
 +DPRINTF(Setting altp2m mem_access permissions.. );
  
 -if ( rc  0 )
 +for(; gfn  xenaccess-max_gpfn; ++gfn)
 +{
 +rc = xc_altp2m_set_mem_access( xch, domain_id, altp2m_view_id, 
 gfn,
 +   default_access);

Space.

 +if ( !rc )
 +perm_set++;
 +}
 +
 +DPRINTF(done! Permissions set on %lu pages.\n, perm_set);
 +
 +rc = xc_altp2m_switch_to_view( xch, domain_id, altp2m_view_id );

Ditto.

The rest of code has 

Re: [Xen-devel] [PATCH v2 1/3] xen/domain_page: Convert map_domain_page_global() to using mfn_t

2015-07-14 Thread Jan Beulich
 On 13.07.15 at 18:56, ian.jack...@eu.citrix.com wrote:
 Jan Beulich writes (Re: [Xen-devel] [PATCH v2 1/3] xen/domain_page: Convert 
 map_domain_page_global() to using mfn_t):
 On 07.07.15 at 12:07, andrew.coop...@citrix.com wrote:
  Just like free(), these functions are not performing a read-only
  operation on the destination pointer, therefore must not be performed on
  an actual const pointer.
 
 I disagree - from the caller's persepctive they don't modify the data
 being pointed to (that data is simply becoming undefined). An entity
 allocating memory, initializing it, and never modifying it again should
 be allowed to store the pointer in a variable pointing to a const
 modified type, and free/unmap it without needing any casts. I.e. in
 fact xfree() and free_xenheap_pages() should have their parameters
 constified.
 
 Surely xfree() ought to have the same prototype as free() ?

Why? If it were to be a full match, why wouldn't we call it free() in
the first place?

Note also that Linux has

void kfree(const void *);
void kzfree(const void *);

(with even the krealloc() flavors matching that model).

 free is declared in C99 (7.20.3.2 in my copy of TC2) as:
 
 void free(void *ptr);
 
 That is, non-const.  AFAIAA this is not generally regarded as a bug.

Perhaps, but certainly depending on who looks at it. I for my part
dislike (as hinted at above) that I have to cast away const-ness in
order to be able to call free() without causing compiler warnings,
and I generally hide this in wrappers to limit such casting.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for ARM/ITS patches

2015-07-14 Thread Vijay Kilari
On Sat, Jul 11, 2015 at 12:48 PM, Julien Grall julien.gr...@citrix.com wrote:
 Hi Ian,

 On 10/07/2015 18:07, Ian Campbell wrote:

 On Fri, 2015-07-10 at 16:16 +0530, Vijay Kilari wrote:

  I would like to have freeze exception for ITS feature on ARM64.
 Design got freeze few weeks back and I have sent v4 version of patch
 series
 today.


 Thanks, I've been through v4 and it is certainly much improved over v3.


 I haven't had time to look closer to the code. I will do Wednesday when I'm
 back.

 Although I skimmed quickly the series and it looks ok.

 There are some smaller issues and one slightly more major one regarding
 the mechanisms used to inject a vlpi, which I hope I've explained fully
 enough in the review (in short if you do what the design draft says it
 should be fixed, the incorrectness and complexity is all down to trying
 to do things a different way which leads to you needing to lookup things
 which you shouldn't need to lookup in places where you don't need them)


 This patches will not impact any generic code of other platforms and have
 minor
 changes generic arm related code. Also these patches are only for
 ARM64 platform.


 There is some stuff which touches the non-ITS related ARM interrupt
 handling, however they are mostly adding checks in is_lpi and in some
 cases alternative code if it is true, which should be pretty safe.


 I will comment on this later in the patch series. But some usage of is_lpi
 are worrying for the IRQ passthrough in general.

 Some instance are route_irq_to_guest. The function is re-used in different
 way for LPIs. Which make the code more difficult to read and potentially
 buggy.

 I suggested an idea in v3
 (http://lists.xenproject.org/archives/html/xen-devel/2015-06/msg03756.html)
 but it seems that it has not been taking into account in v4.

 As I mentioned during review care needs to be taken to ensure that this
 code is not enabled for any guest on a platform which has no ITS and to
 only enable it for dom0 on platforms which do.

 Due to the structure of the patch series (which is not optimised for
 being able to follow what is going on) it wasn't clear to me if this was
 the case, but I think not. There are no checks for dom0 and only checks
 for the presence of LPI/ITS in the physical gic, not as a property of
 the individual domains.

 When I say not enabled I mean no MMIO handlers registered, no
 GITS_TRANSLATER MMIO mapping to the guest, no functional change to the
 existing GIC* registers.


 These patches are pre-requisite for PCI support / Pass-through support
 on ARM64 platforms.


 As I explained in my reply to Jan I think this is underselling it a
 little, since AIUI it should make it possible to boot Xen on ThunderX
 and do useful things (like run guests).


 Well, PCI are able to support both legacy interrupt and MSI. If there is no
 MSI, the PCI will use the former. The performance may be poor but it will
 at least boot Xen on ThunderX and creating guest.

 Furthermore, I think this is important to note that this feature is more a
 tech preview than a production ready one.

 The current implementation is working fine with the current version of
 Linux, it behave as we expect. But the emulation as some TODO (I have in
 mind GICR_PROPBASER) which need to be fixed in order to support any Linux.

 We had a similar bug in GICv3 for the re-distributor emulation which took us
 a month to fix it.

 I'm not saying that I'm against a freeze exception, but we need some promise
 from Cavium to finish/cleanup the implementation afterwards.


I can fix all the required issues required for 4.6 provided if we take
firm decision
that we will have freeze exception for ITS.

I can commit to finish/clean up any pending ITS TODO's


 I have in mind some code placement which may be ok for Xen 4.6 (ITS in
 domain_build vITS initialization directly called in the setup.c...) but it's
 definitely against the principle that we are trying to keep the common code
 as common as possible.

 Regards,

 --
 Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3 1/1] libxl: set stub domain size based on VRAM size

2015-07-14 Thread Eric Shelton
On Jul 14, 2015 4:51 AM, George Dunlap george.dun...@eu.citrix.com
wrote:

 On Sat, Jul 11, 2015 at 10:30 PM, Eric Shelton eshel...@pobox.com wrote:
  Allocate additional memory to the stub domain for qemu-traditional if
  more than 4 MB is assigned to the video adapter to avoid out of memory
  condition for QEMU.
 
  Signed-off-by: Eric Shelton eshel...@pobox.com

 This seems like a good fix for now, thanks.

 But (just speaking to people in general) it does make it even harder
 to predict exactly how much host memory a VM is going to end up
 needing to start, which is something a lot of people complain about.

 I'm beginning to think that one feature we should try to look at is a
 way for the admin to specify, Max host memory used, from which
 shadow/p2m/device memory/stubdomain can all be allocated.

Just curious: what happens when all of that memory has been allocated,
yet there is still plenty of memory overall?  Would you not be able to spin
up a domain despite there actually being plenty of memory to do so?

Eric
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3 1/1] libxl: set stub domain size based on VRAM size

2015-07-14 Thread George Dunlap
On 07/14/2015 11:02 AM, Eric Shelton wrote:
 On Jul 14, 2015 4:51 AM, George Dunlap george.dun...@eu.citrix.com
 wrote:

 On Sat, Jul 11, 2015 at 10:30 PM, Eric Shelton eshel...@pobox.com wrote:
 Allocate additional memory to the stub domain for qemu-traditional if
 more than 4 MB is assigned to the video adapter to avoid out of memory
 condition for QEMU.

 Signed-off-by: Eric Shelton eshel...@pobox.com

 This seems like a good fix for now, thanks.

 But (just speaking to people in general) it does make it even harder
 to predict exactly how much host memory a VM is going to end up
 needing to start, which is something a lot of people complain about.

 I'm beginning to think that one feature we should try to look at is a
 way for the admin to specify, Max host memory used, from which
 shadow/p2m/device memory/stubdomain can all be allocated.
 
 Just curious: what happens when all of that memory has been allocated,
 yet there is still plenty of memory overall?  Would you not be able to spin
 up a domain despite there actually being plenty of memory to do so?

The idea would be that you would set this value to say, 2048MiB, and
then all the little bits and bobs would come out of that 2048MiB.  So if
you have a 48MiB stubdom, and 4MiB of device ROMs, and 4MiB of p2ms
(including alternate p2ms), and 1MiB of other miscellaneous stuff, then
the guest would only see 1991MiB of virtual RAM.  (If I've done the math
correctly.)

Which does mean, if you set this value to 50MiB, but run it with a
stubdom and 16MiB of video RAM and altp2ms and everything, you're going
to have problems. :-)

But in the normal use case, it should never cause a guest not to be created.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 0/3] Vm_event memory introspection helpers

2015-07-14 Thread Jan Beulich
 On 13.07.15 at 19:14, rcojoc...@bitdefender.com wrote:
 I've also moved x86 logic in patch 3/3 to x86 source files, this
 seems to have gone unnoticed but would likely have not compiled
 on ARM.

Which leaves open whether this time you actually checked that
ARM continues to build.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for RMRR

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 02:26, tiejun.c...@intel.com wrote:
  1. clarify the state of patch series / feature.

 ReviewedAcked   RMRR series v7
 Y   Y   [v7][PATCH 01/16] xen: introduce 
 XENMEM_reserved_device_memory_map
 Y   Y   [v7][PATCH 02/16] xen/vtd: create RMRR mapping
 Y   N   [v7][PATCH 03/16] xen/passthrough: extend hypercall to 
 support rdm
 reservation policy

 I can't seem to find any such Reviewed-by on the list (and the patch
 itself doesn't carry one either).

 Y   Y   [v7][PATCH 04/16] xen: enable XENMEM_memory_map in hvm
 Y   N   [v7][PATCH 05/16] hvmloader: get guest memory map into 
 memory_map[]
 Y   N   [v7][PATCH 06/16] hvmloader/pci: skip reserved ranges

 Same here.

 Y   N   [v7][PATCH 07/16] hvmloader/e820: construct guest e820 
 table

 And again.
 
 Sorry this is my fault to these hv patches.
 

 Y   Y   [v7][PATCH 08/16] tools/libxc: Expose new hypercall
 xc_reserved_device_memory_map
 Y   Y   [v7][PATCH 09/16] tools: extend xc_assign_device() to 
 support rdm
 reservation policy
 Y   Y   [v7][PATCH 10/16] tools: introduce some new parameters 
 to set rdm
 policy
 Y   Y   [v7][PATCH 11/16] tools/libxl: detect and avoid 
 conflicts with RDM
 Y   Y   [v7][PATCH 12/16] tools: introduce a new parameter to 
 set a
 predefined rdm boundary
 Y   Y   [v7][PATCH 13/16] libxl: construct e820 map with RDM 
 information
 for HVM guest
 Y   Y   [v7][PATCH 14/16] xen/vtd: enable USB device assignment
 Y   Y   [v7][PATCH 15/16] xen/vtd: prevent from assign the 
 device with
 shared rmrr

 And yet again for these two. Please avoid giving a false impression
 
 But these two patches really won Kevin's Ack, and also I wrote this line
 
 Acked-by: Kevin Tian kevin.t...@intel.com
 
 both in these two patches.

But talk here is about their review status, not who ack-ed them (and
an ack by other than  a maintainer of the affected code is not very
meaningful anyway).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] sysctl: adjust XEN_SYSCTL_cputopoinfo behavior

2015-07-14 Thread Jan Beulich
 On 13.07.15 at 19:43, boris.ostrov...@oracle.com wrote:
 On 07/13/2015 11:53 AM, Jan Beulich wrote:
 The new function's implementation, other than the original one of
 XEN_SYSCTL_topologyinfo, didn't allow the caller to get what it needs
 (if e.g. it's after the data for just one specific CPU) with just one
 hypercall, without caring about the total number of CPUs in the system.
 
 Reviewed-by: Boris Ostrovsky boris.ostrov...@oracle.com
 
 Would we want a similar update for XEN_SYSCTL_numainfo?

I thought there was a second such case, but seeing
XEN_SYSCTL_pcitopoinfo not be it I didn't look further assuming
I mis-remembered. So yes, imo that one should be adjusted in the
same way.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] util/grub.d/20_linux_xen.in: Add arm64 support

2015-07-14 Thread Ian Campbell
On Tue, 2015-07-14 at 06:53 +0300, Andrei Borzenkov wrote:
  +if [ x$machine != xaarch64 ]; then
  +   multiboot_cmd=multiboot
  +   module_cmd=module
  +else
  +   multiboot_cmd=xen_hypervisor
  +   module_cmd=xen_module
  +fi
  +
 
 Strictly speaking, this is boot-time decision. As mentioned by
 Vladimir, better would be to provide alias xen_hypervisor and
 xen_module in multiboot for platforms supporting Xen (is MIPS really
 supported?) and use it consistently.

I had been thinking of this the other way around, e.g. on platforms
which support Xen but not multiboot1 multiboot would be added as an
alias for xen_hypervisor.

However so long as grub-mkconfig (via 20_linux_xen) work for everyone
and that peoples existing hand-crafted x86/multiboot/Xen grub.cfg's
continue to work then I think having the alias go either way would be
fine.

BTW I had been going to suggest a function at the grub.cfg level which
dispatched to the correct command, but I suppose an actual alias is
better.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for ARM/ITS patches

2015-07-14 Thread Ian Campbell
On Tue, 2015-07-14 at 14:54 +0530, Vijay Kilari wrote:
 I am trying to boot latest staging Xen branch on ThunderX with ITS patches.
 
 I face below issues with above [1] patch series
 
 1)  If pcie support only MSI, then INT mapping is not specified in DT. However
   the below code returns error if INT mapping is not found and does not map.
 In ThunderX INT mapping is not specified for pcie nodes.
 
 static int map_device_children(struct domain *d,
 const struct dt_device_node *dev)
 {
  int ret;
 
  if ( dt_device_type_is_equal(dev, pci) )
  {
  DPRINT(Mapping children of %s to guest\n, dt_node_full_name(dev));
 
  ret = dt_for_each_irq_map(dev, map_dt_irq_to_domain, d);
  if ( ret  0 )
  return ret;  // Returns error here

Hrm, I suppose dt_for_each_irq_map ought to return success if the device
in question has no interrupt-map at all.

At first glance it seems like:
if ( imap == NULL )
{
dt_dprintk( - no map, ignoring\n);
goto fail;
}
Should become:
if ( imap == NULL )
{
dt_dprintk( - no map, ignoring\n);
return 0;
}

Can you test that and if it is correct submit it as a patch please.

 ...
 }
 
 2) Dom0 fails to boot with GICv3. It hangs just after GICv3 initialization.
 
 To know which is last distributor/re-distributor registers read, I
 have added debug prints
 in GICD  GICR mmio handlers in Xen. But with your patches Linux driver never
 traps to Xen to read/write GICD/GICR registers. If I revert back this
 patch series,
 I see the traps.

Where this patch series is this:

$ git log --oneline 
d7f132c762d1359f03b2b5b89406daf39d8aefc0..467e5cbb2ffc5e0994c4cb584b7ace6a01a727af
 
467e5cb xen: arm: consolidate mmio and irq mapping to dom0
f65399f xen: arm: Import of_bus PCI entry from Linux (as a dt_bus entry)
864f82a xen: arm: map child MMIO and IRQs to dom0 for PCI bus DT nodes.
eed5e39 xen: arm: drop redundant extra call to vgic_reserve_virq
f9d08f4 xen: dt: add dt_for_each_range helper
5cefb30 xen: dt: add dt_for_each_irq_map helper
$

?

That's rather strange, nothing here should be interacting with the vgic
trapping for GICD/GICR.

About the only thing I can imagine is that something has incorrectly
created a p2m mapping over the GICD/GICR addresses. If that is the case
then it ought to be pretty apparent from the logs with DT_DEBUG enabled
in domain_build.c.

If it isn't apparent from the debug log then please could you bisect
down to a specific patch.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/9] libxl: fix libxl__abs_path

2015-07-14 Thread Ian Jackson
Ian Campbell writes (Re: [PATCH 1/9] libxl: fix libxl__abs_path):
 On Mon, 2015-07-13 at 18:12 +0100, Ian Jackson wrote:
  There are no call sites in libxl which can pass a NULL.
  
  I think that if we are to retain this feature, it ought to be
  documented, at least.
 
 Or add an assert if we don't wish to support this?

If we don't want to support it we can just remove the check and let it
crash.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Request a freeze exception for Libxl Migration v2 in 4.6

2015-07-14 Thread Ian Jackson
Andrew Cooper writes (Request a freeze exception for Libxl Migration v2 in 
4.6):
 I would like to request a freeze exception for libxl migration v2.
 
 v3 of the series was posted this morning, and review seems to indicate
 that it is mostly on track.  I hope to have v4 ready to post tomorrow,
 and hope to have no further adjustments required.

Wei asked me for input and I thought it best to reply by email.

Andrew Cooper's summary of the current state is accurate.

In my opinion, the version posted last week was nearly ready.  The
remaining issues with v3 are minor and I expect them to be resolved
quickly.  The overall quality is high.

 The biggest risk, feature wise, is the subject of Remus streams which I
 am unable to test.  An earlier version of the series had some testing
 and identified an issue, which has subsequently been fixed.

It would IMO be reasonable to treat any further problems discovered as
bugs to fix during the freeze.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-14 Thread George Dunlap
On 07/13/2015 07:47 AM, Chen, Tiejun wrote:
 Thanks for this; a few more comments...

 
 Thanks for your time.
 
 @@ -1577,9 +1578,15 @@ int iommu_do_pci_domctl(
   seg = machine_sbdf  16;
   bus = PCI_BUS(machine_sbdf);
   devfn = PCI_DEVFN2(machine_sbdf);
 +flag = domctl-u.assign_device.flag;
 +if ( flag  XEN_DOMCTL_DEV_RDM_RELAXED )

 This is not a blocker, but a stylistic comment: I would have inverted
 the bitmask here, as that's conceptually what you're checking.  I
 won't make this a blocker for going in.
 
 What about this?
 
 diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
 index 6e23fc6..17a4206 100644
 --- a/xen/drivers/passthrough/pci.c
 +++ b/xen/drivers/passthrough/pci.c
 @@ -1579,7 +1579,7 @@ int iommu_do_pci_domctl(
  bus = PCI_BUS(machine_sbdf);
  devfn = PCI_DEVFN2(machine_sbdf);
  flag = domctl-u.assign_device.flag;
 -if ( flag  XEN_DOMCTL_DEV_RDM_RELAXED )
 +if ( flag  ~XEN_DOMCTL_DEV_RDM_MASK )
  {
  ret = -EINVAL;
  break;
 diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
 index bca25c9..07549a4 100644
 --- a/xen/include/public/domctl.h
 +++ b/xen/include/public/domctl.h
 @@ -480,6 +480,7 @@ struct xen_domctl_assign_device {
  } u;
  /* IN */
  #define XEN_DOMCTL_DEV_RDM_RELAXED  1
 +#define XEN_DOMCTL_DEV_RDM_MASK 0x1

The way this sort of thing is defined in the rest of domctl.h is like this:

#define _XEN_DOMCTL_CDF_hvm_guest 0
#define XEN_DOMCTL_CDF_hvm_guest  (1U_XEN_DOMCTL_CDF_hvm_guest)

So the above should be

#define _XEN_DOMCTL_DEV_RDM_RELAXED 0
#define XEN_DOMCTL_DEV_RDM_RELAXED (1U_XEN_DOMCTL_DEV_RDM_RELAXED)

And then your check in iommu_do_pci_domctl() would look like

if (flag  ~XEN_DOMCTL_DEV_RDM_RELAXED)

And if we end up adding any extra flags, we just | them into the above
conditional, as is done in, for example, the XEN_DOMCTL_createdomain
case in xen/common/domctl.c:do_domctl().

Thanks,
 -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 03/29] tools/libxl: Introduce ROUNDUP()

2015-07-14 Thread Andrew Cooper
This is the same as is used by libxc.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
 tools/libxl/libxl_internal.h |3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 5235d25..19fc425 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -110,6 +110,9 @@
 
 #define ARRAY_SIZE(a) (sizeof(a) / sizeof(a[0]))
 
+#define ROUNDUP(_val, _order)   \
+(((unsigned long)(_val)+(1UL(_order))-1)  ~((1UL(_order))-1))
+
 #define min(X, Y) ({ \
 const typeof (X) _x = (X);   \
 const typeof (Y) _y = (Y);   \
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 09/29] tools/libxl: Fix libxl__carefd_opened() to be more useful with an invalid fd

2015-07-14 Thread Andrew Cooper
In the case that fd is -1, preserve errno and don't attempt to set
CLOEXEC.

Note that the implementation can still fail, as it ignores fcntl()
errors and may not set CLOEXEC properly.  Update the documentation
accordingly until it is fixed.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
Fixing the fnctl() error issue involves more TUITs than I currently
have.

New in v4
---
 tools/libxl/libxl_fork.c |5 -
 tools/libxl/libxl_internal.h |2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_fork.c b/tools/libxl/libxl_fork.c
index 4486687..024c1e2 100644
--- a/tools/libxl/libxl_fork.c
+++ b/tools/libxl/libxl_fork.c
@@ -112,9 +112,12 @@ libxl__carefd *libxl__carefd_record(libxl_ctx *ctx, int fd)
 libxl__carefd *libxl__carefd_opened(libxl_ctx *ctx, int fd)
 {
 libxl__carefd *cf = 0;
+int saved_errno = errno;
 
-cf = libxl__carefd_record(ctx, fd);
+if (fd = 0)
+cf = libxl__carefd_record(ctx, fd);
 libxl__carefd_unlock();
+errno = saved_errno;
 return cf;
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3f1fed8..5d3499e 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2189,7 +2189,7 @@ _hidden void libxl__ao_progress_report(libxl__egc *egc, 
libxl__ao *ao,
 _hidden libxl__carefd *libxl__carefd_record(libxl_ctx *ctx, int fd);
 
 /* Combines _record and _unlock in a single call.  If fd==-1,
- * still does the unlock, but returns 0.  Cannot fail. */
+ * still does the unlock, but returns 0. */
 _hidden libxl__carefd *libxl__carefd_opened(libxl_ctx *ctx, int fd);
 
 /* Works just like close(2).  You may pass NULL, in which case it's
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 07/29] tools/libxl: Extra management APIs for the save helper

2015-07-14 Thread Andrew Cooper
With migration v2, there are several moving parts needing to be
juggled at once.  This requires the error handling logic to be able to
query the state of each moving part, possibly before they have been
started, and be able to cancel them.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v4: Don't force _init() to be mandatory
v3: Adjust helper_{stop,failed,done} to use libxl__save_helper_inuse()
v2: Add an _init() function which allows _inuse() to be safe to call even
before the save helper has started.
---
 tools/libxl/libxl_internal.h |9 +
 tools/libxl/libxl_save_callout.c |   23 +--
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index e5599a3..8ce3d49 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3272,6 +3272,15 @@ _hidden void libxl__xc_domain_restore(libxl__egc *egc,
 _hidden void libxl__xc_domain_restore_done(libxl__egc *egc, void *dcs_void,
int rc, int retval, int errnoval);
 
+_hidden void libxl__save_helper_init(libxl__save_helper_state *shs);
+_hidden void libxl__save_helper_abort(libxl__egc *egc,
+  libxl__save_helper_state *shs);
+
+static inline bool libxl__save_helper_inuse(const libxl__save_helper_state 
*shs)
+{
+return libxl__ev_child_inuse(shs-child);
+}
+
 /* Each time the dm needs to be saved, we must call suspend and then save */
 _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
libxl__domain_suspend_state *dss);
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 1136b79..c56b68c 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -146,6 +146,13 @@ void 
libxl__xc_domain_saverestore_async_callback_done(libxl__egc *egc,
 shs-egc = 0;
 }
 
+void libxl__save_helper_init(libxl__save_helper_state *shs)
+{
+libxl__ao_abortable_init(shs-abrt);
+libxl__ev_fd_init(shs-readable);
+libxl__ev_child_init(shs-child);
+}
+
 /*- helper execution -*/
 
 static void run_helper(libxl__egc *egc, libxl__save_helper_state *shs,
@@ -167,9 +174,7 @@ static void run_helper(libxl__egc *egc, 
libxl__save_helper_state *shs,
 shs-rc = 0;
 shs-completed = 0;
 shs-pipes[0] = shs-pipes[1] = 0;
-libxl__ao_abortable_init(shs-abrt);
-libxl__ev_fd_init(shs-readable);
-libxl__ev_child_init(shs-child);
+libxl__save_helper_init(shs);
 
 shs-abrt.ao = shs-ao;
 shs-abrt.callback = helper_stop;
@@ -255,7 +260,7 @@ static void helper_failed(libxl__egc *egc, 
libxl__save_helper_state *shs,
 
 libxl__ev_fd_deregister(gc, shs-readable);
 
-if (!libxl__ev_child_inuse(shs-child)) {
+if (!libxl__save_helper_inuse(shs)) {
 helper_done(egc, shs);
 return;
 }
@@ -268,7 +273,7 @@ static void helper_stop(libxl__egc *egc, 
libxl__ao_abortable *abrt, int rc)
 libxl__save_helper_state *shs = CONTAINER_OF(abrt, *shs, abrt);
 STATE_AO_GC(shs-ao);
 
-if (!libxl__ev_child_inuse(shs-child)) {
+if (!libxl__save_helper_inuse(shs)) {
 helper_failed(egc, shs, rc);
 return;
 }
@@ -279,6 +284,12 @@ static void helper_stop(libxl__egc *egc, 
libxl__ao_abortable *abrt, int rc)
 libxl__kill(gc, shs-child.pid, SIGTERM, save/restore helper);
 }
 
+void libxl__save_helper_abort(libxl__egc *egc,
+  libxl__save_helper_state *shs)
+{
+helper_stop(egc, shs-abrt, ERROR_FAIL);
+}
+
 static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
int fd, short events, short revents)
 {
@@ -356,7 +367,7 @@ static void helper_done(libxl__egc *egc, 
libxl__save_helper_state *shs)
 libxl__ev_fd_deregister(gc, shs-readable);
 libxl__carefd_close(shs-pipes[0]);  shs-pipes[0] = 0;
 libxl__carefd_close(shs-pipes[1]);  shs-pipes[1] = 0;
-assert(!libxl__ev_child_inuse(shs-child));
+assert(!libxl__save_helper_inuse(shs));
 if (shs-toolstack_data_file) fclose(shs-toolstack_data_file);
 
 shs-egc = egc;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 00/27] Libxl migration v2

2015-07-14 Thread Andrew Cooper
This series adds support for the libxl migration v2 stream, and untangles the
existing layering violations of the toolstack and qemu records.

It can be found on the branch libxl-migv2-v4
  git://xenbits.xen.org/people/andrewcoop/xen.git
  http://xenbits.xen.org/git-http/people/andrewcoop/xen.git

No major changes over v3, but a lot of minor changes following review.

At the end of the series, legacy migration is no longer used.

The Remus code is untested by me.  All other combinations of
suspend/migrate/resume have been tested with PV and HVM guests (qemu-trad and
qemu-upstream), including automatic legacy - v2 inline conversion.

Anyway, thoughts/comments welcome.  Please test!

~Andrew

Summary of Acks/Modified/New from v2

A   bsd-sys-queue-h-seddery: Massage `offsetof'
A   tools/libxc: Always compile the compat qemu variables into xc_sr_context
A   tools/libxl: Introduce ROUNDUP()
A   tools/libxl: Introduce libxl__kill()
A   tools/libxl: Stash all restore parameters in domain_create_state
A   tools/libxl: Split libxl__domain_create_state.restore_fd in two
AM  tools/libxl: Extra management APIs for the save helper
A   tools/libxl: Add save_helper_state pointers to 
libxl__xc_domain_{save,restore}()
  N tools/libxl: Fix libxl__carefd_opened() to be more useful with an invalid fd
A   tools/xl: Mandatory flag indicating the format of the migration stream
A   docs: Libxl migration v2 stream specification
A   tools/python: Libxc migration v2 infrastructure
A   tools/python: Libxl migration v2 infrastructure
A   tools/python: Other migration infrastructure
A   tools/python: Verification utility for v2 stream spec compliance
A   tools/python: Conversion utility for legacy migration streams
A   tools/libxl: Migration v2 stream format
 M  tools/libxl: Infrastructure for reading a libxl migration v2 stream
 M  tools/libxl: Infrastructure to convert a legacy stream
 M  tools/libxl: Convert a legacy stream if needed
AM  tools/libxc+libxl+xl: Restore v2 streams
 M  tools/libxl: Infrastructure for writing a v2 stream
AM  tools/libxc+libxl+xl: Save v2 streams
A   docs/libxl: Introduce CHECKPOINT_END to support migration v2 remus streams
AM  tools/libxl: Write checkpoint records into the stream
A   tools/libx{c,l}: Introduce restore_callbacks.checkpoint()
AM  tools/libxl: Handle checkpoint records in a libxl migration v2 stream
A   tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc
A   tools/libxl: Drop all knowledge of toolstack callbacks

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 06/29] tools/libxl: Split libxl__domain_create_state.restore_fd in two

2015-07-14 Thread Andrew Cooper
In a future patch, we shall support automatically converting a legacy
stream to a v2 stream, in which case libxc needs to read from a
different fd.

Simply overwriting restore_fd does not work; the two fd's have
different circumstances.  The restore_fd needs to be returned to its
original state before libxl_domain_create_restore() returns, while in
the converted case, the fd needs allocating and deallocating
appropriately.

No functional change.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
New in v2
---
 tools/libxl/libxl_create.c   |2 +-
 tools/libxl/libxl_internal.h |2 +-
 tools/libxl/libxl_save_callout.c |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 61515da..be13204 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1525,7 +1525,7 @@ static int do_domain_create(libxl_ctx *ctx, 
libxl_domain_config *d_config,
 cdcs-dcs.guest_config = d_config;
 libxl_domain_config_init(cdcs-dcs.guest_config_saved);
 libxl_domain_config_copy(ctx, cdcs-dcs.guest_config_saved, d_config);
-cdcs-dcs.restore_fd = restore_fd;
+cdcs-dcs.restore_fd = cdcs-dcs.libxc_fd = restore_fd;
 if (restore_fd  -1)
 cdcs-dcs.restore_params = *params;
 cdcs-dcs.callback = domain_create_cb;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 6428757..e5599a3 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3216,7 +3216,7 @@ struct libxl__domain_create_state {
 libxl__ao *ao;
 libxl_domain_config *guest_config;
 libxl_domain_config guest_config_saved; /* vanilla config */
-int restore_fd;
+int restore_fd, libxc_fd;
 libxl_domain_restore_params restore_params;
 libxl__domain_create_cb *callback;
 libxl_asyncprogress_how aop_console_how;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 80aae1b..1136b79 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -48,7 +48,7 @@ void libxl__xc_domain_restore(libxl__egc *egc, 
libxl__domain_create_state *dcs,
 
 /* Convenience aliases */
 const uint32_t domid = dcs-guest_domid;
-const int restore_fd = dcs-restore_fd;
+const int restore_fd = dcs-libxc_fd;
 libxl__domain_build_state *const state = dcs-build_state;
 
 unsigned cbflags = libxl__srm_callout_enumcallbacks_restore
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 04/29] tools/libxl: Introduce libxl__kill()

2015-07-14 Thread Andrew Cooper
as a wrapper to kill(2), and use it in preference to sendsig in
libxl_save_callout.c.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Wei Liu wei.l...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com

---
v3: Fix typo, add _hidden

Logically new in v2 - split out from a v1 change which was itself a
cherrypick-and-modify from the AO Abort series
---
 tools/libxl/libxl_aoutils.c  |   15 +++
 tools/libxl/libxl_internal.h |2 ++
 tools/libxl/libxl_save_callout.c |   10 ++
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index 0931eee..274ef39 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -621,3 +621,18 @@ bool libxl__async_exec_inuse(const libxl__async_exec_state 
*aes)
 assert(time_inuse == child_inuse);
 return child_inuse;
 }
+
+void libxl__kill(libxl__gc *gc, pid_t pid, int sig, const char *what)
+{
+int r = kill(pid, sig);
+if (r) LOGE(WARN, failed to kill() %s [%lu] (signal %d),
+what, (unsigned long)pid, sig);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 19fc425..7ccbf55 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2244,6 +2244,8 @@ struct libxl__async_exec_state {
 int libxl__async_exec_start(libxl__async_exec_state *aes);
 bool libxl__async_exec_inuse(const libxl__async_exec_state *aes);
 
+_hidden void libxl__kill(libxl__gc *gc, pid_t pid, int sig, const char *what);
+
 /*- device addition/removal -*/
 
 typedef struct libxl__ao_device libxl__ao_device;
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 087c2d5..b82a5c1 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -244,12 +244,6 @@ static void run_helper(libxl__egc *egc, 
libxl__save_helper_state *shs,
 libxl__carefd_close(childs_pipes[1]);
 helper_failed(egc, shs, rc);;
 }
-static void sendsig(libxl__gc *gc, libxl__save_helper_state *shs, int sig)
-{
-int r = kill(shs-child.pid, sig);
-if (r) LOGE(WARN, failed to kill save/restore helper [%lu] (signal %d),
-(unsigned long)shs-child.pid, sig);
-}
 
 static void helper_failed(libxl__egc *egc, libxl__save_helper_state *shs,
   int rc)
@@ -266,7 +260,7 @@ static void helper_failed(libxl__egc *egc, 
libxl__save_helper_state *shs,
 return;
 }
 
-sendsig(gc, shs, SIGKILL);
+libxl__kill(gc, shs-child.pid, SIGKILL, save/restore helper);
 }
 
 static void helper_stop(libxl__egc *egc, libxl__ao_abortable *abrt, int rc)
@@ -282,7 +276,7 @@ static void helper_stop(libxl__egc *egc, 
libxl__ao_abortable *abrt, int rc)
 if (!shs-rc)
 shs-rc = rc;
 
-sendsig(gc, shs, SIGTERM);
+libxl__kill(gc, shs-child.pid, SIGTERM, save/restore helper);
 }
 
 static void helper_stdout_readable(libxl__egc *egc, libxl__ev_fd *ev,
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3] libxl: Increase device model startup timeout to 1min.

2015-07-14 Thread Dario Faggioli
On Tue, 2015-07-14 at 08:55 +0100, Ian Campbell wrote:
 On Tue, 2015-07-14 at 07:17 +0100, Jan Beulich wrote:
   On 07.07.15 at 17:41, ian.campb...@citrix.com wrote:
   On Tue, 2015-07-07 at 16:14 +0100, Ian Jackson wrote:
   Anthony PERARD writes ([PATCH V3] libxl: Increase device model startup 
   timeout to 1min.):
On a busy host, QEMU may take more than 10s to load and start.

This is likely due to a bug in Linux where the I/O subsystem sometime
produce high latency under load and result in QEMU taking a long time 
to
load every single dynamic libraries.
   
   Acked-by: Ian Jackson ian.jack...@eu.citrix.com
   
   Applied.
  
  So is this the answer to Problems with merlot* AMD Opteron 6376
  systems?
 
 It'll be hard to say until this change gets through the Xen push gate
 and that version gets used for other branches (linux testing, libvirt,
 ovmf, osstest's own gate etc).
 
Indeed. My opinion is that no, it is not.

My understanding of the data Anthony provided is that, under some
(difficult to track/analyze/reproduce/etc) load conditions, the Linux IO
and VM subsystem suffer from high latency, delaying QEMU startup.

In the merlot* cases, the system is completely idle, apart from the
failing creation/migration operation.

So, no, I don't think that would not be the fix we need for that
situation.

 At the moment it looks like it has helped with some but not all of the
 issues.
 
 These:
 
 http://logs.test-lab.xenproject.org/osstest/results/host/merlot0.html
 http://logs.test-lab.xenproject.org/osstest/results/host/merlot1.html
 
Can I ask why (I mean, e.g., comparing what with what) you're saying it
seems to have helped?

Regards,
Dario
-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V3] libxl: Increase device model startup timeout to 1min.

2015-07-14 Thread Ian Campbell
On Tue, 2015-07-14 at 11:25 +0200, Dario Faggioli wrote:
 On Tue, 2015-07-14 at 08:55 +0100, Ian Campbell wrote:
  On Tue, 2015-07-14 at 07:17 +0100, Jan Beulich wrote:
On 07.07.15 at 17:41, ian.campb...@citrix.com wrote:
On Tue, 2015-07-07 at 16:14 +0100, Ian Jackson wrote:
Anthony PERARD writes ([PATCH V3] libxl: Increase device model 
startup 
timeout to 1min.):
 On a busy host, QEMU may take more than 10s to load and start.
 
 This is likely due to a bug in Linux where the I/O subsystem sometime
 produce high latency under load and result in QEMU taking a long 
 time to
 load every single dynamic libraries.

Acked-by: Ian Jackson ian.jack...@eu.citrix.com

Applied.
   
   So is this the answer to Problems with merlot* AMD Opteron 6376
   systems?
  
  It'll be hard to say until this change gets through the Xen push gate
  and that version gets used for other branches (linux testing, libvirt,
  ovmf, osstest's own gate etc).
  
 Indeed. My opinion is that no, it is not.
 
 My understanding of the data Anthony provided is that, under some
 (difficult to track/analyze/reproduce/etc) load conditions, the Linux IO
 and VM subsystem suffer from high latency, delaying QEMU startup.
 
 In the merlot* cases, the system is completely idle, apart from the
 failing creation/migration operation.
 
 So, no, I don't think that would not be the fix we need for that
 situation.

Even if it is not the correct fix it seems like in some situations the
increase in timeout has improved things, hence it is an answer as Jan
asked (his quote marks).

  At the moment it looks like it has helped with some but not all of the
  issues.
  
  These:
  
  http://logs.test-lab.xenproject.org/osstest/results/host/merlot0.html
  http://logs.test-lab.xenproject.org/osstest/results/host/merlot1.html
  
 Can I ask why (I mean, e.g., comparing what with what) you're saying it
 seems to have helped?

There seemed (unscientifically) to be fewer of the libvirt related
guest-start failures.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for RMRR

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 11:25, ian.campb...@citrix.com wrote:
 On Tue, 2015-07-14 at 10:18 +0100, Jan Beulich wrote:
 
  YY   [v7][PATCH 14/16] xen/vtd: enable USB device 
  assignment
 
 diffstat:
  xen/drivers/passthrough/vtd/dmar.h  |  1 -
  xen/drivers/passthrough/vtd/iommu.c | 11 ++-
  xen/drivers/passthrough/vtd/utils.c |  7 ---
  3 files changed, 2 insertions(+), 17 deletions(-)
 
  YY   [v7][PATCH 15/16] xen/vtd: prevent from assign 
  the device with
  shared rmrr
 
 xen/drivers/passthrough/vtd/iommu.c | 32 +---
  1 file changed, 29 insertions(+), 3 deletions(-)
 
 
  And yet again for these two. Please avoid giving a false impression
  
  But these two patches really won Kevin's Ack, and also I wrote this line
  
  Acked-by: Kevin Tian kevin.t...@intel.com
  
  both in these two patches.
 
 But talk here is about their review status, not who ack-ed them (and
 an ack by other than  a maintainer of the affected code is not very
 meaningful anyway).
 
 Kevin is maintainer of that code though, isn't he?

Oh, right, I got mislead by the hv in the middle of Tiejun's
earlier reply, assuming (base on remembering the ordering of the
series) what follows it are the hvmloader patches (without looking
at the actual titles again). Indeed, the part of my reply above in
parentheses was misguided. The patches not having been
reviewed by anyone still is a fact (as an ack is not a review iirc).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 10/10] libxl: fix caller of libxl_cpupool functions

2015-07-14 Thread Dario Faggioli
On Mon, 2015-07-13 at 17:22 +0100, Wei Liu wrote:
 Coverity complains cpupool_info leaks a string in failure path. Instead
 of fixing that path, we rely on the callers (two public APIs at the
 moment) of cpupool_info correctly call libxl_cpupoolinfo_dispose in
 their failure path to dispose of any resources.
 
 That involves:
 1. Call _init and _dispose in libxl_list_cpupool
 2. Call _init in numa_place_domain
 
 Fix numa_place_domain to use goto style to make things more clearer.
 
 Signed-off-by: Wei Liu wei.l...@citrix.com

Reviewed-by: Dario Faggioli dario.faggi...@citrix.com

Dario
-- 
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Requesting for freeze exception for RMRR

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 11:27, tiejun.c...@intel.com wrote:
 Y Y   [v7][PATCH 14/16] xen/vtd: enable USB device assignment
 Y Y   [v7][PATCH 15/16] xen/vtd: prevent from assign the 
 device with
 shared rmrr

 And yet again for these two. Please avoid giving a false impression

 But these two patches really won Kevin's Ack, and also I wrote this line

 Acked-by: Kevin Tian kevin.t...@intel.com

 both in these two patches.

 But talk here is about their review status, not who ack-ed them (and
 an ack by other than  a maintainer of the affected code is not very
 meaningful anyway).
 
 Isn't Kevin the key maintainer specific to IOMMU subsystem?

Yes, he is (for VT-d to be precise). See my reply just sent to Ian's
similar response.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 26/28] tools/libxl: Handle checkpoint records in a libxl migration v2 stream

2015-07-14 Thread Yang Hongyang

On 07/13/2015 08:01 PM, Andrew Cooper wrote:
[...]

  domcreate_rebuild_done(egc, dcs, rc);
@@ -966,6 +989,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
  }

  /* Restore */
+callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;


This should be moved after read_stream_init()

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e42de47..86f7f13 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -989,8 +989,6 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 }

 /* Restore */
-callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;
-
 rc = libxl__build_pre(gc, domid, d_config, state);
 if (rc)
 goto out;
@@ -1003,6 +1001,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 dcs-srs.legacy = (dcs-restore_params.stream_version == 1);
 dcs-srs.completion_callback = domcreate_stream_done;
 dcs-srs.checkpoint_callback = remus_checkpoint_stream_done;
+callbacks-checkpoint = libxl__remus_domain_checkpoint_callback;

 libxl__stream_read_start(egc, dcs-srs);
 return;



  rc = libxl__build_pre(gc, domid, d_config, state);
  if (rc)
@@ -978,6 +1002,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
  dcs-srs.fd = restore_fd;
  dcs-srs.legacy = (dcs-restore_params.stream_version == 1);
  dcs-srs.completion_callback = domcreate_stream_done;
+dcs-srs.checkpoint_callback = remus_checkpoint_stream_done;

  libxl__stream_read_start(egc, dcs-srs);
  return;

...

--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 12:22, tiejun.c...@intel.com wrote:
 On 2015/7/14 17:32, Jan Beulich wrote:
 On 14.07.15 at 07:22, tiejun.c...@intel.com wrote:
 +for ( i = 0; i  memory_map.nr_map; i++ )
 +{
 +uint64_t end = e820[i].addr + e820[i].size;

 Either loop index/boundary or used array are wrong here: In the
 earlier loop you copied memory_map[0...nr_map-1] to
 e820[n...n+nr_map-1], but here you're looping looking at
 e820[0...nr_map-1]

 You're right. I should lookup all e820[] like this,

 for ( i = 0; i  nr; i++ )

 Hmm, I would have thought you only care about either part of
 the just glued together map.

 +if ( e820[i].type == E820_RAM 
 + low_mem_end  e820[i].addr  low_mem_end  end )

 Assuming you mean to look at the RDM e820[] entries here, this
 is not a correct check: You don't care about partly or fully
 contained, all you care about is whether low_mem_end extends
 beyond the start of the region.

 Here I'm looking at the e820 entry indicating low memory. Because

 low_mem_end = hvm_info-low_mem_pgend  PAGE_SHIFT;

 and when we allocate MMIO in pci.c, its possible to populate RAM so
 hvm_info-low_mem_pgend would be changed over there. So we need to
 compensate this loss with high memory. Here memory_map[] also records
 the original low/high memory, so if low_mem_end is less-than the
 original we need this compensation.

 And I'm not disputing your intentions - I'm merely pointing out that
 afaics the code above doesn't match these intentions. In particular
 (as said) I don't see why you need to check low_mem_end  end.

 
 Before we probably relocate RAM,
 
 low_mem_end = hvm_info-low_mem_pgend  PAGE_SHIFT
 
 and the e820 entry specific to low memory,
 
 [e820[X].addr, end]
 
 Here, end = e820[X].addr + e820[X].size;
 
 Now low_mem_end = end.
 
 After that, low_mem_end  end. so if
 
 (low_mem_end  e820[X].addr  low_mem_end  end) is true, this means 
 that associated RAM entry is hitting, right? Then we need to revise this 
 entry as [e820[X].addr, low_mem_end], and compensate [end - low_mem_end] 
 to high memory. Anything I'm still wrong here?

Ah, I think I see now what I misunderstood.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 12/29] tools/python: Libxc migration v2 infrastructure

2015-07-14 Thread Andrew Cooper
Contains:
 * Python implementation of the libxc migration v2 records
 * Verification code for spec compliance
 * Unit tests

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
 tools/python/setup.py|1 +
 tools/python/xen/migration/libxc.py  |  446 ++
 tools/python/xen/migration/tests.py  |   41 
 tools/python/xen/migration/verify.py |   37 +++
 4 files changed, 525 insertions(+)
 create mode 100644 tools/python/xen/migration/__init__.py
 create mode 100644 tools/python/xen/migration/libxc.py
 create mode 100644 tools/python/xen/migration/tests.py
 create mode 100644 tools/python/xen/migration/verify.py

diff --git a/tools/python/setup.py b/tools/python/setup.py
index 439c429..5bf81be 100644
--- a/tools/python/setup.py
+++ b/tools/python/setup.py
@@ -43,6 +43,7 @@ setup(name= 'xen',
   version = '3.0',
   description = 'Xen',
   packages= ['xen',
+ 'xen.migration',
  'xen.lowlevel',
 ],
   ext_package = xen.lowlevel,
diff --git a/tools/python/xen/migration/__init__.py 
b/tools/python/xen/migration/__init__.py
new file mode 100644
index 000..e69de29
diff --git a/tools/python/xen/migration/libxc.py 
b/tools/python/xen/migration/libxc.py
new file mode 100644
index 000..b0255ac
--- /dev/null
+++ b/tools/python/xen/migration/libxc.py
@@ -0,0 +1,446 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+
+Libxc Migration v2 streams
+
+Record structures as per docs/specs/libxc-migration-stream.pandoc, and
+verification routines.
+
+
+import sys
+
+from struct import calcsize, unpack
+
+from xen.migration.verify import StreamError, RecordError, VerifyBase
+
+# Image Header
+IHDR_FORMAT = !QIIHHI
+
+IHDR_MARKER  = 0x
+IHDR_IDENT   = 0x58454E46 # XENF in ASCII
+IHDR_VERSION = 2
+
+IHDR_OPT_BIT_ENDIAN = 0
+IHDR_OPT_LE = (0  IHDR_OPT_BIT_ENDIAN)
+IHDR_OPT_BE = (1  IHDR_OPT_BIT_ENDIAN)
+
+IHDR_OPT_RESZ_MASK = 0xfffe
+
+# Domain Header
+DHDR_FORMAT = IHHII
+
+DHDR_TYPE_x86_pv  = 0x0001
+DHDR_TYPE_x86_hvm = 0x0002
+DHDR_TYPE_x86_pvh = 0x0003
+DHDR_TYPE_arm = 0x0004
+
+dhdr_type_to_str = {
+DHDR_TYPE_x86_pv  : x86 PV,
+DHDR_TYPE_x86_hvm : x86 HVM,
+DHDR_TYPE_x86_pvh : x86 PVH,
+DHDR_TYPE_arm : ARM,
+}
+
+# Records
+RH_FORMAT = II
+
+REC_TYPE_end  = 0x
+REC_TYPE_page_data= 0x0001
+REC_TYPE_x86_pv_info  = 0x0002
+REC_TYPE_x86_pv_p2m_frames= 0x0003
+REC_TYPE_x86_pv_vcpu_basic= 0x0004
+REC_TYPE_x86_pv_vcpu_extended = 0x0005
+REC_TYPE_x86_pv_vcpu_xsave= 0x0006
+REC_TYPE_shared_info  = 0x0007
+REC_TYPE_tsc_info = 0x0008
+REC_TYPE_hvm_context  = 0x0009
+REC_TYPE_hvm_params   = 0x000a
+REC_TYPE_toolstack= 0x000b
+REC_TYPE_x86_pv_vcpu_msrs = 0x000c
+REC_TYPE_verify   = 0x000d
+REC_TYPE_checkpoint   = 0x000e
+
+rec_type_to_str = {
+REC_TYPE_end  : End,
+REC_TYPE_page_data: Page data,
+REC_TYPE_x86_pv_info  : x86 PV info,
+REC_TYPE_x86_pv_p2m_frames: x86 PV P2M frames,
+REC_TYPE_x86_pv_vcpu_basic: x86 PV vcpu basic,
+REC_TYPE_x86_pv_vcpu_extended : x86 PV vcpu extended,
+REC_TYPE_x86_pv_vcpu_xsave: x86 PV vcpu xsave,
+REC_TYPE_shared_info  : Shared info,
+REC_TYPE_tsc_info : TSC info,
+REC_TYPE_hvm_context  : HVM context,
+REC_TYPE_hvm_params   : HVM params,
+REC_TYPE_toolstack: Toolstack,
+REC_TYPE_x86_pv_vcpu_msrs : x86 PV vcpu msrs,
+REC_TYPE_verify   : Verify,
+REC_TYPE_checkpoint   : Checkpoint,
+}
+
+# page_data
+PAGE_DATA_FORMAT = II
+PAGE_DATA_PFN_MASK   = (1L  52) - 1
+PAGE_DATA_PFN_RESZ_MASK  = ((1L  60) - 1)  ~((1L  52) - 1)
+
+# flags from xen/public/domctl.h: XEN_DOMCTL_PFINFO_* shifted by 32 bits
+PAGE_DATA_TYPE_SHIFT = 60
+PAGE_DATA_TYPE_LTABTYPE_MASK = (0x7L  PAGE_DATA_TYPE_SHIFT)
+PAGE_DATA_TYPE_LTAB_MASK = (0xfL  PAGE_DATA_TYPE_SHIFT)
+PAGE_DATA_TYPE_LPINTAB   = (0x8L  PAGE_DATA_TYPE_SHIFT) # Pinned 
pagetable
+
+PAGE_DATA_TYPE_NOTAB = (0x0L  PAGE_DATA_TYPE_SHIFT) # Regular page
+PAGE_DATA_TYPE_L1TAB = (0x1L  PAGE_DATA_TYPE_SHIFT) # L1 pagetable
+PAGE_DATA_TYPE_L2TAB = (0x2L  PAGE_DATA_TYPE_SHIFT) # L2 pagetable
+PAGE_DATA_TYPE_L3TAB = (0x3L  PAGE_DATA_TYPE_SHIFT) # L3 pagetable
+PAGE_DATA_TYPE_L4TAB = (0x4L  PAGE_DATA_TYPE_SHIFT) # L4 pagetable
+PAGE_DATA_TYPE_BROKEN= (0xdL  PAGE_DATA_TYPE_SHIFT) # Broken
+PAGE_DATA_TYPE_XALLOC= (0xeL  PAGE_DATA_TYPE_SHIFT) # Allocate-only
+PAGE_DATA_TYPE_XTAB  

[Xen-devel] [PATCH v4 14/29] tools/python: Other migration infrastructure

2015-07-14 Thread Andrew Cooper
Contains:
 * Reverse-engineered notes of the legacy format from xg_save_restore.h
 * Python implementation of the legacy format
 * Public HVM Params used in the legacy stream
 * XL header format

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
New in v2 - removes various many magic numbers from subsequent scripts
---
 tools/python/xen/migration/legacy.py |  279 ++
 tools/python/xen/migration/public.py |   21 +++
 tools/python/xen/migration/xl.py |   12 ++
 3 files changed, 312 insertions(+)
 create mode 100644 tools/python/xen/migration/legacy.py
 create mode 100644 tools/python/xen/migration/public.py
 create mode 100644 tools/python/xen/migration/xl.py

diff --git a/tools/python/xen/migration/legacy.py 
b/tools/python/xen/migration/legacy.py
new file mode 100644
index 000..2f2240a
--- /dev/null
+++ b/tools/python/xen/migration/legacy.py
@@ -0,0 +1,279 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+
+Libxc legacy migration streams
+
+Documentation and record structures for legacy migration
+
+
+
+SAVE/RESTORE/MIGRATE PROTOCOL
+=
+
+The general form of a stream of chunks is a header followed by a
+body consisting of a variable number of chunks (terminated by a
+chunk with type 0) followed by a trailer.
+
+For a rolling/checkpoint (e.g. remus) migration then the body and
+trailer phases can be repeated until an external event
+(e.g. failure) causes the process to terminate and commit to the
+most recent complete checkpoint.
+
+HEADER
+--
+
+unsigned long: p2m_size
+
+extended-info (PV-only, optional):
+
+  If first unsigned long == ~0UL then extended info is present,
+  otherwise unsigned long is part of p2m. Note that p2m_size above
+  does not include the length of the extended info.
+
+  extended-info:
+
+unsigned long: signature == ~0UL
+uint32_t   : number of bytes remaining in extended-info
+
+1 or more extended-info blocks of form:
+char[4]  : block identifier
+uint32_t : block data size
+bytes: block data
+
+defined extended-info blocks:
+vcpu : VCPU context info containing vcpu_guest_context_t.
+   The precise variant of the context structure
+   (e.g. 32 vs 64 bit) is distinguished by
+   the block size.
+extv   : Presence indicates use of extended VCPU context in
+   tail, data size is 0.
+
+p2m (PV-only):
+
+  consists of p2m_size bytes comprising an array of xen_pfn_t sized entries.
+
+BODY PHASE - Format A (for live migration or Remus without compression)
+--
+
+A series of chunks with a common header:
+  int  : chunk type
+
+If the chunk type is +ve then chunk contains guest memory data, and the
+type contains the number of pages in the batch:
+
+unsigned long[]  : PFN array, length == number of pages in batch
+   Each entry consists of XEN_DOMCTL_PFINFO_*
+   in bits 31-28 and the PFN number in bits 27-0.
+page data: PAGE_SIZE bytes for each page marked present in PFN
+   array
+
+If the chunk type is -ve then chunk consists of one of a number of
+metadata types.  See definitions of XC_SAVE_ID_* below.
+
+If chunk type is 0 then body phase is complete.
+
+
+BODY PHASE - Format B (for Remus with compression)
+--
+
+A series of chunks with a common header:
+  int  : chunk type
+
+If the chunk type is +ve then chunk contains array of PFNs corresponding
+to guest memory and type contains the number of PFNs in the batch:
+
+unsigned long[]  : PFN array, length == number of pages in batch
+   Each entry consists of XEN_DOMCTL_PFINFO_*
+   in bits 31-28 and the PFN number in bits 27-0.
+
+If the chunk type is -ve then chunk consists of one of a number of
+metadata types.  See definitions of XC_SAVE_ID_* below.
+
+If the chunk type is -ve and equals XC_SAVE_ID_COMPRESSED_DATA, then the
+chunk consists of compressed page data, in the following format:
+
+unsigned long: Size of the compressed chunk to follow
+compressed data :  variable length data of size indicated above.
+   This chunk consists of compressed page data.
+   The number of pages in one chunk depends on
+   the amount of space available in the sender's
+   output buffer.
+
+Format of compressed data:
+  compressed_data = deltas*
+  delta   = marker, run*
+  marker  = (RUNFLAG|SKIPFLAG) bitwise-or RUNLEN [1 byte marker]
+  RUNFLAG = 0
+  SKIPFLAG= 1  7
+  RUNLEN  = 7-bit unsigned value indicating number of WORDS in the run
+  run  

[Xen-devel] [PATCH v4 18/29] tools/libxl: Infrastructure for reading a libxl migration v2 stream

2015-07-14 Thread Andrew Cooper
From: Ross Lagerwall ross.lagerw...@citrix.com

This contains the event machinery and state machines to read an act on
a non-checkpointed migration v2 stream (with the exception of the
xc_domain_restore() handling which is spliced later in a bisectable
way).

It also contains some boilerplate to help support checkpointed
streams, which shall be introduced in a later patch.

Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v4:
 * Don't make _init() mandatory
 * Don't unilaterally clobber rc from datacopier callback
 * Comment adjustments
 * More assertions

v3: More descriptions of the internal state, shuffle function order, add
an _init() call, condense error handling
---
 tools/libxl/Makefile|1 +
 tools/libxl/libxl_internal.h|   53 
 tools/libxl/libxl_stream_read.c |  564 +++
 3 files changed, 618 insertions(+)
 create mode 100644 tools/libxl/libxl_stream_read.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index cc9c152..c71c5fe 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -94,6 +94,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
libxl_pci.o \
libxl_dom.o libxl_exec.o libxl_xshelp.o libxl_device.o \
libxl_internal.o libxl_utils.o libxl_uuid.o \
libxl_json.o libxl_aoutils.o libxl_numa.o libxl_vnuma.o 
\
+   libxl_stream_read.o \
libxl_save_callout.o _libxl_save_msgs_callout.o \
libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 5d3499e..9397d53 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -19,6 +19,8 @@
 
 #include libxl_osdeps.h /* must come before any other headers */
 
+#include libxl_sr_stream_format.h
+
 #include assert.h
 #include dirent.h
 #include errno.h
@@ -3211,6 +3213,57 @@ typedef void libxl__domain_create_cb(libxl__egc *egc,
  libxl__domain_create_state*,
  int rc, uint32_t domid);
 
+/* State for manipulating a libxl migration v2 stream */
+typedef struct libxl__stream_read_state libxl__stream_read_state;
+
+typedef struct libxl__sr_record_buf {
+/* private to stream read helper */
+LIBXL_STAILQ_ENTRY(struct libxl__sr_record_buf) entry;
+libxl__sr_rec_hdr hdr;
+void *body; /* iff hdr.length != 0 */
+} libxl__sr_record_buf;
+
+struct libxl__stream_read_state {
+/* filled by the user */
+libxl__ao *ao;
+libxl__domain_create_state *dcs;
+int fd;
+void (*completion_callback)(libxl__egc *egc,
+libxl__stream_read_state *srs,
+int rc);
+/* Private */
+int rc;
+bool running;
+
+/* Main stream-reading data. */
+libxl__datacopier_state dc; /* Only used when reading a record */
+libxl__sr_hdr hdr;
+LIBXL_STAILQ_HEAD(, libxl__sr_record_buf) record_queue; /* NOGC */
+enum {
+SRS_PHASE_NORMAL,
+} phase;
+bool recursion_guard;
+
+/* Only used while actively reading a record from the stream. */
+libxl__sr_record_buf *incoming_record; /* NOGC */
+
+/* Both only used when processing an EMULATOR record. */
+libxl__datacopier_state emu_dc;
+libxl__carefd *emu_carefd;
+};
+
+_hidden void libxl__stream_read_init(libxl__stream_read_state *stream);
+_hidden void libxl__stream_read_start(libxl__egc *egc,
+  libxl__stream_read_state *stream);
+_hidden void libxl__stream_read_abort(libxl__egc *egc,
+  libxl__stream_read_state *stream, int 
rc);
+static inline bool
+libxl__stream_read_inuse(const libxl__stream_read_state *stream)
+{
+return stream-running;
+}
+
+
 struct libxl__domain_create_state {
 /* filled in by user */
 libxl__ao *ao;
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
new file mode 100644
index 000..2eb4ff5
--- /dev/null
+++ b/tools/libxl/libxl_stream_read.c
@@ -0,0 +1,564 @@
+/*
+ * Copyright (C) 2015  Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include 

[Xen-devel] [PATCH v4 20/29] tools/libxl: Convert a legacy stream if needed

2015-07-14 Thread Andrew Cooper
For backwards compatibility, a legacy stream needs converting before
it can be read by the v2 stream logic.

This causes the v2 stream logic to need to juggle two parallel tasks.
check_all_finished() is introduced for the purpose of joining the
tasks in both success and error cases.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v4: Alter logic layout in check_all_finished()
v3: Simplified greatly from v2.  No practical change.
---
 tools/libxl/libxl_internal.h|2 ++
 tools/libxl/libxl_stream_read.c |   66 +++
 2 files changed, 68 insertions(+)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 97755fe..b65019b 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3258,12 +3258,14 @@ struct libxl__stream_read_state {
 libxl__ao *ao;
 libxl__domain_create_state *dcs;
 int fd;
+bool legacy;
 void (*completion_callback)(libxl__egc *egc,
 libxl__stream_read_state *srs,
 int rc);
 /* Private */
 int rc;
 bool running;
+libxl__conversion_helper_state chs;
 
 /* Main stream-reading data. */
 libxl__datacopier_state dc; /* Only used when reading a record */
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 2eb4ff5..4b677f6 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -86,6 +86,10 @@
  *- stream_write_emulator()
  *- stream_write_emulator_done()
  *- stream_continue()
+ *
+ * Depending on the contents of the stream, there are likely to be several
+ * parallel tasks being managed.  check_all_finished() is used to join all
+ * tasks in both success and error cases.
  */
 
 /* Success/error/cleanup handling. */
@@ -93,6 +97,10 @@ static void stream_complete(libxl__egc *egc,
 libxl__stream_read_state *stream, int rc);
 static void stream_done(libxl__egc *egc,
 libxl__stream_read_state *stream);
+static void conversion_done(libxl__egc *egc,
+libxl__conversion_helper_state *chs, int rc);
+static void check_all_finished(libxl__egc *egc,
+   libxl__stream_read_state *stream, int rc);
 
 /* Event chain for first iteration, from _start(). */
 static void stream_header_done(libxl__egc *egc,
@@ -151,6 +159,7 @@ void libxl__stream_read_init(libxl__stream_read_state 
*stream)
 {
 stream-rc = 0;
 stream-running = false;
+libxl__conversion_helper_init(stream-chs);
 FILLZERO(stream-dc);
 FILLZERO(stream-hdr);
 LIBXL_STAILQ_INIT(stream-record_queue);
@@ -165,6 +174,7 @@ void libxl__stream_read_start(libxl__egc *egc,
   libxl__stream_read_state *stream)
 {
 libxl__datacopier_state *dc = stream-dc;
+STATE_AO_GC(stream-ao);
 int rc = 0;
 
 libxl__stream_read_init(stream);
@@ -172,6 +182,29 @@ void libxl__stream_read_start(libxl__egc *egc,
 stream-running = true;
 stream-phase   = SRS_PHASE_NORMAL;
 
+if (stream-legacy) {
+/* Convert the legacy stream. */
+libxl__conversion_helper_state *chs = stream-chs;
+
+chs-ao = stream-ao;
+chs-legacy_fd = stream-fd;
+chs-hvm =
+(stream-dcs-guest_config-b_info.type == LIBXL_DOMAIN_TYPE_HVM);
+chs-completion_callback = conversion_done;
+
+rc = libxl__convert_legacy_stream(egc, stream-chs);
+
+if (rc) {
+LOG(ERROR, Failed to start the legacy stream conversion helper);
+goto err;
+}
+
+assert(stream-chs.v2_carefd);
+stream-fd = libxl__carefd_fd(stream-chs.v2_carefd);
+stream-dcs-libxc_fd = stream-fd;
+}
+/* stream-fd is now a v2 stream. */
+
 dc-ao  = stream-ao;
 dc-readfd  = stream-fd;
 dc-writefd = -1;
@@ -545,6 +578,10 @@ static void stream_done(libxl__egc *egc,
 if (stream-emu_carefd)
 libxl__carefd_close(stream-emu_carefd);
 
+/* If we started a conversion helper, we took ownership of its carefd. */
+if (stream-chs.v2_carefd)
+libxl__carefd_close(stream-chs.v2_carefd);
+
 /* The record queue had better be empty if the stream believes
  * itself to have been successful. */
 assert(LIBXL_STAILQ_EMPTY(stream-record_queue) || stream-rc);
@@ -552,6 +589,35 @@ static void stream_done(libxl__egc *egc,
 LIBXL_STAILQ_FOREACH_SAFE(rec, stream-record_queue, entry, trec)
 free_record(rec);
 
+check_all_finished(egc, stream, stream-rc);
+}
+
+static void conversion_done(libxl__egc *egc,
+libxl__conversion_helper_state *chs, int rc)
+{
+libxl__stream_read_state *stream = CONTAINER_OF(chs, *stream, chs);
+
+check_all_finished(egc, stream, rc);
+}
+
+static void 

[Xen-devel] [PATCH v4 26/29] tools/libx{c, l}: Introduce restore_callbacks.checkpoint()

2015-07-14 Thread Andrew Cooper
And call it when a checkpoint record is found in the libxc stream.

Some parts of this patch have been based on patches from the COLO
series.

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Yang Hongyang yan...@cn.fujitsu.com
Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v3: Named constants for the API
v2: Borrow sufficient fragments from several COLO patches to get
BROKEN_CHANNEL and checkpoint failover to function.
---
 tools/libxc/include/xenguest.h |7 +
 tools/libxc/xc_sr_common.h |7 +++--
 tools/libxc/xc_sr_restore.c|   53 ++--
 tools/libxl/libxl_save_msgs_gen.pl |2 +-
 4 files changed, 51 insertions(+), 18 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index 7581263..e95af54 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -102,6 +102,13 @@ struct restore_callbacks {
 int (*toolstack_restore)(uint32_t domid, const uint8_t *buf,
 uint32_t size, void* data);
 
+/* A checkpoint record has been found in the stream.
+ * returns: */
+#define XGR_CHECKPOINT_ERROR0 /* Terminate processing */
+#define XGR_CHECKPOINT_SUCCESS  1 /* Continue reading more data from the 
stream */
+#define XGR_CHECKPOINT_FAILOVER 2 /* Failover and resume VM */
+int (*checkpoint)(void* data);
+
 /* to be provided as the last argument to each callback function */
 void* data;
 };
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 08c66db..1f4d4e4 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -130,10 +130,13 @@ struct xc_sr_restore_ops
  * Process an individual record from the stream.  The caller shall take
  * care of processing common records (e.g. END, PAGE_DATA).
  *
- * @return 0 for success, -1 for failure, or the sentinel value
- * RECORD_NOT_PROCESSED.
+ * @return 0 for success, -1 for failure, or the following sentinels:
+ *  - RECORD_NOT_PROCESSED
+ *  - BROKEN_CHANNEL: under Remus/COLO, this means master may be dead, and
+ *a failover is needed.
  */
 #define RECORD_NOT_PROCESSED 1
+#define BROKEN_CHANNEL 2
 int (*process_record)(struct xc_sr_context *ctx, struct xc_sr_record *rec);
 
 /**
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9e27dba..18ba411 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -1,5 +1,7 @@
 #include arpa/inet.h
 
+#include assert.h
+
 #include xc_sr_common.h
 
 /*
@@ -472,7 +474,7 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 static int handle_checkpoint(struct xc_sr_context *ctx)
 {
 xc_interface *xch = ctx-xch;
-int rc = 0;
+int rc = 0, ret;
 unsigned i;
 
 if ( !ctx-restore.checkpointed )
@@ -482,6 +484,21 @@ static int handle_checkpoint(struct xc_sr_context *ctx)
 goto err;
 }
 
+ret = ctx-restore.callbacks-checkpoint(ctx-restore.callbacks-data);
+switch ( ret )
+{
+case XGR_CHECKPOINT_SUCCESS:
+break;
+
+case XGR_CHECKPOINT_FAILOVER:
+rc = BROKEN_CHANNEL;
+goto err;
+
+default: /* Other fatal error */
+rc = -1;
+goto err;
+}
+
 if ( ctx-restore.buffer_all_records )
 {
 IPRINTF(All records buffered);
@@ -560,19 +577,6 @@ static int process_record(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
 free(rec-data);
 rec-data = NULL;
 
-if ( rc == RECORD_NOT_PROCESSED )
-{
-if ( rec-type  REC_TYPE_OPTIONAL )
-DPRINTF(Ignoring optional record %#x (%s),
-rec-type, rec_type_to_str(rec-type));
-else
-{
-ERROR(Mandatory record %#x (%s) not handled,
-  rec-type, rec_type_to_str(rec-type));
-rc = -1;
-}
-}
-
 return rc;
 }
 
@@ -678,7 +682,22 @@ static int restore(struct xc_sr_context *ctx)
 else
 {
 rc = process_record(ctx, rec);
-if ( rc )
+if ( rc == RECORD_NOT_PROCESSED )
+{
+if ( rec.type  REC_TYPE_OPTIONAL )
+DPRINTF(Ignoring optional record %#x (%s),
+rec.type, rec_type_to_str(rec.type));
+else
+{
+ERROR(Mandatory record %#x (%s) not handled,
+  rec.type, rec_type_to_str(rec.type));
+rc = -1;
+goto err;
+}
+}
+else if ( rc == BROKEN_CHANNEL )
+goto remus_failover;
+else if ( rc )
 goto err;
 }
 
@@ -735,6 +754,10 @@ int xc_domain_restore2(xc_interface *xch, int io_fd, 
uint32_t dom,
 

[Xen-devel] [PATCH v4 19/29] tools/libxl: Infrastructure to convert a legacy stream

2015-07-14 Thread Andrew Cooper
Provide a thin wrapper around exec()ing the python conversion utility,
and a stub implementation for cases where conversion is not wanted
(i.e. not x86).

One complication is that the caller of this interface needs to assume
ownership of the output fd, to prevent it being closed while still in
use in a datacopier.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v4: Formatting tweaks, description of the width field.
v3: Trimmed down massively from v2, provide libxl_no_convert_callout.c
---
 tools/libxl/Makefile   |6 ++
 tools/libxl/libxl_convert_callout.c|  171 
 tools/libxl/libxl_internal.h   |   30 ++
 tools/libxl/libxl_no_convert_callout.c |   35 +++
 4 files changed, 242 insertions(+)
 create mode 100644 tools/libxl/libxl_convert_callout.c
 create mode 100644 tools/libxl/libxl_no_convert_callout.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index c71c5fe..68fe1c1 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -56,6 +56,12 @@ else
 LIBXL_OBJS-y += libxl_nonetbuffer.o
 endif
 
+ifeq ($(CONFIG_X86),y)
+LIBXL_OBJS-y += libxl_convert_callout.o
+else
+LIBXL_OBJS-y += libxl_no_convert_callout.o
+endif
+
 LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
diff --git a/tools/libxl/libxl_convert_callout.c 
b/tools/libxl/libxl_convert_callout.c
new file mode 100644
index 000..65b4df9
--- /dev/null
+++ b/tools/libxl/libxl_convert_callout.c
@@ -0,0 +1,171 @@
+/*
+ * Copyright (C) 2014  Citrix Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include libxl_osdeps.h
+
+#include libxl_internal.h
+
+/*
+ * Infrastructure for converting a legacy migration stream into a
+ * libxl v2 stream.
+ *
+ * This is done by fork()ing the python conversion script, which takes
+ * in a legacy stream, and puts out a suitably-formatted v2 stream.
+ */
+
+static void helper_exited(libxl__egc *egc, libxl__ev_child *ch,
+  pid_t pid, int status);
+static void helper_stop(libxl__egc *egc, libxl__ao_abortable *abrt, int rc);
+static void helper_done(libxl__egc *egc,
+libxl__conversion_helper_state *chs);
+
+/*- Entrypoints -*/
+
+void libxl__conversion_helper_init(libxl__conversion_helper_state *chs)
+{
+chs-v2_carefd = NULL;
+chs-rc = 0;
+libxl__ao_abortable_init(chs-abrt);
+libxl__ev_child_init(chs-child);
+}
+
+int libxl__convert_legacy_stream(libxl__egc *egc,
+ libxl__conversion_helper_state *chs)
+{
+STATE_AO_GC(chs-ao);
+libxl__carefd *child_in = NULL, *child_out = NULL;
+int rc = 0;
+
+chs-abrt.ao = chs-ao;
+chs-abrt.callback = helper_stop;
+rc = libxl__ao_abortable_register(chs-abrt);
+if (rc) goto err;
+
+libxl__carefd_begin();
+int fds[2];
+if (libxl_pipe(CTX, fds)) {
+rc = ERROR_FAIL;
+libxl__carefd_unlock();
+goto err;
+}
+child_out = libxl__carefd_record(CTX, fds[0]);
+child_in  = libxl__carefd_record(CTX, fds[1]);
+libxl__carefd_unlock();
+
+pid_t pid = libxl__ev_child_fork(gc, chs-child, helper_exited);
+if (!pid) {
+char * const args[] =
+{
+getenv(LIBXL_CONVERT_HELPER) ?:
+LIBEXEC_BIN /convert-legacy-stream,
+--in, GCSPRINTF(%d, chs-legacy_fd),
+--out,GCSPRINTF(%d, fds[1]),
+/*
+ * The width calculation is an assumption for the common
+ * case.  The conversion script needs to know the width of
+ * the toolstack which saved the legacy stream.
+ *
+ * In the overwhelming majority of cases, the width of the
+ * saving toolstack will be the same as our current
+ * width.  To avoid extending the libxl API with a
+ * parameter intended to disappear shortly, this option
+ * has not been exposed to the caller.
+ *
+ * If more complicated conversion is required, the
+ * conversion script can be instantiated manually, which
+ * will bypass all of this conversion logic.
+ */
+--width,  sizeof(unsigned long) == 8 ? 64 : 32,
+
+--guest,  chs-hvm ? hvm : pv,
+  

[Xen-devel] [PATCH v4 15/29] tools/python: Verification utility for v2 stream spec compliance

2015-07-14 Thread Andrew Cooper
Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
This is exceedingly useful for development, but not of practical use being
installed into a production dom0.
---
 tools/python/scripts/verify-stream-v2 |  174 +
 1 file changed, 174 insertions(+)
 create mode 100755 tools/python/scripts/verify-stream-v2

diff --git a/tools/python/scripts/verify-stream-v2 
b/tools/python/scripts/verify-stream-v2
new file mode 100755
index 000..3daf257
--- /dev/null
+++ b/tools/python/scripts/verify-stream-v2
@@ -0,0 +1,174 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+ Verify a v2 format migration stream 
+
+import sys
+import struct
+import os, os.path
+import syslog
+import traceback
+
+from xen.migration.verify import StreamError, RecordError
+from xen.migration.libxc import VerifyLibxc
+from xen.migration.libxl import VerifyLibxl
+
+fin = None # Input file/fd
+log_to_syslog = False  # Boolean - Log to syslog instead of stdout/err?
+verbose = False# Boolean - Summarise stream contents
+quiet = False  # Boolean - Suppress error printing
+
+def info(msg):
+Info message, routed to appropriate destination
+if not quiet and verbose:
+if log_to_syslog:
+for line in msg.split(\n):
+syslog.syslog(syslog.LOG_INFO, line)
+else:
+print msg
+
+def err(msg):
+Error message, routed to appropriate destination
+if not quiet:
+if log_to_syslog:
+for line in msg.split(\n):
+syslog.syslog(syslog.LOG_ERR, line)
+print  sys.stderr, msg
+
+def stream_read(_ = None):
+Read from input
+return fin.read(_)
+
+def rdexact(nr_bytes):
+Read exactly nr_bytes from fin
+_ = stream_read(nr_bytes)
+if len(_) != nr_bytes:
+raise IOError(Stream truncated)
+return _
+
+def unpack_exact(fmt):
+Unpack a format from fin
+sz = struct.calcsize(fmt)
+return struct.unpack(fmt, rdexact(sz))
+
+
+def skip_xl_header():
+Skip over an xl header in the stream
+
+hdr = rdexact(32)
+if hdr != Xen saved domain, xl format\n \0 \r:
+raise StreamError(No xl header)
+
+_, mflags, _, optlen = unpack_exact(=)
+_ = rdexact(optlen)
+
+info(Processed xl header)
+
+if mflags  2: # XL_MANDATORY_FLAG_STREAMv2
+return libxl
+else:
+return libxc
+
+def read_stream(fmt):
+ Read an entire stream 
+
+try:
+if fmt == xl:
+fmt = skip_xl_header()
+
+if fmt == libxc:
+VerifyLibxc(info, stream_read).verify()
+else:
+VerifyLibxl(info, stream_read).verify()
+
+except (IOError, StreamError, RecordError):
+err(Stream Error:)
+err(traceback.format_exc())
+return 1
+
+except StandardError:
+err(Script Error:)
+err(traceback.format_exc())
+err(Please fix me)
+return 2
+
+return 0
+
+def open_file_or_fd(val, mode, buffering):
+
+If 'val' looks like a decimal integer, open it as an fd.  If not, try to
+open it as a regular file.
+
+
+fd = -1
+try:
+# Does it look like an integer?
+try:
+fd = int(val, 10)
+except ValueError:
+pass
+
+# Try to open it...
+if fd != -1:
+return os.fdopen(fd, mode, buffering)
+else:
+return open(val, mode, buffering)
+
+except StandardError, e:
+if fd != -1:
+err(Unable to open fd %d: %s: %s %
+(fd, e.__class__.__name__, e))
+else:
+err(Unable to open file '%s': %s: %s %
+(val, e.__class__.__name__, e))
+
+raise SystemExit(2)
+
+def main():
+ main 
+from optparse import OptionParser
+global fin, quiet, verbose
+
+# Change stdout to be line-buffered.
+sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 1)
+
+parser = OptionParser(usage = %prog [options],
+  description =
+  Verify a stream according to the v2 spec)
+
+# Optional options
+parser.add_option(-i, --in, dest = fin, metavar = FD or FILE,
+  default = 0,
+  help = Stream to verify (defaults to stdin))
+parser.add_option(-v, --verbose, action = store_true, default = 
False,
+  help = Summarise stream contents)
+parser.add_option(-q, --quiet, action = store_true, default = False,
+  help = Suppress all logging/errors)
+parser.add_option(-f, --format, dest = format,
+  metavar = libxc|libxl|xl, default = libxc,
+  choices = [libxc, libxl, xl],
+  help = Format of the incoming stream (defaults to 
libxc))
+parser.add_option(--syslog, action = 

[Xen-devel] [PATCH v4 11/29] docs: Libxl migration v2 stream specification

2015-07-14 Thread Andrew Cooper
Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v3: Spelling fixes
---
 docs/specs/libxl-migration-stream.pandoc |  205 ++
 1 file changed, 205 insertions(+)
 create mode 100644 docs/specs/libxl-migration-stream.pandoc

diff --git a/docs/specs/libxl-migration-stream.pandoc 
b/docs/specs/libxl-migration-stream.pandoc
new file mode 100644
index 000..4192950
--- /dev/null
+++ b/docs/specs/libxl-migration-stream.pandoc
@@ -0,0 +1,205 @@
+% LibXenLight Domain Image Format
+% Andrew Cooper andrew.coop...@citrix.com
+% Draft B
+
+Introduction
+
+
+For the purposes of this document, `xl` is used as a representation of any
+implementer of the `libxl` API.  `xl` should be considered completely
+interchangeable with alternates, such as `libvirt` or `xenopsd-xl`.
+
+Purpose
+---
+
+The _domain image format_ is the context of a running domain used for
+snapshots of a domain or for transferring domains between hosts during
+migration.
+
+There are a number of problems with the domain image format used in Xen 4.5
+and earlier (the _legacy format_)
+
+* There is no `libxl` context information.  `xl` is required to send certain
+  pieces of `libxl` context itself.
+
+* The contents of the stream is passed directly through `libxl` to `libxc`.
+  The legacy `libxc` format contained some information which belonged at the
+  `libxl` level, resulting in awkward layer violation to return the
+  information back to `libxl`.
+
+* The legacy `libxc` format was inextensible, causing inextensibility in the
+  legacy `libxl` handling.
+
+This design addresses the above points, allowing for a completely
+self-contained, extensible stream with each layer responsible for its own
+appropriate information.
+
+
+Not Yet Included
+
+
+The following features are not yet fully specified and will be
+included in a future draft.
+
+* Remus
+
+* ARM
+
+
+Overview
+
+
+The image format consists of a _Header_, followed by 1 or more _Records_.
+Each record consists of a type and length field, followed by any type-specific
+data.
+
+\clearpage
+
+Header
+==
+
+The header identifies the stream as a `libxl` stream, including the version of
+this specification that it complies with.
+
+All fields in this header shall be in _big-endian_ byte order, regardless of
+the setting of the endianness bit.
+
+ 0 1 2 3 4 5 6 7 octet
++-+
+| ident   |
++---+-+
+| version   | options |
++---+-+
+
+
+Field   Description
+--- 
+ident   0x4c6962786c466d74 (LibxlFmt in ASCII).
+
+version 0x0002.  The version of this specification.
+
+options bit 0: Endianness.0 = little-endian, 1 = big-endian.
+
+bit 1: Legacy Format. If set, this stream was created by
+  the legacy conversion tool.
+
+bits 2-31: Reserved.
+
+
+The endianness shall be 0 (little-endian) for images generated on an
+i386, x86_64, or arm host.
+
+\clearpage
+
+
+Records
+===
+
+A record has a record header, type specific data and a trailing footer.  If
+`length` is not a multiple of 8, the body is padded with zeroes to align the
+end of the record on an 8 octet boundary.
+
+ 0 1 2 3 4 5 6 7 octet
++---+-+
+| type  | body_length |
++---+---+-+
+| body... |
+...
+|   | padding (0 to 7 octets) |
++---+-+
+
+
+FieldDescription
+---  ---
+type 0x: END
+
+ 0x0001: LIBXC_CONTEXT
+
+ 0x0002: XENSTORE_DATA
+
+ 0x0003: EMULATOR_CONTEXT
+
+ 0x0004 - 0x7FFF: Reserved for future _mandatory_
+ records.
+
+ 0x8000 - 0x: Reserved for future _optional_
+ records.
+
+body_length  Length in octets of the record body.
+
+body Content of the record.
+
+padding  0 to 7 octets of zeros to pad the whole record to a multiple
+ of 8 octets.
+
+
+\clearpage
+
+END
+
+
+A 

[Xen-devel] [PATCH v4 29/29] tools/libxl: Drop all knowledge of toolstack callbacks

2015-07-14 Thread Andrew Cooper
Libxl has now been fully adjusted not to need them.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
 tools/libxl/libxl_dom.c|1 -
 tools/libxl/libxl_internal.h   |2 --
 tools/libxl/libxl_save_callout.c   |   39 +---
 tools/libxl/libxl_save_helper.c|   29 ---
 tools/libxl/libxl_save_msgs_gen.pl |7 ++-
 5 files changed, 3 insertions(+), 75 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 927a10e..81adb3d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -2115,7 +2115,6 @@ void libxl__domain_suspend(libxl__egc *egc, 
libxl__domain_suspend_state *dss)
 callbacks-suspend = libxl__domain_suspend_callback;
 
 callbacks-switch_qemu_logdirty = 
libxl__domain_suspend_common_switch_qemu_logdirty;
-dss-sws.shs.callbacks.save.toolstack_save = libxl__toolstack_save;
 
 dss-sws.ao  = dss-ao;
 dss-sws.dss = dss;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7aa4f16..d9deaad 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2699,8 +2699,6 @@ _hidden void libxl__datacopier_prefixdata(libxl__egc*, 
libxl__datacopier_state*,
 
 typedef struct libxl__srm_save_callbacks {
 libxl__srm_save_autogen_callbacks a;
-int (*toolstack_save)(uint32_t domid, uint8_t **buf,
-  uint32_t *len, void *data);
 } libxl__srm_save_callbacks;
 
 typedef struct libxl__srm_restore_callbacks {
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index a5e180c..f2ce868 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -80,41 +80,12 @@ void libxl__xc_domain_save(libxl__egc *egc, 
libxl__domain_suspend_state *dss,
libxl__save_helper_state *shs)
 {
 STATE_AO_GC(dss-ao);
-int r, rc, toolstack_data_fd = -1;
-uint32_t toolstack_data_len = 0;
-
-/* Resources we need to free */
-uint8_t *toolstack_data_buf = 0;
 
 unsigned cbflags =
 libxl__srm_callout_enumcallbacks_save(shs-callbacks.save.a);
 
-if (shs-callbacks.save.toolstack_save) {
-r = shs-callbacks.save.toolstack_save
-(dss-domid, toolstack_data_buf, toolstack_data_len, dss);
-if (r) { rc = ERROR_FAIL; goto out; }
-
-shs-toolstack_data_file = tmpfile();
-if (!shs-toolstack_data_file) {
-LOGE(ERROR, cannot create toolstack data tmpfile);
-rc = ERROR_FAIL;
-goto out;
-}
-toolstack_data_fd = fileno(shs-toolstack_data_file);
-
-r = libxl_write_exactly(CTX, toolstack_data_fd,
-toolstack_data_buf, toolstack_data_len,
-toolstack data tmpfile, 0);
-if (r) { rc = ERROR_FAIL; goto out; }
-
-/* file position must be reset before passing to libxl-save-helper. */
-r = lseek(toolstack_data_fd, 0, SEEK_SET);
-if (r) { rc = ERROR_FAIL; goto out; }
-}
-
 const unsigned long argnums[] = {
 dss-domid, 0, 0, dss-xcflags, dss-hvm,
-toolstack_data_fd, toolstack_data_len,
 cbflags,
 };
 
@@ -125,18 +96,10 @@ void libxl__xc_domain_save(libxl__egc *egc, 
libxl__domain_suspend_state *dss,
 shs-caller_state = dss;
 shs-need_results = 0;
 
-free(toolstack_data_buf);
-
 run_helper(egc, shs, --save-domain, dss-fd,
-   toolstack_data_fd, 1,
+   NULL, 0,
argnums, ARRAY_SIZE(argnums));
 return;
-
- out:
-free(toolstack_data_buf);
-if (shs-toolstack_data_file) fclose(shs-toolstack_data_file);
-
-libxl__xc_domain_save_done(egc, dss, rc, 0, 0);
 }
 
 
diff --git a/tools/libxl/libxl_save_helper.c b/tools/libxl/libxl_save_helper.c
index f196786..1622bb7 100644
--- a/tools/libxl/libxl_save_helper.c
+++ b/tools/libxl/libxl_save_helper.c
@@ -213,32 +213,8 @@ int helper_getreply(void *user)
 
 /*- other callbacks -*/
 
-static int toolstack_save_fd;
-static uint32_t toolstack_save_len;
 static struct save_callbacks helper_save_callbacks;
 
-static int toolstack_save_cb(uint32_t domid, uint8_t **buf,
- uint32_t *len, void *data)
-{
-int r;
-
-assert(toolstack_save_fd  0);
-
-/* This is a hack for remus */
-if (helper_save_callbacks.checkpoint) {
-r = lseek(toolstack_save_fd, 0, SEEK_SET);
-if (r) fail(errno,rewind toolstack data tmpfile);
-}
-
-*buf = xmalloc(toolstack_save_len);
-r = read_exactly(toolstack_save_fd, *buf, toolstack_save_len);
-if (r0) fail(errno,read toolstack data);
-if (r==0) fail(0,read toolstack data eof);
-
-*len = toolstack_save_len;
-return 0;
-}
-
 static void startup(const char *op) {
 

[Xen-devel] [PATCH v4 13/29] tools/python: Libxl migration v2 infrastructure

2015-07-14 Thread Andrew Cooper
Contains:
 * Python implementation of the libxl migration v2 records
 * Verification code for spec compliance
 * Unit tests

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
 tools/python/xen/migration/libxl.py |  188 +++
 tools/python/xen/migration/tests.py |   13 +++
 2 files changed, 201 insertions(+)
 create mode 100644 tools/python/xen/migration/libxl.py

diff --git a/tools/python/xen/migration/libxl.py 
b/tools/python/xen/migration/libxl.py
new file mode 100644
index 000..4e1f4f8
--- /dev/null
+++ b/tools/python/xen/migration/libxl.py
@@ -0,0 +1,188 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+
+Libxl Migration v2 streams
+
+Record structures as per docs/specs/libxl-migration-stream.pandoc, and
+verification routines.
+
+
+import sys
+
+from struct import calcsize, unpack
+from xen.migration.verify import StreamError, RecordError, VerifyBase
+from xen.migration.libxc import VerifyLibxc
+
+# Header
+HDR_FORMAT = !QII
+
+HDR_IDENT = 0x4c6962786c466d74 # LibxlFmt in ASCII
+HDR_VERSION = 2
+
+HDR_OPT_BIT_ENDIAN = 0
+HDR_OPT_BIT_LEGACY = 1
+
+HDR_OPT_LE = (0  HDR_OPT_BIT_ENDIAN)
+HDR_OPT_BE = (1  HDR_OPT_BIT_ENDIAN)
+HDR_OPT_LEGACY = (1  HDR_OPT_BIT_LEGACY)
+
+HDR_OPT_RESZ_MASK = 0xfffc
+
+# Records
+RH_FORMAT = II
+
+REC_TYPE_end  = 0x
+REC_TYPE_libxc_context= 0x0001
+REC_TYPE_xenstore_data= 0x0002
+REC_TYPE_emulator_context = 0x0003
+
+rec_type_to_str = {
+REC_TYPE_end  : End,
+REC_TYPE_libxc_context: Libxc context,
+REC_TYPE_xenstore_data: Xenstore data,
+REC_TYPE_emulator_context : Emulator context,
+}
+
+# emulator_context
+EMULATOR_CONTEXT_FORMAT = II
+
+EMULATOR_ID_unknown   = 0x
+EMULATOR_ID_qemu_trad = 0x0001
+EMULATOR_ID_qemu_upstream = 0x0002
+
+emulator_id_to_str = {
+EMULATOR_ID_unknown   : Unknown,
+EMULATOR_ID_qemu_trad : Qemu Traditional,
+EMULATOR_ID_qemu_upstream : Qemu Upstream,
+}
+
+
+#
+# libxl format
+#
+
+LIBXL_QEMU_SIGNATURE = DeviceModelRecord0002
+LIBXL_QEMU_RECORD_HDR = =%dsI % (len(LIBXL_QEMU_SIGNATURE), )
+
+class VerifyLibxl(VerifyBase):
+ Verify a Libxl v2 stream 
+
+def __init__(self, info, read):
+VerifyBase.__init__(self, info, read)
+
+
+def verify(self):
+ Verity a libxl stream 
+
+self.verify_hdr()
+
+while self.verify_record() != REC_TYPE_end:
+pass
+
+
+def verify_hdr(self):
+ Verify a Header 
+ident, version, options = self.unpack_exact(HDR_FORMAT)
+
+if ident != HDR_IDENT:
+raise StreamError(Bad image id: Expected 0x%x, got 0x%x
+  % (HDR_IDENT, ident))
+
+if version != HDR_VERSION:
+raise StreamError(Unknown image version: Expected %d, got %d
+  % (HDR_VERSION, version))
+
+if options  HDR_OPT_RESZ_MASK:
+raise StreamError(Reserved bits set in image options field: 0x%x
+  % (options  HDR_OPT_RESZ_MASK))
+
+if ( (sys.byteorder == little) and
+ ((options  HDR_OPT_BIT_ENDIAN) != HDR_OPT_LE) ):
+raise StreamError(
+Stream is not native endianess - unable to validate)
+
+endian = [little, big][options  HDR_OPT_LE]
+
+if options  HDR_OPT_LEGACY:
+self.info(Libxl Header: %s endian, legacy converted % (endian, ))
+else:
+self.info(Libxl Header: %s endian % (endian, ))
+
+
+def verify_record(self):
+ Verify an individual record 
+rtype, length = self.unpack_exact(RH_FORMAT)
+
+if rtype not in rec_type_to_str:
+raise StreamError(Unrecognised record type %x % (rtype, ))
+
+self.info(Libxl Record: %s, length %d
+  % (rec_type_to_str[rtype], length))
+
+contentsz = (length + 7)  ~7
+content = self.rdexact(contentsz)
+
+padding = content[length:]
+if padding != \x00 * len(padding):
+raise StreamError(Padding containing non0 bytes found)
+
+if rtype not in record_verifiers:
+raise RuntimeError(No verification function for libxl record '%s'
+   % rec_type_to_str[rtype])
+else:
+record_verifiers[rtype](self, content[:length])
+
+return rtype
+
+
+def verify_record_end(self, content):
+ End record 
+
+if len(content) != 0:
+raise RecordError(End record with non-zero length)
+
+
+def verify_record_libxc_context(self, content):
+ Libxc context record 
+
+if len(content) != 0:
+raise RecordError(Libxc context record with non-zero length)
+
+# Verify the libxc stream, as we can't seek forwards through it
+

Re: [Xen-devel] [v7][PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-14 Thread George Dunlap
On 07/14/2015 11:53 AM, Chen, Tiejun wrote:
 The way this sort of thing is defined in the rest of domctl.h is like
 this:

 #define _XEN_DOMCTL_CDF_hvm_guest 0
 #define XEN_DOMCTL_CDF_hvm_guest  (1U_XEN_DOMCTL_CDF_hvm_guest)

 So the above should be

 #define _XEN_DOMCTL_DEV_RDM_RELAXED 0
 #define XEN_DOMCTL_DEV_RDM_RELAXED (1U_XEN_DOMCTL_DEV_RDM_RELAXED)

 And then your check in iommu_do_pci_domctl() would look like

 if (flag  ~XEN_DOMCTL_DEV_RDM_RELAXED)

 And if we end up adding any extra flags, we just | them into the above
 conditional, as is done in, for example, the XEN_DOMCTL_createdomain
 case in xen/common/domctl.c:do_domctl().

 
 Seems Jan didn't like this way IIRC, so I hope Jan also can have a look
 at this beforehand :)

I think Jan thought that the MASK value you defined wasn't meant to be a
single flag, but for all the flags; i.e., that if we added flags in bits
1 and 2, that MASK would become 0x7 rather than 0x1.  And I agree that
there's not much point to having such a mask defined in the public header.

But what I'm doing above is making explicit what you have already; i.e.,
you just set XEN_DOMCTL_DEV_RDM_RELAXED to '1'; the reader has to sort
of infer that the reason '1' is chosen is that it's setting bit 0.
Doing it the way I suggest makes it more clear that this is meant to be
a bitfield, and '0' has been allocated.

Please correct me if I'm wrong, Jan.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 0/3] Vm_event memory introspection helpers

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 13:45, rcojoc...@bitdefender.com wrote:
 On 07/14/2015 01:50 PM, Jan Beulich wrote:
 On 13.07.15 at 19:14, rcojoc...@bitdefender.com wrote:
 I've also moved x86 logic in patch 3/3 to x86 source files, this
 seems to have gone unnoticed but would likely have not compiled
 on ARM.
 
 Which leaves open whether this time you actually checked that
 ARM continues to build.
 
 I did check, and again just now. My patches don't break the ARM build,
 but I just found that on my ARM system, current Xen staging doesn't
 build: http://pastebin.com/RnywiCX7 

That's the tool stack having issues, while we're talking about the
hypervisor build here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 02:14, edmund.h.wh...@intel.com wrote:
 +void
 +altp2m_vcpu_initialise(struct vcpu *v)
 +{
 +if ( v != current )
 +vcpu_pause(v);
 +
 +altp2m_vcpu_reset(v);
 +vcpu_altp2m(v).p2midx = 0;
 +atomic_inc(p2m_get_altp2m(v)-active_vcpus);
 +
 +altp2m_vcpu_update_eptp(v);
 +
 +if ( v != current )
 +vcpu_unpause(v);
 +}
 +
 +void
 +altp2m_vcpu_destroy(struct vcpu *v)
 +{
 +struct p2m_domain *p2m;
 +
 +if ( v != current )
 +vcpu_pause(v);
 +
 +if ( (p2m = p2m_get_altp2m(v)) )
 +atomic_dec(p2m-active_vcpus);
 +
 +altp2m_vcpu_reset(v);
 +
 +altp2m_vcpu_update_eptp(v);
 +altp2m_vcpu_update_vmfunc_ve(v);
 +
 +if ( v != current )
 +vcpu_unpause(v);
 +}

There not being any caller of altp2m_vcpu_initialise() I can't judge
about its pausing requirements, but for the destroy case I can't
see what the pausing is good for. Considering its sole user it's also
not really clear why the two update operations need to be done
while destroying.

 @@ -6498,6 +6500,25 @@ enum hvm_intblk nhvm_interrupt_blocked(struct vcpu *v)
  return hvm_funcs.nhvm_intr_blocked(v);
  }
  
 +void altp2m_vcpu_update_eptp(struct vcpu *v)
 +{
 +if ( hvm_funcs.altp2m_vcpu_update_eptp )
 +hvm_funcs.altp2m_vcpu_update_eptp(v);
 +}
 +
 +void altp2m_vcpu_update_vmfunc_ve(struct vcpu *v)
 +{
 +if ( hvm_funcs.altp2m_vcpu_update_vmfunc_ve )
 +hvm_funcs.altp2m_vcpu_update_vmfunc_ve(v);
 +}
 +
 +bool_t altp2m_vcpu_emulate_ve(struct vcpu *v)
 +{
 +if ( hvm_funcs.altp2m_vcpu_emulate_ve )
 +return hvm_funcs.altp2m_vcpu_emulate_ve(v);
 +return 0;
 +}

The patch context here suggests that you're using pretty outdated
a tree - nhvm_interrupt_blocked() went away a week ago. In line
with the commit doing so, all of the above should become inline
functions.

 +uint16_t p2m_find_altp2m_by_eptp(struct domain *d, uint64_t eptp)
 +{
 +struct p2m_domain *p2m;
 +struct ept_data *ept;
 +uint16_t i;
 +
 +altp2m_list_lock(d);
 +
 +for ( i = 0; i  MAX_ALTP2M; i++ )
 +{
 +if ( d-arch.altp2m_eptp[i] == INVALID_MFN )
 +continue;
 +
 +p2m = d-arch.altp2m_p2m[i];
 +ept = p2m-ept;
 +
 +if ( eptp == ept_get_eptp(ept) )
 +goto out;
 +}
 +
 +i = INVALID_ALTP2M;
 +
 +out:

Labels should be indented by at least one space.

Pending the rename, the function may also live in the wrong file (the
use of ept_get_eptp() suggests so even if the function itself got
renamed).

 +bool_t p2m_switch_vcpu_altp2m_by_id(struct vcpu *v, uint16_t idx)
 +{
 +struct domain *d = v-domain;
 +bool_t rc = 0;
 +
 +if ( idx  MAX_ALTP2M )
 +return rc;
 +
 +altp2m_list_lock(d);
 +
 +if ( d-arch.altp2m_eptp[idx] != INVALID_MFN )
 +{
 +if ( idx != vcpu_altp2m(v).p2midx )
 +{
 +atomic_dec(p2m_get_altp2m(v)-active_vcpus);
 +vcpu_altp2m(v).p2midx = idx;
 +atomic_inc(p2m_get_altp2m(v)-active_vcpus);

Are the two results of p2m_get_altp2m(v) here guaranteed to be
distinct? If they aren't, is it safe to decrement first (potentially
dropping the count to zero)?

 @@ -722,6 +731,27 @@ void nestedp2m_write_p2m_entry(struct p2m_domain *p2m, 
 unsigned long gfn,
  l1_pgentry_t *p, l1_pgentry_t new, unsigned int level);
  
  /*
 + * Alternate p2m: shadow p2m tables used for alternate memory views
 + */
 +
 +/* get current alternate p2m table */
 +static inline struct p2m_domain *p2m_get_altp2m(struct vcpu *v)
 +{
 +struct domain *d = v-domain;
 +uint16_t index = vcpu_altp2m(v).p2midx;
 +
 +ASSERT(index  MAX_ALTP2M);
 +
 +return (index == INVALID_ALTP2M) ? NULL : d-arch.altp2m_p2m[index];
 +}

Looking at this again, I'm afraid I'd still prefer index  MAX_ALTP2M
in the return statement (and the ASSERT() dropped): The ASSERT()
does nothing in a debug=n build, and hence wouldn't shield us from
possibly having to issue an XSA if somehow an access outside the
array's bounds turned out possible.

I've also avoided to repeat any of the un-addressed points that I
raised against earlier versions.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 28/29] tools/libxc: Drop all XG_LIBXL_HVM_COMPAT code from libxc

2015-07-14 Thread Andrew Cooper
Libxl has now been fully adjusted not to need it.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com
---
 tools/libxc/xc_sr_common.h  |5 --
 tools/libxc/xc_sr_restore.c |   18 -
 tools/libxc/xc_sr_restore_x86_hvm.c |  124 ---
 tools/libxc/xc_sr_save_x86_hvm.c|   36 --
 4 files changed, 183 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 1f4d4e4..64f6082 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -309,11 +309,6 @@ struct xc_sr_context
 /* HVM context blob. */
 void *context;
 size_t contextsz;
-
-/* #ifdef XG_LIBXL_HVM_COMPAT */
-uint32_t qlen;
-void *qbuf;
-/* #endif */
 } restore;
 };
 } x86_hvm;
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 18ba411..bf1ee15 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -627,9 +627,6 @@ static void cleanup(struct xc_sr_context *ctx)
 PERROR(Failed to clean up);
 }
 
-#ifdef XG_LIBXL_HVM_COMPAT
-extern int read_qemu(struct xc_sr_context *ctx);
-#endif
 /*
  * Restore a domain.
  */
@@ -656,21 +653,6 @@ static int restore(struct xc_sr_context *ctx)
 goto err;
 }
 
-#ifdef XG_LIBXL_HVM_COMPAT
-if ( ctx-dominfo.hvm 
- (rec.type == REC_TYPE_END || rec.type == REC_TYPE_CHECKPOINT) )
-{
-rc = read_qemu(ctx);
-if ( rc )
-{
-if ( ctx-restore.buffer_all_records )
-goto remus_failover;
-else
-goto err;
-}
-}
-#endif
-
 if ( ctx-restore.buffer_all_records 
  rec.type != REC_TYPE_END 
  rec.type != REC_TYPE_CHECKPOINT )
diff --git a/tools/libxc/xc_sr_restore_x86_hvm.c 
b/tools/libxc/xc_sr_restore_x86_hvm.c
index 6f5af0e..49d22c7 100644
--- a/tools/libxc/xc_sr_restore_x86_hvm.c
+++ b/tools/libxc/xc_sr_restore_x86_hvm.c
@@ -3,24 +3,6 @@
 
 #include xc_sr_common_x86.h
 
-#ifdef XG_LIBXL_HVM_COMPAT
-static int handle_toolstack(struct xc_sr_context *ctx, struct xc_sr_record 
*rec)
-{
-xc_interface *xch = ctx-xch;
-int rc;
-
-if ( !ctx-restore.callbacks || !ctx-restore.callbacks-toolstack_restore 
)
-return 0;
-
-rc = ctx-restore.callbacks-toolstack_restore(
-ctx-domid, rec-data, rec-length, ctx-restore.callbacks-data);
-
-if ( rc  0 )
-PERROR(restoring toolstack);
-return rc;
-}
-#endif
-
 /*
  * Process an HVM_CONTEXT record from the stream.
  */
@@ -93,98 +75,6 @@ static int handle_hvm_params(struct xc_sr_context *ctx,
 return 0;
 }
 
-#ifdef XG_LIBXL_HVM_COMPAT
-int read_qemu(struct xc_sr_context *ctx);
-int read_qemu(struct xc_sr_context *ctx)
-{
-xc_interface *xch = ctx-xch;
-char qemusig[21];
-uint32_t qlen;
-void *qbuf = NULL;
-int rc = -1;
-
-if ( read_exact(ctx-fd, qemusig, sizeof(qemusig)) )
-{
-PERROR(Error reading QEMU signature);
-goto out;
-}
-
-if ( !memcmp(qemusig, DeviceModelRecord0002, sizeof(qemusig)) )
-{
-if ( read_exact(ctx-fd, qlen, sizeof(qlen)) )
-{
-PERROR(Error reading QEMU record length);
-goto out;
-}
-
-qbuf = malloc(qlen);
-if ( !qbuf )
-{
-PERROR(no memory for device model state);
-goto out;
-}
-
-if ( read_exact(ctx-fd, qbuf, qlen) )
-{
-PERROR(Error reading device model state);
-goto out;
-}
-}
-else
-{
-ERROR(Invalid device model state signature '%*.*s',
-  (int)sizeof(qemusig), (int)sizeof(qemusig), qemusig);
-goto out;
-}
-
-/* With Remus, this could be read many times */
-if ( ctx-x86_hvm.restore.qbuf )
-free(ctx-x86_hvm.restore.qbuf);
-ctx-x86_hvm.restore.qbuf = qbuf;
-ctx-x86_hvm.restore.qlen = qlen;
-rc = 0;
-
-out:
-if (rc)
-free(qbuf);
-return rc;
-}
-
-static int handle_qemu(struct xc_sr_context *ctx)
-{
-xc_interface *xch = ctx-xch;
-char path[256];
-uint32_t qlen = ctx-x86_hvm.restore.qlen;
-void *qbuf = ctx-x86_hvm.restore.qbuf;
-int rc = -1;
-FILE *fp = NULL;
-
-sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE.%u, ctx-domid);
-fp = fopen(path, wb);
-if ( !fp )
-{
-PERROR(Failed to open '%s' for writing, path);
-goto out;
-}
-
-DPRINTF(Writing %u bytes of QEMU data, qlen);
-if ( fwrite(qbuf, 1, qlen, fp) != qlen )
-{
-PERROR(Failed to write %u bytes of QEMU data, qlen);
-goto out;
-}
-
-rc = 0;
-
- out:
-if ( fp )
-fclose(fp);

[Xen-devel] [PATCH v4 17/29] tools/libxl: Migration v2 stream format

2015-07-14 Thread Andrew Cooper
From: Ross Lagerwall ross.lagerw...@citrix.com

C structures describing the Libxl migration v2 stream format

Signed-off-by: Ross Lagerwall ross.lagerw...@citrix.com
Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v2: Move into libxl__ namespace
---
 tools/libxl/libxl_sr_stream_format.h |   57 ++
 1 file changed, 57 insertions(+)
 create mode 100644 tools/libxl/libxl_sr_stream_format.h

diff --git a/tools/libxl/libxl_sr_stream_format.h 
b/tools/libxl/libxl_sr_stream_format.h
new file mode 100644
index 000..f4f790b
--- /dev/null
+++ b/tools/libxl/libxl_sr_stream_format.h
@@ -0,0 +1,57 @@
+#ifndef LIBXL__SR_STREAM_FORMAT_H
+#define LIBXL__SR_STREAM_FORMAT_H
+
+/*
+ * C structures for the Migration v2 stream format.
+ * See docs/specs/libxl-migration-stream.pandoc
+ */
+
+#include stdint.h
+
+typedef struct libxl__sr_hdr
+{
+uint64_t ident;
+uint32_t version;
+uint32_t options;
+} libxl__sr_hdr;
+
+#define RESTORE_STREAM_IDENT 0x4c6962786c466d74UL
+#define RESTORE_STREAM_VERSION   0x0002U
+
+#define RESTORE_OPT_BIG_ENDIAN   (1u  0)
+#define RESTORE_OPT_LEGACY   (1u  1)
+
+
+typedef struct libxl__sr_rec_hdr
+{
+uint32_t type;
+uint32_t length;
+} libxl__sr_rec_hdr;
+
+/* All records must be aligned up to an 8 octet boundary */
+#define REC_ALIGN_ORDER  3U
+
+#define REC_TYPE_END 0xU
+#define REC_TYPE_LIBXC_CONTEXT   0x0001U
+#define REC_TYPE_XENSTORE_DATA   0x0002U
+#define REC_TYPE_EMULATOR_CONTEXT0x0003U
+
+typedef struct libxl__sr_emulator_hdr
+{
+uint32_t id;
+uint32_t index;
+} libxl__sr_emulator_hdr;
+
+#define EMULATOR_UNKNOWN 0xU
+#define EMULATOR_QEMU_TRADITIONAL0x0001U
+#define EMULATOR_QEMU_UPSTREAM   0x0002U
+
+#endif /* LIBXL__SR_STREAM_FORMAT_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 10/29] tools/xl: Mandatory flag indicating the format of the migration stream

2015-07-14 Thread Andrew Cooper
Introduced at this point so the python stream conversion code has a
concrete ABI to use.  Later when libxl itself starts supporting a v2
stream, it will be added to XL_MANDATORY_FLAG_ALL.

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Campbell ian.campb...@citrix.com
CC: Ian Jackson ian.jack...@eu.citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v2: Expand commit message
---
 tools/libxl/xl_cmdimpl.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 971209c..26b1e7d 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -109,6 +109,7 @@
*/
 
 #define XL_MANDATORY_FLAG_JSON (1U  0) /* config data is in JSON format */
+#define XL_MANDATORY_FLAG_STREAMv2 (1U  1) /* stream is v2 */
 #define XL_MANDATORY_FLAG_ALL  (XL_MANDATORY_FLAG_JSON)
 struct save_file_header {
 char magic[32]; /* savefileheader_magic */
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 23/29] tools/libxc+libxl+xl: Save v2 streams

2015-07-14 Thread Andrew Cooper
This is a complicated set of changes which must be done together for
bisectability.

 * libxl-save-helper is updated to unconditionally use libxc migration v2.
 * libxl compatibility workarounds in libxc are disabled for save operations.
 * libxl__stream_write_start() is logically spliced into the event location
   where libxl__xc_domain_save() used to reside.
 * Ownership of the helper state moves into stream_write_state.
 * xl is updated to indicate that the stream is now v2

Signed-off-by: Andrew Cooper andrew.coop...@citrix.com
Acked-by: Ian Jackson ian.jack...@eu.citrix.com
CC: Ian Campbell ian.campb...@citrix.com
CC: Wei Liu wei.l...@citrix.com

---
v3: Many small changes, follwing similar review to the restore side
---
 tools/libxc/Makefile |2 --
 tools/libxl/libxl.h  |2 ++
 tools/libxl/libxl_dom.c  |   69 --
 tools/libxl/libxl_internal.h |3 +-
 tools/libxl/libxl_save_helper.c  |2 +-
 tools/libxl/libxl_stream_write.c |   40 ++
 tools/libxl/xl_cmdimpl.c |1 +
 7 files changed, 61 insertions(+), 58 deletions(-)

diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 2cd0b1a..1aec848 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -64,8 +64,6 @@ GUEST_SRCS-$(CONFIG_X86) += xc_sr_save_x86_hvm.c
 GUEST_SRCS-y += xc_sr_restore.c
 GUEST_SRCS-y += xc_sr_save.c
 GUEST_SRCS-y += xc_offline_page.c xc_compression.c
-xc_sr_save_x86_hvm.o: CFLAGS += -DXG_LIBXL_HVM_COMPAT
-xc_sr_save_x86_hvm.opic: CFLAGS += -DXG_LIBXL_HVM_COMPAT
 else
 GUEST_SRCS-y += xc_nomigrate.c
 endif
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 74b0829..5a7308d 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -812,6 +812,8 @@
  *
  * If this is defined, then the libxl_domain_create_restore() interface takes
  * a stream_version parameter and supports a value of 2.
+ *
+ * libxl_domain_suspend() will produce a v2 stream.
  */
 #define LIBXL_HAVE_SRM_V2 1
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index a685b77..cda3a7a 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1153,6 +1153,8 @@ int libxl__toolstack_restore(uint32_t domid, const 
uint8_t *ptr,
 
 /* Domain suspend (save) */
 
+static void stream_done(libxl__egc *egc,
+libxl__stream_write_state *sws, int rc);
 static void domain_suspend_done(libxl__egc *egc,
 libxl__domain_suspend_state *dss, int rc);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
@@ -1379,7 +1381,7 @@ static void switch_logdirty_done(libxl__egc *egc,
 } else {
 broke = 0;
 }
-libxl__xc_domain_saverestore_async_callback_done(egc, dss-shs, broke);
+libxl__xc_domain_saverestore_async_callback_done(egc, dss-sws.shs, 
broke);
 }
 
 /*- callbacks, called by xc_domain_save -*/
@@ -1842,7 +1844,7 @@ static void 
domain_suspend_callback_common_done(libxl__egc *egc,
 libxl__domain_suspend_state *dss, int rc)
 {
 dss-rc = rc;
-libxl__xc_domain_saverestore_async_callback_done(egc, dss-shs, !rc);
+libxl__xc_domain_saverestore_async_callback_done(egc, dss-sws.shs, !rc);
 }
 
 /*- remus callbacks -*/
@@ -1878,7 +1880,7 @@ static void 
remus_domain_suspend_callback_common_done(libxl__egc *egc,
 
 out:
 dss-rc = rc;
-libxl__xc_domain_saverestore_async_callback_done(egc, dss-shs, !rc);
+libxl__xc_domain_saverestore_async_callback_done(egc, dss-sws.shs, !rc);
 }
 
 static void remus_devices_postsuspend_cb(libxl__egc *egc,
@@ -1895,7 +1897,7 @@ static void remus_devices_postsuspend_cb(libxl__egc *egc,
 out:
 if (rc)
 dss-rc = rc;
-libxl__xc_domain_saverestore_async_callback_done(egc, dss-shs, !rc);
+libxl__xc_domain_saverestore_async_callback_done(egc, dss-sws.shs, !rc);
 }
 
 static void libxl__remus_domain_resume_callback(void *data)
@@ -1930,7 +1932,7 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
 out:
 if (rc)
 dss-rc = rc;
-libxl__xc_domain_saverestore_async_callback_done(egc, dss-shs, !rc);
+libxl__xc_domain_saverestore_async_callback_done(egc, dss-sws.shs, !rc);
 }
 
 /*- remus asynchronous checkpoint callback -*/
@@ -1978,7 +1980,7 @@ static void remus_checkpoint_dm_saved(libxl__egc *egc,
 return;
 
 out:
-libxl__xc_domain_saverestore_async_callback_done(egc, dss-shs, 0);
+libxl__xc_domain_saverestore_async_callback_done(egc, dss-sws.shs, 0);
 }
 
 static void remus_devices_commit_cb(libxl__egc *egc,
@@ -2013,7 +2015,7 @@ static void remus_devices_commit_cb(libxl__egc *egc,
 return;
 
 out:
-libxl__xc_domain_saverestore_async_callback_done(egc, dss-shs, 0);
+libxl__xc_domain_saverestore_async_callback_done(egc, dss-sws.shs, 0);
 }
 
 static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
@@ -2034,7 +2036,7 @@ static void 

Re: [Xen-devel] [PATCH v3 03/13] VMX: implement suppress #VE.

2015-07-14 Thread George Dunlap
On 07/13/2015 08:40 AM, Jan Beulich wrote:
 On 10.07.15 at 21:30, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Thursday, July 09, 2015 6:01 AM
 On 01.07.15 at 20:09, edmund.h.wh...@intel.com wrote:
 @@ -232,6 +235,15 @@ static int ept_set_middle_entry(struct p2m_domain
 @@ -1134,6 +1151,13 @@ int ept_p2m_init(struct p2m_domain *p2m)
  p2m-flush_hardware_cached_dirty = ept_flush_pml_buffers;
  }

 +table =
 + map_domain_page(pagetable_get_pfn(p2m_get_pagetable(p2m)));
 +
 +for ( i = 0; i  EPT_PAGETABLE_ENTRIES; i++ )
 +table[i].suppress_ve = 1;
 +
 +unmap_domain_page(table);

 ... why is this needed? Bit 63 is documented to be ignored in PML4Es (just 
 like
 in all other intermediate page tables).

 Valid point - this has no negative side-effects per se so we didn't change 
 this.
 
 Taking we didn't change this to refer to v3 - v4, I still think this
 should be dropped if it isn't needed. There can only be confusion
 arising from code having no purpose.

Just want to call out the general principle Jan refers to here: Nobody
can remember everything about all the details of how the code and the
hardware works; there's just too much to keep in your head all at one
time.  And in any case, people who are not maintainers need to be able
to understand the code to modify it.

The result is that we naturally use the code itself to remind us, or
give us a hint, what the rest of the code or what the hardware does; if
we see a check for NULL, we tend to assume that the pointer may actually
be NULL; if we see a flag being passed, we tend to assume that the flag
will have an effect.

The general principle for making the code easier to grasp, and for
reducing the cognitive load on people trying to understand and modify
it, is to enhance this.  Write code which implies the truth about other
bits of the codebase or the hardware; avoid writing code which will
mislead someone into thinking something false about the other bits of
the codebase or the hardware.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 05/13] x86/altp2m: basic data structures and support routines.

2015-07-14 Thread George Dunlap
On 07/14/2015 01:01 AM, Sahita, Ravi wrote:
 
 
 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Monday, July 13, 2015 1:01 AM

 On 10.07.15 at 23:48, ravi.sah...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Thursday, July 09, 2015 6:30 AM

 On 01.07.15 at 20:09, edmund.h.wh...@intel.com wrote:
 ---
  xen/arch/x86/hvm/Makefile|  1 +
  xen/arch/x86/hvm/altp2m.c| 92
 +

 Wouldn't this better go into xen/arch/x86/mm/?

 In this case we followed the pattern of nestedhvm - hope that's ok.

 Not really imo: Nested HVM obviously belongs in hvm/; alt-P2m is more of a
 mm extension than a HVM one afaict, and hence would rather belong in mm/.

 
 Would like George's opinion also on this before we make this change (again 
 want to avoid thrashing on things like this).

This is a bit of a murky one, since the whole reason you're doing this
is that you actually do use hardware to do the VMFUNC and p2m switching.

Let me take a look at v5 and see what I think (as it sounds like some of
the hvm_function hooks have disappeared).

 -George



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 03/15] VMX: implement suppress #VE.

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 02:14, edmund.h.wh...@intel.com wrote:
 In preparation for selectively enabling #VE in a later patch, set
 suppress #VE on all EPTE's.
 
 Suppress #VE should always be the default condition for two reasons:
 it is generally not safe to deliver #VE into a guest unless that guest
 has been modified to receive it; and even then for most EPT violations only
 the hypervisor is able to handle the violation.
 
 Signed-off-by: Ed White edmund.h.wh...@intel.com
 
 Reviewed-by: Andrew Cooper andrew.coop...@citrix.com
 Reviewed-by: George Dunlap george.dun...@eu.citrix.com
 Acked-by: Jun Nakajima jun.nakaj...@intel.com
 ---

Missing revision log here. And I think the dropping of the adjustment
to ept_p2m_init() invalidates reviews as well as Jun's ack - just
consider if my request to do so was wrong for some reason, and
neither of them saw me asking for this.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] sysctl: adjust XEN_SYSCTL_numainfo behavior

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 14:43, andrew.coop...@citrix.com wrote:
 On 14/07/15 13:35, Dario Faggioli wrote:
 On Tue, 2015-07-14 at 10:52 +0100, Jan Beulich wrote:
 ... to match XEN_SYSCTL_cputopoinfo, allowing the caller to get what it
 needs (if e.g. it's after the data for just one specific node) with
 just one hypercall, without caring about the total number of nodes in
 the system.

 Suggested-by: Boris Ostrovsky boris.ostrov...@oracle.com
 Signed-off-by: Jan Beulich jbeul...@suse.com

 Reviewed-by: Dario Faggioli dario.faggi...@citrix.com

 One question:

 --- a/xen/common/sysctl.c
 +++ b/xen/common/sysctl.c
  
 @@ -335,7 +329,7 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xe
  else
  i = num_nodes;
  
 -if ( (!ret || (ret == -ENOBUFS))  (ni-num_nodes != i) )
 +if ( !ret  (ni-num_nodes != i) )
  {
 Can't we kill the parentheses around the second argument of the  ?
 
 No.  The coding style requires binary operators as part of larger
 statements to have brackets.

I don't see it requires this, and I can't even find it recommending
such - it just so happens that some people prefer using the extra
parentheses. I personally dislike them for all operators where the
precedence is natural, i.e. here I wouldn't have added them if
I had written the code anew. Since I modified existing code, I
decided to retain previous style.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v7][PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-14 Thread George Dunlap
On 07/14/2015 12:45 PM, Jan Beulich wrote:
 On 14.07.15 at 13:30, george.dun...@eu.citrix.com wrote:
 On 07/14/2015 11:53 AM, Chen, Tiejun wrote:
 The way this sort of thing is defined in the rest of domctl.h is like
 this:

 #define _XEN_DOMCTL_CDF_hvm_guest 0
 #define XEN_DOMCTL_CDF_hvm_guest  (1U_XEN_DOMCTL_CDF_hvm_guest)

 So the above should be

 #define _XEN_DOMCTL_DEV_RDM_RELAXED 0
 #define XEN_DOMCTL_DEV_RDM_RELAXED (1U_XEN_DOMCTL_DEV_RDM_RELAXED)

 And then your check in iommu_do_pci_domctl() would look like

 if (flag  ~XEN_DOMCTL_DEV_RDM_RELAXED)

 And if we end up adding any extra flags, we just | them into the above
 conditional, as is done in, for example, the XEN_DOMCTL_createdomain
 case in xen/common/domctl.c:do_domctl().


 Seems Jan didn't like this way IIRC, so I hope Jan also can have a look
 at this beforehand :)

 I think Jan thought that the MASK value you defined wasn't meant to be a
 single flag, but for all the flags; i.e., that if we added flags in bits
 1 and 2, that MASK would become 0x7 rather than 0x1.  And I agree that
 there's not much point to having such a mask defined in the public header.

 But what I'm doing above is making explicit what you have already; i.e.,
 you just set XEN_DOMCTL_DEV_RDM_RELAXED to '1'; the reader has to sort
 of infer that the reason '1' is chosen is that it's setting bit 0.
 Doing it the way I suggest makes it more clear that this is meant to be
 a bitfield, and '0' has been allocated.

 Please correct me if I'm wrong, Jan.
 
 Indeed my primary objection was to what you describe. That said,
 I'm not too happy with what you propose now either. Not only do
 I view this (bit-pos,bit-mask) tuple as redundant unless one actually
 needs both in certain places (which doesn't seem to be the case
 here), but the naming also conflicts with the C standard (reserving
 identifiers starting with underscore and an upper case letter). Just
 like said in various other cases - we've got many examples of such
 violations already, but I'd prefer not to make the situation worse.
 
 IOW I'd prefer to go with just
 
 #define XEN_DOMCTL_DEV_RDM_RELAXED 1

Very well.

 -George

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 59536: trouble: pass/preparing/running

2015-07-14 Thread osstest service owner
flight 59536 xen-unstable running [real]
http://logs.test-lab.xenproject.org/osstest/logs/59536/

Failures and problems with tests :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm  2 hosts-allocate running
 test-amd64-amd64-rumpuserxen-amd64  4 host-ping-check-native   running
 test-amd64-amd64-xl-multivcpu  2 hosts-allocate   running
 test-amd64-i386-rumpuserxen-i386  2 hosts-allocate   running
 test-amd64-i386-libvirt   2 hosts-allocate   running
 test-amd64-amd64-xl-xsm   2 hosts-allocate   running
 test-amd64-amd64-libvirt-xsm  2 hosts-allocate   running
 test-amd64-i386-qemut-rhel6hvm-amd  2 hosts-allocate   running
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 2 hosts-allocate running
 test-amd64-amd64-xl-pvh-amd   2 hosts-allocate   running
 test-amd64-amd64-libvirt  2 hosts-allocate   running
 test-amd64-amd64-xl-pvh-intel  2 hosts-allocate   running
 test-amd64-i386-libvirt-xsm   2 hosts-allocate   running
 test-armhf-armhf-xl-credit2   2 hosts-allocate   running
 test-amd64-i386-xl2 hosts-allocate   running
 test-amd64-i386-qemut-rhel6hvm-intel  2 hosts-allocate   running
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  2 hosts-allocaterunning
 test-amd64-amd64-xl-qemuu-ovmf-amd64  2 hosts-allocate   running
 test-amd64-i386-qemuu-rhel6hvm-intel  2 hosts-allocate   running
 test-amd64-amd64-xl-credit2   2 hosts-allocate   running
 test-amd64-i386-freebsd10-amd64  2 hosts-allocate   running
 test-amd64-amd64-xl-rtds  2 hosts-allocate   running
 test-amd64-amd64-xl   2 hosts-allocate   running
 test-amd64-i386-freebsd10-i386  2 hosts-allocate   running
 test-armhf-armhf-xl-xsm   2 hosts-allocate   running
 test-amd64-i386-qemuu-rhel6hvm-amd  2 hosts-allocate   running
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  2 hosts-allocaterunning
 test-armhf-armhf-xl-rtds  2 hosts-allocate   running
 test-armhf-armhf-xl-arndale   9 debian-install   running
 test-armhf-armhf-xl   2 hosts-allocate   running
 test-amd64-i386-xl-xsm2 hosts-allocate   running
 test-amd64-amd64-xl-qemut-debianhvm-amd64  2 hosts-allocaterunning
 test-armhf-armhf-libvirt  3 host-install(3)  running
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  2 hosts-allocate running
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  2 hosts-allocaterunning
 test-amd64-i386-xl-qemut-debianhvm-amd64  2 hosts-allocate running
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  2 hosts-allocate running
 test-amd64-i386-xl-qemuu-debianhvm-amd64  2 hosts-allocate running
 test-armhf-armhf-xl-multivcpu  2 hosts-allocate   running
 test-armhf-armhf-libvirt-xsm  2 hosts-allocate   running
 test-armhf-armhf-xl-cubietruck  2 hosts-allocate   running
 test-amd64-i386-xl-qemuu-ovmf-amd64  2 hosts-allocate   running
 test-amd64-amd64-pair 2 hosts-allocate   running
 test-amd64-amd64-xl-qemuu-winxpsp3  2 hosts-allocate   running
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  2 hosts-allocate running
 test-amd64-i386-xl-qemut-win7-amd64  2 hosts-allocate   running
 test-amd64-amd64-xl-qemut-win7-amd64  2 hosts-allocate   running
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  2 hosts-allocate running
 test-amd64-amd64-xl-qemuu-win7-amd64  2 hosts-allocate   running
 test-amd64-amd64-xl-qemut-winxpsp3  2 hosts-allocate   running
 test-amd64-i386-xl-qemuu-win7-amd64  2 hosts-allocate   running
 test-amd64-i386-pair  2 hosts-allocate   running
 test-amd64-i386-xl-qemut-winxpsp3  2 hosts-allocate   running
 test-amd64-i386-xl-qemuu-winxpsp3  2 hosts-allocate   running

version targeted for testing:
 xen  42b79d4ad84da0928f11c4b265eb30cece5fe2fb
baseline version:
 xen  c40317f11b3f05e7c06a2213560c8471081f2662

Last test of basis58965  2015-06-29 02:08:30 Z   15 days
Failing since 58974  2015-06-29 15:11:59 Z   14 days   16 attempts
Testing same since59509  2015-07-13 12:41:20 Z1 days1 attempts


People who touched revisions under test:
  Andrew Cooper andrew.coop...@citrix.com
  Anthony PERARD anthony.per...@citrix.com
  Ard Biesheuvel a...@linaro.org
  Ben Catterall ben.catter...@citrix.com
  Boris Ostrovsky boris.ostrov...@oracle.com
  Chao Peng chao.p.p...@linux.intel.com
  Chen Baozi 

Re: [Xen-devel] [PATCH v2 1/3] xen/domain_page: Convert map_domain_page_global() to using mfn_t

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 12:54, ian.jack...@eu.citrix.com wrote:
 Jan Beulich writes (Re: [Xen-devel] [PATCH v2 1/3] xen/domain_page: Convert 
 map_domain_page_global() to using mfn_t):
 On 13.07.15 at 18:56, ian.jack...@eu.citrix.com wrote:
  Surely xfree() ought to have the same prototype as free() ?
 
 Why? If it were to be a full match, why wouldn't we call it free() in
 the first place?
 
 Is that what is supposed to differ between free and xfree ?  That
 would be a bit odd.
 
 In the userland world, xfree is the companion to xmalloc:
 
   
 http://manpages.debian.org/cgi-bin/man.cgi?query=xfreeapropos=0sektion=0manp
  
 ath=Debian+8+jessieformat=htmllocale=en
 
 (Although, logically speaking, xfree is unnecessary in that set.)

And I don't view the hypervisor's xmalloc() as a counterpart to the
user mode one, but as one of Linux'es kmalloc().

  free is declared in C99 (7.20.3.2 in my copy of TC2) as:
  
  void free(void *ptr);
  
  That is, non-const.  AFAIAA this is not generally regarded as a bug.
 
 Perhaps, but certainly depending on who looks at it. I for my part
 dislike (as hinted at above) that I have to cast away const-ness in
 order to be able to call free() without causing compiler warnings,
 and I generally hide this in wrappers to limit such casting.
 
 The flipside is that if free can take a const*, functions which take a
 const struct foo* can free it.  That's not what I would expect such a
 function to do.

Depends.

 Do we need to resolve this disagreement now ?

As long as no patch is pending to adjust xfree(), I don't think so.
And of course unless you request what the patch here (already
have gone in) did to be adjusted.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] how to locate the hypercall address in memory?

2015-07-14 Thread big strong
could you explain that in detail? As syscall tracing, we usually locate the
kernel module first, then find the address of specific syscall function in
that module with the help of symbol files. How could this be applied to
hypercalls then?

2015-07-14 19:56 GMT+08:00 Jan Beulich jbeul...@suse.com:

  On 14.07.15 at 09:10, fangtu...@gmail.com wrote:
  As syscalls can be located with the help of symbol files, is it possible
 to
  do it to hypercalls too?

 Sure.

 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] how to locate the hypercall address in memory?

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 13:59, fangtu...@gmail.com wrote:
 could you explain that in detail? As syscall tracing, we usually locate the
 kernel module first, then find the address of specific syscall function in
 that module with the help of symbol files. How could this be applied to
 hypercalls then?

You'd do whatever you do with the kernel binary with the hypervisor
one instead (plus you don't even need to care about modules there).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 2/3] util/grub.d/20_linux_xen.in: Add arm64 support

2015-07-14 Thread Fu Wei
Hi Andrei,

Great thanks for your review.

So Are you suggesting this:
(1) in util/grub.d/20_linux_xen.in, we only use xen_hypervisor/xen_module.
(2) in xen_boot.c, we only register command xen_hypervisor/xen_module.
(3) in grub-core/loader/i386/xen.c, we *add*
---
  cmd_xen_hypervisort = grub_register_command (xen_hypervisor, grub_cmd_xen,
0, N_(Load Linux.));
  cmd_xen_module = grub_register_command (xen_module, grub_cmd_module,
 0, N_(Load module.));
---
(4)in grub-core/loader/multiboot.c, we *add*
---
#if defined (__i386__) || defined (__aarch64__)
  cmd_xen_hypervisort =
grub_register_command (xen_hypervisor, grub_cmd_multiboot,
  0, N_(Load a multiboot kernel.));
  cmd_xen_module =
grub_register_command (xen_module, grub_cmd_module,
  0, N_(Load a multiboot module.));
#endif
---

BTW, from the source code, MIPS isn't supported by multiboot,  IS
supported by multiboot2.

Please correct me. If I misunderstand your suggestion. :-)

Thanks again!


On 14 July 2015 at 11:53, Andrei Borzenkov arvidj...@gmail.com wrote:
 В Mon, 13 Jul 2015 16:53:59 +0800
 fu@linaro.org пишет:

 From: Fu Wei fu@linaro.org

 This patch adds the support of boot command on arm64 for XEN:
 xen_hypervisor
 xen_module

 Signed-off-by: Fu Wei fu@linaro.org
 ---
  util/grub.d/20_linux_xen.in | 14 +++---
  1 file changed, 11 insertions(+), 3 deletions(-)

 diff --git a/util/grub.d/20_linux_xen.in b/util/grub.d/20_linux_xen.in
 index f532fb9..b52c50d 100644
 --- a/util/grub.d/20_linux_xen.in
 +++ b/util/grub.d/20_linux_xen.in
 @@ -120,16 +120,16 @@ linux_entry ()
  else
  xen_rm_opts=no-real-mode edd=off
  fi
 - multiboot   ${rel_xen_dirname}/${xen_basename} placeholder 
 ${xen_args} \${xen_rm_opts}
 + ${multiboot_cmd}${rel_xen_dirname}/${xen_basename} placeholder 
 ${xen_args} \${xen_rm_opts}
   echo'$(echo $lmessage | grub_quote)'
 - module  ${rel_dirname}/${basename} placeholder 
 root=${linux_root_device_thisversion} ro ${args}
 + ${module_cmd}   ${rel_dirname}/${basename} placeholder 
 root=${linux_root_device_thisversion} ro ${args}
  EOF
if test -n ${initrd} ; then
  # TRANSLATORS: ramdisk isn't identifier. Should be translated.
  message=$(gettext_printf Loading initial ramdisk ...)
  sed s/^/$submenu_indentation/  EOF
   echo'$(echo $message | grub_quote)'
 - module  --nounzip   ${rel_dirname}/${initrd}
 + ${module_cmd}   --nounzip   ${rel_dirname}/${initrd}
  EOF
fi
sed s/^/$submenu_indentation/  EOF
 @@ -185,6 +185,14 @@ case $machine in
  *) GENKERNEL_ARCH=$machine ;;
  esac

 +if [ x$machine != xaarch64 ]; then
 + multiboot_cmd=multiboot
 + module_cmd=module
 +else
 + multiboot_cmd=xen_hypervisor
 + module_cmd=xen_module
 +fi
 +

 Strictly speaking, this is boot-time decision. As mentioned by
 Vladimir, better would be to provide alias xen_hypervisor and
 xen_module in multiboot for platforms supporting Xen (is MIPS really
 supported?) and use it consistently.

  # Extra indentation to add to menu entries in a submenu. We're not in a 
 submenu
  # yet, so it's empty. In a submenu it will be equal to '\t' (one tab).
  submenu_indentation=




-- 
Best regards,

Fu Wei
Software Engineer
Red Hat Software (Beijing) Co.,Ltd.Shanghai Branch
Ph: +86 21 61221326(direct)
Ph: +86 186 2020 4684 (mobile)
Room 1512, Regus One Corporate Avenue,Level 15,
One Corporate Avenue,222 Hubin Road,Huangpu District,
Shanghai,China 200021

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 1 6/8] xen/pt: Make xen_pt_unregister_device idempotent

2015-07-14 Thread Stefano Stabellini
On Thu, 2 Jul 2015, Konrad Rzeszutek Wilk wrote:
   @@ -858,15 +863,20 @@ static void xen_pt_unregister_device(PCIDevice *d)
   machine_irq, errno);
}
}
   +s-machine_irq = 0;
}

/* delete all emulated config registers */
xen_pt_config_delete(s);

   -memory_listener_unregister(s-memory_listener);
   -memory_listener_unregister(s-io_listener);
   -
   -xen_host_pci_device_put(s-real_device);
   +if (s-listener_set) {
   +memory_listener_unregister(s-memory_listener);
   +memory_listener_unregister(s-io_listener);
   +s-listener_set = false;
  
  If you call QTAILQ_INIT on memory_listener and io_listener, then you
  simply check on QTAILQ_EMPTY and remove listener_set.
 
 No dice. For that to work they need to be QTAIL_HEAD while
 they are of QTAILQ_ENTRY.
 
 That is, the include/exec/memory.h:
 struct MemoryListener {
  ...
   AddressSpace *address_space_filter;
   QTAILQ_ENTRY(MemoryListener) link;
 }
 
 And the QTAILQ_EMPTY checks for 'tqh_first' while the
 QTAILQ_ENTRY only defines two pointers: tqe_next and tqe_prev.

Ah, that is true.  In that case maybe it is OK to introduce
listener_set, rather than messing up with QTAILQ definitions or having
checks like memory_listener-link-tqe_next ==
memory_listener-link-tqe_prev.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [xen-unstable test] 59509: regressions - FAIL

2015-07-14 Thread Jan Beulich
 On 14.07.15 at 08:23, osstest-ad...@xenproject.org wrote:
 flight 59509 xen-unstable real [real]
 http://logs.test-lab.xenproject.org/osstest/logs/59509/ 
 
 Regressions :-(
 
 Tests which did not succeed and are blocking,
 including tests which could not be run:
  test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 14 
 guest-localmigrate.2 fail REGR. vs. 58958
  test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install 
 fail REGR. vs. 58965
  test-amd64-i386-qemut-rhel6hvm-intel 12 guest-start/redhat.repeat fail REGR. 
 vs. 58965
  test-amd64-amd64-xl-qemuu-ovmf-amd64 16 guest-stopfail REGR. vs. 
 58965
  test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 16 guest-stop fail REGR. vs. 
 58965
  test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 16 guest-stop fail REGR. vs. 
 58965
  test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 16 guest-stop fail REGR. vs. 
 58965
  test-amd64-i386-xl-qemuu-debianhvm-amd64 16 guest-stopfail REGR. vs. 
 58965
  test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 9 windows-install fail REGR. vs. 
 58965
  test-amd64-i386-xl-qemut-win7-amd64  9 windows-installfail REGR. vs. 
 58965
  test-amd64-amd64-xl-qemut-win7-amd64  9 windows-install   fail REGR. vs. 
 58965
  test-amd64-i386-xl-qemut-winxpsp3-vcpus1 9 windows-install fail REGR. vs. 
 58965
  test-amd64-amd64-xl-qemuu-win7-amd64  9 windows-install   fail REGR. vs. 
 58965
  test-amd64-i386-xl-qemut-winxpsp3  9 windows-install  fail REGR. vs. 
 58965

Patch posted.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 15/15] tools/xen-access: altp2m testcases

2015-07-14 Thread Lengyel, Tamas
On Tue, Jul 14, 2015 at 5:56 AM, Wei Liu wei.l...@citrix.com wrote:

 On Mon, Jul 13, 2015 at 05:15:03PM -0700, Ed White wrote:
  From: Tamas K Lengyel tleng...@novetta.com
 
  Working altp2m test-case. Extended the test tool to support
 singlestepping
  to better highlight the core feature of altp2m view switching.
 
  Signed-off-by: Tamas K Lengyel tleng...@novetta.com
  Signed-off-by: Ed White edmund.h.wh...@intel.com
 
  Reviewed-by: Razvan Cojocaru rcojoc...@bitdefender.com

 Acked-by: Wei Liu wei.l...@citrix.com

 Some nits below.

 I do notice there is inconsistency in coding style, so I won't ask you
 to resubmit just for fixing style. But a follow-up patch to fix up all
 style problem is welcome.

  ---
   tools/tests/xen-access/xen-access.c | 173
 ++--
   1 file changed, 148 insertions(+), 25 deletions(-)
 
  diff --git a/tools/tests/xen-access/xen-access.c
 b/tools/tests/xen-access/xen-access.c
  index 12ab921..6b69c26 100644
  --- a/tools/tests/xen-access/xen-access.c
  +++ b/tools/tests/xen-access/xen-access.c
  @@ -275,6 +275,19 @@ xenaccess_t *xenaccess_init(xc_interface **xch_r,
 domid_t domain_id)
   return NULL;
   }
 
  +static inline
  +int control_singlestep(
  +xc_interface *xch,
  +domid_t domain_id,
  +unsigned long vcpu,
  +bool enable)
  +{
  +uint32_t op = enable ?
  +XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_ON :
 XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF;
  +
  +return xc_domain_debug_control(xch, domain_id, op, vcpu);
  +}
  +
   /*
* Note that this function is not thread safe.
*/
  @@ -317,13 +330,15 @@ static void put_response(vm_event_t *vm_event,
 vm_event_response_t *rsp)
 
   void usage(char* progname)
   {
  -fprintf(stderr,
  -Usage: %s [-m] domain_id write|exec|breakpoint\n
  +fprintf(stderr, Usage: %s [-m] domain_id write|exec, progname);
  +#if defined(__i386__) || defined(__x86_64__)
  +fprintf(stderr, |breakpoint|altp2m_write|altp2m_exec);
  +#endif
  +fprintf(stderr,
   \n
   Logs first page writes, execs, or breakpoint traps that
 occur on the domain.\n
   \n
  --m requires this program to run, or else the domain may
 pause\n,
  -progname);
  +-m requires this program to run, or else the domain may
 pause\n);

 Indentation looks wrong, but this is not your fault so don't worry
 about this.

   }
 
   int main(int argc, char *argv[])
  @@ -341,6 +356,8 @@ int main(int argc, char *argv[])
   int required = 0;
   int breakpoint = 0;
   int shutting_down = 0;
  +int altp2m = 0;
  +uint16_t altp2m_view_id = 0;
 
   char* progname = argv[0];
   argv++;
  @@ -379,10 +396,22 @@ int main(int argc, char *argv[])
   default_access = XENMEM_access_rw;
   after_first_access = XENMEM_access_rwx;
   }
  +#if defined(__i386__) || defined(__x86_64__)
   else if ( !strcmp(argv[0], breakpoint) )
   {
   breakpoint = 1;
   }
  +else if ( !strcmp(argv[0], altp2m_write) )
  +{
  +default_access = XENMEM_access_rx;
  +altp2m = 1;
  +}
  +else if ( !strcmp(argv[0], altp2m_exec) )
  +{
  +default_access = XENMEM_access_rw;
  +altp2m = 1;
  +}
  +#endif
   else
   {
   usage(argv[0]);
  @@ -415,22 +444,73 @@ int main(int argc, char *argv[])
   goto exit;
   }
 
  -/* Set the default access type and convert all pages to it */
  -rc = xc_set_mem_access(xch, domain_id, default_access, ~0ull, 0);
  -if ( rc  0 )
  +/* With altp2m we just create a new, restricted view of the memory
 */
  +if ( altp2m )
   {
  -ERROR(Error %d setting default mem access type\n, rc);
  -goto exit;
  -}
  +xen_pfn_t gfn = 0;
  +unsigned long perm_set = 0;
  +
  +rc = xc_altp2m_set_domain_state( xch, domain_id, 1 );

 Extraneous spaces in brackets.

  +if ( rc  0 )
  +{
  +ERROR(Error %d enabling altp2m on domain!\n, rc);
  +goto exit;
  +}
  +
  +rc = xc_altp2m_create_view( xch, domain_id, default_access,
 altp2m_view_id );

 Extraneous spaces and line too long.

  +if ( rc  0 )
  +{
  +ERROR(Error %d creating altp2m view!\n, rc);
  +goto exit;
  +}
 
  -rc = xc_set_mem_access(xch, domain_id, default_access, START_PFN,
  -   (xenaccess-max_gpfn - START_PFN) );
  +DPRINTF(altp2m view created with id %u\n, altp2m_view_id);
  +DPRINTF(Setting altp2m mem_access permissions.. );
 
  -if ( rc  0 )
  +for(; gfn  xenaccess-max_gpfn; ++gfn)
  +{
  +rc = xc_altp2m_set_mem_access( xch, domain_id,
 altp2m_view_id, gfn,
  +   default_access);

 Space.

  +if ( !rc )
  +perm_set++;
  +}

  1   2   3   >