Re: [Xen-devel] [RFC Design Doc] Add vNVDIMM support for Xen

2016-02-17 Thread Haozhong Zhang
On 02/17/16 02:08, Jan Beulich wrote:
> >>> On 17.02.16 at 10:01,  wrote:
> > On 02/15/16 04:07, Jan Beulich wrote:
> >> >>> On 15.02.16 at 09:43,  wrote:
> >> > On 02/03/16 03:15, Konrad Rzeszutek Wilk wrote:
> >> >> >  Similarly to that in KVM/QEMU, enabling vNVDIMM in Xen is composed of
> >> >> >  three parts:
> >> >> >  (1) Guest clwb/clflushopt/pcommit enabling,
> >> >> >  (2) Memory mapping, and
> >> >> >  (3) Guest ACPI emulation.
> >> >> 
> >> >> 
> >> >> .. MCE? and vMCE?
> >> >> 
> >> > 
> >> > NVDIMM can generate UCR errors like normal ram. Xen may handle them in a
> >> > way similar to what mc_memerr_dhandler() does, with some differences in
> >> > the data structure and the broken page offline parts:
> >> > 
> >> > Broken NVDIMM pages should be marked as "offlined" so that Xen
> >> > hypervisor can refuse further requests that map them to DomU.
> >> > 
> >> > The real problem here is what data structure will be used to record
> >> > information of NVDIMM pages. Because the size of NVDIMM is usually much
> >> > larger than normal ram, using struct page_info for NVDIMM pages would
> >> > occupy too much memory.
> >> 
> >> I don't see how your alternative below would be less memory
> >> hungry: Since guests have at least partial control of their GFN
> >> space, a malicious guest could punch holes into the contiguous
> >> GFN range that you appear to be thinking about, thus causing
> >> arbitrary splitting of the control structure.
> >>
> > 
> > QEMU would always use MFN above guest normal ram and I/O holes for
> > vNVDIMM. It would attempt to search in that space for a contiguous range
> > that is large enough for that that vNVDIMM devices. Is guest able to
> > punch holes in such GFN space?
> 
> See XENMAPSPACE_* and their uses.
> 

I think we can add following restrictions to avoid uses of XENMAPSPACE_*
punching holes in GFNs of vNVDIMM:

(1) For XENMAPSPACE_shared_info and _grant_table, never map idx in them
to GFNs occupied by vNVDIMM.

(2) For XENMAPSPACE_gmfn, _gmfn_range and _gmfn_foreign,
   (a) never map idx in them to GFNs occupied by vNVDIMM, and
   (b) never map idx corresponding to GFNs occupied by vNVDIMM


Haozhong

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue.

2016-02-17 Thread Xu, Quan
> On February 17, 2016 10:41pm,  wrote:
> >>> On 05.02.16 at 11:18,  wrote:
> > --- a/xen/drivers/passthrough/vtd/qinval.c
> > +++ b/xen/drivers/passthrough/vtd/qinval.c
> > +if ( pci_hide_device(bus, devfn) )
> 
> But now I'm _really_ puzzled: You acquire the exact lock that
> pci_hide_device() acquires. Hence, unless I've overlooked an earlier change, I
> can't see this as other than an unconditional dead lock. Did you test this 
> code
> path at all?

Sorry, I didn't test this code path.
I did test the follows:
   1) Create domain with ATS device.
   2) Attach / Detach ATS device.

I think I could add a variation of pci_hide_device(), without 
"spin_lock(_lock) / spin_unlock(_lock)"
Or "__init".

But it is sure that different lock state is possible for different call trees 
when to flush an ATS device.
I verify it as follows:
1.print pcidevs_lock status in flush_iotlb_qi()

flush_iotlb_qi()
{
...
+printk("__ pcidevs_lock : %d *__\n", spin_is_locked(_lock));
...
}

2. attach ATS device
  $xl pci-attach TestDom :81:00.0
  #The print is "(XEN) __ pcidevs_lock : 1 *__"

3. reset memory of domain
  $ xl mem-set TestDom 2047m
  #the print is "(XEN) __ pcidevs_lock : 0 *__"

Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).

2016-02-17 Thread Xu, Quan
> On February 18, 2016 2:37pm,  wrote:
> > From: Xu, Quan
> > Sent: Thursday, February 18, 2016 2:34 PM


> Thanks for your work.
> Kevin

I appreciate your help!:)
Quan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).

2016-02-17 Thread Tian, Kevin
> From: Xu, Quan
> Sent: Thursday, February 18, 2016 2:34 PM
> 
> > On February 18, 2016 2:16pm,  wrote:
> > > From: Xu, Quan
> > > Sent: Wednesday, February 17, 2016 9:38 PM
> > > >
> > > > > BTW, with patch 1/7, I can build Xen successfully( make xen ).
> > > > > To align this rule, I'd better merge patch1/7 and patch 2/7 into a
> > > > > large patch.
> > > >
> > > > Not having looked at patch 2 yet
> > > > (I'm about to),
> > >
> > > Good news. :):)
> > > I think there are maybe some minor issues for this patch set (
> > > _my_estimate_), and I will address these issues immediately.
> > >
> >
> > Hi, Quan, are you expecting people to continue looking at your v5, or 
> > better to
> > wait for v7? :-)
> >
> 
> V7?

sorry. I meant v6.

> I think the other people would better wait for v6. I plan to send out v6 in 
> next week.
> Jan gave some valuable comments for v5. I will fix it ASAP.
> I apologize to everyone who has to go through v5 patch set. I missed some v4
> comments and didn't fully test the v5 patch set, as I tried my best to send 
> out v5 on the
> last
> working day before Chinese Spring Festival and it was a long periods for v4 
> discussion.
> 

Thanks for your work.
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 1/3] PMU: make {acquire, release}_pmu_ownership names consistent

2016-02-17 Thread Tian, Kevin
> From: Doug Goldstein [mailto:car...@cardoe.com]
> Sent: Wednesday, February 17, 2016 10:38 PM
> 
> The function names were inconsistent with acquire and release being
> called acquire_pmu_ownership() and release_pmu_ownship() respectively.
> Function prototypes were available for both spellings so this change
> makes them consistent and drops the dual function prototypes.
> Additionally change the internal variable names within those functions
> to ownership as well.
> 
> Signed-off-by: Doug Goldstein 

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).

2016-02-17 Thread Xu, Quan
> On February 18, 2016 2:16pm,  wrote:
> > From: Xu, Quan
> > Sent: Wednesday, February 17, 2016 9:38 PM
> > >
> > > > BTW, with patch 1/7, I can build Xen successfully( make xen ).
> > > > To align this rule, I'd better merge patch1/7 and patch 2/7 into a
> > > > large patch.
> > >
> > > Not having looked at patch 2 yet
> > > (I'm about to),
> >
> > Good news. :):)
> > I think there are maybe some minor issues for this patch set (
> > _my_estimate_), and I will address these issues immediately.
> >
> 
> Hi, Quan, are you expecting people to continue looking at your v5, or better 
> to
> wait for v7? :-)
> 

V7?
I think the other people would better wait for v6. I plan to send out v6 in 
next week.
Jan gave some valuable comments for v5. I will fix it ASAP.
I apologize to everyone who has to go through v5 patch set. I missed some v4 
comments and didn't fully test the v5 patch set, as I tried my best to send out 
v5 on the last
working day before Chinese Spring Festival and it was a long periods for v4 
discussion.


Quan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Restoring FPU exception state

2016-02-17 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Wednesday, February 17, 2016 9:46 PM
> 
> >>> On 17.02.16 at 14:08,  wrote:
> > The FPU exception state includes 4 registers:
> >
> > - 64-bit FIP
> > - 16-bit FCS
> > - 64-bit FDP
> > - 16-bit FDS
> >
> > When a CPU takes an FPU exception in long mode, all 4 registers are
> > fully updated.  This state can be saved with a combination of REX.W
> > prefixed XSAVE and FNSTENV.  This state cannot be restored with any
> > combination of instructions as those that restore the 64-bit FIP/FDP
> > clear FCS and FDS; and those that restore FCS and FDS clear the upper
> > 32-bits of FIP and FDP [1].
> >
> > This causes problems when running Microsoft's Driver Verifier in a
> > 64-bit Windows guest (seen with Windows 7 SP1, but other versions may
> > also be affected).
> >
> > The Driver Verifier prior to calling a driver's interrupt handler will
> > save the FPU state, after the handler is called it will save the state
> > again and do a byte-by-byte compare to verify the state has not changed.
> >  A 0x3D (INTERRUPT_EXCEPTION_NOT_HANDLED) BugCheck is raised if the
> > state does not match.
> >
> > Windows uses XSAVE to save the FPU state, but it does not use a REX.W
> > prefixed XSAVE, and saves only the lower 32-bits of FIP/FDP and FCS/FDS.
> 
> Oh, you say that's the case even in 64-bit Windows? That's rather
> unexpected.
> 
> > If the VCPU is descheduled between these two checks, the contents of
> > FCS/FDS is lost, Windows will notice and BugCheck.
> >
> > When saving a VCPUs FPU state, Xen first uses a REX.W prefixed XSAVE and
> > notices that FIP/FDP[64:32] is non-zero and assumes are REX.W prefixed
> > XRSTOR is required to restore the full 64-bits of FIP/FDP.  This clears
> > FCS/FDS.
> >
> > On processors with FPCSDS[2] (bit 13) set in CPUID leaf 0x7, sub-leaf
> > 0x0, do not save FCS/FDS (they always write zeros) and this problem does
> > not occur, because FCS/FDS never needs to be restored.
> >
> > Does anyone any thoughts of a solution for processors without the FPCSDS
> > feature?
> 
> One would assume they have a solution to this problem on Hyper-V,
> but then again their solution may simply be that they don't use REX
> prefixes anywhere (i.e. namely also not in context switch code). In
> which case, though, they'd corrupt state of their non-Windows guests.
> 
> But to answer your question: I have no idea. The handling of these
> FPU code/data pointers has been an unloved child from the very
> beginning of the x86-64 architecture. All I could see us doing would
> be to add a per-domain control to override the auto-detection in the
> XSAVE code path.
> 

Interesting. Let me also have a check internally whether there is other
architectural alternative. BTW, not quite related. Could I think finally
Xen may allow user to specify OS type as a general per-domain control,
and then Xen can do free optimizations underlying based on that type?
I don't expect we want to expose raw FPU related per-domain control
since it's difficult for users to correctly set it up...

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).

2016-02-17 Thread Tian, Kevin
> From: Xu, Quan
> Sent: Wednesday, February 17, 2016 9:38 PM
> >
> > > BTW, with patch 1/7, I can build Xen successfully( make xen ).
> > > To align this rule, I'd better merge patch1/7 and patch 2/7 into a
> > > large patch.
> >
> > Not having looked at patch 2 yet
> > (I'm about to),
> 
> Good news. :):)
> I think there are maybe some minor issues for this patch set ( 
> _my_estimate_), and I will
> address these issues immediately.
> 

Hi, Quan, are you expecting people to continue looking at your v5,
or better to wait for v7? :-)

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-next test] 82983: regressions - FAIL

2016-02-17 Thread osstest service owner
flight 82983 linux-next real [real]
http://logs.test-lab.xenproject.org/osstest/logs/82983/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-xsm   6 xen-boot  fail REGR. vs. 82764
 test-armhf-armhf-libvirt-xsm  6 xen-boot  fail REGR. vs. 82764
 test-armhf-armhf-xl-arndale   6 xen-boot  fail REGR. vs. 82764
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail REGR. vs. 82764
 test-armhf-armhf-xl-cubietruck  6 xen-bootfail REGR. vs. 82764
 test-armhf-armhf-xl   6 xen-boot  fail REGR. vs. 82764
 test-amd64-amd64-xl-qemut-win7-amd64 12 guest-saverestore fail REGR. vs. 82764
 test-armhf-armhf-xl-credit2   6 xen-boot  fail REGR. vs. 82764
 test-amd64-amd64-xl-qemuu-win7-amd64 12 guest-saverestore fail REGR. vs. 82764
 test-amd64-amd64-amd64-pvgrub  9 debian-di-installfail REGR. vs. 82764
 test-armhf-armhf-libvirt  6 xen-boot  fail REGR. vs. 82764
 test-armhf-armhf-libvirt-qcow2  6 xen-bootfail REGR. vs. 82764
 test-armhf-armhf-xl-vhd   6 xen-boot  fail REGR. vs. 82764
 test-armhf-armhf-libvirt-raw  6 xen-boot  fail REGR. vs. 82764

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds  6 xen-boot  fail REGR. vs. 82764
 build-amd64-rumpuserxen   6 xen-buildfail   like 82764
 build-i386-rumpuserxen6 xen-buildfail   like 82764
 test-amd64-i386-libvirt-xsm  15 guest-saverestore.2  fail   like 82764
 test-amd64-amd64-xl  15 guest-localmigrate   fail   like 82764
 test-amd64-amd64-xl-xsm  15 guest-localmigrate   fail   like 82764
 test-amd64-amd64-libvirt 15 guest-saverestore.2  fail   like 82764
 test-amd64-i386-libvirt  15 guest-saverestore.2  fail   like 82764
 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate   fail  like 82764
 test-amd64-i386-xl   15 guest-localmigrate   fail   like 82764
 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2  fail   like 82764
 test-amd64-amd64-xl-credit2  15 guest-localmigrate   fail   like 82764
 test-amd64-i386-xl-xsm   15 guest-localmigrate   fail   like 82764
 test-amd64-amd64-libvirt-pair 22 guest-migrate/dst_host/src_host fail like 
82764
 test-amd64-amd64-pair   22 guest-migrate/dst_host/src_host fail like 82764
 test-amd64-i386-pair22 guest-migrate/dst_host/src_host fail like 82764
 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail like 82764
 test-amd64-amd64-xl-rtds 15 guest-localmigrate   fail   like 82764
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 82764

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass

version targeted for testing:
 linuxb3ebd75cbc2a4833a52351c74e48f9b5e9368fd7
baseline version:
 linux1926e54f115725a9248d0c4c65c22acaf94de4c4

Last test of basis  (not found) 
Failing since 0  1970-01-01 00:00:00 Z 16849 days
Testing same since82983  2016-02-17 09:30:49 Z0 days1 attempts

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt

Re: [Xen-devel] [PATCH 4/5] VMX: fold redundant code

2016-02-17 Thread Tian, Kevin
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: Thursday, February 18, 2016 12:37 AM
> 
> No need to do this in two slightly different ways, possibly keeping the
> compiler from folding the code for us.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Kevin Tian 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] XSAVE flavors

2016-02-17 Thread Shuai Ruan
On Thu, Feb 04, 2016 at 01:51:34AM -0700, Jan Beulich wrote:
> >> >> And I'm afraid there's yet one more issue: If my reading of the
> >> >> SDM is right, then the offsets at which components get saved
> >> >> by XSAVEC / XSAVES aren't fixed, but depend on RFBM (as that's
> >> >> what gets stored into xcomp_bv[62:0]). xstate_comp_offsets[],
> >> >> otoh, gets computed based on all available features, irrespective
> >> >> of vcpu_xsave_mask() returning four different values depending
> >> >> on current guest state. I can't see how get_xsave_addr() can
> >> >> work correctly without honoring xcomp_bv. Nor can I convince
> >> >> myself that state can't get corrupted / lost, e.g. when a save
> >> >> with v->fpu_dirtied set is followed by one with v->fpu_dirtied
> >> >> clear.
> >> >> 
> >> >> Am I misunderstanding what the SDM writes?
> >> >> 
> >> > Yes. you are right. This is a issue. I will find a way to solve
> >> > this.
> >> 
> >> Thanks.
> > 
> > For xstate_comp_offsets is only used in get_xsave_addr when performing 
> > migration. 
> > I intend to recaculte xstate_comp_offsets based on the 
> > vcpu->arch.xsavec_area.save_hdr.xcomp_bv 
> > before get_xsave_addr is called. 
> 
> I don't think that'll suffice, as it won't deal with the lazy XSAVE[SC]
> possibly overwriting data written by the non-lazy one. See the
> effectively three different values returned by vcpu_xsave_mask()
> (the fourth one is impossible since the function won't ever get
> called with both v->fpu_dirtied and v->arch.nonlazy_xstate_used
> clear).
Oh, Yes, Thanks.
The way I think to solve the problem is:
if v->fpu_dirtied is clear and v->arch.nonlazy_xstate_used is set,  
vcpu_xsave_mask will return XSTATE_ALL (only if xsave[sc] is support).
But this will do some overhead save.
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/2] xen: add randconfig target to Makefile

2016-02-17 Thread Doug Goldstein
This allows us to generate a random config which can be used for build
testing random configurations.

Signed-off-by: Doug Goldstein 
---
CC: Ian Campbell 
CC: Ian Jackson 
CC: Jan Beulich 
CC: Keir Fraser 
CC: Tim Deegan 
---
 xen/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/Makefile b/xen/Makefile
index 5d98bcb..349f63a 100644
--- a/xen/Makefile
+++ b/xen/Makefile
@@ -238,7 +238,8 @@ FORCE:
$(MAKE) -f $(BASEDIR)/Rules.mk -C $* built_in.o built_in_bin.o
 
 kconfig := silentoldconfig oldconfig config menuconfig defconfig \
-   nconfig xconfig gconfig savedefconfig listnewconfig olddefconfig
+   nconfig xconfig gconfig savedefconfig listnewconfig olddefconfig \
+   randconfig
 .PHONY: $(kconfig)
 $(kconfig):
$(MAKE) -f $(BASEDIR)/tools/kconfig/Makefile.kconfig ARCH=$(ARCH) 
SRCARCH=$(SRCARCH) $@
-- 
2.5.4 (Apple Git-61)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/2] travis: add randconfig test target

2016-02-17 Thread Doug Goldstein
Add another build target which uses randconfig to randomize the config
file so that we build test more than the default config.

Signed-off-by: Doug Goldstein 
---
CC: Ian Campbell 
CC: Ian Jackson 
CC: Jan Beulich 
CC: Keir Fraser 
CC: Tim Deegan 
---
 .travis.yml | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/.travis.yml b/.travis.yml
index c7227ba..7dbd82a 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -8,6 +8,8 @@ matrix:
 - compiler: gcc
   env: XEN_TARGET_ARCH=x86_64
 - compiler: gcc
+  env: XEN_TARGET_ARCH=x86_64 XEN_CONFIG_EXPERT=y RANDCONFIG=y
+- compiler: gcc
   env: XEN_TARGET_ARCH=x86_64 COMPILER=gcc-5
 - compiler: gcc
   env: XEN_TARGET_ARCH=x86_64 debug=y
@@ -24,10 +26,14 @@ matrix:
 - compiler: gcc
   env: XEN_TARGET_ARCH=arm32 CROSS_COMPILE=arm-linux-gnueabihf-
 - compiler: gcc
+  env: XEN_TARGET_ARCH=arm32 CROSS_COMPILE=arm-linux-gnueabihf- 
XEN_CONFIG_EXPERT=y RANDCONFIG=y
+- compiler: gcc
   env: XEN_TARGET_ARCH=arm32 CROSS_COMPILE=arm-linux-gnueabihf- debug=y
 - compiler: gcc
   env: XEN_TARGET_ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
 - compiler: gcc
+  env: XEN_TARGET_ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- 
XEN_CONFIG_EXPERT=y RANDCONFIG=y
+- compiler: gcc
   env: XEN_TARGET_ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- debug=y
 addons:
 apt:
@@ -69,5 +75,9 @@ before_script:
 - export CC=${COMPILER:-${CC}}
 - ${CC} --version
 script:
+- ( [ "x${RANDCONFIG}" = "xy" ] && ( make -C xen randconfig )
+  || exit 0 )
 - ( ./configure --disable-tools --disable-stubdom --enable-docs &&
   make CC="${CROSS_COMPILE}${CC}" HOSTCC="${CC}" dist )
+after_script:
+- cat xen/.config
-- 
2.5.4 (Apple Git-61)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v8 06/13] tools/libxl: introduce enum type libxl_checkpointed_stream

2016-02-17 Thread Wen Congyang
Introduce enum type libxl_checkpointed_stream in IDL.
rename the last argument of migrate_receive from "remus" to
"checkpointed" since the semantics of this parameter has
changed.

NOTE:
 libxl_domain_restore_params and domain_create aren't changed here,
 checkpointed_stream is still an int. Because we will pass the
 value from libxl to libxc.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/libxl.h |  7 +++
 tools/libxl/libxl_create.c  |  8 ++--
 tools/libxl/libxl_stream_read.c |  7 +--
 tools/libxl/libxl_types.idl |  5 +
 tools/libxl/xl_cmdimpl.c| 18 --
 5 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index fa87f53..6225db1 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -876,6 +876,13 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
libxl_mac *src);
  */
 #define LIBXL_HAVE_DEVICE_MODEL_VERSION_NONE 1
 
+/*
+ * LIBXL_HAVE_CHECKPOINTED_STREAM
+ *
+ * If this is defined, then libxl_checkpointed_stream exists.
+ */
+#define LIBXL_HAVE_CHECKPOINTED_STREAM 1
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index ad1d50c..f1028bc 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1033,9 +1033,13 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 dcs->srs.completion_callback = domcreate_stream_done;
 
 if (restore_fd >= 0) {
-if (checkpointed_stream)
+switch (checkpointed_stream) {
+case LIBXL_CHECKPOINTED_STREAM_REMUS:
 libxl__remus_restore_setup(egc, dcs);
-libxl__stream_read_start(egc, >srs);
+/* fall through */
+case LIBXL_CHECKPOINTED_STREAM_NONE:
+libxl__stream_read_start(egc, >srs);
+}
 return;
 }
 
diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index dac134e..f4781eb 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -794,19 +794,22 @@ void libxl__xc_domain_restore_done(libxl__egc *egc, void 
*dcs_void,
  * If the stream is not still alive, we must not continue any work.
  */
 if (libxl__stream_read_inuse(stream)) {
-if (checkpointed_stream) {
+switch (checkpointed_stream) {
+case LIBXL_CHECKPOINTED_STREAM_REMUS:
 /*
  * Failover from primary. Domain state is currently at a
  * consistent checkpoint, complete the stream, and call
  * stream->completion_callback() to resume the guest.
  */
 stream_complete(egc, stream, 0);
-} else {
+break;
+case LIBXL_CHECKPOINTED_STREAM_NONE:
 /*
  * Libxc has indicated that it is done with the stream.
  * Resume reading libxl records from it.
  */
 stream_continue(egc, stream);
+break;
 }
 }
 }
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9ad7eba..b8fb22f 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -228,6 +228,11 @@ libxl_hdtype = Enumeration("hdtype", [
 (2, "AHCI"),
 ], init_val = "LIBXL_HDTYPE_IDE")
 
+libxl_checkpointed_stream = Enumeration("checkpointed_stream", [
+(0, "NONE"),
+(1, "REMUS"),
+])
+
 #
 # Complex libxl types
 #
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index d07ccb2..6597ebd 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -4426,7 +4426,8 @@ static void migrate_domain(uint32_t domid, const char 
*rune, int debug,
 }
 
 static void migrate_receive(int debug, int daemonize, int monitor,
-int send_fd, int recv_fd, int remus)
+int send_fd, int recv_fd,
+libxl_checkpointed_stream checkpointed)
 {
 uint32_t domid;
 int rc, rc2;
@@ -4451,7 +4452,7 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
 dom_info.paused = 1;
 dom_info.migrate_fd = recv_fd;
 dom_info.migration_domname_r = _domname;
-dom_info.checkpointed_stream = remus;
+dom_info.checkpointed_stream = checkpointed;
 
 rc = create_domain(_info);
 if (rc < 0) {
@@ -4462,7 +4463,8 @@ static void migrate_receive(int debug, int daemonize, int 
monitor,
 
 domid = rc;
 
-if (remus) {
+switch (checkpointed) {
+case LIBXL_CHECKPOINTED_STREAM_REMUS:
 /* If we are here, it means that the sender (primary) has crashed.
  * TODO: Split-Brain Check.
  */
@@ -4495,6 +4497,9 @@ static void 

[Xen-devel] [PATCH v8 04/13] libxl/save: Refactor libxl__domain_suspend_state

2016-02-17 Thread Wen Congyang
Currently struct libxl__domain_suspend_state contains 2 type of states,
one is save state, another is suspend state. This patch separates those
two out.
The motivation of this is that COLO will need to do suspend/resume
continuously, we need a more common suspend state.

After this change, dss stands for libxl__domain_save_state,
dsps stands for libxl__domain_suspend_state.

Also introduce libxl__domain_suspend_init to initialise the
libxl__domain_suspend_state.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
Acked-by:Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/libxl.c  |  10 +-
 tools/libxl/libxl_create.c   |  10 +-
 tools/libxl/libxl_dom_save.c |  61 
 tools/libxl/libxl_dom_suspend.c  | 207 ---
 tools/libxl/libxl_internal.h |  61 +++-
 tools/libxl/libxl_netbuffer.c|   2 +-
 tools/libxl/libxl_remus.c|  37 +++
 tools/libxl/libxl_save_callout.c |   2 +-
 tools/libxl/libxl_stream_write.c |  16 +--
 9 files changed, 227 insertions(+), 179 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index d6ce7da..db5732c 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -832,7 +832,7 @@ out:
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-  libxl__domain_suspend_state *dss, int rc);
+  libxl__domain_save_state *dss, int rc);
 
 /* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
 int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
@@ -840,7 +840,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
  const libxl_asyncop_how *ao_how)
 {
 AO_CREATE(ctx, domid, ao_how);
-libxl__domain_suspend_state *dss;
+libxl__domain_save_state *dss;
 int rc;
 
 libxl_domain_type type = libxl__domain_type(gc, domid);
@@ -888,7 +888,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 }
 
 static void remus_failover_cb(libxl__egc *egc,
-  libxl__domain_suspend_state *dss, int rc)
+  libxl__domain_save_state *dss, int rc)
 {
 STATE_AO_GC(dss->ao);
 /*
@@ -900,7 +900,7 @@ static void remus_failover_cb(libxl__egc *egc,
 }
 
 static void domain_suspend_cb(libxl__egc *egc,
-  libxl__domain_suspend_state *dss, int rc)
+  libxl__domain_save_state *dss, int rc)
 {
 STATE_AO_GC(dss->ao);
 int flrc;
@@ -925,7 +925,7 @@ int libxl_domain_suspend(libxl_ctx *ctx, uint32_t domid, 
int fd, int flags,
 goto out_err;
 }
 
-libxl__domain_suspend_state *dss;
+libxl__domain_save_state *dss;
 GCNEW(dss);
 
 dss->ao = ao;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e421d36..ad1d50c 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1558,7 +1558,7 @@ typedef struct {
 typedef struct {
 libxl__app_domain_create_state cdcs;
 libxl__domain_destroy_state dds;
-libxl__domain_suspend_state dss;
+libxl__domain_save_state dss;
 char *toolstack_buf;
 uint32_t toolstack_len;
 } libxl__domain_soft_reset_state;
@@ -1653,7 +1653,7 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
 libxl__app_domain_create_state *cdcs;
 libxl__domain_create_state *dcs;
 libxl__domain_build_state *state;
-libxl__domain_suspend_state *dss;
+libxl__domain_save_state *dss;
 char *dom_path, *xs_store_mfn, *xs_console_mfn;
 uint32_t domid_out;
 int rc;
@@ -1697,8 +1697,8 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
 
 dss->ao = ao;
 dss->domid = domid_soft_reset;
-dss->dm_savefile = GCSPRINTF(LIBXL_DEVICE_MODEL_SAVE_FILE".%d",
- domid_soft_reset);
+dss->dsps.dm_savefile = GCSPRINTF(LIBXL_DEVICE_MODEL_SAVE_FILE".%d",
+  domid_soft_reset);
 
 rc = libxl__save_emulator_xenstore_data(dss, >toolstack_buf,
 >toolstack_len);
@@ -1707,7 +1707,7 @@ static int do_domain_soft_reset(libxl_ctx *ctx,
 goto out;
 }
 
-rc = libxl__domain_suspend_device_model(gc, dss);
+rc = libxl__domain_suspend_device_model(gc, >dsps);
 if (rc) {
 LOG(ERROR, "failed to suspend device model.");
 goto out;
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index cca3404..aead042 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -24,7 +24,7 @@
 static void stream_done(libxl__egc *egc,

[Xen-devel] [PATCH v8 12/13] tools/libxl: move remus state into a seperate structure

2016-02-17 Thread Wen Congyang
Add a new structure remus state, and move concrete layer's private
member to remus state.
it is pure refactoring and no functional changes.
Init interval in libxl__remus_setup(). It is safe to move this initialisation,
because this value is only used for remus, and remus will use this value after
libxl__remus_setup().

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/libxl.c |  2 +-
 tools/libxl/libxl_dom_save.c|  3 +--
 tools/libxl/libxl_internal.h| 35 +++---
 tools/libxl/libxl_netbuffer.c   | 49 +
 tools/libxl/libxl_remus.c   | 24 --
 tools/libxl/libxl_remus_disk_drbd.c |  8 +++---
 6 files changed, 72 insertions(+), 49 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 58b4574..4cdc169 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -881,7 +881,7 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 assert(info);
 
 /* Point of no return */
-libxl__remus_setup(egc, dss);
+libxl__remus_setup(egc, >rs);
 return AO_INPROGRESS;
 
  out:
diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index 28e2a41..4eb7960 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -383,7 +383,6 @@ void libxl__domain_save(libxl__egc *egc, 
libxl__domain_save_state *dss)
 }
 
 if (dss->checkpointed_stream == LIBXL_CHECKPOINTED_STREAM_REMUS) {
-dss->interval = r_info->interval;
 if (libxl_defbool_val(r_info->compression))
 dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS;
 }
@@ -433,7 +432,7 @@ static void domain_save_done(libxl__egc *egc,
  * from sending checkpoints. Teardown the network buffers and
  * release netlink resources.  This is an async op.
  */
-libxl__remus_teardown(egc, dss, rc);
+libxl__remus_teardown(egc, >rs, rc);
 return;
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 2847d13..a1aae97 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2894,6 +2894,7 @@ struct libxl__checkpoint_devices_state {
 libxl__ao *ao;
 uint32_t domid;
 libxl__checkpoint_callback *callback;
+void *concrete_data;
 int device_kind_flags;
 /* The ops must be pointer array, and the last ops must be NULL. */
 const libxl__checkpoint_device_instance_ops **ops;
@@ -2917,16 +2918,6 @@ struct libxl__checkpoint_devices_state {
 int num_disks;
 
 libxl__multidev multidev;
-
-/*- private for concrete (device-specific) layer only -*/
-
-/* private for nic device subkind ops */
-char *netbufscript;
-struct nl_sock *nlsock;
-struct nl_cache *qdisc_cache;
-
-/* private for drbd disk subkind ops */
-char *drbd_probe_script;
 };
 
 /*
@@ -2974,6 +2965,23 @@ _hidden void 
libxl__checkpoint_devices_preresume(libxl__egc *egc,
 libxl__checkpoint_devices_state *cds);
 _hidden void libxl__checkpoint_devices_commit(libxl__egc *egc,
 libxl__checkpoint_devices_state *cds);
+
+/*- Remus related state structure -*/
+typedef struct libxl__remus_state libxl__remus_state;
+struct libxl__remus_state {
+/* private */
+libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
+int interval; /* checkpoint interval */
+
+/*- private for concrete (device-specific) layer only -*/
+/* private for nic device subkind ops */
+char *netbufscript;
+struct nl_sock *nlsock;
+struct nl_cache *qdisc_cache;
+
+/* private for drbd disk subkind ops */
+char *drbd_probe_script;
+};
 _hidden int libxl__netbuffer_enabled(libxl__gc *gc);
 
 /*- Legacy conversion helper -*/
@@ -3127,9 +3135,8 @@ struct libxl__domain_save_state {
 int hvm;
 int xcflags;
 libxl__domain_suspend_state dsps;
+libxl__remus_state rs;
 libxl__checkpoint_devices_state cds;
-libxl__ev_time checkpoint_timeout; /* used for Remus checkpoint */
-int interval; /* checkpoint interval (for Remus) */
 libxl__stream_write_state sws;
 libxl__logdirty_switch logdirty;
 };
@@ -3535,9 +3542,9 @@ _hidden void libxl__domain_suspend_callback(void *data);
 
 /* Remus setup and teardown */
 _hidden void libxl__remus_setup(libxl__egc *egc,
-libxl__domain_save_state *dss);
+libxl__remus_state *rs);
 _hidden void libxl__remus_teardown(libxl__egc *egc,
-   libxl__domain_save_state *dss,
+   libxl__remus_state *rs,
int rc);
 _hidden void 

[Xen-devel] [PATCH v8 08/13] tools/libxl: export logdirty_init

2016-02-17 Thread Wen Congyang
We need to enable logdirty on secondary, so we export logdirty_init
for internal use. Rename it to libxl__logdirty_init.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
Acked-by: Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/libxl_dom_save.c | 4 ++--
 tools/libxl/libxl_internal.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c
index a385500..28e2a41 100644
--- a/tools/libxl/libxl_dom_save.c
+++ b/tools/libxl/libxl_dom_save.c
@@ -44,7 +44,7 @@ static void switch_logdirty_xswatch(libxl__egc *egc, 
libxl__ev_xswatch*,
 static void switch_logdirty_done(libxl__egc *egc,
  libxl__domain_save_state *dss, int rc);
 
-static void logdirty_init(libxl__logdirty_switch *lds)
+void libxl__logdirty_init(libxl__logdirty_switch *lds)
 {
 lds->cmd_path = 0;
 libxl__ev_xswatch_init(>watch);
@@ -345,7 +345,7 @@ void libxl__domain_save(libxl__egc *egc, 
libxl__domain_save_state *dss)
 }
 
 dss->rc = 0;
-logdirty_init(>logdirty);
+libxl__logdirty_init(>logdirty);
 dsps->ao = ao;
 dsps->domid = domid;
 rc = libxl__domain_suspend_init(egc, dsps, type);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index ac6457f..656bccd 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3078,6 +3078,8 @@ typedef struct libxl__logdirty_switch {
 libxl__ev_time timeout;
 } libxl__logdirty_switch;
 
+_hidden void libxl__logdirty_init(libxl__logdirty_switch *lds);
+
 struct libxl__domain_suspend_state {
 /* set by caller of libxl__domain_suspend_init */
 libxl__ao *ao;
-- 
2.5.0




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v8 03/13] tools/libxl: move save/restore code into libxl_dom_save.c

2016-02-17 Thread Wen Congyang
This is purely code motion.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Ian Jackson 
Acked-by: Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/Makefile |   2 +-
 tools/libxl/libxl_dom.c  | 509 
 tools/libxl/libxl_dom_save.c | 538 +++
 3 files changed, 539 insertions(+), 510 deletions(-)
 create mode 100644 tools/libxl/libxl_dom_save.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 7d64ecc..263ea0e 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -105,7 +105,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o 
libxl_pci.o \
libxl_stream_read.o libxl_stream_write.o \
libxl_save_callout.o _libxl_save_msgs_callout.o \
libxl_qmp.o libxl_event.o libxl_fork.o \
-   libxl_dom_suspend.o $(LIBXL_OBJS-y)
+   libxl_dom_suspend.o libxl_dom_save.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += libxl_genid.o
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index d74f1a4..664adad 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -24,7 +24,6 @@
 #include 
 #include 
 #include 
-#include 
 
 libxl_domain_type libxl__domain_type(libxl__gc *gc, uint32_t domid)
 {
@@ -1107,514 +1106,6 @@ int libxl__qemu_traditional_cmd(libxl__gc *gc, uint32_t 
domid,
 return libxl__xs_printf(gc, XBT_NULL, path, "%s", cmd);
 }
 
-/*
- * Inspect the buffer between start and end, and return a pointer to the
- * character following the NUL terminator of start, or NULL if start is not
- * terminated before end.
- */
-static const char *next_string(const char *start, const char *end)
-{
-if (start >= end) return NULL;
-
-size_t total_len = end - start;
-size_t len = strnlen(start, total_len);
-
-if (len == total_len)
-return NULL;
-else
-return start + len + 1;
-}
-
-int libxl__restore_emulator_xenstore_data(libxl__domain_create_state *dcs,
-  const char *ptr, uint32_t size)
-{
-STATE_AO_GC(dcs->ao);
-const char *next = ptr, *end = ptr + size, *key, *val;
-int rc;
-
-const uint32_t domid = dcs->guest_domid;
-const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid);
-const char *xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, "");
-
-while (next < end) {
-key = next;
-next = next_string(next, end);
-
-/* Sanitise 'key'. */
-if (!next) {
-rc = ERROR_FAIL;
-LOG(ERROR, "Key in xenstore data not NUL terminated");
-goto out;
-}
-if (key[0] == '\0') {
-rc = ERROR_FAIL;
-LOG(ERROR, "empty key found in xenstore data");
-goto out;
-}
-if (key[0] == '/') {
-rc = ERROR_FAIL;
-LOG(ERROR, "Key in xenstore data not relative");
-goto out;
-}
-
-val = next;
-next = next_string(next, end);
-
-/* Sanitise 'val'. */
-if (!next) {
-rc = ERROR_FAIL;
-LOG(ERROR, "Val in xenstore data not NUL terminated");
-goto out;
-}
-
-libxl__xs_printf(gc, XBT_NULL,
- GCSPRINTF("%s/%s", xs_root, key),
- "%s", val);
-}
-
-rc = 0;
-
- out:
-return rc;
-}
-
-/* Domain suspend (save) */
-
-static void stream_done(libxl__egc *egc,
-libxl__stream_write_state *sws, int rc);
-static void domain_save_done(libxl__egc *egc,
- libxl__domain_suspend_state *dss, int rc);
-
-/*- complicated callback, called by xc_domain_save -*/
-
-/*
- * We implement the other end of protocol for controlling qemu-dm's
- * logdirty.  There is no documentation for this protocol, but our
- * counterparty's implementation is in
- * qemu-xen-traditional.git:xenstore.c in the function
- * xenstore_process_logdirty_event
- */
-
-static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-const struct timeval *requested_abs,
-int rc);
-static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
-const char *watch_path, const char *event_path);
-static void switch_logdirty_done(libxl__egc *egc,
- libxl__domain_suspend_state *dss, int rc);
-
-static void logdirty_init(libxl__logdirty_switch *lds)
-{
-lds->cmd_path = 0;
-libxl__ev_xswatch_init(>watch);
-libxl__ev_time_init(>timeout);
-}
-

[Xen-devel] [PATCH v8 02/13] tools/libxl: move remus code into libxl_remus.c

2016-02-17 Thread Wen Congyang
After previous refactoring, we are now able to move all remus code
into a separate file libxl_remus.c.

Export following functions for internal use:
- setup/teardown Remus:
  * libxl__remus_setup
  * libxl__remus_teardown
  * libxl__remus_restore_setup

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Ian Campbell 
CC: Ian Jackson 
Acked-by:Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/Makefile |   2 +-
 tools/libxl/libxl.c  |  75 -
 tools/libxl/libxl_create.c   |  32 
 tools/libxl/libxl_dom.c  | 223 --
 tools/libxl/libxl_internal.h |  14 +-
 tools/libxl/libxl_remus.c| 362 +++
 6 files changed, 371 insertions(+), 337 deletions(-)
 create mode 100644 tools/libxl/libxl_remus.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 620720e..7d64ecc 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -64,7 +64,7 @@ else
 LIBXL_OBJS-y += libxl_no_convert_callout.o
 endif
 
-LIBXL_OBJS-y += libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 38029cd..d6ce7da 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -831,12 +831,6 @@ out:
 return ptr;
 }
 
-static void libxl__remus_setup(libxl__egc *egc,
-   libxl__domain_suspend_state *dss);
-static void remus_setup_done(libxl__egc *egc,
- libxl__remus_devices_state *rds, int rc);
-static void remus_setup_failed(libxl__egc *egc,
-   libxl__remus_devices_state *rds, int rc);
 static void remus_failover_cb(libxl__egc *egc,
   libxl__domain_suspend_state *dss, int rc);
 
@@ -893,75 +887,6 @@ int libxl_domain_remus_start(libxl_ctx *ctx, 
libxl_domain_remus_info *info,
 return AO_CREATE_FAIL(rc);
 }
 
-static void libxl__remus_setup(libxl__egc *egc,
-   libxl__domain_suspend_state *dss)
-{
-/* Convenience aliases */
-libxl__remus_devices_state *const rds = >rds;
-const libxl_domain_remus_info *const info = dss->remus;
-libxl__srm_save_autogen_callbacks *const callbacks =
->sws.shs.callbacks.save.a;
-
-STATE_AO_GC(dss->ao);
-
-if (libxl_defbool_val(info->netbuf)) {
-if (!libxl__netbuffer_enabled(gc)) {
-LOG(ERROR, "Remus: No support for network buffering");
-goto out;
-}
-rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VIF);
-}
-
-if (libxl_defbool_val(info->diskbuf))
-rds->device_kind_flags |= (1 << LIBXL__DEVICE_KIND_VBD);
-
-rds->ao = ao;
-rds->domid = dss->domid;
-rds->callback = remus_setup_done;
-
-dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
-
-callbacks->suspend = libxl__remus_domain_suspend_callback;
-callbacks->postcopy = libxl__remus_domain_resume_callback;
-callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
-
-libxl__remus_devices_setup(egc, rds);
-return;
-
-out:
-dss->callback(egc, dss, ERROR_FAIL);
-}
-
-static void remus_setup_done(libxl__egc *egc,
- libxl__remus_devices_state *rds, int rc)
-{
-libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-STATE_AO_GC(dss->ao);
-
-if (!rc) {
-libxl__domain_save(egc, dss);
-return;
-}
-
-LOG(ERROR, "Remus: failed to setup device for guest with domid %u, rc %d",
-dss->domid, rc);
-rds->callback = remus_setup_failed;
-libxl__remus_devices_teardown(egc, rds);
-}
-
-static void remus_setup_failed(libxl__egc *egc,
-   libxl__remus_devices_state *rds, int rc)
-{
-libxl__domain_suspend_state *dss = CONTAINER_OF(rds, *dss, rds);
-STATE_AO_GC(dss->ao);
-
-if (rc)
-LOG(ERROR, "Remus: failed to teardown device after setup failed"
-" for guest with domid %u, rc %d", dss->domid, rc);
-
-dss->callback(egc, dss, rc);
-}
-
 static void remus_failover_cb(libxl__egc *egc,
   libxl__domain_suspend_state *dss, int rc)
 {
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 7293d0b..e421d36 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -709,38 +709,6 @@ static int store_libxl_entry(libxl__gc *gc, uint32_t domid,
 
libxl_device_model_version_to_string(b_info->device_model_version));
 }
 
-/*- remus asynchronous checkpoint callback 

[Xen-devel] [PATCH v8 10/13] tools/libxl: adjust the indentation

2016-02-17 Thread Wen Congyang
This is just tidying up after the "tools/libxl: rename remus device
to checkpoint device" patch automatic renaming.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
Acked-by: Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/libxl_checkpoint_device.c | 21 +++--
 tools/libxl/libxl_internal.h  | 19 +++
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c 
b/tools/libxl/libxl_checkpoint_device.c
index 109cd23..226f159 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -73,9 +73,9 @@ static void devices_teardown_cb(libxl__egc *egc,
 /* checkpoint device setup and teardown */
 
 static libxl__checkpoint_device* checkpoint_device_init(libxl__egc *egc,
-  libxl__checkpoint_devices_state 
*cds,
-  libxl__device_kind kind,
-  void *libxl_dev)
+libxl__checkpoint_devices_state *cds,
+libxl__device_kind kind,
+void *libxl_dev)
 {
 libxl__checkpoint_device *dev = NULL;
 
@@ -89,9 +89,10 @@ static libxl__checkpoint_device* 
checkpoint_device_init(libxl__egc *egc,
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-libxl__checkpoint_devices_state *cds);
+ libxl__checkpoint_devices_state *cds);
 
-void libxl__checkpoint_devices_setup(libxl__egc *egc, 
libxl__checkpoint_devices_state *cds)
+void libxl__checkpoint_devices_setup(libxl__egc *egc,
+ libxl__checkpoint_devices_state *cds)
 {
 int i, rc;
 
@@ -137,7 +138,7 @@ out:
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
-libxl__checkpoint_devices_state *cds)
+ libxl__checkpoint_devices_state *cds)
 {
 int i, rc;
 
@@ -285,12 +286,12 @@ static void devices_checkpoint_cb(libxl__egc *egc,
 
 /* API implementations */
 
-#define define_checkpoint_api(api)\
-void libxl__checkpoint_devices_##api(libxl__egc *egc,\
-libxl__checkpoint_devices_state *cds)\
+#define define_checkpoint_api(api)  \
+void libxl__checkpoint_devices_##api(libxl__egc *egc,   \
+libxl__checkpoint_devices_state *cds)   \
 {   \
 int i;  \
-libxl__checkpoint_device *dev;   \
+libxl__checkpoint_device *dev;  \
 \
 STATE_AO_GC(cds->ao);   \
 \
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 630f048..bde7a15 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2818,7 +2818,8 @@ typedef struct libxl__save_helper_state {
  * Each device type needs to implement the interfaces specified in
  * the libxl__checkpoint_device_instance_ops if it wishes to support Remus.
  *
- * The high-level control flow through the checkpoint device layer is shown 
below:
+ * The high-level control flow through the checkpoint device layer is shown
+ * below:
  *
  * xl remus
  *  |->  libxl_domain_remus_start
@@ -2879,7 +2880,8 @@ int 
init_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 void cleanup_subkind_drbd_disk(libxl__checkpoint_devices_state *cds);
 
 typedef void libxl__checkpoint_callback(libxl__egc *,
-   libxl__checkpoint_devices_state *, int rc);
+libxl__checkpoint_devices_state *,
+int rc);
 
 /*
  * State associated with a checkpoint invocation, including parameters
@@ -2887,7 +2889,7 @@ typedef void libxl__checkpoint_callback(libxl__egc *,
  * save/restore machinery.
  */
 struct libxl__checkpoint_devices_state {
-/* must be set by caller of libxl__checkpoint_device_(setup|teardown) 
*/
+/*-- must be set by caller of libxl__checkpoint_device_(setup|teardown) 
--*/
 
 libxl__ao *ao;
 uint32_t domid;
@@ -2900,7 +2902,8 @@ struct libxl__checkpoint_devices_state {
 /*
  * this array is allocated before setup the checkpoint devices by the
  * checkpoint abstract layer.
- * devs may 

[Xen-devel] [PATCH v8 13/13] tools/libxl: seperate device init/cleanup from checkpoint device layer

2016-02-17 Thread Wen Congyang
we call (init|cleanup)_subkind_nic and (init|cleanup)_subkind_drbd_disk
directly in checkpoint device. Move them to libxl_remus.c, Call them before
calling libxl__checkpoint_devices_setup() or after calling
libxl__checkpoint_devices_teardown().
it is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
Acked-by: Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/libxl_checkpoint_device.c | 42 ++-
 tools/libxl/libxl_remus.c | 42 +++
 2 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c 
b/tools/libxl/libxl_checkpoint_device.c
index bbc6dc4..0a16dbb 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,38 +17,6 @@
 
 #include "libxl_internal.h"
 
-/*- helper functions -*/
-
-static int init_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-/* init device subkind-specific state in the libxl ctx */
-int rc;
-STATE_AO_GC(cds->ao);
-
-if (libxl__netbuffer_enabled(gc)) {
-rc = init_subkind_nic(cds);
-if (rc) goto out;
-}
-
-rc = init_subkind_drbd_disk(cds);
-if (rc) goto out;
-
-rc = 0;
-out:
-return rc;
-}
-
-static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
-{
-/* cleanup device subkind-specific state in the libxl ctx */
-STATE_AO_GC(cds->ao);
-
-if (libxl__netbuffer_enabled(gc))
-cleanup_subkind_nic(cds);
-
-cleanup_subkind_drbd_disk(cds);
-}
-
 /*- setup() and teardown() -*/
 
 /* callbacks */
@@ -86,14 +54,10 @@ static void checkpoint_devices_setup(libxl__egc *egc,
 void libxl__checkpoint_devices_setup(libxl__egc *egc,
  libxl__checkpoint_devices_state *cds)
 {
-int i, rc;
+int i;
 
 STATE_AO_GC(cds->ao);
 
-rc = init_device_subkind(cds);
-if (rc)
-goto out;
-
 cds->num_devices = 0;
 cds->num_nics = 0;
 cds->num_disks = 0;
@@ -126,7 +90,7 @@ void libxl__checkpoint_devices_setup(libxl__egc *egc,
 return;
 
 out:
-cds->callback(egc, cds, rc);
+cds->callback(egc, cds, 0);
 }
 
 static void checkpoint_devices_setup(libxl__egc *egc,
@@ -263,8 +227,6 @@ static void devices_teardown_cb(libxl__egc *egc,
 cds->disks = NULL;
 cds->num_disks = 0;
 
-cleanup_device_subkind(cds);
-
 cds->callback(egc, cds, rc);
 }
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index e83cdc9..54ec7de 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -26,6 +26,38 @@ static const libxl__checkpoint_device_instance_ops 
*remus_ops[] = {
 NULL,
 };
 
+/*- helper functions -*/
+
+static int init_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+/* init device subkind-specific state in the libxl ctx */
+int rc;
+STATE_AO_GC(cds->ao);
+
+if (libxl__netbuffer_enabled(gc)) {
+rc = init_subkind_nic(cds);
+if (rc) goto out;
+}
+
+rc = init_subkind_drbd_disk(cds);
+if (rc) goto out;
+
+rc = 0;
+out:
+return rc;
+}
+
+static void cleanup_device_subkind(libxl__checkpoint_devices_state *cds)
+{
+/* cleanup device subkind-specific state in the libxl ctx */
+STATE_AO_GC(cds->ao);
+
+if (libxl__netbuffer_enabled(gc))
+cleanup_subkind_nic(cds);
+
+cleanup_subkind_drbd_disk(cds);
+}
+
 /* Remus setup and teardown -*/
 
 static void remus_setup_done(libxl__egc *egc,
@@ -68,6 +100,12 @@ void libxl__remus_setup(libxl__egc *egc, libxl__remus_state 
*rs)
 cds->concrete_data = rs;
 rs->interval = info->interval;
 
+if (init_device_subkind(cds)) {
+LOG(ERROR, "Remus: failed to init device subkind for guest %u",
+dss->domid);
+goto out;
+}
+
 dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
 callbacks->suspend = libxl__remus_domain_suspend_callback;
@@ -108,6 +146,8 @@ static void remus_setup_failed(libxl__egc *egc,
 LOG(ERROR, "Remus: failed to teardown device after setup failed"
 " for guest with domid %u, rc %d", dss->domid, rc);
 
+cleanup_device_subkind(cds);
+
 dss->callback(egc, dss, rc);
 }
 
@@ -142,6 +182,8 @@ static void remus_teardown_done(libxl__egc *egc,
 LOG(ERROR, "Remus: failed to teardown device for guest with domid %u,"
 " rc %d", dss->domid, rc);
 
+cleanup_device_subkind(cds);
+
 dss->callback(egc, dss, rc);
 }
 
-- 
2.5.0




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v8 00/13] Prerequisite patches for COLO

2016-02-17 Thread Wen Congyang
This patchset is Prerequisite for COLO feature. Refer to:
http://wiki.xen.org/wiki/COLO_-_Coarse_Grain_Lock_Stepping

Patch status:
1. Acked patches: patch 2-4, 6-13
2. Reviewd patches: all
3. New patches: none
Note:
1. Patch 1 and 7 is updated according to Wei Liu's comments
2. Patch 2-3 is updated because patch 1 is updated
3. Patch 8, 9, 11, 12 in v7 is moved to another series
4. Patch 13, 14 in v7 is fold into one patch(patch 9)
5. The commit message for patch 5 is not updated(wait the reply
   from Ian C, and Ian J)

You can get the codes from here:
https://github.com/wencongyang/xen/tree/colo_pre_v8
You can get the whole colo related patches from here:
https://github.com/wencongyang/xen/tree/colo_v10

v6->v7:
 - Addressed comments from Konrad Rzeszutek Wilk

v5->v6:
 - Fix some bugs found in the test

v4->v5:
 - Rebased to the latest xen
 - Addressed comments from last round

v3->v4:
 - Rebased to the latest migration v2 branch
 - Addressed comments from last round

v2->v3:
 - Merge '[PATCH v2 0/6] Misc cleanups for libxl' into this patchset
   for easy review
 - Addressed review comments
 - Add back channel to libxc
 - Introduce should_checkpoint callback
 - Introduce DIRTY_BITMAP record on libxc side
 - Introduce COLO_CONTEXT record on libxl side
 - Ported to Libxl migration v2

v1->v2:
 - Rebased to [PATCH v2 0/6] Misc cleanups for libxl
 - Add a bugfix for the error handling of process_record

Wen Congyang (13):
  libxl/remus: init checkpoint callback in Remus setup callback
  tools/libxl: move remus code into libxl_remus.c
  tools/libxl: move save/restore code into libxl_dom_save.c
  libxl/save: Refactor libxl__domain_suspend_state
  tools/libxc: support to resume uncooperative HVM guests
  tools/libxl: introduce enum type libxl_checkpointed_stream
  migration/save: pass checkpointed_stream from libxl to libxc
  tools/libxl: export logdirty_init
  tools/libxl: rename remus device to checkpoint device
  tools/libxl: adjust the indentation
  tools/libxl: store remus_ops in checkpoint device state
  tools/libxl: move remus state into a seperate structure
  tools/libxl: seperate device init/cleanup from checkpoint device layer

 tools/libxc/include/xenguest.h|   6 +-
 tools/libxc/xc_nomigrate.c|   3 +-
 tools/libxc/xc_resume.c   |  25 +-
 tools/libxc/xc_sr_common.h|  12 +-
 tools/libxc/xc_sr_save.c  |  17 +-
 tools/libxl/Makefile  |   4 +-
 tools/libxl/libxl.c   |  81 +---
 tools/libxl/libxl.h   |  19 +
 tools/libxl/libxl_checkpoint_device.c | 282 +
 tools/libxl/libxl_create.c|  44 +-
 tools/libxl/libxl_dom.c   | 740 --
 tools/libxl/libxl_dom_save.c  | 521 
 tools/libxl/libxl_dom_suspend.c   | 207 ++
 tools/libxl/libxl_internal.h  | 217 ++
 tools/libxl/libxl_netbuffer.c | 117 +++---
 tools/libxl/libxl_nonetbuffer.c   |  10 +-
 tools/libxl/libxl_remus.c | 424 +++
 tools/libxl/libxl_remus_device.c  | 327 ---
 tools/libxl/libxl_remus_disk_drbd.c   |  56 +--
 tools/libxl/libxl_save_callout.c  |   4 +-
 tools/libxl/libxl_save_helper.c   |   3 +-
 tools/libxl/libxl_stream_read.c   |   7 +-
 tools/libxl/libxl_stream_write.c  |  18 +-
 tools/libxl/libxl_types.idl   |  10 +-
 tools/libxl/xl_cmdimpl.c  |  18 +-
 25 files changed, 1709 insertions(+), 1463 deletions(-)
 create mode 100644 tools/libxl/libxl_checkpoint_device.c
 create mode 100644 tools/libxl/libxl_dom_save.c
 create mode 100644 tools/libxl/libxl_remus.c
 delete mode 100644 tools/libxl/libxl_remus_device.c

-- 
2.5.0




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v8 01/13] libxl/remus: init checkpoint callback in Remus setup callback

2016-02-17 Thread Wen Congyang
Init stream {read/write} state checkpoint_callback, suspend/resume/checkpoint
callback in Remus setup callback.
There's no functional change, it's just refactoring so that we can move
all remus code into one file.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Andrew Cooper 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
Reviewed-by: Konrad Rzeszutek Wilk 
---
 tools/libxl/libxl.c  |  8 
 tools/libxl/libxl_create.c   | 18 ++
 tools/libxl/libxl_dom.c  | 18 +-
 tools/libxl/libxl_internal.h |  7 +++
 4 files changed, 34 insertions(+), 17 deletions(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 2d18b8d..38029cd 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -899,6 +899,8 @@ static void libxl__remus_setup(libxl__egc *egc,
 /* Convenience aliases */
 libxl__remus_devices_state *const rds = >rds;
 const libxl_domain_remus_info *const info = dss->remus;
+libxl__srm_save_autogen_callbacks *const callbacks =
+>sws.shs.callbacks.save.a;
 
 STATE_AO_GC(dss->ao);
 
@@ -917,6 +919,12 @@ static void libxl__remus_setup(libxl__egc *egc,
 rds->domid = dss->domid;
 rds->callback = remus_setup_done;
 
+dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
+
+callbacks->suspend = libxl__remus_domain_suspend_callback;
+callbacks->postcopy = libxl__remus_domain_resume_callback;
+callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback;
+
 libxl__remus_devices_setup(egc, rds);
 return;
 
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index de5d27f..7293d0b 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -730,6 +730,17 @@ static void remus_checkpoint_stream_done(
 libxl__xc_domain_saverestore_async_callback_done(egc, >shs, rc);
 }
 
+static void libxl__remus_restore_setup(libxl__egc *egc,
+   libxl__domain_create_state *dcs)
+{
+/* Convenience aliases */
+libxl__srm_restore_autogen_callbacks *const callbacks =
+>srs.shs.callbacks.restore.a;
+
+callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
+dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
+}
+
 /*- main domain creation -*/
 
 /* We have a linear control flow; only one event callback is
@@ -1014,8 +1025,7 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 libxl_domain_config *const d_config = dcs->guest_config;
 const int restore_fd = dcs->restore_fd;
 libxl__domain_build_state *const state = >build_state;
-libxl__srm_restore_autogen_callbacks *const callbacks =
->srs.shs.callbacks.restore.a;
+const int checkpointed_stream = dcs->restore_params.checkpointed_stream;
 
 if (rc) {
 domcreate_rebuild_done(egc, dcs, rc);
@@ -1043,7 +1053,6 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 }
 
 /* Restore */
-callbacks->checkpoint = libxl__remus_domain_restore_checkpoint_callback;
 
 rc = libxl__build_pre(gc, domid, d_config, state);
 if (rc)
@@ -1054,9 +1063,10 @@ static void domcreate_bootloader_done(libxl__egc *egc,
 dcs->srs.fd = restore_fd;
 dcs->srs.legacy = (dcs->restore_params.stream_version == 1);
 dcs->srs.completion_callback = domcreate_stream_done;
-dcs->srs.checkpoint_callback = remus_checkpoint_stream_done;
 
 if (restore_fd >= 0) {
+if (checkpointed_stream)
+libxl__remus_restore_setup(egc, dcs);
 libxl__stream_read_start(egc, >srs);
 return;
 }
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 2269998..7835d4d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1489,7 +1489,7 @@ static void remus_devices_preresume_cb(libxl__egc *egc,
libxl__remus_devices_state *rds,
int rc);
 
-static void libxl__remus_domain_suspend_callback(void *data)
+void libxl__remus_domain_suspend_callback(void *data)
 {
 libxl__save_helper_state *shs = data;
 libxl__egc *egc = shs->egc;
@@ -1532,7 +1532,7 @@ out:
 libxl__xc_domain_saverestore_async_callback_done(egc, >sws.shs, !rc);
 }
 
-static void libxl__remus_domain_resume_callback(void *data)
+void libxl__remus_domain_resume_callback(void *data)
 {
 libxl__save_helper_state *shs = data;
 libxl__egc *egc = shs->egc;
@@ -1569,8 +1569,6 @@ out:
 
 /*- remus asynchronous checkpoint callback -*/
 
-static void remus_checkpoint_stream_written(
-libxl__egc *egc, libxl__stream_write_state *sws, int rc);
 static void remus_devices_commit_cb(libxl__egc *egc,
 libxl__remus_devices_state *rds,

[Xen-devel] [PATCH v8 07/13] migration/save: pass checkpointed_stream from libxl to libxc

2016-02-17 Thread Wen Congyang
Pass checkpointed_stream from libxl to libxc.
It won't affact legacy migration because legacy migration
won't use this param.

Signed-off-by: Yang Hongyang 
Signed-off-by: Wen Congyang 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
CC: Andrew Cooper 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxc/include/xenguest.h   |  6 --
 tools/libxc/xc_nomigrate.c   |  3 ++-
 tools/libxc/xc_sr_common.h   | 12 +++-
 tools/libxc/xc_sr_save.c | 17 +++--
 tools/libxl/libxl.c  |  2 ++
 tools/libxl/libxl_dom_save.c | 11 ---
 tools/libxl/libxl_internal.h |  1 +
 tools/libxl/libxl_save_callout.c |  2 +-
 tools/libxl/libxl_save_helper.c  |  3 ++-
 tools/libxl/libxl_stream_write.c |  2 +-
 tools/libxl/libxl_types.idl  |  1 +
 11 files changed, 44 insertions(+), 16 deletions(-)

diff --git a/tools/libxc/include/xenguest.h b/tools/libxc/include/xenguest.h
index d48b3ff..affc42b 100644
--- a/tools/libxc/include/xenguest.h
+++ b/tools/libxc/include/xenguest.h
@@ -29,7 +29,6 @@
 #define XCFLAGS_HVM   (1 << 2)
 #define XCFLAGS_STDVGA(1 << 3)
 #define XCFLAGS_CHECKPOINT_COMPRESS(1 << 4)
-#define XCFLAGS_CHECKPOINTED(1 << 5)
 
 #define X86_64_B_SIZE   64 
 #define X86_32_B_SIZE   32
@@ -82,11 +81,14 @@ struct save_callbacks {
  * @parm xch a handle to an open hypervisor interface
  * @parm fd the file descriptor to save a domain to
  * @parm dom the id of the domain
+ * @param checkpointed_stream MIG_STREAM_NONE if the far end of the stream
+ *doesn't use checkpointing
  * @return 0 on success, -1 on failure
  */
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t 
max_iters,
uint32_t max_factor, uint32_t flags /* XCFLAGS_xxx */,
-   struct save_callbacks* callbacks, int hvm);
+   struct save_callbacks* callbacks, int hvm,
+   int checkpointed_stream);
 
 /* callbacks provided by xc_domain_restore */
 struct restore_callbacks {
diff --git a/tools/libxc/xc_nomigrate.c b/tools/libxc/xc_nomigrate.c
index 902429e..c9124df 100644
--- a/tools/libxc/xc_nomigrate.c
+++ b/tools/libxc/xc_nomigrate.c
@@ -22,7 +22,8 @@
 
 int xc_domain_save(xc_interface *xch, int io_fd, uint32_t dom, uint32_t 
max_iters,
uint32_t max_factor, uint32_t flags,
-   struct save_callbacks* callbacks, int hvm)
+   struct save_callbacks* callbacks, int hvm,
+   int checkpointed_stream)
 {
 errno = ENOSYS;
 return -1;
diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index 60b43e8..66f595f 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -180,6 +180,16 @@ struct xc_sr_context
 
 xc_dominfo_t dominfo;
 
+/*
+ * migration stream
+ * 0: Plain VM
+ * 1: Remus
+ */
+enum {
+MIG_STREAM_NONE, /* plain stream */
+MIG_STREAM_REMUS,
+} migration_stream;
+
 union /* Common save or restore data. */
 {
 struct /* Save data. */
@@ -191,7 +201,7 @@ struct xc_sr_context
 bool live;
 
 /* Plain VM, or checkpoints over time. */
-bool checkpointed;
+int checkpointed;
 
 /* Further debugging information in the stream. */
 bool debug;
diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index ccb000e..e258b7c 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -629,7 +629,7 @@ static int send_domain_memory_live(struct xc_sr_context 
*ctx)
 if ( rc )
 goto out;
 
-if ( ctx->save.debug && !ctx->save.checkpointed )
+if ( ctx->save.debug && ctx->save.checkpointed != MIG_STREAM_NONE )
 {
 rc = verify_frames(ctx);
 if ( rc )
@@ -758,7 +758,7 @@ static int save(struct xc_sr_context *ctx, uint16_t 
guest_type)
 
 if ( ctx->save.live )
 rc = send_domain_memory_live(ctx);
-else if ( ctx->save.checkpointed )
+else if ( ctx->save.checkpointed != MIG_STREAM_NONE )
 rc = send_domain_memory_checkpointed(ctx);
 else
 rc = send_domain_memory_nonlive(ctx);
@@ -778,7 +778,7 @@ static int save(struct xc_sr_context *ctx, uint16_t 
guest_type)
 if ( rc )
 goto err;
 
-if ( ctx->save.checkpointed )
+if ( ctx->save.checkpointed != MIG_STREAM_NONE )
 {
 /*
  * We have now completed the initial live portion of the checkpoint
@@ -799,7 +799,7 @@ static int save(struct xc_sr_context *ctx, uint16_t 
guest_type)
 if ( rc <= 0 )
 goto err;
 }
-} while ( ctx->save.checkpointed );
+} while ( ctx->save.checkpointed != 

[Xen-devel] [PATCH v8 09/13] tools/libxl: rename remus device to checkpoint device

2016-02-17 Thread Wen Congyang
This patch is auto generated by the following commands:
 1. git mv tools/libxl/libxl_remus_device.c 
tools/libxl/libxl_checkpoint_device.c
 2. perl -pi -e 's/libxl_remus_device/libxl_checkpoint_device/g' 
tools/libxl/Makefile
 3. perl -pi -e 's/\blibxl__remus_devices/libxl__checkpoint_devices/g' 
tools/libxl/*.[ch]
 4. perl -pi -e 's/\blibxl__remus_device\b/libxl__checkpoint_device/g' 
tools/libxl/*.[ch]
 5. perl -pi -e 
's/\blibxl__remus_device_instance_ops\b/libxl__checkpoint_device_instance_ops/g'
 tools/libxl/*.[ch]
 6. perl -pi -e 's/\blibxl__remus_callback\b/libxl__checkpoint_callback/g' 
tools/libxl/*.[ch]
 7. perl -pi -e 's/\bremus_device_init\b/checkpoint_device_init/g' 
tools/libxl/*.[ch]
 8. perl -pi -e 's/\bremus_devices_setup\b/checkpoint_devices_setup/g' 
tools/libxl/*.[ch]
 9. perl -pi -e 's/\bdefine_remus_checkpoint_api\b/define_checkpoint_api/g' 
tools/libxl/*.[ch]
10. perl -pi -e 's/\brds\b/cds/g' tools/libxl/*.[ch]
11. perl -pi -e 's/REMUS_DEVICE/CHECKPOINT_DEVICE/g' tools/libxl/*.[ch] 
tools/libxl/*.idl
12. perl -pi -e 's/REMUS_DEVOPS/CHECKPOINT_DEVOPS/g' tools/libxl/*.[ch] 
tools/libxl/*.idl
13. perl -pi -e 's/\bremus\b/checkpoint/g' 
tools/libxl/libxl_checkpoint_device.[ch]
14. perl -pi -e 's/\bremus device/checkpoint device/g' 
tools/libxl/libxl_internal.h
15. perl -pi -e 's/\bRemus device/checkpoint device/g' 
tools/libxl/libxl_internal.h
16. perl -pi -e 's/\bremus abstract/checkpoint abstract/g' 
tools/libxl/libxl_internal.h
17. perl -pi -e 's/\bremus invocation/checkpoint invocation/g' 
tools/libxl/libxl_internal.h
18. perl -pi -e 's/\blibxl__remus_device_\(/libxl__checkpoint_device_(/g' 
tools/libxl/libxl_internal.h

The patch also fixes the following backword compatibility:
  The error code ERROR_REMUS_XXX was introduced in Xen 4.5, and
  changed to ERROR_CHECKPOINT_XXX after previous renaming.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
Reviewed-Lightly-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/Makefile   |   2 +-
 tools/libxl/libxl.h|  12 ++
 ...xl_remus_device.c => libxl_checkpoint_device.c} | 198 ++---
 tools/libxl/libxl_internal.h   | 112 ++--
 tools/libxl/libxl_netbuffer.c  | 108 +--
 tools/libxl/libxl_nonetbuffer.c|  10 +-
 tools/libxl/libxl_remus.c  |  76 
 tools/libxl/libxl_remus_disk_drbd.c|  52 +++---
 tools/libxl/libxl_types.idl|   4 +-
 9 files changed, 293 insertions(+), 281 deletions(-)
 rename tools/libxl/{libxl_remus_device.c => libxl_checkpoint_device.c} (52%)

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 263ea0e..789a12e 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -64,7 +64,7 @@ else
 LIBXL_OBJS-y += libxl_no_convert_callout.o
 endif
 
-LIBXL_OBJS-y += libxl_remus.o libxl_remus_device.o libxl_remus_disk_drbd.o
+LIBXL_OBJS-y += libxl_remus.o libxl_checkpoint_device.o libxl_remus_disk_drbd.o
 
 LIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o libxl_psr.o
 LIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o libxl_libfdt_compat.o
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 6225db1..f9e3ef5 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -883,6 +883,18 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, 
libxl_mac *src);
  */
 #define LIBXL_HAVE_CHECKPOINTED_STREAM 1
 
+/*
+ * ERROR_REMUS_XXX error code only exists from Xen 4.5, Xen 4.6 and it
+ * is changed to ERROR_CHECKPOINT_XXX in Xen 4.7
+ */
+#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION >= 0x040500 \
+   && LIBXL_API_VERSION < 0x040700
+#define ERROR_REMUS_DEVOPS_DOES_NOT_MATCH \
+ERROR_CHECKPOINT_DEVOPS_DOES_NOT_MATCH
+#define ERROR_REMUS_DEVICE_NOT_SUPPORTED \
+ERROR_CHECKPOINT_DEVICE_NOT_SUPPORTED
+#endif
+
 typedef char **libxl_string_list;
 void libxl_string_list_dispose(libxl_string_list *sl);
 int libxl_string_list_length(const libxl_string_list *sl);
diff --git a/tools/libxl/libxl_remus_device.c 
b/tools/libxl/libxl_checkpoint_device.c
similarity index 52%
rename from tools/libxl/libxl_remus_device.c
rename to tools/libxl/libxl_checkpoint_device.c
index a6cb7f6..109cd23 100644
--- a/tools/libxl/libxl_remus_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,9 +17,9 @@
 
 #include "libxl_internal.h"
 
-extern const libxl__remus_device_instance_ops remus_device_nic;
-extern const libxl__remus_device_instance_ops remus_device_drbd_disk;
-static const libxl__remus_device_instance_ops *remus_ops[] = {
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
 

[Xen-devel] [PATCH v8 05/13] tools/libxc: support to resume uncooperative HVM guests

2016-02-17 Thread Wen Congyang
Before this patch:
1. suspend
a. PVHVM and PV: we use the same way to suspend the guest (send the suspend
   request to the guest). If the guest doesn't support evtchn, the xenstore
   variant will be used, suspending the guest via XenBus control node.
b. pure HVM: we call xc_domain_shutdown(..., SHUTDOWN_suspend) to suspend
   the guest

2. Resume:
a. fast path(fast=1)
   Do not change the guest state. We call libxl__domain_resume(.., 1) which
   calls xc_domain_resume(..., 1 /* fast=1*/) to resume the guest.
   PV:   modify the return code to 1, and than call the domctl:
 XEN_DOMCTL_resumedomain
   PVHVM:same with PV
   pure HVM: do nothing in modify_returncode, and than call the domctl:
 XEN_DOMCTL_resumedomain
b. slow
   Used when the guest's state have been changed. Will call
   libxl__domain_resume(..., 0) to resume the guest.
   PV:   update start info, and reset all secondary CPU states. Than call
 the domctl: XEN_DOMCTL_resumedomain
   PVHVM:can not be resumed. You will get the following error message:
 "Cannot resume uncooperative HVM guests"
   pure HVM: same with PVHVM

After this patch:
1. suspend
   unchanged

2. Resume
a. fast path:
   unchanged
b. slow
   PV:   unchanged
   PVHVM:call XEN_DOMCTL_resumedomain to resume the guest. Because we
 don't modify the return code, the PV driver will disconnect
 and reconnect.
 The guest ends up doing the XENMAPSPACE_shared_info
 XENMEM_add_to_physmap hypercall and resetting all of its CPU
 states to point to the shared_info(well except the ones past 32).
 That is the Linux kernel does that - regardless whether the
 SCHEDOP_shutdown:SHUTDOWN_suspend returns 1 or not.
   Pure HVM: call XEN_DOMCTL_resumedomain to resume the guest.

Under COLO, we will update the guest's state(modify memory, cpu's registers,
device status...). In this case, we cannot use the fast path to resume it.
Keep the return code 0, and use a slow path to resume the guest. While
resuming HVM using slow path is not supported currently, this patch is to
make the resume call to not fail.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
Reviewed-by: Konrad Rzeszutek Wilk 
---
 tools/libxc/xc_resume.c | 25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/tools/libxc/xc_resume.c b/tools/libxc/xc_resume.c
index e692b81..4eedf87 100644
--- a/tools/libxc/xc_resume.c
+++ b/tools/libxc/xc_resume.c
@@ -108,6 +108,26 @@ static int xc_domain_resume_cooperative(xc_interface *xch, 
uint32_t domid)
 return do_domctl(xch, );
 }
 
+static int xc_domain_resume_hvm(xc_interface *xch, uint32_t domid)
+{
+DECLARE_DOMCTL;
+
+/*
+ * The domctl XEN_DOMCTL_resumedomain unpause each vcpu. After
+ * the domctl, the guest will run.
+ *
+ * If it is PVHVM, the guest called the hypercall
+ *SCHEDOP_shutdown:SHUTDOWN_suspend
+ * to suspend itself. We don't modify the return code, so the PV driver
+ * will disconnect and reconnect.
+ *
+ * If it is a HVM, the guest will continue running.
+ */
+domctl.cmd = XEN_DOMCTL_resumedomain;
+domctl.domain = domid;
+return do_domctl(xch, );
+}
+
 static int xc_domain_resume_any(xc_interface *xch, uint32_t domid)
 {
 DECLARE_DOMCTL;
@@ -137,10 +157,7 @@ static int xc_domain_resume_any(xc_interface *xch, 
uint32_t domid)
  */
 #if defined(__i386__) || defined(__x86_64__)
 if ( info.hvm )
-{
-ERROR("Cannot resume uncooperative HVM guests");
-return rc;
-}
+return xc_domain_resume_hvm(xch, domid);
 
 if ( xc_domain_get_guest_width(xch, domid, >guest_width) != 0 )
 {
-- 
2.5.0




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v8 11/13] tools/libxl: store remus_ops in checkpoint device state

2016-02-17 Thread Wen Congyang
Checkpoint device is an abstract layer to do checkpoint.
COLO can also use it to do checkpoint. But there are
still some codes in checkpoint device which touch remus.

This patch and:
 tools/libxl: move remus state into a seperate structure
 tools/libxl: seperate device init/cleanup from checkpoint device layer
will seperate remus from checkpoint device layer.

We use remus ops directly in checkpoint device. Store it
in checkpoint device state so that we do not aware of
remus_ops in the checkpoint device layer.

It is pure refactoring and no functional changes.

Signed-off-by: Wen Congyang 
Signed-off-by: Yang Hongyang 
Acked-by:Ian Campbell 
Reviewed-by: Konrad Rzeszutek Wilk 
Acked-by: Wei Liu 
---
 tools/libxl/libxl_checkpoint_device.c | 10 +-
 tools/libxl/libxl_internal.h  |  2 ++
 tools/libxl/libxl_remus.c |  9 +
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_checkpoint_device.c 
b/tools/libxl/libxl_checkpoint_device.c
index 226f159..bbc6dc4 100644
--- a/tools/libxl/libxl_checkpoint_device.c
+++ b/tools/libxl/libxl_checkpoint_device.c
@@ -17,14 +17,6 @@
 
 #include "libxl_internal.h"
 
-extern const libxl__checkpoint_device_instance_ops remus_device_nic;
-extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
-static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
-_device_nic,
-_device_drbd_disk,
-NULL,
-};
-
 /*- helper functions -*/
 
 static int init_device_subkind(libxl__checkpoint_devices_state *cds)
@@ -172,7 +164,7 @@ static void device_setup_iterate(libxl__egc *egc, 
libxl__ao_device *aodev)
 goto out;
 
 do {
-dev->ops = remus_ops[++dev->ops_index];
+dev->ops = dev->cds->ops[++dev->ops_index];
 if (!dev->ops) {
 libxl_device_nic * nic = NULL;
 libxl_device_disk * disk = NULL;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index bde7a15..2847d13 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2895,6 +2895,8 @@ struct libxl__checkpoint_devices_state {
 uint32_t domid;
 libxl__checkpoint_callback *callback;
 int device_kind_flags;
+/* The ops must be pointer array, and the last ops must be NULL. */
+const libxl__checkpoint_device_instance_ops **ops;
 
 /*- private for abstract layer only -*/
 
diff --git a/tools/libxl/libxl_remus.c b/tools/libxl/libxl_remus.c
index d41a439..86f81c3 100644
--- a/tools/libxl/libxl_remus.c
+++ b/tools/libxl/libxl_remus.c
@@ -18,6 +18,14 @@
 
 #include "libxl_internal.h"
 
+extern const libxl__checkpoint_device_instance_ops remus_device_nic;
+extern const libxl__checkpoint_device_instance_ops remus_device_drbd_disk;
+static const libxl__checkpoint_device_instance_ops *remus_ops[] = {
+_device_nic,
+_device_drbd_disk,
+NULL,
+};
+
 /* Remus setup and teardown -*/
 
 static void remus_setup_done(libxl__egc *egc,
@@ -55,6 +63,7 @@ void libxl__remus_setup(libxl__egc *egc,
 cds->ao = ao;
 cds->domid = dss->domid;
 cds->callback = remus_setup_done;
+cds->ops = remus_ops;
 
 dss->sws.checkpoint_callback = remus_checkpoint_stream_written;
 
-- 
2.5.0




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen: Avoid left shifting into a sign bit

2016-02-17 Thread Wu, Feng


> -Original Message-
> From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
> Sent: Thursday, February 18, 2016 3:52 AM
> To: Xen-devel 
> Cc: Andrew Cooper ; Jan Beulich
> ; Tim Deegan ; Ian Campbell
> ; Tian, Kevin ; Wu, Feng
> 
> Subject: [PATCH] xen: Avoid left shifting into a sign bit
> 
> Clang 3.8 notices, and objects because it is undefined behaviour.
> 
> "error: shifting a negative signed value is undefined 
> [-Werror,-Wshift-negative-
> value]"
> 
> Use unsigned constants rather than signed ones.
> 
> Signed-off-by: Andrew Cooper 
> ---
> CC: Jan Beulich 
> CC: Tim Deegan 
> CC: Ian Campbell 
> CC: Kevin Tian 
> CC: Feng Wu 
> ---
>  xen/common/page_alloc.c   | 2 +-
>  xen/drivers/passthrough/vtd/x86/ats.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)

Acked-by: Feng Wu   for the VT-d part.

Thanks,
Feng

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5][RFC]xen: sched: convert RTDS from time to event driven model

2016-02-17 Thread Tianyang Chen



On 2/15/2016 10:55 PM, Meng Xu wrote:

Hi Tianyang,

Thanks for the patch! Great work and really quick action! :-)
I will just comment on something I quickly find out and would look
forwarding to Dario's comment.

On Mon, Feb 8, 2016 at 11:33 PM, Tianyang Chen > wrote:
 > Changes since v4:
 > removed unnecessary replenishment queue checks in vcpu_wake()
 > extended replq_remove() to all cases in vcpu_sleep()
 > used _deadline_queue_insert() helper function for both queues
 > _replq_insert() and _replq_remove() program timer internally
 >
 > Changes since v3:
 > removed running queue.
 > added repl queue to keep track of repl events.
 > timer is now per scheduler.
 > timer is init on a valid cpu in a cpupool.
 >
 > Signed-off-by: Tianyang Chen >
 > Signed-off-by: Meng Xu >
 > Signed-off-by: Dagaen Golomb >
 > ---
 >  xen/common/sched_rt.c |  337
-
 >  1 file changed, 251 insertions(+), 86 deletions(-)
 >
 > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
 > index 2e5430f..1f0bb7b 100644
 > --- a/xen/common/sched_rt.c
 > +++ b/xen/common/sched_rt.c
 > @@ -16,6 +16,7 @@
 >  #include 
 >  #include 
 >  #include 
 > +#include 
 >  #include 
 >  #include 
 >  #include 
 > @@ -87,7 +88,7 @@
 >  #define RTDS_DEFAULT_BUDGET (MICROSECS(4000))
 >
 >  #define UPDATE_LIMIT_SHIFT  10
 > -#define MAX_SCHEDULE(MILLISECS(1))
 > +
 >  /*
 >   * Flags
 >   */
 > @@ -142,6 +143,12 @@ static cpumask_var_t *_cpumask_scratch;
 >   */
 >  static unsigned int nr_rt_ops;
 >
 > +/* handler for the replenishment timer */
 > +static void repl_handler(void *data);
 > +
 > +/* checks if a timer is active or not */
 > +bool_t active_timer(struct timer* t);
 > +
 >  /*
 >   * Systme-wide private data, include global RunQueue/DepletedQ
 >   * Global lock is referenced by schedule_data.schedule_lock from all
 > @@ -152,7 +159,9 @@ struct rt_private {
 >  struct list_head sdom;  /* list of availalbe domains, used
for dump */
 >  struct list_head runq;  /* ordered list of runnable vcpus */
 >  struct list_head depletedq; /* unordered list of depleted vcpus */
 > +struct list_head replq; /* ordered list of vcpus that need
replenishment */
 >  cpumask_t tickled;  /* cpus been tickled */
 > +struct timer *repl_timer;   /* replenishment timer */
 >  };
 >
 >  /*
 > @@ -160,6 +169,7 @@ struct rt_private {
 >   */
 >  struct rt_vcpu {
 >  struct list_head q_elem;/* on the runq/depletedq list */
 > +struct list_head replq_elem;/* on the repl event list */
 >
 >  /* Up-pointers */
 >  struct rt_dom *sdom;
 > @@ -213,8 +223,14 @@ static inline struct list_head
*rt_depletedq(const struct scheduler *ops)
 >  return _priv(ops)->depletedq;
 >  }
 >
 > +static inline struct list_head *rt_replq(const struct scheduler *ops)
 > +{
 > +return _priv(ops)->replq;
 > +}
 > +
 >  /*
 > - * Queue helper functions for runq and depletedq
 > + * Queue helper functions for runq, depletedq
 > + * and replenishment event queue
 >   */
 >  static int
 >  __vcpu_on_q(const struct rt_vcpu *svc)
 > @@ -228,6 +244,18 @@ __q_elem(struct list_head *elem)
 >  return list_entry(elem, struct rt_vcpu, q_elem);
 >  }
 >
 > +static struct rt_vcpu *
 > +__replq_elem(struct list_head *elem)
 > +{
 > +return list_entry(elem, struct rt_vcpu, replq_elem);
 > +}
 > +
 > +static int
 > +__vcpu_on_replq(const struct rt_vcpu *svc)
 > +{
 > +   return !list_empty(>replq_elem);
 > +}
 > +
 >  /*
 >   * Debug related code, dump vcpu/cpu information
 >   */
 > @@ -288,7 +316,7 @@ rt_dump_pcpu(const struct scheduler *ops, int cpu)
 >  static void
 >  rt_dump(const struct scheduler *ops)
 >  {
 > -struct list_head *runq, *depletedq, *iter;
 > +struct list_head *runq, *depletedq, *replq, *iter;
 >  struct rt_private *prv = rt_priv(ops);
 >  struct rt_vcpu *svc;
 >  struct rt_dom *sdom;
 > @@ -301,6 +329,7 @@ rt_dump(const struct scheduler *ops)
 >
 >  runq = rt_runq(ops);
 >  depletedq = rt_depletedq(ops);
 > +replq = rt_replq(ops);
 >
 >  printk("Global RunQueue info:\n");
 >  list_for_each( iter, runq )
 > @@ -316,6 +345,13 @@ rt_dump(const struct scheduler *ops)
 >  rt_dump_vcpu(ops, svc);
 >  }
 >
 > +printk("Global Replenishment Event info:\n");
 > +list_for_each( iter, replq )
 > +{
 > +svc = __replq_elem(iter);
 > +rt_dump_vcpu(ops, svc);
 > +}
 > +
 >  printk("Domain info:\n");
 >  list_for_each( iter, >sdom )
 >  {
 > @@ -388,6 +424,66 @@ __q_remove(struct rt_vcpu *svc)
 >  }
 >
 >  /*
 > + * Removing a vcpu from the replenishment queue could
 > + * re-program the timer for the next replenishment 

[Xen-devel] [xen-unstable test] 82962: regressions - FAIL

2016-02-17 Thread osstest service owner
flight 82962 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/82962/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemut-debianhvm-amd64 15 guest-localmigrate/x10 fail REGR. 
vs. 82825

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 82825
 build-amd64-rumpuserxen   6 xen-buildfail   like 82825
 build-i386-rumpuserxen6 xen-buildfail   like 82825
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 82825
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 82825
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 82825

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass

version targeted for testing:
 xen  e33b2921d28c8a3aff2c25fd3d046c7432b3a606
baseline version:
 xen  3fba5f5ec6bd2c9375735ae09d9615ccb1d7c0d0

Last test of basis82825  2016-02-16 10:58:18 Z1 days
Testing same since82962  2016-02-17 06:23:57 Z0 days1 attempts


People who touched revisions under test:
  Ian Campbell 
  Samuel Thibault 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-prev  

Re: [Xen-devel] [PATCH 0/4] libxl: support qemu's network-based block backends

2016-02-17 Thread Jim Fehlig
On 02/17/2016 03:24 AM, Ian Campbell wrote:
> On Tue, 2016-02-16 at 14:45 -0700, Jim Fehlig wrote:
>> xl/libxl already supports qemu's network-based block backends
>> such as nbd and rbd. libvirt has supported configuring network
>> disks for long time too. This series marries the two in the
>> libxl driver and in the xl<->xml converter. Only rbd supported
>> is added in this series. Support for other backends such as nbd
>> and iscsi can be added as a follow-up improvement.
> This all looks sensible to me, FWIW.

Thanks for taking a look!

>
> One question, in patch 3's commit log should the example be double escaping
> the \\ or not? Based on your updates to $xen/docs/misc/xl-disk-
> configuration.txt (posted separately on xen-devel) I had expected they
> would.

Yes, you are correct. The test and conversion code in patch 3 is wrong in that
regard too. I've fixed it in V2.

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2 2/4] xenconfig: produce key=value disk config syntax in xl formatter

2016-02-17 Thread Jim Fehlig
The most formal form of xl disk configuration uses key=value
syntax to define each configuration item, e.g.

format=raw, vdev=xvda, access=rw, backendtype=phy, target=disksrc

Change the xl disk formatter to produce this syntax, which allows
target= to contain meta info needed to setup a network-based
disksrc (e.g. rbd, nbd, iscsi). For details on xl disk config
format, see  $xen-src/docs/misc/xl-disk-configuration.txt

Update the disk config in the tests to use the formal syntax.
But add tests to ensure disks specified with the positional
parameter syntax are correctly converted to  XML.

Signed-off-by: Jim Fehlig 
---
 src/xenconfig/xen_xl.c | 27 ++-
 .../test-disk-positional-parms-full.cfg| 26 +++
 .../test-disk-positional-parms-full.xml| 54 ++
 .../test-disk-positional-parms-partial.cfg | 26 +++
 .../test-disk-positional-parms-partial.xml | 54 ++
 .../test-fullvirt-direct-kernel-boot.cfg   |  2 +-
 tests/xlconfigdata/test-fullvirt-multiusb.cfg  |  2 +-
 tests/xlconfigdata/test-new-disk.cfg   |  2 +-
 tests/xlconfigdata/test-paravirt-cmdline.cfg   |  2 +-
 tests/xlconfigdata/test-paravirt-maxvcpus.cfg  |  2 +-
 tests/xlconfigdata/test-spice-features.cfg |  2 +-
 tests/xlconfigdata/test-spice.cfg  |  2 +-
 tests/xlconfigdata/test-vif-rate.cfg   |  2 +-
 tests/xlconfigtest.c   |  2 +
 14 files changed, 186 insertions(+), 19 deletions(-)

diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c
index be194e3..f3e8b55 100644
--- a/src/xenconfig/xen_xl.c
+++ b/src/xenconfig/xen_xl.c
@@ -587,9 +587,8 @@ xenFormatXLDisk(virConfValuePtr list, virDomainDiskDefPtr 
disk)
 int format = virDomainDiskGetFormat(disk);
 const char *driver = virDomainDiskGetDriver(disk);
 
-/* target */
-virBufferAsprintf(, "%s,", src);
 /* format */
+virBufferAddLit(, "format=");
 switch (format) {
 case VIR_STORAGE_FILE_RAW:
 virBufferAddLit(, "raw,");
@@ -609,31 +608,37 @@ xenFormatXLDisk(virConfValuePtr list, virDomainDiskDefPtr 
disk)
 }
 
 /* device */
-virBufferAdd(, disk->dst, -1);
-
-virBufferAddLit(, ",");
+virBufferAsprintf(, "vdev=%s,", disk->dst);
 
+/* access */
+virBufferAddLit(, "access=");
 if (disk->src->readonly)
-virBufferAddLit(, "r,");
+virBufferAddLit(, "ro,");
 else if (disk->src->shared)
 virBufferAddLit(, "!,");
 else
-virBufferAddLit(, "w,");
+virBufferAddLit(, "rw,");
 if (disk->transient) {
 virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
_("transient disks not supported yet"));
 goto cleanup;
 }
 
+/* backendtype */
+virBufferAddLit(, "backendtype=");
 if (STREQ_NULLABLE(driver, "qemu"))
-virBufferAddLit(, "backendtype=qdisk");
+virBufferAddLit(, "qdisk,");
 else if (STREQ_NULLABLE(driver, "tap"))
-virBufferAddLit(, "backendtype=tap");
+virBufferAddLit(, "tap,");
 else if (STREQ_NULLABLE(driver, "phy"))
-virBufferAddLit(, "backendtype=phy");
+virBufferAddLit(, "phy,");
 
+/* devtype */
 if (disk->device == VIR_DOMAIN_DISK_DEVICE_CDROM)
-virBufferAddLit(, ",devtype=cdrom");
+virBufferAddLit(, "devtype=cdrom,");
+
+/* target */
+virBufferAsprintf(, "target=%s", src);
 
 if (virBufferCheckError() < 0)
 goto cleanup;
diff --git a/tests/xlconfigdata/test-disk-positional-parms-full.cfg 
b/tests/xlconfigdata/test-disk-positional-parms-full.cfg
new file mode 100644
index 000..026e451
--- /dev/null
+++ b/tests/xlconfigdata/test-disk-positional-parms-full.cfg
@@ -0,0 +1,26 @@
+name = "XenGuest2"
+uuid = "c7a5fdb2-cdaf-9455-926a-d65c16db1809"
+maxmem = 579
+memory = 394
+vcpus = 1
+pae = 1
+acpi = 1
+apic = 1
+hap = 0
+viridian = 0
+rtc_timeoffset = 0
+localtime = 0
+on_poweroff = "destroy"
+on_reboot = "restart"
+on_crash = "restart"
+device_model = "/usr/lib/xen/bin/qemu-dm"
+sdl = 0
+vnc = 1
+vncunused = 1
+vnclisten = "127.0.0.1"
+vif = [ "mac=00:16:3e:66:92:9c,bridge=xenbr1,script=vif-bridge,model=e1000" ]
+parallel = "none"
+serial = "none"
+builder = "hvm"
+boot = "d"
+disk = [ "/dev/HostVG/XenGuest2,raw,hda,rw,backendtype=phy", 
"/var/lib/libvirt/images/XenGuest2-home,qcow2,hdb,rw", 
"/root/boot.iso,raw,hdc,ro,devtype=cdrom" ]
diff --git a/tests/xlconfigdata/test-disk-positional-parms-full.xml 
b/tests/xlconfigdata/test-disk-positional-parms-full.xml
new file mode 100644
index 000..49f6dbe
--- /dev/null
+++ b/tests/xlconfigdata/test-disk-positional-parms-full.xml
@@ -0,0 +1,54 @@
+
+  XenGuest2
+  c7a5fdb2-cdaf-9455-926a-d65c16db1809
+  592896
+  403456
+  1
+  
+hvm
+/usr/lib/xen/boot/hvmloader
+
+  
+  
+
+
+
+  
+  
+  destroy
+  restart
+  

[Xen-devel] [PATCH V2 0/4] libxl: support qemu's network-based block backends

2016-02-17 Thread Jim Fehlig

xl/libxl already supports qemu's network-based block backends
such as nbd and rbd. libvirt has supported configuring network
disks for long time too. This series marries the two in the
libxl driver and in the xl<->xml converter. Only rbd supported
is added in this series. Support for other backends such as nbd
and iscsi can be added as a follow-up improvement.

Patch 1 is super trivial and contains no functional changes.

Patch 2 changes the xl disk configuration produced by the
xml->xl converter to use the formal key=value syntax described
in xl-disk-configuration.txt.

Patch 3 adds support for converting rbd info between xl and xml
config formats.

Patch 4 adds support for rbd disks in the libxl driver.

In V2:
Change commit msg, test, and code in patch3 to escape literal
backslashes with a backslash.

Jim Fehlig (4):
  xenconfig: replace text 'xm' with 'xl' in xlconfigtest
  xenconfig: produce key=value disk config syntax in xl formatter
  xenconfig: support xl<->xml conversion of rbd disk devices
  libxl: add support for rbd qdisk

 src/libxl/libxl_conf.c | 192 -
 src/xenconfig/xen_xl.c | 176 +--
 .../test-disk-positional-parms-full.cfg|  26 +++
 .../test-disk-positional-parms-full.xml|  54 ++
 .../test-disk-positional-parms-partial.cfg |  26 +++
 .../test-disk-positional-parms-partial.xml |  54 ++
 .../test-fullvirt-direct-kernel-boot.cfg   |   2 +-
 tests/xlconfigdata/test-fullvirt-multiusb.cfg  |   2 +-
 tests/xlconfigdata/test-new-disk.cfg   |   2 +-
 tests/xlconfigdata/test-paravirt-cmdline.cfg   |   2 +-
 tests/xlconfigdata/test-paravirt-maxvcpus.cfg  |   2 +-
 tests/xlconfigdata/test-rbd-multihost-noauth.cfg   |  26 +++
 tests/xlconfigdata/test-rbd-multihost-noauth.xml   |  51 ++
 tests/xlconfigdata/test-spice-features.cfg |   2 +-
 tests/xlconfigdata/test-spice.cfg  |   2 +-
 tests/xlconfigdata/test-vif-rate.cfg   |   2 +-
 tests/xlconfigtest.c   |  37 ++--
 17 files changed, 618 insertions(+), 40 deletions(-)
 create mode 100644 tests/xlconfigdata/test-disk-positional-parms-full.cfg
 create mode 100644 tests/xlconfigdata/test-disk-positional-parms-full.xml
 create mode 100644 tests/xlconfigdata/test-disk-positional-parms-partial.cfg
 create mode 100644 tests/xlconfigdata/test-disk-positional-parms-partial.xml
 create mode 100644 tests/xlconfigdata/test-rbd-multihost-noauth.cfg
 create mode 100644 tests/xlconfigdata/test-rbd-multihost-noauth.xml

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2 1/4] xenconfig: replace text 'xm' with 'xl' in xlconfigtest

2016-02-17 Thread Jim Fehlig
While at it, improve a few comments. No functional change.

Signed-off-by: Jim Fehlig 
---
 tests/xlconfigtest.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/tests/xlconfigtest.c b/tests/xlconfigtest.c
index 4b2f28f..aa53ed8 100644
--- a/tests/xlconfigtest.c
+++ b/tests/xlconfigtest.c
@@ -1,5 +1,5 @@
 /*
- * xlconfigtest.c: Test backend for xl_internal config file handling
+ * xlconfigtest.c: Test xl.cfg(5) <-> domXML config conversions
  *
  * Copyright (C) 2007, 2010-2011, 2014 Red Hat, Inc.
  * Copyright (c) 2015 SUSE LINUX Products GmbH, Nuernberg, Germany.
@@ -42,20 +42,22 @@
 
 static virCapsPtr caps;
 static virDomainXMLOptionPtr xmlopt;
+
 /*
- * parses the xml, creates a domain def and compare with equivalent xm config
+ * Parses domXML to virDomainDef object, which is then converted to xl.cfg(5)
+ * config and compared with expected config.
  */
 static int
-testCompareParseXML(const char *xmcfg, const char *xml)
+testCompareParseXML(const char *xlcfg, const char *xml)
 {
-char *gotxmcfgData = NULL;
+char *gotxlcfgData = NULL;
 virConfPtr conf = NULL;
 virConnectPtr conn = NULL;
 int wrote = 4096;
 int ret = -1;
 virDomainDefPtr def = NULL;
 
-if (VIR_ALLOC_N(gotxmcfgData, wrote) < 0)
+if (VIR_ALLOC_N(gotxlcfgData, wrote) < 0)
 goto fail;
 
 conn = virGetConnect();
@@ -73,17 +75,17 @@ testCompareParseXML(const char *xmcfg, const char *xml)
 if (!(conf = xenFormatXL(def, conn)))
 goto fail;
 
-if (virConfWriteMem(gotxmcfgData, , conf) < 0)
+if (virConfWriteMem(gotxlcfgData, , conf) < 0)
 goto fail;
-gotxmcfgData[wrote] = '\0';
+gotxlcfgData[wrote] = '\0';
 
-if (virtTestCompareToFile(gotxmcfgData, xmcfg) < 0)
+if (virtTestCompareToFile(gotxlcfgData, xlcfg) < 0)
 goto fail;
 
 ret = 0;
 
  fail:
-VIR_FREE(gotxmcfgData);
+VIR_FREE(gotxlcfgData);
 if (conf)
 virConfFree(conf);
 virDomainDefFree(def);
@@ -91,13 +93,15 @@ testCompareParseXML(const char *xmcfg, const char *xml)
 
 return ret;
 }
+
 /*
- * parses the xl config, develops domain def and compares with equivalent xm 
config
+ * Parses xl.cfg(5) config to virDomainDef object, which is then converted to
+ * domXML and compared to expected XML.
  */
 static int
-testCompareFormatXML(const char *xmcfg, const char *xml)
+testCompareFormatXML(const char *xlcfg, const char *xml)
 {
-char *xmcfgData = NULL;
+char *xlcfgData = NULL;
 char *gotxml = NULL;
 virConfPtr conf = NULL;
 int ret = -1;
@@ -107,10 +111,10 @@ testCompareFormatXML(const char *xmcfg, const char *xml)
 conn = virGetConnect();
 if (!conn) goto fail;
 
-if (virtTestLoadFile(xmcfg, ) < 0)
+if (virtTestLoadFile(xlcfg, ) < 0)
 goto fail;
 
-if (!(conf = virConfReadMem(xmcfgData, strlen(xmcfgData), 0)))
+if (!(conf = virConfReadMem(xlcfgData, strlen(xlcfgData), 0)))
 goto fail;
 
 if (!(def = xenParseXL(conf, caps, xmlopt)))
@@ -128,7 +132,7 @@ testCompareFormatXML(const char *xmcfg, const char *xml)
  fail:
 if (conf)
 virConfFree(conf);
-VIR_FREE(xmcfgData);
+VIR_FREE(xlcfgData);
 VIR_FREE(gotxml);
 virDomainDefFree(def);
 virObjectUnref(conn);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2 4/4] libxl: add support for rbd qdisk

2016-02-17 Thread Jim Fehlig
xl/libxl already supports qemu's network-based block backends
such as nbd and rbd. libvirt has supported configuring such
s for long time too. This patch adds support for rbd
disks in the libxl driver by generating a rbd device URL from
the virDomainDiskDef object. The URL is passed to libxl via the
pdev_path field of libxl_device_disk struct. libxl then passes
the URL to qemu for cosumption by the rbd backend.

Signed-off-by: Jim Fehlig 
---
 src/libxl/libxl_conf.c | 192 -
 1 file changed, 191 insertions(+), 1 deletion(-)

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 48b77d2..5133299 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -46,6 +46,7 @@
 #include "libxl_conf.h"
 #include "libxl_utils.h"
 #include "virstoragefile.h"
+#include "base64.h"
 
 
 #define VIR_FROM_THIS VIR_FROM_LIBXL
@@ -920,17 +921,206 @@ libxlDomainGetEmulatorType(const virDomainDef *def)
 return ret;
 }
 
+static char *
+libxlGetSecretString(virConnectPtr conn,
+ const char *scheme,
+ bool encoded,
+ virStorageAuthDefPtr authdef,
+ virSecretUsageType secretUsageType)
+{
+size_t secret_size;
+virSecretPtr sec = NULL;
+char *secret = NULL;
+char uuidStr[VIR_UUID_STRING_BUFLEN];
+
+/* look up secret */
+switch (authdef->secretType) {
+case VIR_STORAGE_SECRET_TYPE_UUID:
+sec = virSecretLookupByUUID(conn, authdef->secret.uuid);
+virUUIDFormat(authdef->secret.uuid, uuidStr);
+break;
+case VIR_STORAGE_SECRET_TYPE_USAGE:
+sec = virSecretLookupByUsage(conn, secretUsageType,
+ authdef->secret.usage);
+break;
+}
+
+if (!sec) {
+if (authdef->secretType == VIR_STORAGE_SECRET_TYPE_UUID) {
+virReportError(VIR_ERR_NO_SECRET,
+   _("%s no secret matches uuid '%s'"),
+   scheme, uuidStr);
+} else {
+virReportError(VIR_ERR_NO_SECRET,
+   _("%s no secret matches usage value '%s'"),
+   scheme, authdef->secret.usage);
+}
+goto cleanup;
+}
+
+secret = (char *)conn->secretDriver->secretGetValue(sec, _size, 0,
+
VIR_SECRET_GET_VALUE_INTERNAL_CALL);
+if (!secret) {
+if (authdef->secretType == VIR_STORAGE_SECRET_TYPE_UUID) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("could not get value of the secret for "
+ "username '%s' using uuid '%s'"),
+   authdef->username, uuidStr);
+} else {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("could not get value of the secret for "
+ "username '%s' using usage value '%s'"),
+   authdef->username, authdef->secret.usage);
+}
+goto cleanup;
+}
+
+if (encoded) {
+char *base64 = NULL;
+
+base64_encode_alloc(secret, secret_size, );
+VIR_FREE(secret);
+if (!base64) {
+virReportOOMError();
+goto cleanup;
+}
+secret = base64;
+}
+
+ cleanup:
+virObjectUnref(sec);
+return secret;
+}
+
+static char *
+libxlMakeNetworkDiskSrcStr(virStorageSourcePtr src,
+   const char *username,
+   const char *secret)
+{
+char *ret = NULL;
+virBuffer buf = VIR_BUFFER_INITIALIZER;
+size_t i;
+
+switch ((virStorageNetProtocol) src->protocol) {
+case VIR_STORAGE_NET_PROTOCOL_NBD:
+case VIR_STORAGE_NET_PROTOCOL_HTTP:
+case VIR_STORAGE_NET_PROTOCOL_HTTPS:
+case VIR_STORAGE_NET_PROTOCOL_FTP:
+case VIR_STORAGE_NET_PROTOCOL_FTPS:
+case VIR_STORAGE_NET_PROTOCOL_TFTP:
+case VIR_STORAGE_NET_PROTOCOL_ISCSI:
+case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
+case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
+case VIR_STORAGE_NET_PROTOCOL_LAST:
+case VIR_STORAGE_NET_PROTOCOL_NONE:
+virReportError(VIR_ERR_NO_SUPPORT,
+   _("Unsupported network block protocol '%s'"),
+   virStorageNetProtocolTypeToString(src->protocol));
+goto cleanup;
+
+case VIR_STORAGE_NET_PROTOCOL_RBD:
+if (strchr(src->path, ':')) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("':' not allowed in RBD source volume name '%s'"),
+   src->path);
+goto cleanup;
+}
+
+virBufferStrcat(, "rbd:", src->path, NULL);
+
+if (username) {
+virBufferEscape(, '\\', ":", ":id=%s", username);
+virBufferEscape(, '\\', ":",
+":key=%s:auth_supported=cephx\\;none",
+

[Xen-devel] [PATCH V2 3/4] xenconfig: support xl<->xml conversion of rbd disk devices

2016-02-17 Thread Jim Fehlig
The target= setting in xl disk configuration can be used to encode
meta info that is meaningful to a backend. Leverage this fact to
support qdisk network disk types such as rbd. E.g.  config
such as

   
 
 
   
   
   
 
 
 
   

can be converted to the following xl config (and vice versa)

  disk = [ "format=raw,vdev=hdb,access=rw,backendtype=qdisk,

target=rbd:pool/image:auth_supported=none:mon_host=mon1.example.org\\:6321\\;mon2.example.org\\:6322\\;mon3.example.org\\:6322"
 ]

Note that in xl disk config, a literal backslash in target= must
be escaped with a backslash. Conversion of  config is not
handled in this patch, but can be done in a follow-up patch.

Also add a test for the conversions.

Signed-off-by: Jim Fehlig 
---

v2:
Change commit msg, test, and code to escape literal backslash
with a backslash.

 src/xenconfig/xen_xl.c   | 153 +--
 tests/xlconfigdata/test-rbd-multihost-noauth.cfg |  26 
 tests/xlconfigdata/test-rbd-multihost-noauth.xml |  51 
 tests/xlconfigtest.c |   1 +
 4 files changed, 224 insertions(+), 7 deletions(-)

diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c
index f3e8b55..585ef9b 100644
--- a/src/xenconfig/xen_xl.c
+++ b/src/xenconfig/xen_xl.c
@@ -246,6 +246,32 @@ xenParseXLSpice(virConfPtr conf, virDomainDefPtr def)
 return -1;
 }
 
+
+static int
+xenParseXLDiskSrc(virDomainDiskDefPtr disk, char *srcstr)
+{
+char *tmpstr = NULL;
+int ret = -1;
+
+if (STRPREFIX(srcstr, "rbd:")) {
+tmpstr = virStringReplace(srcstr, "", "\\");
+
+virDomainDiskSetType(disk, VIR_STORAGE_TYPE_NETWORK);
+disk->src->protocol = VIR_STORAGE_NET_PROTOCOL_RBD;
+ret = virStorageSourceParseRBDColonString(tmpstr, disk->src);
+} else {
+if (virDomainDiskSetSource(disk, srcstr) < 0)
+goto cleanup;
+
+ret = 0;
+}
+
+ cleanup:
+VIR_FREE(tmpstr);
+return ret;
+}
+
+
 /*
  * For details on xl disk config syntax, see
  * docs/misc/xl-disk-configuration.txt in the Xen sources.  The important
@@ -311,12 +337,12 @@ xenParseXLDisk(virConfPtr conf, virDomainDefPtr def)
 if (!(disk = virDomainDiskDefNew(NULL)))
 goto fail;
 
+if (xenParseXLDiskSrc(disk, libxldisk->pdev_path) < 0)
+goto fail;
+
 if (VIR_STRDUP(disk->dst, libxldisk->vdev) < 0)
 goto fail;
 
-if (virDomainDiskSetSource(disk, libxldisk->pdev_path) < 0)
-goto fail;
-
 disk->src->readonly = !libxldisk->readwrite;
 disk->removable = libxldisk->removable;
 
@@ -358,7 +384,8 @@ xenParseXLDisk(virConfPtr conf, virDomainDefPtr def)
 case LIBXL_DISK_BACKEND_UNKNOWN:
 if (virDomainDiskSetDriver(disk, "qemu") < 0)
 goto fail;
-virDomainDiskSetType(disk, VIR_STORAGE_TYPE_FILE);
+if (virDomainDiskGetType(disk) == VIR_STORAGE_TYPE_NONE)
+virDomainDiskSetType(disk, VIR_STORAGE_TYPE_FILE);
 break;
 
 case LIBXL_DISK_BACKEND_TAP:
@@ -578,14 +605,115 @@ xenFormatXLOS(virConfPtr conf, virDomainDefPtr def)
 }
 
 
+static char *
+xenFormatXLDiskSrcNet(virStorageSourcePtr src)
+{
+char *ret = NULL;
+virBuffer buf = VIR_BUFFER_INITIALIZER;
+size_t i;
+
+switch ((virStorageNetProtocol) src->protocol) {
+case VIR_STORAGE_NET_PROTOCOL_NBD:
+case VIR_STORAGE_NET_PROTOCOL_HTTP:
+case VIR_STORAGE_NET_PROTOCOL_HTTPS:
+case VIR_STORAGE_NET_PROTOCOL_FTP:
+case VIR_STORAGE_NET_PROTOCOL_FTPS:
+case VIR_STORAGE_NET_PROTOCOL_TFTP:
+case VIR_STORAGE_NET_PROTOCOL_ISCSI:
+case VIR_STORAGE_NET_PROTOCOL_GLUSTER:
+case VIR_STORAGE_NET_PROTOCOL_SHEEPDOG:
+case VIR_STORAGE_NET_PROTOCOL_LAST:
+case VIR_STORAGE_NET_PROTOCOL_NONE:
+virReportError(VIR_ERR_NO_SUPPORT,
+   _("Unsupported network block protocol '%s'"),
+   virStorageNetProtocolTypeToString(src->protocol));
+goto cleanup;
+
+case VIR_STORAGE_NET_PROTOCOL_RBD:
+if (strchr(src->path, ':')) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("':' not allowed in RBD source volume name '%s'"),
+   src->path);
+goto cleanup;
+}
+
+virBufferStrcat(, "rbd:", src->path, NULL);
+
+virBufferAddLit(, ":auth_supported=none");
+
+if (src->nhosts > 0) {
+virBufferAddLit(, ":mon_host=");
+for (i = 0; i < src->nhosts; i++) {
+if (i)
+virBufferAddLit(, ";");
+
+/* assume host containing : is ipv6 */
+if (strchr(src->hosts[i].name, ':'))
+virBufferEscape(, '\\', 

[Xen-devel] [xen-unstable-coverity test] 82984: all pass - PUSHED

2016-02-17 Thread osstest service owner
flight 82984 xen-unstable-coverity real [real]
http://logs.test-lab.xenproject.org/osstest/logs/82984/

Perfect :-)
All tests in this flight passed
version targeted for testing:
 xen  3fba5f5ec6bd2c9375735ae09d9615ccb1d7c0d0
baseline version:
 xen  483ad4439f7fc71e12d46dae516f2b9ab2b977ad

Last test of basis82477  2016-02-14 09:19:06 Z3 days
Testing same since82984  2016-02-17 09:27:56 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  Corneliu ZUZU 
  Doug Goldstein 
  George Dunlap 
  Harmandeep Kaur 
  Ian Campbell 
  Ian Campbell 
  Ian Jackson 
  Jan Beulich 
  Kevin Tian 
  Konrad Rzeszutek Wilk 
  Olaf Hering 
  Paul Durrant 
  Razvan Cojocaru 
  Roger Pau MonnĂƒÂ© 
  Stefano Stabellini 
  Tamas K Lengyel 
  Tamas K Lengyel 
  Tim Deegan 
  Wei Liu 

jobs:
 coverity-amd64   pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=xen-unstable-coverity
+ revision=3fba5f5ec6bd2c9375735ae09d9615ccb1d7c0d0
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push 
xen-unstable-coverity 3fba5f5ec6bd2c9375735ae09d9615ccb1d7c0d0
+ branch=xen-unstable-coverity
+ revision=3fba5f5ec6bd2c9375735ae09d9615ccb1d7c0d0
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . ./cri-common
++ . ./cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=xen
+ xenbranch=xen-unstable-coverity
+ qemuubranch=qemu-upstream-unstable-coverity
+ qemuubranch=qemu-upstream-unstable
+ '[' xxen = xlinux ']'
+ linuxbranch=
+ '[' xqemu-upstream-unstable = x ']'
+ select_prevxenbranch
++ ./cri-getprevxenbranch xen-unstable-coverity
+ prevxenbranch=xen-unstable
+ '[' x3fba5f5ec6bd2c9375735ae09d9615ccb1d7c0d0 = x ']'
+ : tested/2.6.39.x
+ . ./ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"OsstestUpstream"} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/qemu-xen-traditional.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig 

Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Borislav Petkov
On Wed, Feb 17, 2016 at 05:39:23PM -0500, Boris Ostrovsky wrote:
> Hmm. I think you are right --- I was following wrong branch of the 'if'
> statement. We are always going straight to note_page().

Yap. The is_hypervisor_range() - without the "!" - does note_page().

> Then yes, we should be able to remove paravirt_enabled(). Sorry for
> the noise.

I'll do a formal patch tomorrow.

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Boris Ostrovsky

On 02/17/2016 05:18 PM, Andy Lutomirski wrote:

On Wed, Feb 17, 2016 at 2:03 PM, Borislav Petkov  wrote:

On Wed, Feb 17, 2016 at 04:21:56PM -0500, Boris Ostrovsky wrote:

That's exactly the point: if something is mapped it's an error for a
non-PV kernel.

How would something be mapped there? __PAGE_OFFSET is
0x8800.

Or are you thinking about some insanely b0rked kernel code mapping stuff
in there?


By removing paravirt_enabled() we may miss those errors. Worse, I think we
may even crash while doing pagetable walk (although it's probably better to
crash here than to use an unexpected translation in real code somewhere)

Well, if this is the only site which keeps paravirt_enabled() from being
buried, we need to think about a better way to detect a hypervisor.
Maybe we should look at x86_hyper, use CPUID(0x4...) or something else.

What's your preference?

I'm confused.  Isn't it the other way around?  That is, checking for
the hypervisor range on all systems should be safer, right?  Or am I
missing something?


Hmm. I think you are right --- I was following wrong branch of the 'if' 
statement. We are always going straight to note_page().


Then yes, we should be able to remove paravirt_enabled(). Sorry for the 
noise.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Luis R. Rodriguez
On Wed, Feb 17, 2016 at 11:03:04PM +0100, Borislav Petkov wrote:
> On Wed, Feb 17, 2016 at 04:21:56PM -0500, Boris Ostrovsky wrote:
> > That's exactly the point: if something is mapped it's an error for a
> > non-PV kernel.
> 
> How would something be mapped there? __PAGE_OFFSET is
> 0x8800.
> 
> Or are you thinking about some insanely b0rked kernel code mapping stuff
> in there?
> 
> > By removing paravirt_enabled() we may miss those errors. Worse, I think we
> > may even crash while doing pagetable walk (although it's probably better to
> > crash here than to use an unexpected translation in real code somewhere)
> 
> Well, if this is the only site which keeps paravirt_enabled() from being
> buried, we need to think about a better way to detect a hypervisor.
> Maybe we should look at x86_hyper, use CPUID(0x4...) or something else.
> 
> What's your preference?

I can think of two possibilities but lets also address _why_ we can't
replace it: the semantic gap.

1) Can't we just set_memory_np() to cause a page fault on that range
   to catch invalid access?

2) Add hypervisor type

I've been lobbying for a new boot protocol hypervisor type and hypervisor data
pointer extensions, much in the same way subarch and subarch_data was added for
x86 boot protocol 2.07.  This would be an extension to help fill in the gaps,
it would also make it accessible to super early code and could help those of
us who care about super clean inits to work towards these goals. Right now we
can rely on the subarch for most Xen concerns but with Xen HVM and future Xen
HVMLite things get more complex and as noted by hpa the subarch was not
designed as a 'hypervisor type', such a thing should be considered separateley.

So we'd knock about 2-3 birds with 1 stone here:

  a) there is a semantic gap between early access to hypervisor type of
 code between asm boot and when setup_arch() is called, only after
 init_hypervisor_platform() is called (in setup_arch()) can things
 such as  cpu_has_hypervisor() or derivatives such as kvm_para_available()
 be correctly used, as only then will that information be correct across
 the board. Part of this issue is what gave rise to paravirt_enabled()
 hackeries in the first place. We have subarch but that is not to be used
 to set things such as hypervisor type, I'll soon post a patch to clarify
 the exact use case for the subarch.
 Since we have no uniform way to detect hypervisor types we are starting
 to see custom hacks. For instance of a hack propagated to drivers now, see
 sound/pci/intel8x0.c use of kvm_para_available().
  b) help put an end to paravirt_enabled() for cases we can't replace
  c) provide an early access mechanism to hypervisor type. This should
 help towards unifying inits by enabling further stubs on early and
 post routine calls. This is future long term possible work:

 With a hypervisor type and hypervisor custom data pointer, we'd strive
 to work to make xen use the same startup_32() and startup_64() entries,
 with stubs possible using the hypervisor type at the start /end as
 follows:

startup_32() startup_64()
   |  |
   |  |
   V  V
pre_hypervisor_stub_32()pre_hypervisor_stub_64()
   |  |
   |  |
   V  V
 [existing startup_32()]   [existing startup_64()]
   |  |
   |  |
   V  V
post_hypervisor_stub_32()   post_hypervisor_stub_64()

The pre_hypervisor_stub_32() would have much of the code of
the newly proposed hvmlite_start_xen() but for 32-bit, 
pre_hypervisor_stub_64()
would have the 64-bits.

I realize Andrew has not been a fan over the idea of Xen setting on the
zero page *any* parameter but he's also noted an alternative is to just
lobby for Grub to boot Xen kernels and then we can rely on Grub to
set it for Linux. The only issue with this is it seems this doesn't
address stubdomains which don't use grub?

  Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Boris Ostrovsky

On 02/17/2016 05:03 PM, Borislav Petkov wrote:

On Wed, Feb 17, 2016 at 04:21:56PM -0500, Boris Ostrovsky wrote:

That's exactly the point: if something is mapped it's an error for a
non-PV kernel.

How would something be mapped there? __PAGE_OFFSET is
0x8800.

Or are you thinking about some insanely b0rked kernel code mapping stuff
in there?


The latter. This is to detect things that clearly shouldn't be happening.




By removing paravirt_enabled() we may miss those errors. Worse, I think we
may even crash while doing pagetable walk (although it's probably better to
crash here than to use an unexpected translation in real code somewhere)

Well, if this is the only site which keeps paravirt_enabled() from being
buried, we need to think about a better way to detect a hypervisor.
Maybe we should look at x86_hyper, use CPUID(0x4...)


Can't use CPUID 0x4000 because it will return hypervisor (Xen or 
otherwise) for non-PV guests as well. In Xen's case, you can't determine 
guest type from hypervisor leaves.




or something else.


We could say xen_pv_domain(). But this means using Xen-specific code in 
x86-generic file to detect things specified by ABI. I don't know if I'd 
like that.


-boris



What's your preference?




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Andy Lutomirski
On Wed, Feb 17, 2016 at 2:03 PM, Borislav Petkov  wrote:
> On Wed, Feb 17, 2016 at 04:21:56PM -0500, Boris Ostrovsky wrote:
>> That's exactly the point: if something is mapped it's an error for a
>> non-PV kernel.
>
> How would something be mapped there? __PAGE_OFFSET is
> 0x8800.
>
> Or are you thinking about some insanely b0rked kernel code mapping stuff
> in there?
>
>> By removing paravirt_enabled() we may miss those errors. Worse, I think we
>> may even crash while doing pagetable walk (although it's probably better to
>> crash here than to use an unexpected translation in real code somewhere)
>
> Well, if this is the only site which keeps paravirt_enabled() from being
> buried, we need to think about a better way to detect a hypervisor.
> Maybe we should look at x86_hyper, use CPUID(0x4...) or something else.
>
> What's your preference?

I'm confused.  Isn't it the other way around?  That is, checking for
the hypervisor range on all systems should be safer, right?  Or am I
missing something?

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Borislav Petkov
On Wed, Feb 17, 2016 at 04:21:56PM -0500, Boris Ostrovsky wrote:
> That's exactly the point: if something is mapped it's an error for a
> non-PV kernel.

How would something be mapped there? __PAGE_OFFSET is
0x8800.

Or are you thinking about some insanely b0rked kernel code mapping stuff
in there?

> By removing paravirt_enabled() we may miss those errors. Worse, I think we
> may even crash while doing pagetable walk (although it's probably better to
> crash here than to use an unexpected translation in real code somewhere)

Well, if this is the only site which keeps paravirt_enabled() from being
buried, we need to think about a better way to detect a hypervisor.
Maybe we should look at x86_hyper, use CPUID(0x4...) or something else.

What's your preference?

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Boris Ostrovsky

On 02/17/2016 03:49 PM, Borislav Petkov wrote:

On Wed, Feb 17, 2016 at 12:07:13PM -0800, Luis R. Rodriguez wrote:

OK so here's a wiki to keep track of progress of the difference uses:

http://kernelnewbies.org/KernelProjects/remove-paravirt-enabled

It seems we have a resolution one way or another for all except for
the use on arch/x86/mm/dump_pagetables.c, is that right?


Why not?

I think we should simply check the range as 8000 -
87ff is practically an ABI and nothing should be mapped

^

That's exactly the point: if something is mapped it's an error for a 
non-PV kernel.


By removing paravirt_enabled() we may miss those errors. Worse, I think 
we may even crash while doing pagetable walk (although it's probably 
better to crash here than to use an unexpected translation in real code 
somewhere)


-boris



there anyway. No need for paravirt_enabled() there either.




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Luis R. Rodriguez
On Wed, Feb 17, 2016 at 12:49 PM, Borislav Petkov  wrote:
> On Wed, Feb 17, 2016 at 12:07:13PM -0800, Luis R. Rodriguez wrote:
>> OK so here's a wiki to keep track of progress of the difference uses:
>>
>> http://kernelnewbies.org/KernelProjects/remove-paravirt-enabled
>>
>> It seems we have a resolution one way or another for all except for
>> the use on arch/x86/mm/dump_pagetables.c, is that right?
>
> Why not?
>
> I think we should simply check the range as 8000 -
> 87ff is practically an ABI and nothing should be mapped
> there anyway. No need for paravirt_enabled() there either.

Provided someone on the xen side acks, then great! We'd have full
coverage to remove all uses soon and kill paravirt_enabled() for good.
It may take some time to run tests of this to get a full sense of
correctness but perhaps in the future it may be easier if 0-day gets
some basic Xen tests (or embraces the Xen test suite) as was discussed
as possible a while ago. lguest may need some basic tests too, but I'm
not even sure what type of tests we'd run against lguest.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 6/7] common/pv-iommu: Add foreign ops to PV-IOMMU interface

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 10, 2016 at 10:10:34AM +, Malcolm Crossley wrote:
> Currently the ops are X86 only due to a dependency on the
> X86 only B2M implementation.
> 
> Foreign ops conform to draft D of PV-IOMMU design.
> 
> XSM control been implemented to allow full security control of
> these priviledged operatins.

But you missed the decleration of them in access_vectors, the
default policy in xen.te and also the implementation in the hooks.c

And that should all be part of the code that adds the new hypercall.

> 
> Page references and page locking are taken before using B2M
> interface which is mandated by the B2M interface itself.
> 
> Signed-off-by: Malcolm Crossley 
> --
> Cc: dgde...@tycho.nsa.gov
> Cc: jbeul...@suse.com
> Cc: ian.campb...@citrix.com
> Cc: k...@xen.org
> Cc: t...@xen.org
> Cc: andrew.coop...@citrix.com
> Cc: xen-devel@lists.xen.org
> ---
>  xen/common/pv_iommu.c | 269 
> ++
>  xen/include/public/pv-iommu.h |  22 
>  xen/include/xsm/dummy.h   |   6 +
>  xen/include/xsm/xsm.h |   6 +
>  xen/xsm/dummy.c   |   1 +
>  5 files changed, 304 insertions(+)
> 
> diff --git a/xen/common/pv_iommu.c b/xen/common/pv_iommu.c
> index 91485f3..bbf1a21 100644
> --- a/xen/common/pv_iommu.c
> +++ b/xen/common/pv_iommu.c
> @@ -21,7 +21,12 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
> +#ifdef CONFIG_X86
> +#include 
> +#include 
> +#endif
>  #define ret_t long
>  
>  static int get_paged_frame(unsigned long gfn, unsigned long *frame,
> @@ -79,6 +84,7 @@ void do_iommu_sub_op(struct pv_iommu_op *op)
>  {
>  struct domain *d = current->domain;
>  struct domain *rd = NULL;
> +int ret;
>  
>  /* Only order 0 pages supported */
>  if ( IOMMU_get_page_order(op->flags) != 0 )
> @@ -183,7 +189,270 @@ void do_iommu_sub_op(struct pv_iommu_op *op)
>  op->status = 0;
>  break;
>  }
> +#ifdef CONFIG_X86

Why not move all of this in arch/x86/pv_iommu.c code that would
implement per-arch code?

> +case IOMMUOP_map_foreign_page:

Didn't look at the rest of the code - will when you repost.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 4/7] common/pv-iommu: Add query, map and unmap ops

2016-02-17 Thread Konrad Rzeszutek Wilk
> >  ret_t do_iommu_op(XEN_GUEST_HANDLE_PARAM(void) arg, unsigned int count)
> 
> Shouldn't this be changed to be pv_iommu_op_t? instead of void?
> 
> 
> >  {
> > -return -ENOSYS;
> > +ret_t ret = 0;
> > +int i;
> 
> unsigned int ?
> > +struct pv_iommu_op op;
> > +struct domain *d = current->domain;
> > +
> > +if ( !is_hardware_domain(d) )
> > +return -ENOSYS;
> 
> -EPERM ?
> 
> > +


Also you are missing the XSM checks which should be here.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 4/7] common/pv-iommu: Add query, map and unmap ops

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 10, 2016 at 10:10:32AM +, Malcolm Crossley wrote:
> Implement above ops according to PV-IOMMU design draft D.

.. which would be great if they were part of this patch series
and you could just: in docs/misc/blah.

> 
> Currently restricted to hardware domains only due to RFC status.

Um, .. that is not reassuring. Perhaps the design document (which
would be in the patchset) would iterate the rest of TODOs?

> 
> Signed-off-by: Malcolm Crossley 
> --
> Cc: jbeul...@suse.com
> Cc: k...@xen.org
> Cc: t...@xen.org
> Cc: andrew.coop...@citrix.com
> Cc: xen-devel@lists.xen.org
> ---
>  xen/common/pv_iommu.c | 228 
> +-
>  xen/include/public/pv-iommu.h |  69 +
>  2 files changed, 296 insertions(+), 1 deletion(-)
>  create mode 100644 xen/include/public/pv-iommu.h
> 
> diff --git a/xen/common/pv_iommu.c b/xen/common/pv_iommu.c
> index 304fccf..91485f3 100644
> --- a/xen/common/pv_iommu.c
> +++ b/xen/common/pv_iommu.c
> @@ -17,13 +17,239 @@
>   * along with this program; If not, see .
>   */
>  
> +#include 
> +#include 

Can those be sorted?

>  #include 
> +#include 
>  
>  #define ret_t long
>  
> +static int get_paged_frame(unsigned long gfn, unsigned long *frame,
> +   struct page_info **page, int readonly,
> +   struct domain *rd)
> +{
> +int rc = 0;
> +#if defined(P2M_PAGED_TYPES) || defined(P2M_SHARED_TYPES)
> +p2m_type_t p2mt;
> +
> +*page = get_page_from_gfn(rd, gfn, ,
> + (readonly) ? P2M_ALLOC : P2M_UNSHARE);
> +if ( !(*page) )
> +{
> +*frame = INVALID_MFN;
> +if ( p2m_is_shared(p2mt) )
> +return -EIO;
> +if ( p2m_is_paging(p2mt) )
> +{
> +p2m_mem_paging_populate(rd, gfn);
> +return -EIO;
> +}
> +return -EIO;
> +}
> +*frame = page_to_mfn(*page);
> +#else
> +*frame = gmfn_to_mfn(rd, gfn);
> +*page = mfn_valid(*frame) ? mfn_to_page(*frame) : NULL;
> +if ( (!(page)) || (!get_page*page, rd) )
> +{
> +*frame = INVALID_MFN;
> +*page = NULL;
> +rc = -EIO;
> +}
> +#endif
> +
> +return rc;
> +}
> +
> +int can_use_iommu_check(struct domain *d)
> +{
> +if ( !iommu_enabled || (!is_hardware_domain(d) && !need_iommu(d)) )
> +return 0;
> +
> +if ( is_hardware_domain(d) && iommu_passthrough )
> +return 0;
> +

What if a PV guests calls this hypercall? Won't it crash on platform_ops 
derefence?

> +if ( !domain_hvm_iommu(d)->platform_ops->lookup_page )
> +return 0;
> +
> +return 1;
> +}
> +
> +void do_iommu_sub_op(struct pv_iommu_op *op)
> +{
> +struct domain *d = current->domain;
> +struct domain *rd = NULL;
> +
> +/* Only order 0 pages supported */

Missing full stop.

> +if ( IOMMU_get_page_order(op->flags) != 0 )
> +{
> +op->status = -ENOSPC;

Not ENOSYS? or EINVAL?

> +goto finish;
> +}
> +
> +switch ( op->subop_id )
> +{
> +case IOMMUOP_query_caps:
> +{
> +op->flags = 0;
> +op->status = 0;
> +if ( can_use_iommu_check(d) )
> +op->flags |= IOMMU_QUERY_map_cap;

s/|=/=/ ?

> +
> +break;
> +}
> +case IOMMUOP_map_page:
> +{
> +unsigned long mfn, tmp;

mfn_t pls.

> +unsigned int flags = 0;

Not uint16_t ?
[edit: ah, this is for  iommu_map_page. You may want to mention that]
> +struct page_info *page = NULL;
> +
> +/* Check if calling domain can create IOMMU mappings */

Missing full stop.

> +if ( !can_use_iommu_check(d) )
> +{
> +op->status = -EPERM;
> +goto finish;
> +}
> +
> +/* Check we are the owner of the page */

Full stop missing.

> +if ( !is_hardware_domain(d) &&
> + ( maddr_get_owner(op->u.map_page.gfn) != d ) )
> +{
> +op->status = -EPERM;
> +goto finish;
> +}
> +
> +/* Lookup page struct backing gfn */

Stop full missing.

> +if ( is_hardware_domain(d) &&
> +(op->flags & IOMMU_MAP_OP_no_ref_cnt) )
> +{
> +mfn = op->u.map_page.gfn;
> +page = mfn_to_page(mfn);
> +if (!page)

Wrong syntax. if ( !page )

> +{
> +op->status = -EPERM;
> +goto finish;
> +}
> +} else if ( get_paged_frame(op->u.map_page.gfn, , , 0, 
> d) )

I think the syntax waants the 'else if' on its own line.

You may want to put a comment around the 0 value.

> +{
> +op->status = -EPERM;
> +goto finish;
> +}
> +
> +/* Check for 

Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Borislav Petkov
On Wed, Feb 17, 2016 at 12:07:13PM -0800, Luis R. Rodriguez wrote:
> OK so here's a wiki to keep track of progress of the difference uses:
> 
> http://kernelnewbies.org/KernelProjects/remove-paravirt-enabled
> 
> It seems we have a resolution one way or another for all except for
> the use on arch/x86/mm/dump_pagetables.c, is that right?

Why not?

I think we should simply check the range as 8000 -
87ff is practically an ABI and nothing should be mapped
there anyway. No need for paravirt_enabled() there either.

---
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 4a6f1d9b5106..de1ee3a40250 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -358,20 +358,20 @@ static void walk_pud_level(struct seq_file *m, struct 
pg_state *st, pgd_t addr,
 #define pgd_none(a)  pud_none(__pud(pgd_val(a)))
 #endif
 
-#ifdef CONFIG_X86_64
 static inline bool is_hypervisor_range(int idx)
 {
+#ifdef CONFIG_X86_64
+
/*
 * 8000 - 87ff is reserved for
 * the hypervisor.
 */
-   return paravirt_enabled() &&
-   (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
+   return  (idx >= pgd_index(__PAGE_OFFSET) - 16) &&
(idx < pgd_index(__PAGE_OFFSET));
-}
 #else
-static inline bool is_hypervisor_range(int idx) { return false; }
+   return false;
 #endif
+}
 
 static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
   bool checkwx)


-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [libvirt test] 82949: tolerable FAIL - PUSHED

2016-02-17 Thread osstest service owner
flight 82949 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/82949/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  cda1cc170f07b45911b3dad03e42c8ebfc210fa1
baseline version:
 libvirt  d6165440779f3ffda20bedae387a990077f08dcf

Last test of basis82783  2016-02-16 04:22:14 Z1 days
Testing same since82949  2016-02-17 04:25:44 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  John Ferlan 
  Ludovic Beliveau 
  Michal Privoznik 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmpass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  pass
 test-amd64-amd64-libvirt-pairpass
 test-amd64-i386-libvirt-pair pass
 test-armhf-armhf-libvirt-qcow2   fail
 test-armhf-armhf-libvirt-raw fail
 test-amd64-amd64-libvirt-vhd pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=cda1cc170f07b45911b3dad03e42c8ebfc210fa1
+ . ./cri-lock-repos
++ . ./cri-common
+++ . ./cri-getconfig
+++ umask 002
+++ getrepos
 getconfig Repos
 perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
+++ local repos=/home/osstest/repos
+++ '[' -z /home/osstest/repos ']'
+++ '[' '!' -d /home/osstest/repos ']'
+++ echo /home/osstest/repos
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec 

Re: [Xen-devel] Adding Xen to the kbuild bot?

2016-02-17 Thread Andy Lutomirski
On Fri, Feb 5, 2016 at 5:17 PM, Fengguang Wu  wrote:
> On Fri, Feb 05, 2016 at 12:10:56PM -0800, Andy Lutomirski wrote:
>> On Feb 4, 2016 7:11 PM, "Fengguang Wu"  wrote:
>> >
>> > Hi Andy,
>> >
>> > CC more people on Xen testing -- in case OSStest already (or plans to)
>> > cover such test case.
>> >
>> > On Tue, Feb 02, 2016 at 07:31:30PM -0800, Andy Lutomirski wrote:
>> > > Hi all-
>> > >
>> > > Would it make sense to add some basic Xen PV testing to the kbuild bot?
>> >
>> > Do you mean to run basic Xen testing on the various kernel trees that
>> > 0day robot covers? That is, to catch kernel regressions when running
>> > under Xen.
>> >
>>
>> Yes, exactly.  I've personally broken Linux as a Xen guest at least twice.
>>
>> > If the intention is to catch Xen regressions, the OSStest
>> > infrastructure may be a better option:
>> >
>> > git://xenbits.xen.org/osstest.git
>>
>> No, I think that 0day should pick one Xen version and stick with it
>> for a while rather than trying to track the latest version.
>
> OK, got it. So it's suitable to run in 0day.
>
>> > > qemu can boot Xen like this:
>> > >
>> > > qemu-system-x86_64 -kernel path/to/xen-4.5.2 -initrd 'path/to/bzImage
>> > > kernelarg otherkernelarg=value" -append 'xenarg other_xen_arg'
>> > >
>> > > This should work with any kernel image for x86 or x86_64 with 
>> > > CONFIG_XEN=y.
>> >
>> > Got it. If you have simple working test scripts to illustrate test
>> > details, it'd be a great help for integrating into OSStest or 0day.
>>
>> I have a script that will boot to a command prompt, but I don't know
>> if that's the right way to do it.  I'm not really sure how 0day works
>> under the hood, but treating Xen as a different configuration or arch
>> instead of treating it as a different test case might make more sense.
>
> We can check the script first, then determine the most suitable way to
> integrate it into 0day. My guess is, it might be suitable to run as a
> new kind of VM host, like this
>
> https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/tree/hosts/vm-kbuild-1G
>
> model: qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap
> nr_vm: 12
> nr_cpu: 2
> memory: 1G
> disk_type: virtio-scsi
> rootfs: debian-x86_64.cgz
> hdd_partitions: /dev/sda /dev/sdb /dev/sdc /dev/sdd
> swap_partitions: /dev/sde

This makes sense to me, but I think it would need an extension to the
configuration language.

The guest virtio code should be in the next -next release.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] [VERY RFC] Clang: Issues with .data.rel.ro relocations

2016-02-17 Thread Andrew Cooper
Clang-3.8 generates several .data.rel.ro sections when compiling Xen.

c/s eb2952b4 "x86: move alternative.c data fully into .init.*" cited "While at
it also drop the non-local section names from SPECIAL_DATA_SECTIONS - they
can't be safely converted." without any further information, and google isn't
overly helpful.

One solution to fix the compilation is:

diff --git a/xen/Rules.mk b/xen/Rules.mk
index f29491e..b4f13f0 100644
--- a/xen/Rules.mk
+++ b/xen/Rules.mk
@@ -177,7 +177,7 @@ SPECIAL_DATA_SECTIONS := rodata $(foreach a,1 2 4 8 16, \
 $(filter %.init.o,$(obj-y) $(obj-bin-y) $(extra-y)): %.init.o: %.o Makefile
$(OBJDUMP) -h $< | sed -n '/[0-9]/{s,00*,0,g;p;}' | while read idx name 
sz rest; do \
case "$$name" in \
-   .*.local) ;; \
+   .*.local|.data.rel.ro) ;; \
.text|.text.*|.data|.data.*|.bss) \
test $$sz != 0 || continue; \
echo "Error: size of $<:$$name is 0x$$sz" >&2; \

but this goes against the statement in c/s eb2952b4.

The alternative is in the body of this patch, which shuffles the data such
that Clang doesn't create problematic relocations.

Given no obvious guidence on why .data.rel.ro relocations are unsafe, I can't
judge which is the correct approach to take.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
---
 xen/arch/x86/alternative.c |  4 ++--
 xen/common/efi/boot.c  | 58 +++---
 2 files changed, 21 insertions(+), 41 deletions(-)

diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 46ac0fd..02b5e92 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -38,7 +38,7 @@ static const unsigned char k8nops[] __initconst = {
 K8_NOP7,
 K8_NOP8
 };
-static const unsigned char * const k8_nops[ASM_NOP_MAX+1] = {
+static const unsigned char * const k8_nops[ASM_NOP_MAX+1] __initconst = {
 NULL,
 k8nops,
 k8nops + 1,
@@ -62,7 +62,7 @@ static const unsigned char p6nops[] __initconst = {
 P6_NOP7,
 P6_NOP8
 };
-static const unsigned char * const p6_nops[ASM_NOP_MAX+1] = {
+static const unsigned char * const p6_nops[ASM_NOP_MAX+1] __initconst = {
 NULL,
 p6nops,
 p6nops + 1,
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 53c7452..64ed433 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -241,53 +241,33 @@ static void __init noreturn blexit(const CHAR16 *str)
 /* generic routine for printing error messages */
 static void __init PrintErrMesg(const CHAR16 *mesg, EFI_STATUS ErrCode)
 {
+static __initconst CHAR16* ErrCodeToStr[] = {
+[EFI_NOT_FOUND & ~EFI_ERROR_MASK]   = L"Not found",
+[EFI_NO_MEDIA & ~EFI_ERROR_MASK]= L"The device has no 
media",
+[EFI_MEDIA_CHANGED & ~EFI_ERROR_MASK]   = L"Media changed",
+[EFI_DEVICE_ERROR & ~EFI_ERROR_MASK]= L"Device error",
+[EFI_VOLUME_CORRUPTED & ~EFI_ERROR_MASK]= L"Volume corrupted",
+[EFI_ACCESS_DENIED & ~EFI_ERROR_MASK]   = L"Access denied",
+[EFI_OUT_OF_RESOURCES & ~EFI_ERROR_MASK]= L"Out of resources",
+[EFI_VOLUME_FULL & ~EFI_ERROR_MASK] = L"Volume is full",
+[EFI_SECURITY_VIOLATION & ~EFI_ERROR_MASK]  = L"Security violation",
+[EFI_CRC_ERROR & ~EFI_ERROR_MASK]   = L"CRC error",
+[EFI_COMPROMISED_DATA & ~EFI_ERROR_MASK]= L"Compromised data",
+[EFI_BUFFER_TOO_SMALL & ~EFI_ERROR_MASK]= L"Buffer too small",
+};
+EFI_STATUS ErrIdx = ErrCode & ~EFI_ERROR_MASK;
+
 StdOut = StdErr;
 PrintErr((CHAR16 *)mesg);
 PrintErr(L": ");
 
-switch (ErrCode)
+if( (ErrIdx < ARRAY_SIZE(ErrCodeToStr)) && ErrCodeToStr[ErrIdx] )
+mesg = ErrCodeToStr[ErrIdx];
+else
 {
-case EFI_NOT_FOUND:
-mesg = L"Not found";
-break;
-case EFI_NO_MEDIA:
-mesg = L"The device has no media";
-break;
-case EFI_MEDIA_CHANGED:
-mesg = L"Media changed";
-break;
-case EFI_DEVICE_ERROR:
-mesg = L"Device error";
-break;
-case EFI_VOLUME_CORRUPTED:
-mesg = L"Volume corrupted";
-break;
-case EFI_ACCESS_DENIED:
-mesg = L"Access denied";
-break;
-case EFI_OUT_OF_RESOURCES:
-mesg = L"Out of resources";
-break;
-case EFI_VOLUME_FULL:
-mesg = L"Volume is full";
-break;
-case EFI_SECURITY_VIOLATION:
-mesg = L"Security violation";
-break;
-case EFI_CRC_ERROR:
-mesg = L"CRC error";
-break;
-case EFI_COMPROMISED_DATA:
-mesg = L"Compromised data";
-break;
-case EFI_BUFFER_TOO_SMALL:
-mesg = L"Buffer too small";
-break;
-default:
 PrintErr(L"ErrCode: ");
 DisplayUint(ErrCode, 0);
 mesg = NULL;
-break;
 }
 

Re: [Xen-devel] [RFC PATCH 3/7] VT-d: Add iommu_lookup_page support

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 10, 2016 at 10:10:31AM +, Malcolm Crossley wrote:
> Function does not need to handle shared EPT use of IOMMU as core code
> already handles this.

Could you mention which part of 'core code' handles this?

Also you may want to say this code can only deal with 4K pages but
not with larger ones.

> 
> Signed-off-by: Malcolm Crossley 
> --
> Cc: kevin.t...@intel.com
> Cc: feng...@intel.com
> Cc: xen-devel@lists.xen.org
> ---
>  xen/drivers/passthrough/vtd/iommu.c | 31 +++
>  xen/drivers/passthrough/vtd/iommu.h |  1 +
>  2 files changed, 32 insertions(+)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> b/xen/drivers/passthrough/vtd/iommu.c
> index ec31c6b..0c79b48 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1754,6 +1754,36 @@ static int intel_iommu_unmap_page(struct domain *d, 
> unsigned long gfn)
>  return 0;
>  }
>  
> +static int intel_iommu_lookup_page(
> +struct domain *d, unsigned long gfn, unsigned long *mfn)
> +{
> +struct hvm_iommu *hd = domain_hvm_iommu(d);
> +struct dma_pte *page = NULL, *pte = NULL, old;
> +u64 pg_maddr;
> +
> +spin_lock(>arch.mapping_lock);
> +
> +pg_maddr = addr_to_dma_page_maddr(d, (paddr_t)gfn << PAGE_SHIFT_4K, 1);
> +if ( pg_maddr == 0 )
> +{
> +spin_unlock(>arch.mapping_lock);
> +return -ENOMEM;
> +}
> +page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
> +pte = page + (gfn & LEVEL_MASK);
> +old = *pte;
> +if (!dma_pte_present(old)) {
> +unmap_vtd_domain_page(page);
> +spin_unlock(>arch.mapping_lock);
> +return -ENOMEM;
> +}
> +unmap_vtd_domain_page(page);
> +spin_unlock(>arch.mapping_lock);

All this code looks close to lookup routine in dma_pte_clear_one and
intel_iommu_map_page.

Could you move most of this lookup code in a static function that your
code and the other ones could call?

> +
> +*mfn = dma_get_pte_addr(old) >> PAGE_SHIFT_4K;
> +return 0;
> +}
> +
>  void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
>   int order, int present)
>  {
> @@ -2534,6 +2564,7 @@ const struct iommu_ops intel_iommu_ops = {
>  .teardown = iommu_domain_teardown,
>  .map_page = intel_iommu_map_page,
>  .unmap_page = intel_iommu_unmap_page,
> +.lookup_page = intel_iommu_lookup_page,
>  .free_page_table = iommu_free_page_table,
>  .reassign_device = reassign_device_ownership,
>  .get_device_group_id = intel_iommu_group_id,
> diff --git a/xen/drivers/passthrough/vtd/iommu.h 
> b/xen/drivers/passthrough/vtd/iommu.h
> index c55ee08..03583ef 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -275,6 +275,7 @@ struct dma_pte {
>  #define dma_pte_addr(p) ((p).val & PADDR_MASK & PAGE_MASK_4K)
>  #define dma_set_pte_addr(p, addr) do {\
>  (p).val |= ((addr) & PAGE_MASK_4K); } while (0)
> +#define dma_get_pte_addr(p) (((p).val & PAGE_MASK_4K))
>  #define dma_pte_present(p) (((p).val & DMA_PTE_PROT) != 0)
>  #define dma_pte_superpage(p) (((p).val & DMA_PTE_SP) != 0)
>  
> -- 
> 1.7.12.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 2/7] iommu: add iommu_lookup_page to lookup guest gfn for a particular IOMMU mapping

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 10, 2016 at 10:10:30AM +, Malcolm Crossley wrote:
> If IOMMU driver does not implement lookup_page function then it returns 
> -ENOMEM.

That is a very odd return code. Could you explain why -ENOMEM?

> 
> Returns 0 on success and any other value on failure.
> 
> Signed-off-by: Malcolm Crossley 
> --
> Cc: jbeul...@suse.com
> Cc: xen-devel@lists.xen.org
> ---
>  xen/drivers/passthrough/iommu.c | 21 +
>  xen/include/xen/iommu.h |  2 ++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index 0b2abf4..06f21ee 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -257,6 +257,27 @@ int iommu_unmap_page(struct domain *d, unsigned long gfn)
>  return hd->platform_ops->unmap_page(d, gfn);
>  }
>  
> +int iommu_lookup_page(struct domain *d, unsigned long bfn, unsigned long 
> *gfn)
> +{
> +struct hvm_iommu *hd = domain_hvm_iommu(d);
> +
> +/* 
> + * BFN maps 1:1 to GFN when iommu passthrough is enabled or 
> + * when IOMMU shared page tables is in use

Missing full stop.

> + */
> +if ( iommu_use_hap_pt(d) || (iommu_passthrough && is_hardware_domain(d)) 
> )

The comment is not in sync with the code. But what I am curious - if
the dom0 is PVH and the shared EPT is disabled, what path should we take?

I would think we would need to skip this and go to the ->lookup_page?

> +{
> +*gfn = bfn;
> +return 0;
> +}
> +
> +if ( !iommu_enabled || !hd->platform_ops ||
> +!hd->platform_ops->lookup_page )
> +return -ENOMEM;

-ENOMEM ? ENXIO ? Or maybe -ENOSYS for the case when hd->platform_ops
is not set nor ->lookup_page?

> +
> +return hd->platform_ops->lookup_page(d, bfn, gfn);
> +}
> +
>  static void iommu_free_pagetables(unsigned long unused)
>  {
>  do {
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 8217cb7..49ca087 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -77,6 +77,7 @@ void iommu_teardown(struct domain *d);
>  int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
> unsigned int flags);
>  int iommu_unmap_page(struct domain *d, unsigned long gfn);
> +int iommu_lookup_page(struct domain *d, unsigned long bfn, unsigned long 
> *gfn);
>  
>  enum iommu_feature
>  {
> @@ -151,6 +152,7 @@ struct iommu_ops {
>  int (*map_page)(struct domain *d, unsigned long gfn, unsigned long mfn,
>  unsigned int flags);
>  int (*unmap_page)(struct domain *d, unsigned long gfn);
> +int (*lookup_page)(struct domain *d, unsigned long bfn, unsigned long 
> *gfn);
>  void (*free_page_table)(struct page_info *);
>  #ifdef CONFIG_X86
>  void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, 
> unsigned int value);
> -- 
> 1.7.12.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.18 test] 82928: regressions - FAIL

2016-02-17 Thread osstest service owner
flight 82928 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/82928/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 79037
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 79037

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-libvirt-qcow2  6 xen-boot  fail pass in 82793
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-bootfail pass in 82793

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeat fail in 82793 like 79037
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail in 82793 like 79037
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 79037
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 79037
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 79037

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-check fail in 82793 never 
pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestore   fail in 82793 never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 16 debian-hvm-install/l1/l2  fail never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  13 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  11 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 saverestore-support-checkfail   never pass

version targeted for testing:
 linux2c07053b8e1e0c22bb54dfbdf8e86a70f8bf00fc
baseline version:
 linux707e840c5e24bb2df1ea4e753964275e257ec816

Last test of basis79037  2016-01-25 18:28:55 Z   23 days
Testing same since82793  2016-02-16 05:49:14 Z1 days2 attempts


People who touched revisions under test:
  "J. Bruce Fields" 
  
  Adrian Hunter 
  Al Viro 
  Alan Stern 
  Alex Deucher 
  Alexander Aring 
  Alexandru Cornea 
  Alexei Starovoitov 
  amish 
  Andrew Elble 
  Andrew Gabbasov 
  Andrew Morton 
  Andrey Ryabinin 
  Andy Gospodarek 

Re: [Xen-devel] [RFC PATCH 0/7] Implement Xen PV-IOMMU interface

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 10, 2016 at 10:33:24AM +, Malcolm Crossley wrote:
> This RFC series implements the PV-IOMMU interface according to the PV-IOMMU
> design draft D.
> 
> The PV-IOMMU interface is currently restricted to being used by the hardware
> domain only as non hardware domain callers have not been fully tested yet.
> 
> Significant effort was put into implementing a m2b tracking structure without
> increasing the size of struct page but no union was found that could be safely
> used in all cases when a page is allocated to an HVM domain. Comments and
> feedback on the m2b design are most welcome.
> 
> The hardware domain specific IOMMU pre map mechanism was implemented in order
> to keep performance parity with current out of tree mechanisms to obtain BFNs
> for foreign guest owned memory. The pre map mechanism is not a weakening of
> the current security model of Xen and is only allowed when the hardware domain
> is allowed relaxed IOMMU mappings.

I would recommend you add to this patchset:

- docs/misc/pv-iommu.txt describing the design of this.
- Add yourself to the maintainers file for this code: xen/common/pv_iommu? 
perhaps?
- Compile this under ARM as well - and figure out what is missing there?
- Add an KConfig entry so folks have the option of not compiling the Xen
  hypervisor with this?
- Have these patches in a git tree for easier testing.

> 
> 
> Malcolm Crossley (7):
>   common/pv-iommu: Add stub hypercall for PV-IOMMU
>   iommu: add iommu_lookup_page to lookup guest gfn for a particular
> IOMMU mapping
>   VT-d: Add iommu_lookup_page support
>   common/pv-iommu: Add query, map and unmap ops
>   x86/m2b: Add a tracking structure for mfn to bfn mappings per page
>   common/pv-iommu: Add foreign ops to PV-IOMMU interface
>   common/pv-iommu: Allow hardware_domain to pre IOMMU map foreign
> memory
> 
>  xen/arch/x86/domain.c   |  12 +-
>  xen/arch/x86/mm/Makefile|   1 +
>  xen/arch/x86/mm/m2b.c   | 211 
>  xen/arch/x86/x86_64/compat/entry.S  |   2 +
>  xen/arch/x86/x86_64/entry.S |   2 +
>  xen/common/Makefile |   1 +
>  xen/common/memory.c |  11 +
>  xen/common/pv_iommu.c   | 633 
> 
>  xen/drivers/passthrough/iommu.c |  21 ++
>  xen/drivers/passthrough/vtd/iommu.c |  31 ++
>  xen/drivers/passthrough/vtd/iommu.h |   1 +
>  xen/include/asm-x86/domain.h|   1 +
>  xen/include/asm-x86/m2b.h   |  65 
>  xen/include/asm-x86/mm.h|  12 +-
>  xen/include/public/hvm/ioreq.h  |   1 +
>  xen/include/public/pv-iommu.h   |  93 ++
>  xen/include/public/xen.h|   1 +
>  xen/include/xen/iommu.h |   2 +
>  xen/include/xen/sched.h |   4 +
>  xen/include/xsm/dummy.h |   6 +
>  xen/include/xsm/xsm.h   |   6 +
>  xen/xsm/dummy.c |   1 +
>  22 files changed, 1116 insertions(+), 2 deletions(-)
>  create mode 100644 xen/arch/x86/mm/m2b.c
>  create mode 100644 xen/common/pv_iommu.c
>  create mode 100644 xen/include/asm-x86/m2b.h
>  create mode 100644 xen/include/public/pv-iommu.h
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH 1/7] common/pv-iommu: Add stub hypercall for PV-IOMMU

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 10, 2016 at 10:10:29AM +, Malcolm Crossley wrote:
> Signed-off-by: Malcolm Crossley 
> --
> Cc: jbeul...@suse.com
> Cc: andrew.coop...@citrix.com
> Cc: ian.campb...@citrix.com
> Cc: k...@xen.org
> Cc: t...@xen.org
> Cc: xen-devel@lists.xen.org
> ---
>  xen/arch/x86/x86_64/compat/entry.S |  2 ++
>  xen/arch/x86/x86_64/entry.S|  2 ++
>  xen/common/Makefile|  1 +
>  xen/common/pv_iommu.c  | 38 
> ++
>  xen/include/public/xen.h   |  1 +
>  5 files changed, 44 insertions(+)
>  create mode 100644 xen/common/pv_iommu.c
> 
> diff --git a/xen/arch/x86/x86_64/compat/entry.S 
> b/xen/arch/x86/x86_64/compat/entry.S
> index 3088aa7..53a1689 100644
> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -436,6 +436,7 @@ ENTRY(compat_hypercall_table)
>  .quad do_tmem_op
>  .quad do_ni_hypercall   /* reserved for XenClient */
>  .quad do_xenpmu_op  /* 40 */
> +.quad do_iommu_op
>  .rept __HYPERVISOR_arch_0-((.-compat_hypercall_table)/8)
>  .quad compat_ni_hypercall
>  .endr
> @@ -487,6 +488,7 @@ ENTRY(compat_hypercall_args_table)
>  .byte 1 /* do_tmem_op   */
>  .byte 0 /* reserved for XenClient   */
>  .byte 2 /* do_xenpmu_op */  /* 40 */
> +.byte 2 /* do_iommu_op  */
>  .rept __HYPERVISOR_arch_0-(.-compat_hypercall_args_table)
>  .byte 0 /* compat_ni_hypercall  */
>  .endr
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index 94a54aa..fee7191 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -769,6 +769,7 @@ ENTRY(hypercall_table)
>  .quad do_tmem_op
>  .quad do_ni_hypercall   /* reserved for XenClient */
>  .quad do_xenpmu_op  /* 40 */
> +.quad do_iommu_op
>  .rept __HYPERVISOR_arch_0-((.-hypercall_table)/8)
>  .quad do_ni_hypercall
>  .endr
> @@ -820,6 +821,7 @@ ENTRY(hypercall_args_table)
>  .byte 1 /* do_tmem_op   */
>  .byte 0 /* reserved for XenClient */
>  .byte 2 /* do_xenpmu_op */  /* 40 */
> +.byte 2 /* do_iommu_op  */
>  .rept __HYPERVISOR_arch_0-(.-hypercall_args_table)
>  .byte 0 /* do_ni_hypercall  */
>  .endr
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 6e82b33..b498589 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -25,6 +25,7 @@ obj-y += notifier.o
>  obj-y += page_alloc.o
>  obj-$(CONFIG_HAS_PDX) += pdx.o
>  obj-y += preempt.o
> +obj-y += pv_iommu.o

Perhaps have an Kconfig entry for it?

Also you seem to be missing ARM code?

>  obj-y += random.o
>  obj-y += rangeset.o
>  obj-y += radix-tree.o
> diff --git a/xen/common/pv_iommu.c b/xen/common/pv_iommu.c
> new file mode 100644
> index 000..304fccf
> --- /dev/null
> +++ b/xen/common/pv_iommu.c
> @@ -0,0 +1,38 @@
> +/**
> + * common/pv_iommu.c
> + * 
> + * Paravirtualised IOMMU functionality
> + * 
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + * 
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + * 
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see .
> + */
> +
> +#include 
> +
> +#define ret_t long

? What is wrong with just using 'long'?

> +
> +ret_t do_iommu_op(XEN_GUEST_HANDLE_PARAM(void) arg, unsigned int count)
> +{
> +return -ENOSYS;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> +
> diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
> index 7b629b1..ff50e7a 100644
> --- a/xen/include/public/xen.h
> +++ b/xen/include/public/xen.h
> @@ -102,6 +102,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
>  #define __HYPERVISOR_tmem_op  38
>  #define __HYPERVISOR_xc_reserved_op   39 /* reserved for XenClient */
>  #define __HYPERVISOR_xenpmu_op40
> +#define __HYPERVISOR_iommu_op 41

I would think there would be an pv_iommu.h header file as well?
>  
>  /* Architecture-specific hypercall definitions. */
>  #define __HYPERVISOR_arch_0   48
> -- 
> 1.7.12.4
> 
> 
> 

Re: [Xen-devel] [PATCH v2 03/11] xen/hvmlite: Initialize HVMlite kernel

2016-02-17 Thread Luis R. Rodriguez
On Mon, Feb 1, 2016 at 7:38 AM, Boris Ostrovsky
 wrote:
> +   pv_info.paravirt_enabled = 1;

As its being discussed we want to remove paravirt_enabled so this
series should not be merged with this:

http://kernelnewbies.org/KernelProjects/remove-paravirt-enabled

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 3/3] paravirt: rename paravirt_enabled to paravirt_legacy

2016-02-17 Thread Luis R. Rodriguez
OK so here's a wiki to keep track of progress of the difference uses:

http://kernelnewbies.org/KernelProjects/remove-paravirt-enabled

It seems we have a resolution one way or another for all except for
the use on arch/x86/mm/dump_pagetables.c, is that right? So to be
clear we should not merge more users of paravirt_enabled.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Is: PVH dom0 - MWAIT detection logic to get deeper C-states exposed in ACPI AML code. Was:Re: [PATCH v2 10/30] xen/x86: Annotate VM applicability in featureset

2016-02-17 Thread Boris Ostrovsky

On 02/17/2016 02:02 PM, Konrad Rzeszutek Wilk wrote:

On Mon, Feb 15, 2016 at 03:41:41PM +, Andrew Cooper wrote:

On 15/02/16 15:02, Jan Beulich wrote:

On 15.02.16 at 15:53,  wrote:

On 15/02/16 14:50, Jan Beulich wrote:

On 15.02.16 at 15:38,  wrote:

On 15/02/16 09:20, Jan Beulich wrote:

On 12.02.16 at 18:42,  wrote:

On 12/02/16 17:05, Jan Beulich wrote:

On 05.02.16 at 14:42,  wrote:

  #define X86_FEATURE_MWAITX( 3*32+29) /*   MWAIT extension

(MONITORX/MWAITX) */

Why not exposed to HVM (also for _MWAIT as I now notice)?

Because that is a good chunk of extra work to support.  We would need to
use 4K monitor widths, and extra p2m handling.

I don't understand: The base (_MWAIT) feature being exposed to
guests today, and kernels making use of the feature when available
suggests to me that things work. Are you saying you know
otherwise? (And if there really is a reason to mask the feature all of
the sudden, this should again be justified in the commit message.)

PV guests had it clobbered by Xen in traps.c

HVM guests have:

vmx.c:
 case EXIT_REASON_MWAIT_INSTRUCTION:
 case EXIT_REASON_MONITOR_INSTRUCTION:
[...]
 hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
 break;

and svm.c:
 case VMEXIT_MONITOR:
 case VMEXIT_MWAIT:
 hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
 break;

I don't see how a guest could actually use this feature.

Do you see the respective intercepts getting enabled anywhere?
(I don't outside of nested code, which I didn't check in detail.)

Yes - the intercepts are always enabled to prevent the guest actually
putting the processor to sleep.

Hmm, you're right, somehow I've managed to ignore the relevant
lines grep reported. Yet - how do things work then, without the
MWAIT feature flag currently getting cleared?



We whitelist CPUID0x0001.ecx features in 
libxc/xc_cpuid_x86.c:xc_cpuid_hvm_policy() so MWAIT is never set.


-boris



I have never observed it being used.  Do you have some local patches in
the SLES hypervisor?

There is some gross layer violation in xen/enlighten.c to pretend that
MWAIT is present to trick the ACPI code into evaluating _CST() methods
to report back to Xen.  (This is yet another PV-ism which will cause a
headache for a DMLite dom0)

Yes indeed. CC-ing Roger, and Boris.


~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 24/30] tools/libxc: Modify bitmap operations to take void pointers

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 10, 2016 at 10:07:12AM +, Andrew Cooper wrote:
> On 08/02/16 16:36, Ian Campbell wrote:
> > On Mon, 2016-02-08 at 16:23 +, Tim Deegan wrote:
> >> At 13:42 + on 05 Feb (1454679737), Andrew Cooper wrote:
> >>> The type of the pointer to a bitmap is not interesting; it does not
> >>> affect the
> >>> representation of the block of bits being pointed to.
> >> It does affect the alignment, though.  Is this safe on ARM?
> > Good point. These constructs in the patch:
> >
> > +const unsigned long *addr = _addr;
> >
> > Would be broken if _addr were not suitably aligned for an unsigned long.
> >
> > That probably rules out this approach unfortunately.
> 
> What about reworking libxc bitops in terms of unsigned char?  That
> should cover all alignment issues.

See 3cab67ac83b1d56c3daedd9c4adfed497a114246

"+/*
+ * xc_bitops.h has macros that do this as well - however they assume that
+ * the bitmask is word aligned but xc_cpumap_t is only guaranteed to be
+ * byte aligned and so we need byte versions for architectures which do
+ * not support misaligned accesses (which is basically everyone
+ * but x86, although even on x86 it can be inefficient).
+ */
"

> 
> ~Andrew
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen: Avoid left shifting into a sign bit

2016-02-17 Thread Andrew Cooper
Clang 3.8 notices, and objects because it is undefined behaviour.

"error: shifting a negative signed value is undefined 
[-Werror,-Wshift-negative-value]"

Use unsigned constants rather than signed ones.

Signed-off-by: Andrew Cooper 
---
CC: Jan Beulich 
CC: Tim Deegan 
CC: Ian Campbell 
CC: Kevin Tian 
CC: Feng Wu 
---
 xen/common/page_alloc.c   | 2 +-
 xen/drivers/passthrough/vtd/x86/ats.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 624a266..7179d67 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1675,7 +1675,7 @@ void *alloc_xenheap_pages(unsigned int order, unsigned 
int memflags)
 ASSERT(!in_irq());
 
 if ( xenheap_bits && (memflags >> _MEMF_bits) > xenheap_bits )
-memflags &= ~MEMF_bits(~0);
+memflags &= ~MEMF_bits(~0U);
 if ( !(memflags >> _MEMF_bits) )
 memflags |= MEMF_bits(xenheap_bits);
 
diff --git a/xen/drivers/passthrough/vtd/x86/ats.c 
b/xen/drivers/passthrough/vtd/x86/ats.c
index 7c797f6..334b9c1 100644
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -133,7 +133,7 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
 case DMA_TLB_GLOBAL_FLUSH:
 /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */
 sbit = 1;
-addr = (~0 << PAGE_SHIFT_4K) & 0x7FFF;
+addr = (~0UL << PAGE_SHIFT_4K) & 0x7FFF;
 rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
  sid, sbit, addr);
 break;
@@ -145,7 +145,7 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
 sbit = size_order ? 1 : 0;
 
 /* clear lower bits */
-addr &= ~0 << PAGE_SHIFT_4K;
+addr &= ~0UL << PAGE_SHIFT_4K;
 
 /* if sbit == 1, zero out size_order bit and set lower bits to 1 */
 if ( sbit )
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xenpaging: do not leak if --pagefile given twice

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 17, 2016 at 02:14:38PM -0500, Konrad Rzeszutek Wilk wrote:
> On Wed, Feb 17, 2016 at 02:58:33PM +, Ian Campbell wrote:
> > By freeing filename (which is either NULL or the previous iteration of
> > this argument). This implements a semantic where the last --pagefile
> > given on the command line takes precedence.
> > 
> > This is the same semantic as the other options have.
> > 
> > CID: 1198792
> > 
> 
> Reviewed-by: Konrad Rzeszutek Wilk 

and applied. Figured nobody would mind this fix.
> 
> > Signed-off-by: Ian Campbell 
> > ---
> >  tools/xenpaging/xenpaging.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/tools/xenpaging/xenpaging.c b/tools/xenpaging/xenpaging.c
> > index 0377507..6157d3a 100644
> > --- a/tools/xenpaging/xenpaging.c
> > +++ b/tools/xenpaging/xenpaging.c
> > @@ -232,6 +232,7 @@ static int xenpaging_getopts(struct xenpaging *paging, 
> > int argc, char *argv[])
> >  paging->vm_event.domain_id = atoi(optarg);
> >  break;
> >  case 'f':
> > +free(filename);
> >  filename = strdup(optarg);
> >  break;
> >  case 'm':
> > -- 
> > 2.1.4
> > 
> > 
> > ___
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 1/2] xenoprof: fix up ability to disable it

2016-02-17 Thread Andrew Cooper
On 17/02/16 18:02, Doug Goldstein wrote:
> Allow Xenoprof to be fully disabled when toggling the option off.
>
> Signed-off-by: Doug Goldstein 
> ---
> CC: Keir Fraser 
> CC: Jan Beulich 
> CC: Andrew Cooper 
>
> change since v3:
> - drop (void)var; from static inlines
> - fix typo that broke build (must have forgotten to do XEN_CONFIG_EXPERT=y 
> make
> change since v2:
> - move all functions in xenoprof.h inside CONFIG_XENOPROF as suggested by
>   Andrew Cooper
> change since v1:
> - switch to #define empty 'functions' as suggested by Andrew Cooper
> ---
>  xen/arch/x86/Makefile  |  2 +-
>  xen/arch/x86/Rules.mk  |  2 ++
>  xen/arch/x86/x86_64/compat/entry.S |  4 
>  xen/arch/x86/x86_64/entry.S|  4 
>  xen/include/asm-x86/config.h   |  1 -
>  xen/include/asm-x86/xenoprof.h | 19 +++
>  xen/include/xen/xenoprof.h | 25 +++--
>  7 files changed, 49 insertions(+), 8 deletions(-)
>
> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> index 8e6e901..434d985 100644
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -3,7 +3,7 @@ subdir-y += cpu
>  subdir-y += genapic
>  subdir-y += hvm
>  subdir-y += mm
> -subdir-y += oprofile
> +subdir-$(xenoprof) += oprofile
>  subdir-y += x86_64
>  
>  obj-bin-y += alternative.init.o
> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
> index a1cdae0..94e4efd 100644
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -10,6 +10,8 @@ CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-generic
>  CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-default
>  CFLAGS += '-D__OBJECT_LABEL__=$(subst /,$$,$(subst -,_,$(subst 
> $(BASEDIR)/,,$(CURDIR))/$@))'
>  
> +CFLAGS-$(xenoprof) += -DCONFIG_XENOPROF
> +
>  # Prevent floating-point variables from creeping into Xen.
>  CFLAGS += -msoft-float
>  
> diff --git a/xen/arch/x86/x86_64/compat/entry.S 
> b/xen/arch/x86/x86_64/compat/entry.S
> index 3088aa7..6424ed0 100644
> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -394,6 +394,10 @@ compat_crash_page_fault:
>  #define compat_kexec_op do_ni_hypercall
>  #endif
>  
> +#ifndef CONFIG_XENOPROF
> +#define compat_xenoprof_op do_ni_hypercall
> +#endif
> +
>  ENTRY(compat_hypercall_table)
>  .quad compat_set_trap_table /*  0 */
>  .quad do_mmu_update
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index 94a54aa..0a73878 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -727,6 +727,10 @@ ENTRY(exception_table)
>  #define do_kexec_op do_ni_hypercall
>  #endif
>  
> +#ifndef CONFIG_XENOPROF
> +#define do_xenoprof_op do_ni_hypercall
> +#endif
> +
>  ENTRY(hypercall_table)
>  .quad do_set_trap_table /*  0 */
>  .quad do_mmu_update
> diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
> index d97877d..a45d3ee 100644
> --- a/xen/include/asm-x86/config.h
> +++ b/xen/include/asm-x86/config.h
> @@ -47,7 +47,6 @@
>  #define CONFIG_VGA 1
>  #define CONFIG_VIDEO 1
>  
> -#define CONFIG_XENOPROF 1
>  #define CONFIG_WATCHDOG 1
>  
>  #define CONFIG_MULTIBOOT 1
> diff --git a/xen/include/asm-x86/xenoprof.h b/xen/include/asm-x86/xenoprof.h
> index b006ddc..7044084 100644
> --- a/xen/include/asm-x86/xenoprof.h
> +++ b/xen/include/asm-x86/xenoprof.h
> @@ -67,9 +67,28 @@ void xenoprof_backtrace(struct vcpu *, const struct 
> cpu_user_regs *,
>   "xenoprof/x86 with autotranslated mode enabled"\
>   "isn't supported yet\n");  \
>  } while (0)
> +
> +#ifdef CONFIG_XENOPROF

Sorry for not commenting before you resubmitted.

I would suggest having this #ifdef immediately under this files
inclusion guard.  That way you hide all symbols and declarations in the
!CONFIG_XENOPROF case.  Given that this patch compiles, I don't expect
you will encounter any issues from moving it.

>  int passive_domain_do_rdmsr(unsigned int msr, uint64_t *msr_content);
>  int passive_domain_do_wrmsr(unsigned int msr, uint64_t msr_content);
>  void passive_domain_destroy(struct vcpu *v);
> +#else

As a matter of style, please put newlines around here to space it out a
bit, and /* CONFIG_XENOPROF */ after the else, as the original #if is a
long way away.

> +static inline int passive_domain_do_rdmsr(unsigned int msr,
> +  uint64_t *msr_content)
> +{
> +return 0;
> +}
> +
> +static inline int passive_domain_do_wrmsr(unsigned int msr,
> +  uint64_t msr_content)
> +{
> +return 0;
> +}
> +
> +static inline void passive_domain_destroy(struct vcpu *v)

For brevity, "static inline void passive_domain_destroy(struct vcpu *v)
{}" is perfectly fine here.

> +{
> +}
> +#endif
>  
>  #endif /* __ASM_X86_XENOPROF_H__ */
>  
> diff --git 

Re: [Xen-devel] [PATCH] xenpaging: do not leak if --pagefile given twice

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 17, 2016 at 02:58:33PM +, Ian Campbell wrote:
> By freeing filename (which is either NULL or the previous iteration of
> this argument). This implements a semantic where the last --pagefile
> given on the command line takes precedence.
> 
> This is the same semantic as the other options have.
> 
> CID: 1198792
> 

Reviewed-by: Konrad Rzeszutek Wilk 

> Signed-off-by: Ian Campbell 
> ---
>  tools/xenpaging/xenpaging.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/xenpaging/xenpaging.c b/tools/xenpaging/xenpaging.c
> index 0377507..6157d3a 100644
> --- a/tools/xenpaging/xenpaging.c
> +++ b/tools/xenpaging/xenpaging.c
> @@ -232,6 +232,7 @@ static int xenpaging_getopts(struct xenpaging *paging, 
> int argc, char *argv[])
>  paging->vm_event.domain_id = atoi(optarg);
>  break;
>  case 'f':
> +free(filename);
>  filename = strdup(optarg);
>  break;
>  case 'm':
> -- 
> 2.1.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 17/30] x86/cpu: Common infrastructure for levelling context switching

2016-02-17 Thread Konrad Rzeszutek Wilk
On Fri, Feb 05, 2016 at 01:42:10PM +, Andrew Cooper wrote:
> This change is purely scaffolding to reduce the complexity of the following
> three patches.

Keep in mind that the patches may not be applied right after this.

It would be easier to jus spell out the three patches.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Is: PVH dom0 - MWAIT detection logic to get deeper C-states exposed in ACPI AML code. Was:Re: [PATCH v2 10/30] xen/x86: Annotate VM applicability in featureset

2016-02-17 Thread Konrad Rzeszutek Wilk
On Mon, Feb 15, 2016 at 03:41:41PM +, Andrew Cooper wrote:
> On 15/02/16 15:02, Jan Beulich wrote:
>  On 15.02.16 at 15:53,  wrote:
> >> On 15/02/16 14:50, Jan Beulich wrote:
> >> On 15.02.16 at 15:38,  wrote:
>  On 15/02/16 09:20, Jan Beulich wrote:
>  On 12.02.16 at 18:42,  wrote:
> >> On 12/02/16 17:05, Jan Beulich wrote:
> >> On 05.02.16 at 14:42,  wrote:
>   #define X86_FEATURE_MWAITX( 3*32+29) /*   MWAIT extension 
> >> (MONITORX/MWAITX) */
> >>> Why not exposed to HVM (also for _MWAIT as I now notice)?
> >> Because that is a good chunk of extra work to support.  We would need 
> >> to
> >> use 4K monitor widths, and extra p2m handling.
> > I don't understand: The base (_MWAIT) feature being exposed to
> > guests today, and kernels making use of the feature when available
> > suggests to me that things work. Are you saying you know
> > otherwise? (And if there really is a reason to mask the feature all of
> > the sudden, this should again be justified in the commit message.)
>  PV guests had it clobbered by Xen in traps.c
> 
>  HVM guests have:
> 
>  vmx.c:
>  case EXIT_REASON_MWAIT_INSTRUCTION:
>  case EXIT_REASON_MONITOR_INSTRUCTION:
>  [...]
>  hvm_inject_hw_exception(TRAP_invalid_op, HVM_DELIVER_NO_ERROR_CODE);
>  break;
> 
>  and svm.c:
>  case VMEXIT_MONITOR:
>  case VMEXIT_MWAIT:
>  hvm_inject_hw_exception(TRAP_invalid_op, 
>  HVM_DELIVER_NO_ERROR_CODE);
>  break;
> 
>  I don't see how a guest could actually use this feature.
> >>> Do you see the respective intercepts getting enabled anywhere?
> >>> (I don't outside of nested code, which I didn't check in detail.)
> >> Yes - the intercepts are always enabled to prevent the guest actually
> >> putting the processor to sleep.
> > Hmm, you're right, somehow I've managed to ignore the relevant
> > lines grep reported. Yet - how do things work then, without the
> > MWAIT feature flag currently getting cleared?
> 
> I have never observed it being used.  Do you have some local patches in
> the SLES hypervisor?
> 
> There is some gross layer violation in xen/enlighten.c to pretend that
> MWAIT is present to trick the ACPI code into evaluating _CST() methods
> to report back to Xen.  (This is yet another PV-ism which will cause a
> headache for a DMLite dom0)

Yes indeed. CC-ing Roger, and Boris.

> 
> ~Andrew
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 5/5] x86: drop failsafe callback invocation from assembly

2016-02-17 Thread Andrew Cooper
On 17/02/16 16:38, Jan Beulich wrote:
> Afaict this was never necessary on a 64-bit hypervisor, and was instead
> just blindly cloned over from 32-bit code: We don't fiddle with (and
> hence don't reload) any of DS, ES, FS, or GS, and an exception on IRET
> itself can equally well be reported to the guest as that very exception
> on the target of that IRET.
>
> Signed-off-by: Jan Beulich 

As best as I can tell, this looks safe.

Reviewed-by: Andrew Cooper 

This is one area I intend to write an XTF test for, but I havn't had
time to yet.  I will see if I can dig out the start of the test and
complete it.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] 82599 passthrough problem

2016-02-17 Thread Konrad Rzeszutek Wilk
On Thu, Feb 04, 2016 at 02:42:32PM +0800, Norton.Zhu wrote:
> XEN HVM, after it hung, no serial log printed.
> it hang in __pirq_guest_unbind. its call stack as follows:
> (XEN)[] __pirq_guest_unbind+0x36/0x350
> (XEN)[] __pirq_guest_unbind+0x36/0x350
> (XEN)[] do_invalid_op+0x30b/0x3f0
> (XEN)[] flush_area_mask+0x7c/0x130
> (XEN)[] handle_exception_saved+0x30/0x6e
> (XEN)[] __pirq_guest_unbind+0x2c/0x350
> (XEN)[] domain_spin_lock_irq_desc+0x64/0xa0
> (XEN)[] pirq_guest_unbind+0x5d/0x170
> (XEN)[] pci_release_devices+0x13e/0x230
> (XEN)[] domain_relinquish_resources+0xa8/0x2a0

Any chance you can provide the xen-syms? Also what version of Xen
is this?

Is the passthrough done via pci-attach or via 'pci=' guest config?

Does this happen all the time? What happens if you have 'sync_console'
on the Xen command line?

What happens if you detach (pci-detach) the PCI device from the guest
before shutdown? Does it work?

What version of QEMU are you using ? qemu-xen or qemu-traditional?

Thanks.
> 
> On 2016/2/4 1:54, Konrad Rzeszutek Wilk wrote:
> > On Tue, Feb 02, 2016 at 12:04:36PM +0800, Norton.Zhu wrote:
> >> Hi,all:
> >>
> >> I met a problem when passthrough 82599 to domU(for some reasons, I use pci 
> >> passthrough not SRIOV).
> > 
> > Is this PV or HVM?
> > 
> >> It works well after passthrough, but the host hung after destroy domU.
> >> Btw, no log prints even from serial port, but I found it hung afer unbind 
> >> irq.
> >> anyone knows what's wrong with it? thanks.
> > 
> > Serial logs?
> >>
> >>
> >> ___
> >> Xen-devel mailing list
> >> Xen-devel@lists.xen.org
> >> http://lists.xen.org/xen-devel
> > 
> > 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 82911: regressions - trouble: blocked/broken/fail/pass

2016-02-17 Thread osstest service owner
flight 82911 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/82911/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 59254
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 59254
 test-amd64-amd64-xl-xsm  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-i386-xl-xsm   15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl-credit2  15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate   fail REGR. vs. 59254
 test-amd64-i386-xl   15 guest-localmigratefail REGR. vs. 59254
 test-amd64-amd64-pair  22 guest-migrate/dst_host/src_host fail REGR. vs. 59254
 test-amd64-i386-pair   22 guest-migrate/dst_host/src_host fail REGR. vs. 59254
 test-armhf-armhf-xl-credit2  15 guest-start/debian.repeat fail REGR. vs. 59254
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeat fail REGR. vs. 59254
 test-armhf-armhf-xl-cubietruck 16 guest-start.2   fail REGR. vs. 59254
 test-armhf-armhf-xl   8 leak-check/basis(8)   fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-rtds 15 guest-localmigratefail REGR. vs. 59254
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59254
 test-amd64-amd64-libvirt-pair 22 guest-migrate/dst_host/src_host fail baseline 
untested
 test-amd64-i386-libvirt-pair 22 guest-migrate/dst_host/src_host fail baseline 
untested
 test-armhf-armhf-xl-vhd   9 debian-di-install   fail baseline untested
 test-amd64-i386-libvirt-xsm  15 guest-saverestore.2  fail blocked in 59254
 test-amd64-amd64-libvirt 15 guest-saverestore.2  fail blocked in 59254
 test-amd64-i386-libvirt  15 guest-saverestore.2  fail blocked in 59254
 test-amd64-amd64-libvirt-xsm 15 guest-saverestore.2  fail blocked in 59254
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 59254
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 59254
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59254

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 14 guest-saverestorefail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 14 guest-saverestorefail   never pass
 test-amd64-amd64-qemuu-nested-intel 13 xen-boot/l1 fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-qemuu-nested-amd 13 xen-boot/l1   fail never pass
 test-armhf-armhf-xl-xsm  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 13 saverestore-support-checkfail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 10 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 guest-saverestorefail   never pass
 test-armhf-armhf-libvirt-raw 11 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-qcow2 11 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-qcow2 13 guest-saverestorefail never pass

version targeted for testing:
 linux

Re: [Xen-devel] [PATCH 4/5] VMX: fold redundant code

2016-02-17 Thread Andrew Cooper
On 17/02/16 16:37, Jan Beulich wrote:
> No need to do this in two slightly different ways, possibly keeping the
> compiler from folding the code for us.
>
> Signed-off-by: Jan Beulich 
>
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3103,6 +3103,7 @@ void vmx_vmexit_handler(struct cpu_user_
>   && vector != TRAP_nmi 
>   && vector != TRAP_machine_check ) 

It would be nice if you could nuke the two bits of whitespace after the
TRAP_* constants while you are editing this area.

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 1/2] xenoprof: fix up ability to disable it

2016-02-17 Thread Doug Goldstein
Allow Xenoprof to be fully disabled when toggling the option off.

Signed-off-by: Doug Goldstein 
---
CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 

change since v3:
- drop (void)var; from static inlines
- fix typo that broke build (must have forgotten to do XEN_CONFIG_EXPERT=y make
change since v2:
- move all functions in xenoprof.h inside CONFIG_XENOPROF as suggested by
  Andrew Cooper
change since v1:
- switch to #define empty 'functions' as suggested by Andrew Cooper
---
 xen/arch/x86/Makefile  |  2 +-
 xen/arch/x86/Rules.mk  |  2 ++
 xen/arch/x86/x86_64/compat/entry.S |  4 
 xen/arch/x86/x86_64/entry.S|  4 
 xen/include/asm-x86/config.h   |  1 -
 xen/include/asm-x86/xenoprof.h | 19 +++
 xen/include/xen/xenoprof.h | 25 +++--
 7 files changed, 49 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 8e6e901..434d985 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -3,7 +3,7 @@ subdir-y += cpu
 subdir-y += genapic
 subdir-y += hvm
 subdir-y += mm
-subdir-y += oprofile
+subdir-$(xenoprof) += oprofile
 subdir-y += x86_64
 
 obj-bin-y += alternative.init.o
diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index a1cdae0..94e4efd 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -10,6 +10,8 @@ CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-generic
 CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-default
 CFLAGS += '-D__OBJECT_LABEL__=$(subst /,$$,$(subst -,_,$(subst 
$(BASEDIR)/,,$(CURDIR))/$@))'
 
+CFLAGS-$(xenoprof) += -DCONFIG_XENOPROF
+
 # Prevent floating-point variables from creeping into Xen.
 CFLAGS += -msoft-float
 
diff --git a/xen/arch/x86/x86_64/compat/entry.S 
b/xen/arch/x86/x86_64/compat/entry.S
index 3088aa7..6424ed0 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -394,6 +394,10 @@ compat_crash_page_fault:
 #define compat_kexec_op do_ni_hypercall
 #endif
 
+#ifndef CONFIG_XENOPROF
+#define compat_xenoprof_op do_ni_hypercall
+#endif
+
 ENTRY(compat_hypercall_table)
 .quad compat_set_trap_table /*  0 */
 .quad do_mmu_update
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 94a54aa..0a73878 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -727,6 +727,10 @@ ENTRY(exception_table)
 #define do_kexec_op do_ni_hypercall
 #endif
 
+#ifndef CONFIG_XENOPROF
+#define do_xenoprof_op do_ni_hypercall
+#endif
+
 ENTRY(hypercall_table)
 .quad do_set_trap_table /*  0 */
 .quad do_mmu_update
diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
index d97877d..a45d3ee 100644
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -47,7 +47,6 @@
 #define CONFIG_VGA 1
 #define CONFIG_VIDEO 1
 
-#define CONFIG_XENOPROF 1
 #define CONFIG_WATCHDOG 1
 
 #define CONFIG_MULTIBOOT 1
diff --git a/xen/include/asm-x86/xenoprof.h b/xen/include/asm-x86/xenoprof.h
index b006ddc..7044084 100644
--- a/xen/include/asm-x86/xenoprof.h
+++ b/xen/include/asm-x86/xenoprof.h
@@ -67,9 +67,28 @@ void xenoprof_backtrace(struct vcpu *, const struct 
cpu_user_regs *,
  "xenoprof/x86 with autotranslated mode enabled"\
  "isn't supported yet\n");  \
 } while (0)
+
+#ifdef CONFIG_XENOPROF
 int passive_domain_do_rdmsr(unsigned int msr, uint64_t *msr_content);
 int passive_domain_do_wrmsr(unsigned int msr, uint64_t msr_content);
 void passive_domain_destroy(struct vcpu *v);
+#else
+static inline int passive_domain_do_rdmsr(unsigned int msr,
+  uint64_t *msr_content)
+{
+return 0;
+}
+
+static inline int passive_domain_do_wrmsr(unsigned int msr,
+  uint64_t msr_content)
+{
+return 0;
+}
+
+static inline void passive_domain_destroy(struct vcpu *v)
+{
+}
+#endif
 
 #endif /* __ASM_X86_XENOPROF_H__ */
 
diff --git a/xen/include/xen/xenoprof.h b/xen/include/xen/xenoprof.h
index 9b9ef56..8148c01 100644
--- a/xen/include/xen/xenoprof.h
+++ b/xen/include/xen/xenoprof.h
@@ -64,19 +64,32 @@ struct xenoprof {
 #endif
 
 struct domain;
-int is_active(struct domain *d);
-int is_passive(struct domain *d);
-void free_xenoprof_pages(struct domain *d);
-
-int xenoprof_add_trace(struct vcpu *, uint64_t pc, int mode);
-
 #define PMU_OWNER_NONE  0
 #define PMU_OWNER_XENOPROF  1
 #define PMU_OWNER_HVM   2
+
+#ifdef CONFIG_XENOPROF
 int acquire_pmu_ownership(int pmu_ownership);
 void release_pmu_ownership(int pmu_ownership);
 
+int is_active(struct domain *d);
+int is_passive(struct domain *d);
+void free_xenoprof_pages(struct domain *d);
+
+int xenoprof_add_trace(struct vcpu *, uint64_t pc, int mode);
+
 void xenoprof_log_event(struct vcpu *, const struct cpu_user_regs *,

[Xen-devel] [PATCH v4 2/2] build: convert xenoprof to Kconfig

2016-02-17 Thread Doug Goldstein
Convert the xenoprof x86 build time option to Kconfig.

CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
Reviewed-by: Andrew Cooper 
Signed-off-by: Doug Goldstein 
Acked-by: Jan Beulich 
---
CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 

change since v3:
- move xenoprof entry to the main sources list
- combine 'default' and 'bool' into 'def_bool'
change since v2:
- require EXPERT for XENOPROF as suggested by Jan Beulich
change since v1:
- fix name of Kconfig entry as suggested by Andrew Cooper
---
 xen/arch/x86/Makefile |  2 +-
 xen/arch/x86/Rules.mk |  3 ---
 xen/common/Kconfig| 13 +
 xen/common/Makefile   |  2 +-
 4 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 434d985..1bcb08b 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -3,7 +3,7 @@ subdir-y += cpu
 subdir-y += genapic
 subdir-y += hvm
 subdir-y += mm
-subdir-$(xenoprof) += oprofile
+subdir-$(CONFIG_XENOPROF) += oprofile
 subdir-y += x86_64
 
 obj-bin-y += alternative.init.o
diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index 94e4efd..14519e3 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -3,15 +3,12 @@
 
 HAS_NUMA := y
 HAS_CORE_PARKING := y
-xenoprof := y
 
 CFLAGS += -I$(BASEDIR)/include 
 CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-generic
 CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-default
 CFLAGS += '-D__OBJECT_LABEL__=$(subst /,$$,$(subst -,_,$(subst 
$(BASEDIR)/,,$(CURDIR))/$@))'
 
-CFLAGS-$(xenoprof) += -DCONFIG_XENOPROF
-
 # Prevent floating-point variables from creeping into Xen.
 CFLAGS += -msoft-float
 
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 6f404b4..49de790 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -84,6 +84,19 @@ config LATE_HWDOM
 
  If unsure, say N.
 
+# Adds support for Xenoprof
+config XENOPROF
+   def_bool y
+   prompt "Xen Oprofile Support" if EXPERT = "y"
+   depends on X86
+   ---help---
+ Xen OProfile (Xenoprof) is a system-wide profiler for Xen virtual
+ machine environments, capable of profiling the Xen virtual machine
+ monitor, multiple Linux guest operating systems, and applications
+ running on them.
+
+ If unsure, say Y.
+
 # Enable/Disable XSM support
 config XSM
bool "Xen Security Modules support"
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 0d76efe..57f4ed7 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -57,13 +57,13 @@ obj-y += vm_event.o
 obj-y += vmap.o
 obj-y += vsprintf.o
 obj-y += wait.o
+obj-$(CONFIG_XENOPROF) += xenoprof.o
 obj-y += xmalloc_tlsf.o
 
 obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo 
unlz4 earlycpio,$(n).init.o)
 
 obj-$(perfc)   += perfc.o
 obj-$(crash_debug) += gdbstub.o
-obj-$(xenoprof)+= xenoprof.o
 
 obj-$(CONFIG_COMPAT) += $(addprefix compat/,domain.o kernel.o memory.o 
multicall.o tmem_xen.o xlat.o)
 
-- 
2.4.10


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/5] x86emul: simplify IRET logic

2016-02-17 Thread Andrew Cooper
On 17/02/16 16:36, Jan Beulich wrote:
> Since we only handle real mode, we need to consider neither non-ring0
> nor IOPL. Also for POPF the mode_iopl() check can really be inside the
> not-ring-0 body.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 2/5] x86emul: limit-check branch targets

2016-02-17 Thread Andrew Cooper
On 17/02/16 16:35, Jan Beulich wrote:
> All branches need to #GP when their target violates the segment limit
> (in 16- and 32-bit modes) or is non-canonical (in 64-bit mode). For
> near branches facilitate this via a zero-byte instruction fetch from
> the target address (resulting in address translation and validation
> without an actual read from memory), while far branches get dealt with
> by breaking up the segment register loading into a read-and-validate
> part and a write one. The latter at once allows correcting some
> ordering issues in how the individual emulation steps get carried out:
> Before updating machine state, all exceptions unrelated to that state
> updating should have got raised (i.e. the only ones possibly resulting
> in partly updated state are faulting memory writes [pushes]).
>
> Note that while not immediately needed here, write and distinct read
> emulation routines get updated to deal with zero byte accesses too, for
> overall consistency.
>
> Reported-by: Ă¥Ë†ËœĂ¤Â»Â¤ 
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 0/2] Vm-events: move monitor vm-events code to common-side.

2016-02-17 Thread Corneliu ZUZU

On 2/17/2016 5:09 PM, Konrad Rzeszutek Wilk wrote:

On Wed, Feb 17, 2016 at 09:33:58AM +0200, Corneliu ZUZU wrote:

This patch series is an attempt to move some of the monitor vm-events code to
the common-side. Done to make it easier to move additional parts that can be
moved to common when ARM-side implementations are to be added.

Both applied.

Patches summary:
1. Fix file comment
 Acked-by: Stefano Stabellini 
2. Move monitor_domctl to common-side.
 Acked-by: Stefano Stabellini 
 Acked-by: Razvan Cojocaru 

Note: ARM support for guest-request, control-register write monitor vm-events
will follow after review of this patch-series.

---
Changed since v4:
   1/2: nothing changed
   2/2: arch_monitor_domctl_event:
 replaced !old_status w/ requested_status (equivalent but more readable)

Corneliu ZUZU (2):
   xen/arm: fix file comments
   xen/vm-events: Move parts of monitor_domctl code to common-side.

  MAINTAINERS   |   1 +
  xen/arch/arm/hvm.c|  29 +++-
  xen/arch/x86/monitor.c| 153 +++---
  xen/common/Makefile   |   1 +
  xen/common/domctl.c   |   2 +-
  xen/common/monitor.c  |  69 +++
  xen/include/asm-arm/monitor.h |  30 +++--
  xen/include/asm-x86/monitor.h |  53 +--
  xen/include/xen/monitor.h |  30 +
  9 files changed, 245 insertions(+), 123 deletions(-)
  create mode 100644 xen/common/monitor.c
  create mode 100644 xen/include/xen/monitor.h

--
2.5.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Thanks!

Corneliu.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/5] x86emul: fix rIP handling

2016-02-17 Thread Andrew Cooper
On 17/02/16 16:32, Jan Beulich wrote:
> Deal with rIP just like with any other register: Truncate to designated
> width upon entry, write back the zero-extended 32-bit value when
> emulating 32-bit code, and leave the upper 48 bits unchanged for 16-bit
> code.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/mm: slightly simplify mod_l1_entry()

2016-02-17 Thread Andrew Cooper
On 17/02/16 16:10, Jan Beulich wrote:
> Re-order code to simplify error cleanup.
>
> Signed-off-by: Jan Beulich 

Reviewed-by: Andrew Cooper 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen Project Infrastructure Maintenance: Feb 25th and 1st of March

2016-02-17 Thread Lars Kurth
Hi everyone,

we are rebooting a number of Xen Project services in the next few days to 
upgrade operating systems. This means that a few services may be temporarily 
unavailable. The following websites are affected and will be done during the 
times below. If you notice any issues after the reboot, please reply to this 
mail or check on the #xeninfra IRC channel.

= Thursday, Feb 25th, 7am - 9am UTC : mailing lists & mail server =
Affected services: 
- all foo @ lists at xenproject dot org mailing lists
- forwarding of mails to foo @ xenproject for org accounts
- downloads.xenproject.org
- DNS

Note that mails to all mailing lists will be delayed during the maintenance 
window

= Tuesday, March 1st, 7am - 9am UTC : xenbits and xen etherpads =
Affected services: 
- Xen Project source code repositories
- Automatically generated xenproject docs
- List of security vulnerabilities
- Xen Project etherpads

If you need to clone or commit to any affected repositories please do so before 
or after

Best Regards
Lars


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2 2/3] docs: fix typo in xl-disk-configuration.txt

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 17, 2016 at 10:20:58AM -0700, Jim Fehlig wrote:
> Signed-off-by: Jim Fehlig 
> Acked-by: Ian Campbell 

applied
> ---
>  docs/misc/xl-disk-configuration.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/docs/misc/xl-disk-configuration.txt 
> b/docs/misc/xl-disk-configuration.txt
> index 6a2118d..29f6ddb 100644
> --- a/docs/misc/xl-disk-configuration.txt
> +++ b/docs/misc/xl-disk-configuration.txt
> @@ -160,7 +160,7 @@ Mandatory: No
>  Default value: Automatically determine which backend to use.
>  
>  This does not affect the guest's view of the device.  It controls
> -which software implementation of the Xen backend driver us used.
> +which software implementation of the Xen backend driver is used.
>  
>  Not all backend drivers support all combinations of other options.
>  For example, "phy" does not support formats other than "raw".
> -- 
> 2.1.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2 1/3] libxlu_cfg: reject unknown characters following '\'

2016-02-17 Thread Konrad Rzeszutek Wilk
On Wed, Feb 17, 2016 at 10:20:57AM -0700, Jim Fehlig wrote:
> When dequoting config strings in xlu__cfgl_dequote(), unknown
> characters following a '\', and the '\' itself, are discarded.
> E.g. a disk configuration string containing
> 
>   rbd:pool/image:mon_host=192.168.0.100\:6789
> 
> would be dequoted as
> 
>   rbd:pool/image:mon_host=192.168.0.1006789
> 
> Instead of discarding the '\' and unknown character, reject the
> string and set error to EINVAL.
> 
> Signed-off-by: Jim Fehlig 
> Acked-by: Ian Campbell 

applied
> ---
>  tools/libxl/libxlu_cfg.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
> index 1d70909..5838f68 100644
> --- a/tools/libxl/libxlu_cfg.c
> +++ b/tools/libxl/libxlu_cfg.c
> @@ -533,6 +533,11 @@ char *xlu__cfgl_dequote(CfgParseContext *ctx, const char 
> *src) {
>  NUMERIC_CHAR(2,2,16,"hex");
>  } else if (nc>='0' && nc<='7') {
>  NUMERIC_CHAR(1,3,10,"octal");
> +} else {
> +xlu__cfgl_lexicalerror(ctx,
> +   "invalid character after backlash in quoted 
> string");
> +ctx->err= EINVAL;
> +goto x;
>  }
>  assert(p <= src+len-1);
>  } else {
> -- 
> 2.1.4
> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Stabilising some tools only HVMOPs?

2016-02-17 Thread Wei Liu
Hi all

Tools people are in the process of splitting libxenctrl into a set of
stable libraries. One of the proposed libraries is libxendevicemodel
which has a collection of APIs that can be used by device model.

Currently we use QEMU as reference to extract symbols and go through
them one by one. Along the way we discover QEMU is using some tools
only HVMOPs.

The list of tools only HVMOPs used by QEMU are:

  #define HVMOP_track_dirty_vram6
  #define HVMOP_modified_memory7
  #define HVMOP_set_mem_type8
  #define HVMOP_inject_msi 16
  #define HVMOP_create_ioreq_server 17
  #define HVMOP_get_ioreq_server_info 18
  #define HVMOP_map_io_range_to_ioreq_server 19
  #define HVMOP_unmap_io_range_from_ioreq_server 20
  #define HVMOP_destroy_ioreq_server 21
  #define HVMOP_set_ioreq_server_state 22

I'm curious about the rationale for making them tools only in the
first place and what needs to be done to make them stable.

The option to build stable library APIs on top of unstable hypervisor
APIs is always there, but that looks suboptimal to me.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 1/3] libxlu_cfg: reject unknown characters following '\'

2016-02-17 Thread Jim Fehlig
On 02/17/2016 03:11 AM, Ian Campbell wrote:
> On Wed, 2016-02-17 at 10:05 +, Ian Campbell wrote:
>> On Tue, 2016-02-16 at 20:54 -0700, Jim Fehlig wrote:
>>> When dequoting config strings in xlu__cfgl_dequote(), unknown
>>> characters following a '\', and the '\' itself, are discarded.
>>> E.g. a disk configuration string containing
>>>
>>>   rbd:pool/image:mon_host=192.168.0.100\:6789
>>>
>>> would be dequoted as
>>>
>>>   rbd:pool/image:mon_host=192.168.0.1006789
>>>
>>> Instead of discarding the '\' and unknown character, reject the
>>> string and set error to EINVAL.
>> Missing your S-o-b.
>>
>> Other than that:
>>
>>> +xlu__cfgl_lexicalerror(ctx, "invalid character after
>>> backlash "
>>> +   "in quoted string");
>> Please try where possible not to split string constants (so log messages
>> can more easily be grepped for).
> I see now that this parsing code is pretty liberally ignoring this advice
> already. So apart from the missing S-o-b this patch is

Your suggestion is a good one and is common throughout much of libxl, so I
pulled back the error string indentation in V2.

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] docs: add more info about target= in disk config

2016-02-17 Thread Jim Fehlig
On 02/17/2016 03:10 AM, Ian Campbell wrote:
> On Tue, 2016-02-16 at 20:54 -0700, Jim Fehlig wrote:
>> target= in disk config can be used to convey arbitrary
>> configuration information to backends. Add a bit more info
>> to xl-disk-configuration.txt to clarify this, including some
>> simple nbd and rbd qdisk configurations.
> Missing S-o-b.
>
>> ---
>>  docs/misc/xl-disk-configuration.txt | 10 +-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/misc/xl-disk-configuration.txt b/docs/misc/xl-disk-
>> configuration.txt
>> index 29f6ddb..0918fb8 100644
>> --- a/docs/misc/xl-disk-configuration.txt
>> +++ b/docs/misc/xl-disk-configuration.txt
>> @@ -75,7 +75,15 @@ Special syntax:
>> the target was already specified as a positional parameter.  This
>> is the only way to specify a target string containing metacharacters
>> such as commas and (in some cases) colons, which would otherwise be
>> -   misinterpreted.
>> +   misinterpreted. Meta-information in a target string can be used to
>> +   specify configuration information for a qdisk block backend. For
>> +   example the nbd and rbd qdisk block backends can be configured with
>> +
>> + target=nbd:192.168.0.1:
>> + target=rbd:pool/image:mon_host=192.186.0.1\\:6789
>> +
>> +   Note the use of double backslash ('\\') for metacharacters that need
>> +   escaped.
> "need to be escaped".
>
> However I wouldn't describe "\\" that way, I think I would say "note that \
> is used to escape metacharacters and therefore to get a literal backslash
> "\\" is required".

Agreed. I changed the text and sent a V2 of the series.

>
> The general concept of escaping metacharaters is not mentioned in this doc
> at all, i.e. there is no mention of which characters need such escaping nor
> of the various "special" codes (\t and \n etc), nor of the octal and hex
> escape codes. Maybe that's a topic for another patch though.

I could do that in a follow-up, but I have no clue what those mean or how they
are used.

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2 2/3] docs: fix typo in xl-disk-configuration.txt

2016-02-17 Thread Jim Fehlig
Signed-off-by: Jim Fehlig 
Acked-by: Ian Campbell 
---
 docs/misc/xl-disk-configuration.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/misc/xl-disk-configuration.txt 
b/docs/misc/xl-disk-configuration.txt
index 6a2118d..29f6ddb 100644
--- a/docs/misc/xl-disk-configuration.txt
+++ b/docs/misc/xl-disk-configuration.txt
@@ -160,7 +160,7 @@ Mandatory: No
 Default value: Automatically determine which backend to use.
 
 This does not affect the guest's view of the device.  It controls
-which software implementation of the Xen backend driver us used.
+which software implementation of the Xen backend driver is used.
 
 Not all backend drivers support all combinations of other options.
 For example, "phy" does not support formats other than "raw".
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2 3/3] docs: add more info about target= in disk config

2016-02-17 Thread Jim Fehlig
target= in disk config can be used to convey arbitrary
configuration information to backends. Add a bit more info
to xl-disk-configuration.txt to clarify this, including some
simple nbd and rbd qdisk configurations.

Signed-off-by: Jim Fehlig 
---
 docs/misc/xl-disk-configuration.txt | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/docs/misc/xl-disk-configuration.txt 
b/docs/misc/xl-disk-configuration.txt
index 29f6ddb..79f1e4a 100644
--- a/docs/misc/xl-disk-configuration.txt
+++ b/docs/misc/xl-disk-configuration.txt
@@ -75,7 +75,15 @@ Special syntax:
the target was already specified as a positional parameter.  This
is the only way to specify a target string containing metacharacters
such as commas and (in some cases) colons, which would otherwise be
-   misinterpreted.
+   misinterpreted. Meta-information in a target string can be used to
+   specify configuration information for a qdisk block backend. For
+   example the nbd and rbd qdisk block backends can be configured with
+
+ target=nbd:192.168.0.1:
+ target=rbd:pool/image:mon_host=192.186.0.1\\:6789
+
+   Note that '\' is used to escape metacharacters. Literal backslashes
+   in target= strings must use '\\'.
 
Future parameter and flag names will start with an ascii letter and
contain only ascii alphanumerics, hyphens and underscores, and will
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2 1/3] libxlu_cfg: reject unknown characters following '\'

2016-02-17 Thread Jim Fehlig
When dequoting config strings in xlu__cfgl_dequote(), unknown
characters following a '\', and the '\' itself, are discarded.
E.g. a disk configuration string containing

  rbd:pool/image:mon_host=192.168.0.100\:6789

would be dequoted as

  rbd:pool/image:mon_host=192.168.0.1006789

Instead of discarding the '\' and unknown character, reject the
string and set error to EINVAL.

Signed-off-by: Jim Fehlig 
Acked-by: Ian Campbell 
---
 tools/libxl/libxlu_cfg.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/tools/libxl/libxlu_cfg.c b/tools/libxl/libxlu_cfg.c
index 1d70909..5838f68 100644
--- a/tools/libxl/libxlu_cfg.c
+++ b/tools/libxl/libxlu_cfg.c
@@ -533,6 +533,11 @@ char *xlu__cfgl_dequote(CfgParseContext *ctx, const char 
*src) {
 NUMERIC_CHAR(2,2,16,"hex");
 } else if (nc>='0' && nc<='7') {
 NUMERIC_CHAR(1,3,10,"octal");
+} else {
+xlu__cfgl_lexicalerror(ctx,
+   "invalid character after backlash in quoted 
string");
+ctx->err= EINVAL;
+goto x;
 }
 assert(p <= src+len-1);
 } else {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH V2 0/3] libxl and docs: small improvements in qdisk support

2016-02-17 Thread Jim Fehlig
This series contains a few improvments related to libxl's
support for the various qdisk types. Patch1 is a small fix
for libxlu_cfg to error when encountering unknown backslash-
escaped characters instead of silently dropping them. Patch2
is actually unrelated and fixes a typo noticed while reviewing
xl-disk-configuration.txt, which is improved a bit in patch3
wrt target= syntax.

V2:
Add forgotten SOB in all patches

Improve patch3 doc text

Jim Fehlig (3):
  libxlu_cfg: reject unknown characters following '\'
  docs: fix typo in xl-disk-configuration.txt
  docs: add more info about target= in disk config

 docs/misc/xl-disk-configuration.txt | 12 ++--
 tools/libxl/libxlu_cfg.c|  5 +
 2 files changed, 15 insertions(+), 2 deletions(-)

-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v6] libxl: allow 'phy' backend to use empty files

2016-02-17 Thread Roger Pau Monne
This was introduced by 97ee1f (~5 years ago), but was probably never
surfaced because most people used regular files as CDROM images, so the PHY
backend was actually never selected. A year ago this was changed, and now
regular RAW files are also handled by the PHY backend, which has made this
bug surface.

Fix it by allowing empty disks to use the PHY backend, skipping the stat
tests.

Signed-off-by: Roger Pau Monné 
Reported-by: Alex Braunegg 
---
Cc: Ian Jackson 
Cc: Ian Campbell 
Cc: Wei Liu 
Cc: Alex Braunegg 
---
Changes since v4:
 - Split form the rest of the series.
 - Fix disk_try_backend.
---
 tools/libxl/libxl_device.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 8bb5e93..2e08108 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -196,6 +196,14 @@ static int disk_try_backend(disk_try_backend_args *a,
 goto bad_format;
 }
 
+if (a->disk->format == LIBXL_DISK_FORMAT_EMPTY) {
+assert(a->disk->pdev_path == NULL ||
+   !strcmp(a->disk->pdev_path, ""));
+LOG(DEBUG, "Disk vdev=%s is empty, skipping physical device check",
+a->disk->vdev);
+return backend;
+}
+
 if (a->disk->backend_domid != LIBXL_TOOLSTACK_DOMID) {
 LOG(DEBUG, "Disk vdev=%s, is using a storage driver domain, "
"skipping physical device check", a->disk->vdev);
-- 
2.5.4 (Apple Git-61)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 07/10] xen: factor out allocation of page tables into separate function

2016-02-17 Thread Juergen Gross
Do the allocation of page tables in a separate function. This will
allow to do the allocation at different times of the boot preparations
depending on the features the kernel is supporting.

Signed-off-by: Juergen Gross 
---
 grub-core/loader/i386/xen.c | 91 -
 1 file changed, 56 insertions(+), 35 deletions(-)

diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
index 69d0e65..3bcf4c8 100644
--- a/grub-core/loader/i386/xen.c
+++ b/grub-core/loader/i386/xen.c
@@ -51,6 +51,9 @@ struct xen_loader_state {
   struct start_info *virt_start_info;
   grub_xen_mfn_t console_pfn;
   grub_uint64_t max_addr;
+  grub_uint64_t *virt_pgtable;
+  grub_uint64_t pgtbl_start;
+  grub_uint64_t pgtbl_end;
   struct xen_multiboot_mod_list *module_info_page;
   grub_uint64_t modules_target_start;
   grub_size_t n_modules;
@@ -110,17 +113,17 @@ get_pgtable_size (grub_uint64_t total_pages, 
grub_uint64_t virt_base)
 
 static void
 generate_page_table (grub_uint64_t *where, grub_uint64_t paging_start,
-grub_uint64_t total_pages, grub_uint64_t virt_base,
-grub_xen_mfn_t *mfn_list)
+grub_uint64_t paging_end, grub_uint64_t total_pages,
+grub_uint64_t virt_base, grub_xen_mfn_t *mfn_list)
 {
   if (!virt_base)
-total_pages++;
+paging_end++;
 
   grub_uint64_t lx[NUMBER_OF_LEVELS], lxs[NUMBER_OF_LEVELS];
   grub_uint64_t nlx, nls, sz = 0;
   int l;
 
-  nlx = total_pages;
+  nlx = paging_end;
   nls = virt_base >> PAGE_SHIFT;
   for (l = 0; l < NUMBER_OF_LEVELS; l++)
 {
@@ -164,7 +167,7 @@ generate_page_table (grub_uint64_t *where, grub_uint64_t 
paging_start,
   if (pr)
 pg += POINTERS_PER_PAGE;
 
-  for (j = 0; j < total_pages; j++)
+  for (j = 0; j < paging_end; j++)
 {
   if (j >= paging_start && j < lp)
pg[j + lxs[0]] = page2offset (mfn_list[j]) | 5;
@@ -271,24 +274,12 @@ grub_xen_special_alloc (void)
 }
 
 static grub_err_t
-grub_xen_boot (void)
+grub_xen_pt_alloc (void)
 {
   grub_relocator_chunk_t ch;
   grub_err_t err;
   grub_uint64_t nr_info_pages;
   grub_uint64_t nr_pages, nr_pt_pages, nr_need_pages;
-  struct gnttab_set_version gnttab_setver;
-  grub_size_t i;
-
-  if (grub_xen_n_allocated_shared_pages)
-return grub_error (GRUB_ERR_BUG, "active grants");
-
-  err = grub_xen_p2m_alloc ();
-  if (err)
-return err;
-  err = grub_xen_special_alloc ();
-  if (err)
-return err;
 
   xen_state.next_start.pt_base =
 xen_state.max_addr + xen_state.xen_inf.virt_base;
@@ -309,31 +300,61 @@ grub_xen_boot (void)
   nr_pages = nr_need_pages;
 }
 
-  grub_dprintf ("xen", "bootstrap domain %llx+%llx\n",
-   (unsigned long long) xen_state.xen_inf.virt_base,
-   (unsigned long long) page2offset (nr_pages));
-
   err = grub_relocator_alloc_chunk_addr (xen_state.relocator, ,
 xen_state.max_addr,
 page2offset (nr_pt_pages));
   if (err)
 return err;
 
-  err = set_mfns (xen_state.console_pfn);
-  if (err)
-return err;
-
-  generate_page_table (get_virtual_current_address (ch),
-  xen_state.max_addr >> PAGE_SHIFT, nr_pages,
-  xen_state.xen_inf.virt_base, xen_state.virt_mfn_list);
-
+  xen_state.virt_pgtable = get_virtual_current_address (ch);
+  xen_state.pgtbl_start = xen_state.max_addr >> PAGE_SHIFT;
   xen_state.max_addr += page2offset (nr_pt_pages);
   xen_state.state.stack =
 xen_state.max_addr + STACK_SIZE + xen_state.xen_inf.virt_base;
-  xen_state.state.entry_point = xen_state.xen_inf.entry_point;
-
-  xen_state.next_start.nr_pt_frames = nr_pt_pages;
   xen_state.state.paging_size = nr_pt_pages;
+  xen_state.next_start.nr_pt_frames = nr_pt_pages;
+  xen_state.max_addr = page2offset (nr_pages);
+  xen_state.pgtbl_end = nr_pages;
+
+  return GRUB_ERR_NONE;
+}
+
+static grub_err_t
+grub_xen_boot (void)
+{
+  grub_err_t err;
+  grub_uint64_t nr_pages;
+  struct gnttab_set_version gnttab_setver;
+  grub_size_t i;
+
+  if (grub_xen_n_allocated_shared_pages)
+return grub_error (GRUB_ERR_BUG, "active grants");
+
+  err = grub_xen_p2m_alloc ();
+  if (err)
+return err;
+  err = grub_xen_special_alloc ();
+  if (err)
+return err;
+  err = grub_xen_pt_alloc ();
+  if (err)
+return err;
+
+  err = set_mfns (xen_state.console_pfn);
+  if (err)
+return err;
+
+  nr_pages = xen_state.max_addr >> PAGE_SHIFT;
+
+  grub_dprintf ("xen", "bootstrap domain %llx+%llx\n",
+   (unsigned long long) xen_state.xen_inf.virt_base,
+   (unsigned long long) page2offset (nr_pages));
+
+  generate_page_table (xen_state.virt_pgtable, xen_state.pgtbl_start,
+  xen_state.pgtbl_end, nr_pages,
+  xen_state.xen_inf.virt_base, xen_state.virt_mfn_list);
+
+  xen_state.state.entry_point = xen_state.xen_inf.entry_point;
 
   

[Xen-devel] [PATCH v3 09/10] xen: modify page table construction

2016-02-17 Thread Juergen Gross
Modify the page table construction to allow multiple virtual regions
to be mapped. This is done as preparation for removing the p2m list
from the initial kernel mapping in order to support huge pv domains.

This allows a cleaner approach for mapping the relocator page by
using this capability.

The interface to the assembler level of the relocator has to be changed
in order to be able to process multiple page table areas.

Signed-off-by: Juergen Gross 
---
V3: use constants instead of numbers as requested by Daniel Kiper
add lots of comments to assembly code as requested by Daniel Kiper
---
 grub-core/lib/i386/xen/relocator.S   |  87 ++
 grub-core/lib/x86_64/xen/relocator.S | 134 ++-
 grub-core/lib/xen/relocator.c|  25 ++-
 grub-core/loader/i386/xen.c  | 325 ---
 include/grub/i386/memory.h   |   7 +
 include/grub/xen/relocator.h |   6 +-
 6 files changed, 354 insertions(+), 230 deletions(-)

diff --git a/grub-core/lib/i386/xen/relocator.S 
b/grub-core/lib/i386/xen/relocator.S
index 694a54c..f1c729e 100644
--- a/grub-core/lib/i386/xen/relocator.S
+++ b/grub-core/lib/i386/xen/relocator.S
@@ -16,6 +16,8 @@
  *  along with GRUB.  If not, see .
  */
 
+#include 
+#include 
 #include 
 #include 
 
@@ -23,78 +25,86 @@
 
 VARIABLE(grub_relocator_xen_remap_start)
 LOCAL(base):
-   /* mov imm32, %ebx */
+   /* Remap the remapper to it's new address. */
+   /* mov imm32, %ebx - %ebx: new virtual address of remapper */
.byte   0xbb
 VARIABLE(grub_relocator_xen_remapper_virt)
.long   0
 
-   /* mov imm32, %ecx */
+   /* mov imm32, %ecx - %ecx: low part of page table entry */
.byte   0xb9
 VARIABLE(grub_relocator_xen_remapper_map)
.long   0
 
-   /* mov imm32, %edx */
+   /* mov imm32, %edx  - %edx: high part of page table entry */
.byte   0xba
 VARIABLE(grub_relocator_xen_remapper_map_high)
.long   0
 
-   movl%ebx, %ebp
+   movl%ebx, %ebp  /* %ebx is clobbered by hypercall */
 
-   movl$2, %esi
+   movl$UVMF_INVLPG, %esi  /* esi: flags (inv. single entry) */
movl$__HYPERVISOR_update_va_mapping, %eax
int $0x82
 
movl%ebp, %ebx
addl   $(LOCAL(cont) - LOCAL(base)), %ebx
 
-   jmp *%ebx
+   jmp *%ebx   /* Continue with new virtual address */
 
 LOCAL(cont):
-   xorl%eax, %eax
-   movl%eax, %ebp
+   /* Modify mappings of new page tables to be read-only. */
+   /* mov imm32, %eax */
+   .byte   0xb8
+VARIABLE(grub_relocator_xen_paging_areas_addr)
+   .long   0
+   movl%eax, %ebx
 1:
+   movl0(%ebx), %ebp   /* Get start pfn of the current area */
+   movlGRUB_TARGET_SIZEOF_LONG(%ebx), %ecx /* Get # of pg tables */
+   testl   %ecx, %ecx  /* 0 -> last area reached */
+   jz  3f
+   addl$(2 * GRUB_TARGET_SIZEOF_LONG), %ebx
+   movl%ebx, %esp  /* Save current area pointer */
 
+2:
+   movl%ecx, %edi
/* mov imm32, %eax */
.byte   0xb8
 VARIABLE(grub_relocator_xen_mfn_list)
.long   0
-   movl%eax, %edi
-   movl%ebp, %eax
-   movl0(%edi, %eax, 4), %ecx
-
-   /* mov imm32, %ebx */
-   .byte   0xbb
-VARIABLE(grub_relocator_xen_paging_start)
-   .long   0
-   shll$12, %eax
-   addl%eax, %ebx
+   movl0(%eax, %ebp, 4), %ecx  /* mfn */
+   movl%ebp, %ebx
+   shll$PAGE_SHIFT, %ebx   /* virtual address (1:1 mapping) */
movl%ecx, %edx
-   shll$12,  %ecx
-   shrl$20,  %edx
-   orl $5, %ecx
-   movl$2, %esi
+   shll$PAGE_SHIFT,  %ecx  /* prepare pte low part */
+   shrl$(32 - PAGE_SHIFT),  %edx   /* pte high part */
+   orl $(GRUB_PAGE_PRESENT | GRUB_PAGE_USER), %ecx /* pte low */
+   movl$UVMF_INVLPG, %esi
movl$__HYPERVISOR_update_va_mapping, %eax
-   int $0x82
+   int $0x82   /* parameters: eax, ebx, ecx, edx, esi */
 
-   incl%ebp
-   /* mov imm32, %ecx */
-   .byte   0xb9
-VARIABLE(grub_relocator_xen_paging_size)
-   .long   0
-   cmpl%ebp, %ecx
+   incl%ebp/* next pfn */
+   movl%edi, %ecx
 
-   ja  1b
+   loop2b
 
+   mov %esp, %ebx  /* restore area poniter */
+   jmp 1b
+
+3:
+   /* Switch page tables: pin new L3 pt, load cr3, unpin old L3. */
/* mov imm32, %ebx */
.byte   0xbb
 VARIABLE(grub_relocator_xen_mmu_op_addr)
.long  0
-   movl   $3, %ecx
-   movl   $0, %edx
-   movl   $0x7FF0, %esi
+   movl   $3, %ecx /* 3 mmu ops */
+   movl   $0, %edx /* pdone (not used) */
+   movl   $DOMID_SELF, %esi
movl   $__HYPERVISOR_mmuext_op, %eax
 

[Xen-devel] [PATCH v3 08/10] xen: add capability to load initrd outside of initial mapping

2016-02-17 Thread Juergen Gross
Modern pvops linux kernels support an initrd not covered by the initial
mapping. This capability is flagged by an elf-note.

In case the elf-note is set by the kernel don't place the initrd into
the initial mapping. This will allow to load larger initrds and/or
support domains with larger memory, as the initial mapping is limited
to 2GB and it is containing the p2m list.

Signed-off-by: Juergen Gross 
---
 grub-core/loader/i386/xen.c| 61 ++
 grub-core/loader/i386/xen_fileXX.c |  3 ++
 include/grub/xen_file.h|  1 +
 3 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
index 3bcf4c8..7ac74f6 100644
--- a/grub-core/loader/i386/xen.c
+++ b/grub-core/loader/i386/xen.c
@@ -58,6 +58,7 @@ struct xen_loader_state {
   grub_uint64_t modules_target_start;
   grub_size_t n_modules;
   int loaded;
+  int alloc_end_called;
 };
 
 static struct xen_loader_state xen_state;
@@ -320,6 +321,28 @@ grub_xen_pt_alloc (void)
 }
 
 static grub_err_t
+grub_xen_alloc_end (void)
+{
+  grub_err_t err;
+
+  if (xen_state.alloc_end_called)
+return GRUB_ERR_NONE;
+  xen_state.alloc_end_called = 1;
+
+  err = grub_xen_p2m_alloc ();
+  if (err)
+return err;
+  err = grub_xen_special_alloc ();
+  if (err)
+return err;
+  err = grub_xen_pt_alloc ();
+  if (err)
+return err;
+
+  return GRUB_ERR_NONE;
+}
+
+static grub_err_t
 grub_xen_boot (void)
 {
   grub_err_t err;
@@ -330,13 +353,7 @@ grub_xen_boot (void)
   if (grub_xen_n_allocated_shared_pages)
 return grub_error (GRUB_ERR_BUG, "active grants");
 
-  err = grub_xen_p2m_alloc ();
-  if (err)
-return err;
-  err = grub_xen_special_alloc ();
-  if (err)
-return err;
-  err = grub_xen_pt_alloc ();
+  err = grub_xen_alloc_end ();
   if (err)
 return err;
 
@@ -609,6 +626,13 @@ grub_cmd_initrd (grub_command_t cmd __attribute__ 
((unused)),
   goto fail;
 }
 
+  if (xen_state.xen_inf.unmapped_initrd)
+{
+  err = grub_xen_alloc_end ();
+  if (err)
+   goto fail;
+}
+
   if (grub_initrd_init (argc, argv, _ctx))
 goto fail;
 
@@ -626,14 +650,24 @@ grub_cmd_initrd (grub_command_t cmd __attribute__ 
((unused)),
goto fail;
 }
 
-  xen_state.next_start.mod_start =
-xen_state.max_addr + xen_state.xen_inf.virt_base;
-  xen_state.next_start.mod_len = size;
-
-  xen_state.max_addr = ALIGN_UP (xen_state.max_addr + size, PAGE_SIZE);
+  if (xen_state.xen_inf.unmapped_initrd)
+{
+  xen_state.next_start.flags |= SIF_MOD_START_PFN;
+  xen_state.next_start.mod_start = xen_state.max_addr >> PAGE_SHIFT;
+  xen_state.next_start.mod_len = size;
+}
+  else
+{
+  xen_state.next_start.mod_start =
+   xen_state.max_addr + xen_state.xen_inf.virt_base;
+  xen_state.next_start.mod_len = size;
+}
 
   grub_dprintf ("xen", "Initrd, addr=0x%x, size=0x%x\n",
-   (unsigned) xen_state.next_start.mod_start, (unsigned) size);
+   (unsigned) (xen_state.max_addr + xen_state.xen_inf.virt_base),
+   (unsigned) size);
+
+  xen_state.max_addr = ALIGN_UP (xen_state.max_addr + size, PAGE_SIZE);
 
 fail:
   grub_initrd_close (_ctx);
@@ -685,6 +719,7 @@ grub_cmd_module (grub_command_t cmd __attribute__ 
((unused)),
 
   if (!xen_state.module_info_page)
 {
+  xen_state.xen_inf.unmapped_initrd = 0;
   xen_state.n_modules = 0;
   xen_state.max_addr = ALIGN_UP (xen_state.max_addr, PAGE_SIZE);
   xen_state.modules_target_start = xen_state.max_addr;
diff --git a/grub-core/loader/i386/xen_fileXX.c 
b/grub-core/loader/i386/xen_fileXX.c
index 1f7f71d..d68634d 100644
--- a/grub-core/loader/i386/xen_fileXX.c
+++ b/grub-core/loader/i386/xen_fileXX.c
@@ -259,6 +259,9 @@ parse_note (grub_elf_t elf, struct grub_xen_file_info *xi,
  descsz == 2 ? 2 : 3) == 0)
xi->arch = GRUB_XEN_FILE_I386;
  break;
+   case XEN_ELFNOTE_MOD_START_PFN:
+ xi->unmapped_initrd = !!grub_le_to_cpu32(*(grub_uint32_t *) desc);
+ break;
default:
  grub_dprintf ("xen", "unknown note type %d\n", nh->n_type);
  break;
diff --git a/include/grub/xen_file.h b/include/grub/xen_file.h
index 4b2ccba..ed749fa 100644
--- a/include/grub/xen_file.h
+++ b/include/grub/xen_file.h
@@ -36,6 +36,7 @@ struct grub_xen_file_info
   int has_note;
   int has_xen_guest;
   int extended_cr3;
+  int unmapped_initrd;
   enum
   {
 GRUB_XEN_FILE_I386 = 1,
-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 02/10] xen: reduce number of global variables in xen loader

2016-02-17 Thread Juergen Gross
The loader for xen paravirtualized environment is using lots of global
variables. Reduce the number by making them either local or by putting
them into a single state structure.

Signed-off-by: Juergen Gross 
---
 grub-core/loader/i386/xen.c | 259 +++-
 1 file changed, 138 insertions(+), 121 deletions(-)

diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
index ff7c553..d5fe168 100644
--- a/grub-core/loader/i386/xen.c
+++ b/grub-core/loader/i386/xen.c
@@ -42,16 +42,20 @@
 
 GRUB_MOD_LICENSE ("GPLv3+");
 
-static struct grub_relocator *relocator = NULL;
-static grub_uint64_t max_addr;
+struct xen_loader_state {
+  struct grub_relocator *relocator;
+  struct start_info next_start;
+  struct grub_xen_file_info xen_inf;
+  grub_uint64_t max_addr;
+  struct xen_multiboot_mod_list *module_info_page;
+  grub_uint64_t modules_target_start;
+  grub_size_t n_modules;
+  int loaded;
+};
+
+static struct xen_loader_state xen_state;
+
 static grub_dl_t my_mod;
-static int loaded = 0;
-static struct start_info next_start;
-static void *kern_chunk_src;
-static struct grub_xen_file_info xen_inf;
-static struct xen_multiboot_mod_list *xen_module_info_page;
-static grub_uint64_t modules_target_start;
-static grub_size_t n_modules;
 
 #define PAGE_SIZE 4096
 #define MAX_MODULES (PAGE_SIZE / sizeof (struct xen_multiboot_mod_list))
@@ -225,50 +229,55 @@ grub_xen_boot (void)
   if (grub_xen_n_allocated_shared_pages)
 return grub_error (GRUB_ERR_BUG, "active grants");
 
-  state.mfn_list = max_addr;
-  next_start.mfn_list = max_addr + xen_inf.virt_base;
-  next_start.first_p2m_pfn = max_addr >> PAGE_SHIFT;   /* Is this right? */
+  state.mfn_list = xen_state.max_addr;
+  xen_state.next_start.mfn_list =
+xen_state.max_addr + xen_state.xen_inf.virt_base;
+  xen_state.next_start.first_p2m_pfn = xen_state.max_addr >> PAGE_SHIFT;
   pgtsize = sizeof (grub_xen_mfn_t) * grub_xen_start_page_addr->nr_pages;
-  err = grub_relocator_alloc_chunk_addr (relocator, , max_addr, pgtsize);
-  next_start.nr_p2m_frames = (pgtsize + PAGE_SIZE - 1) >> PAGE_SHIFT;
+  err = grub_relocator_alloc_chunk_addr (xen_state.relocator, ,
+xen_state.max_addr, pgtsize);
+  xen_state.next_start.nr_p2m_frames = (pgtsize + PAGE_SIZE - 1) >> PAGE_SHIFT;
   if (err)
 return err;
   new_mfn_list = get_virtual_current_address (ch);
   grub_memcpy (new_mfn_list,
   (void *) grub_xen_start_page_addr->mfn_list, pgtsize);
-  max_addr = ALIGN_UP (max_addr + pgtsize, PAGE_SIZE);
+  xen_state.max_addr = ALIGN_UP (xen_state.max_addr + pgtsize, PAGE_SIZE);
 
-  err = grub_relocator_alloc_chunk_addr (relocator, ,
-max_addr, sizeof (next_start));
+  err = grub_relocator_alloc_chunk_addr (xen_state.relocator, ,
+xen_state.max_addr,
+sizeof (xen_state.next_start));
   if (err)
 return err;
-  state.start_info = max_addr + xen_inf.virt_base;
+  state.start_info = xen_state.max_addr + xen_state.xen_inf.virt_base;
   nst = get_virtual_current_address (ch);
-  max_addr = ALIGN_UP (max_addr + sizeof (next_start), PAGE_SIZE);
+  xen_state.max_addr =
+ALIGN_UP (xen_state.max_addr + sizeof (xen_state.next_start), PAGE_SIZE);
 
-  next_start.nr_pages = grub_xen_start_page_addr->nr_pages;
-  grub_memcpy (next_start.magic, grub_xen_start_page_addr->magic,
-  sizeof (next_start.magic));
-  next_start.store_mfn = grub_xen_start_page_addr->store_mfn;
-  next_start.store_evtchn = grub_xen_start_page_addr->store_evtchn;
-  next_start.console.domU = grub_xen_start_page_addr->console.domU;
-  next_start.shared_info = grub_xen_start_page_addr->shared_info;
+  xen_state.next_start.nr_pages = grub_xen_start_page_addr->nr_pages;
+  grub_memcpy (xen_state.next_start.magic, grub_xen_start_page_addr->magic,
+  sizeof (xen_state.next_start.magic));
+  xen_state.next_start.store_mfn = grub_xen_start_page_addr->store_mfn;
+  xen_state.next_start.store_evtchn = grub_xen_start_page_addr->store_evtchn;
+  xen_state.next_start.console.domU = grub_xen_start_page_addr->console.domU;
+  xen_state.next_start.shared_info = grub_xen_start_page_addr->shared_info;
 
-  err = set_mfns (new_mfn_list, max_addr >> PAGE_SHIFT);
+  err = set_mfns (new_mfn_list, xen_state.max_addr >> PAGE_SHIFT);
   if (err)
 return err;
-  max_addr += 2 * PAGE_SIZE;
+  xen_state.max_addr += 2 * PAGE_SIZE;
 
-  next_start.pt_base = max_addr + xen_inf.virt_base;
-  state.paging_start = max_addr >> PAGE_SHIFT;
+  xen_state.next_start.pt_base =
+xen_state.max_addr + xen_state.xen_inf.virt_base;
+  state.paging_start = xen_state.max_addr >> PAGE_SHIFT;
 
-  nr_info_pages = max_addr >> PAGE_SHIFT;
+  nr_info_pages = xen_state.max_addr >> PAGE_SHIFT;
   nr_pages = nr_info_pages;
 
   while (1)
 {
   nr_pages = ALIGN_UP (nr_pages, (ALIGN_SIZE >> 

[Xen-devel] [PATCH v3 01/10] xen: make xen loader callable multiple times

2016-02-17 Thread Juergen Gross
The loader for xen paravirtualized environment isn't callable multiple
times as it won't free any memory in case of failure.

Call grub_relocator_unload() as other modules do it before allocating
a new relocator or when unloading the module.

Signed-off-by: Juergen Gross 
---
 grub-core/loader/i386/xen.c| 28 +++-
 grub-core/loader/i386/xen_fileXX.c | 17 +++--
 2 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
index c4d9689..ff7c553 100644
--- a/grub-core/loader/i386/xen.c
+++ b/grub-core/loader/i386/xen.c
@@ -316,11 +316,23 @@ grub_xen_boot (void)
  xen_inf.virt_base);
 }
 
+static void
+grub_xen_reset (void)
+{
+  grub_memset (_start, 0, sizeof (next_start));
+  xen_module_info_page = NULL;
+  n_modules = 0;
+
+  grub_relocator_unload (relocator);
+  relocator = NULL;
+  loaded = 0;
+}
+
 static grub_err_t
 grub_xen_unload (void)
 {
+  grub_xen_reset ();
   grub_dl_unref (my_mod);
-  loaded = 0;
   return GRUB_ERR_NONE;
 }
 
@@ -403,10 +415,7 @@ grub_cmd_xen (grub_command_t cmd __attribute__ ((unused)),
 
   grub_loader_unset ();
 
-  grub_memset (_start, 0, sizeof (next_start));
-
-  xen_module_info_page = NULL;
-  n_modules = 0;
+  grub_xen_reset ();
 
   grub_create_loader_cmdline (argc - 1, argv + 1,
  (char *) next_start.cmd_line,
@@ -503,16 +512,17 @@ grub_cmd_xen (grub_command_t cmd __attribute__ ((unused)),
   goto fail;
 
 fail:
+  err = grub_errno;
 
   if (elf)
 grub_elf_close (elf);
   else if (file)
 grub_file_close (file);
 
-  if (grub_errno != GRUB_ERR_NONE)
-loaded = 0;
+  if (err != GRUB_ERR_NONE)
+grub_xen_reset ();
 
-  return grub_errno;
+  return err;
 }
 
 static grub_err_t
@@ -552,7 +562,7 @@ grub_cmd_initrd (grub_command_t cmd __attribute__ 
((unused)),
 {
   err = grub_relocator_alloc_chunk_addr (relocator, , max_addr, size);
   if (err)
-   return err;
+   goto fail;
 
   if (grub_initrd_load (_ctx, argv,
get_virtual_current_address (ch)))
diff --git a/grub-core/loader/i386/xen_fileXX.c 
b/grub-core/loader/i386/xen_fileXX.c
index 1ba5649..5475819 100644
--- a/grub-core/loader/i386/xen_fileXX.c
+++ b/grub-core/loader/i386/xen_fileXX.c
@@ -35,7 +35,8 @@ parse_xen_guest (grub_elf_t elf, struct grub_xen_file_info 
*xi,
   if (grub_file_read (elf->file, buf, sz) != (grub_ssize_t) sz)
 {
   if (grub_errno)
-   return grub_errno;
+   goto out;
+  grub_free (buf);
   return grub_error (GRUB_ERR_BAD_OS, N_("premature end of file %s"),
 elf->file->name);
 }
@@ -123,14 +124,14 @@ parse_xen_guest (grub_elf_t elf, struct 
grub_xen_file_info *xi,
{
  xi->virt_base = grub_strtoull (ptr + sizeof ("VIRT_BASE=") - 1, , 
16);
  if (grub_errno)
-   return grub_errno;
+   goto out;
  continue;
}
   if (grub_strncmp (ptr, "VIRT_ENTRY=", sizeof ("VIRT_ENTRY=") - 1) == 0)
{
  xi->entry_point = grub_strtoull (ptr + sizeof ("VIRT_ENTRY=") - 1, 
, 16);
  if (grub_errno)
-   return grub_errno;
+   goto out;
  continue;
}
   if (grub_strncmp (ptr, "HYPERCALL_PAGE=", sizeof ("HYPERCALL_PAGE=") - 
1) == 0)
@@ -138,7 +139,7 @@ parse_xen_guest (grub_elf_t elf, struct grub_xen_file_info 
*xi,
  xi->hypercall_page = grub_strtoull (ptr + sizeof ("HYPERCALL_PAGE=") 
- 1, , 16);
  xi->has_hypercall_page = 1;
  if (grub_errno)
-   return grub_errno;
+   goto out;
  continue;
}
   if (grub_strncmp (ptr, "ELF_PADDR_OFFSET=", sizeof ("ELF_PADDR_OFFSET=") 
- 1) == 0)
@@ -146,7 +147,7 @@ parse_xen_guest (grub_elf_t elf, struct grub_xen_file_info 
*xi,
  xi->paddr_offset = grub_strtoull (ptr + sizeof ("ELF_PADDR_OFFSET=") 
- 1, , 16);
  has_paddr = 1;
  if (grub_errno)
-   return grub_errno;
+   goto out;
  continue;
}
 }
@@ -154,7 +155,11 @@ parse_xen_guest (grub_elf_t elf, struct grub_xen_file_info 
*xi,
 xi->hypercall_page = (xi->hypercall_page << 12) + xi->virt_base;
   if (!has_paddr)
 xi->paddr_offset = xi->virt_base;
-  return GRUB_ERR_NONE;
+
+out:
+  grub_free (buf);
+
+  return grub_errno;
 }
 
 #pragma GCC diagnostic ignored "-Wcast-align"
-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 03/10] xen: add elfnote.h to avoid using numbers instead of constants

2016-02-17 Thread Juergen Gross
Various features and parameters of a pv-kernel are specified via
elf notes in the kernel image. Those notes are part of the interface
between the Xen hypervisor and the kernel.

Instead of using num,bers in the code when interpreting the elf notes
make use of the header supplied by Xen for that purpose.

Signed-off-by: Juergen Gross 
---
 grub-core/loader/i386/xen_fileXX.c |  19 +--
 include/xen/elfnote.h  | 281 +
 2 files changed, 291 insertions(+), 9 deletions(-)
 create mode 100644 include/xen/elfnote.h

diff --git a/grub-core/loader/i386/xen_fileXX.c 
b/grub-core/loader/i386/xen_fileXX.c
index 5475819..1f7f71d 100644
--- a/grub-core/loader/i386/xen_fileXX.c
+++ b/grub-core/loader/i386/xen_fileXX.c
@@ -18,6 +18,7 @@
 
 #include 
 #include 
+#include 
 
 static grub_err_t
 parse_xen_guest (grub_elf_t elf, struct grub_xen_file_info *xi,
@@ -201,35 +202,35 @@ parse_note (grub_elf_t elf, struct grub_xen_file_info *xi,
   xi->has_note = 1;
   switch (nh->n_type)
{
-   case 1:
+   case XEN_ELFNOTE_ENTRY:
  xi->entry_point = grub_le_to_cpu_addr (*(Elf_Addr *) desc);
  break;
-   case 2:
+   case XEN_ELFNOTE_HYPERCALL_PAGE:
  xi->hypercall_page = grub_le_to_cpu_addr (*(Elf_Addr *) desc);
  xi->has_hypercall_page = 1;
  break;
-   case 3:
+   case XEN_ELFNOTE_VIRT_BASE:
  xi->virt_base = grub_le_to_cpu_addr (*(Elf_Addr *) desc);
  break;
-   case 4:
+   case XEN_ELFNOTE_PADDR_OFFSET:
  xi->paddr_offset = grub_le_to_cpu_addr (*(Elf_Addr *) desc);
  break;
-   case 5:
+   case XEN_ELFNOTE_XEN_VERSION:
  grub_dprintf ("xen", "xenversion = `%s'\n", (char *) desc);
  break;
-   case 6:
+   case XEN_ELFNOTE_GUEST_OS:
  grub_dprintf ("xen", "name = `%s'\n", (char *) desc);
  break;
-   case 7:
+   case XEN_ELFNOTE_GUEST_VERSION:
  grub_dprintf ("xen", "version = `%s'\n", (char *) desc);
  break;
-   case 8:
+   case XEN_ELFNOTE_LOADER:
  if (descsz < 7
  || grub_memcmp (desc, "generic", descsz == 7 ? 7 : 8) != 0)
return grub_error (GRUB_ERR_BAD_OS, "invalid loader");
  break;
  /* PAE */
-   case 9:
+   case XEN_ELFNOTE_PAE_MODE:
  grub_dprintf ("xen", "pae = `%s', %d, %d\n", (char *) desc,
xi->arch, descsz);
  if (xi->arch != GRUB_XEN_FILE_I386
diff --git a/include/xen/elfnote.h b/include/xen/elfnote.h
new file mode 100644
index 000..353985f
--- /dev/null
+++ b/include/xen/elfnote.h
@@ -0,0 +1,281 @@
+/**
+ * elfnote.h
+ *
+ * Definitions used for the Xen ELF notes.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2006, Ian Campbell, XenSource Ltd.
+ */
+
+#ifndef __XEN_PUBLIC_ELFNOTE_H__
+#define __XEN_PUBLIC_ELFNOTE_H__
+
+/*
+ * `incontents 200 elfnotes ELF notes
+ *
+ * The notes should live in a PT_NOTE segment and have "Xen" in the
+ * name field.
+ *
+ * Numeric types are either 4 or 8 bytes depending on the content of
+ * the desc field.
+ *
+ * LEGACY indicated the fields in the legacy __xen_guest string which
+ * this a note type replaces.
+ *
+ * String values (for non-legacy) are NULL terminated ASCII, also known
+ * as ASCIZ type.
+ */
+
+/*
+ * NAME=VALUE pair (string).
+ */
+#define XEN_ELFNOTE_INFO   0
+
+/*
+ * The virtual address of the entry point (numeric).
+ *
+ * LEGACY: VIRT_ENTRY
+ */
+#define XEN_ELFNOTE_ENTRY  1
+
+/* The virtual address of the hypercall transfer page (numeric).
+ *
+ * LEGACY: HYPERCALL_PAGE. (n.b. legacy value is a physical page
+ * number not a virtual address)
+ */
+#define XEN_ELFNOTE_HYPERCALL_PAGE 2
+
+/* The virtual address where the kernel image should be mapped (numeric).

[Xen-devel] [PATCH v3 10/10] xen: add capability to load p2m list outside of kernel mapping

2016-02-17 Thread Juergen Gross
Modern pvops linux kernels support a p2m list not covered by the
kernel mapping. This capability is flagged by an elf-note specifying
the virtual address the kernel is expecting the p2m list to be mapped
to.

In case the elf-note is set by the kernel don't place the p2m list
into the kernel mapping, but map it to the given address. This will
allow to support domains with larger memory, as the kernel mapping is
limited to 2GB and a domain with huge memory in the TB range will have
a p2m list larger than this.

Signed-off-by: Juergen Gross 
---
 grub-core/loader/i386/xen.c| 53 +++---
 grub-core/loader/i386/xen_fileXX.c |  4 +++
 include/grub/xen_file.h|  2 ++
 3 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
index c514637..5b9bcd7 100644
--- a/grub-core/loader/i386/xen.c
+++ b/grub-core/loader/i386/xen.c
@@ -316,21 +316,47 @@ static grub_err_t
 grub_xen_p2m_alloc (void)
 {
   grub_relocator_chunk_t ch;
-  grub_size_t p2msize;
+  grub_size_t p2msize, p2malloc;
   grub_err_t err;
+  struct grub_xen_mapping *map;
+
+  map = xen_state.mappings + xen_state.n_mappings;
+  p2msize = ALIGN_UP (sizeof (grub_xen_mfn_t) *
+ grub_xen_start_page_addr->nr_pages, PAGE_SIZE);
+  if (xen_state.xen_inf.has_p2m_base)
+{
+  err = get_pgtable_size (xen_state.xen_inf.p2m_base,
+ xen_state.xen_inf.p2m_base + p2msize,
+ (xen_state.max_addr + p2msize) >> PAGE_SHIFT);
+  if (err)
+   return err;
+
+  map->area.pfn_start = xen_state.max_addr >> PAGE_SHIFT;
+  p2malloc = p2msize + page2offset (map->area.n_pt_pages);
+  xen_state.n_mappings++;
+  xen_state.next_start.mfn_list = xen_state.xen_inf.p2m_base;
+  xen_state.next_start.first_p2m_pfn = map->area.pfn_start;
+  xen_state.next_start.nr_p2m_frames = p2malloc >> PAGE_SHIFT;
+}
+  else
+{
+  xen_state.next_start.mfn_list =
+   xen_state.max_addr + xen_state.xen_inf.virt_base;
+  p2malloc = p2msize;
+}
 
   xen_state.state.mfn_list = xen_state.max_addr;
-  xen_state.next_start.mfn_list =
-xen_state.max_addr + xen_state.xen_inf.virt_base;
-  p2msize = sizeof (grub_xen_mfn_t) * grub_xen_start_page_addr->nr_pages;
   err = grub_relocator_alloc_chunk_addr (xen_state.relocator, ,
-xen_state.max_addr, p2msize);
+xen_state.max_addr, p2malloc);
   if (err)
 return err;
   xen_state.virt_mfn_list = get_virtual_current_address (ch);
+  if (xen_state.xen_inf.has_p2m_base)
+map->where = (grub_uint64_t *) xen_state.virt_mfn_list +
+p2msize / sizeof (grub_uint64_t);
   grub_memcpy (xen_state.virt_mfn_list,
   (void *) grub_xen_start_page_addr->mfn_list, p2msize);
-  xen_state.max_addr = ALIGN_UP (xen_state.max_addr + p2msize, PAGE_SIZE);
+  xen_state.max_addr += p2malloc;
 
   return GRUB_ERR_NONE;
 }
@@ -445,9 +471,12 @@ grub_xen_alloc_end (void)
 return GRUB_ERR_NONE;
   xen_state.alloc_end_called = 1;
 
-  err = grub_xen_p2m_alloc ();
-  if (err)
-return err;
+  if (!xen_state.xen_inf.has_p2m_base)
+{
+  err = grub_xen_p2m_alloc ();
+  if (err)
+   return err;
+}
   err = grub_xen_special_alloc ();
   if (err)
 return err;
@@ -472,6 +501,12 @@ grub_xen_boot (void)
   err = grub_xen_alloc_end ();
   if (err)
 return err;
+  if (xen_state.xen_inf.has_p2m_base)
+{
+  err = grub_xen_p2m_alloc ();
+  if (err)
+   return err;
+}
 
   err = set_mfns (xen_state.console_pfn);
   if (err)
diff --git a/grub-core/loader/i386/xen_fileXX.c 
b/grub-core/loader/i386/xen_fileXX.c
index d68634d..c92b807 100644
--- a/grub-core/loader/i386/xen_fileXX.c
+++ b/grub-core/loader/i386/xen_fileXX.c
@@ -259,6 +259,10 @@ parse_note (grub_elf_t elf, struct grub_xen_file_info *xi,
  descsz == 2 ? 2 : 3) == 0)
xi->arch = GRUB_XEN_FILE_I386;
  break;
+   case XEN_ELFNOTE_INIT_P2M:
+ xi->p2m_base = grub_le_to_cpu_addr (*(Elf_Addr *) desc);
+ xi->has_p2m_base = 1;
+ break;
case XEN_ELFNOTE_MOD_START_PFN:
  xi->unmapped_initrd = !!grub_le_to_cpu32(*(grub_uint32_t *) desc);
  break;
diff --git a/include/grub/xen_file.h b/include/grub/xen_file.h
index ed749fa..6587999 100644
--- a/include/grub/xen_file.h
+++ b/include/grub/xen_file.h
@@ -32,9 +32,11 @@ struct grub_xen_file_info
   grub_uint64_t entry_point;
   grub_uint64_t hypercall_page;
   grub_uint64_t paddr_offset;
+  grub_uint64_t p2m_base;
   int has_hypercall_page;
   int has_note;
   int has_xen_guest;
+  int has_p2m_base;
   int extended_cr3;
   int unmapped_initrd;
   enum
-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 06/10] xen: factor out allocation of special pages into separate function

2016-02-17 Thread Juergen Gross
Do the allocation of special pages (start info, console and xenbus
ring buffers) in a separate function. This will allow to do the
allocation at different times of the boot preparations depending on
the features the kernel is supporting.

Signed-off-by: Juergen Gross 
---
 grub-core/loader/i386/xen.c | 48 +
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
index ca293ac..69d0e65 100644
--- a/grub-core/loader/i386/xen.c
+++ b/grub-core/loader/i386/xen.c
@@ -48,6 +48,8 @@ struct xen_loader_state {
   struct start_info next_start;
   struct grub_xen_file_info xen_inf;
   grub_xen_mfn_t *virt_mfn_list;
+  struct start_info *virt_start_info;
+  grub_xen_mfn_t console_pfn;
   grub_uint64_t max_addr;
   struct xen_multiboot_mod_list *module_info_page;
   grub_uint64_t modules_target_start;
@@ -240,22 +242,10 @@ grub_xen_p2m_alloc (void)
 }
 
 static grub_err_t
-grub_xen_boot (void)
+grub_xen_special_alloc (void)
 {
   grub_relocator_chunk_t ch;
   grub_err_t err;
-  struct start_info *nst;
-  grub_uint64_t nr_info_pages;
-  grub_uint64_t nr_pages, nr_pt_pages, nr_need_pages;
-  struct gnttab_set_version gnttab_setver;
-  grub_size_t i;
-
-  if (grub_xen_n_allocated_shared_pages)
-return grub_error (GRUB_ERR_BUG, "active grants");
-
-  err = grub_xen_p2m_alloc ();
-  if (err)
-return err;
 
   err = grub_relocator_alloc_chunk_addr (xen_state.relocator, ,
 xen_state.max_addr,
@@ -263,9 +253,11 @@ grub_xen_boot (void)
   if (err)
 return err;
   xen_state.state.start_info = xen_state.max_addr + 
xen_state.xen_inf.virt_base;
-  nst = get_virtual_current_address (ch);
+  xen_state.virt_start_info = get_virtual_current_address (ch);
   xen_state.max_addr =
 ALIGN_UP (xen_state.max_addr + sizeof (xen_state.next_start), PAGE_SIZE);
+  xen_state.console_pfn = xen_state.max_addr >> PAGE_SHIFT;
+  xen_state.max_addr += 2 * PAGE_SIZE;
 
   xen_state.next_start.nr_pages = grub_xen_start_page_addr->nr_pages;
   grub_memcpy (xen_state.next_start.magic, grub_xen_start_page_addr->magic,
@@ -275,10 +267,28 @@ grub_xen_boot (void)
   xen_state.next_start.console.domU = grub_xen_start_page_addr->console.domU;
   xen_state.next_start.shared_info = grub_xen_start_page_addr->shared_info;
 
-  err = set_mfns (xen_state.max_addr >> PAGE_SHIFT);
+  return GRUB_ERR_NONE;
+}
+
+static grub_err_t
+grub_xen_boot (void)
+{
+  grub_relocator_chunk_t ch;
+  grub_err_t err;
+  grub_uint64_t nr_info_pages;
+  grub_uint64_t nr_pages, nr_pt_pages, nr_need_pages;
+  struct gnttab_set_version gnttab_setver;
+  grub_size_t i;
+
+  if (grub_xen_n_allocated_shared_pages)
+return grub_error (GRUB_ERR_BUG, "active grants");
+
+  err = grub_xen_p2m_alloc ();
+  if (err)
+return err;
+  err = grub_xen_special_alloc ();
   if (err)
 return err;
-  xen_state.max_addr += 2 * PAGE_SIZE;
 
   xen_state.next_start.pt_base =
 xen_state.max_addr + xen_state.xen_inf.virt_base;
@@ -309,6 +319,10 @@ grub_xen_boot (void)
   if (err)
 return err;
 
+  err = set_mfns (xen_state.console_pfn);
+  if (err)
+return err;
+
   generate_page_table (get_virtual_current_address (ch),
   xen_state.max_addr >> PAGE_SHIFT, nr_pages,
   xen_state.xen_inf.virt_base, xen_state.virt_mfn_list);
@@ -321,7 +335,7 @@ grub_xen_boot (void)
   xen_state.next_start.nr_pt_frames = nr_pt_pages;
   xen_state.state.paging_size = nr_pt_pages;
 
-  *nst = xen_state.next_start;
+  *xen_state.virt_start_info = xen_state.next_start;
 
   grub_memset (_setver, 0, sizeof (gnttab_setver));
 
-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3 04/10] xen: synchronize xen header

2016-02-17 Thread Juergen Gross
Get actual version of include/xen/xen.h from the Xen repository in
order to be able to use constants defined there.

Signed-off-by: Juergen Gross 
---
 include/xen/arch-x86/xen-x86_32.h |  22 +++
 include/xen/arch-x86/xen-x86_64.h |   8 +--
 include/xen/xen.h | 125 +++---
 3 files changed, 105 insertions(+), 50 deletions(-)

diff --git a/include/xen/arch-x86/xen-x86_32.h 
b/include/xen/arch-x86/xen-x86_32.h
index 1504191..7eca6cd 100644
--- a/include/xen/arch-x86/xen-x86_32.h
+++ b/include/xen/arch-x86/xen-x86_32.h
@@ -58,34 +58,31 @@
 #define __HYPERVISOR_VIRT_START_PAE0xF580
 #define __MACH2PHYS_VIRT_START_PAE 0xF580
 #define __MACH2PHYS_VIRT_END_PAE   0xF680
-#define HYPERVISOR_VIRT_START_PAE  \
-mk_unsigned_long(__HYPERVISOR_VIRT_START_PAE)
-#define MACH2PHYS_VIRT_START_PAE   \
-mk_unsigned_long(__MACH2PHYS_VIRT_START_PAE)
-#define MACH2PHYS_VIRT_END_PAE \
-mk_unsigned_long(__MACH2PHYS_VIRT_END_PAE)
+#define HYPERVISOR_VIRT_START_PAE  
xen_mk_ulong(__HYPERVISOR_VIRT_START_PAE)
+#define MACH2PHYS_VIRT_START_PAE   xen_mk_ulong(__MACH2PHYS_VIRT_START_PAE)
+#define MACH2PHYS_VIRT_END_PAE xen_mk_ulong(__MACH2PHYS_VIRT_END_PAE)
 
 /* Non-PAE bounds are obsolete. */
 #define __HYPERVISOR_VIRT_START_NONPAE 0xFC00
 #define __MACH2PHYS_VIRT_START_NONPAE  0xFC00
 #define __MACH2PHYS_VIRT_END_NONPAE0xFC40
 #define HYPERVISOR_VIRT_START_NONPAE   \
-mk_unsigned_long(__HYPERVISOR_VIRT_START_NONPAE)
+xen_mk_ulong(__HYPERVISOR_VIRT_START_NONPAE)
 #define MACH2PHYS_VIRT_START_NONPAE\
-mk_unsigned_long(__MACH2PHYS_VIRT_START_NONPAE)
+xen_mk_ulong(__MACH2PHYS_VIRT_START_NONPAE)
 #define MACH2PHYS_VIRT_END_NONPAE  \
-mk_unsigned_long(__MACH2PHYS_VIRT_END_NONPAE)
+xen_mk_ulong(__MACH2PHYS_VIRT_END_NONPAE)
 
 #define __HYPERVISOR_VIRT_START __HYPERVISOR_VIRT_START_PAE
 #define __MACH2PHYS_VIRT_START  __MACH2PHYS_VIRT_START_PAE
 #define __MACH2PHYS_VIRT_END__MACH2PHYS_VIRT_END_PAE
 
 #ifndef HYPERVISOR_VIRT_START
-#define HYPERVISOR_VIRT_START mk_unsigned_long(__HYPERVISOR_VIRT_START)
+#define HYPERVISOR_VIRT_START xen_mk_ulong(__HYPERVISOR_VIRT_START)
 #endif
 
-#define MACH2PHYS_VIRT_START  mk_unsigned_long(__MACH2PHYS_VIRT_START)
-#define MACH2PHYS_VIRT_ENDmk_unsigned_long(__MACH2PHYS_VIRT_END)
+#define MACH2PHYS_VIRT_START  xen_mk_ulong(__MACH2PHYS_VIRT_START)
+#define MACH2PHYS_VIRT_ENDxen_mk_ulong(__MACH2PHYS_VIRT_END)
 #define MACH2PHYS_NR_ENTRIES  ((MACH2PHYS_VIRT_END-MACH2PHYS_VIRT_START)>>2)
 #ifndef machine_to_phys_mapping
 #define machine_to_phys_mapping ((unsigned long *)MACH2PHYS_VIRT_START)
@@ -104,6 +101,7 @@
 do { if ( sizeof(hnd) == 8 ) *(uint64_t *)&(hnd) = 0;   \
  (hnd).p = val; \
 } while ( 0 )
+#define  int64_aligned_t  int64_t __attribute__((aligned(8)))
 #define uint64_aligned_t uint64_t __attribute__((aligned(8)))
 #define __XEN_GUEST_HANDLE_64(name) __guest_handle_64_ ## name
 #define XEN_GUEST_HANDLE_64(name) __XEN_GUEST_HANDLE_64(name)
diff --git a/include/xen/arch-x86/xen-x86_64.h 
b/include/xen/arch-x86/xen-x86_64.h
index 1c4e159..5e18613 100644
--- a/include/xen/arch-x86/xen-x86_64.h
+++ b/include/xen/arch-x86/xen-x86_64.h
@@ -76,12 +76,12 @@
 #define __MACH2PHYS_VIRT_END0x8040
 
 #ifndef HYPERVISOR_VIRT_START
-#define HYPERVISOR_VIRT_START mk_unsigned_long(__HYPERVISOR_VIRT_START)
-#define HYPERVISOR_VIRT_END   mk_unsigned_long(__HYPERVISOR_VIRT_END)
+#define HYPERVISOR_VIRT_START xen_mk_ulong(__HYPERVISOR_VIRT_START)
+#define HYPERVISOR_VIRT_END   xen_mk_ulong(__HYPERVISOR_VIRT_END)
 #endif
 
-#define MACH2PHYS_VIRT_START  mk_unsigned_long(__MACH2PHYS_VIRT_START)
-#define MACH2PHYS_VIRT_ENDmk_unsigned_long(__MACH2PHYS_VIRT_END)
+#define MACH2PHYS_VIRT_START  xen_mk_ulong(__MACH2PHYS_VIRT_START)
+#define MACH2PHYS_VIRT_ENDxen_mk_ulong(__MACH2PHYS_VIRT_END)
 #define MACH2PHYS_NR_ENTRIES  ((MACH2PHYS_VIRT_END-MACH2PHYS_VIRT_START)>>3)
 #ifndef machine_to_phys_mapping
 #define machine_to_phys_mapping ((unsigned long *)HYPERVISOR_VIRT_START)
diff --git a/include/xen/xen.h b/include/xen/xen.h
index a6a2092..6c9e42b 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -52,6 +52,19 @@ DEFINE_XEN_GUEST_HANDLE(void);
 DEFINE_XEN_GUEST_HANDLE(uint64_t);
 DEFINE_XEN_GUEST_HANDLE(xen_pfn_t);
 DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
+
+/* Turn a plain number into a C unsigned (long) constant. */
+#define __xen_mk_uint(x)  x ## U
+#define __xen_mk_ulong(x) x ## UL
+#define xen_mk_uint(x)__xen_mk_uint(x)
+#define xen_mk_ulong(x)   __xen_mk_ulong(x)
+
+#else
+
+/* In assembly code we cannot use C numeric constant suffixes. */
+#define xen_mk_uint(x)  x
+#define xen_mk_ulong(x) x
+
 #endif
 
 /*
@@ -101,6 +114,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_kexec_op 37
 #define __HYPERVISOR_tmem_op 

[Xen-devel] [PATCH v3 05/10] xen: factor out p2m list allocation into separate function

2016-02-17 Thread Juergen Gross
Do the p2m list allocation of the to be loaded kernel in a separate
function. This will allow doing the p2m list allocation at different
times of the boot preparations depending on the features the kernel
is supporting.

While at this remove superfluous setting of first_p2m_pfn and
nr_p2m_frames as those are needed only in case of the p2m list not
being mapped by the initial kernel mapping.

Signed-off-by: Juergen Gross 
---
 grub-core/loader/i386/xen.c | 87 ++---
 1 file changed, 50 insertions(+), 37 deletions(-)

diff --git a/grub-core/loader/i386/xen.c b/grub-core/loader/i386/xen.c
index d5fe168..ca293ac 100644
--- a/grub-core/loader/i386/xen.c
+++ b/grub-core/loader/i386/xen.c
@@ -44,8 +44,10 @@ GRUB_MOD_LICENSE ("GPLv3+");
 
 struct xen_loader_state {
   struct grub_relocator *relocator;
+  struct grub_relocator_xen_state state;
   struct start_info next_start;
   struct grub_xen_file_info xen_inf;
+  grub_xen_mfn_t *virt_mfn_list;
   grub_uint64_t max_addr;
   struct xen_multiboot_mod_list *module_info_page;
   grub_uint64_t modules_target_start;
@@ -170,7 +172,7 @@ generate_page_table (grub_uint64_t *where, grub_uint64_t 
paging_start,
 }
 
 static grub_err_t
-set_mfns (grub_xen_mfn_t * new_mfn_list, grub_xen_mfn_t pfn)
+set_mfns (grub_xen_mfn_t pfn)
 {
   grub_xen_mfn_t i, t;
   grub_xen_mfn_t cn_pfn = -1, st_pfn = -1;
@@ -179,32 +181,34 @@ set_mfns (grub_xen_mfn_t * new_mfn_list, grub_xen_mfn_t 
pfn)
 
   for (i = 0; i < grub_xen_start_page_addr->nr_pages; i++)
 {
-  if (new_mfn_list[i] == grub_xen_start_page_addr->console.domU.mfn)
+  if (xen_state.virt_mfn_list[i] ==
+ grub_xen_start_page_addr->console.domU.mfn)
cn_pfn = i;
-  if (new_mfn_list[i] == grub_xen_start_page_addr->store_mfn)
+  if (xen_state.virt_mfn_list[i] == grub_xen_start_page_addr->store_mfn)
st_pfn = i;
 }
   if (cn_pfn == (grub_xen_mfn_t)-1)
 return grub_error (GRUB_ERR_BUG, "no console");
   if (st_pfn == (grub_xen_mfn_t)-1)
 return grub_error (GRUB_ERR_BUG, "no store");
-  t = new_mfn_list[pfn];
-  new_mfn_list[pfn] = new_mfn_list[cn_pfn];
-  new_mfn_list[cn_pfn] = t;
-  t = new_mfn_list[pfn + 1];
-  new_mfn_list[pfn + 1] = new_mfn_list[st_pfn];
-  new_mfn_list[st_pfn] = t;
+  t = xen_state.virt_mfn_list[pfn];
+  xen_state.virt_mfn_list[pfn] = xen_state.virt_mfn_list[cn_pfn];
+  xen_state.virt_mfn_list[cn_pfn] = t;
+  t = xen_state.virt_mfn_list[pfn + 1];
+  xen_state.virt_mfn_list[pfn + 1] = xen_state.virt_mfn_list[st_pfn];
+  xen_state.virt_mfn_list[st_pfn] = t;
 
-  m2p_updates[0].ptr = page2offset (new_mfn_list[pfn]) | MMU_MACHPHYS_UPDATE;
+  m2p_updates[0].ptr =
+page2offset (xen_state.virt_mfn_list[pfn]) | MMU_MACHPHYS_UPDATE;
   m2p_updates[0].val = pfn;
   m2p_updates[1].ptr =
-page2offset (new_mfn_list[pfn + 1]) | MMU_MACHPHYS_UPDATE;
+page2offset (xen_state.virt_mfn_list[pfn + 1]) | MMU_MACHPHYS_UPDATE;
   m2p_updates[1].val = pfn + 1;
   m2p_updates[2].ptr =
-page2offset (new_mfn_list[cn_pfn]) | MMU_MACHPHYS_UPDATE;
+page2offset (xen_state.virt_mfn_list[cn_pfn]) | MMU_MACHPHYS_UPDATE;
   m2p_updates[2].val = cn_pfn;
   m2p_updates[3].ptr =
-page2offset (new_mfn_list[st_pfn]) | MMU_MACHPHYS_UPDATE;
+page2offset (xen_state.virt_mfn_list[st_pfn]) | MMU_MACHPHYS_UPDATE;
   m2p_updates[3].val = st_pfn;
 
   grub_xen_mmu_update (m2p_updates, 4, NULL, DOMID_SELF);
@@ -213,43 +217,52 @@ set_mfns (grub_xen_mfn_t * new_mfn_list, grub_xen_mfn_t 
pfn)
 }
 
 static grub_err_t
+grub_xen_p2m_alloc (void)
+{
+  grub_relocator_chunk_t ch;
+  grub_size_t p2msize;
+  grub_err_t err;
+
+  xen_state.state.mfn_list = xen_state.max_addr;
+  xen_state.next_start.mfn_list =
+xen_state.max_addr + xen_state.xen_inf.virt_base;
+  p2msize = sizeof (grub_xen_mfn_t) * grub_xen_start_page_addr->nr_pages;
+  err = grub_relocator_alloc_chunk_addr (xen_state.relocator, ,
+xen_state.max_addr, p2msize);
+  if (err)
+return err;
+  xen_state.virt_mfn_list = get_virtual_current_address (ch);
+  grub_memcpy (xen_state.virt_mfn_list,
+  (void *) grub_xen_start_page_addr->mfn_list, p2msize);
+  xen_state.max_addr = ALIGN_UP (xen_state.max_addr + p2msize, PAGE_SIZE);
+
+  return GRUB_ERR_NONE;
+}
+
+static grub_err_t
 grub_xen_boot (void)
 {
-  struct grub_relocator_xen_state state;
   grub_relocator_chunk_t ch;
   grub_err_t err;
-  grub_size_t pgtsize;
   struct start_info *nst;
   grub_uint64_t nr_info_pages;
   grub_uint64_t nr_pages, nr_pt_pages, nr_need_pages;
   struct gnttab_set_version gnttab_setver;
-  grub_xen_mfn_t *new_mfn_list;
   grub_size_t i;
 
   if (grub_xen_n_allocated_shared_pages)
 return grub_error (GRUB_ERR_BUG, "active grants");
 
-  state.mfn_list = xen_state.max_addr;
-  xen_state.next_start.mfn_list =
-xen_state.max_addr + xen_state.xen_inf.virt_base;
-  xen_state.next_start.first_p2m_pfn = xen_state.max_addr >> PAGE_SHIFT;

[Xen-devel] [PATCH v3 00/10] grub-xen: support booting huge pv-domains

2016-02-17 Thread Juergen Gross
The Xen hypervisor supports starting a dom0 with large memory (up to
the TB range) by not including the initrd and p2m list in the initial
kernel mapping. Especially the p2m list can grow larger than the
available virtual space in the initial mapping.

The started kernel is indicating the support of each feature via
elf notes.

This series enables grub-xen to do the same as the hypervisor.

Tested with:
- 32 bit domU (kernel not supporting unmapped initrd)
- 32 bit domU (kernel supporting unmapped initrd)
- 1 GB 64 bit domU (kernel supporting unmapped initrd, not p2m)
- 1 GB 64 bit domU (kernel supporting unmapped initrd and p2m)
- 900GB 64 bit domU (kernel supporting unmapped initrd and p2m)

Changes in V3:
- added new patch 1 (free memory in case of error) as requested by
  Daniel Kiper
- added new patch 2 (avoid global variables) as requested by Daniel Kiper
- added new patch 3 (use constants for elf notes) as requested by Daniel Kiper
- added new patch 4 (sync with new Xen headers) in order to use constants
  in assembly code
- modified patch 9 (was patch 5) to use constants instead of numbers and
  added lots of comments to assembly code as requested by Daniel Kiper

Changes in V2:
- rebased patch 5 to current master

Juergen Gross (10):
  xen: make xen loader callable multiple times
  xen: reduce number of global variables in xen loader
  xen: add elfnote.h to avoid using numbers instead of constants
  xen: synchronize xen header
  xen: factor out p2m list allocation into separate function
  xen: factor out allocation of special pages into separate function
  xen: factor out allocation of page tables into separate function
  xen: add capability to load initrd outside of initial mapping
  xen: modify page table construction
  xen: add capability to load p2m list outside of kernel mapping

 grub-core/lib/i386/xen/relocator.S   |  87 ++--
 grub-core/lib/x86_64/xen/relocator.S | 134 +++---
 grub-core/lib/xen/relocator.c|  25 +-
 grub-core/loader/i386/xen.c  | 778 +++
 grub-core/loader/i386/xen_fileXX.c   |  43 +-
 include/grub/i386/memory.h   |   7 +
 include/grub/xen/relocator.h |   6 +-
 include/grub/xen_file.h  |   3 +
 include/xen/arch-x86/xen-x86_32.h|  22 +-
 include/xen/arch-x86/xen-x86_64.h|   8 +-
 include/xen/elfnote.h| 281 +
 include/xen/xen.h| 125 --
 12 files changed, 1070 insertions(+), 449 deletions(-)
 create mode 100644 include/xen/elfnote.h

-- 
2.6.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   >