[Xen-devel] [qemu-upstream-4.3-testing test] 56373: regressions - FAIL

2015-05-14 Thread osstest service user
flight 56373 qemu-upstream-4.3-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/56373/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-amd64 15 guest-localmigrate.2   fail REGR. vs. 50282

Tests which are failing intermittently (not blocking):
 test-amd64-i386-freebsd10-amd64 13 guest-localmigrate fail in 55875 pass in 
56373
 test-amd64-amd64-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail in 55875 
pass in 56373
 test-amd64-i386-xl-qemuu-win7-amd64 12 guest-localmigrate fail in 55875 pass 
in 56373
 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10 fail in 55875 
pass in 56373
 test-amd64-i386-libvirt  11 guest-start fail pass in 55875

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt  12 migrate-support-check fail in 55875 never pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64  9 debian-hvm-install fail never pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  9 debian-hvm-install  fail never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 16 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 16 guest-stop   fail never pass

version targeted for testing:
 qemuud7b34893e0ad5c84d898b34fda8a465dfd7e8376
baseline version:
 qemuu7f34050dc014ae8f4078d48aec97ec6553151bf2


People who touched revisions under test:
  Petr Matousek 


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  fail
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-credit2  pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt pass
 test-amd64-i386-libvirt  fail
 test-amd64-amd64-xl-multivcpupass
 test-amd64-amd64-pairpass
 test-amd64-i386-pair pass
 test-amd64-amd64-xl-sedf-pin pass
 test-amd64-amd64-pv  pass
 test-amd64-i386-pv   pass
 test-amd64-amd64-xl-sedf pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 fail
 test-amd64-amd64-xl-qemuu-winxpsp3   fail



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/osstest/pub/logs
images: /home/osstest/pub/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.


commit d7b34893e0ad5c84d898b34fda8a465dfd7e8376
Author: Petr Matousek 
Date:   Wed May 6 09:48:59 2015 +0200

fdc: force the fifo access to be in bounds of the allocated buffer

During processing of certain commands such as FD_CMD_READ_ID and
FD_CMD_DRIVE_SPECIFICATION_COMMAND the fifo memory access could
get out of bounds leading to memory corruption with values coming
from the guest.

Fix this by making sure that the index is always bounded by the
allocated memory.

This is CVE-2015-3456.

Signed-off-by: Petr Matousek 
Revie

Re: [Xen-devel] [RFC][PATCH 13/13] hvmloader/e820: construct guest e820 table

2015-05-14 Thread Jan Beulich
>>> On 15.05.15 at 08:39,  wrote:
> On 2015/5/15 14:25, Jan Beulich wrote:
> On 15.05.15 at 08:11,  wrote:
>>> Even we may separate the
>>> low memory to construct memory_map.map[]...
>>
>> ???
> 
> Sorry I just mean that the low memory is not represented with only one 
> memory_map.map[] in some cases.

That's correct.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/9] Porting the intel_pstate driver to Xen

2015-05-14 Thread Wang, Wei W
Hi Jan,

> On 28/04/2015 16:37, Wei Wang wrote
> > Changes:
> > *NO.1 The intel_pstate driver can be controlled via two ways:
> > A. min_perf_pct and max_perf_pct
> >The user directly adjusts min_perf_pct and max_perf_pct to get what
> >they want. For example, if min_perf_pct=max_perf_pct=60%, then the
> >user is asking for something similar to a userspace governor with
> >setting the requested performance=60%.
> > B. set-scaling-governor
> >This one is functionally redundant, since A. can achieve all the
> >governor functions. It is remained to give people time to get
> >familiar with method A.
> >Users can choose from the four governors: Powersave, Ondemand,
> >Powersave, Performance. The driver achieves the functionality of
> >the selected governor via adjusting the min_perf_pct and max_perf_pct
> >itself.
> >
> > *NO.2 The xenpm "get-cpufreq-para" displays the following things:
> > cpu id   : 10
> > affected_cpus: 10
> > cpuinfo frequency: max [370] min [120] cur [140]
> > scaling_driver   : intel_pstate
> > scaling_avail_gov: performance powersave userspace ondemand
> > current_governor : ondemand
> > max_perf_pct : 100
> > min_perf_pct : 32
> > turbo_pct: 54
> > turbo mode   : enabled
> >
> > *NO.3 Changed "intel_pstate=disable" to "intel_pstate=enable".
> > If "intel_pstate=enable" is added, but the CPU does not support the
> > intel_pstate driver, the old P-state driver (acpi-cpufreq) will be loaded.
> >
> > *NO.4 Moved the declarations under xen/include/acpi to an x86-specific
> > header.
> >
> > ** Basic Description **
> > This patch series ports the intel_pstate driver from the Linux kernel to 
> > Xen.
> > The intel_pstate driver is used to tune P states for SandyBridge+
> processors.
> > It needs to be enabled by adding "intel_pstate=enable" to the booting
> > parameter list.
> >
> > The intel_pstate.c file under xen/arch/x86/acpi/cpufreq/ contains all
> > the logic for selecting the current P-state. It follows its
> > implementation in the kernel. In order to better support future Intel
> > CPUs (e.g. the HWP feature on
> > Skylake+), intel_pstate changes to tune P-state based on percentage
> values.
> >
> > The xenpm tool is also upgraded to support the intel_pstate driver. If
> > intel_pstate is used, "get-cpufreq-para" displays percentage value
> > based feedback. If the intel_pstate driver is not enabled, xenpm will
> > work in the old style.
> >
> >
> > Wei Wang (9):
> >   x86/acpi: add a common interface for x86 cpu matching
> >   x86/intel_pstate: add some calculation related support
> >   x86/cpu_hotplug: add the unregister_cpu_notifier function to support
> > CPU hotplug
> >   x86/intel_pstate: add new policy fields and a new driver interface
> >   x86/intel_pstate: relocate the driver register/unregister function
> >   x86/intel_pstate: the main boby of the intel_pstate driver
> >   x86/intel_pstate: add a booting param to select the driver to load
> >   x86/intel_pstate: support the use of intel_pstate in pmstat.c
> >   x86/intel_pstate: enable xenpm to control the intel_pstate driver
> >
> >  tools/libxc/include/xenctrl.h|  14 +-
> >  tools/libxc/xc_pm.c  |  17 +-
> >  tools/misc/xenpm.c   | 104 +++-
> >  xen/arch/x86/acpi/cpufreq/Makefile   |   1 +
> >  xen/arch/x86/acpi/cpufreq/cpufreq.c  |   9 +-
> >  xen/arch/x86/acpi/cpufreq/intel_pstate.c | 869
> > +++
> >  xen/arch/x86/cpu/common.c|  39 ++
> >  xen/arch/x86/cpu/mwait-idle.c|  30 +-
> >  xen/common/cpu.c |   7 +
> >  xen/drivers/acpi/pmstat.c| 106 +++-
> >  xen/drivers/cpufreq/cpufreq.c|  27 +-
> >  xen/drivers/cpufreq/utility.c|   5 +
> >  xen/include/acpi/cpufreq/cpufreq.h   |  45 +-
> >  xen/include/asm-x86/acpi.h   |   4 +
> >  xen/include/asm-x86/cpufeature.h |   1 +
> >  xen/include/asm-x86/div64.h  |  68 +++
> >  xen/include/asm-x86/msr-index.h  |   3 +
> >  xen/include/asm-x86/processor.h  |  10 +
> >  xen/include/public/sysctl.h  |  16 +-
> >  xen/include/xen/cpu.h|   1 +
> >  xen/include/xen/kernel.h |  30 ++
> >  21 files changed, 1300 insertions(+), 106 deletions(-)  create mode
> > 100644 xen/arch/x86/acpi/cpufreq/intel_pstate.c
> >
> > --
> > 1.9.1

Do you have any comments on this version?

Best,
Wei


> 
> 
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 13/13] hvmloader/e820: construct guest e820 table

2015-05-14 Thread Chen, Tiejun

On 2015/5/15 14:25, Jan Beulich wrote:

On 15.05.15 at 08:11,  wrote:

On 2015/4/20 22:29, Jan Beulich wrote:

On 10.04.15 at 11:22,  wrote:

@@ -119,10 +120,6 @@ int build_e820_table(struct e820entry *e820,

   /* Low RAM goes here. Reserve space for special pages. */
   BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20));
-e820[nr].addr = 0x10;
-e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
-nr++;


I think the above comment needs adjustment with all this code
removed. I also wonder how meaningful the BUG_ON() is with
->low_mem_pgend no longer used for E820 table construction.
Perhaps this needs another BUG_ON() validating that the field
matches some value from memory_map.map[]?


But I think hvm_info->low_mem_pgend is still correct, right?


I think so, but as said it's becoming less used and hence less
relevant here.


Understood.




And
additionally, there's no any obvious flag to indicate which
memory_map.map[x] is that last low memory map.


I didn't imply it would be immediately obvious _how_ to do this.
I'm merely wanting to avoid leaving meaningless BUG_ON()s in
the code, while meaningful ones are amiss.


Maybe we should lookup all .map[] to get the lowest memory map and then 
BUG_ON?





Even we may separate the
low memory to construct memory_map.map[]...


???


Sorry I just mean that the low memory is not represented with only one 
memory_map.map[] in some cases. Is it impossible? Even in the future? Or 
actually we always consider the lowest memory map?





@@ -159,16 +156,37 @@ int build_e820_table(struct e820entry *e820,
   nr++;
   }

-
-if ( hvm_info->high_mem_pgend )
+/* Construct the remaining according memory_map[]. */
+for ( i = 0; i < memory_map.nr_map; i++ )
   {
-e820[nr].addr = ((uint64_t)1 << 32);
-e820[nr].size =
-((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
+e820[nr].addr = memory_map.map[i].addr;
+e820[nr].size = memory_map.map[i].size;
+e820[nr].type = memory_map.map[i].type;


Afaict you could use structure assignment here to make this
more readable.


Sorry, are you saying this?

memcpy(&e820[nr], &memory_map.map[i], sizeof(struct e820entry));


No, structure assignment (which, other than memcpy(), is type safe):

 e820[nr] = memory_map.map[i];



Understood.

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 1/2] xen/pvh: use a custom IO bitmap for PVH hardware domains

2015-05-14 Thread Jan Beulich
>>> On 14.05.15 at 17:27,  wrote:
> El 13/05/15 a les 11.53, Jan Beulich ha escrit:
> On 11.05.15 at 16:57,  wrote:
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -42,6 +42,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>  
>>>  /* Linux config option: propageted to domain0 */
>>>  /* xen_processor_pmbits: xen control Cx, Px, ... */
>>> @@ -219,6 +220,7 @@ static int late_hwdom_init(struct domain *d)
>>>  rangeset_swap(d->iomem_caps, dom0->iomem_caps);
>>>  #ifdef CONFIG_X86
>>>  rangeset_swap(d->arch.ioport_caps, dom0->arch.ioport_caps);
>>> +setup_io_bitmap(d);
>>>  #endif
>> 
>> Considering that rangesets are getting swapped rather than
>> copied, I think you also need to reset Dom0's I/O bitmap here
>> to the ordinary, non-hardware domain one.
> 
> Yes. Would it be fine to memset it and just call setup_io_bitmap on it
> again, or would you prefer to exchange it with the static one and free it?

Following how the rangesets are being treated, simply swapping
the two I/O bitmaps would seem to be the right approach here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 09/13] xen: enable XENMEM_set_memory_map in hvm

2015-05-14 Thread Jan Beulich
>>> On 15.05.15 at 08:24,  wrote:
> On 2015/5/15 14:12, Jan Beulich wrote:
> On 15.05.15 at 04:33,  wrote:
>>> On 2015/4/20 21:46, Jan Beulich wrote:
>>> On 10.04.15 at 11:22,  wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4729,7 +4729,6 @@ static long hvm_memory_op(int cmd,
>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>
>switch ( cmd & MEMOP_CMD_MASK )
>{
> -case XENMEM_memory_map:

 Title and description talk about XENMEM_set_memory_map only. As
 I think the implementation is right, the former will need updating. Do
 you actually need a HVM domain to be able to XENMEM_set_memory_map
>>>
>>> Yes. Actually we need to enable two hypercalls here,
>>>
>>> #1. XENMEM_set_memory_map --> Set
>>> #2. XENMEM_memory_map --> Get
>>
>> You say "yes" without saying why, and ...
> 
> Instead of constructing e820 in the case of hvmloader, now we'd like to 
> set up a basic e820 while building hvm, so we need to enable 
> XENMEM_set_memory_map/XENMEM_memory_map to own this approach in hvm case.

You continue to ignore ...

 on itself? If not, it should probably replace XENMEM_memory_map here.

... the "on itself" here. Of course the tool stack needs to be able to
invoke this. But does the guest itself need to?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 3/5] libxl: add support for vscsi

2015-05-14 Thread Olaf Hering
On Wed, May 13, Ian Campbell wrote:

> On Wed, 2015-05-06 at 13:28 +, Olaf Hering wrote:
> > +++ b/docs/man/xl.pod.1
> > @@ -1328,6 +1328,24 @@ List virtual trusted platform modules for a domain.
> >  
> >  =back
> >  
> > +=head2 PVSCSI DEVICES
> > +
> > +=over 4
> > +
> > +=item B I I I,I<[feature-host]>
> 
> Unlike in the xl.cfg disk spec the pdev and vdev are separated with
> space rather than ",", is that deliberate? (I don't mind, just want to
> check it's intended).

Yes, pdev and vdev are space separated. I have to double check how xm
handled the additional feature-host option, it was most likely comma
separated.

> > +=item B I I<[domain-id] ...>
> > +
> > +List vscsi devices for the domain specified by I.
> 
> Does/could omitting the domid list them all?

No, at least one domid is required. Not sure how desirable (and racy) it
is to walk every domain and look for vscsi devices.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 13/13] hvmloader/e820: construct guest e820 table

2015-05-14 Thread Jan Beulich
>>> On 15.05.15 at 08:11,  wrote:
> On 2015/4/20 22:29, Jan Beulich wrote:
> On 10.04.15 at 11:22,  wrote:
>>> @@ -119,10 +120,6 @@ int build_e820_table(struct e820entry *e820,
>>>
>>>   /* Low RAM goes here. Reserve space for special pages. */
>>>   BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20));
>>> -e820[nr].addr = 0x10;
>>> -e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - 
>>> e820[nr].addr;
>>> -e820[nr].type = E820_RAM;
>>> -nr++;
>>
>> I think the above comment needs adjustment with all this code
>> removed. I also wonder how meaningful the BUG_ON() is with
>> ->low_mem_pgend no longer used for E820 table construction.
>> Perhaps this needs another BUG_ON() validating that the field
>> matches some value from memory_map.map[]?
> 
> But I think hvm_info->low_mem_pgend is still correct, right?

I think so, but as said it's becoming less used and hence less
relevant here.

> And 
> additionally, there's no any obvious flag to indicate which 
> memory_map.map[x] is that last low memory map.

I didn't imply it would be immediately obvious _how_ to do this.
I'm merely wanting to avoid leaving meaningless BUG_ON()s in
the code, while meaningful ones are amiss.

> Even we may separate the 
> low memory to construct memory_map.map[]...

???

>>> @@ -159,16 +156,37 @@ int build_e820_table(struct e820entry *e820,
>>>   nr++;
>>>   }
>>>
>>> -
>>> -if ( hvm_info->high_mem_pgend )
>>> +/* Construct the remaining according memory_map[]. */
>>> +for ( i = 0; i < memory_map.nr_map; i++ )
>>>   {
>>> -e820[nr].addr = ((uint64_t)1 << 32);
>>> -e820[nr].size =
>>> -((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - 
>>> e820[nr].addr;
>>> -e820[nr].type = E820_RAM;
>>> +e820[nr].addr = memory_map.map[i].addr;
>>> +e820[nr].size = memory_map.map[i].size;
>>> +e820[nr].type = memory_map.map[i].type;
>>
>> Afaict you could use structure assignment here to make this
>> more readable.
> 
> Sorry, are you saying this?
> 
> memcpy(&e820[nr], &memory_map.map[i], sizeof(struct e820entry));

No, structure assignment (which, other than memcpy(), is type safe):

e820[nr] = memory_map.map[i];

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 09/13] xen: enable XENMEM_set_memory_map in hvm

2015-05-14 Thread Chen, Tiejun

On 2015/5/15 14:12, Jan Beulich wrote:

On 15.05.15 at 04:33,  wrote:




On 2015/4/20 21:46, Jan Beulich wrote:

On 10.04.15 at 11:22,  wrote:

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4729,7 +4729,6 @@ static long hvm_memory_op(int cmd,

XEN_GUEST_HANDLE_PARAM(void) arg)


   switch ( cmd & MEMOP_CMD_MASK )
   {
-case XENMEM_memory_map:


Title and description talk about XENMEM_set_memory_map only. As
I think the implementation is right, the former will need updating. Do
you actually need a HVM domain to be able to XENMEM_set_memory_map


Yes. Actually we need to enable two hypercalls here,

#1. XENMEM_set_memory_map --> Set
#2. XENMEM_memory_map --> Get


You say "yes" without saying why, and ...


Instead of constructing e820 in the case of hvmloader, now we'd like to 
set up a basic e820 while building hvm, so we need to enable 
XENMEM_set_memory_map/XENMEM_memory_map to own this approach in hvm case.





on itself? If not, it should probably replace XENMEM_memory_map here.



Just rephrase,

xen: enable XENMEM set/get memory_map in hvm

This patch enables XENMEM_set_memory_map in hvm and then we can use
it to setup the e820 mappings, and finally hvmloader can get
these mappings with XENMEM_memory_map.


... according to this wording of yours it's not needed.



Sorry, anything confound you or me?

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] docs: fix typo in xl.cfg:vfb=

2015-05-14 Thread Olaf Hering
Use singular for option, it refers to vfb= itself.

Signed-off-by: Olaf Hering 
Cc: Ian Campbell 
Cc: Ian Jackson 
---
 docs/man/xl.cfg.pod.5 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 566e343..5a0ca50 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -570,7 +570,7 @@ emulation in backend driver is bypassed when "feature-host" 
is specified.
 Specifies the paravirtual framebuffer devices which should be supplied
 to the domain.
 
-This options does not control the emulated graphics card presented to
+This option does not control the emulated graphics card presented to
 an HVM guest. See L below for how to
 configure the emulated device. If L options
 are used in a PV guest configuration, xl will pick up B, B,

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 12/13] hvmloader/pci: skip reserved ranges

2015-05-14 Thread Jan Beulich
>>> On 15.05.15 at 05:18,  wrote:
> On 2015/4/20 22:21, Jan Beulich wrote:
> On 10.04.15 at 11:22,  wrote:
>>> --- a/tools/firmware/hvmloader/pci.c
>>> +++ b/tools/firmware/hvmloader/pci.c
>>> @@ -59,8 +59,8 @@ void pci_setup(void)
>>>   uint32_t bar_reg;
>>>   uint64_t bar_sz;
>>>   } *bars = (struct bars *)scratch_start;
>>> -unsigned int i, nr_bars = 0;
>>> -uint64_t mmio_hole_size = 0;
>>> +unsigned int i, j, nr_bars = 0;
>>> +uint64_t mmio_hole_size = 0, reserved_end;
>>>
>>>   const char *s;
>>>   /*
>>> @@ -393,8 +393,23 @@ void pci_setup(void)
>>>   }
>>>
>>>   base = (resource->base  + bar_sz - 1) & ~(uint64_t)(bar_sz - 1);
>>> + reallocate_mmio:
>>>   bar_data |= (uint32_t)base;
>>>   bar_data_upper = (uint32_t)(base >> 32);
>>> +for ( j = 0; j < memory_map.nr_map ; j++ )
>>> +{
>>> +if ( memory_map.map[j].type != E820_RAM )
>>> +{
>>> +reserved_end = memory_map.map[j].addr + 
> memory_map.map[j].size;
>>> +if ( check_hole_conflict(base, bar_sz,
>>> + memory_map.map[j].addr,
>>> + memory_map.map[j].size) )
>>> +{
>>> +base = (reserved_end  + bar_sz - 1) & 
>>> ~(uint64_t)(bar_sz - 
> 1);
>>> +goto reallocate_mmio;
>>> +}
>>> +}
>>> +}
>>>   base += bar_sz;
>>>
>>>   if ( (base < resource->base) || (base > resource->max) )
>>
> 
> Actually some original codes are missing here,
> 
>  if ( (base < resource->base) || (base > resource->max) )
>  {
>  printf("pci dev %02x:%x bar %02x size "PRIllx": no space for "
> "resource!\n", devfn>>3, devfn&7, bar_reg,
> PRIllx_arg(bar_sz));
>  continue;
>  }
> 
> I think this can guarantee the MMIO regions just fit in the available RAM.
> 
> Or am I wrong?

The code you cite guarantees almost nothing, it simply skips assigning
resources. Your changes potentially growing the space needed to fit
all MMIO BARs therefore also needs to adjust the up front calculation,
such that if necessary more RAM can be relocated to make the hole
large enough.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 10/13] tools: extend XENMEM_set_memory_map

2015-05-14 Thread Jan Beulich
>>> On 15.05.15 at 04:57,  wrote:
> On 2015/4/20 21:51, Jan Beulich wrote:
> On 10.04.15 at 11:22,  wrote:
>>> --- a/tools/libxl/libxl_dom.c
>>> +++ b/tools/libxl/libxl_dom.c
>>> @@ -787,6 +787,70 @@ out:
>>>   return rc;
>>>   }
>>>
>>> +static int libxl__domain_construct_memmap(libxl_ctx *ctx,
>>> +  libxl_domain_config *d_config,
>>> +  uint32_t domid,
>>> +  struct xc_hvm_build_args *args,
>>> +  int num_pcidevs,
>>> +  libxl_device_pci *pcidevs)
>>> +{
>>> +unsigned int nr = 0, i;
>>> +/* We always own at least one lowmem entry. */
>>> +unsigned int e820_entries = 1;
>>> +uint64_t highmem_end = 0, highmem_size = args->mem_size - 
>>> args->lowmem_size;
>>> +struct e820entry *e820 = NULL;
>>> +
>>> +/* Add all rdm entries. */
>>> +e820_entries += d_config->num_rdms;
>>> +
>>> +/* If we should have a highmem range. */
>>> +if (highmem_size)
>>> +{
>>> +highmem_end = (1ull<<32) + highmem_size;
>>> +e820_entries++;
>>> +}
>>> +
>>> +e820 = malloc(sizeof(struct e820entry) * e820_entries);
>>> +if (!e820) {
>>> +return -1;
>>> +}
>>> +
>>> +/* Low memory */
>>> +e820[nr].addr = 0x10;
>>> +e820[nr].size = args->lowmem_size - 0x10;
>>> +e820[nr].type = E820_RAM;
>>
>> If you really mean it to be this lax (not covering the low 1Mb), then
>> you need to explain why in a comment (and the consuming side
>> should also have a similar explanation then).
>>
> 
> Okay, here may need this,
> 
> /* 
> 
>   * Low RAM starts at least from 1M to make sure all standard regions 
> 
>   * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios, 
> 
>   * have enough space.
>   */
> #define GUEST_LOW_MEM_START_DEFAULT 0x10

But this only states a generic fact, but doesn't explain why you can
lump together all the different things below 1Mb into a single E820
entry.

>>> +nr++;
>>> +
>>> +/* RDM mapping */
>>> +for (i = 0; i < d_config->num_rdms; i++) {
>>> +/*
>>> + * We should drop this kind of rdm entry.
>>> + */
>>> +if (d_config->rdms[i].flag == LIBXL_RDM_RESERVE_FLAG_INVALID)
>>> +continue;
>>> +
>>> +e820[nr].addr = d_config->rdms[i].start;
>>> +e820[nr].size = d_config->rdms[i].size;
>>> +e820[nr].type = E820_RESERVED;
>>> +nr++;
>>> +}
>>
>> Is this guaranteed not to produce overlapping entries?
>>
> 
> Right, I would add this at the beginning,
> 
>  if (e820_entries >= E820MAX) {
>  LOG(ERROR, "Ooops! Too many entries in the memory map!\n");
>  return -1;
>  }

That would be a protection against too many entries, but not against
overlapping ones.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 13/13] hvmloader/e820: construct guest e820 table

2015-05-14 Thread Chen, Tiejun

On 2015/4/20 22:29, Jan Beulich wrote:

On 10.04.15 at 11:22,  wrote:

--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -73,7 +73,8 @@ int build_e820_table(struct e820entry *e820,
   unsigned int lowmem_reserved_base,
   unsigned int bios_image_base)
  {
-unsigned int nr = 0;
+unsigned int nr = 0, i, j;
+struct e820entry tmp;


The declaration of "tmp" belongs in the most narrow scope you need
it in.


Right.




@@ -119,10 +120,6 @@ int build_e820_table(struct e820entry *e820,

  /* Low RAM goes here. Reserve space for special pages. */
  BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20));
-e820[nr].addr = 0x10;
-e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
-nr++;


I think the above comment needs adjustment with all this code
removed. I also wonder how meaningful the BUG_ON() is with
->low_mem_pgend no longer used for E820 table construction.
Perhaps this needs another BUG_ON() validating that the field
matches some value from memory_map.map[]?


But I think hvm_info->low_mem_pgend is still correct, right? And 
additionally, there's no any obvious flag to indicate which 
memory_map.map[x] is that last low memory map. Even we may separate the 
low memory to construct memory_map.map[]...





@@ -159,16 +156,37 @@ int build_e820_table(struct e820entry *e820,
  nr++;
  }

-
-if ( hvm_info->high_mem_pgend )
+/* Construct the remaining according memory_map[]. */
+for ( i = 0; i < memory_map.nr_map; i++ )
  {
-e820[nr].addr = ((uint64_t)1 << 32);
-e820[nr].size =
-((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
+e820[nr].addr = memory_map.map[i].addr;
+e820[nr].size = memory_map.map[i].size;
+e820[nr].type = memory_map.map[i].type;


Afaict you could use structure assignment here to make this
more readable.


Sorry, are you saying this?

memcpy(&e820[nr], &memory_map.map[i], sizeof(struct e820entry));




  nr++;
  }

+/* May need to reorder all e820 entries. */
+for ( j = 0; j < nr-1; j++ )
+{
+for ( i = j+1; i < nr; i++ )
+{
+if ( e820[j].addr > e820[i].addr )
+{
+tmp.addr = e820[j].addr;
+tmp.size = e820[j].size;
+tmp.type = e820[j].type;
+
+e820[j].addr = e820[i].addr;
+e820[j].size = e820[i].size;
+e820[j].type = e820[i].type;
+
+e820[i].addr = tmp.addr;
+e820[i].size = tmp.size;
+e820[i].type = tmp.type;


Please again use structure assignments to make this more readable.



And here,

for ( j = 0; j < nr-1; j++ )
{
for ( i = j+1; i < nr; i++ )
{
if ( e820[j].addr > e820[i].addr )
{
struct e820entry tmp;

memcpy(&tmp, &e820[j], sizeof(struct e820entry));

memcpy(&e820[j], &e820[i], sizeof(struct e820entry));

memcpy(&e820[i], &tmp, sizeof(struct e820entry));
}
}
}

If I'm wrong please correct me.

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 09/13] xen: enable XENMEM_set_memory_map in hvm

2015-05-14 Thread Jan Beulich
>>> On 15.05.15 at 04:33,  wrote:

> 
> On 2015/4/20 21:46, Jan Beulich wrote:
> On 10.04.15 at 11:22,  wrote:
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -4729,7 +4729,6 @@ static long hvm_memory_op(int cmd, 
> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>
>>>   switch ( cmd & MEMOP_CMD_MASK )
>>>   {
>>> -case XENMEM_memory_map:
>>
>> Title and description talk about XENMEM_set_memory_map only. As
>> I think the implementation is right, the former will need updating. Do
>> you actually need a HVM domain to be able to XENMEM_set_memory_map
> 
> Yes. Actually we need to enable two hypercalls here,
> 
> #1. XENMEM_set_memory_map --> Set
> #2. XENMEM_memory_map --> Get

You say "yes" without saying why, and ...

>> on itself? If not, it should probably replace XENMEM_memory_map here.
>>
> 
> Just rephrase,
> 
> xen: enable XENMEM set/get memory_map in hvm
> 
> This patch enables XENMEM_set_memory_map in hvm and then we can use
> it to setup the e820 mappings, and finally hvmloader can get
> these mappings with XENMEM_memory_map.

... according to this wording of yours it's not needed.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 3/5] libxl: add support for vscsi

2015-05-14 Thread Olaf Hering
On Fri, May 15, Jürgen Groß wrote:

> Multi-LUN devices do exist and they are required to be presented as
> those to the guest.

Ok, this means we need the bus concept. I will see if the API for
add/remove can be changed that only devices get passed to the libxl.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Pkg-xen-devel] Bug#785187: xen-hypervisor-4.5-amd64: Option ucode=scan is not working

2015-05-14 Thread Stephan Seitz

On Wed, May 13, 2015 at 11:57:55AM -0400, Konrad Rzeszutek Wilk wrote:

> according to the documentation the option ucode=scan should tell XEN to
> look for a microcode update in an uncompressed initrd.
>
> While I don’t use the Debian kernel the tools to generate the initrd are
> part of Debian. The command „cpio -i < /boot/initrd.img-4.0.2-Dom0”
> creates the directory structure „kernel/x86/microcode/GenuineIntel.bin”,
> so I think the initrd is allright.
Is the initramfs compressed? The scanning code can't deal if the 


[stse@osgiliath]: file /boot/initrd.img-4.0.2-Dom0 
/boot/initrd.img-4.0.2-Dom0: ASCII cpio archive (SVR4 with no CRC)


I don’t think the initrd is compressed.

http://lists.xen.org/archives/html/xen-users/2014-05/msg00053.html says 
that I have to use „cpio -H newc” not „cpio -o c”, but I don’t know how 
the Debian tools create the initrd.


Shade and sweet water!

Stephan

--
| Stephan Seitz  E-Mail: s...@fsing.rootsland.net |
| Public Keys: http://fsing.rootsland.net/~stse/keys.html |


smime.p7s
Description: S/MIME cryptographic signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 3/5] libxl: add support for vscsi

2015-05-14 Thread Jürgen Groß

On 05/13/2015 07:31 PM, Olaf Hering wrote:

On Wed, May 13, Ian Campbell wrote:

Is it important/useful that the user be able to configure/control the
number (and addresses) of the buses themselves and which devices are on
which, or can we get away with the pvpci model where the libxl user just
gives the individual devices and the library internally takes care of
what buses need to be created?


I do not know if its important for users of vscsi to build a bus with
several devices. Giving each device its own bus would work as well,
which would turn the whole thing into ordinary devices from libxl POV.

No idea where the initial idea came from. Even if the feature-host thing
is used, the raw passthrough of SCSI commands would continue to work.


Are you talking about single LUNs or only targets?

Multi-LUN devices do exist and they are required to be presented as
those to the guest.


Juergen

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 12/13] hvmloader/pci: skip reserved ranges

2015-05-14 Thread Chen, Tiejun

On 2015/4/20 22:21, Jan Beulich wrote:

On 10.04.15 at 11:22,  wrote:

--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -59,8 +59,8 @@ void pci_setup(void)
  uint32_t bar_reg;
  uint64_t bar_sz;
  } *bars = (struct bars *)scratch_start;
-unsigned int i, nr_bars = 0;
-uint64_t mmio_hole_size = 0;
+unsigned int i, j, nr_bars = 0;
+uint64_t mmio_hole_size = 0, reserved_end;

  const char *s;
  /*
@@ -393,8 +393,23 @@ void pci_setup(void)
  }

  base = (resource->base  + bar_sz - 1) & ~(uint64_t)(bar_sz - 1);
+ reallocate_mmio:
  bar_data |= (uint32_t)base;
  bar_data_upper = (uint32_t)(base >> 32);
+for ( j = 0; j < memory_map.nr_map ; j++ )
+{
+if ( memory_map.map[j].type != E820_RAM )
+{
+reserved_end = memory_map.map[j].addr + memory_map.map[j].size;
+if ( check_hole_conflict(base, bar_sz,
+ memory_map.map[j].addr,
+ memory_map.map[j].size) )
+{
+base = (reserved_end  + bar_sz - 1) & ~(uint64_t)(bar_sz - 
1);
+goto reallocate_mmio;
+}
+}
+}
  base += bar_sz;

  if ( (base < resource->base) || (base > resource->max) )




Actually some original codes are missing here,

if ( (base < resource->base) || (base > resource->max) )
{
printf("pci dev %02x:%x bar %02x size "PRIllx": no space for "
   "resource!\n", devfn>>3, devfn&7, bar_reg,
   PRIllx_arg(bar_sz));
continue;
}

I think this can guarantee the MMIO regions just fit in the available RAM.

Or am I wrong?

Thanks
Tiejun


But you do nothing to make sure the MMIO regions all fit in the
available window (see the code ahead of this relocating RAM if
necessary).

Jan




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 11/13] hvmloader: get guest memory map into memory_map[]

2015-05-14 Thread Chen, Tiejun

On 2015/4/20 21:57, Jan Beulich wrote:

On 10.04.15 at 11:22,  wrote:

--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -27,6 +27,16 @@
  #include 
  #include 

+int check_hole_conflict(uint64_t start, uint64_t size,
+uint64_t reserved_start, uint64_t reserved_size)
+{
+if ( start + size <= reserved_start ||
+start >= reserved_start + reserved_size )
+return 0;
+else
+return 1;
+}


See the comments on the similar tool stack function. Also please get
indentation right.



Okay.

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH Remus v5 1/2] libxc/save: implement Remus checkpointed save

2015-05-14 Thread Yang Hongyang



On 05/14/2015 08:47 PM, Ian Campbell wrote:

On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote:

With Remus, the save flow should be:
live migration->{ periodically save(checkpointed save) }

Signed-off-by: Yang Hongyang 
Reviewed-by: Andrew Cooper 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
  tools/libxc/xc_sr_save.c | 80 
  1 file changed, 61 insertions(+), 19 deletions(-)

diff --git a/tools/libxc/xc_sr_save.c b/tools/libxc/xc_sr_save.c
index 1d0a46d..1c5d199 100644
--- a/tools/libxc/xc_sr_save.c
+++ b/tools/libxc/xc_sr_save.c
@@ -57,6 +57,16 @@ static int write_end_record(struct xc_sr_context *ctx)
  }

  /*
+ * Writes an CHECKPOINT record into the stream.


"a CHECKPOINT"


+ */
+static int write_checkpoint_record(struct xc_sr_context *ctx)
+{
+struct xc_sr_record checkpoint = { REC_TYPE_CHECKPOINT, 0, NULL };
+
+return write_record(ctx, &checkpoint);
+}
+
+/*
   * Writes a batch of memory as a PAGE_DATA record into the stream.  The batch
   * is constructed in ctx->save.batch_pfns.
   *
@@ -467,6 +477,14 @@ static int send_domain_memory_live(struct xc_sr_context 
*ctx)
  DECLARE_HYPERCALL_BUFFER_SHADOW(unsigned long, dirty_bitmap,
  &ctx->save.dirty_bitmap_hbuf);

+/*
+ * With Remus, we will enter checkpointed save after live migration.
+ * In checkpointed save loop, we skip the live part and pause straight
+ * away to send dirty pages between checkpoints.
+ */
+if ( !ctx->save.live )
+goto last_iter;


Rather than use goto would it work to refactor everything from here to
the label into some sort of helper and just call that in the "actually
live" case?

Or perhaps everything from the label to the end should be a helper
function which the caller can also use in thecheckpoint case instead of
calling send_domain_memory_live (and which s_d_m_l also calls of
course).


I'm going to refactor the send_domain_memory_live() as follows:

split the send_domain_memory_live() into three helper function:
  - send_memory_live()  do the actually live case
  - suspend_and_send_dirty() suspend the guest and send dirty pages
  - send_memory_verify()

then:
  - send_domain_memory_live() combination of those three helper functions
  - send_domain_momory_checkpointed() calls suspend_and_send_dirty() and
  send_memory_verify()
  - send_domain_memory_nonlive() stay as it is

Does it make sense?




+if ( ctx->save.checkpointed )
+{
+if ( ctx->save.live )
+{
+/* End of live migration, we are sending checkpointed stream */
+ctx->save.live = false;


I think I'm misunderstanding either the purpose of this code or the
comment (or both).

Is it the case that a checkpoint starts with an iteration of live (to
transfer everything over) and then drops into sending periodical
non-live updates at each checkpoint?

If so then I think a more useful comment would be:

 /*
  * We have now completed the initial live portion of the
 checkpoint
  * process. Therefore switch into periodically sending synchronous
  * batches of pages.
  */



This is much better, Thank you!


Personally I don't have a problem with just a direct assignment without
the if, since assigning false to an already flase value is a nop.


Will drop the if.




+}
+
+rc = write_checkpoint_record(ctx);
+if ( rc )
+goto err;
+
+ctx->save.callbacks->postcopy(ctx->save.callbacks->data);
+
+rc = ctx->save.callbacks->checkpoint(ctx->save.callbacks->data);
+if ( rc > 0 )
+xc_report_progress_single(xch, "Checkpointed save");
+else
+ctx->save.checkpointed = false;
+}
+} while ( ctx->save.checkpointed );

  xc_report_progress_single(xch, "End of stream");




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 10/13] tools: extend XENMEM_set_memory_map

2015-05-14 Thread Chen, Tiejun

On 2015/4/20 21:51, Jan Beulich wrote:

On 10.04.15 at 11:22,  wrote:

--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -787,6 +787,70 @@ out:
  return rc;
  }

+static int libxl__domain_construct_memmap(libxl_ctx *ctx,
+  libxl_domain_config *d_config,
+  uint32_t domid,
+  struct xc_hvm_build_args *args,
+  int num_pcidevs,
+  libxl_device_pci *pcidevs)
+{
+unsigned int nr = 0, i;
+/* We always own at least one lowmem entry. */
+unsigned int e820_entries = 1;
+uint64_t highmem_end = 0, highmem_size = args->mem_size - 
args->lowmem_size;
+struct e820entry *e820 = NULL;
+
+/* Add all rdm entries. */
+e820_entries += d_config->num_rdms;
+
+/* If we should have a highmem range. */
+if (highmem_size)
+{
+highmem_end = (1ull<<32) + highmem_size;
+e820_entries++;
+}
+
+e820 = malloc(sizeof(struct e820entry) * e820_entries);
+if (!e820) {
+return -1;
+}
+
+/* Low memory */
+e820[nr].addr = 0x10;
+e820[nr].size = args->lowmem_size - 0x10;
+e820[nr].type = E820_RAM;


If you really mean it to be this lax (not covering the low 1Mb), then
you need to explain why in a comment (and the consuming side
should also have a similar explanation then).



Okay, here may need this,

/* 

 * Low RAM starts at least from 1M to make sure all standard regions 

 * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios, 


 * have enough space.
 */
#define GUEST_LOW_MEM_START_DEFAULT 0x10

On the consuming side, I should clarify that we always preserve 1M.


+nr++;
+
+/* RDM mapping */
+for (i = 0; i < d_config->num_rdms; i++) {
+/*
+ * We should drop this kind of rdm entry.
+ */
+if (d_config->rdms[i].flag == LIBXL_RDM_RESERVE_FLAG_INVALID)
+continue;
+
+e820[nr].addr = d_config->rdms[i].start;
+e820[nr].size = d_config->rdms[i].size;
+e820[nr].type = E820_RESERVED;
+nr++;
+}


Is this guaranteed not to produce overlapping entries?



Right, I would add this at the beginning,

if (e820_entries >= E820MAX) {
LOG(ERROR, "Ooops! Too many entries in the memory map!\n");
return -1;
}

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity

2015-05-14 Thread Wei Liu
On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> The PV network protocol is using 4KB page granularity. The goal of this
> patch is to allow a Linux using 64KB page granularity working as a
> network backend on a non-modified Xen.
> 
> It's only necessary to adapt the ring size and break skb data in small
> chunk of 4KB. The rest of the code is relying on the grant table code.
> 
> Although only simple workload is working (dhcp request, ping). If I try
> to use wget in the guest, it will stall until a tcpdump is started on
> the vif interface in DOM0. I wasn't able to find why.
> 

I think in wget workload you're more likely to break down 64K pages to
4K pages. Some of your calculation of mfn, offset might be wrong.

> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> it's used for (I have limited knowledge on the network driver).
> 

This is the maximum slots a guest packet can use. AIUI the protocol
still works on 4K granularity (you break 64K page to a bunch of 4K
pages), you don't need to change this.

> Signed-off-by: Julien Grall 
> Cc: Ian Campbell 
> Cc: Wei Liu 
> Cc: net...@vger.kernel.org
> 
> ---
> 
> Improvement such as support of 64KB grant is not taken into
> consideration in this patch because we have the requirement to run a
> Linux using 64KB pages on a non-modified Xen.
> ---
>  drivers/net/xen-netback/common.h  |  7 ---
>  drivers/net/xen-netback/netback.c | 27 ++-
>  2 files changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h 
> b/drivers/net/xen-netback/common.h
> index 8a495b3..0eda6e9 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -44,6 +44,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  typedef unsigned int pending_ring_idx_t;
> @@ -64,8 +65,8 @@ struct pending_tx_info {
>   struct ubuf_info callback_struct;
>  };
>  
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
>  
>  struct xenvif_rx_meta {
>   int id;
> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
>  /* Discriminate from any valid pending_idx value. */
>  #define INVALID_PENDING_IDX 0x
>  
> -#define MAX_BUFFER_OFFSET PAGE_SIZE
> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
>  
>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>  
> diff --git a/drivers/net/xen-netback/netback.c 
> b/drivers/net/xen-netback/netback.c
> index 9ae1d43..ea5ce84 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
> *queue, struct sk_buff *skb
>  {
>   struct gnttab_copy *copy_gop;
>   struct xenvif_rx_meta *meta;
> - unsigned long bytes;
> + unsigned long bytes, off_grant;
>   int gso_type = XEN_NETIF_GSO_TYPE_NONE;
>  
>   /* Data must not cross a page boundary. */
> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
> *queue, struct sk_buff *skb
>   if (npo->copy_off == MAX_BUFFER_OFFSET)
>   meta = get_next_rx_buffer(queue, npo);
>  
> - bytes = PAGE_SIZE - offset;
> + off_grant = offset & ~XEN_PAGE_MASK;
> + bytes = XEN_PAGE_SIZE - off_grant;
>   if (bytes > size)
>   bytes = size;
>  
> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
> *queue, struct sk_buff *skb
>   } else {
>   copy_gop->source.domid = DOMID_SELF;
>   copy_gop->source.u.gmfn =
> - virt_to_mfn(page_address(page));
> + virt_to_mfn(page_address(page) + offset);
>   }
> - copy_gop->source.offset = offset;
> + copy_gop->source.offset = off_grant;
>  
>   copy_gop->dest.domid = queue->vif->domid;
>   copy_gop->dest.offset = npo->copy_off;
> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue 
> *queue,
>   first->size -= txp->size;
>   slots++;
>  
> - if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
> + if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
>   netdev_err(queue->vif->dev, "Cross page boundary, 
> txp->offset: %x, size: %u\n",
>txp->offset, txp->size);
>   xenvif_fatal_tx_err(queue->vif);
> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
> *queue,
>   }
>  
>   /* No crossing a page as the payload mustn't fragment. */
> - if (unlikely((txreq.offset + txr

Re: [Xen-devel] [RFC][PATCH 09/13] xen: enable XENMEM_set_memory_map in hvm

2015-05-14 Thread Chen, Tiejun



On 2015/4/20 21:46, Jan Beulich wrote:

On 10.04.15 at 11:22,  wrote:

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4729,7 +4729,6 @@ static long hvm_memory_op(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)

  switch ( cmd & MEMOP_CMD_MASK )
  {
-case XENMEM_memory_map:


Title and description talk about XENMEM_set_memory_map only. As
I think the implementation is right, the former will need updating. Do
you actually need a HVM domain to be able to XENMEM_set_memory_map


Yes. Actually we need to enable two hypercalls here,

#1. XENMEM_set_memory_map --> Set
#2. XENMEM_memory_map --> Get


on itself? If not, it should probably replace XENMEM_memory_map here.



Just rephrase,

xen: enable XENMEM set/get memory_map in hvm

This patch enables XENMEM_set_memory_map in hvm and then we can use
it to setup the e820 mappings, and finally hvmloader can get
these mappings with XENMEM_memory_map.

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 56370: tolerable FAIL - PUSHED

2015-05-14 Thread osstest service user
flight 56370 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/56370/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-freebsd10-amd64 13 guest-localmigrate  fail like 55353
 test-amd64-i386-freebsd10-i386 13 guest-localmigrate   fail like 55353

Tests which did not succeed, but are not blocking:
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail never 
pass
 test-amd64-i386-xl-xsm   11 guest-start  fail   never pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail never 
pass
 test-amd64-amd64-libvirt-xsm 11 guest-start  fail   never pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 9 debian-hvm-install fail never 
pass
 test-amd64-amd64-xl-xsm  11 guest-start  fail   never pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 9 debian-hvm-install fail never 
pass
 test-amd64-i386-libvirt-xsm  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 16 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 16 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 16 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 16 guest-stopfail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 16 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 16 guest-stopfail never pass

version targeted for testing:
 ovmf 99d9ade85aad554a0fa08fff8586b0fd40570ac3
baseline version:
 ovmf c1b9129c3c1f09a4847c7ff3179ef2edc475cf56


People who touched revisions under test:
  Chao Zhang 
  Eric Dong 
  Feng Tian 
  Gabriel Somlo 
  Heyi Guo 
  Laszlo Ersek 
  Liming Gao 
  Olivier Martin 
  Ruiyu Ni 
  Shifei Lu 
  Star Zeng 


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm fail
 test-amd64-i386-libvirt-xsm  fail
 test-amd64-amd64-xl-xsm  fail
 test-amd64-i386-xl-xsm   fail
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  fail
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-credit2   

Re: [Xen-devel] [RFC][PATCH 01/13] tools: introduce some new parameters to set rdm policy

2015-05-14 Thread Chen, Tiejun

On 2015/5/11 22:54, Wei Liu wrote:

On Mon, May 11, 2015 at 01:35:06PM +0800, Chen, Tiejun wrote:

On 2015/5/8 21:04, Wei Liu wrote:

Sorry for the late review.



Really thanks for taking your time :)


On Fri, Apr 10, 2015 at 05:21:52PM +0800, Tiejun Chen wrote:

This patch introduces user configurable parameters to specify RDM
resource and according policies,

Global RDM parameter:
 rdm = [ 'host, reserve=force/try' ]
Per-device RDM parameter:
 pci = [ 'sbdf, rdm_reserve=force/try' ]

Global RDM parameter allows user to specify reserved regions explicitly,
e.g. using 'host' to include all reserved regions reported on this platform
which is good to handle hotplug scenario. In the future this parameter
may be further extended to allow specifying random regions, e.g. even
those belonging to another platform as a preparation for live migration
with passthrough devices.

'force/try' policy decides how to handle conflict when reserving RDM
regions in pfn space. If conflict exists, 'force' means an immediate error
so VM will be killed, while 'try' allows moving forward with a warning
message thrown out.

Default per-device RDM policy is 'force', while default global RDM policy
is 'try'. When both policies are specified on a given region, 'force' is
always preferred.

Signed-off-by: Tiejun Chen 
---
  docs/man/xl.cfg.pod.5   | 44 +
  docs/misc/vtd.txt   | 34 
  tools/libxl/libxl_create.c  |  5 +++
  tools/libxl/libxl_types.idl | 18 +++
  tools/libxl/libxlu_pci.c| 78 +
  tools/libxl/libxlutil.h |  4 +++
  tools/libxl/xl_cmdimpl.c| 21 +++-
  7 files changed, 203 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 408653f..9ed3055 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -583,6 +583,36 @@ assigned slave device.

  =back

+=item B
+


Shouldn't this be "TYPE,RDM_RESERVE_STRIGN" according to your commit
message? If the only available config is just one string, you probably
don't need a list for this?


Yes, based on that design we don't need a list. So

=item B



Note that this is still a list (enclosed by "[]"). Maybe you mean

rdm = "RDM_RESERVE_STRING"

?


Yes, I'll do this.






+(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM),
+which is necessary to enable robust device passthrough usage. One example of
+RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
+structure on x86 platform.
+Each B has the form C<["TYPE",KEY=VALUE,KEY=VALUE,...> where:
+


RDM_CHECK_STRING?


And here should be corrected like this,

B has the form ...




+=over 4
+
+=item B<"TYPE">
+
+Currently we just have one type. 'host' means all reserved device memory on
+this platform should be reserved in this VM's pfn space.
+


What are other possible types? If there is only one type then we can


Currently we just have one type and looks that design doesn't make this
clear.


simply ignore the type?


I just think we may introduce something else specific to live migration in
the future... But I'm really not sure right now.



Fair enough. I was just wondering if there would be any other types. If
so we do need provisioning.

In any case, the "type" argument you proposed is a positional argument
(you require it to be the first element of the spec string").
I think you can just make it a key-value pair to make parsing easier.


Do you mean this statement?

=item B

...

B has the form C<[KEY=VALUE,KEY=VALUE,...> where:

=over 4

=item B

Possible Bs are:

=over 4

=item B

Currently we just have one type. "host" means all reserved device memory on
this platform should be reserved in this VM's pfn space.

=over 4

=item B
...







+=item B
+
+Possible Bs are:
+
+=over 4
+
+=item B
+
+Conflict may be detected when reserving reserved device memory in gfn space.
+'force' means an unsolved conflict leads to immediate VM destroy, while


Do you mean "immediate VM crash"?


Yes. So I guess I should replace this.




+'try' allows VM moving forward with a warning message thrown out. 'try'
+is default.


Can you please your double quotes for "force", "try" etc.


Sure. Just note we'd like to use "strict"/"relaxed" to replace "force"/"try"
from next revision according to Jan's suggestion.



No problem.




+
+Note this may be overrided by another sub item, rdm_reserve, in pci device.
+
  =item B

  Specifies the host PCI devices to passthrough to this guest. Each 
B
@@ -645,6 +675,20 @@ dom0 without confirmation.  Please use with care.
  D0-D3hot power management states for the PCI device. False (0) by
  default.

+=item B
+
+(HVM/x86 only) Specifies the information about Reserved Device Memory (RDM),
+which is necessary to enable robust device passthrough usage. One example of
+RDM is reported through ACPI Reserved Memory Region Reporting (RMRR)
+structure on x86 platform.
+
+Conflict may be detec

Re: [Xen-devel] [PATCH v7 07/14] x86: dynamically get/set CBM for a domain

2015-05-14 Thread Chao Peng
On Thu, May 14, 2015 at 11:19:17AM +0200, Dario Faggioli wrote:
> On Fri, 2015-05-08 at 16:56 +0800, Chao Peng wrote:
> > For CAT, COS is maintained in hypervisor only while CBM is exposed to
> > user space directly to allow getting/setting domain's cache capacity.
> > For each specified CBM, hypervisor will either use a existed COS which
> > has the same CBM or allocate a new one if the same CBM is not found. If
> > the allocation fails because of no enough COS available then error is
> > returned. The getting/setting are always operated on a specified socket.
> > For multiple sockets system, the interface may be called several times.
> > 
> > Signed-off-by: Chao Peng 
> >
> Reviewed-by: Dario Faggioli 
> 
> Just, one minor thing, only if you end up resending...
> 
> > diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
> > index 1feb2f6..385807b 100644
> > --- a/xen/arch/x86/psr.c
> > +++ b/xen/arch/x86/psr.c
> > @@ -49,6 +49,14 @@ static unsigned int opt_cos_max = 255;
> >  static uint64_t rmid_mask;
> >  static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
> >  
> > +static unsigned int get_socket_cpu(unsigned int socket)
> > +{
> > +if ( socket < nr_sockets )
> > +return cpumask_any(socket_to_cpumask[socket]);
> > +
> ... What about 
> 
>   if ( likely(socket < nr_sockets) )

Agreed, it can be an optimization chance.

Let's see what others think.

Thanks for review.
Chao

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore

2015-05-14 Thread Yang Hongyang



On 05/14/2015 09:05 PM, Ian Campbell wrote:

On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote:

With Remus, the restore flow should be:
the first full migration stream -> { periodically restore stream }

Signed-off-by: Yang Hongyang 
Signed-off-by: Andrew Cooper 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
  tools/libxc/xc_sr_common.h  |  14 ++
  tools/libxc/xc_sr_restore.c | 113 
  2 files changed, 117 insertions(+), 10 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index f8121e7..3bf27f1 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -208,6 +208,20 @@ struct xc_sr_context
  /* Plain VM, or checkpoints over time. */
  bool checkpointed;

+/* Currently buffering records between a checkpoint */
+bool buffer_all_records;
+
+/*
+ * With Remus, we buffer the records sent by the primary at checkpoint,
+ * in case the primary will fail, we can recover from the last
+ * checkpoint state.
+ * This should be enough because primary only send dirty pages at
+ * checkpoint.


I'm not sure how it then follows that 1024 buffers is guaranteed to be
enough, unless there is something on the sending side arranging it to be
so?


There are only few records at every checkpoint in my test, mostly under 10,
probably because I don't do much operations in the Guest. I thought This limit
can be adjusted later by further testing.
Since you and Andy both have doubts on this, I have to reconsider on this,
perhaps there should be no limit. Even if the 1024 limit works for
most of the cases, there might be cases that exceed the limit. So I will
add another member 'allocated_rec_num' in the context, when the
'buffered_rec_num' exceed the 'allocated_rec_num', I will reallocate the buffer.
The initial buffer size will be 1024 records which will work for most cases.

/*
 * With Remus, we buffer the records sent by the primary at checkpoint,
 * in case the primary will fail, we can recover from the last
 * checkpoint state.
 * This should be enough for most of the cases because primary only send
 * dirty pages at checkpoint.
 */
#define DEFAULT_BUF_RECORDS 1024
struct xc_sr_record *buffered_records;
unsigned allocated_rec_num;
unsigned buffered_rec_num;




+ */
+#define MAX_BUF_RECORDS 1024
+struct xc_sr_record *buffered_records;
+unsigned buffered_rec_num;
+
  /*
   * Xenstore and Console parameters.
   * INPUT:  evtchn & domid
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9ab5760..8468ffc 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -468,11 +468,69 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
  return rc;
  }

+static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec);
+static int handle_checkpoint(struct xc_sr_context *ctx)
+{
+xc_interface *xch = ctx->xch;
+int rc = 0;
+unsigned i;
+
+if ( !ctx->restore.checkpointed )
+{
+ERROR("Found checkpoint in non-checkpointed stream");
+rc = -1;


Is it usual in migrv2 to set errno as well?


+goto err;
+}
+
+if ( ctx->restore.buffer_all_records )
+{
+IPRINTF("All records buffered");
+
+/*
+ * We need to set buffer_all_records to false in
+ * order to process records instead of buffer records.
+ * buffer_all_records should be set back to true after
+ * we successfully processed all records.
+ */
+ctx->restore.buffer_all_records = false;


I'm not personally a fan of changing global state in order to simulate
the action of what should be a parameter to a function.

Preferable IMHO would be to have process_record gain a parameter to
override the ctx state but become an internal helper (perhaps with a
name change) and then have API function process_record and
process_buffered_records or some such which call it in the right way.

Andy may have a differing opinion though.



.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH Remus v5 2/2] libxc/restore: implement Remus checkpointed restore

2015-05-14 Thread Yang Hongyang



On 05/14/2015 10:04 PM, Ian Campbell wrote:

On Thu, 2015-05-14 at 14:17 +0100, Andrew Cooper wrote:

On 14/05/15 14:05, Ian Campbell wrote:

On Thu, 2015-05-14 at 18:06 +0800, Yang Hongyang wrote:

With Remus, the restore flow should be:
the first full migration stream -> { periodically restore stream }

Signed-off-by: Yang Hongyang 
Signed-off-by: Andrew Cooper 
CC: Ian Campbell 
CC: Ian Jackson 
CC: Wei Liu 
---
  tools/libxc/xc_sr_common.h  |  14 ++
  tools/libxc/xc_sr_restore.c | 113 
  2 files changed, 117 insertions(+), 10 deletions(-)

diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h
index f8121e7..3bf27f1 100644
--- a/tools/libxc/xc_sr_common.h
+++ b/tools/libxc/xc_sr_common.h
@@ -208,6 +208,20 @@ struct xc_sr_context
  /* Plain VM, or checkpoints over time. */
  bool checkpointed;

+/* Currently buffering records between a checkpoint */
+bool buffer_all_records;
+
+/*
+ * With Remus, we buffer the records sent by the primary at checkpoint,
+ * in case the primary will fail, we can recover from the last
+ * checkpoint state.
+ * This should be enough because primary only send dirty pages at
+ * checkpoint.

I'm not sure how it then follows that 1024 buffers is guaranteed to be
enough, unless there is something on the sending side arranging it to be
so?


+ */
+#define MAX_BUF_RECORDS 1024
+struct xc_sr_record *buffered_records;
+unsigned buffered_rec_num;
+
  /*
   * Xenstore and Console parameters.
   * INPUT:  evtchn & domid
diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c
index 9ab5760..8468ffc 100644
--- a/tools/libxc/xc_sr_restore.c
+++ b/tools/libxc/xc_sr_restore.c
@@ -468,11 +468,69 @@ static int handle_page_data(struct xc_sr_context *ctx, 
struct xc_sr_record *rec)
  return rc;
  }

+static int process_record(struct xc_sr_context *ctx, struct xc_sr_record *rec);
+static int handle_checkpoint(struct xc_sr_context *ctx)
+{
+xc_interface *xch = ctx->xch;
+int rc = 0;
+unsigned i;
+
+if ( !ctx->restore.checkpointed )
+{
+ERROR("Found checkpoint in non-checkpointed stream");
+rc = -1;

Is it usual in migrv2 to set errno as well?


If a relevant errno is to be had.


EINVAL or ENOSYS perhaps?


There are a lot of cases which are waiting for some real libxc error
codes before they can propagate numeric error information, although in
all cases the log messages will be accurate (and hopefully helpful).

~Andrew




+goto err;
+}
+
+if ( ctx->restore.buffer_all_records )
+{
+IPRINTF("All records buffered");
+
+/*
+ * We need to set buffer_all_records to false in
+ * order to process records instead of buffer records.
+ * buffer_all_records should be set back to true after
+ * we successfully processed all records.
+ */
+ctx->restore.buffer_all_records = false;

I'm not personally a fan of changing global state in order to simulate
the action of what should be a parameter to a function.

Preferable IMHO would be to have process_record gain a parameter to
override the ctx state but become an internal helper (perhaps with a
name change) and then have API function process_record and
process_buffered_records or some such which call it in the right way.

Andy may have a differing opinion though.


Hmm yes - it would be nice to split the buffering logic away from the
processing logic.

However, the two are slightly related.

Perhaps a process_or_buffer_record() helper, and removing all buffering
logic from process_record().


That seems like it would work.


Good idea, add a buffer_record() helper should be an improvement to the
code and make it more clearer, thank you!





.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v6 00/16] Misc patches to aid migration v2 Remus support

2015-05-14 Thread Yang Hongyang



On 05/14/2015 08:35 PM, Ian Campbell wrote:

On Thu, 2015-05-14 at 16:55 +0800, Yang Hongyang wrote:

This is the combination of Andrew Cooper's misc patches and mine
to aid migration v2 Remus support.


Applied, thanks. I made a few changes to some commit logs, I hope that's
ok.



That's great! Thank you!




.



--
Thanks,
Yang.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action

2015-05-14 Thread Wei Liu
On Thu, May 14, 2015 at 06:00:48PM +0100, Julien Grall wrote:
> The variables old_req_cons and ring_slots_used are assigned but never
> used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
> always fully coalesce guest Rx packets".
> 
> Signed-off-by: Julien Grall 
> Cc: Ian Campbell 
> Cc: Wei Liu 
> Cc: net...@vger.kernel.org

Acked-by: Wei Liu 

> ---
>  drivers/net/xen-netback/netback.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c 
> b/drivers/net/xen-netback/netback.c
> index 9c6a504..9ae1d43 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
>  
>   while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
>  && (skb = xenvif_rx_dequeue(queue)) != NULL) {
> - RING_IDX old_req_cons;
> - RING_IDX ring_slots_used;
> -
>   queue->last_rx_time = jiffies;
>  
> - old_req_cons = queue->rx.req_cons;
>   XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, 
> queue);
> - ring_slots_used = queue->rx.req_cons - old_req_cons;
>  
>   __skb_queue_tail(&rxq, skb);
>   }
> -- 
> 2.1.4

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 01/10] tools: Add vga=vmware

2015-05-14 Thread Don Slutz
On 05/14/15 19:42, Andrew Cooper wrote:
> On 15/05/2015 00:34, Don Slutz wrote:
>> This allows use of QEMU's VMware emulated video card
>>
>> Signed-off-by: Don Slutz 
> 
> Nack.
> 
> Qemu-trad is currently has remote code execution vulnerabilities in its
> vmware vga model.  CVE-2014-3689 amongst others.
> 
> Please fix those first before offering an option to configure it.
> 

Ok, will investigate.

  -Don Slutz


> ~Andrew
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Status of VMware tools support (Was: Xen 4.6 Development Update (four months reminder))

2015-05-14 Thread Don Slutz
On 05/13/15 01:01, wei.l...@citrix.com wrote:

> *  VMware tools support (fair)
>   -  Don Slutz
> 

v10 of patch set posted.  Should be able to move to ok.

   -Don Slutz


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v10 01/10] tools: Add vga=vmware

2015-05-14 Thread Andrew Cooper
On 15/05/2015 00:34, Don Slutz wrote:
> This allows use of QEMU's VMware emulated video card
>
> Signed-off-by: Don Slutz 

Nack.

Qemu-trad is currently has remote code execution vulnerabilities in its
vmware vga model.  CVE-2014-3689 amongst others.

Please fix those first before offering an option to configure it.

~Andrew

> ---
> v10: New at v10.
>
>   Was part of "tools: Add vmware_hwver support"
>
>  docs/man/xl.cfg.pod.5   | 2 +-
>  tools/libxl/libxl.h | 6 ++
>  tools/libxl/libxl_dm.c  | 8 
>  tools/libxl/libxl_types.idl | 1 +
>  tools/libxl/xl_cmdimpl.c| 2 ++
>  5 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index 8e4154f..ba78374 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -1374,7 +1374,7 @@ This option is deprecated, use vga="stdvga" instead.
>  
>  =item B
>  
> -Selects the emulated video card (none|stdvga|cirrus|qxl).
> +Selects the emulated video card (none|stdvga|cirrus|qxl|vmware).
>  The default is cirrus.
>  
>  In general, QXL should work with the Spice remote display protocol
> diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
> index 2ed7194..007a211 100644
> --- a/tools/libxl/libxl.h
> +++ b/tools/libxl/libxl.h
> @@ -192,6 +192,12 @@
>   * is not present, instead of ERROR_INVAL.
>   */
>  #define LIBXL_HAVE_ERROR_DOMAIN_NOTFOUND 1
> +
> +/*
> + * The libxl_vga_interface_type has the type for vmware.
> + */
> +#define LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE 1
> +
>  /*
>   * libxl ABI compatibility
>   *
> diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
> index 0c6408d..9a06f9b 100644
> --- a/tools/libxl/libxl_dm.c
> +++ b/tools/libxl/libxl_dm.c
> @@ -251,6 +251,9 @@ static char ** 
> libxl__build_device_model_args_old(libxl__gc *gc,
>  case LIBXL_VGA_INTERFACE_TYPE_NONE:
>  flexarray_append_pair(dm_args, "-vga", "none");
>  break;
> +case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
> +flexarray_append_pair(dm_args, "-vga", "vmware");
> +break;
>  case LIBXL_VGA_INTERFACE_TYPE_QXL:
>  break;
>  }
> @@ -633,6 +636,11 @@ static char ** 
> libxl__build_device_model_args_new(libxl__gc *gc,
>  
> GCSPRINTF("qxl-vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64,
>  (b_info->video_memkb/2/1024), (b_info->video_memkb/2/1024) ) 
> );
>  break;
> +case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
> +flexarray_append_pair(dm_args, "-device",
> +GCSPRINTF("vmware-svga,vgamem_mb=%d",
> +libxl__sizekb_to_mb(b_info->video_memkb)));
> +break;
>  }
>  
>  if (b_info->u.hvm.boot) {
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 65d479f..9d6ca45 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -184,6 +184,7 @@ libxl_vga_interface_type = 
> Enumeration("vga_interface_type", [
>  (2, "STD"),
>  (3, "NONE"),
>  (4, "QXL"),
> +(5, "VMWARE"),
>  ], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS")
>  
>  libxl_vendor_device = Enumeration("vendor_device", [
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 373aa37..0e44b12 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -2117,6 +2117,8 @@ skip_vfb:
>  b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
>  } else if (!strcmp(buf, "qxl")) {
>  b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_QXL;
> +} else if (!strcmp(buf, "vmware")) {
> +b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_VMWARE;
>  } else {
>  fprintf(stderr, "Unknown vga \"%s\" specified\n", buf);
>  exit(1);


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v10 00/10] Xen VMware tools support

2015-05-14 Thread Don Slutz
Changes v9 to v10:
  Split out LIBXL_VGA_INTERFACE_TYPE_VMWARE into it's own patch (#1)
  that can stand alone.  In the patch set because a later patch
  depends on it.

  Reworked to be based on:

commit a7511905fae7ba592c5bf63cd77d8ff78087d689
Author: Julien Grall 
Date:   Wed Apr 1 17:21:41 2015 +0100

xen: Extend DOMCTL createdomain to support arch configuration

  rebased onto:

commit e13013dbf1d5997915548a3b5f1c39594d8c1d7b
Author: Yang Hongyang 
Date:   Thu May 14 16:55:18 2015 +0800

libxc/restore: add checkpointed flag to the restore context


  Andrew Cooper (#2: "xen: Add support for VMware cpuid leaves"):
Did not add "Reviewed-by: Andrew Cooper "
because of changes here to do things the new way.
  Reword comment message to reflect new way.

  Ian Campbell (#3 "tools: Add vmware_hwver support"):
LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE &
LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HWVER are arriving together
a single umbrella could be used.
  Since I split the LIBXL_VGA_INTERFACE_TYPE_VMWARE into
  it's own patch, this is not longer true.
  But I did use 1 for the 2 c_info changes.
Please use GCSPRINTF.
  Done.
  Remove vga=vmware from here.

  Ian Campbell (#3 "tools: Add vmware_hwver support"):
For "Add IOREQ_TYPE_VMWARE_PORT"
  With those fixed the tools/* bits are:
Acked-by: Ian Campbell   
Did not add Acked-by to "tools: Add vmware_hwver support"
because of the rework for using libxl_domain_create_info.

  Andrew Cooper (#4: "vmware: Add VMware provided include file."):
Added "Acked-by: Andrew Cooper "

  Andrew Cooper (#5 "xen: Add vmware_port support"):
Probably better as EOPNOTSUPP, as it is a configuration problem.
  Done.
vmport_ioport function looks as if it should be static.
  Done.
Why is GETHZ the only one of these with a CPL check?
  Please see thread for detail.
I would suggest putting vmport_register declaration in hvm.h ...
  Done.

  Jan Beulich (#5 "xen: Add vmware_port support"):
As indicated before, I don't think this is a good use case for a
domain creation flag.
  Switch to the new config way.
struct domain *d => struct domain *currd
  Done
Are you sure you don't want to zero the high halves of 64-bit ...
  Comment added.
   Then just have this handled into the default case.
  Reworked new_eax handling.
   is_hvm_domain(currd)
   And - why here rather than before the switch() or even right at the
   start of the function?
  Moved to start.
   With that, is it really correct that OUT updates the other registers
   just like IN? If so, this deserves a comment, so that readers won't
   think this is in error.
 All done in comment at start.

  Andrew Cooper (#6 "xen: Add ring 3 vmware_port support"):
>> This looks horribly invasive.
>>
>> Why are emulation changes needed?  What is wrong with the normal
>> handling with a registered ioport handler?
> Because VMware made a bad way to provide a "hyper call".  They decided to
> allow user access to this.  So when a #GP fault should have been
> reported, they instead do the "hyper call".
>
Urgh - now I remember.

Right.  In the case that vmport is active, we start intercepting #GP
faults and emulating access.  That part of the patch looks ok.

However, the rest is very invasive to the emulation infrastructure.
  Re-worked along this lines suggested.

  Jan Beulich (#6 "xen: Add ring 3 vmware_port support"):
I hope that vmport_check will no longer be needed with the adjustments ...
> Since this is not an architecture feature and I do not expect any real
> CPUs to support this, I do not expect any other use.  But I am happy
> to make it more generic.

Let's see how this ends up looking - the hook is probably indeed
bogus (from an architectural pov) no matter how you name it.
  Last e-mail on thread, so no change.

  Ian Campbell (#7 "tools: Add vmware_port support"):
If..." at the start of the sentence ...
  Used Ian's reword.
Also, why is 7 special?
  Attempted to better explain.

  Paul Durrant & Jan Beulich (#8 "Add IOREQ_TYPE_VMWARE_PORT"):
Now that buf is no longer a bool, could ...
These literals should become an enum
  Added an enum.
I don't think the invalidate type is needed.
  Dropped.
IOREQ_TYPE_VMWARE_PORT as 3 is a re-use.
  Switch to 9.
Code handling "case X86EMUL_UNHANDLEABLE:" in emulate.c
is unclear.
   Re-worked to a version that Jan likes better.
Comment about "special' range of 1" is not clear.
   Re-worded comments.

  Ian Campbell (#9 "Add xentrace to vmware_port"):
Acked-by
  Readded dropped traces.

  Jan Beulich & Andrew Cooper (#9 "Add xentrace to vmware_port"):
Why is cmd in this patch?
  Because the trace points use it.

  Jan Beulich (#10 "test_x86_emulator.c: Add tests for #GP usage

[Xen-devel] [PATCH v10 08/10] Add IOREQ_TYPE_VMWARE_PORT

2015-05-14 Thread Don Slutz
This adds synchronization of the 6 vcpu registers (only 32bits of
them) that vmport.c needs between Xen and QEMU.

This is to avoid a 2nd and 3rd exchange between QEMU and Xen to
fetch and put these 6 vcpu registers used by the code in vmport.c
and vmmouse.c

In the tools, enable usage of QEMU's vmport code.

The currently most useful VMware port support that QEMU has is the
VMware mouse support.  Xorg included a VMware mouse support that
uses absolute mode.  This make using a mouse in X11 much nicer.

Signed-off-by: Don Slutz 
Acked-by: Ian Campbell 
---
v10:
  These literals should become an enum.
I don't think the invalidate type is needed.
Code handling "case X86EMUL_UNHANDLEABLE:" in emulate.c
is unclear.
Comment about "special' range of 1" is not clear.


v9:
  New code was presented as an RFC before this.

  Paul Durrant sugested I add support for other IOREQ types
  to HVMOP_map_io_range_to_ioreq_server.
I have done this.

 tools/libxc/xc_hvm_build_x86.c   |   5 +-
 tools/libxl/libxl_dm.c   |   2 +
 xen/arch/x86/hvm/emulate.c   |  78 ++---
 xen/arch/x86/hvm/hvm.c   | 182 ++-
 xen/arch/x86/hvm/io.c|  16 
 xen/include/asm-x86/hvm/domain.h |   3 +-
 xen/include/asm-x86/hvm/hvm.h|   1 +
 xen/include/public/hvm/hvm_op.h  |   5 ++
 xen/include/public/hvm/ioreq.h   |  17 
 xen/include/public/hvm/params.h  |   4 +-
 10 files changed, 274 insertions(+), 39 deletions(-)

diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
index e45ae4a..ffe52eb 100644
--- a/tools/libxc/xc_hvm_build_x86.c
+++ b/tools/libxc/xc_hvm_build_x86.c
@@ -46,7 +46,8 @@
 #define SPECIALPAGE_IOREQ5
 #define SPECIALPAGE_IDENT_PT 6
 #define SPECIALPAGE_CONSOLE  7
-#define NR_SPECIAL_PAGES 8
+#define SPECIALPAGE_VMPORT_REGS 8
+#define NR_SPECIAL_PAGES 9
 #define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
 
 #define NR_IOREQ_SERVER_PAGES 8
@@ -569,6 +570,8 @@ static int setup_guest(xc_interface *xch,
  special_pfn(SPECIALPAGE_BUFIOREQ));
 xc_hvm_param_set(xch, dom, HVM_PARAM_IOREQ_PFN,
  special_pfn(SPECIALPAGE_IOREQ));
+xc_hvm_param_set(xch, dom, HVM_PARAM_VMPORT_REGS_PFN,
+ special_pfn(SPECIALPAGE_VMPORT_REGS));
 xc_hvm_param_set(xch, dom, HVM_PARAM_CONSOLE_PFN,
  special_pfn(SPECIALPAGE_CONSOLE));
 xc_hvm_param_set(xch, dom, HVM_PARAM_PAGING_RING_PFN,
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index c04fa0d..e02766a 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -799,6 +799,8 @@ static char ** libxl__build_device_model_args_new(libxl__gc 
*gc,
 machinearg, max_ram_below_4g);
 }
 }
+if (libxl_defbool_val(c_info->vmware_port))
+machinearg = GCSPRINTF("%s,vmport=on", machinearg);
 flexarray_append(dm_args, machinearg);
 for (i = 0; b_info->extra_hvm && b_info->extra_hvm[i] != NULL; i++)
 flexarray_append(dm_args, b_info->extra_hvm[i]);
diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index d5e6468..0a42d18 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -219,27 +219,70 @@ static int hvmemul_do_io(
 vio->io_state = HVMIO_handle_mmio_awaiting_completion;
 break;
 case X86EMUL_UNHANDLEABLE:
-{
-struct hvm_ioreq_server *s =
-hvm_select_ioreq_server(curr->domain, &p);
-
-/* If there is no suitable backing DM, just ignore accesses */
-if ( !s )
+if ( vmport_check_port(p.addr) )
 {
-hvm_complete_assist_req(&p);
-rc = X86EMUL_OKAY;
-vio->io_state = HVMIO_none;
+struct hvm_ioreq_server *s =
+hvm_select_ioreq_server(curr->domain, &p);
+
+/* If there is no suitable backing DM, just ignore accesses */
+if ( !s )
+{
+hvm_complete_assist_req(&p);
+rc = X86EMUL_OKAY;
+vio->io_state = HVMIO_none;
+}
+else
+{
+rc = X86EMUL_RETRY;
+if ( !hvm_send_assist_req(s, &p) )
+vio->io_state = HVMIO_none;
+else if ( p_data == NULL )
+rc = X86EMUL_OKAY;
+}
 }
 else
 {
-rc = X86EMUL_RETRY;
-if ( !hvm_send_assist_req(s, &p) )
-vio->io_state = HVMIO_none;
-else if ( p_data == NULL )
+struct hvm_ioreq_server *s;
+vmware_regs_t *vr;
+
+BUILD_BUG_ON(sizeof(ioreq_t) < sizeof(vmware_regs_t));
+
+p.type = IOREQ_TYPE_VMWARE_PORT;
+s = hvm_select_ioreq_server(curr->domain, &p);
+vr = get_vmport_regs_any(s, curr);
+
+/*
+

[Xen-devel] [PATCH v10 02/10] xen: Add support for VMware cpuid leaves

2015-05-14 Thread Don Slutz
This is done by adding xen_arch_domainconfig vmware_hw. It is set to
the VMware virtual hardware version.

Currently 0, 3-4, 6-11 are good values.  However the
code only checks for == 0 or != 0 or >= 7.

If non-zero then
  Return VMware's cpuid leaves.  If >= 7 return data, else
  return 0.

The support of hypervisor cpuid leaves has not been agreed to.

MicroSoft Hyper-V (AKA viridian) currently must be at 0x4000.

VMware currently must be at 0x4000.

KVM currently must be at 0x4000 (from Seabios).

Xen can be found at the first otherwise unused 0x100 aligned
offset between 0x4000 and 0x4001.

http://download.microsoft.com/download/F/B/0/FB0D01A3-8E3A-4F5F-AA59-08C8026D3B8A/requirements-for-implementing-microsoft-hypervisor-interface.docx

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458

http://lwn.net/Articles/301888/
  Attempted to get this cleaned up.

So based on this, I picked the order:

Xen at 0x4000 or
Viridian or VMware at 0x4000 and Xen at 0x4100

If both Viridian and VMware selected, report an error.

Since I need to change xen/arch/x86/hvm/Makefile; also add
a newline at end of file.

Signed-off-by: Don Slutz 
Reviewed-by: Andrew Cooper 
---
v10:
Did not add "Reviewed-by: Andrew Cooper "
because of changes here to do things the new way.
  Reword comment message to reflect new way.

v9:
s/vmware_hw/vmware_hwver/i
Change -EXDEV to EOPNOTSUPP.
  Done.
adding another subdirectory: xen/arch/x86/hvm/vmware
Much will depend on the discussion of the subsequent patches.
  TBD.
So for versions < 7 there's effectively no CPUID support at all?
  Changed to check at entry.
The comment /* Params for VMware */ seems wrong...
  Changed to /* emulated VMware Hardware Version */
Also please use d, not _d in #define is_vmware_domain()
  Changed.  Line is now > 80 characters, so split into 2.

v7:
  Prevent setting of HVM_PARAM_VIRIDIAN if HVM_PARAM_VMWARE_HW set.
v5:
  Given how is_viridian and is_vmware are defined I think '||' is more
  appropriate.
Fixed.
  The names of all three functions are bogus.
removed static support routines.
  This hunk is unrelated, but is perhaps something better fixed.
Added to commit message.
  include  (IIRC) please.
Done.
  At least 1 pair of brackets please, especially as the placement of
  brackets affects the result of this particular calculation.
Switch to "100ull / APIC_BUS_CYCLE_NS"  

 xen/arch/x86/domain.c |  2 ++
 xen/arch/x86/hvm/Makefile |  1 +
 xen/arch/x86/hvm/hvm.c| 11 ++
 xen/arch/x86/hvm/vmware/Makefile  |  1 +
 xen/arch/x86/hvm/vmware/cpuid.c   | 75 +++
 xen/arch/x86/traps.c  |  8 +++--
 xen/include/asm-x86/hvm/domain.h  |  3 ++
 xen/include/asm-x86/hvm/hvm.h |  6 
 xen/include/asm-x86/hvm/vmware.h  | 33 +
 xen/include/public/arch-x86/xen.h |  2 +-
 10 files changed, 139 insertions(+), 3 deletions(-)
 create mode 100644 xen/arch/x86/hvm/vmware/Makefile
 create mode 100644 xen/arch/x86/hvm/vmware/cpuid.c
 create mode 100644 xen/include/asm-x86/hvm/vmware.h

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 1f1550e..bc3d3a5 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -518,6 +518,8 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 hvm_funcs.hap_supported &&
 (domcr_flags & DOMCRF_hap);
 d->arch.hvm_domain.mem_sharing_enabled = 0;
+if ( config )
+d->arch.hvm_domain.vmware_hwver = config->vmware_hwver;
 
 d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
 
diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
index 69af47f..284ca75 100644
--- a/xen/arch/x86/hvm/Makefile
+++ b/xen/arch/x86/hvm/Makefile
@@ -1,5 +1,6 @@
 subdir-y += svm
 subdir-y += vmx
+subdir-y += vmware
 
 obj-y += asid.o
 obj-y += emulate.o
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 689e402..05c80e9 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -4253,6 +4254,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, 
unsigned int *ebx,
 if ( cpuid_viridian_leaves(input, eax, ebx, ecx, edx) )
 return;
 
+if ( cpuid_vmware_leaves(input, eax, ebx, ecx, edx) )
+return;
+
 if ( cpuid_hypervisor_leaves(input, count, eax, ebx, ecx, edx) )
 return;
 
@@ -5656,6 +5660,13 @@ static int hvm_allow_set_param(struct domain *d,
 {
 /* The following parameters should only be changed once. */
 case HVM_PARAM_VIRIDIAN:
+/* Disallow if vmware_hwver */
+if ( d->arch.hvm_domain.vmware_hwver )
+{
+rc = -EOPNOTSUPP;
+break;
+}
+/* F

[Xen-devel] [PATCH v10 10/10] test_x86_emulator.c: Add tests for #GP usage

2015-05-14 Thread Don Slutz
Test out special #GP handling for the VMware port 0x5658.
This is done in two "modes", j=0 and j=1.  Testing 4
instructions (all the basic PIO) in both modes.

The port used is based on j.

For IN, eax should change.  For OUT eax should not change.

All 4 PIO instructions are 1 byte long, so eip should only
change by 1.

Signed-off-by: Don Slutz 
---
v10:
  More comments and simpler error checking.
 Dropped un-needed new routines.

 tools/tests/x86_emulator/test_x86_emulator.c | 189 +++
 1 file changed, 189 insertions(+)

diff --git a/tools/tests/x86_emulator/test_x86_emulator.c 
b/tools/tests/x86_emulator/test_x86_emulator.c
index 1b78bf7..a509dad 100644
--- a/tools/tests/x86_emulator/test_x86_emulator.c
+++ b/tools/tests/x86_emulator/test_x86_emulator.c
@@ -154,6 +154,47 @@ int get_fpu(
 return X86EMUL_OKAY;
 }
 
+static int read_io(
+unsigned int port,
+unsigned int bytes,
+unsigned long *val,
+struct x86_emulate_ctxt *ctxt)
+{
+*val = 0x;
+if ( port == 0x5658 )
+{
+ctxt->regs->_ebx++;
+ctxt->regs->_ecx++;
+ctxt->regs->_esi++;
+}
+return X86EMUL_OKAY;
+}
+
+static int write_io(
+unsigned int port,
+unsigned int bytes,
+unsigned long val,
+struct x86_emulate_ctxt *ctxt)
+{
+if ( port == 0x5658 )
+{
+ctxt->regs->_ebx++;
+ctxt->regs->_ecx++;
+ctxt->regs->_esi++;
+}
+return X86EMUL_OKAY;
+}
+
+static int vmport_check(
+   unsigned int first_port,
+   struct x86_emulate_ctxt *ctxt)
+{
+if ( first_port == 0x5658 )
+return 0;
+else
+return 1;
+}
+
 static struct x86_emulate_ops emulops = {
 .read   = read,
 .insn_fetch = fetch,
@@ -163,6 +204,13 @@ static struct x86_emulate_ops emulops = {
 .get_fpu= get_fpu,
 };
 
+static struct x86_emulate_ops emulops_gp = {
+.insn_fetch = fetch,
+.read_io= read_io,
+.write_io   = write_io,
+.vmport_check = vmport_check,
+};
+
 int main(int argc, char **argv)
 {
 struct x86_emulate_ctxt ctxt;
@@ -928,6 +976,147 @@ int main(int argc, char **argv)
 goto fail;
 printf("okay\n");
 
+/*
+ * Test out special #GP handling for the VMware port 0x5658.
+ * This is done in two "modes", j=0 and j=1.  Testing 4
+ * instructions (all the basic PIO) in both modes.
+ *
+ * The port used is based on j.
+ *
+ * For IN, eax should change.  For OUT eax should not change.
+ *
+ * All 4 PIO instructions are 1 byte long, so eip should only
+ * change by 1.
+ */
+for ( j = 0; j <= 1; j++ )
+{
+regs.eflags = 0x20002;
+regs.edx= 0x5658 + j;
+printf("Testing %s dx=%x ...   ", "in (%dx),%eax", (int)regs.edx);
+instr[0] = 0xed; /* in (%dx),%eax or in (%dx),%ax */
+regs.eip= (unsigned long)&instr[0];
+regs.eax= 0x12345678;
+regs.ebx= 0;
+regs.ecx= 0;
+regs.esi= 0;
+rc = x86_emulate(&ctxt, &emulops_gp);
+/*
+ * In j=0, there should not be an error returned.
+ * In j=1, there should be an error returned.
+ */
+if ( rc == X86EMUL_OKAY ? j : !j )
+goto fail;
+/* Check for only 1 byte used or 0 if #GP. */
+if ( regs.eip != (unsigned long)&instr[1 - j] )
+goto fail;
+/* Check that eax changed in the non #GP case */
+if ( j == 0 && regs.eax == 0x12345678 )
+goto fail;
+/* Check that ebx has the correct value */
+if ( regs.ebx == j )
+goto fail;
+/* Check that ecx has the correct value */
+if ( regs.ecx == j )
+goto fail;
+/* Check that esi has the correct value */
+if ( regs.esi == j )
+goto fail;
+printf("okay\n");
+
+printf("Testing %s  dx=%x ...   ", "in (%dx),%al", (int)regs.edx);
+instr[0] = 0xec; /* in (%dx),%al */
+regs.eip= (unsigned long)&instr[0];
+regs.eax= 0x12345678;
+regs.ebx= 0;
+regs.ecx= 0;
+regs.esi= 0;
+rc = x86_emulate(&ctxt, &emulops_gp);
+/*
+ * In j=0, there should not be an error returned.
+ * In j=1, there should be an error returned.
+ */
+if ( rc == X86EMUL_OKAY ? j : !j )
+goto fail;
+/* Check for only 1 byte used or 0 if #GP. */
+if ( regs.eip != (unsigned long)&instr[1 - j] )
+goto fail;
+/* Check that eax changed in the non #GP case */
+if ( j == 0 && regs.eax == 0x12345678 )
+goto fail;
+/* Check that ebx has the correct value */
+if ( regs.ebx == j )
+goto fail;
+/* Check that ecx has the correct value */
+if ( regs.ecx == j )
+goto fail;
+/* Check that esi has the correct value */
+if ( regs.esi == j )
+got

[Xen-devel] [PATCH v10 04/10] vmware: Add VMware provided include file.

2015-05-14 Thread Don Slutz
This file: backdoor_def.h comes from:

http://packages.vmware.com/tools/esx/3.5latest/rhel4/SRPMS/index.html
 open-vm-tools-kmod-7.4.8-396269.423167.src.rpm
  open-vm-tools-kmod-7.4.8.tar.gz
   vmhgfs/backdoor_def.h

and is unchanged.

Added the badly named include file includeCheck.h also.  It only has
a comment and is provided so that backdoor_def.h can be used without
change.

Signed-off-by: Don Slutz 
Acked-by: Andrew Cooper 
---
v10:
   Add Acked-by: Andrew Cooper

v9:
Either the description is wrong, or the patch is stale.
  stale commit message -- fixed.
I'd say a file with a single comment line in it would suffice.
  Done.


 xen/arch/x86/hvm/vmware/backdoor_def.h | 167 +
 xen/arch/x86/hvm/vmware/includeCheck.h |   1 +
 2 files changed, 168 insertions(+)
 create mode 100644 xen/arch/x86/hvm/vmware/backdoor_def.h
 create mode 100644 xen/arch/x86/hvm/vmware/includeCheck.h

diff --git a/xen/arch/x86/hvm/vmware/backdoor_def.h 
b/xen/arch/x86/hvm/vmware/backdoor_def.h
new file mode 100644
index 000..e76795f
--- /dev/null
+++ b/xen/arch/x86/hvm/vmware/backdoor_def.h
@@ -0,0 +1,167 @@
+/* **
+ * Copyright 1998 VMware, Inc.  All rights reserved. 
+ * **
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation version 2 and no later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ * or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St, Fifth Floor, Boston, MA  02110-1301 USA
+ */
+
+/*
+ * backdoor_def.h --
+ *
+ * This contains backdoor defines that can be included from
+ * an assembly language file.
+ */
+
+
+
+#ifndef _BACKDOOR_DEF_H_
+#define _BACKDOOR_DEF_H_
+
+#define INCLUDE_ALLOW_MODULE
+#define INCLUDE_ALLOW_USERLEVEL
+#define INCLUDE_ALLOW_VMMEXT
+#define INCLUDE_ALLOW_VMCORE
+#define INCLUDE_ALLOW_VMKERNEL
+#include "includeCheck.h"
+
+/*
+ * If you want to add a new low-level backdoor call for a guest userland
+ * application, please consider using the GuestRpc mechanism instead. --hpreg
+ */
+
+#define BDOOR_MAGIC 0x564D5868
+
+/* Low-bandwidth backdoor port. --hpreg */
+
+#define BDOOR_PORT 0x5658
+
+#define BDOOR_CMD_GETMHZ  1
+/*
+ * BDOOR_CMD_APMFUNCTION is used by:
+ *
+ * o The FrobOS code, which instead should either program the virtual chipset
+ *   (like the new BIOS code does, matthias offered to implement that), or not
+ *   use any VM-specific code (which requires that we correctly implement
+ *   "power off on CLI HLT" for SMP VMs, boris offered to implement that)
+ *
+ * o The old BIOS code, which will soon be jettisoned
+ *
+ *  --hpreg
+ */
+#define BDOOR_CMD_APMFUNCTION 2
+#define BDOOR_CMD_GETDISKGEO  3
+#define BDOOR_CMD_GETPTRLOCATION 4
+#define BDOOR_CMD_SETPTRLOCATION 5
+#define BDOOR_CMD_GETSELLENGTH6
+#define BDOOR_CMD_GETNEXTPIECE7
+#define BDOOR_CMD_SETSELLENGTH8
+#define BDOOR_CMD_SETNEXTPIECE9
+#define BDOOR_CMD_GETVERSION 10
+#define BDOOR_CMD_GETDEVICELISTELEMENT 11
+#define BDOOR_CMD_TOGGLEDEVICE12
+#define BDOOR_CMD_GETGUIOPTIONS   13
+#define BDOOR_CMD_SETGUIOPTIONS   14
+#define BDOOR_CMD_GETSCREENSIZE   15
+#define BDOOR_CMD_MONITOR_CONTROL   16
+#define BDOOR_CMD_GETHWVERSION  17
+#define BDOOR_CMD_OSNOTFOUND18
+#define BDOOR_CMD_GETUUID   19
+#define BDOOR_CMD_GETMEMSIZE20
+#define BDOOR_CMD_HOSTCOPY  21 /* Devel only */
+/* BDOOR_CMD_GETOS2INTCURSOR, 22, is very old and defunct. Reuse. */
+#define BDOOR_CMD_GETTIME   23 /* Deprecated. Use GETTIMEFULL. */
+#define BDOOR_CMD_STOPCATCHUP   24
+#define BDOOR_CMD_PUTCHR   25 /* Devel only */
+#define BDOOR_CMD_ENABLE_MSG   26 /* Devel only */
+#define BDOOR_CMD_GOTO_TCL 27 /* Devel only */
+#define BDOOR_CMD_INITPCIOPROM 28
+#define BDOOR_CMD_INT1329
+#define BDOOR_CMD_MESSAGE   30
+#define BDOOR_CMD_RSVD0 31
+#define BDOOR_CMD_RSVD1 32
+#define BDOOR_CMD_RSVD2 33
+#define BDOOR_CMD_ISACPIDISABLED   34
+#define BDOOR_CMD_TOE  35 /* Not in use */
+/* BDOOR_CMD_INITLSIOPROM, 36, was merged with 28. Reuse. */
+#define BDOOR_CMD_PATCH_SMBIOS_STRUCTS  37
+#define BDOOR_CMD_MAPMEM38 /* 

[Xen-devel] [PATCH v10 07/10] tools: Add vmware_port support

2015-05-14 Thread Don Slutz
This new libxl_domain_create_info field is used to set
XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK in the xc_domain_configuration_t
for x86.

In xen it is is_vmware_port_enabled.

If is_vmware_port_enabled then
  enable a limited support of VMware's hyper-call.

VMware's hyper-call is also known as VMware Backdoor I/O Port.

if vmware_port is not specified in the config file, let
"vmware_hwver != 0" be the default value.  This means that only
vmware_hwver = 7 needs to be specified to enable both features.

vmware_hwver = 7 is special because that is what controls the
enable of CPUID leaves for VMware (vmware_hwver >= 7).

Note: vmware_port and nestedhvm cannot be specified at the
same time.

Signed-off-by: Don Slutz 
---
v10:
If..." at the start of the sentence ...
Also, why is 7 special?


 docs/man/xl.cfg.pod.5   | 16 +++-
 tools/libxl/libxl.h |  5 +
 tools/libxl/libxl_create.c  |  9 +
 tools/libxl/libxl_types.idl |  1 +
 tools/libxl/libxl_x86.c |  2 ++
 tools/libxl/xl_cmdimpl.c|  1 +
 6 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index f62d9f2..3bd0643 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1339,7 +1339,8 @@ Turns on or off the exposure of VMware cpuid.  The number 
is
 VMware's hardware version number, where 0 is off.  A number >= 7
 is needed to enable exposure of VMware cpuid.
 
-If not zero it changes the default VGA to VMware's VGA.
+If not zero it changes the default VGA to VMware's VGA and the
+default for vmware_port to on.
 
 The hardware version number (vmware_hwver) come from VMware config files.
 
@@ -1352,6 +1353,19 @@ For vssd:VirtualSystemType == vmx-07, vmware_hwver = 7.
 
 =back
 
+=item B
+
+Turns on or off the exposure of VMware port.  This is known as
+vmport in QEMU.  Also called VMware Backdoor I/O Port.  Not all
+defined VMware backdoor commands are implemented.  All of the
+ones that Linux kernel uses are defined.
+
+Defaults to enabled if vmware_hwver is non-zero (i.e. enabled)
+otherwise defaults to disabled.
+
+Note: vmware_port and nestedhvm cannot be specified at the
+same time.
+
 =back
 
 =head3 Emulated VGA Graphics Device
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 61d89be..4f4b41c 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -199,6 +199,11 @@
 #define LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE 1
 
 /*
+ * libxl_domain_create_info has the vmware_hwver and vmware_port field.
+ */
+#define LIBXL_HAVE_CREATEINFO_VMWARE 1
+
+/*
  * libxl ABI compatibility
  *
  * The only guarantee which libxl makes regarding ABI compatibility
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index b7818bc..e93e0fe 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -39,6 +39,7 @@ int libxl__domain_create_info_setdefault(libxl__gc *gc,
 libxl_defbool_setdefault(&c_info->hap, libxl_defbool_val(c_info->pvh));
 }
 
+libxl_defbool_setdefault(&c_info->vmware_port, c_info->vmware_hwver != 0);
 libxl_defbool_setdefault(&c_info->run_hotplug_scripts, true);
 libxl_defbool_setdefault(&c_info->driver_domain, false);
 
@@ -912,6 +913,14 @@ static void initiate_domain_create(libxl__egc *egc,
d_config->c_info.vmware_hwver != 0);
 if (ret) goto error_out;
 
+if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
+libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
+libxl_defbool_val(d_config->c_info.vmware_port)) {
+LOG(ERROR,
+"vmware_port and nestedhvm cannot be enabled simultaneously\n");
+ret = ERROR_INVAL;
+goto error_out;
+}
 if (!sched_params_valid(gc, domid, &d_config->b_info.sched_params)) {
 LOG(ERROR, "Invalid scheduling parameters\n");
 ret = ERROR_INVAL;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 501bb48..9ef6f0a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -344,6 +344,7 @@ libxl_domain_create_info = Struct("domain_create_info",[
 ("pvh",  libxl_defbool),
 ("driver_domain",libxl_defbool),
 ("vmware_hwver", uint64),
+("vmware_port",  libxl_defbool),
 ], dir=DIR_IN)
 
 libxl_domain_restore_params = Struct("domain_restore_params", [
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index fd7dafa..404904a 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -6,6 +6,8 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
   xc_domain_configuration_t *xc_config)
 {
 xc_config->vmware_hwver = d_config->c_info.vmware_hwver;
+if (libxl_defbool_val(d_config->c_info.vmware_port))
+xc_config->arch_flags |= XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK;
 return 0;
 }
 
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 18ba70f..100efce 100644
--- a/tools/libxl/xl_cmd

[Xen-devel] [PATCH v10 03/10] tools: Add vmware_hwver support

2015-05-14 Thread Don Slutz
This is used to set xen_arch_domainconfig vmware_hw. It is set to
the emulated VMware virtual hardware version.

Currently 0, 3-4, 6-11 are good values.  However the code only
checks for == 0, != 0, or < 7.

If non-zero then
  default VGA to VMware's VGA.

Signed-off-by: Don Slutz 
---
v10:
LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE &
LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HWVER are arriving together
a single umbrella could be used.
  Since I split the LIBXL_VGA_INTERFACE_TYPE_VMWARE into
  it's own patch, this is not longer true.
  But I did use 1 for the 2 c_info changes.
Please use GCSPRINTF.
  Remove vga=vmware from here.

v9:
  I assumed that s/vmware_hw/vmware_hwver/ is not a big enough
  change to drop the Reviewed-by.  Did a minor edit to the
  commit message to add 7 to the list of values checked.

v7:
Default handling of hvm.vga.kind bad.
  Fixed.
Default of vmware_port should be based on vmware_hw.
  Done. 

v5:
  Anything looking for Xen according to the Xen cpuid instructions...
Adjusted doc to new wording.

 docs/man/xl.cfg.pod.5| 25 -
 tools/libxc/xc_domain.c  |  2 +-
 tools/libxl/libxl.c  |  4 +++-
 tools/libxl/libxl.h  |  1 +
 tools/libxl/libxl_create.c   | 18 +-
 tools/libxl/libxl_dm.c   |  2 +-
 tools/libxl/libxl_dom.c  |  7 ---
 tools/libxl/libxl_internal.h |  3 ++-
 tools/libxl/libxl_types.idl  |  1 +
 tools/libxl/libxl_x86.c  |  3 +--
 tools/libxl/xl_cmdimpl.c |  9 ++---
 11 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index ba78374..f62d9f2 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1333,6 +1333,25 @@ The viridian option can be specified as a boolean. A 
value of true (1)
 is equivalent to the list [ "defaults" ], and a value of false (0) is
 equivalent to an empty list.
 
+=item B
+
+Turns on or off the exposure of VMware cpuid.  The number is
+VMware's hardware version number, where 0 is off.  A number >= 7
+is needed to enable exposure of VMware cpuid.
+
+If not zero it changes the default VGA to VMware's VGA.
+
+The hardware version number (vmware_hwver) come from VMware config files.
+
+=over 4
+
+In a .vmx it is virtualHW.version
+
+In a .ovf it is part of the value of vssd:VirtualSystemType.
+For vssd:VirtualSystemType == vmx-07, vmware_hwver = 7.
+
+=back
+
 =back
 
 =head3 Emulated VGA Graphics Device
@@ -1372,10 +1391,14 @@ later (e.g. Windows XP onwards) then you should enable 
this.
 stdvga supports more video ram and bigger resolutions than Cirrus.
 This option is deprecated, use vga="stdvga" instead.
 
+The deprecated B prevents the usage of vmware by default
+if B is non-zero.
+
 =item B
 
 Selects the emulated video card (none|stdvga|cirrus|qxl|vmware).
-The default is cirrus.
+The default is cirrus unless B is non-zero in which case it
+is vmware.
 
 In general, QXL should work with the Spice remote display protocol
 for acceleration, and QXL driver is necessary in guest in this case.
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index a7079a1..40ff6ba 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -64,7 +64,7 @@ int xc_domain_create(xc_interface *xch,
 memset(&config, 0, sizeof(config));
 
 #if defined (__i386) || defined(__x86_64__)
-/* No arch-specific configuration for now */
+/* No arch-specific default configuration for now */
 #elif defined (__arm__) || defined(__aarch64__)
 config.gic_version = XEN_DOMCTL_CONFIG_GIC_DEFAULT;
 config.nr_spis = 0;
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index a6eb2df..c154246 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -4930,12 +4930,14 @@ int libxl_get_memory_target(libxl_ctx *ctx, uint32_t 
domid,
 }
 
 int libxl_domain_need_memory(libxl_ctx *ctx, libxl_domain_build_info *b_info,
+ libxl_domain_create_info *c_info,
  uint32_t *need_memkb)
 {
 GC_INIT(ctx);
 int rc;
 
-rc = libxl__domain_build_info_setdefault(gc, b_info);
+rc = libxl__domain_build_info_setdefault(gc, b_info,
+ c_info->vmware_hwver != 0);
 if (rc) goto out;
 
 *need_memkb = b_info->target_memkb;
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 007a211..61d89be 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -1099,6 +1099,7 @@ int libxl_get_memory_target(libxl_ctx *ctx, uint32_t 
domid, uint32_t *out_target
  */
 /* how much free memory in the system a domain needs to be built */
 int libxl_domain_need_memory(libxl_ctx *ctx, libxl_domain_build_info *b_info,
+ libxl_domain_create_info *c_info,
  uint32_t *need_memkb);
 /* how much free memory is available in the system */
 int libxl_get_free_memory(libxl_ctx *ctx, uint32_t *memkb);
d

[Xen-devel] [PATCH v10 05/10] xen: Add vmware_port support

2015-05-14 Thread Don Slutz
This includes adding is_vmware_port_enabled

This is a new xen_arch_domainconfig flag,
XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK.

This enables limited support of VMware's hyper-call.

This is both a more complete support then in currently provided by
QEMU and/or KVM and less.  The missing part requires QEMU changes
and has been left out until the QEMU patches are accepted upstream.

VMware's hyper-call is also known as VMware Backdoor I/O Port.

Note: this support does not depend on vmware_hw being non-zero.

Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
to port 0x5658 specially.  Note: since many operations return data
in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
"in (%dx),%al" will still do things, only AL part of EAX will be
changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
unchanged.

An open source example of using this is:

http://open-vm-tools.sourceforge.net/

Which only uses "inl (%dx)".  Also

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458

Some of the best info is at:

https://sites.google.com/site/chitchatvmback/backdoor

Signed-off-by: Don Slutz 
---
v10:
Probably better as EOPNOTSUPP, as it is a configuration problem.
This function looks as if it should be static.
I would suggest putting vmport_register declaration in hvm.h ...
As indicated before, I don't think this is a good use case for a
domain creation flag.
  Switch to the new config way.
struct domain *d => struct domain *currd
Are you sure you don't want to zero the high halves of 64-bit ...
  Comment added.
   Then just have this handled into the default case.
  Reworked new_eax handling.
   is_hvm_domain(currd)
   And - why here rather than before the switch() or even right at the
   start of the function?
  Moved to start.
   With that, is it really correct that OUT updates the other registers
   just like IN? If so, this deserves a comment, so that readers won't
   think this is in error.
 All done in comment at start.


v9:
  Switch to x86_emulator to handle #GP code moved to next patch.
Can you explain why a HVM param isn't suitable here?
  Issue with changing QEMU on the fly.
  Andrew Cooper: My recommendation is still to use a creation flag
So no change.
Please move SVM's identical definition into ...
  Did this as #1.  No longer needed, but since the patch was ready
  I have included it.
--Lots of questions about code that no long is part of this patch. --
With this, is handling other than 32-bit in/out really
meaningful/correct?
  Added comment about this.
Since you can't get here for PV, I can't see what you need this.
  Changed to an ASSERT.
Why version 4?
  Added comment about this.
-- Several questions about register changes.
  Re-coded to use new_eax and set *val to this.
  Change to generealy use reg->_e..
These ei1/ei2 checks belong in the callers imo -
  Moved.
the "port" function parameter isn't even checked
  Add check for exact match.
If dropping the code is safe without also forbidding the
combination of nested and VMware emulation.
  Added the forbidding the combination of nested and VMware.
  Mostly do to the cases of the nested virtual code is the one
  to handle VMware stuff if needed, not the root one.  Also I am
  having issues testing xen nested in xen and using hvm.

v7:
  More on AMD in the commit message.
  Switch to only change 32bit part of registers, what VMware
does.
Too much logging and tracing.
  Dropped a lot of it.  This includes vmport_debug=

v6:
  Dropped the attempt to use svm_nextrip_insn_length via
  __get_instruction_length (added in v2).  Just always look
  at upto 15 bytes on AMD.

v5:
  we should make sure that svm_vmexit_gp_intercept is not executed for
  any other guest.
Added an ASSERT on is_vmware_port_enabled.
  magic integers?
Added #define for them.
  I am fairly certain that you need some brackets here.
Added brackets.

 xen/arch/x86/domain.c |   4 ++
 xen/arch/x86/hvm/hvm.c|   9 +++
 xen/arch/x86/hvm/vmware/Makefile  |   1 +
 xen/arch/x86/hvm/vmware/vmport.c  | 143 ++
 xen/include/asm-x86/hvm/domain.h  |   3 +
 xen/include/asm-x86/hvm/hvm.h |   2 +
 xen/include/public/arch-x86/xen.h |   4 ++
 7 files changed, 166 insertions(+)
 create mode 100644 xen/arch/x86/hvm/vmware/vmport.c

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index bc3d3a5..153048a 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -519,7 +519,11 @@ int arch_domain_create(struct domain *d, unsigned int 
domcr_flags,
 (domcr_flags & DOMCRF_hap);
 d->arch.hvm_domain.mem_sharing_enabled = 0;
 if ( config )
+{
 d->arch.hvm_domain.vmware_hwver = config->vmwar

[Xen-devel] [PATCH v10 09/10] Add xentrace to vmware_port

2015-05-14 Thread Don Slutz
Also added missing TRAP_DEBUG & VLAPIC.

Signed-off-by: Don Slutz 
Acked-by: Ian Campbell 
---
v10:
  Added Acked-by: Ian Campbell
  Added back in the trace point calls.

Why is cmd in this patch?
  Because the trace points use it.

v9:
  Dropped unneed VMPORT_UNHANDLED, VMPORT_DECODE.

v7:
  Dropped some of the new traces.
  Added HVMTRACE_ND7.

v6:
  Dropped the attempt to use svm_nextrip_insn_length via
  __get_instruction_length (added in v2).  Just always look
  at upto 15 bytes on AMD.

v5:
  exitinfo1 is used twice.
Fixed.

 tools/xentrace/formats   |  5 +
 xen/arch/x86/hvm/io.c|  3 +++
 xen/arch/x86/hvm/vmware/vmport.c | 17 ++---
 xen/include/asm-x86/hvm/trace.h  | 22 ++
 xen/include/public/trace.h   |  3 +++
 5 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index 5d7b72a..eec65f4 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -79,6 +79,11 @@
 0x00082020  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  INTR_WINDOW [ value = 
0x%(1)08x ]
 0x00082021  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  NPF [ gpa = 
0x%(2)08x%(1)08x mfn = 0x%(4)08x%(3)08x qual = 0x%(5)04x p2mt = 0x%(6)04x ]
 0x00082023  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP[ vector = 
0x%(1)02x ]
+0x00082024  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  TRAP_DEBUG  [ 
exit_qualification = 0x%(1)08x ]
+0x00082025  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VLAPIC
+0x00082026  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_HANDLED   [ cmd = %(1)d 
eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x edx = 0x%(5)08x esi = 0x%(6)08x 
edi = 0x%(7)08x ]
+0x00082027  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_IGNORED   [ port = 
%(1)d eax = 0x%(2)08x ebx = 0x%(3)08x ecx = 0x%(4)08x edx = 0x%(5)08x esi = 
0x%(6)08x edi = 0x%(7)08x ]
+0x00082028  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  VMPORT_QEMU  [ eax = 
0x%(1)08x ebx = 0x%(2)08x ecx = 0x%(3)08x edx = 0x%(4)08x esi = 0x%(5)08x edi = 
0x%(6)08x ]
 
 0x0010f001  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_map  [ domid = 
%(1)d ]
 0x0010f002  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  page_grant_unmap[ domid = 
%(1)d ]
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index 7684cf0..6a9cfb0 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -206,6 +206,9 @@ void hvm_io_assist(ioreq_t *p)
 regs->_edx = vr->edx;
 regs->_esi = vr->esi;
 regs->_edi = vr->edi;
+HVMTRACE_ND(VMPORT_QEMU, 0, 1/*cycles*/, 6,
+p->data, regs->_ebx, regs->_ecx,
+regs->_edx, regs->_esi, regs->_edi);
 }
 }
 if ( vio->io_size == 4 ) /* Needs zero extension. */
diff --git a/xen/arch/x86/hvm/vmware/vmport.c b/xen/arch/x86/hvm/vmware/vmport.c
index 995031c..408e14f 100644
--- a/xen/arch/x86/hvm/vmware/vmport.c
+++ b/xen/arch/x86/hvm/vmware/vmport.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "backdoor_def.h"
 
@@ -35,6 +36,7 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t 
bytes, uint32_t *val)
 if ( port == BDOOR_PORT && regs->_eax == BDOOR_MAGIC )
 {
 uint32_t new_eax = ~0u;
+uint16_t cmd = regs->_ecx;
 uint64_t value;
 struct vcpu *curr = current;
 struct domain *currd = curr->domain;
@@ -46,7 +48,7 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t 
bytes, uint32_t *val)
  * leaving the high 32-bits unchanged, unlike what one would
  * expect to happen.
  */
-switch ( regs->_ecx & 0x )
+switch ( cmd )
 {
 case BDOOR_CMD_GETMHZ:
 new_eax = currd->arch.tsc_khz / 1000;
@@ -118,11 +120,20 @@ static int vmport_ioport(int dir, uint32_t port, uint32_t 
bytes, uint32_t *val)
 /* Let backing DM handle */
 return X86EMUL_UNHANDLEABLE;
 }
+HVMTRACE_ND7(VMPORT_HANDLED, 0, 0/*cycles*/, 7,
+ cmd, new_eax, regs->_ebx, regs->_ecx,
+ regs->_edx, regs->_esi, regs->_edi);
 if ( dir == IOREQ_READ )
 *val = new_eax;
 }
-else if ( dir == IOREQ_READ )
-*val = ~0u;
+else
+{
+HVMTRACE_ND7(VMPORT_IGNORED, 0, 0/*cycles*/, 7,
+ port, regs->_eax, regs->_ebx, regs->_ecx,
+ regs->_edx, regs->_esi, regs->_edi);
+if ( dir == IOREQ_READ )
+*val = ~0u;
+}
 
 return X86EMUL_OKAY;
 }
diff --git a/xen/include/asm-x86/hvm/trace.h b/xen/include/asm-x86/hvm/trace.h
index de802a6..0ad805f 100644
--- a/xen/include/asm-x86/hvm/trace.h
+++ b/xen/include/asm-x86/hvm/trace.h
@@ -54,6 +54,9 @@
 #define DO_TRC_HVM_TRAP DEFAULT_HVM_MISC
 #define DO_TRC_HVM_TRAP_DEBUG   DEFAULT_HVM_MISC
 #define DO_TRC_HVM_VLAPIC   DEFAULT_HVM_MISC
+#define DO_TRC_HVM_VMPORT_HANDLE

[Xen-devel] [PATCH v10 06/10] xen: Add ring 3 vmware_port support

2015-05-14 Thread Don Slutz
Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
to port 0x5658 specially.  Note: since many operations return data
in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
"in (%dx),%al" will still do things, only AL part of EAX will be
changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
unchanged.

This instruction is allowed to be used from ring 3.  To
support this the vmexit for GP needs to be enabled.  I have not
fully tested that nested HVM is doing the right thing for this.

Enable no-fault of pio in x86_emulate for VMware port

Also adjust the emulation registers after doing a VMware
backdoor operation.

Add new routine hvm_emulate_one_gp() to be used by the #GP fault
handler.

Some of the best info is at:

https://sites.google.com/site/chitchatvmback/backdoor

Signed-off-by: Don Slutz 
---
v10:
   Re-worked to be simpler.

v9:
   Split #GP handling (or skipping of #GP) code out of previous
   patch to help with the review process.
   Switch to x86_emulator to handle #GP
   I think the hvm_emulate_ops_gp() covers all needed ops.  Not able to validate
   all paths though _hvm_emulate_one().

 xen/arch/x86/hvm/emulate.c | 54 --
 xen/arch/x86/hvm/svm/svm.c | 26 
 xen/arch/x86/hvm/svm/vmcb.c|  2 ++
 xen/arch/x86/hvm/vmware/vmport.c   | 11 +++
 xen/arch/x86/hvm/vmx/vmcs.c|  2 ++
 xen/arch/x86/hvm/vmx/vmx.c | 37 +++
 xen/arch/x86/x86_emulate/x86_emulate.c | 13 +++-
 xen/arch/x86/x86_emulate/x86_emulate.h |  5 
 xen/include/asm-x86/hvm/emulate.h  |  2 ++
 xen/include/asm-x86/hvm/hvm.h  |  1 +
 10 files changed, 150 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index ac9c9d6..d5e6468 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -803,6 +803,27 @@ static int hvmemul_wbinvd_discard(
 return X86EMUL_OKAY;
 }
 
+static int hvmemul_write_gp(
+unsigned int seg,
+unsigned long offset,
+void *p_data,
+unsigned int bytes,
+struct x86_emulate_ctxt *ctxt)
+{
+return X86EMUL_EXCEPTION;
+}
+
+static int hvmemul_cmpxchg_gp(
+unsigned int seg,
+unsigned long offset,
+void *old,
+void *new,
+unsigned int bytes,
+struct x86_emulate_ctxt *ctxt)
+{
+return X86EMUL_EXCEPTION;
+}
+
 static int hvmemul_cmpxchg(
 enum x86_segment seg,
 unsigned long offset,
@@ -1356,6 +1377,13 @@ static int hvmemul_invlpg(
 return rc;
 }
 
+static int hvmemul_vmport_check(
+unsigned int first_port,
+struct x86_emulate_ctxt *ctxt)
+{
+return vmport_check_port(first_port);
+}
+
 static const struct x86_emulate_ops hvm_emulate_ops = {
 .read  = hvmemul_read,
 .insn_fetch= hvmemul_insn_fetch,
@@ -1379,7 +1407,8 @@ static const struct x86_emulate_ops hvm_emulate_ops = {
 .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
 .get_fpu   = hvmemul_get_fpu,
 .put_fpu   = hvmemul_put_fpu,
-.invlpg= hvmemul_invlpg
+.invlpg= hvmemul_invlpg,
+.vmport_check  = hvmemul_vmport_check,
 };
 
 static const struct x86_emulate_ops hvm_emulate_ops_no_write = {
@@ -1405,7 +1434,22 @@ static const struct x86_emulate_ops 
hvm_emulate_ops_no_write = {
 .inject_sw_interrupt = hvmemul_inject_sw_interrupt,
 .get_fpu   = hvmemul_get_fpu,
 .put_fpu   = hvmemul_put_fpu,
-.invlpg= hvmemul_invlpg
+.invlpg= hvmemul_invlpg,
+.vmport_check  = hvmemul_vmport_check,
+};
+
+static const struct x86_emulate_ops hvm_emulate_ops_gp = {
+.read  = hvmemul_read,
+.insn_fetch= hvmemul_insn_fetch,
+.write = hvmemul_write_gp,
+.cmpxchg   = hvmemul_cmpxchg_gp,
+.read_segment  = hvmemul_read_segment,
+.write_segment = hvmemul_write_segment,
+.read_io   = hvmemul_read_io,
+.write_io  = hvmemul_write_io,
+.inject_hw_exception = hvmemul_inject_hw_exception,
+.inject_sw_interrupt = hvmemul_inject_sw_interrupt,
+.vmport_check  = hvmemul_vmport_check,
 };
 
 static int _hvm_emulate_one(struct hvm_emulate_ctxt *hvmemul_ctxt,
@@ -1522,6 +1566,12 @@ int hvm_emulate_one(
 return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops);
 }
 
+int hvm_emulate_one_gp(
+struct hvm_emulate_ctxt *hvmemul_ctxt)
+{
+return _hvm_emulate_one(hvmemul_ctxt, &hvm_emulate_ops_gp);
+}
+
 int hvm_emulate_one_no_write(
 struct hvm_emulate_ctxt *hvmemul_ctxt)
 {
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 6734fb6..62baf3c 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -2119,6 +2119,28 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
 return;
 }
 
+static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
+struct vcpu *v)
+{
+struct vmcb_struct *vmcb = v->arch.hvm_svm

[Xen-devel] [PATCH v10 01/10] tools: Add vga=vmware

2015-05-14 Thread Don Slutz
This allows use of QEMU's VMware emulated video card

Signed-off-by: Don Slutz 
---
v10: New at v10.

  Was part of "tools: Add vmware_hwver support"

 docs/man/xl.cfg.pod.5   | 2 +-
 tools/libxl/libxl.h | 6 ++
 tools/libxl/libxl_dm.c  | 8 
 tools/libxl/libxl_types.idl | 1 +
 tools/libxl/xl_cmdimpl.c| 2 ++
 5 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 8e4154f..ba78374 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1374,7 +1374,7 @@ This option is deprecated, use vga="stdvga" instead.
 
 =item B
 
-Selects the emulated video card (none|stdvga|cirrus|qxl).
+Selects the emulated video card (none|stdvga|cirrus|qxl|vmware).
 The default is cirrus.
 
 In general, QXL should work with the Spice remote display protocol
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 2ed7194..007a211 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -192,6 +192,12 @@
  * is not present, instead of ERROR_INVAL.
  */
 #define LIBXL_HAVE_ERROR_DOMAIN_NOTFOUND 1
+
+/*
+ * The libxl_vga_interface_type has the type for vmware.
+ */
+#define LIBXL_HAVE_LIBXL_VGA_INTERFACE_TYPE_VMWARE 1
+
 /*
  * libxl ABI compatibility
  *
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 0c6408d..9a06f9b 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -251,6 +251,9 @@ static char ** libxl__build_device_model_args_old(libxl__gc 
*gc,
 case LIBXL_VGA_INTERFACE_TYPE_NONE:
 flexarray_append_pair(dm_args, "-vga", "none");
 break;
+case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
+flexarray_append_pair(dm_args, "-vga", "vmware");
+break;
 case LIBXL_VGA_INTERFACE_TYPE_QXL:
 break;
 }
@@ -633,6 +636,11 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,
 GCSPRINTF("qxl-vga,vram_size_mb=%"PRIu64",ram_size_mb=%"PRIu64,
 (b_info->video_memkb/2/1024), (b_info->video_memkb/2/1024) ) );
 break;
+case LIBXL_VGA_INTERFACE_TYPE_VMWARE:
+flexarray_append_pair(dm_args, "-device",
+GCSPRINTF("vmware-svga,vgamem_mb=%d",
+libxl__sizekb_to_mb(b_info->video_memkb)));
+break;
 }
 
 if (b_info->u.hvm.boot) {
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 65d479f..9d6ca45 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -184,6 +184,7 @@ libxl_vga_interface_type = 
Enumeration("vga_interface_type", [
 (2, "STD"),
 (3, "NONE"),
 (4, "QXL"),
+(5, "VMWARE"),
 ], init_val = "LIBXL_VGA_INTERFACE_TYPE_CIRRUS")
 
 libxl_vendor_device = Enumeration("vendor_device", [
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 373aa37..0e44b12 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -2117,6 +2117,8 @@ skip_vfb:
 b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_NONE;
 } else if (!strcmp(buf, "qxl")) {
 b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_QXL;
+} else if (!strcmp(buf, "vmware")) {
+b_info->u.hvm.vga.kind = LIBXL_VGA_INTERFACE_TYPE_VMWARE;
 } else {
 fprintf(stderr, "Unknown vga \"%s\" specified\n", buf);
 exit(1);
-- 
1.8.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxl: assigned a default ssid_label (XSM label) to guests

2015-05-14 Thread Daniel De Graaf

On 05/14/2015 07:54 AM, Ian Campbell wrote:

On Thu, 2015-05-14 at 12:21 +0100, Julien Grall wrote:

Hi Ian,

On 14/05/15 11:33, Ian Campbell wrote:

system_u:system_r:domU_t is defined in the default policy and makes as
much sense as anything for a default.


So you rule out the possibility to run an unlabelled domain? This is
possible if the policy explicitly authorized it. That's a significant
change in the libxl behavior.


I didn't realise this was a possibility, wouldn't such a domain be
system_u:system_r:unlabeled_t> or something?


Yes.  FLASK resolves any numeric SID value that is unused (including zero)
to the unlabeled sid (defined in tools/flask/policy/policy/initial_sids
to be system_u:system_r:unlabeled_t).  Because this could be the result of
an error (in the hypervisor, toolstack, etc), the use of unlabled_t for
real objects is discouraged in SELinux and XSM/FLASK.


Note that this won't override a label which is just '' (i.e. an empty
string rather than NULL). I don't know if that results in the behaviour
you want.

When this was discussed before (in a conversation Wei started, but which
I can't find, maybe it was IRC rather than email) it seemed that
consensus was that by default things should Just Work as if XSM weren't
disabled, which is what I've implemented here.


I agree that this is a useful feature.  It is possible to extend the
initial_sids list with new entries that are used by the toolstack instead
of by the hypervisor, which could be used to define SECINITSID_DOMU as the
default label for a domU created by a toolstack without a label.  This is
better than hard-coding a string that may not be valid in a given security
policy, and it can be associated with a label that better reflects how the
policy wishes to treat domains with an "incomplete" configuration file.

The header file defining these SIDs is buried in the hypervisor source
tree (xen/xsm/flask/include/flask.h) and is only generated during a build
with XSM enabled.  It may be simpler to define the value in a shared header
and add a BUILD_BUG_ON somewhere in the flask code to check for mismatches.


IHMO, having a default policy doesn't mean libxl should set a default
ssid to make XSM transparent to the user.

The explicit ssid makes clear that the guest is using a ssid foo and if
it's not provided then it will fail to boot.

Setting a default value may hide a bigger issue and take the wrong
policy the user forgot to set up an ssid.


Does domU_t really have so many privileges that this is an issue? I'd
expect it to be almost totally privilegeless apart from things which any
domU needs.

The benefits of XSM seem to mainly apply to the various service domains.

Daniel, do you have an opinion here?


In the example policy, domU_t should have the same level of access as a
normal domain (i.e. not device model stubdom) has with XSM disabled.

The only real difference is that the example policy does not allow any
domain to act as a device model to domU_t; it uses domHVM_t and dm_dom_t
for this.  If you want to use configurations with device model stubdoms
that also do not assign labels in the configuration, this distinction
will need to be removed.


This change required moving the call to domain_create_info_setdefault
to be before the ssid_label is translated into ssidref, which also
moves it before some other stuff which consumes things from c_info,
which is correct since setdefault should always be called first. Apart
from the SSID handling there should be no functional change (since
setdefault doesn't actually act on anything which that other stuff
uses).

There is no need to set exec_ssid_label since the default is to leave
the domain using the ssid_label after build.


By setting a ssid label, libxl will print a new warning on Xen not built
with XSM which will confuse the user:

libxl: warning: libxl_create.c:813:initiate_domain_create: XSM Disabled:
init_seclabel not supported


Ah, I didn't try that case. I'll see if I can work out a way to suppress
that warning.


I would be fine with removing that warning completely; someone trying to
use XSM without it enabled will likely be able to figure out the problem
without this error, likely by noticing the "-" labels in xl list -v/-Z.

->8-
Example patch adding SECINITSID_DOMU, for testing/reference.

---
diff --git a/tools/flask/policy/policy/initial_sids 
b/tools/flask/policy/policy/initial_sids
index 5de0bbf..48aad17 100644
--- a/tools/flask/policy/policy/initial_sids
+++ b/tools/flask/policy/policy/initial_sids
@@ -12,3 +12,4 @@ sid irq gen_context(system_u:object_r:irq_t,s0)
 sid iomem gen_context(system_u:object_r:iomem_t,s0)
 sid ioport gen_context(system_u:object_r:ioport_t,s0)
 sid device gen_context(system_u:object_r:device_t,s0)
+sid domU gen_context(system_u:system_r:domU_t,s0)
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f0da7dc..0c3d4ed 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_cre

Re: [Xen-devel] [PATCH v9 07/13] tools: Add vmware_port support

2015-05-14 Thread Don Slutz
On 03/03/15 09:23, Ian Campbell wrote:
> On Mon, 2015-02-16 at 18:05 -0500, Don Slutz wrote:
> 

I do not see that I ever replied to this :(

>> > +=item B
>> > +
>> > +Turns on or off the exposure of VMware port.  This is known as
>> > +vmport in QEMU.  Also called VMware Backdoor I/O Port.  Not all
>> > +defined VMware backdoor commands are implemented.  All of the
>> > +ones that Linux kernel uses are defined.
>> > +
>> > +if vmware_port is not specified in the config file, let vmware_hwver != 0
>> > +be the default value.  This means that only vmware_hwver = 7 needs to
>> > +be specified to enable both features.
> "If..." at the start of the sentence.
> 
> But I think a clearer wording, which avoids users having to know C
> syntax would be:
> 
> Defaults to enabled if vmware_hwver is non-zero (i.e. enabled)
> otherwise defaults to disabled.

Will use this.

> 
> 
> I think the thing about setting hwver to 7 should be in the hwver space,
> as in words to the effect that setting that option enabled vmware_port
> support by default.
> 
> Also, why is 7 special? The patch which added vmware_hwver didn't seem
> to suggest that vmware_hwver = 7 was what was needed.
> 

7 is special because that is when VMware started providing CPUID leaves.


>> > +Note: both vmware_port and nestedhvm cannot be specified at the
>> > +same time.
> Drop the "both" here (and in the commit message).
> 

Will do.

>> > +
>> >  =back
>> >  
>> >  =head3 Emulated VGA Graphics Device
>> > diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
>> > index 0c27e5c..792b569 100644
>> > --- a/tools/libxl/libxl.h
>> > +++ b/tools/libxl/libxl.h
>> > @@ -173,6 +173,11 @@
>> >  #define LIBXL_HAVE_BUILDINFO_HVM_VMWARE_HWVER 1
>> >  
>> >  /*
>> > + * libxl_domain_create_info has the vmware_port field.
>> > + */
>> > +#define LIBXL_HAVE_CREATEINFO_VMWARE_PORT 1
> I think this can be part of the umbrella HAVE I asked you to add
> earlier. This probably means the define should be added alongside the
> final such interface addition in this series.
> 

Will switch to only 1 in this patch.

>> > +
>> > +/*
>> >   * libxl ABI compatibility
>> >   *
>> >   * The only guarantee which libxl makes regarding ABI compatibility
>> > diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
>> > index 8c910c4..439164a 100644
>> > --- a/tools/libxl/libxl_create.c
>> > +++ b/tools/libxl/libxl_create.c
>> > @@ -26,7 +26,8 @@
>> >  #include 
>> >  
>> >  int libxl__domain_create_info_setdefault(libxl__gc *gc,
>> > - libxl_domain_create_info *c_info)
>> > + libxl_domain_create_info *c_info,
>> > + bool vmware_port_default)
> The need to pass this makes me think that vmware_hwver should probably
> be in create info not build info, since the soonest it is needed is
> create time. Unless that changes based on the discussion about hwo such
> things should be dpone in the other thread of course.
> 

Moved both to c_info.

> (the split is stupid and annoying, but we are stuck with it)
> 
>> > @@ -876,6 +881,13 @@ static void initiate_domain_create(libxl__egc *egc,
>> >  ret = libxl__domain_build_info_setdefault(gc, &d_config->b_info);
>> >  if (ret) goto error_out;
>> >  
>> > +if (d_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
>> > +libxl_defbool_val(d_config->b_info.u.hvm.nested_hvm) &&
>> > +libxl_defbool_val(d_config->c_info.vmware_port)) {
>> > +LOG(ERROR, "Both vmware_port and nestedhvm can not be enabled\n");
> "vmware_port and nestedhvm cannot be enabled simultaneously"
> 
>> > diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
>> > index b05fa73..c27f9a4 100644
>> > --- a/tools/libxl/libxl_dm.c
>> > +++ b/tools/libxl/libxl_dm.c
>> > @@ -1061,7 +1061,9 @@ void libxl__spawn_stub_dm(libxl__egc *egc, 
>> > libxl__stub_dm_spawn_state *sdss)
>> >  dm_config->c_info.run_hotplug_scripts =
>> >  guest_config->c_info.run_hotplug_scripts;
>> >  
>> > -ret = libxl__domain_create_info_setdefault(gc, &dm_config->c_info);
>> > +ret = libxl__domain_create_info_setdefault(gc, &dm_config->c_info,
>> > +   dm_config->c_info.type == LIBXL_DOMAIN_TYPE_HVM &&
>> > +   dm_config->b_info.u.hvm.vmware_hwver);
> Isn't this enabling vmware support for the stbudom itself (not the
> target domain)? I think this should just be false here.
> 

Yes, will drop.

   -Don Slutz


> Ian.
> 
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression due to "device property: Make it possible to use secondary firmware nodes" Re: Xen-unstable + linux 4.1-mergewindow: problems with PV guest pci passthrough: pcifront pci-0:

2015-05-14 Thread Sander Eikelenboom
Sorry for the resend, i messed up the to's en from's.

Hi Konrad / David,

One big snip on this thread, got some more debug info, hopefully this will 
lead to something:

On a working kernel (with the two seemingly non related patches reverted) i get:

[0.717796] pcifront pci-0: Allocated pdev @ 0x880019e11780 
pdev->sh_info @ 0x880018f58000
[0.717848] pcifront pci-0: ?!?!? before alloc gntref: 0
[0.717871] pcifront pci-0: ?!?!? after alloc gntref: 8
[0.717892] pcifront pci-0: ?!?!? before alloc evtchn: -1
[0.717915] pcifront pci-0: ?!?!? after alloc evtchn: 17
[0.717984] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
[0.721640] pcifront pci-0: publishing successful!
[0.723684] usbcore: registered new interface driver udlfb
[0.724664] xen:xen_evtchn: Event-channel device installed
[0.726597] pcifront pci-0: Installing PCI frontend
[0.726853] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[0.727059] pcifront pci-0: Creating PCI Frontend Bus :00
[0.727363] pcifront pci-0: PCI host bridge to bus :00
[0.727391] pci_bus :00: root bus resource [io  0x-0x]
[0.727417] pci_bus :00: root bus resource [mem 
0x-0x]
[0.727452] pci_bus :00: root bus resource [bus 00-ff]
[0.727475] pci_bus :00: scanning bus
[0.727503] pcifront pci-0: read dev=:00:00.0 - offset 0 size 4
[0.728253] Linux agpgart interface v0.103
[0.728387] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, 
margin is 60 seconds).
[0.728474] [drm] Initialized drm 1.1.0 20060810
[0.728551] [drm] radeon kernel modesetting enabled.
[0.730319] pcifront pci-0: ?!?!? pciback responded !!! irq:31 
irq_flags:880019e100a8 ns: 143164178555170  ns_timeout: 
1431641787541235000 evtchn:17 gnt_ref:8
[0.730319] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
[0.730319] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 
size:4
[0.730319] pcifront pci-0: read got back value 3f6
[0.738845] pcifront pci-0: read dev=:00:00.0 - offset e size 1
[0.744976] brd: module loaded
[0.745204] pcifront pci-0: ?!?!? pciback responded !!! irq:31 
irq_flags:880019e100a8 ns: 1431641785562852000  ns_timeout: 
143164178755258 evtchn:17 gnt_ref:8
[0.745204] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:14 size:1
[0.745204] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:14 
size:1
[0.745204] pcifront pci-0: read got back value 0
[0.749204] pcifront pci-0: read dev=:00:00.0 - offset 6 size 2
[0.750155] loop: module loaded
[0.752527] pcifront pci-0: ?!?!? pciback responded !!! irq:31 
irq_flags:880019e100a8 ns: 1431641785570841000  ns_timeout: 
1431641787562917000 evtchn:17 gnt_ref:8
[0.752527] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:6 size:2
[0.752527] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:6 
size:2
[0.752527] pcifront pci-0: read got back value 210
[0.757187] pcifront pci-0: read dev=:00:00.0 - offset 34 size 1


Were as in the non-working situation i get:

[0.751244] pcifront pci-0: Allocated pdev @ 0x880019ec2e00 
pdev->sh_info @ 0x88001aa51000
[0.751295] pcifront pci-0: ?!?!? before alloc gntref: 0
[0.751315] pcifront pci-0: ?!?!? after alloc gntref: 8
[0.751334] pcifront pci-0: ?!?!? before alloc evtchn: -1
[0.751355] pcifront pci-0: ?!?!? after alloc evtchn: 17
[0.751422] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
[0.755215] pcifront pci-0: publishing successful!
[0.757341] usbcore: registered new interface driver udlfb
[0.758365] xen:xen_evtchn: Event-channel device installed
[0.760419] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[0.760819] pcifront pci-0: Installing PCI frontend
[0.761518] pcifront pci-0: Creating PCI Frontend Bus :00
[0.761684] pcifront pci-0: PCI host bridge to bus :00
[0.761710] pci_bus :00: root bus resource [io  0x-0x]
[0.761733] pci_bus :00: root bus resource [mem 
0x-0x]
[0.761763] pci_bus :00: root bus resource [bus 00-ff]
[0.761783] pci_bus :00: scanning bus
[0.761805] pcifront pci-0: read dev=:00:00.0 - offset 0 size 4
[0.767207] Linux agpgart interface v0.103
[0.767362] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, 
margin is 60 seconds).
[0.767439] [drm] Initialized drm 1.1.0 20060810
[0.767515] [drm] radeon kernel modesetting enabled.
[0.766948] pcifront pci-0: pciback not responding!!! irq:31 
irq_flags:880019ec0028 ns: 1431641983026498000  ns_timeout: 
1431641983026497000 evtchn:0 gnt_ref:0
[0.766948] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
[0.766948] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 
size:4
[0.766948] pcifront pci-0: other err read got ba

Re: [Xen-devel] Regression due to "device property: Make it possible to use secondary firmware nodes" Re: Xen-unstable + linux 4.1-mergewindow: problems with PV guest pci passthrough: pcifront pci-0:

2015-05-14 Thread Konrad Rzeszutek Wilk
Hi Konrad / David,

One a big snip on this thread, got some more debug info, hopefully this will 
lead to something:

On a working kernel (with the two seemingly non related patches reverted) i get:

[0.717796] pcifront pci-0: Allocated pdev @ 0x880019e11780 
pdev->sh_info @ 0x880018f58000
[0.717848] pcifront pci-0: ?!?!? before alloc gntref: 0
[0.717871] pcifront pci-0: ?!?!? after alloc gntref: 8
[0.717892] pcifront pci-0: ?!?!? before alloc evtchn: -1
[0.717915] pcifront pci-0: ?!?!? after alloc evtchn: 17
[0.717984] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
[0.721640] pcifront pci-0: publishing successful!
[0.723684] usbcore: registered new interface driver udlfb
[0.724664] xen:xen_evtchn: Event-channel device installed
[0.726597] pcifront pci-0: Installing PCI frontend
[0.726853] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[0.727059] pcifront pci-0: Creating PCI Frontend Bus :00
[0.727363] pcifront pci-0: PCI host bridge to bus :00
[0.727391] pci_bus :00: root bus resource [io  0x-0x]
[0.727417] pci_bus :00: root bus resource [mem 
0x-0x]
[0.727452] pci_bus :00: root bus resource [bus 00-ff]
[0.727475] pci_bus :00: scanning bus
[0.727503] pcifront pci-0: read dev=:00:00.0 - offset 0 size 4
[0.728253] Linux agpgart interface v0.103
[0.728387] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, 
margin is 60 seconds).
[0.728474] [drm] Initialized drm 1.1.0 20060810
[0.728551] [drm] radeon kernel modesetting enabled.
[0.730319] pcifront pci-0: ?!?!? pciback responded !!! irq:31 
irq_flags:880019e100a8 ns: 143164178555170  ns_timeout: 
1431641787541235000 evtchn:17 gnt_ref:8
[0.730319] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
[0.730319] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 
size:4
[0.730319] pcifront pci-0: read got back value 3f6
[0.738845] pcifront pci-0: read dev=:00:00.0 - offset e size 1
[0.744976] brd: module loaded
[0.745204] pcifront pci-0: ?!?!? pciback responded !!! irq:31 
irq_flags:880019e100a8 ns: 1431641785562852000  ns_timeout: 
143164178755258 evtchn:17 gnt_ref:8
[0.745204] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:14 size:1
[0.745204] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:14 
size:1
[0.745204] pcifront pci-0: read got back value 0
[0.749204] pcifront pci-0: read dev=:00:00.0 - offset 6 size 2
[0.750155] loop: module loaded
[0.752527] pcifront pci-0: ?!?!? pciback responded !!! irq:31 
irq_flags:880019e100a8 ns: 1431641785570841000  ns_timeout: 
1431641787562917000 evtchn:17 gnt_ref:8
[0.752527] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:6 size:2
[0.752527] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:6 
size:2
[0.752527] pcifront pci-0: read got back value 210
[0.757187] pcifront pci-0: read dev=:00:00.0 - offset 34 size 1


Were as in the non-working situation i get:

[0.751244] pcifront pci-0: Allocated pdev @ 0x880019ec2e00 
pdev->sh_info @ 0x88001aa51000
[0.751295] pcifront pci-0: ?!?!? before alloc gntref: 0
[0.751315] pcifront pci-0: ?!?!? after alloc gntref: 8
[0.751334] pcifront pci-0: ?!?!? before alloc evtchn: -1
[0.751355] pcifront pci-0: ?!?!? after alloc evtchn: 17
[0.751422] pcifront pci-0: ?!?!? bound evtchn:17 to irqhandler:-1 err:31
[0.755215] pcifront pci-0: publishing successful!
[0.757341] usbcore: registered new interface driver udlfb
[0.758365] xen:xen_evtchn: Event-channel device installed
[0.760419] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[0.760819] pcifront pci-0: Installing PCI frontend
[0.761518] pcifront pci-0: Creating PCI Frontend Bus :00
[0.761684] pcifront pci-0: PCI host bridge to bus :00
[0.761710] pci_bus :00: root bus resource [io  0x-0x]
[0.761733] pci_bus :00: root bus resource [mem 
0x-0x]
[0.761763] pci_bus :00: root bus resource [bus 00-ff]
[0.761783] pci_bus :00: scanning bus
[0.761805] pcifront pci-0: read dev=:00:00.0 - offset 0 size 4
[0.767207] Linux agpgart interface v0.103
[0.767362] Hangcheck: starting hangcheck timer 0.9.1 (tick is 180 seconds, 
margin is 60 seconds).
[0.767439] [drm] Initialized drm 1.1.0 20060810
[0.767515] [drm] radeon kernel modesetting enabled.
[0.766948] pcifront pci-0: pciback not responding!!! irq:31 
irq_flags:880019ec0028 ns: 1431641983026498000  ns_timeout: 
1431641983026497000 evtchn:0 gnt_ref:0
[0.766948] pcifront pci-0: ?!?!? op cmd:0 err:0 info:0 offset:0 size:4
[0.766948] pcifront pci-0: ?!?!? active_op cmd:0 err:0 info:0 offset:0 
size:4
[0.766948] pcifront pci-0: other err read got back err:  value: 0
[2.762062] pcifront pci

Re: [Xen-devel] [PATCH v1 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-05-14 Thread Chong Li
On Mon, May 11, 2015 at 1:57 AM, Jan Beulich  wrote:

> >>> On 11.05.15 at 00:04,  wrote:
> > On Fri, May 8, 2015 at 2:49 AM, Jan Beulich  wrote:
> >> >>> On 07.05.15 at 19:05,  wrote:
> >> > @@ -1110,6 +1113,67 @@ rt_dom_cntl(
> >> >  }
> >> >  spin_unlock_irqrestore(&prv->lock, flags);
> >> >  break;
> >> > +case XEN_DOMCTL_SCHEDOP_getvcpuinfo:
> >> > +op->u.rtds.nr_vcpus = 0;
> >> > +spin_lock_irqsave(&prv->lock, flags);
> >> > +list_for_each( iter, &sdom->vcpu )
> >> > +vcpu_index++;
> >> > +spin_unlock_irqrestore(&prv->lock, flags);
> >> > +op->u.rtds.nr_vcpus = vcpu_index;
> >>
> >> Does dropping of the lock here and re-acquiring it below really work
> >> race free?
> >>
> >
> > Here, the lock is used in the same way as the ones in the two cases
> > above (XEN_DOMCTL_SCHEDOP_get/putinfo). So I think if race free
> > is guaranteed in that two cases, the lock in this case works race free
> > as well.
>
> No - the difference is that in the {get,put}info cases it is being
> acquired just once each.
>

I see. I changed it based on Dario's suggestions.

>
> >> > +vcpu_index = 0;
> >> > +spin_lock_irqsave(&prv->lock, flags);
> >> > +list_for_each( iter, &sdom->vcpu )
> >> > +{
> >> > +struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu,
> >> sdom_elem);
> >> > +
> >> > +local_sched[vcpu_index].budget = svc->budget /
> MICROSECS(1);
> >> > +local_sched[vcpu_index].period = svc->period /
> MICROSECS(1);
> >> > +local_sched[vcpu_index].index = vcpu_index;
> >>
> >> What use is this index to the caller? I think you rather want to tell it
> >> the vCPU number. That's especially also taking the use case of a
> >> get/set pair into account - unless you tell me that these indexes can
> >> never change, the indexes passed back into the set operation would
> >> risk to have become stale by the time the hypervisor processes the
> >> request.
> >>
> >
> > I don't quite understand what the "stale" means. The array here
> > (local_sched[ ])
> > and the array (in libxc) that local_sched[ ] is copied to are both used
> for
> > this get
> > operation only. When users set per-vcpu parameters, there are also
> > dedicated
> > arrays for that set operation.
>
> Just clarify this for me (and maybe yourself): Is the vCPU number
> <-> vcpu_index mapping invariable for the lifetime of a domain?
> If it isn't, the vCPU for a particular vcpu_index during a "get"
> may be different from that for the same vcpu_index during a
> subsequent "set".
>

Here the vcpu_index means the vcpu_id. I'll use svc->vcpu.vcpu_id instead
of the
vcpu_index in next version.


>
> >> > +if( local_sched == NULL )
> >> > +{
> >> > +return -ENOMEM;
> >> > +}
> >> > +copy_from_guest(local_sched, op->u.rtds.vcpus,
> >> op->u.rtds.nr_vcpus);
> >> > +
> >> > +for( i = 0; i < op->u.rtds.nr_vcpus; i++ )
> >> > +{
> >> > +vcpu_index = 0;
> >> > +spin_lock_irqsave(&prv->lock, flags);
> >> > +list_for_each( iter, &sdom->vcpu )
> >> > +{
> >> > +struct rt_vcpu *svc = list_entry(iter, struct
> rt_vcpu,
> >> sdom_elem);
> >> > +if ( local_sched[i].index == vcpu_index )
> >> > +{
> >> > +if ( local_sched[i].period <= 0 ||
> >> local_sched[i].budget <= 0 )
> >> > + return -EINVAL;
> >> > +
> >> > +svc->period = MICROSECS(local_sched[i].period);
> >> > +svc->budget = MICROSECS(local_sched[i].budget);
> >> > +break;
> >> > +}
> >> > +vcpu_index++;
> >> > +}
> >> > +spin_unlock_irqrestore(&prv->lock, flags);
> >> > +}
> >>
> >> Considering a maximum size guest, these two nested loops could
> >> require a couple of million iterations. That's too much without any
> >> preemption checks in the middle.
> >>
> >
> > The section protected by the lock is only the "list_for_each" loop, whose
> > running time is limited by the number of vcpus of a domain (32 at most).
>
> Since when is 32 the limit on the number of vCPU-s in a domain?
>

Based on Dario's suggestion, I'll use vcpu_id to locate the vcpu to set,
which cost much
less time.


>
> > If this does cause problems, I think adding a "hypercall_preempt_check()"
> > at the outside "for" loop may help. Is that right?
>
> Yes.
>
> >> > --- a/xen/common/schedule.c
> >> > +++ b/xen/common/schedule.c
> >> > @@ -1093,7 +1093,9 @@ long sched_adjust(struct domain *d, struct
> >> xen_domctl_scheduler_op *op)
> >> >
> >> >  if ( (op->sched_id != DOM2OP(d)->sched_id) ||
> >> >   ((op->cmd != XEN_DOMCTL_SCHEDOP_putinfo) &&
> >> > -  (op->cmd != XEN_DOMCTL_SCHEDOP_getinfo)) )
> >> > +  (op->cmd != XEN_DOMCTL_SCHEDOP_getinfo) &&
> >> > +  

Re: [Xen-devel] [PATCH v1 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-05-14 Thread Chong Li
On Mon, May 11, 2015 at 8:11 AM, Dario Faggioli 
wrote:

> On Thu, 2015-05-07 at 12:05 -0500, Chong Li wrote:
> > Add two hypercalls(XEN_DOMCTL_SCHEDOP_getvcpuinfo/putvcpuinfo) to
> get/set a domain's
> > per-VCPU parameters. Hypercalls are handled in function rt_dom_cntl.
> >
> And that is because, right now, only code in sched_rt.c is able to deal
> with per-vcpu parameters getting and setting.
>
> That's of course true, but these two new hypercalls are, potentially,
> generic, i.e., other schedulers may want to use them at some point. So,
> why not just put them in good shape for that from the beginning?
>
> To do so, you could with the new DOMCTLs in a similar way as
> XEN_DOMCTL_SCHEDOP_{get,put}info are handled, and add a
> new .adjust_vcpu(s?) hook in the scheduler interface.
>
> > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> > index 7c39a9e..9add5a4 100644
> > --- a/xen/common/sched_rt.c
> > +++ b/xen/common/sched_rt.c
> > @@ -1085,6 +1085,9 @@ rt_dom_cntl(
> >  struct list_head *iter;
> >  unsigned long flags;
> >  int rc = 0;
> > +xen_domctl_sched_rtds_params_t *local_sched;
> > +int vcpu_index=0;
> >
> So, what's this vcpu_index intended meaning/usage?
>

The vcpu_index here equals vcpu_id.

>
> > @@ -1110,6 +1113,67 @@ rt_dom_cntl(
> >  }
> >  spin_unlock_irqrestore(&prv->lock, flags);
> >  break;
> > +case XEN_DOMCTL_SCHEDOP_getvcpuinfo:
> > +op->u.rtds.nr_vcpus = 0;
> > +spin_lock_irqsave(&prv->lock, flags);
> > +list_for_each( iter, &sdom->vcpu )
> > +vcpu_index++;
> >
> > +spin_unlock_irqrestore(&prv->lock, flags);
> >
> This gives you the number of vcpus of sdom, doesn't it? It feels rather
> nasty (especially the lock being dropped and taken again below!).
>
> Aren't there other ways to get the same information that suits your
> needs (e.g., d->max_vcpus)? If not, I think you should consider adding a
> 'nr_vcpu' field in rt_dom, exactly as Credit2 is doing in csched2_dom.
>
> > +spin_lock_irqsave(&prv->lock, flags);
> > +list_for_each( iter, &sdom->vcpu )
> > +{
> > +struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu,
> sdom_elem);
> > +
> > +local_sched[vcpu_index].budget = svc->budget / MICROSECS(1);
> > +local_sched[vcpu_index].period = svc->period / MICROSECS(1);
> > +local_sched[vcpu_index].index = vcpu_index;
> > +vcpu_index++;
> >
> And that's why I was asking about index. As Jan is pointing out already,
> used like this, this index/vcpu_index is rather useless.
>
> I mean, you're passing up nr_vcpus structs in an array in which
> the .index field of the i-eth element is equal to i. How is this
> important? The caller could well iterate, count, and retrieve the
> position of each elements by itself!
>
> What you probably are after, is the vcpu id, isn't it?
>

Yes, it is. Now I use vcpuid instead of vcpu_index.

>
> > +}
> > +spin_unlock_irqrestore(&prv->lock, flags);
> > +copy_to_guest(op->u.rtds.vcpus, local_sched, vcpu_index);
> >
> I'm sure we want some checks about whether we are overflowing the
> userspace provided buffer (and something similar below, for put). I
> appreciate that you, in this patch series, are only calling this from
> libxl, which properly dimension things, etc., but that can not always be
> the case.
>
> There are several examples in the code base on the route to take for
> similar operations. For example, you can try to do some checks and only
> fill as much elements as the buffer allows, or you can give a special
> semantic to calling the hypercall with NULL/0 as parameters, i.e., use
> that for asking Xen about the proper sizes, etc.
>
> Have a look at how XEN_SYSCTL_numainfo and XEN_SYSCTL_cputopoinfo are
> implemented (in Xen, but also in libxc and libxl, to properly understand
> things).
>
> > +case XEN_DOMCTL_SCHEDOP_putvcpuinfo:
> > +local_sched = xzalloc_array(xen_domctl_sched_rtds_params_t,
> > +op->u.rtds.nr_vcpus);
> > +if( local_sched == NULL )
> > +{
> > +return -ENOMEM;
> > +}
> > +copy_from_guest(local_sched, op->u.rtds.vcpus,
> op->u.rtds.nr_vcpus);
> > +
> > +for( i = 0; i < op->u.rtds.nr_vcpus; i++ )
> > +{
> > +vcpu_index = 0;
> > +spin_lock_irqsave(&prv->lock, flags);
> > +list_for_each( iter, &sdom->vcpu )
> > +{
> >
> But why the nested loop? I think this is still that 'index' thing
> causing problems. If you use vcpu numbers/ids, you can just use the
> d->vcpu[] array, and get rid of the one of the for-s!
>
> Look at, for instance, XEN_DOMCTL_{set,get}vcpuaffinity. You're after
> something that is pretty similar (i.e., altering a per-vcpu property),
> you just want to do it on more than one vcpu at a time.
>
> > +struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu,
> 

Re: [Xen-devel] [libvirt test] 55257: regressions - FAIL

2015-05-14 Thread Jim Fehlig
Jim Fehlig wrote:
> More hint that libvirtd crashed.  Have there been any attempts to
> reproduce this outside of the test rig?  Or capture a core dump?
>   

FYI, I've unsuccessfully tried to reproduce this using config similar to
debian.guest.osstest.cfg.xml.

Regards,
Jim

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [libvirt test] 55257: regressions - FAIL

2015-05-14 Thread Jim Fehlig
Anthony PERARD wrote:
> On Thu, May 14, 2015 at 11:47:18AM +0100, Ian Campbell wrote:
>   
>> I suppose the openstack CI loop doesn't capture anything more
>> interesting than osstest does?
>> 
>
> No, nothing else interesting. The next step would be to enable more debug
> output from libvirtd by playing with "log_level" and "log_filters" in
> /etc/libvirtd.conf, but I don't know which filter would be intersting.
>   

log_level is already set to DEBUG.  And the xen tool logger used by the
libxl driver is also set to XTL_DEBUG.  I'm not aware of any more debug
or logging to enable.

Regards,
Jim

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [libvirt test] 55257: regressions - FAIL

2015-05-14 Thread Jim Fehlig
Ian Campbell wrote:
> On Wed, 2015-05-13 at 18:46 +0100, Anthony PERARD wrote:
>   
>> On Wed, May 13, 2015 at 09:46:28AM +0100, Ian Campbell wrote:
>> 
>>> On Mon, 2015-05-11 at 10:36 -0600, Jim Fehlig wrote:
>>> [...]
>>>   
> The qemu log is sadly empty so I've no clue why this timed out.
>   
>   
 I guess qemu didn't run at all...

 
> Perhaps there is something in 
> http://logs.test-lab.xenproject.org/osstest/logs/55257/test-amd64-amd64-libvirt/merlot1---var-log-libvirt-libvirtd.log.gz
> I can't make heads nor tail though.
>   
>   
 Nothing interesting.  Only the unhelpful

 2015-05-11 12:42:17.451+: 4280: error : libxlDomainStart:1032 :
 internal error: libxenlight failed to create new domain
 'debian.guest.osstest'
 
>>> This happened again in
>>> http://logs.test-lab.xenproject.org/osstest/logs/55349/test-amd64-amd64-libvirt/info.html
>>>
>>> Is there anything we could tweak in osstest to produce more helpful
>>> logging?
>>>   
>> Well we can find in var-log-libvirt-libvirtd.log.gz this:
>> 2015-05-12 17:39:35.180+: 4329: error : libxlDomainStart:1032 : internal 
>> error: libxenlight failed to create new domain 'debian.guest.osstest'
>>
>> And for more information we need to look into the driver specific log,
>> libxl logs in var-log-libvirt-libxl-libxl-driver.log:
>> libxl: error: libxl_exec.c:393:spawn_watch_event: domain 1 device model: 
>> startup timed out
>> 
>
> Thanks, all of that was mentioned earlier in the thread too, I was
> looking for ways to get more info.
>
>   
>> I'm seeing this error a lot on our OpenStack CI loop, I thought the error
>> was due to the "host" been very busy, but if osstest is having the same
>> issue, then there is probably something wrong with libxl+libvirt :(.
>> 
>
> Are you able to reproduce at will or is it like osstest and just a
> sporadic failure?
>
> I suppose the openstack CI loop doesn't capture anything more
> interesting than osstest does?
>
> FWIW http://logs.test-lab.xenproject.org/osstest/logs/55443/ seems to
> have two more instances of this (amd64 and i386)

More cases of qemu not starting.  I'm not sure how we can get more
details about that.

>  but with no 
> interesting logs still and a different one on ARM:
>
> http://logs.test-lab.xenproject.org/osstest/logs/55443/test-armhf-armhf-libvirt/11.ts-guest-start.log:
> 2015-05-13 09:23:32.193+: 16389: info : libvirt version: 1.2.16
> 2015-05-13 09:23:32.193+: 16389: warning : virKeepAliveTimerInternal:143 
> : No response from client 0xb7000c38 after 6 keepalive messages in 35 seconds
> 2015-05-13 09:23:32.193+: 16390: warning : virKeepAliveTimerInternal:143 
> : No response from client 0xb7000c38 after 6 keepalive messages in 35 seconds
> error: Failed to create domain from /etc/xen/debian.guest.osstest.cfg.xml
> error: internal error: received hangup / error event on socket
>   

In this case it seems libvirtd crashed.

> In that case the the libxl-driver log ends with:
> libxl: debug: libxl_dm.c:1495:libxl__spawn_local_dm: Spawning device-model 
> /usr/local/lib/xen/bin/qemu-system-i386 with arguments:
> [...]
> libxl: debug: libxl_event.c:600:libxl__ev_xswatch_register: watch 
> w=0xb2e07bcc wpath=/local/domain/0/device-model/1/state token=3/0: register 
> slotnum=3
> libxl: debug: libxl_create.c:1560:do_domain_create: ao 0xb2e044f0: 
> inprogress: poller=0xb2e07590, flags=i
> libxl: debug: libxl_event.c:537:watchfd_callback: watch w=0xb2e07bcc 
> wpath=/local/domain/0/device-model/1/state token=3/0: event 
> epath=/local/domain/0/device-model/1/state
>
> Which I don't think is complete, i.e. there should be more? Not sure if
> this gives a hint for the x86 case too?
>   

More hint that libvirtd crashed.  Have there been any attempts to
reproduce this outside of the test rig?  Or capture a core dump?

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv4 3/5] xen: use ticket locks for spin locks

2015-05-14 Thread Jan Beulich
>>> Tim Deegan  05/14/15 10:55 PM >>>
>At 21:05 +0100 on 14 May (1431637535), Jan Beulich wrote:
>> >>> Tim Deegan  05/14/15 12:36 PM >>>
>> >At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
>> >> +while ( observe_head(&lock->tickets) != sample.tail )
>> >
>> >This test should be "observe_head(&lock->tickets) == sample.head",
>> >i.e. wait until the thread that held the lock has released it.
>> >Checking for it to reach the tail is unnecessary (other threads that
>> >were queueing for the lock at the sample time don't matter) and
>> >dangerous (on a contended lock head might pass sample.tail without us
>> >happening to observe it being == ).
>> 
>> The observation of there being a problem is correct, but the suggested 
>> solution
>> doesn't seem to be. The new code being
>> 
>> if ( sample.head != sample.tail )
>> {
>> while ( observe_head(&lock->tickets) == sample.tail )
>> cpu_relax();
>> 
>> means that if head didn't change between the full sample and the head sample
>> we'd end the loop right away, which can't be right. We really need to wait 
>> for
>> head to reach or pass the sampled tail.
>
>I think you misread what I asked for.  We wait until the observed head
>doesn't match the sampled _head_, i.e. for whoever had the lock when
>we sampled it to realease it:
>
>if ( sample.head != sample.tail )
>{
>while ( observe_head(&lock->tickets) == sample.head )
>cpu_relax();

Indeed. And in my simultaneously written reply to that mail I (once again)
mixed up head and tail. Sorry for the noise then.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Pkg-xen-devel] Bug#785187: xen-hypervisor-4.5-amd64: Option ucode=scan is not working

2015-05-14 Thread Atom2

Am 14.05.15 um 10:41 schrieb Ian Campbell:

Thanks Atom2, thanks for the refs, I've added the Debian bug and the
submitter back to the CC.
Unfortunately those recipients originally seem to have escaped my eye 
... but they are included now.

On Wed, 2015-05-13 at 22:11 +0200, Atom2 wrote:

Am 13.05.15 um 15:41 schrieb Ian Campbell:

I think I remember some discussion of something in this area not too
long ago on xen-devel. CC-s added.

I assume you refer to this discussion which happend to be on the
xen-users mailing list:
http://lists.xen.org/archives/html/xen-users/2014-05/msg00052.html

Especially look at the first answer
(http://lists.xen.org/archives/html/xen-users/2014-05/msg00053.html).

This says "not 'cpio -o c' as some information on the internet
suggests", do you have a link? Is it to something the xenproject
controls and could update (i.e. our wiki and/or in tree docs?)

Cheers,
Ian.
Ian, I am unable to find any of those links that I had read back then 
when I wrote the "not 'cpio ..." part in my answer to the list referred 
to above.


I now have only been able to come up with 
http://xenbits.xen.org/docs/unstable/misc/amd-ucode-container.txt, which 
is dated _after_ my referred mail (both the original and the amended 
version) and clearly also states the "-H newc" option. This document BTW 
contains a pretty good recipe for making this happen and also seems to 
be in line with what I had suggested earlier. So in a nutshell, 
xenproject seems to have things correct.


Appologies for not being able to be more helpful.

Regards Atom2

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv4 3/5] xen: use ticket locks for spin locks

2015-05-14 Thread Tim Deegan
At 21:05 +0100 on 14 May (1431637535), Jan Beulich wrote:
> >>> Tim Deegan  05/14/15 12:36 PM >>>
> >At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
> >>  void _spin_barrier(spinlock_t *lock)
> >>  {
> >> +spinlock_tickets_t sample;
> >>  #ifdef LOCK_PROFILE
> >>  s_time_t block = NOW();
> >> -u64  loop = 0;
> >> +#endif
> >>  
> >>  check_barrier(&lock->debug);
> >> -do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
> >> -if ((loop > 1) && lock->profile)
> >> +smp_mb();
> >> +sample = observe_lock(&lock->tickets);
> >> +if ( sample.head != sample.tail )
> >>  {
> >> -lock->profile->time_block += NOW() - block;
> >> -lock->profile->block_cnt++;
> >> -}
> >> -#else
> >> -check_barrier(&lock->debug);
> >> -do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
> >> +while ( observe_head(&lock->tickets) != sample.tail )
> >
> >This test should be "observe_head(&lock->tickets) == sample.head",
> >i.e. wait until the thread that held the lock has released it.
> >Checking for it to reach the tail is unnecessary (other threads that
> >were queueing for the lock at the sample time don't matter) and
> >dangerous (on a contended lock head might pass sample.tail without us
> >happening to observe it being == ).
> 
> The observation of there being a problem is correct, but the suggested 
> solution
> doesn't seem to be. The new code being
> 
> if ( sample.head != sample.tail )
> {
> while ( observe_head(&lock->tickets) == sample.tail )
> cpu_relax();
> 
> means that if head didn't change between the full sample and the head sample
> we'd end the loop right away, which can't be right. We really need to wait for
> head to reach or pass the sampled tail.

I think you misread what I asked for.  We wait until the observed head
doesn't match the sampled _head_, i.e. for whoever had the lock when
we sampled it to realease it:

 if ( sample.head != sample.tail )
 {
 while ( observe_head(&lock->tickets) == sample.head )
 cpu_relax();

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv4 3/5] xen: use ticket locks for spin locks

2015-05-14 Thread Jan Beulich
>>> "Jan Beulich"  05/14/15 10:06 PM >>>
 Tim Deegan  05/14/15 12:36 PM >>>
>>At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
>>>  void _spin_barrier(spinlock_t *lock)
>>>  {
>>> +spinlock_tickets_t sample;
>>>  #ifdef LOCK_PROFILE
>>>  s_time_t block = NOW();
>>> -u64  loop = 0;
>>> +#endif
>>>  
>>>  check_barrier(&lock->debug);
>>> -do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
>>> -if ((loop > 1) && lock->profile)
>>> +smp_mb();
>>> +sample = observe_lock(&lock->tickets);
>>> +if ( sample.head != sample.tail )
>>>  {
>>> -lock->profile->time_block += NOW() - block;
>>> -lock->profile->block_cnt++;
>>> -}
>>> -#else
>>> -check_barrier(&lock->debug);
>>> -do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
>>> +while ( observe_head(&lock->tickets) != sample.tail )
>>
>>This test should be "observe_head(&lock->tickets) == sample.head",
>>i.e. wait until the thread that held the lock has released it.
>>Checking for it to reach the tail is unnecessary (other threads that
>>were queueing for the lock at the sample time don't matter) and
>>dangerous (on a contended lock head might pass sample.tail without us
>>happening to observe it being == ).
>
>The observation of there being a problem is correct, but the suggested solution
>doesn't seem to be. The new code being
>
>if ( sample.head != sample.tail )
>{
>while ( observe_head(&lock->tickets) == sample.tail )
>cpu_relax();
>
>means that if head didn't change between the full sample and the head sample
>we'd end the loop right away, which can't be right. We really need to wait for
>head to reach or pass the sampled tail.

Just realized that this would be too heavy again (and more difficult to do than
necessary) - I think on the left side we need observe_tail() instead of
observe_head(), and then the == is correct.

Jan

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC][PATCH 07/13] xen/passthrough: extend hypercall to support rdm reservation policy

2015-05-14 Thread Jan Beulich
>>> "Chen, Tiejun"  05/14/15 7:48 AM >>>
>On 2015/5/11 18:57, Jan Beulich wrote:
>> Yeah, the constant name probably shouldn't refer to PCI, but simply
>> to pass-through.
>
>What about XEN_DOMCTL_DEV_RDM_XXX? I mean this may be specific to 
>device, right?

Fine with me.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCHv4 3/5] xen: use ticket locks for spin locks

2015-05-14 Thread Jan Beulich
>>> Tim Deegan  05/14/15 12:36 PM >>>
>At 15:37 +0100 on 11 May (1431358623), David Vrabel wrote:
>>  void _spin_barrier(spinlock_t *lock)
>>  {
>> +spinlock_tickets_t sample;
>>  #ifdef LOCK_PROFILE
>>  s_time_t block = NOW();
>> -u64  loop = 0;
>> +#endif
>>  
>>  check_barrier(&lock->debug);
>> -do { smp_mb(); loop++;} while ( _raw_spin_is_locked(&lock->raw) );
>> -if ((loop > 1) && lock->profile)
>> +smp_mb();
>> +sample = observe_lock(&lock->tickets);
>> +if ( sample.head != sample.tail )
>>  {
>> -lock->profile->time_block += NOW() - block;
>> -lock->profile->block_cnt++;
>> -}
>> -#else
>> -check_barrier(&lock->debug);
>> -do { smp_mb(); } while ( _raw_spin_is_locked(&lock->raw) );
>> +while ( observe_head(&lock->tickets) != sample.tail )
>
>This test should be "observe_head(&lock->tickets) == sample.head",
>i.e. wait until the thread that held the lock has released it.
>Checking for it to reach the tail is unnecessary (other threads that
>were queueing for the lock at the sample time don't matter) and
>dangerous (on a contended lock head might pass sample.tail without us
>happening to observe it being == ).

The observation of there being a problem is correct, but the suggested solution
doesn't seem to be. The new code being

if ( sample.head != sample.tail )
{
while ( observe_head(&lock->tickets) == sample.tail )
cpu_relax();

means that if head didn't change between the full sample and the head sample
we'd end the loop right away, which can't be right. We really need to wait for
head to reach or pass the sampled tail.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] xen/vm_event: Clean up control-register-write vm_events

2015-05-14 Thread Razvan Cojocaru
On 05/14/2015 08:31 PM, Tamas K Lengyel wrote:
> On Thu, May 14, 2015 at 7:11 PM, Razvan Cojocaru
>  wrote:
>> On 05/14/2015 07:55 PM, Tamas K Lengyel wrote:
 diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
 index 45b5283..1dd49dd 100644
 --- a/xen/include/asm-x86/domain.h
 +++ b/xen/include/asm-x86/domain.h
 @@ -341,19 +341,13 @@ struct arch_domain

  /* Monitor options */
  struct {
 -uint16_t mov_to_cr0_enabled  : 1;
 -uint16_t mov_to_cr0_sync : 1;
 -uint16_t mov_to_cr0_onchangeonly : 1;
 -uint16_t mov_to_cr3_enabled  : 1;
 -uint16_t mov_to_cr3_sync : 1;
 -uint16_t mov_to_cr3_onchangeonly : 1;
 -uint16_t mov_to_cr4_enabled  : 1;
 -uint16_t mov_to_cr4_sync : 1;
 -uint16_t mov_to_cr4_onchangeonly : 1;
 -uint16_t mov_to_msr_enabled  : 1;
 -uint16_t mov_to_msr_extended : 1;
 -uint16_t singlestep_enabled  : 1;
 -uint16_t software_breakpoint_enabled : 1;
 +uint32_t write_ctrlreg_enabled   : 8;
 +uint32_t write_ctrlreg_sync  : 8;
 +uint32_t write_ctrlreg_onchangeonly  : 8;
>>>
>>> Any particular reason why you have these bitmaps 8-bits wide? There
>>> are only 4 events defined at the moment that would use these.
>>
>> ARM control registers have been mentioned, so I thought I would leave
>> some space for a few more events. Other than that, they don't _need_ to
>> be 8-bits wide. If compactness matters more I'll change them to 4.
>>
>>
>> Thanks,
>> Razvan
> 
> IMHO it's better to widen the field when there is an actual need for
> it. For now I think 4 would be appropriate.

Fair enough. I'll change it to 4 and we'll just have to be careful when
patches touching these events come in.


Thanks,
Razvan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 0/4] Support more than 8 vcpus on arm64 with GICv3

2015-05-14 Thread Julien Grall
On 14/05/15 18:48, Julien Grall wrote:
> Although you would need to reshuffle a bit the layout. With your
> solution, if the guest is using 128 vCPUs it will overlap the
> grant-table region, magic page (xenstore, xenconsole,...) and the
> beginning of the RAM. whoops ;).

Hmmm... forget this paragraph, I miscalculated the final value :/.

There is enough space for accommodating 128 vCPUs.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST] Toolstack::xl: Support for ACPI fallback for shutdown

2015-05-14 Thread Ian Jackson
Ian Campbell writes ("Re: [Xen-devel] [PATCH OSSTEST] Toolstack::xl: Support 
for ACPI fallback for shutdown"):
> You acked the xl one, so here is a separate one.

Acked-by: Ian Jackson 

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 4/4] xen/arm: Remove unnecessary GUEST_GICV3_GICR0_SIZE macro.

2015-05-14 Thread Julien Grall
Hi Chen,

On 14/05/15 15:14, Chen Baozi wrote:
> From: Chen Baozi 
> 
> Signed-off-by: Chen Baozi 
> ---
>  xen/include/public/arch-arm.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index c029e0f..cbcda74 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -388,8 +388,7 @@ struct xen_arch_domainconfig {
>  #define GUEST_GICV3_RDIST_STRIDE   0x2ULL
>  #define GUEST_GICV3_RDIST_REGIONS  1
>  
> -#define GUEST_GICV3_GICR0_BASE 0x0302ULL/* vCPU0 - vCPU7 */

The /* vCPU0 - vCPU7 */ was useful. Please update to whatever you will
be use.

> -#define GUEST_GICV3_GICR0_SIZE 0x0010ULL
> +#define GUEST_GICV3_GICR0_BASE 0x0302ULL

Please don't drop it. Even if you dropped all the usage, it's necessary
for documentation purpose.

If someone wants to change the guest layout, he will know that the
re-dist region will be this size.

I would also like to keep the BUILD_BUG_ON in vgic-v3 in order to check
if there is enough space reserved in the guest layout for the re-dist
(see your patch #2).

Regards,


-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST v6 13/24] distros: support PV guest install from Debian netinst media.

2015-05-14 Thread Ian Jackson
Ian Campbell writes ("Re: [OSSTEST v6 13/24] distros: support PV guest install 
from Debian netinst media."):
> On Tue, 2015-05-12 at 16:52 +0100, Ian Jackson wrote:
> > Wrap damage on my screen.  I appreciate that you want to retain the
> > tabular nature, but perhaps 
> > 
> > my %arch_props = (
> > amd64 => [ "multi-arch", "amd64-i386", "/install.amd/xen" },
> > i386  => [ "multi-arch", "amd64-i386", "/install.386/xen" },
> > armhf => [ "armhf",  "armhf",  "/install.armhf" },
> > )
> > my   ( $path_arch,   $file_arch,   $iso_path ) =
> > @{ $arch_props{$arch} };
...
> More Perl-fu than I could muster by myself, but looks good to me ;-)

You didn't spot the [...} then ? :-)


> > > +my $baseurl = $cd eq "current" ?
> > > +  
> > > "http://cdimage.debian.org/debian-cd/current/$props->{PathArch}/jigdo-cd" 
> > > :
> > > +  
> > > "http://cdimage.debian.org/cdimage/weekly-builds/$props->{PathArch}/jigdo-cd";
> > 
> > This should surely come from a runvar (or perhaps a config option)
> > rather than being hardcoded.
> 
> $cd above came from $r{"$gho->{Guest}_cd"} and takes the values either
> "current" or "weekly".
> 
> I could push the logic which mas those to an actual URL into
> make-distros-flight if you would prefer? Or perhaps push the two
> prefixes (up to but not including "$props->.../jigdo-cd") into the cfg
> file? So it would become "$c{DebianCdURL_
> $cd}/$props->{PathArch}/jigdo-cd"

(We had a conversation IRL about some of this, but:)

Hardcoding something in make-flight is not as bad as hardcoding it in
the script, IMO, but:

> The latter sounds a bit preferable I think, hardcoding URLs in
> make-flight seems as wrong as hardcoding them here.

Having it in the config is best.  You might want to consider whether
the config ought to be plumbed through a runvar by make-flight, so
that it is possible to (eg) use cs-adjust-flight or whatever to do
tests of different things.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST v6 14/24] Test pygrub and pvgrub on the regular flights

2015-05-14 Thread Ian Jackson
Ian Campbell writes ("Re: [OSSTEST v6 14/24] Test pygrub and pvgrub on the 
regular flights"):
> On Tue, 2015-05-12 at 16:54 +0100, Ian Jackson wrote:
> > I think we should consider at this point which jobs we can drop as a
> > result of these new ones being added.  Ideally this patch would
> > contain hunks adding "delete this later" style comments to some
> > existing tests.
> 
> After these tests are added the set of test-* jobs (taken from a random
> adhoc flight I ran) are below. I'm not sure which could sensibly be
> dropped. I appreciate this is adding a lot of jobs though :-(

Hmm.  Well, fair enough, then, I guess.

Acked-by: Ian Jackson 

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 2/4] xen/arm64: increase MAX_VIRT_CPUS to 128 on arm64

2015-05-14 Thread Julien Grall
Hi Chen,

On 14/05/15 15:14, Chen Baozi wrote:
> From: Chen Baozi 
> 
> GIC-500 supports up to 128 cores in a single SoC. Since the
> redistributor register map is no longer set by fixed size, which limits
> the number of vcpu, we increase MAX_VIRT_CPUS to 128 and remove the
> corresponding restriction.
> 
> Signed-off-by: Chen Baozi 
> ---
>  xen/arch/arm/vgic-v3.c   | 3 ---
>  xen/include/asm-arm/config.h | 4 
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index a0c1de9..83e7344 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -906,7 +906,6 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v, 
> mmio_info_t *info)
>  rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER,
>  DABT_DOUBLE_WORD);
>  if ( rank == NULL ) goto write_ignore;
> -BUG_ON(v->domain->max_vcpus > 8);

This was here to catch bump of MAX_VIRT without adapting the vGIC code.

The current vGICv3 is not supporting more than 16 vCPUs because it only
cares about AFF0.

>  new_irouter = *r;
>  vgic_lock_rank(v, rank, flags);
>  
> @@ -1203,8 +1202,6 @@ static int vgic_v3_domain_init(struct domain *d)
>  d->arch.vgic.nr_regions = GUEST_GICV3_RDIST_REGIONS;
>  d->arch.vgic.rdist_stride = GUEST_GICV3_RDIST_STRIDE;
>  
> -/* The first redistributor should contain enough space for all CPUs 
> */
> -BUILD_BUG_ON((GUEST_GICV3_GICR0_SIZE / GUEST_GICV3_RDIST_STRIDE) < 
> MAX_VIRT_CPUS);

I'd like to keep this BUILD_BUG_ON. It ensures that we reserved enough
space in the guest layout for the redistributors.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [PATCH] Do not emulate a floppy drive when -nodefaults

2015-05-14 Thread John Snow


On 05/14/2015 10:07 AM, Michael S. Tsirkin wrote:
> On Thu, May 14, 2015 at 02:02:04PM +0200, Markus Armbruster wrote:
>> Correct.
>>
>> Here's how I think it should be done:
>>
>> * Create a machine option to control the FDC
>>
>>   This is a machine-specific option.  It should only exist for machine
>>   types that have an optional FDC.
>>
>>   Default must be "on" for old machine types.  Default may be "off" for
>>   new machine types.
>>
>>   It should certainly be off for pc-q35-2.4 and newer.  Real Q35 boards
>>   commonly don't have an FDC (depends on the Super I/O chip used).
>>
>>   We may want to keep it off for pc-i440fx-2.4 and newer.  I doubt
>>   there's a real i440FX without an FDC, but our virtual i440FX is quite
>>   unlike a real one in other ways already.
> 
> I think making it off by default is a bad idea, it will break
> command-line users.
> 
> 

If we can add a flag to disable it, I still think I wouldn't mind that,
if it could be worked out to not be hacky and gross.

>> * Create the FDC only if the option is "on".
>>
>> * Optional: make -drive if=floppy,... auto-enable it
> 
> Every time we do such auto hacks, we regret this later.
> Just do what we are told, fail if=floppy if disabled.
> 

I agree very much. Just because the current drive/device syntax is
almost totally hosed doesn't mean we should put more wood on the fire.

>>   I wouldn't bother doing the same for -global isa-fdc.driveA=... and
>>   such.
>>
>> Stefano, if you're willing to tackle this, go right ahead!


I'm definitely against a "--seriously-nothing" flag because the line for
what is embedded or not is fuzzy. Paolo raises some good points against
where you draw the line for what we decide to allow users to
include/exclude that is otherwise considered part of the board.

Still, given the hype train, if there is an API we could introduce that
is likely not to make our code gross (or make us belly-ache about how
dumb we were in 5 years) that disables the FDC, I don't think I would
mind terribly. I'll leave that to minds more opinionated than mine to
hash out, though.

Maybe the best option here really is to carefully separate optional from
non-optional components (FDC vs. Floppy Drive, Floppy Disk code) and
just give the core FDC code a good scrubbing.

--js

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 1/4] xen/arm64: Map the redistributor region by max_vcpus of domU danamically

2015-05-14 Thread Julien Grall
Hi Chen,

On 14/05/15 15:14, Chen Baozi wrote:
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index e571e76..43b9f79 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -33,6 +33,11 @@
>  #include 
>  #include 
>  
> +#ifdef CONFIG_ARM_64
> +#include 
> +#include 
> +#endif
> +
>  static DEFINE_SPINLOCK(domctl_lock);
>  DEFINE_SPINLOCK(vcpu_alloc_lock);
>  
> @@ -680,6 +685,11 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
> u_domctl)
>  d->max_vcpus = max;
>  }
>  
> +#ifdef CONFIG_ARM_64
> +if (!is_hardware_domain(d) && d->arch.vgic.version == GIC_V3)
> +vgic_v3_rdist_map(d);
> +#endif

That is very hackish. It's common with other architecture and we are
trying to be vgic agnostic in general. A vGIC callback would be more
suitable.

Although, as said on the cover letter. It's not necessary to allocate
dynamically. We can expand the current region to support up to 16/128 vCPUs.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] [RFC] run QEMU as non-root

2015-05-14 Thread Stefano Stabellini
Run QEMU as non-root. Starting from uid 6000, the chosen uid is
base+domid. If the uid doesn't exist, try just 6000. This is less
secure: ideally we don't want different domains having their QEMUs
running with the same uid. Finally if uid 6000 doesn't exist either,
fall back to running QEMU as root.

The uids need to be manually created by the user or, more likely, by the
xen package maintainer.

To actually secure QEMU when running in Dom0, we need at least to
deprivilege the privcmd and xenstore interfaces, this is just the first
step in that direction.

Signed-off-by: Stefano Stabellini 
---
 tools/libxl/libxl_dm.c   |   17 +
 tools/libxl/libxl_internal.h |2 ++
 2 files changed, 19 insertions(+)

diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 0c6408d..942c5df 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -19,6 +19,8 @@
 
 #include "libxl_internal.h"
 #include 
+#include 
+#include 
 
 static const char *libxl_tapif_script(libxl__gc *gc)
 {
@@ -439,6 +441,7 @@ static char ** libxl__build_device_model_args_new(libxl__gc 
*gc,
 int i, connection, devid;
 uint64_t ram_size;
 const char *path, *chardev;
+struct passwd *user = NULL;
 
 dm_args = flexarray_make(gc, 16, 1);
 
@@ -878,6 +881,20 @@ static char ** 
libxl__build_device_model_args_new(libxl__gc *gc,
 default:
 break;
 }
+
+user = getpwuid(LIBXL_QEMU_BASE_UID + guest_domid);
+if (user == NULL) {
+LIBXL__LOG(ctx, LIBXL__LOG_WARNING, "Could not find uid %d, 
falling back to %d\n",
+LIBXL_QEMU_BASE_UID + guest_domid, LIBXL_QEMU_BASE_UID);
+user = getpwuid(LIBXL_QEMU_BASE_UID);
+if (user == NULL)
+LIBXL__LOG(ctx, LIBXL__LOG_WARNING, "Could not find uid %d, 
starting QEMU as root\n",
+LIBXL_QEMU_BASE_UID);
+}
+if (user) {
+flexarray_append(dm_args, "-runas");
+flexarray_append(dm_args, user->pw_name);
+}
 }
 flexarray_append(dm_args, NULL);
 return (char **) flexarray_contents(dm_args);
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 8eb38aa..065ff98 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -3692,6 +3692,8 @@ static inline void libxl__update_config_vtpm(libxl__gc 
*gc,
  */
 void libxl__bitmap_copy_best_effort(libxl__gc *gc, libxl_bitmap *dptr,
 const libxl_bitmap *sptr);
+
+#define LIBXL_QEMU_BASE_UID (6000)
 #endif
 
 /*
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST PATCH 5/6] ts-xen-build-prep: mkfs a new /home/osstest, don't resize2fs

2015-05-14 Thread Ian Jackson
Ian Campbell writes ("Re: [OSSTEST PATCH 5/6] ts-xen-build-prep: mkfs a new 
/home/osstest, don't resize2fs"):
> On Wed, 2015-05-13 at 12:15 +0100, Ian Jackson wrote:
> > +   echo '$mapper $dir $fstype defaults 0 0' >>/etc/fstab
> 
> This doesn't update /etc/fstab so in standalone mode (when a host may
> share build and test duties) the reboots will cause the
> old /home/osstest to reappear and then I'm not sure if this all does
> what we want or not with the existing lv.

That line there is the update to /etc/fstab.

The result is that if the host reboots the new fs should be mounted,
and then when this runs again this

> > +   if mount | sed -e 's/^[^ ].* on //; s/ .*//' | grep -F '$dir'; then
> > +   exit 0
> > +   fi

is supposed to arrange that the whole thing is a no-op.

Also note that the script makes the mount point with mode 2700 so that
normal build scripts will crash if the fs is not mounted for some
reason:

> + mkdir -m 2700 $dir

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC 0/4] Support more than 8 vcpus on arm64 with GICv3

2015-05-14 Thread Julien Grall
Hi Chen,

On 14/05/15 15:14, Chen Baozi wrote:
> From: Chen Baozi 
> 
> Currently the number of vcpus on arm64 with GICv3 is limited up to 8 due
> to the fixed size of redistributor mmio region. In this patch series, I
> postpone setting the size of GICR0 to the point when max_vcpus of a domU is
> determined to support more than 8 redistributors.

I don't think postponing is necessary. We have plenty of space in the
RAM guest layout to reserve a region for 128 redistributors.

If the guest is trying to access to wrong re-distributor, it will
receive a data abort. It's already the case today when the guest is
using less than 8 vCPUs.

Although you would need to reshuffle a bit the layout. With your
solution, if the guest is using 128 vCPUs it will overlap the
grant-table region, magic page (xenstore, xenconsole,...) and the
beginning of the RAM. whoops ;).

> However, I am not quite sure that decoupling the rdist base and size
> setting of domU to different functions is appropriate, though I am now
> able to create both a dom0 and a domU of 8+ vcpus with these patches.
> So any comments/suggestions are welcomed.

I'm afraid to say that your suggestion is only enough to support up to
16 vCPUs per guest.

The vGIC is only using the affinity 0 of the MPIDR (AFF1, AFF2 and AFF3
are ignored).

Affinity 0 represents the CPU in a cluster and AFF{1,2,3} the cluster
ID. Each cluster can support up to 16 CPUs.

You will also need to change the way to domain is creating the MIPDR,
currently it considers that AFF0 == vcpu_id (see vcpu_initialise).

Lastly, you need more care for the GICv2 case. We don't want a user to
create a guest with more than 8 vCPUs.

Even though supporting more than 16 vCPUs would nice, it would require
more work. I would be fine if you decide to only bump to 16 vCPUs for now.

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [OSSTEST PATCH] ts-logs-capture: Collect /var/log/messages and /var/log/debug

2015-05-14 Thread Ian Jackson
Signed-off-by: Ian Jackson 
---
 ts-logs-capture |2 ++
 1 file changed, 2 insertions(+)

diff --git a/ts-logs-capture b/ts-logs-capture
index 4ad55b9..6445e03 100755
--- a/ts-logs-capture
+++ b/ts-logs-capture
@@ -102,6 +102,8 @@ sub fetch_logs_host_guests () {
   /var/log/kern.log*
   /var/log/syslog*
   /var/log/daemon.log*
+  /var/log/messages*
+  /var/log/debug*
 
   /var/log/dmesg*
   /var/log/user.log*
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [ovmf test] 55883: regressions - FAIL

2015-05-14 Thread Ian Jackson
osstest service user writes ("[ovmf test] 55883: regressions - FAIL"):
> flight 55883 ovmf real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/55883/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  build-amd64-xsm   5 xen-build fail REGR. vs. 55353

home/osstest/build.55883.build-amd64-xsm/xen/tools/firmware/ovmf-dir-remote/MdePkg/Include/Library/BaseLib.h:1560:1:
 note: expected 'const CHAR8 *' but argument is of type 'UINT8 *'
/home/osstest/build.55883.build-amd64-xsm/xen/tools/firmware/ovmf-dir-remote/MdeModulePkg/Universal/HiiDatabaseDxe/ConfigKeywordHandler.c:1139:11:
 error: pointer targets in passing argument 1 of 'AsciiStrToUnicodeStr' differ 
in signedness [-Werror=pointer-sign]
In file included from 
/home/osstest/build.55883.build-amd64-xsm/xen/tools/firmware/ovmf-dir-remote/MdeModulePkg/Universal/HiiDatabaseDxe/HiiDatabase.h:38:0,
 from 
/home/osstest/build.55883.build-amd64-xsm/xen/tools/firmware/ovmf-dir-remote/MdeModulePkg/Universal/HiiDatabaseDxe/ConfigKeywordHandler.c:16:
/home/osstest/build.55883.build-amd64-xsm/xen/tools/firmware/ovmf-dir-remote/MdePkg/Include/Library/BaseLib.h:1560:1:
 note: expected 'const CHAR8 *' but argument is of type 'UINT8 *'

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] xen/vm_event: Clean up control-register-write vm_events

2015-05-14 Thread Tamas K Lengyel
On Thu, May 14, 2015 at 7:11 PM, Razvan Cojocaru
 wrote:
> On 05/14/2015 07:55 PM, Tamas K Lengyel wrote:
>>> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
>>> index 45b5283..1dd49dd 100644
>>> --- a/xen/include/asm-x86/domain.h
>>> +++ b/xen/include/asm-x86/domain.h
>>> @@ -341,19 +341,13 @@ struct arch_domain
>>>
>>>  /* Monitor options */
>>>  struct {
>>> -uint16_t mov_to_cr0_enabled  : 1;
>>> -uint16_t mov_to_cr0_sync : 1;
>>> -uint16_t mov_to_cr0_onchangeonly : 1;
>>> -uint16_t mov_to_cr3_enabled  : 1;
>>> -uint16_t mov_to_cr3_sync : 1;
>>> -uint16_t mov_to_cr3_onchangeonly : 1;
>>> -uint16_t mov_to_cr4_enabled  : 1;
>>> -uint16_t mov_to_cr4_sync : 1;
>>> -uint16_t mov_to_cr4_onchangeonly : 1;
>>> -uint16_t mov_to_msr_enabled  : 1;
>>> -uint16_t mov_to_msr_extended : 1;
>>> -uint16_t singlestep_enabled  : 1;
>>> -uint16_t software_breakpoint_enabled : 1;
>>> +uint32_t write_ctrlreg_enabled   : 8;
>>> +uint32_t write_ctrlreg_sync  : 8;
>>> +uint32_t write_ctrlreg_onchangeonly  : 8;
>>
>> Any particular reason why you have these bitmaps 8-bits wide? There
>> are only 4 events defined at the moment that would use these.
>
> ARM control registers have been mentioned, so I thought I would leave
> some space for a few more events. Other than that, they don't _need_ to
> be 8-bits wide. If compactness matters more I'll change them to 4.
>
>
> Thanks,
> Razvan

IMHO it's better to widen the field when there is an actual need for
it. For now I think 4 would be appropriate.

Thanks,
Tamas

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 11/23] xen: Add Xen specific page definition

2015-05-14 Thread Julien Grall
The Xen hypercall interface is always using 4K page granularity on ARM
and x86 architecture.

With the incoming support of 64K page granularity for ARM64 guest, it
won't be possible to re-use the Linux page definition in Xen drivers.

Introduce Xen page definition helpers based on the Linux page
definition. They have exactly the same name but prefixed with
XEN_/xen_ prefix.

Also modify page_to_pfn to use new Xen page definition.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 include/xen/page.h | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/xen/page.h b/include/xen/page.h
index c5ed20b..89ae01c 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -1,11 +1,28 @@
 #ifndef _XEN_PAGE_H
 #define _XEN_PAGE_H
 
+#include 
+
+/* The hypercall interface supports only 4KB page */
+#define XEN_PAGE_SHIFT 12
+#define XEN_PAGE_SIZE  (_AC(1,UL) << XEN_PAGE_SHIFT)
+#define XEN_PAGE_MASK  (~(XEN_PAGE_SIZE-1))
+#define xen_offset_in_page(p)  ((unsigned long)(p) & ~XEN_PAGE_MASK)
+#define xen_pfn_to_page(pfn)   \
+   ((pfn_to_page(((unsigned long)(pfn) << XEN_PAGE_SHIFT) >> PAGE_SHIFT)))
+#define xen_page_to_pfn(page)  \
+   (((page_to_pfn(page)) << PAGE_SHIFT) >> XEN_PAGE_SHIFT)
+
+#define XEN_PFN_PER_PAGE   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define XEN_PFN_DOWN(x)((x) >> XEN_PAGE_SHIFT)
+#define XEN_PFN_PHYS(x)((phys_addr_t)(x) << XEN_PAGE_SHIFT)
+
 #include 
 
 static inline unsigned long page_to_mfn(struct page *page)
 {
-   return pfn_to_mfn(page_to_pfn(page));
+   return pfn_to_mfn(xen_page_to_pfn(page));
 }
 
 struct xen_memory_region {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page

2015-05-14 Thread Julien Grall
With 64KB page granularity support in Linux, a page will be split accross
multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
contiguous.

With the offset in the page, the helper will be able to know which MFN
the driver needs to retrieve.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: net...@vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 include/xen/page.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 381d38f..6a0e329 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -431,7 +431,7 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
BUG_ON((signed short)ref < 0);
 
gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
-   page_to_mfn(page), GNTMAP_readonly);
+   page_to_mfn(page, 0), GNTMAP_readonly);
 
queue->tx_skbs[id].skb = skb;
queue->grant_tx_page[id] = page;
diff --git a/include/xen/page.h b/include/xen/page.h
index 89ae01c..8848da1 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -20,9 +20,9 @@
 
 #include 
 
-static inline unsigned long page_to_mfn(struct page *page)
+static inline unsigned long page_to_mfn(struct page *page, unsigned int offset)
 {
-   return pfn_to_mfn(xen_page_to_pfn(page));
+   return pfn_to_mfn(xen_page_to_pfn(page) + (offset >> XEN_PAGE_SHIFT));
 }
 
 struct xen_memory_region {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 20/23] net/xen-netfront: Make it running on 64KB page granularity

2015-05-14 Thread Julien Grall
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using network
device on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Note that we allocate a Linux page for each rx skb but only the first
4KB is used. We may improve the memory usage by extending the size of
the rx skb.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: net...@vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a Linux
using 64KB pages on a non-modified Xen.

Tested with workload such as ping, ssh, wget, git... I would happy if
someone give details how to test all the path.
---
 drivers/net/xen-netfront.c | 43 ++-
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 6a0e329..32a1cb2 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -74,8 +74,8 @@ struct netfront_cb {
 
 #define GRANT_INVALID_REF  0
 
-#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 /* Minimum number of Rx slots (includes slot for GSO metadata). */
 #define NET_RX_SLOTS_MIN (XEN_NETIF_NR_SLOTS_MIN + 1)
@@ -267,7 +267,7 @@ static struct sk_buff *xennet_alloc_one_rx_buffer(struct 
netfront_queue *queue)
kfree_skb(skb);
return NULL;
}
-   skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
+   skb_add_rx_frag(skb, 0, page, 0, 0, XEN_PAGE_SIZE);
 
/* Align ip header to a 16 bytes boundary */
skb_reserve(skb, NET_IP_ALIGN);
@@ -291,7 +291,7 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
struct sk_buff *skb;
unsigned short id;
grant_ref_t ref;
-   unsigned long pfn;
+   unsigned long mfn;
struct xen_netif_rx_request *req;
 
skb = xennet_alloc_one_rx_buffer(queue);
@@ -307,12 +307,12 @@ static void xennet_alloc_rx_buffers(struct netfront_queue 
*queue)
BUG_ON((signed short)ref < 0);
queue->grant_rx_ref[id] = ref;
 
-   pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
+   mfn = page_to_mfn(skb_frag_page(&skb_shinfo(skb)->frags[0]), 0);
 
req = RING_GET_REQUEST(&queue->rx, req_prod);
gnttab_grant_foreign_access_ref(ref,
queue->info->xbdev->otherend_id,
-   pfn_to_mfn(pfn),
+   mfn,
0);
 
req->id = id;
@@ -422,8 +422,10 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
unsigned int id;
struct xen_netif_tx_request *tx;
grant_ref_t ref;
+   unsigned int off_grant;
 
-   len = min_t(unsigned int, PAGE_SIZE - offset, len);
+   off_grant = offset & ~XEN_PAGE_MASK;
+   len = min_t(unsigned int, XEN_PAGE_SIZE - off_grant, len);
 
id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs);
tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
@@ -431,7 +433,8 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
BUG_ON((signed short)ref < 0);
 
gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
-   page_to_mfn(page, 0), GNTMAP_readonly);
+   page_to_mfn(page, offset),
+   GNTMAP_readonly);
 
queue->tx_skbs[id].skb = skb;
queue->grant_tx_page[id] = page;
@@ -439,7 +442,7 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 
tx->id = id;
tx->gref = ref;
-   tx->offset = offset;
+   tx->offset = off_grant;
tx->size = len;
tx->flags = 0;
 
@@ -459,8 +462,11 @@ static struct xen_netif_tx_request *xennet_make_txreqs(
tx->flags |= XEN_NETTXF_more_data;
tx = xennet_make_one_txreq(queue, skb_get(skb),
   page, offset, len);
-   page++;
-   offset = 0;
+   offset += tx->size;
+   if (offset == PAGE_SIZE) {
+   page++;
+   offset = 0;
+   }
len -= tx->size;
}
 
@@ -567,8 +573,11 @@ static int xennet_start_xmit(struct sk_buff *sk

[Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity

2015-05-14 Thread Julien Grall
The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Although only simple workload is working (dhcp request, ping). If I try
to use wget in the guest, it will stall until a tcpdump is started on
the vif interface in DOM0. I wasn't able to find why.

I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
it's used for (I have limited knowledge on the network driver).

Signed-off-by: Julien Grall 
Cc: Ian Campbell 
Cc: Wei Liu 
Cc: net...@vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.
---
 drivers/net/xen-netback/common.h  |  7 ---
 drivers/net/xen-netback/netback.c | 27 ++-
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8a495b3..0eda6e9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -44,6 +44,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 typedef unsigned int pending_ring_idx_t;
@@ -64,8 +65,8 @@ struct pending_tx_info {
struct ubuf_info callback_struct;
 };
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 struct xenvif_rx_meta {
int id;
@@ -80,7 +81,7 @@ struct xenvif_rx_meta {
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0x
 
-#define MAX_BUFFER_OFFSET PAGE_SIZE
+#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
 
 #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
 
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 9ae1d43..ea5ce84 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
 {
struct gnttab_copy *copy_gop;
struct xenvif_rx_meta *meta;
-   unsigned long bytes;
+   unsigned long bytes, off_grant;
int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
/* Data must not cross a page boundary. */
@@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
if (npo->copy_off == MAX_BUFFER_OFFSET)
meta = get_next_rx_buffer(queue, npo);
 
-   bytes = PAGE_SIZE - offset;
+   off_grant = offset & ~XEN_PAGE_MASK;
+   bytes = XEN_PAGE_SIZE - off_grant;
if (bytes > size)
bytes = size;
 
@@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
} else {
copy_gop->source.domid = DOMID_SELF;
copy_gop->source.u.gmfn =
-   virt_to_mfn(page_address(page));
+   virt_to_mfn(page_address(page) + offset);
}
-   copy_gop->source.offset = offset;
+   copy_gop->source.offset = off_grant;
 
copy_gop->dest.domid = queue->vif->domid;
copy_gop->dest.offset = npo->copy_off;
@@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
first->size -= txp->size;
slots++;
 
-   if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
+   if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
netdev_err(queue->vif->dev, "Cross page boundary, 
txp->offset: %x, size: %u\n",
 txp->offset, txp->size);
xenvif_fatal_tx_err(queue->vif);
@@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
}
 
/* No crossing a page as the payload mustn't fragment. */
-   if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
+   if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
netdev_err(queue->vif->dev,
   "txreq.offset: %x, size: %u, end: %lu\n",
   txreq.offset, txreq.size,
-  (txreq.offset&~PAGE_MASK) + txreq.size);
+  (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
xenvif_fatal_tx_err(queue->vif);
break;
 

[Xen-devel] [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity

2015-05-14 Thread Julien Grall
The hypercall interface (as well as the toolstack) is always using 4KB
page granularity. When the toolstack is asking for mapping a series of
guest PFN in a batch, it expects to have the page map contiguously in
its virtual memory.

When Linux is using 64KB page granularity, the privcmd driver will have
to map multiple Xen PFN in a single Linux page.

Note that this solution works on page granularity which is a multiple of
4KB.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 drivers/xen/privcmd.c   |  8 +---
 drivers/xen/xlate_mmu.c | 31 ---
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 5a29616..e8714b4 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -446,7 +446,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, 
int version)
return -EINVAL;
}
 
-   nr_pages = m.num;
+   nr_pages = DIV_ROUND_UP_ULL(m.num, PAGE_SIZE / XEN_PAGE_SIZE);
if ((m.num <= 0) || (nr_pages > (LONG_MAX >> PAGE_SHIFT)))
return -EINVAL;
 
@@ -494,7 +494,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, 
int version)
goto out_unlock;
}
if (xen_feature(XENFEAT_auto_translated_physmap)) {
-   ret = alloc_empty_pages(vma, m.num);
+   ret = alloc_empty_pages(vma, nr_pages);
if (ret < 0)
goto out_unlock;
} else
@@ -518,6 +518,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, 
int version)
state.global_error  = 0;
state.version   = version;
 
+   BUILD_BUG_ON(((PAGE_SIZE / sizeof(xen_pfn_t)) % XEN_PFN_PER_PAGE) != 0);
/* mmap_batch_fn guarantees ret == 0 */
BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
&pagelist, mmap_batch_fn, &state));
@@ -582,12 +583,13 @@ static void privcmd_close(struct vm_area_struct *vma)
 {
struct page **pages = vma->vm_private_data;
int numpgs = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+   int nr_pfn = (vma->vm_end - vma->vm_start) >> XEN_PAGE_SHIFT;
int rc;
 
if (!xen_feature(XENFEAT_auto_translated_physmap) || !numpgs || !pages)
return;
 
-   rc = xen_unmap_domain_mfn_range(vma, numpgs, pages);
+   rc = xen_unmap_domain_mfn_range(vma, nr_pfn, pages);
if (rc == 0)
free_xenballooned_pages(numpgs, pages);
else
diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
index 58a5389..b9dfe1b 100644
--- a/drivers/xen/xlate_mmu.c
+++ b/drivers/xen/xlate_mmu.c
@@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long 
fgmfn,
 
 struct remap_data {
xen_pfn_t *fgmfn; /* foreign domain's gmfn */
+   xen_pfn_t *egmfn; /* end foreign domain's gmfn */
pgprot_t prot;
domid_t  domid;
struct vm_area_struct *vma;
@@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, 
unsigned long addr,
 {
struct remap_data *info = data;
struct page *page = info->pages[info->index++];
-   unsigned long pfn = page_to_pfn(page);
-   pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
+   unsigned long pfn = xen_page_to_pfn(page);
+   pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
int rc;
-
-   rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
-   *info->err_ptr++ = rc;
-   if (!rc) {
-   set_pte_at(info->vma->vm_mm, addr, ptep, pte);
-   info->mapped++;
+   uint32_t i;
+
+   for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
+   if (info->fgmfn == info->egmfn)
+   break;
+
+   rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
+   *info->err_ptr++ = rc;
+   if (!rc) {
+   set_pte_at(info->vma->vm_mm, addr, ptep, pte);
+   info->mapped++;
+   }
+   info->fgmfn++;
}
-   info->fgmfn++;
 
return 0;
 }
@@ -102,13 +109,14 @@ int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
 {
int err;
struct remap_data data;
-   unsigned long range = nr << PAGE_SHIFT;
+   unsigned long range = round_up(nr, XEN_PFN_PER_PAGE) << XEN_PAGE_SHIFT;
 
/* Kept here for the purpose of making sure code doesn't break
   x86 PVOPS */
BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP | VM_IO)));
 
data.fgmfn = mfn;
+   data.egmfn = mfn + nr;
data.prot  = prot;
data.domid = domid;
data.vma   = vma;
@@ -132,7 +140,8 @@ int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
struct xen_remove_from_physmap xrp;
unsigned

[Xen-devel] [RFC 14/23] tty/hvc: xen: Use xen page definition

2015-05-14 Thread Julien Grall
The console ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall 
Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: David Vrabel 
Cc: Stefano Stabellini 
Cc: Boris Ostrovsky 
Cc: linuxppc-...@lists.ozlabs.org
---
 drivers/tty/hvc/hvc_xen.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index 5bab1c6..a68d115 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -230,7 +230,7 @@ static int xen_hvm_console_init(void)
if (r < 0 || v == 0)
goto err;
mfn = v;
-   info->intf = xen_remap(mfn << PAGE_SHIFT, PAGE_SIZE);
+   info->intf = xen_remap(mfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
if (info->intf == NULL)
goto err;
info->vtermno = HVC_COOKIE;
@@ -392,7 +392,7 @@ static int xencons_connect_backend(struct xenbus_device 
*dev,
if (xen_pv_domain())
mfn = virt_to_mfn(info->intf);
else
-   mfn = __pa(info->intf) >> PAGE_SHIFT;
+   mfn = __pa(info->intf) >> XEN_PAGE_SHIFT;
ret = gnttab_alloc_grant_references(1, &gref_head);
if (ret < 0)
return ret;
@@ -476,7 +476,7 @@ static int xencons_resume(struct xenbus_device *dev)
struct xencons_info *info = dev_get_drvdata(&dev->dev);
 
xencons_disconnect_backend(info);
-   memset(info->intf, 0, PAGE_SIZE);
+   memset(info->intf, 0, XEN_PAGE_SIZE);
return xencons_connect_backend(dev, info);
 }
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 18/23] block/xen-blkfront: Make it running on 64KB page granularity

2015-05-14 Thread Julien Grall
From: Julien Grall 

The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using block
device on a non-modified Xen.

The block API is using segment which should at least be the size of a
Linux page. Therefore, the driver will have to break the page in chunk
of 4K before giving the page to the backend.

Breaking a 64KB segment in 4KB chunk will result to have some chunk with
no data. As the PV protocol always require to have data in the chunk, we
have to count the number of Xen page which will be in use and avoid to
sent empty chunk.

Note that, a pre-defined number of grant is reserved before preparing
the request. This pre-defined number is based on the number and the
maximum size of the segments. If each segment contain a very small
amount of data, the driver may reserve too much grant (16 grant is
reserved per segment with 64KB page granularity).

Futhermore, in the case of persistent grant we allocate one Linux page
per grant although only the 4KB of the page will be effectively use.
This could be improved by share the page with multiple grants.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Roger Pau Monné 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

---

Improvement such as support 64KB grant is not taken into consideration in
this patch because we have the requirement to run a Linux using 64KB page
on a non-modified Xen.
---
 drivers/block/xen-blkfront.c | 259 ++-
 1 file changed, 156 insertions(+), 103 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 60cf1d6..c6537ed 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -77,6 +77,7 @@ struct blk_shadow {
struct grant **grants_used;
struct grant **indirect_grants;
struct scatterlist *sg;
+   unsigned int num_sg;
 };
 
 struct split_bio {
@@ -98,7 +99,7 @@ static unsigned int xen_blkif_max_segments = 32;
 module_param_named(max, xen_blkif_max_segments, int, S_IRUGO);
 MODULE_PARM_DESC(max, "Maximum amount of segments in indirect requests 
(default is 32)");
 
-#define BLK_RING_SIZE __CONST_RING_SIZE(blkif, PAGE_SIZE)
+#define BLK_RING_SIZE __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE)
 
 /*
  * We have one of these per vbd, whether ide, scsi or 'other'.  They
@@ -131,6 +132,7 @@ struct blkfront_info
unsigned int discard_granularity;
unsigned int discard_alignment;
unsigned int feature_persistent:1;
+   /* Number of 4K segment handled */
unsigned int max_indirect_segments;
int is_ready;
 };
@@ -158,10 +160,19 @@ static DEFINE_SPINLOCK(minor_lock);
 
 #define DEV_NAME   "xvd"   /* name in /dev */
 
-#define SEGS_PER_INDIRECT_FRAME \
-   (PAGE_SIZE/sizeof(struct blkif_request_segment))
-#define INDIRECT_GREFS(_segs) \
-   ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define SEGS_PER_INDIRECT_FRAME\
+   (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / 
XEN_PAGES_PER_SEGMENT)
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+   (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+
+#define INDIRECT_GREFS(_pages) \
+   ((_pages + XEN_PAGES_PER_INDIRECT_FRAME - 
1)/XEN_PAGES_PER_INDIRECT_FRAME)
 
 static int blkfront_setup_indirect(struct blkfront_info *info);
 
@@ -204,7 +215,7 @@ static int fill_grant_buffer(struct blkfront_info *info, 
int num)
kfree(gnt_list_entry);
goto out_of_memory;
}
-   gnt_list_entry->pfn = page_to_pfn(granted_page);
+   gnt_list_entry->pfn = xen_page_to_pfn(granted_page);
}
 
gnt_list_entry->gref = GRANT_INVALID_REF;
@@ -219,7 +230,7 @@ out_of_memory:
 &info->grants, node) {
list_del(&gnt_list_entry->node);
if (info->feature_persistent)
-   __free_page(pfn_to_page(gnt_list_entry->pfn));
+   __free_page(xen_pfn_to_page(gnt_list_entry->pfn));
kfree(gnt_list_entry);
i--;
}
@@ -389,7 +400,8 @@ static int blkif_queue_request(struct request *req)
struct blkif_request *ring_req;
unsigned long id;
unsigned int fsect, lsect;
-   int i, ref, n;
+   unsigned int shared_off, shared_len, bvec_off, sg_total;
+   int i, ref, n, grant;
struct blkif_request_segment *segments = NULL;
 
/*
@@ -401,18 +413,19 @@ static int blkif_queue_request(struct request *req)
grant_ref_t gref_head;
struct grant *gnt_list_entry = NULL;
struct scatterlist *sg;
-   int nseg, max_grefs;
+   int nseg, max_gref

[Xen-devel] [RFC 17/23] xen/grant-table: Make it running on 64KB granularity

2015-05-14 Thread Julien Grall
The Xen interface is using 4KB page granularity. This means that each
grant is 4KB.

The current implementation allocates a Linux page per grant. On Linux
using 64KB page granularity, only the first 4KB of the page will be
used.

We could decrease the memory wasted by sharing the page with multiple
grant. It will require some care with the {Set,Clear}ForeignPage macro.

Note that no changes has been made in the x86 code because both Linux
and Xen will only use 4KB page granularity.

Signed-off-by: Julien Grall 
Cc: Stefano Stabellini 
Cc: Russell King 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 arch/arm/xen/p2m.c| 6 +++---
 drivers/xen/grant-table.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index 887596c..0ed01f2 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -93,8 +93,8 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref 
*map_ops,
for (i = 0; i < count; i++) {
if (map_ops[i].status)
continue;
-   set_phys_to_machine(map_ops[i].host_addr >> PAGE_SHIFT,
-   map_ops[i].dev_bus_addr >> PAGE_SHIFT);
+   set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT,
+   map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT);
}
 
return 0;
@@ -108,7 +108,7 @@ int clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref 
*unmap_ops,
int i;
 
for (i = 0; i < count; i++) {
-   set_phys_to_machine(unmap_ops[i].host_addr >> PAGE_SHIFT,
+   set_phys_to_machine(unmap_ops[i].host_addr >> XEN_PAGE_SHIFT,
INVALID_P2M_ENTRY);
}
 
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 62f591f..dc0a787 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -642,7 +642,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
if (xen_auto_xlat_grant_frames.count)
return -EINVAL;
 
-   vaddr = xen_remap(addr, PAGE_SIZE * max_nr_gframes);
+   vaddr = xen_remap(addr, XEN_PAGE_SIZE * max_nr_gframes);
if (vaddr == NULL) {
pr_warn("Failed to ioremap gnttab share frames (addr=%pa)!\n",
&addr);
@@ -654,7 +654,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
return -ENOMEM;
}
for (i = 0; i < max_nr_gframes; i++)
-   pfn[i] = PFN_DOWN(addr) + i;
+   pfn[i] = XEN_PFN_DOWN(addr) + i;
 
xen_auto_xlat_grant_frames.vaddr = vaddr;
xen_auto_xlat_grant_frames.pfn = pfn;
@@ -978,7 +978,7 @@ static void gnttab_request_version(void)
 {
/* Only version 1 is used, which will always be available. */
grant_table_version = 1;
-   grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
+   grefs_per_grant_frame = XEN_PAGE_SIZE / sizeof(struct grant_entry_v1);
gnttab_interface = &gnttab_v1_ops;
 
pr_info("Grant tables using version %d layout\n", grant_table_version);
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 23/23] arm/xen: Add support for 64KB page granularity

2015-05-14 Thread Julien Grall
The hypercall interface is always using 4KB page granularity. This is
requiring to use xen page definition macro when we deal with hypercall.

Note that pfn_to_mfn is working with a Xen pfn (i.e 4KB). We may want to
rename pfn_mfn to make this explicit.

We also allocate a 64KB page for the shared page even though only the
first 4KB is used. I don't think this is really important for now as it
helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
Xen PFN).

Signed-off-by: Julien Grall 
Cc: Stefano Stabellini 
Cc: Russell King 
---
 arch/arm/include/asm/xen/page.h | 12 ++--
 arch/arm/xen/enlighten.c|  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 1bee8ca..ab6eb9a 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -56,19 +56,19 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn)
 
 static inline xmaddr_t phys_to_machine(xpaddr_t phys)
 {
-   unsigned offset = phys.paddr & ~PAGE_MASK;
-   return XMADDR(PFN_PHYS(pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
+   unsigned offset = phys.paddr & ~XEN_PAGE_MASK;
+   return XMADDR(XEN_PFN_PHYS(pfn_to_mfn(XEN_PFN_DOWN(phys.paddr))) | 
offset);
 }
 
 static inline xpaddr_t machine_to_phys(xmaddr_t machine)
 {
-   unsigned offset = machine.maddr & ~PAGE_MASK;
-   return XPADDR(PFN_PHYS(mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset);
+   unsigned offset = machine.maddr & ~XEN_PAGE_MASK;
+   return XPADDR(XEN_PFN_PHYS(mfn_to_pfn(XEN_PFN_DOWN(machine.maddr))) | 
offset);
 }
 /* VIRT <-> MACHINE conversion */
 #define virt_to_machine(v) (phys_to_machine(XPADDR(__pa(v
-#define virt_to_mfn(v) (pfn_to_mfn(virt_to_pfn(v)))
-#define mfn_to_virt(m) (__va(mfn_to_pfn(m) << PAGE_SHIFT))
+#define virt_to_mfn(v) (pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT))
+#define mfn_to_virt(m) (__va(mfn_to_pfn(m) << XEN_PAGE_SHIFT))
 
 static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr)
 {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 224081c..dcfe251 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -93,8 +93,8 @@ static void xen_percpu_init(void)
pr_info("Xen: initializing cpu%d\n", cpu);
vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
 
-   info.mfn = __pa(vcpup) >> PAGE_SHIFT;
-   info.offset = offset_in_page(vcpup);
+   info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
+   info.offset = xen_offset_in_page(vcpup);
 
err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
BUG_ON(err);
@@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
xatp.domid = DOMID_SELF;
xatp.idx = 0;
xatp.space = XENMAPSPACE_shared_info;
-   xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
+   xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
BUG();
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable

2015-05-14 Thread Julien Grall
When Linux is using 64K page granularity, every page will be slipt in
multiple non-contiguous 4K MFN.

I'm not sure how to handle efficiently the check to know whether we can
merge 2 biovec with a such case. So for now, always says that biovec are
not mergeable.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 drivers/xen/biomerge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
index 0edb91c..20387c2 100644
--- a/drivers/xen/biomerge.c
+++ b/drivers/xen/biomerge.c
@@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
 
+   /* TODO: Implement it correctly */
+   return 0;
+
return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
((mfn1 == mfn2) || ((mfn1+1) == mfn2));
 }
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux

2015-05-14 Thread Julien Grall
For ARM64 guests, Linux is able to support either 64K or 4K page
granularity. Although, the hypercall interface is always based on 4K
page granularity.

With 64K page granuliarty, a single page will be spread over multiple
Xen frame.

When a driver request/free a balloon page, the balloon driver will have
to split the Linux page in 4K chunk before asking Xen to add/remove the
frame from the guest.

Note that this can work on any page granularity assuming it's a multiple
of 4K.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Wei Liu 

---

TODO/LIMITATIONS:
- When CONFIG_XEN_HAVE_PMMU only 4K page granularity is supported
- It may be possible to extend the concept for ballooning 2M/1G
page.
---
 drivers/xen/balloon.c | 93 +--
 1 file changed, 60 insertions(+), 33 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index fd93369..f0d8666 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -91,7 +91,7 @@ struct balloon_stats balloon_stats;
 EXPORT_SYMBOL_GPL(balloon_stats);
 
 /* We increase/decrease in batches which fit in a page */
-static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
+static xen_pfn_t frame_list[XEN_PAGE_SIZE / sizeof(unsigned long)];
 
 
 /* List of ballooned pages, threaded through the mem_map array. */
@@ -326,7 +326,7 @@ static enum bp_state reserve_additional_memory(long credit)
 static enum bp_state increase_reservation(unsigned long nr_pages)
 {
int rc;
-   unsigned long  pfn, i;
+   unsigned long  pfn, i, nr_frames;
struct page   *page;
struct xen_memory_reservation reservation = {
.address_bits = 0,
@@ -343,30 +343,43 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
}
 #endif
 
-   if (nr_pages > ARRAY_SIZE(frame_list))
-   nr_pages = ARRAY_SIZE(frame_list);
+   if (nr_pages > (ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE))
+   nr_pages = ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE;
+
+   nr_frames = nr_pages * XEN_PFN_PER_PAGE;
+
+   pfn = 0; /* make gcc happy */
 
page = list_first_entry_or_null(&ballooned_pages, struct page, lru);
-   for (i = 0; i < nr_pages; i++) {
-   if (!page) {
-   nr_pages = i;
-   break;
+   for (i = 0; i < nr_frames; i++) {
+   if (!(i % XEN_PFN_PER_PAGE)) {
+   if (!page) {
+   nr_frames = i;
+   break;
+   }
+   pfn = xen_page_to_pfn(page);
+   page = balloon_next_page(page);
}
-   frame_list[i] = page_to_pfn(page);
-   page = balloon_next_page(page);
+   frame_list[i] = pfn++;
}
 
set_xen_guest_handle(reservation.extent_start, frame_list);
-   reservation.nr_extents = nr_pages;
+   reservation.nr_extents = nr_frames;
rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
if (rc <= 0)
return BP_EAGAIN;
 
for (i = 0; i < rc; i++) {
-   page = balloon_retrieve(false);
-   BUG_ON(page == NULL);
 
-   pfn = page_to_pfn(page);
+   /* TODO: Make this code cleaner to make CONFIG_XEN_HAVE_PVMMU
+* with 64K Pages
+*/
+   if (!(i % XEN_PFN_PER_PAGE)) {
+   page = balloon_retrieve(false);
+   BUG_ON(page == NULL);
+
+   pfn = page_to_pfn(page);
+   }
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
if (!xen_feature(XENFEAT_auto_translated_physmap)) {
@@ -385,7 +398,8 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
 #endif
 
/* Relinquish the page back to the allocator. */
-   __free_reserved_page(page);
+   if (!(i % XEN_PFN_PER_PAGE))
+   __free_reserved_page(page);
}
 
balloon_stats.current_pages += rc;
@@ -396,7 +410,7 @@ static enum bp_state increase_reservation(unsigned long 
nr_pages)
 static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 {
enum bp_state state = BP_DONE;
-   unsigned long  pfn, i;
+   unsigned long  pfn, i, nr_frames;
struct page   *page;
int ret;
struct xen_memory_reservation reservation = {
@@ -414,19 +428,27 @@ static enum bp_state decrease_reservation(unsigned long 
nr_pages, gfp_t gfp)
}
 #endif
 
-   if (nr_pages > ARRAY_SIZE(frame_list))
-   nr_pages = ARRAY_SIZE(frame_list);
+   if (nr_pages > (ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE))
+   nr_pages = ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE;
 
-   for (i = 0; i < nr_pages; i++) {
-   page = alloc_page(g

[Xen-devel] [RFC 13/23] xen/xenbus: Use Xen page definition

2015-05-14 Thread Julien Grall
The xenstore ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 drivers/xen/xenbus/xenbus_probe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c 
b/drivers/xen/xenbus/xenbus_probe.c
index 5390a67..f99933d9 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
 
xen_store_mfn = xen_start_info->store_mfn =
pfn_to_mfn(virt_to_phys((void *)page) >>
-  PAGE_SHIFT);
+  XEN_PAGE_SHIFT);
 
/* Next allocate a local port which xenstored can bind to */
alloc_unbound.dom= DOMID_SELF;
@@ -804,7 +804,7 @@ static int __init xenbus_init(void)
goto out_error;
xen_store_mfn = (unsigned long)v;
xen_store_interface =
-   xen_remap(xen_store_mfn << PAGE_SHIFT, PAGE_SIZE);
+   xen_remap(xen_store_mfn << XEN_PAGE_SHIFT, 
XEN_PAGE_SIZE);
break;
default:
pr_warn("Xenstore state unknown\n");
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 19/23] block/xen-blkback: Make it running on 64KB page granularity

2015-05-14 Thread Julien Grall
The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity behaving as a
block backend on a non-modified Xen.

It's only necessary to adapt the ring size and the number of request per
indirect frames. The rest of the code is relying on the grant table
code.

Note that the grant table code is allocating a Linux page per grant
which will result to waste 6OKB for every grant when Linux is using 64KB
page granularity. This could be improved by sharing the page between
multiple grants.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: "Roger Pau Monné" 
Cc: Boris Ostrovsky 
Cc: David Vrabel 

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.

This has been tested only with a loop device. I plan to test passing
hard drive partition but I didn't yet convert the swiotlb code.
---
 drivers/block/xen-blkback/blkback.c |  5 +++--
 drivers/block/xen-blkback/common.h  | 16 +---
 drivers/block/xen-blkback/xenbus.c  |  6 +++---
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index 7049528..1803c07 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -954,7 +954,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request 
*req,
seg[n].nsec = segments[i].last_sect -
segments[i].first_sect + 1;
seg[n].offset = (segments[i].first_sect << 9);
-   if ((segments[i].last_sect >= (PAGE_SIZE >> 9)) ||
+   if ((segments[i].last_sect >= (XEN_PAGE_SIZE >> 9)) ||
(segments[i].last_sect < segments[i].first_sect)) {
rc = -EINVAL;
goto unmap;
@@ -1203,6 +1203,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
req_operation = req->operation == BLKIF_OP_INDIRECT ?
req->u.indirect.indirect_op : req->operation;
+
if ((req->operation == BLKIF_OP_INDIRECT) &&
(req_operation != BLKIF_OP_READ) &&
(req_operation != BLKIF_OP_WRITE)) {
@@ -1261,7 +1262,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
seg[i].nsec = req->u.rw.seg[i].last_sect -
req->u.rw.seg[i].first_sect + 1;
seg[i].offset = (req->u.rw.seg[i].first_sect << 9);
-   if ((req->u.rw.seg[i].last_sect >= (PAGE_SIZE >> 9)) ||
+   if ((req->u.rw.seg[i].last_sect >= (XEN_PAGE_SIZE >> 
9)) ||
(req->u.rw.seg[i].last_sect <
 req->u.rw.seg[i].first_sect))
goto fail_response;
diff --git a/drivers/block/xen-blkback/common.h 
b/drivers/block/xen-blkback/common.h
index 7a03e07..ef15ad4 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,12 +51,21 @@
  */
 #define MAX_INDIRECT_SEGMENTS 256
 
-#define SEGS_PER_INDIRECT_FRAME \
-   (PAGE_SIZE/sizeof(struct blkif_request_segment))
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define SEGS_PER_INDIRECT_FRAME\
+   (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / 
XEN_PAGES_PER_SEGMENT)
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+   (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+
 #define MAX_INDIRECT_PAGES \
((MAX_INDIRECT_SEGMENTS + SEGS_PER_INDIRECT_FRAME - 
1)/SEGS_PER_INDIRECT_FRAME)
 #define INDIRECT_PAGES(_segs) \
-   ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+   ((_segs + XEN_PAGES_PER_INDIRECT_FRAME - 
1)/XEN_PAGES_PER_INDIRECT_FRAME)
 
 /* Not a real protocol.  Used to generate ring structs which contain
  * the elements common to all protocols only.  This way we get a
diff --git a/drivers/block/xen-blkback/xenbus.c 
b/drivers/block/xen-blkback/xenbus.c
index 6ab69ad..2fcf24e 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -217,21 +217,21 @@ static int xen_blkif_map(struct xen_blkif *blkif, 
grant_ref_t gref,
{
struct blkif_sring *sring;
sring = (struct blkif_sring *)blkif->blk_ring;
-   BACK_RING_INIT(&blkif->blk_rings.native, sring, PAGE_SIZE);
+   BACK_RING_INIT(&blkif->blk_rings.native, sring, XEN_PAGE_SIZE);
break;
}
case BLKIF_PROTOCOL_X86_32:
{
struct blkif_x86_32_sring *sring_x86_32;
sring_x86_32 = (struct blkif_x86_32_sring *)blkif->blk_ring;

[Xen-devel] [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity

2015-05-14 Thread Julien Grall
Only use the first 4KB of the page to store the events channel info. It
means that we will wast 60KB every time we allocate page for:
 * control block: a page is allocating per CPU
 * event array: a page is allocating everytime we need to expand it

I think we can reduce the memory waste for the 2 areas by:

* control block: sharing between multiple vCPUs. Although it will
require some bookkeeping in order to not free the page when the CPU
goes offline and the other CPUs sharing the page still there

* event array: always extend the array event by 64K (i.e 16 4K
chunk). That would require more care when we fail to expand the
event channel.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 drivers/xen/events/events_base.c | 2 +-
 drivers/xen/events/events_fifo.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 704d36e..24b97bd 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -40,11 +40,11 @@
 #include 
 #include 
 #include 
-#include 
 #endif
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index ed673e1..d53c297 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -54,7 +54,7 @@
 
 #include "events_internal.h"
 
-#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t))
+#define EVENT_WORDS_PER_PAGE (XEN_PAGE_SIZE / sizeof(event_word_t))
 #define MAX_EVENT_ARRAY_PAGES (EVTCHN_FIFO_NR_CHANNELS / EVENT_WORDS_PER_PAGE)
 
 struct evtchn_fifo_queue {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [OSSTEST PATCH] production-config: Use /home/logs, not /home/osstest/pub

2015-05-14 Thread Ian Jackson
The logs and images (including .../logs, .../results, etc.) are now on
their own filesystem on the production osstest VM, which I have called
/home/logs.

Changing this in production config will allow us to tidy up by
removing the symlink I left behind.

Signed-off-by: Ian Jackson 
---
 production-config |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/production-config b/production-config
index 8834110..3a0b768 100644
--- a/production-config
+++ b/production-config
@@ -34,12 +34,12 @@ QueueDaemonHost osstest
 
 ExecutiveDbnamePat dbname=osstestdb;host=db
 
-Stash /home/osstest/pub/logs
-Images /home/osstest/pub/images
-Logs /home/osstest/pub/logs
+Stash /home/logs/logs
+Images /home/logs/images
+Logs /home/logs/logs
 
-Results /home/osstest/pub/results
-PubBaseDir /home/osstest/pub
+Results /home/logs/results
+PubBaseDir /home/logs
 
 OverlayLocal /home/osstest/overlay-local
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V2] xen/vm_event: Clean up control-register-write vm_events

2015-05-14 Thread Razvan Cojocaru
On 05/14/2015 07:55 PM, Tamas K Lengyel wrote:
>> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
>> index 45b5283..1dd49dd 100644
>> --- a/xen/include/asm-x86/domain.h
>> +++ b/xen/include/asm-x86/domain.h
>> @@ -341,19 +341,13 @@ struct arch_domain
>>
>>  /* Monitor options */
>>  struct {
>> -uint16_t mov_to_cr0_enabled  : 1;
>> -uint16_t mov_to_cr0_sync : 1;
>> -uint16_t mov_to_cr0_onchangeonly : 1;
>> -uint16_t mov_to_cr3_enabled  : 1;
>> -uint16_t mov_to_cr3_sync : 1;
>> -uint16_t mov_to_cr3_onchangeonly : 1;
>> -uint16_t mov_to_cr4_enabled  : 1;
>> -uint16_t mov_to_cr4_sync : 1;
>> -uint16_t mov_to_cr4_onchangeonly : 1;
>> -uint16_t mov_to_msr_enabled  : 1;
>> -uint16_t mov_to_msr_extended : 1;
>> -uint16_t singlestep_enabled  : 1;
>> -uint16_t software_breakpoint_enabled : 1;
>> +uint32_t write_ctrlreg_enabled   : 8;
>> +uint32_t write_ctrlreg_sync  : 8;
>> +uint32_t write_ctrlreg_onchangeonly  : 8;
> 
> Any particular reason why you have these bitmaps 8-bits wide? There
> are only 4 events defined at the moment that would use these.

ARM control registers have been mentioned, so I thought I would leave
some space for a few more events. Other than that, they don't _need_ to
be 8-bits wide. If compactness matters more I'll change them to 4.


Thanks,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1 0/4] Enabling XL to set per-VCPU parameters of a domain for RTDS scheduler

2015-05-14 Thread Chong Li
On Mon, May 11, 2015 at 4:56 AM, Dario Faggioli 
wrote:

> On Thu, 2015-05-07 at 12:05 -0500, Chong Li wrote:
> > [Goal]
> > The current xl sched-rtds tool can only set the VCPUs of a domain to the
> same parameter
> > although the scheduler supports VCPUs with different parameters. This
> patchset is to
> > enable xl sched-rtds tool to configure the VCPUs of a domain with
> different parameters.
> >
> > This per-VCPU settings can be used in many scenarios. For example, based
> on Dario's statement in our pervious discussion(
> http://lists.xen.org/archives/html/xen-devel/2014-09/msg00423.html), if
> there are two real-time applications, which have different timing
> requirements, running in a multi-VCPU guest domain, it is beneficial to pin
> these two applications to two seperate VCPUs with different scheduling
> parameters.
> >
> Right. And in fact, I'm glad to see this is happening, thanks for doing
> this work! :-)
>
> > 1) show the budget and period of each VCPU of each domain, by using "xl
> sched-rtds" command. An example would be like:
> >
> > [..]
> >
> > 2) show the budget and period of each VCPU of a specific domain, by
> using,
> > e.g., "xl sched-rtds -d vm1" command. The output would be like:
> >
> > [..]
> >
> > 3) set the budget and period of each VCPU of a specific domain, by using,
> > e.g., "xl sched-rtds -d vm1 -v 0 -p 100 -b 50" command (where "-v 0"
> specifies
> > the VCPU with ID=0). The parameters would be like:
> >
> > [..]
> >
> > 4) Users can still set the per-domain parameters (previous xl rtds tool
> already supported this).
> > e.g., "xl sched-rtds -d vm1 -p 500 -b 250". The parameters would be like:
> >
> The CLI looks nice to me. I'm wondering, what happens if the user tries
> to only alter the budget or the period of a vcpu (or of a domain)? I
> think that is not possible right now, is it?
>

You're right. The current design requires both budget and period in a 'set'
command.


>
> Would it make sense to allow that? I think it would, but this can well
> happen later, once we will have this in.
>

Yes, we can definitely implement that, after all the other issues in this
patch are well solved.


>
> Regards,
> Dario
>



-- 
Chong Li
Department of Computer Science and Engineering
Washington University in St.louis
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS

2015-05-14 Thread Julien Grall
From: Julien Grall 

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Roger Pau Monné 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 drivers/block/xen-blkfront.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 2c61cf8..5c72c25 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -139,8 +139,6 @@ static unsigned int nr_minors;
 static unsigned long *minors;
 static DEFINE_SPINLOCK(minor_lock);
 
-#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
-   (BLKIF_MAX_SEGMENTS_PER_REQUEST * BLK_RING_SIZE)
 #define GRANT_INVALID_REF  0
 
 #define PARTS_PER_DISK 16
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/

2015-05-14 Thread Julien Grall
From: Julien Grall 

Make the code less confusing to read now that Linux may not have the
same page size as Xen.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Roger Pau Monné 
---
 drivers/block/xen-blkback/blkback.c | 10 +-
 drivers/block/xen-blkback/common.h  |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index 713fc9f..7049528 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -729,7 +729,7 @@ static void xen_blkbk_unmap_and_respond(struct pending_req 
*req)
struct grant_page **pages = req->segments;
unsigned int invcount;
 
-   invcount = xen_blkbk_unmap_prepare(blkif, pages, req->nr_pages,
+   invcount = xen_blkbk_unmap_prepare(blkif, pages, req->nr_segs,
   req->unmap, req->unmap_pages);
 
work->data = req;
@@ -915,7 +915,7 @@ static int xen_blkbk_map_seg(struct pending_req 
*pending_req)
int rc;
 
rc = xen_blkbk_map(pending_req->blkif, pending_req->segments,
-  pending_req->nr_pages,
+  pending_req->nr_segs,
   (pending_req->operation != BLKIF_OP_READ));
 
return rc;
@@ -931,7 +931,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request 
*req,
int indirect_grefs, rc, n, nseg, i;
struct blkif_request_segment *segments = NULL;
 
-   nseg = pending_req->nr_pages;
+   nseg = pending_req->nr_segs;
indirect_grefs = INDIRECT_PAGES(nseg);
BUG_ON(indirect_grefs > BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST);
 
@@ -1251,7 +1251,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
pending_req->id= req->u.rw.id;
pending_req->operation = req_operation;
pending_req->status= BLKIF_RSP_OKAY;
-   pending_req->nr_pages  = nseg;
+   pending_req->nr_segs   = nseg;
 
if (req->operation != BLKIF_OP_INDIRECT) {
preq.dev   = req->u.rw.handle;
@@ -1372,7 +1372,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
  fail_flush:
xen_blkbk_unmap(blkif, pending_req->segments,
-   pending_req->nr_pages);
+   pending_req->nr_segs);
  fail_response:
/* Haven't submitted any bio's yet. */
make_response(blkif, req->u.rw.id, req_operation, BLKIF_RSP_ERROR);
diff --git a/drivers/block/xen-blkback/common.h 
b/drivers/block/xen-blkback/common.h
index f620b5d..7a03e07 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -343,7 +343,7 @@ struct grant_page {
 struct pending_req {
struct xen_blkif*blkif;
u64 id;
-   int nr_pages;
+   int nr_segs;
atomic_tpendcnt;
unsigned short  operation;
int status;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 05/23] block/xen-blkfront: Remove invalid comment

2015-05-14 Thread Julien Grall
From: Julien Grall 

Since commit b764915 "xen-blkfront: use a different scatterlist for each
request", biovec has been replaced by scatterlist when copying back the
data during a completion request.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: Roger Pau Monné 
---
 drivers/block/xen-blkfront.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5c72c25..60cf1d6 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1056,12 +1056,6 @@ static void blkif_completion(struct blk_shadow *s, 
struct blkfront_info *info,
s->req.u.indirect.nr_segments : s->req.u.rw.nr_segments;
 
if (bret->operation == BLKIF_OP_READ && info->feature_persistent) {
-   /*
-* Copy the data received from the backend into the bvec.
-* Since bv_offset can be different than 0, and bv_len different
-* than PAGE_SIZE, we have to keep track of the current offset,
-* to be sure we are copying the data from the right shared 
page.
-*/
for_each_sg(s->sg, sg, nseg, i) {
BUG_ON(sg->offset + sg->length > PAGE_SIZE);
shared_data = kmap_atomic(
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 03/23] xen/grant-table: Remove unused macro SPP

2015-05-14 Thread Julien Grall
SPP was used by the grant table v2 code which has been removed in
commit 438b33c7145ca8a5131a30c36d8f59bce119a19a "xen/grant-table:
remove support for V2 tables".

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 drivers/xen/grant-table.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index b1c7170..62f591f 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -138,7 +138,6 @@ static struct gnttab_free_callback 
*gnttab_free_callback_list;
 static int gnttab_expand(unsigned int req_entries);
 
 #define RPP (PAGE_SIZE / sizeof(grant_ref_t))
-#define SPP (PAGE_SIZE / sizeof(grant_status_t))
 
 static inline grant_ref_t *__gnttab_entry(grant_ref_t entry)
 {
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action

2015-05-14 Thread Julien Grall
The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
always fully coalesce guest Rx packets".

Signed-off-by: Julien Grall 
Cc: Ian Campbell 
Cc: Wei Liu 
Cc: net...@vger.kernel.org
---
 drivers/net/xen-netback/netback.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 9c6a504..9ae1d43 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
   && (skb = xenvif_rx_dequeue(queue)) != NULL) {
-   RING_IDX old_req_cons;
-   RING_IDX ring_slots_used;
-
queue->last_rx_time = jiffies;
 
-   old_req_cons = queue->rx.req_cons;
XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, 
queue);
-   ring_slots_used = queue->rx.req_cons - old_req_cons;
 
__skb_queue_tail(&rxq, skb);
}
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring

2015-05-14 Thread Julien Grall
virt_to_mfn should take a void* rather an unsigned long. While it
doesn't really matter now, it would throw a compiler warning later when
virt_to_mfn will enforce the type.

At the same time, avoid to compute new virtual address every time in the
loop and directly increment the parameter as we don't use it later.

Signed-off-by: Julien Grall 
Cc: Wei Liu 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
---
 drivers/xen/xenbus/xenbus_client.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_client.c 
b/drivers/xen/xenbus/xenbus_client.c
index a014016..d204562 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void 
*vaddr,
int i, j;
 
for (i = 0; i < nr_pages; i++) {
-   unsigned long addr = (unsigned long)vaddr +
-   (PAGE_SIZE * i);
err = gnttab_grant_foreign_access(dev->otherend_id,
- virt_to_mfn(addr), 0);
+ virt_to_mfn(vaddr), 0);
if (err < 0) {
xenbus_dev_fatal(dev, err,
 "granting access to ring page");
goto fail;
}
grefs[i] = err;
+
+   vaddr = (char *)vaddr + PAGE_SIZE;
}
 
return 0;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt

2015-05-14 Thread Julien Grall
From: Julien Grall 

Signed-off-by: Julien Grall 
Cc: Stefano Stabellini 
---
 arch/arm/include/asm/xen/page.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 0b579b2..1bee8ca 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -12,7 +12,6 @@
 #include 
 
 #define phys_to_machine_mapping_valid(pfn) (1)
-#define mfn_to_virt(m) (__va(mfn_to_pfn(m) << PAGE_SHIFT))
 
 #define pte_mfnpte_pfn
 #define mfn_ptepfn_pte
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h

2015-05-14 Thread Julien Grall
Using xen/page.h will be necessary later for using common xen page
helpers.

As xen/page.h already include asm/xen/page.h, always use the later.

Signed-off-by: Julien Grall 
Cc: Stefano Stabellini 
Cc: Ian Campbell 
Cc: Wei Liu 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: net...@vger.kernel.org
---
 arch/arm/xen/mm.c  | 2 +-
 arch/arm/xen/p2m.c | 2 +-
 drivers/net/xen-netback/netback.c  | 2 +-
 drivers/net/xen-netfront.c | 1 -
 drivers/xen/events/events_base.c   | 2 +-
 drivers/xen/events/events_fifo.c   | 2 +-
 drivers/xen/gntdev.c   | 2 +-
 drivers/xen/manage.c   | 2 +-
 drivers/xen/tmem.c | 2 +-
 drivers/xen/xenbus/xenbus_client.c | 2 +-
 10 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 4983250..03e75fe 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -15,10 +15,10 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index cb7a14c..887596c 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -10,10 +10,10 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 4de46aa..9c6a504 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -44,9 +44,9 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
-#include 
 
 /* Provide an option to disable split event channels at load time as
  * event channels are limited resource. Split event channels are
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 3f45afd..ff88f31 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -45,7 +45,6 @@
 #include 
 #include 
 
-#include 
 #include 
 #include 
 #include 
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 2b8553b..704d36e 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -39,8 +39,8 @@
 #include 
 #include 
 #include 
-#include 
 #include 
+#include 
 #endif
 #include 
 #include 
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 417415d..ed673e1 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -44,13 +44,13 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 
 #include "events_internal.h"
 
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 8927485..67b9163 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -41,9 +41,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
-#include 
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Derek G. Murray , "
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 9e6a851..d10effe 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -19,10 +19,10 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
-#include 
 #include 
 
 enum shutdown_state {
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index c4211a3..3718b4a 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -17,8 +17,8 @@
 
 #include 
 #include 
+#include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/drivers/xen/xenbus/xenbus_client.c 
b/drivers/xen/xenbus/xenbus_client.c
index 96b2011..a014016 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -37,7 +37,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses

2015-05-14 Thread Julien Grall
rx->status is an int16_t, print it using %d rather than %u in order to
have a meaningful value when the field is negative.

Signed-off-by: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Boris Ostrovsky 
Cc: David Vrabel 
Cc: net...@vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index ff88f31..381d38f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -732,7 +732,7 @@ static int xennet_get_responses(struct netfront_queue 
*queue,
if (unlikely(rx->status < 0 ||
 rx->offset + rx->status > PAGE_SIZE)) {
if (net_ratelimit())
-   dev_warn(dev, "rx->offset: %x, size: %u\n",
+   dev_warn(dev, "rx->offset: %x, size: %d\n",
 rx->offset, rx->status);
xennet_move_rx_slot(queue, skb, ref);
err = -EINVAL;
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   >