[Xen-devel] [linux-arm-xen test] 58875: tolerable FAIL - PUSHED

2015-06-25 Thread osstest service user
flight 58875 linux-arm-xen real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58875/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-armhf-armhf-xl-cubietruck 11 guest-startfail pass in 58889-bisect

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-cubietruck 12 migrate-support-check fail in 58889 never 
pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux64972ceb0b0cafc91a09764bc731e1b7f0503b5c
baseline version:
 linux9f51b5de8c3fdd01a9d692da5633449cc6936688


People who touched revisions under test:
  David S. Miller da...@davemloft.net
  Ian Campbell ian.campb...@citrix.com
  Luis Henriques luis.henriq...@canonical.com
  Wei Liu wei.l...@citrix.com


jobs:
 build-armhf-xsm  pass
 build-armhf  pass
 build-armhf-libvirt  pass
 build-armhf-pvopspass
 test-armhf-armhf-xl  pass
 test-armhf-armhf-libvirt-xsm pass
 test-armhf-armhf-xl-xsm  pass
 test-armhf-armhf-xl-arndale  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   fail
 test-armhf-armhf-libvirt pass
 test-armhf-armhf-xl-multivcpupass
 test-armhf-armhf-xl-sedf-pin pass
 test-armhf-armhf-xl-sedf pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=linux-arm-xen
+ revision=64972ceb0b0cafc91a09764bc731e1b7f0503b5c
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getconfig Repos
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{Repos} or die $!;
'
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x '!=' x/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/home/osstest/repos/lock
++ exec with-lock-ex -w /home/osstest/repos/lock ./ap-push linux-arm-xen 
64972ceb0b0cafc91a09764bc731e1b7f0503b5c
+ branch=linux-arm-xen
+ revision=64972ceb0b0cafc91a09764bc731e1b7f0503b5c
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getconfig Repos
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{Repos} or die $!;
'
++ repos=/home/osstest/repos
++ repos_lock=/home/osstest/repos/lock
++ '[' x/home/osstest/repos/lock '!=' x/home/osstest/repos/lock ']'
+ . cri-common
++ . cri-getconfig
++ umask 002
+ select_xenbranch
+ case $branch in
+ tree=linux
+ xenbranch=xen-unstable
+ '[' xlinux = xlinux ']'
+ linuxbranch=linux-arm-xen
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ : tested/2.6.39.x
+ . ap-common
++ : osst...@xenbits.xen.org
+++ getconfig OsstestUpstream
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{OsstestUpstream} or die $!;
'
++ :
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/xen.git
++ : git://xenbits.xen.org/staging/qemu-xen-unstable.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://xenbits.xen.org/libvirt.git
++ : osst...@xenbits.xen.org:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xen.org:/home/xen/git/rumpuser-xen.git
+++ 

Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-25 Thread Manish Jaggi



On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote:

On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote:

On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:

On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:

Yes, pciback is already capable of doing that, see
drivers/xen/xen-pciback/conf_space.c


I am not sure if the pci-back driver can query the guest memory map. Is there 
an existing hypercall ?

No, that is missing.  I think it would be OK for the virtual BAR to be
initialized to the same value as the physical BAR.  But I would let the
guest change the virtual BAR address and map the MMIO region wherever it
wants in the guest physical address space with
XENMEM_add_to_physmap_range.

I disagree, given that we've apparently survived for years with x86 PV
guests not being able to right to the BARs I think it would be far
simpler to extend this to ARM and x86 PVH too than to allow guests to
start writing BARs which has various complex questions around it.
All that's needed is for the toolstack to set everything up and write
some new xenstore nodes in the per-device directory with the BAR
address/size.

Also most guests apparently don't reassign the PCI bus by default, so
using a 1:1 by default and allowing it to be changed would require
modifying the guests to reasssign. Easy on Linux, but I don't know about
others and I imagine some OSes (especially simpler/embedded ones) are
assuming the firmware sets up something sane by default.

Does the Flow below captures all points
a) When assigning a device to domU, toolstack creates a node in per
device directory with virtual BAR address/size

Option1:
b) toolstack using some hypercall ask xen to create p2m mapping {
virtual BAR : physical BAR } for domU
While implementing I think rather than the toolstack, pciback driver in 
dom0 can send the

hypercall by to map the physical bar to virtual bar.
Thus no xenstore entry is required for BARs. Moreover a pci driver would 
read BARs only once.

c) domU will not anytime update the BARs, if it does then it is a fault,
till we decide how to handle it

As Julien has noted pciback already deals with this correctly, because
sizing a BAR involves a write, it implementes a scheme which allows
either the hardcoded virtual BAR to be written or all 1s (needed for
size detection).


d) when domU queries BAR address from pci-back the virtual BAR address
is provided.

Option2:
b) domU will not anytime update the BARs, if it does then it is a fault,
till we decide how to handle it
c) when domU queries BAR address from pci-back the virtual BAR address
is provided.
d) domU sends a hypercall to map virtual BARs,
e) xen pci code reads the BAR and maps { virtual BAR : physical BAR }
for domU

Which option is better I think Ian is for (2) and Stefano may be (1)

In fact I'm now (after Julien pointed out the current behaviour of
pciback) in favour of (1), although I'm not sure if Stefano is too.

(I was never in favour of (2), FWIW, I previously was in favour of (3)
which is like (2) except pciback makes the hypervcall to map the virtual
bars to the guest, I'd still favour that over (2) but (1) is now my
preference)

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 RFC 6/6] x86/MSI: properly track guest masking requests

2015-06-25 Thread Jan Beulich
 On 24.06.15 at 19:24, andrew.coop...@citrix.com wrote:
 On 22/06/15 15:51, Jan Beulich wrote:
 --- a/xen/arch/x86/msi.c
 +++ b/xen/arch/x86/msi.c
 @@ -1308,6 +1308,39 @@ printk(%04x:%02x:%02x.%u: MSI-X %03x:%u
  return 1;
  }
  
 +entry = find_msi_entry(pdev, -1, PCI_CAP_ID_MSI);
 +if ( entry  entry-msi_attrib.maskbit )
 +{
 +uint16_t cntl;
 +uint32_t unused;
 +
 +pos = entry-msi_attrib.pos;
 +if ( reg  pos || reg = entry-msi.mpos + 8 )
 +return 0;
 +printk(%04x:%02x:%02x.%u: MSI %03x:%u-%04x\n, seg, bus, slot, func, reg, 
 size, *data);//temp
 +
 +if ( reg == msi_control_reg(pos) )
 +return size == 2 ? 1 : -EACCES;
 +if ( reg  entry-msi.mpos || reg = entry-msi.mpos + 4 || size != 
 4 )
 +return -EACCES;
 
 Can we avoid using EACCES to avoid confusing it with a mismatched tools
 version?

What other suitable error code would you see here? I'm not sure
we want this error code to be reserved for exactly one purpose,
the more that here we're on a path that will never has this error
code returned to the guest (and even less so via a domctl/sysctl,
which would be the primary mismatched-tools-version candidates).

It's also odd that you ask for this here, when patch 2 has a use
of this error code too.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Hyper and Xen Project

2015-06-25 Thread Dave Scott

 On 25 Jun 2015, at 02:46, Wang Xu gna...@gmail.com wrote:
 
 Agree, but I think the document is a bit confused
 
  It is important that channel names are globally unique.
 
 https://github.com/mirage/xen/blob/master/docs/misc/channel.txt#L94

I agree— that wording is definitely confusing. Perhaps the docs should compare 
the channel names to TCP/UDP port numbers? We could say that

- the IANA port registry ~=~ the channel registry in the docs/ directory
- a single IP can only have one binding for a particular port at a time ~=~ a 
domain can only have one binding for a particular name at a time
- lots of IPs can bind the same port ~=~ lots of domains can bind the same name

What do you think?

Thanks,
Dave

 
 On Thu, Jun 25, 2015 at 2:29 AM Dave Scott dave.sc...@citrix.com wrote:
 Hi Xu,
 
  On 24 Jun 2015, at 14:44, Wang Xu gna...@gmail.com wrote:
 
  Thank you Dave, I think I can also get work around for that.
 
  By the way, the document says the name should be global unique, but I can 
  start 2 domains have channels with a same name, is there some potential 
  problems?
 
 The name needs to be unique within a domain. It’s ok to have
 
 1. domid 10, channel name ‘agent’
 2. domid 11, channel name ‘agent’
 
 — this will be common, as multiple domains will have the same ‘agent’ 
 software installed.
 
 but it will cause problems if the name is used twice within a domain. It’s a 
 bad idea to have
 
 1: domid 10, channel name ‘agent’
 2: domid 10, channel name ‘agent’
 
 — although this will create 2 distinct /dev/hvc devices, it will be difficult 
 to tell which is which.
 
 If libxl allows the name to be duplicated within a domain, then this is my 
 fault. We should add validation code to check uniqueness.
 
 Thanks,
 Dave
 
 
  Cheers
 
  Xu
 
  On Wed, Jun 24, 2015 at 9:03 PM Dave Scott dave.sc...@citrix.com wrote:
  I don’t think the frontend driver in Linux knows about the name key. In my 
  testing I wrote a udev script which looks up the ‘name’ key directly in 
  xenstore and created a named device node using that. For reference my 
  script is here:
 
  https://github.com/mirage/mirage-console/blob/master/udev/xenconsole-setup-tty
 
  Cheers,
  Dave
 
   and I directly test `/dev/hvc1`, and it could communicate with the 
   outside socket. Is there some mistake in my channel
   name configuration?
  
   | static void hyper_config_channel(libxl_device_channel* ch, const char* 
   name, const char* sock, int devid) {
   | libxl_device_channel_init(ch);
   | ch-backend_domid = 0;
   | ch-name = strdup(name);
   | ch-devid = devid;
   | ch-connection = LIBXL_CHANNEL_CONNECTION_SOCKET;
   | ch-u.socket.path = strdup(sock);
   | }
  
   I tried to look at the oVirt code as it is mentioned in the dock, but I 
   did not find xen console in its guest agent code.
  
   So the issue is that the name you assign here to the channel, doesn't
   come up anywhere in the guest. Is that correct?
 
 
  
  
   Thank you!
  
  
   On Tue, Jun 23, 2015 at 7:30 PM, Stefano Stabellini 
   stefano.stabell...@eu.citrix.com wrote:
On Tue, 23 Jun 2015, Wang Xu wrote:
   On Sat, Jun 20, 2015 at 1:10 AM Stefano Stabellini 
   stefano.stabell...@eu.citrix.com wrote:
  Integrating hyper with Xen using libxl was the right decision 
   and it
  looks like you did a good job. I think that you can go ahead 
   with the
  PR!
  
  
  But I did have a few issues building hyper. I am getting:
  
  hyperd.go:11:2: cannot find package hyper/daemon in any of:
  [...]
  
   I tried with a clean 0.2-dev branch
   ./autogen.sh
   ./configure
   make
  
   It looks ok, are you work on the 0.2-dev branch, I did not write the 
   branch name in the instruction of
Readme, sorry for
   that.
  
No worries, the most important part at this stage is the code, and 
   that
looks OK :-)
Yes, I was using 0.2-dev and followed those steps. As I usually 
   don't
program in go, it is likely that my go working environment is 
   missing
something, or my go paths are wrong. This is the full error message:
  
CGO_LDFLAGS=-Lhypervisor/xen -lxenlight -lxenctrl -lhyperxl godep 
   go build hyperd.go
hyperd.go:11:2: cannot find package hyper/daemon in any of:
/local/scratch/sstabellini/go/src/hyper/daemon (from 
   $GOROOT)

   /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/daemon 
   (from $GOPATH)
hyperd.go:10:2: cannot find package hyper/engine in any of:
/local/scratch/sstabellini/go/src/hyper/engine (from 
   $GOROOT)

   /local/scratch/sstabellini/hyper/Godeps/_workspace/src/hyper/engine 
   (from $GOPATH)
hyperd.go:12:2: cannot find package hyper/lib/glog in any of:
/local/scratch/sstabellini/go/src/hyper/lib/glog (from 
   $GOROOT)

   

Re: [Xen-devel] [PATCH 5/9] x86/pvh: Set PVH guest's mode in XEN_DOMCTL_set_address_size

2015-06-25 Thread Jan Beulich
 On 24.06.15 at 18:21, boris.ostrov...@oracle.com wrote:
 On 06/24/2015 08:10 AM, Jan Beulich wrote:
 On 24.06.15 at 13:42, boris.ostrov...@oracle.com wrote:
 On 06/24/2015 03:57 AM, Jan Beulich wrote:
 On 24.06.15 at 04:53, boris.ostrov...@oracle.com wrote:
 On 06/23/2015 09:22 AM, Jan Beulich wrote:
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -2320,12 +2320,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
 v-arch.hvm_vcpu.inject_trap.vector = -1;
 
 if ( is_pvh_domain(d) )
 -{
 -v-arch.hvm_vcpu.hcall_64bit = 1;/* PVH 32bitfixme. */
 -/* This is for hvm_long_mode_enabled(v). */
 -v-arch.hvm_vcpu.guest_efer = EFER_LMA | EFER_LME;
 return 0;
 -}
 With this removed, is there any guarantee that hvm_set_mode()
 will be called for each vCPU?
 IIUIC, toolstack is required to call XEN_DOMCTL_set_address_size which
 results in a call to switch_compat/native(), which loop over all VCPUs,
 calling set_mode.
 I don't recall this being a strict requirement. I think a PV 64-bit
 guest would start fine without.
 We do call it via libxl__build_pv() - xc_dom_boot_mem_init() -
 arch_setup_mem_init() - x86_compat().
 Right, that's in our tool stack. The question though was whether it's
 a requirement to be called.
 
 Since this change will assume that this domctl is called for both 32- 
 and 64-bit --- yes, this becomes a requirement for 64-bit PVH guests.

But that's the whole point of my question - it isn't right now, and
hence I don't think it should become a requirement. Instead I
think state should start out to be ready for a 64-bit guest just
like it does for PV.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-linus test] 58873: regressions - FAIL

2015-06-25 Thread osstest service user
flight 58873 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58873/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail REGR. vs. 58793
 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail REGR. vs. 
58793
 test-amd64-i386-xl-qemuu-debianhvm-amd64 9 debian-hvm-install fail REGR. vs. 
58793
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 18 guest-start/debianhvm.repeat 
fail REGR. vs. 58793
 build-armhf-pvops 5 kernel-build  fail REGR. vs. 58793

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt  11 guest-start   fail REGR. vs. 58793
 test-amd64-amd64-libvirt 11 guest-start  fail   like 58793
 test-amd64-i386-freebsd10-amd64  9 freebsd-install fail like 58793
 test-amd64-i386-freebsd10-i386  9 freebsd-install  fail like 58793
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58793
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58793

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-arndale   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-xsm   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2   1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-cubietruck  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-sedf-pin  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel 13 guest-saverestorefail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 linux6eae81a5e2d6646a61146501fd3032a340863c1d
baseline version:
 linuxd2228e4310612a1289c343bcf819831a74ae0366


551 people touched revisions under test,
not listing them all


jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopsfail
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  blocked 
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm pass
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  blocked 
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  fail
 

Re: [Xen-devel] [PATCH] xen: new maintainer for the RTDS scheduler

2015-06-25 Thread Meng Xu
2015-06-25 5:44 GMT-07:00 Dario Faggioli dario.faggi...@citrix.com:
 Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
 ---
 Cc: George Dunlap george.dun...@eu.citrix.com
 Cc: Meng Xu xumengpa...@gmail.com
 ---
  MAINTAINERS |5 +
  1 file changed, 5 insertions(+)

 diff --git a/MAINTAINERS b/MAINTAINERS
 index 6b1068e..e6616d2 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -282,6 +282,11 @@ F: tools/libxl/libxl_nonetbuffer.c
  F: tools/hotplug/Linux/remus-netbuf-setup
  F: tools/hotplug/Linux/block-drbd-probe

 +RTDS SCHEDULER
 +M: Dario Faggioli dario.faggi...@citrix.com
 +S: Supported
 +F: xen/common/sched_rt.c

I'm not sure if the following response is correct and proper, just in
case it is correct. :-)

Reviewed-and-Acked-by: Meng Xu men...@cis.upenn.edu


Thanks,

Meng


---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 14/19] tools/libxl: detect and avoid conflicts with RDM

2015-06-25 Thread Chen, Tiejun

On 2015/6/25 19:23, Wei Liu wrote:

On Tue, Jun 23, 2015 at 05:57:25PM +0800, Tiejun Chen wrote:

While building a VM, HVM domain builder provides struct hvm_info_table{}
to help hvmloader. Currently it includes two fields to construct guest
e820 table by hvmloader, low_mem_pgend and high_mem_pgend. So we should
check them to fix any conflict with RAM.



RAM - RDM?


Fixed.




RMRR can reside in address space beyond 4G theoretically, but we never


[snip]


+static struct xen_reserved_device_memory
+*xc_device_get_rdm(libxl__gc *gc,
+   uint32_t flag,
+   uint16_t seg,
+   uint8_t bus,
+   uint8_t devfn,
+   unsigned int *nr_entries)


I just notice this function lives in libxl_dm.c. The function should be
renamed to libxl__xc_device_get_rdm.

This function should return proper libxl error code (ERROR_FAIL or
something more appropriate). The allocated RDM entries should be


ERROR_FAIL is better.

So refactor this function after address your all comments,

static int
libxl__xc_device_get_rdm(libxl__gc *gc,
 uint32_t flag,
 uint16_t seg,
 uint8_t bus,
 uint8_t devfn,
 unsigned int *nr_entries,
 struct xen_reserved_device_memory *xrdm)
{
int rc;

/*
 * We really can't presume how many entries we can get in advance.
 */
*nr_entries = 0;
rc = xc_reserved_device_memory_map(CTX-xch, flag, seg, bus, devfn,
   NULL, nr_entries);
assert(rc = 0);
/* 0 means we have no any rdm entry. */
if (!rc)


   94,22  3%
/* 0 means we have no any rdm entry. */
if (!rc)
goto out;

if (errno == ENOBUFS) {
xrdm = libxl__malloc(gc,
 *nr_entries *
 sizeof(xen_reserved_device_memory_t));
rc = xc_reserved_device_memory_map(CTX-xch, flag, seg, bus, devfn,
   xrdm, nr_entries);
if (rc) {
LOG(ERROR, Could not get reserved device memory maps.\n);
rc = ERROR_FAIL;
}
} else {
LOG(ERROR, Could not get reserved device memory maps.\n);
rc = ERROR_FAIL;
}

 out:
if (rc) {
*nr_entries = 0;
xrdm = NULL;
}
return rc;
}

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 2/7] libxl_read_file_contents: add new entry to read sysfs file

2015-06-25 Thread Chun Yan Liu


 On 6/25/2015 at 07:09 PM, in message
21899.57676.368102.982...@mariner.uk.xensource.com, Ian Jackson
ian.jack...@eu.citrix.com wrote: 
 Chunyan Liu writes ([PATCH V5 2/7] libxl_read_file_contents: add new entry  
 to read sysfs file): 
  Sysfs file has size=4096 but actual file content is less than that. 
  Current libxl_read_file_contents will treat it as error when file size 
  and actual file content differs, so reading sysfs file content with 
  this function always fails. 
   
  Add a new entry libxl_read_sysfs_file_contents to handle sysfs file 
  specially. It would be used in later pvusb work. 
  
 I think this still fails to detect a situation where the file is 
 unexpectedly longer than the requested size ? 


+} else if (feof(f)) {
+if (rs  datalen  tolerate_shrinking_file) {
+datalen = rs;
+} else {

If the file is bigger than the requested size, it will fall to this branch and 
report error.
Do you mean I should report another error message separately?

- Chunyan

+LOG(ERROR, %s changed size while we were reading it,
+filename);
+goto xe;
+}
+} else {

  
 As we wrote earlier: 
  
Is there any risk that the file is actually bigger than advertised,  
rather than smaller ?  

   For sysfs file, couldn't be bigger. 
   
  Then you should detect the condition that the file is bigger, and call 
  it an error. 
  
 Thanks, 
 Ian. 
  
  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] Stepping up for being the maintainer of sched_rt.c

2015-06-25 Thread Meng Xu
2015-06-25 5:44 GMT-07:00 Dario Faggioli dario.faggi...@citrix.com:
 I've been involved with this scheduler from the very beginning of the
 upstreaming process (from the RT-Xen project to here).

Right! Thank Dario for your help and advice! :-)


 I've been working with Meng and his group closely since then, and I now feel
 comfortable to be the one that will (N)Ack their patches! :-)

I'm not sure what I should reply, but I'm raising my hands and feet to
vote for it. :-)

Thanks,

Meng


---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania
http://www.cis.upenn.edu/~mengxu/

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] vTPM issues

2015-06-25 Thread Marcos Simó Picó
It worked straight away on Ubuntu 15.04.

Thanks a lot for your advice.
On 25 Jun 2015, at 11:52, Emil Condrea 
emilcond...@gmail.commailto:emilcond...@gmail.com wrote:

Timeouts have the standard values.
Good luck with installing 15.04.

On Thu, Jun 25, 2015 at 12:34 PM, Marcos Simó Picó 
marco...@kth.semailto:marco...@kth.se wrote:

Okay, /etc/tpm0 is present.

The timeout values are:

752000 200 752000 752000 [adjusted]


I have no problem actually upgrading to Ubuntu 15.04 if that might solve the 
problem.


Thanks a lot for your reply again.


De: Emil Condrea emilcond...@gmail.commailto:emilcond...@gmail.com
Enviado: jueves, 25 de junio de 2015 11:22
Para: Marcos Simó Picó
Cc: xen-devel@lists.xen.orgmailto:xen-devel@lists.xen.org
Asunto: Re: [Xen-devel] vTPM issues

Sorry, I misspelled, I meant /dev/tpm0 not /etc/tpm0
I remember that once I had this problem when almost all trousers commands
were returning internal software error in domU.
Can you check what are the timeout values?
cat /sys/devices/vtpm-0/timeouts

I remember that there was a bug in ubuntu 14.04 regarding tpm driver.
You could try 14.04.2. I am using Ubuntu 15.04 as domU guest and tpm comands
run succesfully.

On Thu, Jun 25, 2015 at 12:10 PM, Marcos Simó Picó 
marco...@kth.semailto:marco...@kth.se wrote:

Yes, I'm indeed using pv guests. After running #tcsd -f  I get:

TCSD TDDL ioctl: (25) Inappropriate ioctl for device
TCSD TDDL Falling back to Read/Write device support.
TCSD trousers 0.3.5git: TCSD up and running.


I don't know if the problem might be there. When I invoke tpm_takeownership -z 
-y -l debug it returns exactly the same messages I sent in my previous email.


On the other hand, /sys/devices/vtpm-0 is present, but /etc/tpm0 is not.


Thanks for your reply.



De: Emil Condrea emilcond...@gmail.commailto:emilcond...@gmail.com
Enviado: jueves, 25 de junio de 2015 10:21
Para: Marcos Simó Picó
Cc: xen-devel@lists.xen.orgmailto:xen-devel@lists.xen.org; Xu, Quan
Asunto: Re: [Xen-devel] vTPM issues

I guess you are using pv guests, I don't know exactly if Quan finished 
development for hvm.
I suggest to take a look at tcsd log:
pkill tcsd
tcsd -f 
tpm_takeownership -z -y -l debug
Also can you see if /sys/devices/vtpm-0 and /dev/tpm0 are present?

On Wed, Jun 24, 2015 at 6:16 PM, Marcos Simó Picó 
marco...@kth.semailto:marco...@kth.se wrote:

Hello everyone,


I would like to try the vTPM feature, but I'm having some issues. Basically, I 
followed the steps explained in 
https://mhsamsal.wordpress.com/2013/12/05/configuring-virtual-tpm-vtpm-for-xen-4-3-guest-virtual-machines/


I'm running Ubuntu 14.04 as Dom0 on a Dell optiplex-9020. I compiled Xen 4.5.0 
from source. After creating vtpmmgr and vtpm stubdoms, and DomU, I can invoke 
tpm_version from DomU:


root@DomU:/home/xen# tpm_version
  TPM 1.2 Version Info:
  Chip Version:1.2.0.7
  Spec Level:  2
  Errata Revision: 1
  TPM Vendor ID:   ETHZ
  TPM Version: 0101
  Manufacturer Info:   4554485a


I can also see the PCRs status by invoking cat 
/sys/class/misc/tpm0/device/pcrs, however, most of the commands return an 
error. When I invoke takeownership I get the following error:


root@DomU:/home/xen# tpm_takeownership -y -z -l debug
Tspi_Context_Create success
Tspi_Context_Connect success
Tspi_Context_GetTpmObject success
Tspi_GetPolicyObject success
Tspi_Policy_SetSecret success
Tspi_Context_CreateObject success
Tspi_GetPolicyObject success
Tspi_Policy_SetSecret success
Tspi_TPM_TakeOwnership failed: 0x2004 - layer=tcs, code=0004 (4), Internal 
software error
Tspi_Context_CloseObject success
Tspi_Context_FreeMemory success
Tspi_Context_Close success


The same error is given when invoking tpm_getpubkey. I have already tried after 
clearing the TPM from BIOS, after having taken ownership and with ownership no 
taken with the same result when using the vTPM. I have also installed Xen 
4.3.4, with the same result too.


In the end, I would like to use the vTPM to generate and use RSA keys for TLS 
session establishing (using the API provided with GnuTLS). Since I cannot take 
ownership of the vTPM, the GnuTLS' tpmtool complains it doesn't find any SRK.


I really appreciate any help you can provide.


Best regards,

Marcos

___
Xen-devel mailing list
Xen-devel@lists.xen.orgmailto:Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 09/12] x86/altp2m: add remaining support routines.

2015-06-25 Thread Sahita, Ravi
On 06/25/2015 06:40 AM, Razvan Cojocaru wrote:
 On 06/25/2015 03:44 PM, Lengyel, Tamas wrote:
 On Wed, Jun 24, 2015 at 2:06 PM, Ed White edmund.h.wh...@intel.com 
 mailto:edmund.h.wh...@intel.com wrote:
 On 06/24/2015 09:15 AM, Lengyel, Tamas wrote:
  +bool_t p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
  + unsigned long pfn, xenmem_access_t
  access)
  +{
 
 
  This function IMHO should be merged with p2m_set_mem_access and should 
 be
  triggerable with the same memop (XENMEM_access_op) hypercall instead of
  introducing a new hvmop one.

 I think we should vote on this. My view is that it makes
 XENMEM_access_op
 too complicated to use.

 The two functions are not very long and share enough code that it 
 would justify merging. The only big change added is the copy from 
 host-alt when the entry doesn't exists in alt, and that itself is 
 pretty self contained. Let's see if we can get a third opinion on it..
 
 At first sight (I admit I'm rather late in the game and haven't had a 
 chance to follow the series closely from the beginning), the two 
 functions do seem to be mergeable (or at least the common code 
 factored out in static helper functions).
 
 Also, if Ed's concern is that the libxc API would look unnatural if
 xc_set_mem_access() is used for both purposes, as far as I can tell 
 the only difference could be a non-zero last altp2m parameter, so I 
 agree with you that the less functions doing almost the same thing the 
 better (I have been guilty of this in the past too, for example with 
 my
 xc_enable_introspection() function ;) ).
 
 So I'd say, yes, if possible merge them.

So here are my reasons why I don't think we should merge the hypercalls, in 
more detail:

Although the two hypercalls are similar, they are not identical. For one thing, 
the existing hypercall can only be used cross-domain whereas the altp2m one can 
be used cross-domain or intra-domain. Also, the existing hypercall can be used 
to modify a range of pages and the new one can only modify a single page, and 
that is intentional.

As I see it, the implementation in hvm.c would become a lot less clean, and 
every direct user of the existing hypercall would have to change for no good 
reason.

Razvan's suggestion to merge the functions that implement the p2m changes I'm 
more ambivalent about. Personally, I prefer not to have code that contains lots 
of conditional logic, which would be the result, but I don't feel that strongly 
about it.

Ed

Ravi This also has implications for the XSM hooks used for these hypercalls - 
altp2m default policy is to allow for intra-domain , which is not the case for 
XENMEM_access_op - 
Any thoughts on how to manage this difference if we merge them?

Ravi


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.4 test] 58878: regressions - FAIL

2015-06-25 Thread osstest service user
flight 58878 linux-3.4 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58878/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-win7-amd64  6 xen-boot  fail REGR. vs. 30511

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-win7-amd64 9 windows-install fail in 58831 pass in 
58878
 test-amd64-amd64-pair10 xen-boot/dst_host   fail pass in 58798
 test-amd64-amd64-pair 9 xen-boot/src_host   fail pass in 58798
 test-amd64-amd64-xl-sedf-pin  6 xen-bootfail pass in 58798
 test-amd64-i386-pair 10 xen-boot/dst_host   fail pass in 58831
 test-amd64-i386-pair  9 xen-boot/src_host   fail pass in 58831

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-i386-libvirt-xsm   6 xen-bootfail baseline untested
 test-amd64-amd64-xl-multivcpu  6 xen-boot   fail baseline untested
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-libvirt-xsm  6 xen-bootfail baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 6 xen-boot fail baseline 
untested
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 6 xen-boot fail baseline untested
 test-amd64-amd64-xl-sedf  6 xen-boot fail   like 30406
 test-amd64-i386-libvirt  11 guest-start  fail   like 30511
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 30511
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 30511
 test-amd64-amd64-xl-qemuu-ovmf-amd64  6 xen-bootfail like 53709-bisect
 test-amd64-i386-freebsd10-amd64  6 xen-boot fail like 58780-bisect
 test-amd64-i386-xl-qemuu-winxpsp3  6 xen-boot   fail like 58786-bisect
 test-amd64-i386-qemut-rhel6hvm-intel  6 xen-bootfail like 58788-bisect
 test-amd64-i386-rumpuserxen-i386  6 xen-bootfail like 58799-bisect
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  6 xen-bootfail like 58801-bisect
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  6 xen-boot   fail like 58803-bisect
 test-amd64-amd64-xl-qemut-winxpsp3  6 xen-boot  fail like 58804-bisect
 test-amd64-i386-freebsd10-i386  6 xen-boot  fail like 58805-bisect
 test-amd64-i386-xl-qemuu-ovmf-amd64  6 xen-boot fail like 58806-bisect
 test-amd64-amd64-xl-qemuu-winxpsp3  6 xen-boot  fail like 58807-bisect
 test-amd64-i386-xl-qemut-winxpsp3  6 xen-boot   fail like 58808-bisect
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  6 xen-bootfail like 58809-bisect
 test-amd64-amd64-rumpuserxen-amd64  6 xen-boot  fail like 58810-bisect
 test-amd64-i386-xl-qemuu-debianhvm-amd64  6 xen-bootfail like 58811-bisect
 test-amd64-amd64-xl-qemut-debianhvm-amd64  6 xen-boot   fail like 58813-bisect
 test-amd64-i386-qemuu-rhel6hvm-intel  6 xen-bootfail like 58814-bisect
 test-amd64-i386-xl-qemut-debianhvm-amd64  6 xen-bootfail like 58815-bisect

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt  12 migrate-support-check fail in 58831 never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-check fail in 58831 never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass

version targeted for testing:
 linuxcf1b3dad6c5699b977273276bada8597636ef3e2
baseline version:
 linuxbb4a05a0400ed6d2f1e13d1f82f289ff74300a70


500 people touched revisions under test,
not listing them all


jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-amd64-i386-xl

Re: [Xen-devel] [PATCH V5 2/7] libxl_read_file_contents: add new entry to read sysfs file

2015-06-25 Thread Chun Yan Liu


 On 6/25/2015 at 07:09 PM, in message
21899.57676.368102.982...@mariner.uk.xensource.com, Ian Jackson
ian.jack...@eu.citrix.com wrote: 
 Chunyan Liu writes ([PATCH V5 2/7] libxl_read_file_contents: add new entry  
 to read sysfs file): 
  Sysfs file has size=4096 but actual file content is less than that. 
  Current libxl_read_file_contents will treat it as error when file size 
  and actual file content differs, so reading sysfs file content with 
  this function always fails. 
   
  Add a new entry libxl_read_sysfs_file_contents to handle sysfs file 
  specially. It would be used in later pvusb work. 
  
 I think this still fails to detect a situation where the file is 
 unexpectedly longer than the requested size ? 


+} else if (feof(f)) {
+if (rs  datalen  tolerate_shrinking_file) {
+datalen = rs;
+} else {

If the file is bigger than the requested size, it will fall to this branch and 
report error.
Do you mean I should report another error message separately?

- Chunyan

+LOG(ERROR, %s changed size while we were reading it,
+filename);
+goto xe;
+}
+} else {

  
 As we wrote earlier: 
  
Is there any risk that the file is actually bigger than advertised,  
rather than smaller ?  

   For sysfs file, couldn't be bigger. 
   
  Then you should detect the condition that the file is bigger, and call 
  it an error. 
  
 Thanks, 
 Ian. 
  
  



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST Nested PATCH v11 6/7] Compose the main recipe of nested test job

2015-06-25 Thread Ian Jackson
Pang, LongtaoX writes (RE: [OSSTEST Nested PATCH v11 6/7] Compose the main 
recipe of nested test job):
  -Original Message-
  From: Ian Campbell [mailto:ian.campb...@citrix.com]
...
  I think you are correct, the logs capture will fail too.
  
  I'll leave it to Ian to suggest a solution since it will no doubt
  involve some tcl plumbing (I'd be inclined to record 'hosts which are
  actually guests' somewhere and have the infra clean them up
  automatically after doing leak check and log collection).

Sorry I haven't done this yet, it's still on my radar.

  I was thinking more along the lines of creating Osstest/PDU/guest.pm
  with the appropriate methods calling out to toolstack($l0)-foo, setting
  $ho-{Power} = 'guest $l1guestname' somewhere and allowing
  power_cycle_host_setup to do it's thing.
 
 I have reviewed power_cycle_host_setup function, inside this
 function will call get_host_method_object, then we could get a $mo
 which will be assigned to $ho-{PowerMethobjs}, right?  Inside
 power_state function, it will call pdu_power_state which is defined
 in guest.pm, right?

Yes.

 So, I need to defined how to power off/on L1 inside pdu_power_state
 function? I think we need to using 'xl destroy' and 'xl create'
 command to implement the power method.

Indeed.  You'll need to use the appropriate toolstack object, in case
it's libvirt or something.  toolstack($ho) where $ho is the L0.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/arm/mm: use gfn instead of pfn in p2m_get_mem_access/p2m_set_mem_access

2015-06-25 Thread Ian Campbell
On Tue, 2015-06-23 at 18:25 +0200, Vitaly Kuznetsov wrote:
 Jan Beulich jbeul...@suse.com writes:
 
  On 26.05.15 at 15:32, vkuzn...@redhat.com wrote:
  --- a/xen/arch/arm/p2m.c
  +++ b/xen/arch/arm/p2m.c
  @@ -1709,9 +1709,9 @@ bool_t p2m_mem_access_check(paddr_t gpa, vaddr_t 
  gla, 
  const struct npfec npfec)
   
   /*
* Set access type for a region of pfns.
  - * If start_pfn == -1ul, sets the default access type.
  + * If start_gfn == -1ul, sets the default access type.
*/
  -long p2m_set_mem_access(struct domain *d, unsigned long pfn, uint32_t nr,
  +long p2m_set_mem_access(struct domain *d, unsigned long start_gfn, 
  uint32_t nr,
   uint32_t start, uint32_t mask, xenmem_access_t 
  access)
   {
   struct p2m_domain *p2m = p2m_get_hostp2m(d);
  @@ -1752,14 +1752,15 @@ long p2m_set_mem_access(struct domain *d, unsigned 
  long pfn, uint32_t nr,
   p2m-mem_access_enabled = true;
   
   /* If request to set default access. */
  -if ( pfn == ~0ul )
  +if ( start_gfn == ~0ul )
   {
   p2m-default_access = a;
   return 0;
   }
   
   rc = apply_p2m_changes(d, MEMACCESS,
  -   pfn_to_paddr(pfn+start), pfn_to_paddr(pfn+nr),
  +   pfn_to_paddr(start_gfn + start),
 
  Particularly due to this expression I'm not really happy about the
  start_ prefix that you're adding here, but I'll let the maintainers
  of the respective pieces of code decide if they're happy with it.
 
 Sorry for the ping but it has been almost one month...

Sorry, I must have missed this one, pinging was absolutely the right
thing to do (after a week or two would have been fine, no need to wait a
month).

I'm not super keen on the start_ prefix either, but I would prefer
consistency between arm and x86 here more than I object to the prefix.
IOW my preference would be to drop it everywhere, but if x86 folks
prefer to keep it then I don't mind but ARM should keep it too.

I've also copied the (new) mem access maintainers in case they have an
opinion.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 09/19] tools/libxc: Expose new hypercall xc_reserved_device_memory_map

2015-06-25 Thread Wei Liu
On Tue, Jun 23, 2015 at 05:57:20PM +0800, Tiejun Chen wrote:
 We will introduce the hypercall xc_reserved_device_memory_map
 approach to libxc. This helps us get rdm entry info according to
 different parameters. If flag == PCI_DEV_RDM_ALL, all entries
 should be exposed. Or we just expose that rdm entry specific to
 a SBDF.
 
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 Reviewed-by: Kevin Tian kevin.t...@intel.com

Acked-by: Wei Liu wei.l...@citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] pvUSB backend performance

2015-06-25 Thread Juergen Gross

On 06/25/2015 10:53 AM, Dario Faggioli wrote:

On Wed, 2015-06-24 at 14:06 +0200, Juergen Gross wrote:

Hi,

my qemu integrated pvUSB backend is now running stable enough to do
some basic performance measurements. I've passed a memory-stick with
about 90MB of data on it to a pv-domU. Then I read all the data on
it with tar and looked how long this would take (elapsed time):

in dom0: 5.2s
in domU with kernel backend: 6.1s
in domU with qemu backend:   8.2s

So the qemu backend is about 30% slower than the kernel backend. Is
this acceptable?


If I can ask (I know nothing about USB, let alone pvUSB! :-O), and if
you happen to know, what's the situation of other hypervisors, in term
both of support and performance?


No specific knowledge, sorry.


Juergen


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 17/17] x86/hvm: track large memory mapped accesses by buffer offset

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 12:51, paul.durr...@citrix.com wrote:
  -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 25 June 2015 11:47
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 17/17] x86/hvm: track large memory mapped
 accesses by buffer offset
 
  On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
  @@ -621,14 +574,41 @@ static int hvmemul_phys_mmio_access(
 
   for ( ;; )
   {
  -rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
  -*buffer);
  -if ( rc != X86EMUL_OKAY )
  -break;
  +/* Have we already done this chunk? */
  +if ( (*off + chunk) = vio-mmio_cache[dir].size )
 
 I can see why you would like to get rid of the address check, but
 I'm afraid you can't: You have to avoid getting mixed up multiple
 same kind (reads or writes) memory accesses that a single
 instruction can do. While generally I would assume that
 secondary accesses (like the I/O bitmap read associated with an
 OUTS) wouldn't go to MMIO, CMPS with both operands being
 in MMIO would break even if neither crosses a page boundary
 (not to think of when the emulator starts supporting the
 scatter/gather instructions, albeit supporting them will require
 further changes, or we could choose to do them one element at
 a time).
 
 Ok. Can I assume at most two distinct set of addresses for read or write? If 
 so then I can just keep two sets of caches in the hvm_io struct.

If we can leave out implicit accesses (like the one mentioned)
as well as stack ones, then there shouldn't be more than two
(disjoint) reads and one write per instruction, but each possibly
crossing a page boundary. 

If we want to support stacks in MMIO, enter and leave would
extend that set, as would said implicit accesses. Of course we
should take into consideration what currently works, and I
think both stack and implicit accesses would currently work as
long as they're aligned (as misalignment would be the only
reason for them to get split up - they're never wider than a
long). I.e. you may want to consider avoiding any ASSERT()s
or other conditionals potentially breaking these special cases.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 07/11] x86/intel_pstate: the main boby of the intel_pstate driver

2015-06-25 Thread Wei Wang
The intel_pstate driver is ported following its kernel code logic
(commit: 93f0822d).In order to port the Linux source file with
minimal modifications, some of the variable types are kept intact
(e.g. int current_pstae, would otherwise be changed to
unsigned int).

In the kernel, a user can adjust the limits via sysfs
(limits.min_sysfs_pct/max_sysfs_pct). In Xen, the
policy-limits.min_perf_pct/max_perf_pct acts as the transit station.
A user interacts with it via xenpm.

The new xen/include/asm-x86/cpufreq.h header file is added.

v4 changes:
1) changed the identation to be a Tab (same as Linux intel_pstate),
   instead of 4 +$;
2) added a new header file, xen/include/asm-x86/cpufreq.h.

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 xen/arch/x86/acpi/cpufreq/Makefile   |   1 +
 xen/arch/x86/acpi/cpufreq/intel_pstate.c | 870 +++
 xen/include/asm-x86/cpufreq.h|  34 ++
 xen/include/asm-x86/msr-index.h  |   3 +
 4 files changed, 908 insertions(+)
 create mode 100644 xen/arch/x86/acpi/cpufreq/intel_pstate.c
 create mode 100644 xen/include/asm-x86/cpufreq.h

diff --git a/xen/arch/x86/acpi/cpufreq/Makefile 
b/xen/arch/x86/acpi/cpufreq/Makefile
index f75da9b..99fa9f4 100644
--- a/xen/arch/x86/acpi/cpufreq/Makefile
+++ b/xen/arch/x86/acpi/cpufreq/Makefile
@@ -1,2 +1,3 @@
 obj-y += cpufreq.o
+obj-y += intel_pstate.o
 obj-y += powernow.o
diff --git a/xen/arch/x86/acpi/cpufreq/intel_pstate.c 
b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
new file mode 100644
index 000..19c74cc
--- /dev/null
+++ b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
@@ -0,0 +1,870 @@
+#include xen/kernel.h
+#include xen/types.h
+#include xen/init.h
+#include xen/bitmap.h
+#include xen/cpumask.h
+#include xen/timer.h
+#include asm/msr.h
+#include asm/msr-index.h
+#include asm/processor.h
+#include asm/div64.h
+#include asm/cpufreq.h
+#include acpi/cpufreq/cpufreq.h
+
+#define BYT_RATIOS   0x66a
+#define BYT_VIDS 0x66b
+#define BYT_TURBO_RATIOS  0x66c
+#define BYT_TURBO_VIDS   0x66d
+
+#define FRAC_BITS 8
+#define int_tofp(X) ((int64_t)(X)  FRAC_BITS)
+#define fp_toint(X) ((X)  FRAC_BITS)
+
+static inline int32_t mul_fp(int32_t x, int32_t y)
+{
+   return ((int64_t)x * (int64_t)y)  FRAC_BITS;
+}
+
+static inline int32_t div_fp(int32_t x, int32_t y)
+{
+   return div_s64((int64_t)x  FRAC_BITS, y);
+}
+
+static inline int ceiling_fp(int32_t x)
+{
+   int mask, ret;
+
+   ret = fp_toint(x);
+   mask = (1  FRAC_BITS) - 1;
+   if (x  mask)
+   ret += 1;
+   return ret;
+}
+
+struct sample {
+   int32_t core_pct_busy;
+   u64 aperf;
+   u64 mperf;
+   int freq;
+   s_time_t time;
+};
+
+struct pstate_data {
+   int current_pstate;
+   int min_pstate;
+   int max_pstate;
+   int scaling;
+   int turbo_pstate;
+};
+
+struct vid_data {
+   int min;
+   int max;
+   int turbo;
+   int32_t ratio;
+};
+
+struct _pid {
+   int setpoint;
+   int32_t integral;
+   int32_t p_gain;
+   int32_t i_gain;
+   int32_t d_gain;
+   int deadband;
+   int32_t last_err;
+};
+
+struct cpudata {
+   int cpu;
+
+   struct timer timer;
+
+   struct pstate_data pstate;
+   struct vid_data vid;
+   struct _pid pid;
+
+   s_time_t last_sample_time;
+   u64 prev_aperf;
+   u64 prev_mperf;
+   struct sample sample;
+};
+
+static struct cpudata **all_cpu_data;
+
+struct pstate_adjust_policy {
+   int sample_rate_ms;
+   int deadband;
+   int setpoint;
+   int p_gain_pct;
+   int d_gain_pct;
+   int i_gain_pct;
+};
+
+struct pstate_funcs {
+   int (*get_max)(void);
+   int (*get_min)(void);
+   int (*get_turbo)(void);
+   int (*get_scaling)(void);
+   void (*set)(struct perf_limits *, struct cpudata *, int pstate);
+   void (*get_vid)(struct cpudata *);
+};
+
+struct cpu_defaults {
+   struct pstate_adjust_policy pid_policy;
+   struct pstate_funcs funcs;
+};
+
+static struct pstate_adjust_policy pid_params;
+static struct pstate_funcs pstate_funcs;
+
+static inline void pid_reset(struct _pid *pid, int setpoint, int busy,
+int deadband, int integral) {
+   pid-setpoint = setpoint;
+   pid-deadband  = deadband;
+   pid-integral  = int_tofp(integral);
+   pid-last_err  = int_tofp(setpoint) - int_tofp(busy);
+}
+
+static inline void pid_p_gain_set(struct _pid *pid, int percent)
+{
+   pid-p_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static inline void pid_i_gain_set(struct _pid *pid, int percent)
+{
+   pid-i_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static inline void pid_d_gain_set(struct _pid *pid, int percent)
+{
+   pid-d_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static signed int pid_calc(struct _pid *pid, int32_t busy)
+{
+   signed int result;
+   int32_t pterm, dterm, 

[Xen-devel] [PATCH v4 09/11] x86/intel_pstate: add a booting param to select the driver to load

2015-06-25 Thread Wei Wang
By default, the old P-state driver (acpi-freq) is used. Adding
intel_pstate to the Xen booting param list to enable the
use of intel_pstate. However, if intel_pstate is enabled on a
machine which does not support the driver (e.g. Nehalem), the
old P-state driver will be loaded due to the failure loading of
intel_pstate.

Also, adding the intel_pstate booting parameter to
xen-command-line.markdown.

v4 changes:
1) moved the definition of load_intel_pstate right ahead of
intel_pstate_init();
2) merged the previous patch,adding the booting param to
xen.command-line.markdown, into this one.

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 docs/misc/xen-command-line.markdown  | 7 +++
 xen/arch/x86/acpi/cpufreq/cpufreq.c  | 9 ++---
 xen/arch/x86/acpi/cpufreq/intel_pstate.c | 6 ++
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 4889e27..249bf65 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -830,6 +830,13 @@ debug hypervisor only).
 ### idle\_latency\_factor
  `= integer`
 
+### intel\_pstate
+ `= boolean`
+
+ Default: `false`
+
+Enable the loading of the intel pstate driver.
+
 ### ioapic\_ack
  `= old | new`
 
diff --git a/xen/arch/x86/acpi/cpufreq/cpufreq.c 
b/xen/arch/x86/acpi/cpufreq/cpufreq.c
index 643c405..e737437 100644
--- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
+++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
@@ -41,6 +41,7 @@
 #include asm/processor.h
 #include asm/percpu.h
 #include asm/cpufeature.h
+#include asm/cpufreq.h
 #include acpi/acpi.h
 #include acpi/cpufreq/cpufreq.h
 
@@ -648,9 +649,11 @@ static int __init cpufreq_driver_init(void)
 int ret = 0;
 
 if ((cpufreq_controller == FREQCTL_xen) 
-(boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
-ret = cpufreq_register_driver(acpi_cpufreq_driver);
-else if ((cpufreq_controller == FREQCTL_xen) 
+(boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
+ret = intel_pstate_init();
+if (ret)
+ret = cpufreq_register_driver(acpi_cpufreq_driver);
+} else if ((cpufreq_controller == FREQCTL_xen) 
 (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
 ret = powernow_register_driver();
 
diff --git a/xen/arch/x86/acpi/cpufreq/intel_pstate.c 
b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
index 19c74cc..5e03625 100644
--- a/xen/arch/x86/acpi/cpufreq/intel_pstate.c
+++ b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
@@ -831,12 +831,18 @@ static void __init copy_cpu_funcs(struct pstate_funcs 
*funcs)
pstate_funcs.get_vid   = funcs-get_vid;
 }
 
+static bool_t __initdata load_intel_pstate;
+boolean_param(intel_pstate, load_intel_pstate);
+
 int __init intel_pstate_init(void)
 {
int cpu, rc = 0;
const struct x86_cpu_id *id;
struct cpu_defaults *cpu_info;
 
+   if (!load_intel_pstate)
+   return -ENODEV;
+
id = x86_match_cpu(intel_pstate_cpu_ids);
if (!id)
return -ENODEV;
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 10/11] x86/intel_pstate: support the use of intel_pstate in pmstat.c

2015-06-25 Thread Wei Wang
Add support in the pmstat.c so that the xenpm tool can request to
access the intel_pstate driver.

v4 changes:
1) changed to use the internal_governor struct;
2) coding style change (indentation of gov_num++).

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 tools/libxc/xc_pm.c |   4 +-
 xen/drivers/acpi/pmstat.c   | 148 
 xen/include/public/sysctl.h |  16 -
 3 files changed, 138 insertions(+), 30 deletions(-)

diff --git a/tools/libxc/xc_pm.c b/tools/libxc/xc_pm.c
index 5a7148e..823bab6 100644
--- a/tools/libxc/xc_pm.c
+++ b/tools/libxc/xc_pm.c
@@ -265,8 +265,8 @@ int xc_get_cpufreq_para(xc_interface *xch, int cpuid,
 user_para-cpuinfo_max_freq = sys_para-cpuinfo_max_freq;
 user_para-cpuinfo_min_freq = sys_para-cpuinfo_min_freq;
 user_para-scaling_cur_freq = sys_para-scaling_cur_freq;
-user_para-scaling_max_freq = sys_para-scaling_max_freq;
-user_para-scaling_min_freq = sys_para-scaling_min_freq;
+user_para-scaling_max_freq = sys_para-scaling_max.freq;
+user_para-scaling_min_freq = sys_para-scaling_min.freq;
 user_para-turbo_enabled= sys_para-turbo_enabled;
 
 memcpy(user_para-scaling_driver,
diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
index daac2da..89628aa 100644
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -192,22 +192,33 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
 uint32_t ret = 0;
 const struct processor_pminfo *pmpt;
 struct cpufreq_policy *policy;
+struct perf_limits *limits;
+struct internal_governor *internal_gov;
 uint32_t gov_num = 0;
 uint32_t *affected_cpus;
 uint32_t *scaling_available_frequencies;
 char *scaling_available_governors;
 struct list_head *pos;
 uint32_t cpu, i, j = 0;
+uint32_t cur_gov;
 
 pmpt = processor_pminfo[op-cpuid];
 policy = per_cpu(cpufreq_cpu_policy, op-cpuid);
+limits = policy-limits;
+internal_gov = policy-internal_gov;
+cur_gov = internal_gov ? internal_gov-cur_gov : 0;
 
 if ( !pmpt || !pmpt-perf.states ||
- !policy || !policy-governor )
+ !policy || (!policy-governor  !policy-internal_gov) )
 return -EINVAL;
 
-list_for_each(pos, cpufreq_governor_list)
-gov_num++;
+if (internal_gov)
+gov_num = internal_gov-gov_num;
+else
+{
+list_for_each(pos, cpufreq_governor_list)
+gov_num++;
+}
 
 if ( (op-u.get_para.cpu_num  != cpumask_weight(policy-cpus)) ||
  (op-u.get_para.freq_num != pmpt-perf.state_count)||
@@ -241,28 +252,47 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
 if ( ret )
 return ret;
 
-if ( !(scaling_available_governors =
-   xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
-return -ENOMEM;
-if ( (ret = read_scaling_available_governors(scaling_available_governors,
-gov_num * CPUFREQ_NAME_LEN * sizeof(char))) )
+if (internal_gov)
 {
+scaling_available_governors = internal_gov-avail_gov;
+ret = copy_to_guest(op-u.get_para.scaling_available_governors,
+scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
+if ( ret )
+return ret;
+}
+else
+{
+if ( !(scaling_available_governors =
+   xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
+return -ENOMEM;
+if ( (ret = 
read_scaling_available_governors(scaling_available_governors,
+gov_num * CPUFREQ_NAME_LEN * sizeof(char))) )
+{
+xfree(scaling_available_governors);
+return ret;
+}
+ret = copy_to_guest(op-u.get_para.scaling_available_governors,
+scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
 xfree(scaling_available_governors);
-return ret;
+if ( ret )
+return ret;
 }
-ret = copy_to_guest(op-u.get_para.scaling_available_governors,
-scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
-xfree(scaling_available_governors);
-if ( ret )
-return ret;
-
 op-u.get_para.cpuinfo_cur_freq =
 cpufreq_driver-get ? cpufreq_driver-get(op-cpuid) : policy-cur;
 op-u.get_para.cpuinfo_max_freq = policy-cpuinfo.max_freq;
 op-u.get_para.cpuinfo_min_freq = policy-cpuinfo.min_freq;
 op-u.get_para.scaling_cur_freq = policy-cur;
-op-u.get_para.scaling_max_freq = policy-max;
-op-u.get_para.scaling_min_freq = policy-min;
+if (internal_gov)
+{
+op-u.get_para.scaling_max.pct = limits-max_perf_pct;
+op-u.get_para.scaling_min.pct = limits-min_perf_pct;
+op-u.get_para.scaling_turbo_pct = limits-turbo_pct;
+}
+else
+{
+op-u.get_para.scaling_max.freq = policy-max;
+op-u.get_para.scaling_min.freq = policy-min;
+}
 
 if ( 

Re: [Xen-devel] [PATCH 8/8] xen/x86: Additional SMAP modes to work around buggy 32bit PV guests

2015-06-25 Thread David Vrabel
On 24/06/15 17:31, Andrew Cooper wrote:
 Experimentally, older Linux guests perform construction of `init` with user
 pagetable mappings.  This is fine for native systems as such a guest would not
 set CR4.SMAP itself.
 
 However if Xen uses SMAP itself, 32bit PV guests (whose kernels run in ring1)
 are also affected.  Older Linux guests end up spinning in a loop assuming that
 the SMAP violation pagefaults are spurious, and make no further progress.
 
 One option is to disable SMAP completely, but this is unreasonable.  A better
 alternative is to disable SMAP only in the context of 32bit PV guests, but
 reduces the effectiveness SMAP security.  A 3rd option is for Xen to fix up
 behind a 32bit guest if it were SMAP-aware.  It is a heuristic, and does
 result in a guest-visible state change, but allows Xen to keep CR4.SMAP
 unconditionally enabled.
[...]
 --- a/docs/misc/xen-command-line.markdown
 +++ b/docs/misc/xen-command-line.markdown
 @@ -1261,11 +1261,32 @@ Set the serial transmit buffer size.
  Flag to enable Supervisor Mode Execution Protection
  
  ### smap
 - `= boolean`
 + `= boolean | compat | fixup`
  
   Default: `true`
  
 -Flag to enable Supervisor Mode Access Prevention
 +Handling of Supervisor Mode Access Prevention.
 +
 +32bit PV guest kernels qualify as supervisor code, as they execute in ring 1.
 +If Xen uses SMAP protection itself, a PV guest which is not SMAP aware may
 +suffer unexpected pagefaults which it cannot handle. (Experimentally, there
 +are 32bit PV guests which fall foul of SMAP enforcement and spin in an
 +infinite loop taking pagefaults early on boot.)
 +
 +Two further SMAP modes are introduced to work around buggy 32bit PV guests to
 +prevent functional regressions of VMs on newer hardware.  At any point if the
 +guest sets `CR4.SMAP` itself, it is deemed aware, and **compat/fixup** cease
 +to apply.

Guests that is not aware of SMAP or do not support it are not buggy.

 +
 +A SMAP mode of **compat** causes Xen to disable `CR4.SMAP` in the context of
 +an unaware 32bit PV guest.  This prevents the guest from being subject to 
 SMAP
 +enforcement, but also prevents Xen from benefiting from the added security
 +checks.
 +
 +A SMAP mode of **fixup** causes Xen to set `EFLAGS.AC` when discovering a 
 SMAP
 +pagefault in the context of an unaware 32bit PV guest.  This allows Xen to
 +retain the added security from SMAP checks, but results in a guest-visible
 +state change which it might object to.

What does the PV ABI say about the use of EFLAGS.AC?  Have guests
historically been allowed to use this bit?  If so, does Xen fiddling
with it potentially break some guests?

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/4] xen: credit1: properly deal with pCPUs not in any cpupool

2015-06-25 Thread Dario Faggioli
Ideally, the pCPUs that are 'free', i.e., not assigned
to any cpupool, should not be considred by the scheduler
for load balancing or anything. In Credit1, we fail at
this, because of how we use cpupool_scheduler_cpumask().
In fact, for a free pCPU, cpupool_scheduler_cpumask()
returns a pointer to cpupool_free_cpus, and hence, near
the top of csched_load_balance():

 if ( unlikely(!cpumask_test_cpu(cpu, online)) )
 goto out;

is false (the pCPU _is_ free!), and we therefore do not
jump to the end right away, as we should. This, causes
the following splat when resuming from ACPI S3 with
pCPUs not assigned to any pool:

(XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
(XEN) ... ... ...
(XEN) Xen call trace:
(XEN)[82d080122eaa] csched_load_balance+0x213/0x794
(XEN)[82d08012374c] csched_schedule+0x321/0x452
(XEN)[82d08012c85e] schedule+0x12a/0x63c
(XEN)[82d08012fa09] __do_softirq+0x82/0x8d
(XEN)[82d08012fa61] do_softirq+0x13/0x15
(XEN)[82d080164780] idle_loop+0x5b/0x6b
(XEN)
(XEN)
(XEN) 
(XEN) Panic on CPU 8:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=]
(XEN) 

The cure is:
 * use cpupool_online_cpumask(), as a better guard to the
   case when the cpu is being offlined;
 * explicitly check whether the cpu is free.

SEDF is in a similar situation, so fix it too.

Still in Credit1, we must make sure that free (or offline)
CPUs are not considered ticklable. Not doing so would impair
the load balancing algorithm, making the scheduler think that
it is possible to 'ask' the pCPU to pick up some work, while
in reallity, that will never happen! Evidence of such behavior
is shown in this trace:

 Name   CPU list
 Pool-0 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14

0.112998198 | ||.|| -|x||-|- d0v0 runstate_change d0v4 offline-runnable
 ]  0.112998198 | ||.|| -|x||-|- d0v0   22006(2:2:6) 1 [ f ]
 ]  0.112999612 | ||.|| -|x||-|- d0v0   28004(2:8:4) 2 [ 0 4 ]
0.113003387 | ||.|| --|x d32767v15 runstate_continue d32767v15 
running-running

where 22006(2:2:6) 1 [ f ] means that pCPU 15, which is
free from any pool, is tickled.

The cure, in this case, is to filter out the free pCPUs,
within __runq_tickle().

Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
---
Cc: George Dunlap george.dun...@eu.citrix.com
Cc: Juergen Gross jgr...@suse.com
---
 xen/common/sched_credit.c |   23 ---
 xen/common/sched_sedf.c   |3 ++-
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index 953ecb0..a1945ac 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -366,12 +366,17 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
 {
 struct csched_vcpu * const cur = CSCHED_VCPU(curr_on_cpu(cpu));
 struct csched_private *prv = CSCHED_PRIV(per_cpu(scheduler, cpu));
-cpumask_t mask, idle_mask;
+cpumask_t mask, idle_mask, *online;
 int balance_step, idlers_empty;
 
 ASSERT(cur);
 cpumask_clear(mask);
-idlers_empty = cpumask_empty(prv-idlers);
+
+/* cpu is vc-processor, so it must be in a cpupool. */
+ASSERT(per_cpu(cpupool, cpu) != NULL);
+online = cpupool_online_cpumask(per_cpu(cpupool, cpu));
+cpumask_and(idle_mask, prv-idlers, online);
+idlers_empty = cpumask_empty(idle_mask);
 
 
 /*
@@ -408,8 +413,8 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
 /* Are there idlers suitable for new (for this balance step)? */
 csched_balance_cpumask(new-vcpu, balance_step,
csched_balance_mask);
-cpumask_and(idle_mask, prv-idlers, csched_balance_mask);
-new_idlers_empty = cpumask_empty(idle_mask);
+cpumask_and(csched_balance_mask, csched_balance_mask, idle_mask);
+new_idlers_empty = cpumask_empty(csched_balance_mask);
 
 /*
  * Let's not be too harsh! If there aren't idlers suitable
@@ -1510,6 +1515,7 @@ static struct csched_vcpu *
 csched_load_balance(struct csched_private *prv, int cpu,
 struct csched_vcpu *snext, bool_t *stolen)
 {
+struct cpupool *c = per_cpu(cpupool, cpu);
 struct csched_vcpu *speer;
 cpumask_t workers;
 cpumask_t *online;
@@ -1517,10 +1523,13 @@ csched_load_balance(struct csched_private *prv, int cpu,
 int node = cpu_to_node(cpu);
 
 BUG_ON( cpu != snext-vcpu-processor );
-online = cpupool_scheduler_cpumask(per_cpu(cpupool, cpu));
+online = cpupool_online_cpumask(c);
 
-/* If this CPU is going offline we shouldn't steal work. */
-if ( unlikely(!cpumask_test_cpu(cpu, online)) )
+/*
+ * If this CPU is going offline, or is not (yet) part of any cpupool
+ * (as it happens, e.g., during cpu bringup), we shouldn't steal work.
+ */
+if ( unlikely(!cpumask_test_cpu(cpu, online) || 

[Xen-devel] [PATCH 0/4] xen: sched / cpupool: fixes and improvements, mostly for when suspend/resume is involved

2015-06-25 Thread Dario Faggioli
This is mostly about fixing bugs showing up during suspend/resume, with non
default configurations such as, pCPUs free from any cpupool, more than one
cpupool in the system, etc.

I tried a few different appoaches, for dealing with these cases. For instance,
I tried creating an 'idle cpupool', and then putting the free pCPUs there,
instead than sort-of parking them in cpupool0 (although in a special
condition), like we're doing now, but that introduces other issues.  I think
this series is, the least invasive, and yet correct, way of dealing with the
situation.

In some more detail:
 * patch 1 is just refactoring/beautifying dump output;
 * patch 2 is the fix for a bug showing up during resume, when two or more
   cpupools exist;
 * patch 3 fixes a bug (in the suspend/resume path again) and also improves
   Credit1 behavior, i.e., stops it from considering pCPUs that are outside
   of any pool as potential candidates where to execute vCPUs;
 * patch 4 is refactoring again, with the intent of making what made patch
   3 necessary less likely to happen! :-)

Thanks and Regards,
Dario
---
Dario Faggioli (4):
  xen: sched: avoid dumping duplicate information
  xen: x86 / cpupool: clear the proper cpu_valid bit on pCPU teardown
  xen: credit1: properly deal with pCPUs not in any cpupool
  xen: sched: get rid of cpupool_scheduler_cpumask()

 xen/arch/x86/smpboot.c  |1 -
 xen/common/cpupool.c|8 +---
 xen/common/domain.c |5 +++--
 xen/common/domctl.c |4 ++--
 xen/common/sched_arinc653.c |2 +-
 xen/common/sched_credit.c   |   27 ++-
 xen/common/sched_rt.c   |   12 ++--
 xen/common/sched_sedf.c |5 +++--
 xen/common/schedule.c   |   20 ++--
 xen/include/xen/sched-if.h  |   12 ++--
 10 files changed, 62 insertions(+), 34 deletions(-)

--
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 4/4] xen: sched: get rid of cpupool_scheduler_cpumask()

2015-06-25 Thread Dario Faggioli
and of (almost every) direct use of cpupool_online_cpumask().

In fact, what we really want for the most of the times,
is the set of valid pCPUs of the cpupool a certain domain
is part of. Furthermore, in case it's called with a NULL
pool as argument, cpupool_scheduler_cpumask() does more
harm than good, by returning the bitmask of free pCPUs!

This commit, therefore:
 * gets rid of cpupool_scheduler_cpumask(), in favour of
   cpupool_domain_cpumask(), which makes it more evident
   what we are after, and accommodates some sanity checking;
 * replaces some of the calls to cpupool_online_cpumask()
   with calls to the new functions too.

Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
---
Cc: George Dunlap george.dun...@eu.citrix.com
Cc: Juergen Gross jgr...@suse.com
Cc: Robert VanVossen robert.vanvos...@dornerworks.com
Cc: Josh Whitehead josh.whiteh...@dornerworks.com
Cc: Meng Xu men...@cis.upenn.edu
Cc: Sisu Xi xis...@gmail.com
---
 xen/common/domain.c |5 +++--
 xen/common/domctl.c |4 ++--
 xen/common/sched_arinc653.c |2 +-
 xen/common/sched_credit.c   |6 +++---
 xen/common/sched_rt.c   |   12 ++--
 xen/common/sched_sedf.c |2 +-
 xen/common/schedule.c   |2 +-
 xen/include/xen/sched-if.h  |   12 ++--
 8 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3bc52e6..c20accb 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -184,7 +184,8 @@ struct vcpu *alloc_vcpu(
 /* Must be called after making new vcpu visible to for_each_vcpu(). */
 vcpu_check_shutdown(v);
 
-domain_update_node_affinity(d);
+if ( !is_idle_domain(d) )
+domain_update_node_affinity(d);
 
 return v;
 }
@@ -437,7 +438,7 @@ void domain_update_node_affinity(struct domain *d)
 return;
 }
 
-online = cpupool_online_cpumask(d-cpupool);
+online = cpupool_domain_cpumask(d);
 
 spin_lock(d-node_affinity_lock);
 
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 2a2d203..a399aa6 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -664,7 +664,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
u_domctl)
 goto maxvcpu_out;
 
 ret = -ENOMEM;
-online = cpupool_online_cpumask(d-cpupool);
+online = cpupool_domain_cpumask(d);
 if ( max  d-max_vcpus )
 {
 struct vcpu **vcpus;
@@ -748,7 +748,7 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) 
u_domctl)
 if ( op-cmd == XEN_DOMCTL_setvcpuaffinity )
 {
 cpumask_var_t new_affinity, old_affinity;
-cpumask_t *online = cpupool_online_cpumask(v-domain-cpupool);;
+cpumask_t *online = cpupool_domain_cpumask(v-domain);;
 
 /*
  * We want to be able to restore hard affinity if we are trying
diff --git a/xen/common/sched_arinc653.c b/xen/common/sched_arinc653.c
index cff5da9..dbe02ed 100644
--- a/xen/common/sched_arinc653.c
+++ b/xen/common/sched_arinc653.c
@@ -667,7 +667,7 @@ a653sched_pick_cpu(const struct scheduler *ops, struct vcpu 
*vc)
  * If present, prefer vc's current processor, else
  * just find the first valid vcpu .
  */
-online = cpupool_scheduler_cpumask(vc-domain-cpupool);
+online = cpupool_domain_cpumask(vc-domain);
 
 cpu = cpumask_first(online);
 
diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c
index a1945ac..8c36635 100644
--- a/xen/common/sched_credit.c
+++ b/xen/common/sched_credit.c
@@ -309,7 +309,7 @@ __runq_remove(struct csched_vcpu *svc)
 static inline int __vcpu_has_soft_affinity(const struct vcpu *vc,
const cpumask_t *mask)
 {
-return !cpumask_subset(cpupool_online_cpumask(vc-domain-cpupool),
+return !cpumask_subset(cpupool_domain_cpumask(vc-domain),
vc-cpu_soft_affinity) 
!cpumask_subset(vc-cpu_hard_affinity, vc-cpu_soft_affinity) 
cpumask_intersects(vc-cpu_soft_affinity, mask);
@@ -374,7 +374,7 @@ __runq_tickle(unsigned int cpu, struct csched_vcpu *new)
 
 /* cpu is vc-processor, so it must be in a cpupool. */
 ASSERT(per_cpu(cpupool, cpu) != NULL);
-online = cpupool_online_cpumask(per_cpu(cpupool, cpu));
+online = cpupool_domain_cpumask(new-sdom-dom);
 cpumask_and(idle_mask, prv-idlers, online);
 idlers_empty = cpumask_empty(idle_mask);
 
@@ -641,7 +641,7 @@ _csched_cpu_pick(const struct scheduler *ops, struct vcpu 
*vc, bool_t commit)
 int balance_step;
 
 /* Store in cpus the mask of online cpus on which the domain can run */
-online = cpupool_scheduler_cpumask(vc-domain-cpupool);
+online = cpupool_domain_cpumask(vc-domain);
 cpumask_and(cpus, vc-cpu_hard_affinity, online);
 
 for_each_csched_balance_step( balance_step )
diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 4372486..08611c8 100644
--- a/xen/common/sched_rt.c

Re: [Xen-devel] [v4][PATCH 12/19] tools/libxl: passes rdm reservation policy

2015-06-25 Thread Ian Campbell
On Tue, 2015-06-23 at 17:57 +0800, Tiejun Chen wrote:
 This patch passes our rdm reservation policy inside libxl
 when we assign a device or attach a device.
 
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 ---
 v4:
 
 * Fix one typo, s/unkwon/unknown
 * In command description, we should use [] to indicate it's optional
   for that extended xl command, pci-attach.
 
  docs/man/xl.pod.1 |  7 ++-
  tools/libxl/libxl_pci.c   | 10 +-
  tools/libxl/xl_cmdimpl.c  | 23 +++
  tools/libxl/xl_cmdtable.c |  2 +-
  4 files changed, 35 insertions(+), 7 deletions(-)
 
 diff --git a/docs/man/xl.pod.1 b/docs/man/xl.pod.1
 index 4eb929d..c5c4809 100644
 --- a/docs/man/xl.pod.1
 +++ b/docs/man/xl.pod.1
 @@ -1368,10 +1368,15 @@ it will also attempt to re-bind the device to its 
 original driver, making it
  usable by Domain 0 again.  If the device is not bound to pciback, it will
  return success.
  
 -=item Bpci-attach Idomain-id IBDF
 +=item Bpci-attach Idomain-id IBDF [Irdm]
  
  Hot-plug a new pass-through pci device to the specified domain.
  BBDF is the PCI Bus/Device/Function of the physical device to pass-through.
 +Brdm policy is about how to handle conflict between reserving reserved 
 device

s/is about/specifies/ and I think s/between/while/

 +memory and guest address space. strict means an unsolved conflict leads to

I think you mean in rather than and?


 +immediate VM crash, while relaxed allows VM moving forward with a warning
 +message thrown out. Here strict is default.

The default is strict.

You've repeated the list of allowed values for this two or three times
now in the various docs, perhaps try and centralise on one definition
and cross reference instead?



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v3 21/22] Debian: Arrange to be able to chainload a xen.efi from grub2

2015-06-25 Thread Ian Jackson
Ian Campbell writes (Re: [PATCH OSSTEST v3 21/22] Debian: Arrange to be able 
to chainload a xen.efi from grub2):
 On Thu, 2015-06-25 at 13:36 +0100, Ian Jackson wrote:
  I think people are working on a better way is what I was looking
  for.  When that change comes along, we can remove 20_linux_xen ?
 
 OK.

By `OK' do you mean `yes' ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 09/12] x86/altp2m: add remaining support routines.

2015-06-25 Thread Razvan Cojocaru
On 06/25/2015 03:44 PM, Lengyel, Tamas wrote:
 On Wed, Jun 24, 2015 at 2:06 PM, Ed White edmund.h.wh...@intel.com
 mailto:edmund.h.wh...@intel.com wrote:
 On 06/24/2015 09:15 AM, Lengyel, Tamas wrote:
  +bool_t p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
  + unsigned long pfn, xenmem_access_t
  access)
  +{
 
 
  This function IMHO should be merged with p2m_set_mem_access and should 
 be
  triggerable with the same memop (XENMEM_access_op) hypercall instead of
  introducing a new hvmop one.
 
 I think we should vote on this. My view is that it makes
 XENMEM_access_op
 too complicated to use.
 
 The two functions are not very long and share enough code that it would
 justify merging. The only big change added is the copy from host-alt
 when the entry doesn't exists in alt, and that itself is pretty self
 contained. Let's see if we can get a third opinion on it..

At first sight (I admit I'm rather late in the game and haven't had a
chance to follow the series closely from the beginning), the two
functions do seem to be mergeable (or at least the common code factored
out in static helper functions).

Also, if Ed's concern is that the libxc API would look unnatural if
xc_set_mem_access() is used for both purposes, as far as I can tell the
only difference could be a non-zero last altp2m parameter, so I agree
with you that the less functions doing almost the same thing the better
(I have been guilty of this in the past too, for example with my
xc_enable_introspection() function ;) ).

So I'd say, yes, if possible merge them.


Regards,
Razvan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-25 Thread Paul Durrant
 -Original Message-
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: 25 June 2015 14:38
 To: Paul Durrant; Jan Beulich
 Cc: xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 07/17] x86/hvm: add length to mmio check op
 
 On 25/06/15 14:36, Paul Durrant wrote:
  -Original Message-
  From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
  Sent: 25 June 2015 14:34
  To: Jan Beulich
  Cc: Paul Durrant; xen-de...@lists.xenproject.org; Keir (Xen.org)
  Subject: Re: [PATCH v4 07/17] x86/hvm: add length to mmio check op
 
  On 25/06/15 13:46, Jan Beulich wrote:
  On 25.06.15 at 14:21, andrew.coop...@citrix.com wrote:
  On 24/06/15 12:24, Paul Durrant wrote:
  When memory mapped I/O is range checked by internal handlers, the
  length
  of the access should be taken into account.
 
  Signed-off-by: Paul Durrant paul.durr...@citrix.com
  Cc: Keir Fraser k...@xen.org
  Cc: Jan Beulich jbeul...@suse.com
  Cc: Andrew Cooper andrew.coop...@citrix.com
 
  For what purpose?  The length of the access doesn't affect which
 handler
  should accept the IO.
 
  This length check now causes an MMIO handler to not claim an access
  which straddles the upper boundary.
 
  It is probably fine to terminate such an access early, but it isn't fine
  to pass such a straddled access to the default ioreq server.
  No, without involving the length in the check we can end up with
  check() saying Yes, mine but read() or write() saying Not me.
  What I would agree with is for the generic handler to split the
  access if the first byte fits, but the final byte doesn't.
  I discussed this with Paul over lunch.  I had not considered how IO gets
  forwarded to the device model for shared implementations.
 
  Is it reasonable to split a straddled access and direct the halves at
  different handlers? This is not in line with how other hardware behaves
  (PCIe will reject any straddled access).  Furthermore, given small MMIO
  regions and larger registers, there is no guarantee that a single split
  will suffice.
 
  I see in the other thread going on that a domain_crash() is deemed ok
  for now, which is fine my me.
 
  I think that also allows me to simplfy the patch since I don't have to 
  modify
 the mmio_check op any more. I simply call it once for the first byte of the
 access and, if it accepts, verify that it also accepts the last byte of the 
 access.
 
 At that point, I would say it would be easier to modify the claim check
 to return yes/straddled/no rather than calling it twice.

That's excessive code churn, I think. The check functions are generally cheap 
and the second call is only made if the first accepts.

  Paul

 
 ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 1/2] mg-debian-installer-update: Print the correct value for TftpDiVersion

2015-06-25 Thread Ian Jackson
Ian Campbell writes ([PATCH OSSTEST 1/2] mg-debian-installer-update: Print the 
correct value for TftpDiVersion):
 That is, the date without the suite suffix.
...
 -echo $date
 -echo 2 downloaded $dstroot/$arch/$date
 +echo New TftpDiVersion: $date
 +echo 2 downloaded $dstroot/$dst

You could make the output suitable for cp ?

  +echo TftpDiVersion $date

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 2/2] mg-debian-installer-update: Update current symlink, if appropriate

2015-06-25 Thread Ian Jackson
Ian Campbell writes ([PATCH OSSTEST 2/2] mg-debian-installer-update: Update 
current symlink, if appropriate):
 Where appropriate means if TftpDiVersion is set to current, which is
 the default in standalone mode. The assumption is that someone wuth
 that configration runs mg-debian-installer-update then they would
 expected the update to be immediately effective.
 
 There was some existing, but commented, code to do this update,
 reinstate it with the correct condition and adjusting for the addition
 of -$suite to the patch many moons ago.
 
 There is no impact on any production configuration, since they always
 set TftpDiVersion.

Acked-by: Ian Jackson ian.jack...@eu.citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 17/17] x86/hvm: track large memory mapped accesses by buffer offset

2015-06-25 Thread Jan Beulich
 On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
 @@ -621,14 +574,41 @@ static int hvmemul_phys_mmio_access(
  
  for ( ;; )
  {
 -rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
 -*buffer);
 -if ( rc != X86EMUL_OKAY )
 -break;
 +/* Have we already done this chunk? */
 +if ( (*off + chunk) = vio-mmio_cache[dir].size )

I can see why you would like to get rid of the address check, but
I'm afraid you can't: You have to avoid getting mixed up multiple
same kind (reads or writes) memory accesses that a single
instruction can do. While generally I would assume that
secondary accesses (like the I/O bitmap read associated with an
OUTS) wouldn't go to MMIO, CMPS with both operands being
in MMIO would break even if neither crosses a page boundary
(not to think of when the emulator starts supporting the
scatter/gather instructions, albeit supporting them will require
further changes, or we could choose to do them one element at
a time).

 +{
 +ASSERT(*off + chunk = vio-mmio_cache[dir].size);

I don't see any difference to the if() expression just above.

 +if ( dir == IOREQ_READ )
 +memcpy(buffer[*off],
 +   vio-mmio_cache[IOREQ_READ].buffer[*off],
 +   chunk);
 +else
 +{
 +if ( memcmp(buffer[*off],

else if please.

 +vio-mmio_cache[IOREQ_WRITE].buffer[*off],
 +chunk) != 0 )
 +domain_crash(curr-domain);
 +}
 +}
 +else
 +{
 +ASSERT(*off == vio-mmio_cache[dir].size);
 +
 +rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
 +buffer[*off]);
 +if ( rc != X86EMUL_OKAY )
 +break;
 +
 +/* Note that we have now done this chunk */

Missing stop.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen-unstable: pci-passthrough of device using MSI-X interrupts not working after commit x86/MSI: track host and guest masking separately

2015-06-25 Thread Sander Eikelenboom

Thursday, June 25, 2015, 10:48:40 AM, you wrote:

 On 24.06.15 at 21:38, li...@eikelenboom.it wrote:
 I'm having some trouble with a xhci controller passed through with 
 pci-passthrough to one of my HVM guests.
 It uses MSI-X for interrupts, a bisection turned up the following commit:
 
 x86/MSI: track host and guest masking separately
 
 Although from a first glance it looks as if the controller is correctly 
 initialize during the boot of the HVM guest (no worrying messages in dmesg 
 yet).
 It utterly fails a simple lsusb this results in the hang pasted below.
 
 Other devices  i passthrough which use legacy or MSI interrupts seem to be 
 unaffected.

 Odd enough, since I'm having a hard time testing MSI (no suitable
 devices), but did a lot of testing with MSI-X.

 Please say so if you need any specific output from Xen debug keys or 
 anything 
 else !

 M and i debug key output would be the first thing. I'd suspect host
 masking to be wrongly active for some reason.

 Jan

Hi Jan,

Attached is the xl-dmesg output of:

- debug-keys M and i before guest boot
- guest boot
- debug-keys M and i after lsusb in the guest that hangs.

The not working controller is :08:00.0.

--
Sander77] traps.c:2655:d0v1 Domain attempted WRMSR c084 from 
0x00074700 to 0x00047700.
(XEN) [2015-06-25 10:38:46.277] traps.c:2655:d0v2 Domain attempted WRMSR 
c081 from 0xe023e008 to 0x00230010.
(XEN) [2015-06-25 10:38:46.277] traps.c:2655:d0v2 Domain attempted WRMSR 
c082 from 0x82d0bfffd100 to 0x81b2f010.
(XEN) [2015-06-25 10:38:46.277] traps.c:2655:d0v2 Domain attempted WRMSR 
c083 from 0x82d0bfffd120 to 0x81b30f10.
(XEN) [2015-06-25 10:38:46.277] traps.c:2655:d0v2 Domain attempted WRMSR 
0174 from 0x to 0x0010.
(XEN) [2015-06-25 10:38:46.277] traps.c:2655:d0v2 Domain attempted WRMSR 
0176 from 0x to 0x81b30dc0.
(XEN) [2015-06-25 10:38:46.277] traps.c:2655:d0v2 Domain attempted WRMSR 
c084 from 0x00074700 to 0x00047700.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v3 Domain attempted WRMSR 
c081 from 0xe023e008 to 0x00230010.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v3 Domain attempted WRMSR 
c082 from 0x82d0bfffc180 to 0x81b2f010.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v3 Domain attempted WRMSR 
c083 from 0x82d0bfffc1a0 to 0x81b30f10.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v3 Domain attempted WRMSR 
0174 from 0x to 0x0010.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v3 Domain attempted WRMSR 
0176 from 0x to 0x81b30dc0.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v3 Domain attempted WRMSR 
c084 from 0x00074700 to 0x00047700.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v4 Domain attempted WRMSR 
c081 from 0xe023e008 to 0x00230010.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v4 Domain attempted WRMSR 
c082 from 0x82d0bfffb200 to 0x81b2f010.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v4 Domain attempted WRMSR 
c083 from 0x82d0bfffb220 to 0x81b30f10.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v4 Domain attempted WRMSR 
0174 from 0x to 0x0010.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v4 Domain attempted WRMSR 
0176 from 0x to 0x81b30dc0.
(XEN) [2015-06-25 10:38:46.278] traps.c:2655:d0v4 Domain attempted WRMSR 
c084 from 0x00074700 to 0x00047700.
(XEN) [2015-06-25 10:38:46.279] traps.c:2655:d0v5 Domain attempted WRMSR 
c081 from 0xe023e008 to 0x00230010.
(XEN) [2015-06-25 10:38:46.279] traps.c:2655:d0v5 Domain attempted WRMSR 
c082 from 0x82d0bfffa280 to 0x81b2f010.
(XEN) [2015-06-25 10:38:46.279] traps.c:2655:d0v5 Domain attempted WRMSR 
c083 from 0x82d0bfffa2a0 to 0x81b30f10.
(XEN) [2015-06-25 10:38:46.279] traps.c:2655:d0v5 Domain attempted WRMSR 
0174 from 0x to 0x0010.
(XEN) [2015-06-25 10:38:46.279] traps.c:2655:d0v5 Domain attempted WRMSR 
0176 from 0x to 0x81b30dc0.
(XEN) [2015-06-25 10:38:46.279] traps.c:2655:d0v5 Domain attempted WRMSR 
c084 from 0x00074700 to 0x00047700.
(XEN) [2015-06-25 10:38:46.739] PCI add device :00:00.0
(XEN) [2015-06-25 10:38:46.740] PCI add device :00:00.2
(XEN) [2015-06-25 10:38:46.740] PCI add device :00:02.0
(XEN) [2015-06-25 10:38:46.740] PCI add device :00:03.0
(XEN) [2015-06-25 10:38:46.740] PCI add device :00:05.0
(XEN) [2015-06-25 10:38:46.740] PCI add device :00:06.0
(XEN) 

Re: [Xen-devel] [PATCH v4 17/17] x86/hvm: track large memory mapped accesses by buffer offset

2015-06-25 Thread Paul Durrant
 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 25 June 2015 11:47
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 17/17] x86/hvm: track large memory mapped
 accesses by buffer offset
 
  On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
  @@ -621,14 +574,41 @@ static int hvmemul_phys_mmio_access(
 
   for ( ;; )
   {
  -rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
  -*buffer);
  -if ( rc != X86EMUL_OKAY )
  -break;
  +/* Have we already done this chunk? */
  +if ( (*off + chunk) = vio-mmio_cache[dir].size )
 
 I can see why you would like to get rid of the address check, but
 I'm afraid you can't: You have to avoid getting mixed up multiple
 same kind (reads or writes) memory accesses that a single
 instruction can do. While generally I would assume that
 secondary accesses (like the I/O bitmap read associated with an
 OUTS) wouldn't go to MMIO, CMPS with both operands being
 in MMIO would break even if neither crosses a page boundary
 (not to think of when the emulator starts supporting the
 scatter/gather instructions, albeit supporting them will require
 further changes, or we could choose to do them one element at
 a time).

Ok. Can I assume at most two distinct set of addresses for read or write? If so 
then I can just keep two sets of caches in the hvm_io struct.

 
  +{
  +ASSERT(*off + chunk = vio-mmio_cache[dir].size);
 
 I don't see any difference to the if() expression just above.
 

That's possible  - this has been through a few re-bases.

  +if ( dir == IOREQ_READ )
  +memcpy(buffer[*off],
  +   vio-mmio_cache[IOREQ_READ].buffer[*off],
  +   chunk);
  +else
  +{
  +if ( memcmp(buffer[*off],
 
 else if please.
 

Ok.

  +vio-mmio_cache[IOREQ_WRITE].buffer[*off],
  +chunk) != 0 )
  +domain_crash(curr-domain);
  +}
  +}
  +else
  +{
  +ASSERT(*off == vio-mmio_cache[dir].size);
  +
  +rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
  +buffer[*off]);
  +if ( rc != X86EMUL_OKAY )
  +break;
  +
  +/* Note that we have now done this chunk */
 
 Missing stop.
 

Ok.

  Paul

 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 16/17] x86/hvm: always re-emulate I/O from a buffer

2015-06-25 Thread Paul Durrant
 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 25 June 2015 11:50
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: RE: [PATCH v4 16/17] x86/hvm: always re-emulate I/O from a buffer
 
  On 25.06.15 at 12:32, paul.durr...@citrix.com wrote:
   -Original Message-
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: 25 June 2015 10:58
  To: Paul Durrant
  Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
  Subject: Re: [PATCH v4 16/17] x86/hvm: always re-emulate I/O from a
 buffer
 
   On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
   If memory mapped I/O is 'chunked' then the I/O must be re-emulated,
   otherwise only the first chunk will be processed. This patch makes
   sure all I/O from a buffer is re-emulated regardless of whether it
   is a read or a write.
 
  I'm not sure I understand this: Isn't the reason for treating reads
  and writes differently due to the fact that MMIO reads may have
  side effects, and hence can't be re-issued (whereas writes are
  always the last thing an instruction does, and hence can't hold up
  retiring of it, and hence don't need retrying)?
 
  Read were always re-issued, which is why handle_mmio() is called in
  hvm_io_assit(). If the underlying MMIO is deferred to QEMU then this is
 the
  only way for Xen to pick up the result. This patch adds completion for
  writes.
  If the I/O has been broken down in the underlying hvmemul_write() and a
  'chunk' deferred to QEMU then there is actually need to re-emulate
 otherwise
  any remaining chunks will not be handled.
 
 
  Furthermore, doesn't only the first chunk get represented correctly
  already by informing the caller that only a single iteration of a
  repeated instruction was done, such that further repeats will get
  carried out anyway (resulting in another, fresh cycle through the
  emulator)?
 
 
  No, because we're talking about 'chunks' here and not 'reps'. If a single
  non-rep I/O is broken down into, say, two chunks then we:
 
  - Issue the I/O for the first chunk to QEMU
  - Say we did nothing by returning RETRY
  - Re-issue the emulation from hvm_io_assist()
  - Pick up the result of the first chunk from the ioreq, add it to the cache,
  and issue the second chunk to QEMU
  - Say we did nothing by returning RETRY
  - Re-issue the emulation from hvm_io_assist()
  - Pick up the result of the first chunk from the cache and pick up the 
  result
  of the second chunk from the ioreq
  - Say we completed the I/O by returning OKAY
 
  I agree it's not nice, and bouncing would have been preferable, but that's
  the way 'wide I/O' works.
 
 I see. Which means
 Acked-by: Jan Beulich jbeul...@suse.com
 

Thanks.

  Paul

 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 13/19] tools/libxc: check to set args.mmio_size before call xc_hvm_build

2015-06-25 Thread Wei Liu
On Tue, Jun 23, 2015 at 05:57:24PM +0800, Tiejun Chen wrote:
 After commit 5dff8e9eedc7, libxc/libxl: fill xc_hvm_build_args in
 libxl is introduced, we won't check to set args.mmio_size inside
 xc_hvm_build as before. So instead, we need to do this before call
 that.
 
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com

Acked-by: Wei Liu wei.l...@citrix.com

Sigh. I missed this because libxl doesn't use this function and there is
no in tree xend anymore.

I think you should move this earlier in this series. Presumably your RDM
changes depend on this.

Wei.

 ---
 v4:
 
 * Separate this from currenpt patch #14 since this is specific to xc.
 
  tools/libxc/xc_hvm_build_x86.c | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/tools/libxc/xc_hvm_build_x86.c b/tools/libxc/xc_hvm_build_x86.c
 index 003ea06..7343e87 100644
 --- a/tools/libxc/xc_hvm_build_x86.c
 +++ b/tools/libxc/xc_hvm_build_x86.c
 @@ -754,6 +754,8 @@ int xc_hvm_build_target_mem(xc_interface *xch,
  args.mem_size = (uint64_t)memsize  20;
  args.mem_target = (uint64_t)target  20;
  args.image_file_name = image_name;
 +if ( args.mmio_size == 0 )
 +args.mmio_size = HVM_BELOW_4G_MMIO_LENGTH;
  
  return xc_hvm_build(xch, domid, args);
  }
 -- 
 1.9.1

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH V5 2/7] libxl_read_file_contents: add new entry to read sysfs file

2015-06-25 Thread Ian Jackson
Chunyan Liu writes ([PATCH V5 2/7] libxl_read_file_contents: add new entry to 
read sysfs file):
 Sysfs file has size=4096 but actual file content is less than that.
 Current libxl_read_file_contents will treat it as error when file size
 and actual file content differs, so reading sysfs file content with
 this function always fails.
 
 Add a new entry libxl_read_sysfs_file_contents to handle sysfs file
 specially. It would be used in later pvusb work.

I think this still fails to detect a situation where the file is
unexpectedly longer than the requested size ?

As we wrote earlier:

   Is there any risk that the file is actually bigger than advertised, 
   rather than smaller ? 
  
  For sysfs file, couldn't be bigger.
 
 Then you should detect the condition that the file is bigger, and call
 it an error.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 17/17] x86/hvm: track large memory mapped accesses by buffer offset

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 12:55, paul.durr...@citrix.com wrote:
 From: Paul Durrant
 Sent: 25 June 2015 11:52
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: 25 June 2015 11:47
   On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
   @@ -621,14 +574,41 @@ static int hvmemul_phys_mmio_access(
  
for ( ;; )
{
   -rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
   -*buffer);
   -if ( rc != X86EMUL_OKAY )
   -break;
   +/* Have we already done this chunk? */
   +if ( (*off + chunk) = vio-mmio_cache[dir].size )
 
  I can see why you would like to get rid of the address check, but
  I'm afraid you can't: You have to avoid getting mixed up multiple
  same kind (reads or writes) memory accesses that a single
  instruction can do. While generally I would assume that
  secondary accesses (like the I/O bitmap read associated with an
  OUTS) wouldn't go to MMIO, CMPS with both operands being
  in MMIO would break even if neither crosses a page boundary
  (not to think of when the emulator starts supporting the
  scatter/gather instructions, albeit supporting them will require
  further changes, or we could choose to do them one element at
  a time).
 
 Ok. Can I assume at most two distinct set of addresses for read or write? If 
 so
 then I can just keep two sets of caches in the hvm_io struct.
 
 
 Oh, I mean linear addresses here BTW.

Yes, that's what I implied - afaics switching to using linear addresses
shouldn't result in any problem (but then again I wonder whether
physical addresses really were chosen originally for no real reason).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 14/19] tools/libxl: detect and avoid conflicts with RDM

2015-06-25 Thread Wei Liu
On Tue, Jun 23, 2015 at 05:57:25PM +0800, Tiejun Chen wrote:
 While building a VM, HVM domain builder provides struct hvm_info_table{}
 to help hvmloader. Currently it includes two fields to construct guest
 e820 table by hvmloader, low_mem_pgend and high_mem_pgend. So we should
 check them to fix any conflict with RAM.
 

RAM - RDM?

 RMRR can reside in address space beyond 4G theoretically, but we never
 see this in real world. So in order to avoid breaking highmem layout
 we don't solve highmem conflict. Note this means highmem rmrr could still
 be supported if no conflict.
 
 But in the case of lowmem, RMRR probably scatter the whole RAM space.
 Especially multiple RMRR entries would worsen this to lead a complicated
 memory layout. And then its hard to extend hvm_info_table{} to work
 hvmloader out. So here we're trying to figure out a simple solution to
 avoid breaking existing layout. So when a conflict occurs,
 
 #1. Above a predefined boundary (2G)
 - move lowmem_end below reserved region to solve conflict;
 
 #2. Below a predefined boundary (2G)
 - Check strict/relaxed policy.
 strict policy leads to fail libxl. Note when both policies
 are specified on a given region, 'strict' is always preferred.
 relaxed policy issue a warning message and also mask this entry 
 INVALID
 to indicate we shouldn't expose this entry to hvmloader.
 
 Note later we need to provide a parameter to set that predefined boundary
 dynamically.
 
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 Reviewed-by: Kevin Tian kevint.t...@intel.com
 ---
 v4:
 
 * Consistent to use term RDM.
 * Unconditionally set *nr_entries to 0
 * Grab to all sutffs to provide a parameter to set our predefined boundary
   dynamically to as a separated patch later
 
  tools/libxl/libxl_create.c   |   2 +-
  tools/libxl/libxl_dm.c   | 259 
 +++
  tools/libxl/libxl_dom.c  |  17 ++-
  tools/libxl/libxl_internal.h |  11 +-
  tools/libxl/libxl_types.idl  |   7 ++
  5 files changed, 293 insertions(+), 3 deletions(-)
 
 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index 6c8ec63..30e6593 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -460,7 +460,7 @@ int libxl__domain_build(libxl__gc *gc,
  
  switch (info-type) {
  case LIBXL_DOMAIN_TYPE_HVM:
 -ret = libxl__build_hvm(gc, domid, info, state);
 +ret = libxl__build_hvm(gc, domid, d_config, state);
  if (ret)
  goto out;
  
 diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
 index 33f9ce6..5436bcf 100644
 --- a/tools/libxl/libxl_dm.c
 +++ b/tools/libxl/libxl_dm.c
 @@ -90,6 +90,265 @@ const char *libxl__domain_device_model(libxl__gc *gc,
  return dm;
  }
  
 +static struct xen_reserved_device_memory
 +*xc_device_get_rdm(libxl__gc *gc,
 +   uint32_t flag,
 +   uint16_t seg,
 +   uint8_t bus,
 +   uint8_t devfn,
 +   unsigned int *nr_entries)

I just notice this function lives in libxl_dm.c. The function should be
renamed to libxl__xc_device_get_rdm. 

This function should return proper libxl error code (ERROR_FAIL or
something more appropriate). The allocated RDM entries should be
returned with an out parameter.

I had always thought this lived in libxc. Sorry for not having noticed
this earlier.

 +{
 +struct xen_reserved_device_memory *xrdm;
 +int rc;
 +
 +/*
 + * We really can't presume how many entries we can get in advance.
 + */
 +*nr_entries = 0;
 +rc = xc_reserved_device_memory_map(CTX-xch, flag, seg, bus, devfn,
 +   NULL, nr_entries);
 +assert(rc = 0);
 +/* 0 means we have no any rdm entry. */
 +if (!rc)
 +goto out;
 +
 +if (errno == ENOBUFS) {
 +xrdm = malloc(*nr_entries * sizeof(xen_reserved_device_memory_t));

libxl__malloc(gc, ...);

 +if (!xrdm) {
 +LOG(ERROR, Could not allocate RDM buffer!\n);
 +goto out;
 +}

Get rid of this.

 +rc = xc_reserved_device_memory_map(CTX-xch, flag, seg, bus, devfn,
 +   xrdm, nr_entries);
 +if (rc) {
 +LOG(ERROR, Could not get reserved device memory maps.\n);
 +*nr_entries = 0;
 +free(xrdm);
 +xrdm = NULL;

Get rid of free.

 +}
 +} else
 +LOG(ERROR, Could not get reserved device memory maps.\n);
 +
 + out:
 +return xrdm;
 +}

The reset of this patch looks good to me. It does what we've discussed.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] xen: Allow xen tools to run in guest using 64K page granularity

2015-06-25 Thread Ian Campbell
On Thu, 2015-06-25 at 11:21 +0100, Wei Liu wrote:
 On Mon, May 11, 2015 at 12:55:34PM +0100, Julien Grall wrote:
  Hi all,
  
  This small series are the only changes required in Xen in order to run a 
  guest
  using 64K page granularity on top of an unmodified Xen.
  
  I'd like feedback from maintainers tools to know if it might be worth to
  introduce a function xc_pagesize() replicating the behavior of getpagesize()
  for Xen.
  
 
 Can we start with documenting the ABI (?) for communicating between
 guests with different page sizes?

We should certainly make it clearer what things are in terms of Xen ABI
page size vs the guest's page size and other things.

I think we can commit these two without that though?

 
 Or at least mention the ring mfn always has the size of XC_PAGE_SIZE (if
 that's the case).
 
 Wei.
 
  Sincerely yours,
  
  Julien Grall (2):
tools/xenstored: Use XC_PAGE_SIZE rather than getpagesize()
tools/xenconsoled: Use XC_PAGE_SIZE rather than getpagesize()
  
   tools/console/daemon/io.c | 4 ++--
   tools/xenstore/xenstored_domain.c | 4 ++--
   2 files changed, 4 insertions(+), 4 deletions(-)
  
  -- 
  2.1.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 16/19] tools/libxl: extend XENMEM_set_memory_map

2015-06-25 Thread Wei Liu
The subject line should be changed. You're not extending that hypercall.

libxl: construct e820 map with RDM information for HVM guest 

On Tue, Jun 23, 2015 at 05:57:27PM +0800, Tiejun Chen wrote:
 Here we'll construct a basic guest e820 table via
 XENMEM_set_memory_map. This table includes lowmem, highmem
 and RDMs if they exist. And hvmloader would need this info
 later.
 

I have one question. When RDM is disabled, the generated e820 map should
look exactly the same as before (i.e. without this patch), right?

Whatever the answer is, please say that in your commit log.

 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 ---
 v4:
 
 * Use goto style error handling.
 * Instead of NOGC, we shoud use libxl__malloc(gc,XXX) to allocate local e820.
 
  tools/libxl/libxl_dom.c  |  5 +++
  tools/libxl/libxl_internal.h | 24 +
  tools/libxl/libxl_x86.c  | 83 
 
  3 files changed, 112 insertions(+)
 
 diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
 index 0987991..bc8fd5b 100644
 --- a/tools/libxl/libxl_dom.c
 +++ b/tools/libxl/libxl_dom.c
 @@ -1004,6 +1004,11 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
  goto out;
  }
  
 +if (libxl__domain_construct_e820(gc, d_config, domid, args)) {
 +LOG(ERROR, setting domain memory map failed);
 +goto out;
 +}
 +
  ret = hvm_build_set_params(ctx-xch, domid, info, state-store_port,
 state-store_mfn, state-console_port,
 state-console_mfn, state-store_domid,
 diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
 index c0acf11..ae2f5e0 100644
 --- a/tools/libxl/libxl_internal.h
 +++ b/tools/libxl/libxl_internal.h
 @@ -3714,6 +3714,30 @@ static inline void libxl__update_config_vtpm(libxl__gc 
 *gc,
   */
  void libxl__bitmap_copy_best_effort(libxl__gc *gc, libxl_bitmap *dptr,
  const libxl_bitmap *sptr);
 +
 +/*
 + * Here we're just trying to set these kinds of e820 mappings:
 + *
 + * #1. Low memory region
 + *
 + * Low RAM starts at least from 1M to make sure all standard regions
 + * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
 + * have enough space.
 + * Note: Those stuffs below 1M are still constructed with multiple
 + * e820 entries by hvmloader. At this point we don't change anything.
 + *
 + * #2. RDM region if it exists
 + *
 + * #3. High memory region if it exists
 + *
 + * Note: these regions are not overlapping since we already check
 + * to adjust them. Please refer to libxl__domain_device_construct_rdm().
 + */
 +int libxl__domain_construct_e820(libxl__gc *gc,

hidden

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-25 Thread Manish Jaggi



On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote:

On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote:

On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote:

On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote:

On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:

On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:

Yes, pciback is already capable of doing that, see
drivers/xen/xen-pciback/conf_space.c


I am not sure if the pci-back driver can query the guest memory map. Is there 
an existing hypercall ?

No, that is missing.  I think it would be OK for the virtual BAR to be
initialized to the same value as the physical BAR.  But I would let the
guest change the virtual BAR address and map the MMIO region wherever it
wants in the guest physical address space with
XENMEM_add_to_physmap_range.

I disagree, given that we've apparently survived for years with x86 PV
guests not being able to right to the BARs I think it would be far
simpler to extend this to ARM and x86 PVH too than to allow guests to
start writing BARs which has various complex questions around it.
All that's needed is for the toolstack to set everything up and write
some new xenstore nodes in the per-device directory with the BAR
address/size.

Also most guests apparently don't reassign the PCI bus by default, so
using a 1:1 by default and allowing it to be changed would require
modifying the guests to reasssign. Easy on Linux, but I don't know about
others and I imagine some OSes (especially simpler/embedded ones) are
assuming the firmware sets up something sane by default.

Does the Flow below captures all points
a) When assigning a device to domU, toolstack creates a node in per
device directory with virtual BAR address/size

Option1:
b) toolstack using some hypercall ask xen to create p2m mapping {
virtual BAR : physical BAR } for domU

While implementing I think rather than the toolstack, pciback driver in
dom0 can send the
hypercall by to map the physical bar to virtual bar.
Thus no xenstore entry is required for BARs.

pciback doesn't (and shouldn't) have sufficient knowledge of the guest
address space layout to determine what the virtual BAR should be. The
toolstack is the right place for that decision to be made.
Yes, the point is the pciback driver reads the physical BAR regions on 
request from domU.
So it sends a hypercall to map the physical bars into stage2 translation 
for the domU through xen.

Xen would use the holes left in IPA for MMIO.
Xen would return the IPA for pci-back to return to the request to domU.

Moreover a pci driver would read BARs only once.

You can't assume that though, a driver can do whatever it likes, or the
module might be unloaded and reloaded in the guest etc etc.

Are you going to send out a second draft based on the discussion so far?
yes, I was working on that only. I was traveling this week 24 hour 
flights jetlag...


Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] xen/arm: Propagate clock-frequency to DOMU if present in the DT timer node

2015-06-25 Thread Ian Campbell
On Fri, 2015-06-19 at 13:41 +0100, Julien Grall wrote:
 When the property clock-frequency is present in the DT timer node, it
 means that the bootloader/firmware didn't correctly configure the
 CNTFRQ/CNTFRQ_EL0 on each processor.
 
 The best solution would be to fix the offending firmware/bootloader,
 although it may not always be possible to modify and re-flash it.
 
 As it's not possible to trap the register CNTFRQ/CNTFRQ_EL0, we have
 to extend xen_arch_domainconfig to provide the timer frequency to the
 toolstack when the property clock-frequency is present to the host DT
 timer node. Then, a property clock-frequency will be created in the guest
 DT timer node if the value is not 0.
 
 We could have set the property in the guest DT no matter if the property
 is present in the host DT. Although, we still want to let the guest
 using CNTFRQ in normal case. After all, the property clock-frequency
 is just a workaround for buggy firmware.
 
 Also add a stub for fdt_property_u32 which is not present in libfdt 
 1.4.0 used by distribution such as Debian Wheezy.
 
 Signed-off-by: Julien Grall julien.gr...@citrix.com
 Tested-by: Chris Brand chris.br...@broadcom.com

Acked + applied, thanks

 This patch requires to regenerate tools/configure.

Done.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 13/17] x86/hvm: remove HVMIO_dispatched I/O state

2015-06-25 Thread Andrew Cooper
On 24/06/15 12:24, Paul Durrant wrote:
 +#define HVMIO_NEED_COMPLETION(_vio) \
 +( ((_vio)-io_state == HVMIO_awaiting_completion) \
 +  !(_vio)-io_data_is_addr  \
 +  ((_vio)-io_dir == IOREQ_READ) )

Please can this be a static inline which takes a const pointer.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen-unstable: pci-passthrough of device using MSI-X interrupts not working after commit x86/MSI: track host and guest masking separately

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 14:02, li...@eikelenboom.it wrote:
 Thursday, June 25, 2015, 1:29:39 PM, you wrote:
 I'd be curious what the guest view of the MSI-X table entries is at
 that point. Can you still use the console inside the guest? If so,
 sufficiently verbose lspci of the device should be able to tell us
 (hoping that this isn't a Windows guest), or a dd of /dev/mem at
 the right offset. Perhaps there are also way to get at that from
 qemu, but I do not know how.
 
 The guest(linux) keeps running, only that terminal with the lsusb 
 command hangs, so no problem to gather the lspci output.
 Guest lspci -vvvknn attached.

Hmm, no, this

Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
Vector table: BAR=0 offset=1000
PBA: BAR=0 offset=1080

isn't enough. I was sure I saw lspci capable of listing the individual
table entries...

 Btw., are
 
 (XEN) [2015-06-25 10:44:26.550] traps.c:3227: GPF (): 82d0801d8282 
 - 
 82d080239eec
 (XEN) [2015-06-25 10:44:26.550] traps.c:3227: GPF (): 82d0801d8282 
 - 
 82d080239eec
 (XEN) [2015-06-25 10:44:26.550] traps.c:3227: GPF (): 82d0801d8282 
 - 
 82d080239eec
 (XEN) [2015-06-25 10:44:26.550] traps.c:3227: GPF (): 82d0801d8282 
 - 
 82d080239eec
 
 new? Did you ever try to figure out what they're being caused by?
 
 No those aren't new (they are present for at least some months now), 
 something 
 in a booting guest kernel triggers those, not only for HVM's  but 
 also for PV guests (and so they also appear for dom0).

No, the Dom0 ones were different from what I recall.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-25 Thread Andrew Cooper
On 25/06/15 13:46, Jan Beulich wrote:
 On 25.06.15 at 14:21, andrew.coop...@citrix.com wrote:
 On 24/06/15 12:24, Paul Durrant wrote:
 When memory mapped I/O is range checked by internal handlers, the length
 of the access should be taken into account.

 Signed-off-by: Paul Durrant paul.durr...@citrix.com
 Cc: Keir Fraser k...@xen.org
 Cc: Jan Beulich jbeul...@suse.com
 Cc: Andrew Cooper andrew.coop...@citrix.com

 For what purpose?  The length of the access doesn't affect which handler
 should accept the IO.

 This length check now causes an MMIO handler to not claim an access
 which straddles the upper boundary.

 It is probably fine to terminate such an access early, but it isn't fine
 to pass such a straddled access to the default ioreq server.
 No, without involving the length in the check we can end up with
 check() saying Yes, mine but read() or write() saying Not me.
 What I would agree with is for the generic handler to split the
 access if the first byte fits, but the final byte doesn't.

I discussed this with Paul over lunch.  I had not considered how IO gets
forwarded to the device model for shared implementations.

Is it reasonable to split a straddled access and direct the halves at
different handlers? This is not in line with how other hardware behaves
(PCIe will reject any straddled access).  Furthermore, given small MMIO
regions and larger registers, there is no guarantee that a single split
will suffice.

I see in the other thread going on that a domain_crash() is deemed ok
for now, which is fine my me.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-25 Thread Paul Durrant
 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 25 June 2015 14:48
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: RE: [PATCH v4 07/17] x86/hvm: add length to mmio check op
 
  On 25.06.15 at 15:36, paul.durr...@citrix.com wrote:
  I think that also allows me to simplfy the patch since I don't have to
  modify the mmio_check op any more. I simply call it once for the first byte
  of the access and, if it accepts, verify that it also accepts the last byte
  of the access.
 
 That's actually not (generally) okay: There could be a hole in the
 middle. But as long as instructions don't do accesses wider than
 a page, we're fine with that in practice I think. Or wait, no, in the
 MSI-X this could not be okay: A 64-byte read to the 16 bytes
 32 bytes away from a page boundary (and being the last entry
 on one device's MSI-X table) would extend into another device's
 MSI-X table on the next page. I.e. first and last bytes would be
 okay to be accessed, but bytes 16...31 of the access wouldn't.
 Of course the MSI-X read/write handlers don't currently permit
 such wide accesses, but anyway...
 

We could also verify that, for a rep op, all reads/writes come back with OKAY. 
I think that would be ok.

  Paul

 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 09/11] libxc: support XEN_DOMCTL_soft_reset operation

2015-06-25 Thread Wei Liu
On Tue, Jun 23, 2015 at 06:11:51PM +0200, Vitaly Kuznetsov wrote:
 Introduce xc_domain_soft_reset() function supporting XEN_DOMCTL_soft_reset.
 
 Signed-off-by: Vitaly Kuznetsov vkuzn...@redhat.com

Acked-by: Wei Liu wei.l...@citrix.com

 ---
  tools/libxc/include/xenctrl.h | 3 +++
  tools/libxc/xc_domain.c   | 9 +
  2 files changed, 12 insertions(+)
 
 diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
 index d1d2ab3..7aa0e81 100644
 --- a/tools/libxc/include/xenctrl.h
 +++ b/tools/libxc/include/xenctrl.h
 @@ -1301,6 +1301,9 @@ int xc_domain_setvnuma(xc_interface *xch,
  unsigned int *vcpu_to_vnode,
  unsigned int *vnode_to_pnode);
  
 +int xc_domain_soft_reset(xc_interface *xch,
 + uint32_t domid);
 +
  #if defined(__i386__) || defined(__x86_64__)
  /*
   * PC BIOS standard E820 types and structure.
 diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
 index ce51e69..a59d0b0 100644
 --- a/tools/libxc/xc_domain.c
 +++ b/tools/libxc/xc_domain.c
 @@ -2452,6 +2452,15 @@ int xc_domain_setvnuma(xc_interface *xch,
  return rc;
  }
  
 +
 +int xc_domain_soft_reset(xc_interface *xch,
 + uint32_t domid)
 +{
 +DECLARE_DOMCTL;
 +domctl.cmd = XEN_DOMCTL_soft_reset;
 +domctl.domain = (domid_t)domid;
 +return do_domctl(xch, domctl);
 +}
  /*
   * Local variables:
   * mode: C
 -- 
 2.4.2

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 02/11] x86/intel_pstate: add some calculation related support

2015-06-25 Thread Wei Wang
The added calculation related functions will be used in the intel_pstate.c.
They are copied from the Linux kernel(commit 2418f4f2, f3002134, eb18cba7).

v4 changes:
1) in commit message, kernel changed to Linux kernel
2) if-else coding style change.

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 xen/include/asm-x86/div64.h | 78 +
 xen/include/xen/kernel.h| 12 +++
 2 files changed, 90 insertions(+)

diff --git a/xen/include/asm-x86/div64.h b/xen/include/asm-x86/div64.h
index dd49f64..1f171ba 100644
--- a/xen/include/asm-x86/div64.h
+++ b/xen/include/asm-x86/div64.h
@@ -11,4 +11,82 @@
 __rem;  \
 })
 
+static inline uint64_t div_u64_rem(uint64_t dividend, uint32_t divisor,
+  uint32_t *remainder)
+{
+*remainder = do_div(dividend, divisor);
+return dividend;
+}
+
+static inline uint64_t div_u64(uint64_t dividend, uint32_t  divisor)
+{
+uint32_t remainder;
+
+return div_u64_rem(dividend, divisor, remainder);
+}
+
+/*
+ * div64_u64 - unsigned 64bit divide with 64bit divisor
+ * @dividend:64bit dividend
+ * @divisor:64bit divisor
+ *
+ * This implementation is a modified version of the algorithm proposed
+ * by the book 'Hacker's Delight'.  The original source and full proof
+ * can be found here and is available for use without restriction.
+ *
+ * 'http://www.hackersdelight.org/HDcode/newCode/divDouble.c.txt'
+ */
+static inline uint64_t div64_u64(uint64_t dividend, uint64_t divisor)
+{
+uint32_t high = divisor  32;
+uint64_t quot;
+
+if (high == 0)
+quot = div_u64(dividend, divisor);
+else
+{
+int n = 1 + fls(high);
+quot = div_u64(dividend  n, divisor  n);
+
+if (quot != 0)
+quot--;
+if ((dividend - quot * divisor) = divisor)
+quot++;
+}
+return quot;
+}
+
+static inline int64_t div_s64_rem(int64_t dividend, int32_t divisor,
+ int32_t *remainder)
+{
+int64_t quotient;
+
+if (dividend  0)
+{
+quotient = div_u64_rem(-dividend, ABS(divisor),
+(uint32_t *)remainder);
+*remainder = -*remainder;
+if (divisor  0)
+quotient = -quotient;
+}
+else
+{
+quotient = div_u64_rem(dividend, ABS(divisor),
+(uint32_t *)remainder);
+if (divisor  0)
+quotient = -quotient;
+}
+return quotient;
+}
+
+/*
+ * div_s64 - signed 64bit divide with 32bit divisor
+ */
+static inline int64_t div_s64(int64_t dividend, int32_t divisor)
+{
+int32_t remainder;
+
+return div_s64_rem(dividend, divisor, remainder);
+}
+
 #endif
diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
index 548b64d..bfdcdb6 100644
--- a/xen/include/xen/kernel.h
+++ b/xen/include/xen/kernel.h
@@ -42,6 +42,18 @@
 #define MIN(x,y) ((x)  (y) ? (x) : (y))
 #define MAX(x,y) ((x)  (y) ? (x) : (y))
 
+/*
+ * clamp_t - return a value clamped to a given range using a given type
+ * @type: the type of variable to use
+ * @val: current value
+ * @lo: minimum allowable value
+ * @hi: maximum allowable value
+ *
+ * This macro does no typechecking and uses temporary variables of type
+ * 'type' to make all the comparisons.
+ */
+#define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)
+
 /**
  * container_of - cast a member of a structure out to the containing structure
  *
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 05/11] x86/intel_pstate: relocate the driver register function

2015-06-25 Thread Wei Wang
Register the CPU hotplug notifier when the driver is
registered, and move the driver register function to
the cpufreq.c.

v4 changes:
1) Coding style change (the position of ||).

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 xen/drivers/cpufreq/cpufreq.c  | 14 +++---
 xen/include/acpi/cpufreq/cpufreq.h | 27 +--
 2 files changed, 12 insertions(+), 29 deletions(-)

diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index 91b6c25..acc4bb5 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -630,10 +630,18 @@ static struct notifier_block cpu_nfb = {
 .notifier_call = cpu_callback
 };
 
-static int __init cpufreq_presmp_init(void)
+int cpufreq_register_driver(struct cpufreq_driver *driver_data)
 {
+if (!driver_data || !driver_data-init ||
+!driver_data-verify || !driver_data-exit ||
+(!driver_data-target == !driver_data-setpolicy))
+return -EINVAL;
+
+if (cpufreq_driver)
+return -EBUSY;
+
+cpufreq_driver = driver_data;
+
 register_cpu_notifier(cpu_nfb);
 return 0;
 }
-presmp_initcall(cpufreq_presmp_init);
-
diff --git a/xen/include/acpi/cpufreq/cpufreq.h 
b/xen/include/acpi/cpufreq/cpufreq.h
index af37e90..502774f 100644
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -183,32 +183,7 @@ struct cpufreq_driver {
 
 extern struct cpufreq_driver *cpufreq_driver;
 
-static __inline__ 
-int cpufreq_register_driver(struct cpufreq_driver *driver_data)
-{
-if (!driver_data || 
-!driver_data-init   || 
-!driver_data-exit   || 
-!driver_data-verify || 
-!driver_data-target)
-return -EINVAL;
-
-if (cpufreq_driver)
-return -EBUSY;
-
-cpufreq_driver = driver_data;
-return 0;
-}
-
-static __inline__ 
-int cpufreq_unregister_driver(struct cpufreq_driver *driver)
-{
-if (!cpufreq_driver || (driver != cpufreq_driver))
-return -EINVAL;
-
-cpufreq_driver = NULL;
-return 0;
-}
+extern int cpufreq_register_driver(struct cpufreq_driver *driver_data);
 
 static __inline__
 void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 06/11] x86/intel_pstate: APERF/MPERF feature detect

2015-06-25 Thread Wei Wang
Add support to detect the APERF/MPERF feature. Also, remove the identical
code in cpufreq.c and powernow.c.

v4 changes:
1) this is a new consolidated patch dealing with the APERF/MPERF feature
detection.

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 xen/arch/x86/acpi/cpufreq/cpufreq.c  | 6 ++
 xen/arch/x86/acpi/cpufreq/powernow.c | 6 ++
 xen/arch/x86/cpu/common.c| 3 +++
 xen/include/asm-x86/cpufeature.h | 1 +
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/acpi/cpufreq/cpufreq.c 
b/xen/arch/x86/acpi/cpufreq/cpufreq.c
index fa3678d..643c405 100644
--- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
+++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
@@ -51,7 +51,6 @@ enum {
 };
 
 #define INTEL_MSR_RANGE (0xull)
-#define CPUID_6_ECX_APERFMPERF_CAPABILITY   (0x1)
 
 struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
 
@@ -352,10 +351,9 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
 static void feature_detect(void *info)
 {
 struct cpufreq_policy *policy = info;
-unsigned int eax, ecx;
+unsigned int eax;
 
-ecx = cpuid_ecx(6);
-if (ecx  CPUID_6_ECX_APERFMPERF_CAPABILITY) {
+if (boot_cpu_has(X86_FEATURE_APERFMPERF)) {
 policy-aperf_mperf = 1;
 acpi_cpufreq_driver.getavg = get_measured_perf;
 }
diff --git a/xen/arch/x86/acpi/cpufreq/powernow.c 
b/xen/arch/x86/acpi/cpufreq/powernow.c
index 2c9fea2..b5b752c 100644
--- a/xen/arch/x86/acpi/cpufreq/powernow.c
+++ b/xen/arch/x86/acpi/cpufreq/powernow.c
@@ -38,7 +38,6 @@
 #include acpi/acpi.h
 #include acpi/cpufreq/cpufreq.h
 
-#define CPUID_6_ECX_APERFMPERF_CAPABILITY   (0x1)
 #define CPUID_FREQ_VOLT_CAPABILITIES0x8007
 #define CPB_CAPABLE 0x0200
 #define USE_HW_PSTATE   0x0080
@@ -212,10 +211,9 @@ static int powernow_cpufreq_verify(struct cpufreq_policy 
*policy)
 static void feature_detect(void *info)
 {
 struct cpufreq_policy *policy = info;
-unsigned int ecx, edx;
+unsigned int edx;
 
-ecx = cpuid_ecx(6);
-if (ecx  CPUID_6_ECX_APERFMPERF_CAPABILITY) {
+if (boot_cpu_has(X86_FEATURE_APERFMPERF)) {
 policy-aperf_mperf = 1;
 powernow_cpufreq_driver.getavg = get_measured_perf;
 }
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index e105aeb..dba29c0 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -238,6 +238,9 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 
*c)
if ( cpu_has(c, X86_FEATURE_CLFLSH) )
c-x86_clflush_size = ((ebx  8)  0xff) * 8;
 
+   if (cpuid_ecx(6)  0x1)
+   set_bit(X86_FEATURE_APERFMPERF, c-x86_capability);
+
/* AMD-defined flags: level 0x8001 */
c-extended_cpuid_level = cpuid_eax(0x8000);
if ( (c-extended_cpuid_level  0x) == 0x8000 ) {
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 7963a3a..efc9711 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -69,6 +69,7 @@
 #define X86_FEATURE_XTOPOLOGY(3*32+13) /* cpu topology enum extensions */
 #define X86_FEATURE_CPUID_FAULTING (3*32+14) /* cpuid faulting */
 #define X86_FEATURE_CLFLUSH_MONITOR (3*32+15) /* clflush reqd with monitor */
+#define X86_FEATURE_APERFMPERF (3*32+28) /* APERFMPERF */
 
 /* Intel-defined CPU features, CPUID level 0x0001 (ecx), word 4 */
 #define X86_FEATURE_XMM3   (4*32+ 0) /* Streaming SIMD Extensions-3 */
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 08/11] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline

2015-06-25 Thread Wei Wang
We change to NULL the cpufreq_cpu_policy pointer after the call of
cpufreq_driver-exit, because the pointer is still needed in
intel_pstate_set_pstate().

v4 changes:
None.

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 xen/drivers/cpufreq/cpufreq.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index acc4bb5..d1b423f 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -335,12 +335,11 @@ int cpufreq_del_cpu(unsigned int cpu)
 
 /* for HW_ALL, stop gov for each core of the _PSD domain */
 /* for SW_ALL  SW_ANY, stop gov for the 1st core of the _PSD domain */
-if (hw_all || (cpumask_weight(cpufreq_dom-map) ==
-   perf-domain_info.num_processors))
+if (!policy-internal_gov  (hw_all || (cpumask_weight(cpufreq_dom-map) 
==
+   perf-domain_info.num_processors)))
 __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 
 cpufreq_statistic_exit(cpu);
-per_cpu(cpufreq_cpu_policy, cpu) = NULL;
 cpumask_clear_cpu(cpu, policy-cpus);
 cpumask_clear_cpu(cpu, cpufreq_dom-map);
 
@@ -349,6 +348,7 @@ int cpufreq_del_cpu(unsigned int cpu)
 free_cpumask_var(policy-cpus);
 xfree(policy);
 }
+per_cpu(cpufreq_cpu_policy, cpu) = NULL;
 
 /* for the last cpu of the domain, clean room */
 /* It's safe here to free freq_table, drv_data and policy */
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 04/11] x86/intel_pstate: avoid calling cpufreq_add_cpu() twice

2015-06-25 Thread Wei Wang
cpufreq_add_cpu() is already called in the hypercall code path
(the bottom of set_px_pminfo() and inside cpufreq_cpu_init()).
So, we remove the redundant calling here.

v4 changes:
None.

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 xen/drivers/cpufreq/cpufreq.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index ab66884..91b6c25 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -632,8 +632,6 @@ static struct notifier_block cpu_nfb = {
 
 static int __init cpufreq_presmp_init(void)
 {
-void *cpu = (void *)(long)smp_processor_id();
-cpu_callback(cpu_nfb, CPU_ONLINE, cpu);
 register_cpu_notifier(cpu_nfb);
 return 0;
 }
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 1/2] mg-debian-installer-update: Print the correct value for TftpDiVersion

2015-06-25 Thread Ian Campbell
On Thu, 2015-06-25 at 11:34 +0100, Ian Jackson wrote:
 Ian Campbell writes ([PATCH OSSTEST 1/2] mg-debian-installer-update: Print 
 the correct value for TftpDiVersion):
  That is, the date without the suite suffix.
 ...
  -echo $date
  -echo 2 downloaded $dstroot/$arch/$date
  +echo New TftpDiVersion: $date
  +echo 2 downloaded $dstroot/$dst
 
 You could make the output suitable for cp ?
 
   +echo TftpDiVersion $date

Good idea. Shall I resend or just do it on commit?




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] vif-bridge: ip link set failed, name too long

2015-06-25 Thread Anthony PERARD
Hi,

When one tries to start an HVM guest via OpenStack, which is setup with
Neutron for network, the guest creation always fail.

Here are a few relevent logs:

/var/log/libvirt/libxl/libxl-driver.log:
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: 
/etc/xen/scripts/vif-bridge add [-1] exited with error status 1
libxl: error: libxl_device.c:1085:device_hotplug_child_death_cb: script: ip 
link set vif188.0-emu name tap695cf459-b0-emu failed
libxl: debug: libxl_event.c:618:libxl__ev_xswatch_deregister: watch 
w=0x7f5a9c05ddd0: deregister unregistered
libxl: error: libxl_create.c:1226:domcreate_attach_vtpms: unable to add nic 
devices

/var/log/xen/xen-hotplug.log:
Error: argument tap695cf459-b0-emu is wrong: name too long

The libvirt config, from Nova:
interface type='bridge'
  mac address='fa:16:3e:b0:cd:2a'/
  source bridge='qbr695cf459-b0'/
  target dev='tap695cf459-b0'/
/interface

Thanks,

-- 
Anthony PERARD

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-25 Thread Paul Durrant
 -Original Message-
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: 25 June 2015 14:34
 To: Jan Beulich
 Cc: Paul Durrant; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 07/17] x86/hvm: add length to mmio check op
 
 On 25/06/15 13:46, Jan Beulich wrote:
  On 25.06.15 at 14:21, andrew.coop...@citrix.com wrote:
  On 24/06/15 12:24, Paul Durrant wrote:
  When memory mapped I/O is range checked by internal handlers, the
 length
  of the access should be taken into account.
 
  Signed-off-by: Paul Durrant paul.durr...@citrix.com
  Cc: Keir Fraser k...@xen.org
  Cc: Jan Beulich jbeul...@suse.com
  Cc: Andrew Cooper andrew.coop...@citrix.com
 
  For what purpose?  The length of the access doesn't affect which handler
  should accept the IO.
 
  This length check now causes an MMIO handler to not claim an access
  which straddles the upper boundary.
 
  It is probably fine to terminate such an access early, but it isn't fine
  to pass such a straddled access to the default ioreq server.
  No, without involving the length in the check we can end up with
  check() saying Yes, mine but read() or write() saying Not me.
  What I would agree with is for the generic handler to split the
  access if the first byte fits, but the final byte doesn't.
 
 I discussed this with Paul over lunch.  I had not considered how IO gets
 forwarded to the device model for shared implementations.
 
 Is it reasonable to split a straddled access and direct the halves at
 different handlers? This is not in line with how other hardware behaves
 (PCIe will reject any straddled access).  Furthermore, given small MMIO
 regions and larger registers, there is no guarantee that a single split
 will suffice.
 
 I see in the other thread going on that a domain_crash() is deemed ok
 for now, which is fine my me.
 

I think that also allows me to simplfy the patch since I don't have to modify 
the mmio_check op any more. I simply call it once for the first byte of the 
access and, if it accepts, verify that it also accepts the last byte of the 
access.

  Paul

 ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen-unstable: pci-passthrough of device using MSI-X interrupts not working after commit x86/MSI: track host and guest masking separately

2015-06-25 Thread linux

On 2015-06-25 15:37, Jan Beulich wrote:

On 25.06.15 at 15:16, li...@eikelenboom.it wrote:

Thursday, June 25, 2015, 2:40:18 PM, you wrote:

Hmm, no, this



Capabilities: [90] MSI-X: Enable+ Count=8 Masked-
Vector table: BAR=0 offset=1000
PBA: BAR=0 offset=1080


isn't enough. I was sure I saw lspci capable of listing the 
individual

table entries...


It seems to be the most verbose option for my lspci of debian Jessie.
So probably a debug-patch would be best ?


Yes, but I'm not sure when I'd get to it (being on vacation all next
week).

Jan


Ok no problem no hurry, reverting the commit and the following cleanup 
to get a clean revert,

fixes it for me. It can wait (or Andrew should beat you to it ;) )
Have a good vacation !

--
Sander

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v] Add some sanity checks for presence of Repos configuration

2015-06-25 Thread Ian Jackson
Ian Campbell writes ([PATCH OSSTEST v] Add some sanity checks for presence of 
Repos configuration):
 By providing an explicit fetch method in cri-getconfig which checks
 things.
 
 Without this then anything which uses cr-daily-branch produces the
 rather cryptic:
 
 + test -f daily.xsettings
 ++ ./ap-print-url xen-unstable
 with-lock-ex ./ap-print-url: /lock: Permission denied
 + treeurl=
 FAILED rc=255
 
 Which has caught out one or two people using standalone mode.

Acked-by: Ian Jackson ian.jack...@eu.citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 16/17] x86/hvm: always re-emulate I/O from a buffer

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 12:32, paul.durr...@citrix.com wrote:
  -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: 25 June 2015 10:58
 To: Paul Durrant
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 16/17] x86/hvm: always re-emulate I/O from a buffer
 
  On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
  If memory mapped I/O is 'chunked' then the I/O must be re-emulated,
  otherwise only the first chunk will be processed. This patch makes
  sure all I/O from a buffer is re-emulated regardless of whether it
  is a read or a write.
 
 I'm not sure I understand this: Isn't the reason for treating reads
 and writes differently due to the fact that MMIO reads may have
 side effects, and hence can't be re-issued (whereas writes are
 always the last thing an instruction does, and hence can't hold up
 retiring of it, and hence don't need retrying)?
 
 Read were always re-issued, which is why handle_mmio() is called in 
 hvm_io_assit(). If the underlying MMIO is deferred to QEMU then this is the 
 only way for Xen to pick up the result. This patch adds completion for 
 writes.
 If the I/O has been broken down in the underlying hvmemul_write() and a 
 'chunk' deferred to QEMU then there is actually need to re-emulate otherwise 
 any remaining chunks will not be handled.
 
 
 Furthermore, doesn't only the first chunk get represented correctly
 already by informing the caller that only a single iteration of a
 repeated instruction was done, such that further repeats will get
 carried out anyway (resulting in another, fresh cycle through the
 emulator)?
 
 
 No, because we're talking about 'chunks' here and not 'reps'. If a single 
 non-rep I/O is broken down into, say, two chunks then we:
 
 - Issue the I/O for the first chunk to QEMU
 - Say we did nothing by returning RETRY
 - Re-issue the emulation from hvm_io_assist()
 - Pick up the result of the first chunk from the ioreq, add it to the cache, 
 and issue the second chunk to QEMU
 - Say we did nothing by returning RETRY
 - Re-issue the emulation from hvm_io_assist()
 - Pick up the result of the first chunk from the cache and pick up the result 
 of the second chunk from the ioreq
 - Say we completed the I/O by returning OKAY
 
 I agree it's not nice, and bouncing would have been preferable, but that's 
 the way 'wide I/O' works.

I see. Which means
Acked-by: Jan Beulich jbeul...@suse.com

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] docs: Build ARM documentation

2015-06-25 Thread Ian Campbell
On Sat, 2015-06-20 at 12:37 +0100, Julien Grall wrote:
 Julien Grall (2):
   docs: Look for documentation in sub-directories
   docs: Update INDEX to give a title for each ARM docs

Acked + Applied.

 
  docs/INDEX|  5 +
  docs/Makefile | 12 ++--
  2 files changed, 11 insertions(+), 6 deletions(-)
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] Revert libxl_set_memory_target: retain the same maxmem offset on top of the current target

2015-06-25 Thread Ian Campbell
On Tue, 2015-06-23 at 17:07 +0100, Wei Liu wrote:
 This reverts commit 0c029c4da2169159064568ef4fea862a5d2cd84a.
 
 A new memory model that allows QEMU to bump memory behind libxl's back
 was merged a few months ago. We didn't fully understand the
 repercussions back then. Now it breaks migration and becomes blocker of
 4.6 release.
 
 It's better to restore to original behaviour at this stage of the
 release cycle, that would put us in a position no worse than before, so
 the release is unblocked.
 
 The said function is still racy after reverting these two patches.
 Making domain memory state consistent requires a bit more work. Separate
 patch(es) will be sent out to deal with that problem.
 
 Fix up conflicts with f5b43e95 (libxl: fix xl mem-set regression from
 0c029c4da2).
 
 Signed-off-by: Wei Liu wei.l...@citrix.com

Acked + applied.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] xen{trace/analyze}: fix build on FreeBSD

2015-06-25 Thread Ian Campbell
On Fri, 2015-06-19 at 10:58 +0200, Roger Pau Monne wrote:
 Fix the build of xentrace/xenalyze on FreeBSD, and possibly other libcs not 
 having argp. Also fix the usage of fstat64 and O_LARGEFILE.

Both applied, thanks.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] Build libxc on rump kernel

2015-06-25 Thread Ian Campbell
On Wed, 2015-06-24 at 11:10 +0100, Wei Liu wrote:
 I have upstreamed a privcmd driver for rump kernel. That driver has the same
 semantics as the NetBSD one so we can just use xc_netbsd for rump kernel.
 
 Wei.
 
 Wei Liu (2):
   NetBSDRump: provide evtchn.h and privcmd.h
   libxc: use xc_netbsd.c for rump kernel

Acked + applied. At some point I may need to pick your brains regarding
the refactoring I'm doing to all this stuff..

 
  tools/include/xen-sys/NetBSDRump/evtchn.h  | 86 
 ++
  tools/include/xen-sys/NetBSDRump/privcmd.h | 81 ++--
  tools/libxc/Makefile   |  1 +
  3 files changed, 165 insertions(+), 3 deletions(-)
  create mode 100644 tools/include/xen-sys/NetBSDRump/evtchn.h
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-25 Thread Ian Campbell
On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote:
 
 On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote:
  On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote:
  On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote:
  On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote:
  On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:
  On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:
  Yes, pciback is already capable of doing that, see
  drivers/xen/xen-pciback/conf_space.c
 
  I am not sure if the pci-back driver can query the guest memory map. 
  Is there an existing hypercall ?
  No, that is missing.  I think it would be OK for the virtual BAR to be
  initialized to the same value as the physical BAR.  But I would let the
  guest change the virtual BAR address and map the MMIO region wherever 
  it
  wants in the guest physical address space with
  XENMEM_add_to_physmap_range.
  I disagree, given that we've apparently survived for years with x86 PV
  guests not being able to right to the BARs I think it would be far
  simpler to extend this to ARM and x86 PVH too than to allow guests to
  start writing BARs which has various complex questions around it.
  All that's needed is for the toolstack to set everything up and write
  some new xenstore nodes in the per-device directory with the BAR
  address/size.
 
  Also most guests apparently don't reassign the PCI bus by default, so
  using a 1:1 by default and allowing it to be changed would require
  modifying the guests to reasssign. Easy on Linux, but I don't know about
  others and I imagine some OSes (especially simpler/embedded ones) are
  assuming the firmware sets up something sane by default.
  Does the Flow below captures all points
  a) When assigning a device to domU, toolstack creates a node in per
  device directory with virtual BAR address/size
 
  Option1:
  b) toolstack using some hypercall ask xen to create p2m mapping {
  virtual BAR : physical BAR } for domU
  While implementing I think rather than the toolstack, pciback driver in
  dom0 can send the
  hypercall by to map the physical bar to virtual bar.
  Thus no xenstore entry is required for BARs.
  pciback doesn't (and shouldn't) have sufficient knowledge of the guest
  address space layout to determine what the virtual BAR should be. The
  toolstack is the right place for that decision to be made.
 Yes, the point is the pciback driver reads the physical BAR regions on 
 request from domU.
 So it sends a hypercall to map the physical bars into stage2 translation 
 for the domU through xen.
 Xen would use the holes left in IPA for MMIO.

I still think it is the toolstack which should do this, that's whewre
these sorts of layout decisions belong.

 Xen would return the IPA for pci-back to return to the request to domU.
  Moreover a pci driver would read BARs only once.
  You can't assume that though, a driver can do whatever it likes, or the
  module might be unloaded and reloaded in the guest etc etc.
 
  Are you going to send out a second draft based on the discussion so far?
 yes, I was working on that only. I was traveling this week 24 hour 
 flights jetlag...
 
  Ian.
 
 
  ___
  Xen-devel mailing list
  Xen-devel@lists.xen.org
  http://lists.xen.org/xen-devel
 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-25 Thread Andrew Cooper
On 24/06/15 12:24, Paul Durrant wrote:
 When memory mapped I/O is range checked by internal handlers, the length
 of the access should be taken into account.

 Signed-off-by: Paul Durrant paul.durr...@citrix.com
 Cc: Keir Fraser k...@xen.org
 Cc: Jan Beulich jbeul...@suse.com
 Cc: Andrew Cooper andrew.coop...@citrix.com


For what purpose?  The length of the access doesn't affect which handler
should accept the IO.

This length check now causes an MMIO handler to not claim an access
which straddles the upper boundary.

It is probably fine to terminate such an access early, but it isn't fine
to pass such a straddled access to the default ioreq server.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] libxc: delete sent_last_iter

2015-06-25 Thread Ian Campbell
On Thu, 2015-06-18 at 17:37 +0100, Wei Liu wrote:
 It's set in code but never used.  Detected by -Wunused-but-set-variable.
 
 Signed-off-by: Wei Liu wei.l...@citrix.com

Applied thanks (I figured there was no harm even if it is just about to
be deleted)

 ---
  tools/libxc/xc_domain_save.c | 7 +--
  1 file changed, 1 insertion(+), 6 deletions(-)
 
 diff --git a/tools/libxc/xc_domain_save.c b/tools/libxc/xc_domain_save.c
 index 301e770..3222473 100644
 --- a/tools/libxc/xc_domain_save.c
 +++ b/tools/libxc/xc_domain_save.c
 @@ -811,7 +811,7 @@ int xc_domain_save(xc_interface *xch, int io_fd, uint32_t 
 dom, uint32_t max_iter
  int live  = (flags  XCFLAGS_LIVE);
  int debug = (flags  XCFLAGS_DEBUG);
  int superpages = !!hvm;
 -int race = 0, sent_last_iter, skip_this_iter = 0;
 +int race = 0, skip_this_iter = 0;
  unsigned int sent_this_iter = 0;
  int tmem_saved = 0;
  
 @@ -1014,9 +1014,6 @@ int xc_domain_save(xc_interface *xch, int io_fd, 
 uint32_t dom, uint32_t max_iter
  
  last_iter = !live;
  
 -/* pretend we sent all the pages last iteration */
 -sent_last_iter = dinfo-p2m_size;
 -
  /* Setup to_send / to_fix and to_skip bitmaps */
  to_send = xc_hypercall_buffer_alloc_pages(xch, to_send, 
 NRPAGES(bitmap_size(dinfo-p2m_size)));
  to_skip = xc_hypercall_buffer_alloc_pages(xch, to_skip, 
 NRPAGES(bitmap_size(dinfo-p2m_size)));
 @@ -1586,8 +1583,6 @@ int xc_domain_save(xc_interface *xch, int io_fd, 
 uint32_t dom, uint32_t max_iter
  goto out;
  }
  
 -sent_last_iter = sent_this_iter;
 -
  print_stats(xch, dom, sent_this_iter, time_stats, 
 shadow_stats, 1);
  
  }



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 11/19] tools: introduce some new parameters to set rdm policy

2015-06-25 Thread Ian Jackson
Tiejun Chen writes ([v4][PATCH 11/19] tools: introduce some new parameters to 
set rdm policy):
 This patch introduces user configurable parameters to specify RDM
 resource and according policies,
...
 Global RDM parameter, type, allows user to specify reserved regions
 explicitly, e.g. using 'host' to include all reserved regions reported
 on this platform which is good to handle hotplug scenario. In the future
 this parameter may be further extended to allow specifying random regions,
 e.g. even those belonging to another platform as a preparation for live
 migration with passthrough devices. Instead, 'none' means we have nothing
 to do all reserved regions and ignore all policies, so guest work as before.

I think the description in the documentation needs to have more
user-focused information.  It's not quite clear to me what the
tradeoffs are of the different options.


(Your use of random here is rather information.  You should say
arbitrary.)

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 03/18] xen: console: Add ratelimit support for error message

2015-06-25 Thread Vijay Kilari
On Mon, Jun 22, 2015 at 6:51 PM, Jan Beulich jbeul...@suse.com wrote:
 On 22.06.15 at 14:01, vijay.kil...@gmail.com wrote:
 From: Vijaya Kumar K vijaya.ku...@caviumnetworks.com

 XENLOG_ERR_RATE_LIMIT and XENLOG_G_ERR_RATE_LIMIT
 log levels are added to support rate limit for error messages

 If you mean to say that rate limiting currently doesn't work for
 XENLOG_ERR messages, then that's a problem to be fixed by
 adjusting existing code, not by adding yet another log level.

For GUEST messages  ERR and WARN are rate limited by
setting lower threshold to 0 and upper threshold to 2 as below

#define XENLOG_GUEST_UPPER_THRESHOLD 2 /* Do not print INFO and DEBUG  */
#define XENLOG_GUEST_LOWER_THRESHOLD 0 /* Rate-limit ERR and WARNING   */

So do you recommend to set same threshold levels to Xen messages
there by ERR  WARN are rate limited?

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 03/18] xen: console: Add ratelimit support for error message

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 15:14, vijay.kil...@gmail.com wrote:
 On Mon, Jun 22, 2015 at 6:51 PM, Jan Beulich jbeul...@suse.com wrote:
 On 22.06.15 at 14:01, vijay.kil...@gmail.com wrote:
 From: Vijaya Kumar K vijaya.ku...@caviumnetworks.com

 XENLOG_ERR_RATE_LIMIT and XENLOG_G_ERR_RATE_LIMIT
 log levels are added to support rate limit for error messages

 If you mean to say that rate limiting currently doesn't work for
 XENLOG_ERR messages, then that's a problem to be fixed by
 adjusting existing code, not by adding yet another log level.
 
 For GUEST messages  ERR and WARN are rate limited by
 setting lower threshold to 0 and upper threshold to 2 as below
 
 #define XENLOG_GUEST_UPPER_THRESHOLD 2 /* Do not print INFO and DEBUG  */
 #define XENLOG_GUEST_LOWER_THRESHOLD 0 /* Rate-limit ERR and WARNING   */
 
 So do you recommend to set same threshold levels to Xen messages
 there by ERR  WARN are rate limited?

I'm not sure I understand what you're asking: I recommend no
change at all, unless you see something broken (in which case
that's what I want clearly described).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/8] xen/x86: Calculate PV CR4 masks at boot

2015-06-25 Thread Andrew Cooper
On 25/06/15 14:08, Jan Beulich wrote:
 On 24.06.15 at 18:31, andrew.coop...@citrix.com wrote:
 --- a/xen/arch/x86/domain.c
 +++ b/xen/arch/x86/domain.c
 @@ -682,24 +682,47 @@ void arch_domain_unpause(struct domain *d)
  viridian_time_ref_count_thaw(d);
  }
  
 -unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long 
 guest_cr4)
 +/*
 + * These are the masks of CR4 bits (subject to hardware availability) which 
 a
 + * PV guest may not legitimiately attempt to modify.
 + */
 +static unsigned long __read_mostly pv_cr4_mask, compat_pv_cr4_mask;
 The patch generally being fine, I still wonder why you chose to use
 pv in the names instead of the previous hv: To me, the latter
 makes more sense: the bits the hypervisor controls instead of the
 bits pv guests do not control.

It is the set of bits Xen doesn't mind the guest attempting to modify,
which is specifically different from the bits Xen actually controls, and
different from the set of bits shadowed in a guests CR4.

The masks do represent a superset of the shadowed bits, (clamped by
hardware support).  Bits such as PGE and FSGSBASE are deemed ok for a
guest to attempt to modify, but are not shadowed and the guests
interests are completely ignored.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 6/8] xen/x86: Calculate PV CR4 masks at boot

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 15:31, andrew.coop...@citrix.com wrote:
 On 25/06/15 14:08, Jan Beulich wrote:
 On 24.06.15 at 18:31, andrew.coop...@citrix.com wrote:
 --- a/xen/arch/x86/domain.c
 +++ b/xen/arch/x86/domain.c
 @@ -682,24 +682,47 @@ void arch_domain_unpause(struct domain *d)
  viridian_time_ref_count_thaw(d);
  }
  
 -unsigned long pv_guest_cr4_fixup(const struct vcpu *v, unsigned long 
 guest_cr4)
 +/*
 + * These are the masks of CR4 bits (subject to hardware availability) 
 which a
 + * PV guest may not legitimiately attempt to modify.
 + */
 +static unsigned long __read_mostly pv_cr4_mask, compat_pv_cr4_mask;
 The patch generally being fine, I still wonder why you chose to use
 pv in the names instead of the previous hv: To me, the latter
 makes more sense: the bits the hypervisor controls instead of the
 bits pv guests do not control.
 
 It is the set of bits Xen doesn't mind the guest attempting to modify,

It's the inverse of that set of bits really, isn't it?

Jan

 which is specifically different from the bits Xen actually controls, and
 different from the set of bits shadowed in a guests CR4.
 
 The masks do represent a superset of the shadowed bits, (clamped by
 hardware support).  Bits such as PGE and FSGSBASE are deemed ok for a
 guest to attempt to modify, but are not shadowed and the guests
 interests are completely ignored.
 
 ~Andrew




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 01/11] x86/acpi: add a common interface for x86 cpu matching

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 13:14, wei.w.w...@intel.com wrote:
 Add a common interface for matching the current cpu against an
 array of x86_cpu_ids. Also change mwait-idle.c to use it.
 
 v4 changes:
 None.
 
 Signed-off-by: Wei Wang wei.w.w...@intel.com

Please avoid re-sending patches that got applied already.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 8/8] xen/x86: Additional SMAP modes to work around buggy 32bit PV guests

2015-06-25 Thread Andrew Cooper
On 25/06/15 12:18, David Vrabel wrote:
 On 24/06/15 17:31, Andrew Cooper wrote:
 Experimentally, older Linux guests perform construction of `init` with user
 pagetable mappings.  This is fine for native systems as such a guest would 
 not
 set CR4.SMAP itself.

 However if Xen uses SMAP itself, 32bit PV guests (whose kernels run in ring1)
 are also affected.  Older Linux guests end up spinning in a loop assuming 
 that
 the SMAP violation pagefaults are spurious, and make no further progress.

 One option is to disable SMAP completely, but this is unreasonable.  A better
 alternative is to disable SMAP only in the context of 32bit PV guests, but
 reduces the effectiveness SMAP security.  A 3rd option is for Xen to fix up
 behind a 32bit guest if it were SMAP-aware.  It is a heuristic, and does
 result in a guest-visible state change, but allows Xen to keep CR4.SMAP
 unconditionally enabled.
 [...]
 --- a/docs/misc/xen-command-line.markdown
 +++ b/docs/misc/xen-command-line.markdown
 @@ -1261,11 +1261,32 @@ Set the serial transmit buffer size.
  Flag to enable Supervisor Mode Execution Protection
  
  ### smap
 - `= boolean`
 + `= boolean | compat | fixup`
  
   Default: `true`
  
 -Flag to enable Supervisor Mode Access Prevention
 +Handling of Supervisor Mode Access Prevention.
 +
 +32bit PV guest kernels qualify as supervisor code, as they execute in ring 
 1.
 +If Xen uses SMAP protection itself, a PV guest which is not SMAP aware may
 +suffer unexpected pagefaults which it cannot handle. (Experimentally, there
 +are 32bit PV guests which fall foul of SMAP enforcement and spin in an
 +infinite loop taking pagefaults early on boot.)
 +
 +Two further SMAP modes are introduced to work around buggy 32bit PV guests 
 to
 +prevent functional regressions of VMs on newer hardware.  At any point if 
 the
 +guest sets `CR4.SMAP` itself, it is deemed aware, and **compat/fixup** cease
 +to apply.
 Guests that is not aware of SMAP or do not support it are not buggy.

Taking and not understanding a SMAP #PF is understandable.  The way it
spins in an infinite loop is unquestionably buggy.


 +
 +A SMAP mode of **compat** causes Xen to disable `CR4.SMAP` in the context of
 +an unaware 32bit PV guest.  This prevents the guest from being subject to 
 SMAP
 +enforcement, but also prevents Xen from benefiting from the added security
 +checks.
 +
 +A SMAP mode of **fixup** causes Xen to set `EFLAGS.AC` when discovering a 
 SMAP
 +pagefault in the context of an unaware 32bit PV guest.  This allows Xen to
 +retain the added security from SMAP checks, but results in a guest-visible
 +state change which it might object to.
 What does the PV ABI say about the use of EFLAGS.AC?  Have guests
 historically been allowed to use this bit?  If so, does Xen fiddling
 with it potentially break some guests?

If there were an ABI written down anywhere, I might be able to answer
that question.

32bit PV guest kernels cannot make use of AC themselves; alignment
checking is only available in cpl3.  AC is however able to be changed by
a popf instruction even in cpl3 (which make it very curious as to why
stac/clac are strictly cpl0 instructions).

Fundamentally, smap=fixup might indeed break a PV guest, but testing
shows that RHEL/CentOS 5/6, SLES 11/12 and Debian 6/7 PV guests are all
fine with it.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] vif-bridge: ip link set failed, name too long

2015-06-25 Thread Ian Campbell
On Thu, 2015-06-25 at 12:36 +0100, Anthony PERARD wrote:
 Error: argument tap695cf459-b0-emu is wrong: name too long

Under Linux IFNAMSIZ is 16, whereas this is 18 characters.

Since our suffix is -emu we are adding 4 to the original 14, so we
could/should pick a 2 character suffix to distinguish PV from emulated
interfaces. -e perhaps?

It looks like the suffix is in both 
tools/hotplug/Linux/vif-common.sh and
tools/libxl/libxl_internal.h:TAP_DEVICE_SUFFIX. We could perhaps arrange
somehow that only the hotplug scripts needed to know this, allowing this
to be a more localised decision but it would no doubt involve a bunch of
faff. I'm inclined to suggest we just change the suffix globally.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] libxl: Add AHCI support for upstream qemu

2015-06-25 Thread Malcolm Crossley
On 25/06/15 12:15, Fabio Fantoni wrote:
 Il 25/06/2015 12:21, Ian Campbell ha scritto:
 On Tue, 2015-06-23 at 11:15 +0200, Fabio Fantoni wrote:
 Usage:
 ahci=0|1 (default=0)
 I think a global rather than per disk option is OK (I can't think why a
 user would want to mix and match) but maybe we should consider using an
 enum (with values ide and ahci, defaulting to ide in libxl) so that we
 can add support for whatever fancy new disk controller everyone is using
 in 5 years time?
 
 ahci was added 4 years ago in qemu and I don't know of newer similar 
 tecnology, in the case of enum
 probably shold be more generic for include more future possibility or I'm 
 wrong? in that case what
 can be the name?
 @stabellini and other developer: any advice about this?

You may want to support nvme device interface as well. This would be the newer 
similar technology
you are referring to :)



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/4] xen: sched: avoid dumping duplicate information

2015-06-25 Thread Dario Faggioli
When dumping scheduling information (debug key 'r'), what
we print as 'Idle cpupool' is pretty much the same of what
we print immediately after as 'Cpupool0'. In fact, if there
are no pCPUs outside of any cpupools, it is exactly the
same.

If there are free pCPUs, there is some valuable information,
but still a lot of duplication:

 (XEN) Online Cpus: 0-15
 (XEN) Free Cpus: 8
 (XEN) Idle cpupool:
 (XEN) Scheduler: SMP Credit Scheduler (credit)
 (XEN) info:
 (XEN)   ncpus  = 13
 (XEN)   master = 0
 (XEN)   credit = 3900
 (XEN)   credit balance = 45
 (XEN)   weight = 1280
 (XEN)   runq_sort  = 11820
 (XEN)   default-weight = 256
 (XEN)   tslice = 30ms
 (XEN)   ratelimit  = 1000us
 (XEN)   credits per msec   = 10
 (XEN)   ticks per tslice   = 3
 (XEN)   migration delay= 0us
 (XEN) idlers: ,6d29
 (XEN) active vcpus:
 (XEN) 1: [1.7] pri=-1 flags=0 cpu=15 credit=-116 [w=256,cap=0] (84+300) 
{a/i=22/21 m=18+5 (k=0)}
 (XEN) 2: [1.3] pri=0 flags=0 cpu=1 credit=-113 [w=256,cap=0] (87+300) 
{a/i=37/36 m=11+544 (k=0)}
 (XEN) 3: [0.15] pri=-1 flags=0 cpu=4 credit=95 [w=256,cap=0] (210+300) 
{a/i=127/126 m=108+9 (k=0)}
 (XEN) 4: [0.10] pri=-2 flags=0 cpu=12 credit=-287 [w=256,cap=0] (-84+300) 
{a/i=163/162 m=36+568 (k=0)}
 (XEN) 5: [0.7] pri=-2 flags=0 cpu=2 credit=-242 [w=256,cap=0] (-42+300) 
{a/i=129/128 m=16+50 (k=0)}
 (XEN) CPU[08]  sort=5791, sibling=,0300, core=,ff00
 (XEN)   run: [32767.8] pri=-64 flags=0 cpu=8
 (XEN) Cpupool 0:
 (XEN) Cpus: 0-5,10-15
 (XEN) Scheduler: SMP Credit Scheduler (credit)
 (XEN) info:
 (XEN)   ncpus  = 13
 (XEN)   master = 0
 (XEN)   credit = 3900
 (XEN)   credit balance = 45
 (XEN)   weight = 1280
 (XEN)   runq_sort  = 11820
 (XEN)   default-weight = 256
 (XEN)   tslice = 30ms
 (XEN)   ratelimit  = 1000us
 (XEN)   credits per msec   = 10
 (XEN)   ticks per tslice   = 3
 (XEN)   migration delay= 0us
 (XEN) idlers: ,6d29
 (XEN) active vcpus:
 (XEN) 1: [1.7] pri=-1 flags=0 cpu=15 credit=-116 [w=256,cap=0] (84+300) 
{a/i=22/21 m=18+5 (k=0)}
 (XEN) 2: [1.3] pri=0 flags=0 cpu=1 credit=-113 [w=256,cap=0] (87+300) 
{a/i=37/36 m=11+544 (k=0)}
 (XEN) 3: [0.15] pri=-1 flags=0 cpu=4 credit=95 [w=256,cap=0] (210+300) 
{a/i=127/126 m=108+9 (k=0)}
 (XEN) 4: [0.10] pri=-2 flags=0 cpu=12 credit=-287 [w=256,cap=0] (-84+300) 
{a/i=163/162 m=36+568 (k=0)}
 (XEN) 5: [0.7] pri=-2 flags=0 cpu=2 credit=-242 [w=256,cap=0] (-42+300) 
{a/i=129/128 m=16+50 (k=0)}
 (XEN) CPU[00]  sort=11801, sibling=,0003, core=,00ff
 (XEN)   run: [32767.0] pri=-64 flags=0 cpu=0
 ... ... ...
 (XEN) CPU[15]  sort=11820, sibling=,c000, core=,ff00
 (XEN)   run: [1.7] pri=-1 flags=0 cpu=15 credit=-116 [w=256,cap=0] (84+300) 
{a/i=22/21 m=18+5 (k=0)}
 (XEN) 1: [32767.15] pri=-64 flags=0 cpu=15
 (XEN) Cpupool 1:
 (XEN) Cpus: 6-7,9
 (XEN) Scheduler: SMP RTDS Scheduler (rtds)
 (XEN) CPU[06]
 (XEN) CPU[07]
 (XEN) CPU[09]

With this change, we get rid of the redundancy, and retain
only the information about the free pCPUs.

(While there, turn a loop index variable from `int' to
`unsigned int' in schedule_dump().)

Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
---
Cc: Juergen Gross jgr...@suse.com
Cc: George Dunlap george.dun...@eu.citrix.com
---
 xen/common/cpupool.c  |6 +++---
 xen/common/schedule.c |   18 +-
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index 563864d..5471f93 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -728,10 +728,10 @@ void dump_runq(unsigned char key)
 
 print_cpumap(Online Cpus, cpu_online_map);
 if ( !cpumask_empty(cpupool_free_cpus) )
+{
 print_cpumap(Free Cpus, cpupool_free_cpus);
-
-printk(Idle cpupool:\n);
-schedule_dump(NULL);
+schedule_dump(NULL);
+}
 
 for_each_cpupool(c)
 {
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index ecf1545..4ffcd98 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -1473,16 +1473,24 @@ void scheduler_free(struct scheduler *sched)
 
 void schedule_dump(struct cpupool *c)
 {
-int   i;
+unsigned int  i;
 struct scheduler *sched;
 cpumask_t*cpus;
 
 /* Locking, if necessary, must be handled withing each scheduler */
 
-sched = (c == NULL) ? ops : c-sched;
-cpus = cpupool_scheduler_cpumask(c);
-printk(Scheduler: %s (%s)\n, sched-name, sched-opt_name);
-SCHED_OP(sched, dump_settings);
+if ( c != NULL )
+{
+sched = c-sched;
+cpus = c-cpu_valid;
+printk(Scheduler: %s (%s)\n, sched-name, sched-opt_name);
+SCHED_OP(sched, dump_settings);
+}
+else
+{
+sched = ops;
+cpus 

[Xen-devel] [PATCH 2/4] xen: x86 / cpupool: clear the proper cpu_valid bit on pCPU teardown

2015-06-25 Thread Dario Faggioli
In fact, if a pCPU belonging to some other pool than
cpupool0 goes down, we want to clear the relevant bit
from its actual pool, rather than always from cpupool0.

Before this commit, all the pCPUs in the non-default
pool(s) will be considered immediately valid, during
system resume, even the one that have not been brought
up yet. As a result, the (Credit1) scheduler will attempt
to run its load balancing logic on them, causing the
following Oops:

# xl cpupool-cpu-remove Pool-0 8-15
# xl cpupool-create name=\Pool-1\
# xl cpupool-cpu-add Pool-1 8-15
-- suspend
-- resume
(XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
(XEN) CPU:8
(XEN) RIP:e008:[82d080123078] csched_schedule+0x4be/0xb97
(XEN) RFLAGS: 00010087   CONTEXT: hypervisor
(XEN) rax: 80007d2f7fccb780   rbx: 0009   rcx: 
(XEN) rdx: 82d08031ed40   rsi: 82d080334980   rdi: 
(XEN) rbp: 8301fe20   rsp: 8301fd40   r8:  0004
(XEN) r9:     r10: 00ff00ff00ff00ff   r11: 0f0f0f0f0f0f0f0f
(XEN) r12: 8303191ea870   r13: 8303226aadf0   r14: 0009
(XEN) r15: 0008   cr0: 8005003b   cr4: 26f0
(XEN) cr3: dba9d000   cr2: 
(XEN) ds:    es:    fs:    gs:    ss:    cs: e008
(XEN) ... ... ...
(XEN) Xen call trace:
(XEN)[82d080123078] csched_schedule+0x4be/0xb97
(XEN)[82d08012c732] schedule+0x12a/0x63c
(XEN)[82d08012f8c8] __do_softirq+0x82/0x8d
(XEN)[82d08012f920] do_softirq+0x13/0x15
(XEN)[82d080164791] idle_loop+0x5b/0x6b
(XEN)
(XEN) 
(XEN) Panic on CPU 8:
(XEN) GENERAL PROTECTION FAULT
(XEN) [error_code=]
(XEN) 

Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
---
Cc: Juergen Gross jgr...@suse.com
Cc: Jan Beulich jbeul...@suse.com
Cc: Andrew Cooper andrew.coop...@citrix.com
---
 xen/arch/x86/smpboot.c |1 -
 xen/common/cpupool.c   |2 ++
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 2289284..a4ec396 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -887,7 +887,6 @@ void __cpu_disable(void)
 remove_siblinginfo(cpu);
 
 /* It's now safe to remove this processor from the online map */
-cpumask_clear_cpu(cpu, cpupool0-cpu_valid);
 cpumask_clear_cpu(cpu, cpu_online_map);
 fixup_irqs();
 
diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index 5471f93..b48ae17 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -530,6 +530,7 @@ static int cpupool_cpu_remove(unsigned int cpu)
 if ( cpumask_test_cpu(cpu, (*c)-cpu_valid ) )
 {
 cpumask_set_cpu(cpu, (*c)-cpu_suspended);
+cpumask_clear_cpu(cpu, (*c)-cpu_valid);
 break;
 }
 }
@@ -552,6 +553,7 @@ static int cpupool_cpu_remove(unsigned int cpu)
  * If we are not suspending, we are hot-unplugging cpu, and that is
  * allowed only for CPUs in pool0.
  */
+cpumask_clear_cpu(cpu, cpupool0-cpu_valid);
 ret = 0;
 }
 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-4.5-testing test] 58867: regressions - FAIL

2015-06-25 Thread osstest service user
flight 58867 xen-4.5-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/58867/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-qemut-rhel6hvm-amd 12 guest-start/redhat.repeat fail REGR. vs. 
58776
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-localmigrate.2 fail REGR. 
vs. 58776
 test-amd64-i386-xl-qemuu-winxpsp3 15 guest-localmigrate/x10 fail REGR. vs. 
58776

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 58776
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58776
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58776
 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10   fail like 58776

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  e3bd3cefba5f11062523701bd07051c92a47ef34
baseline version:
 xen  a24672752214b07661db594921ba70c0ee3066c5


People who touched revisions under test:
  Ian Jackson ian.jack...@eu.citrix.com
  Jan Beulich jbeul...@suse.com


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   fail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-i386-rumpuserxen-i386 pass
 test-amd64-amd64-xl-pvh-intelfail
 test-amd64-i386-qemut-rhel6hvm-intel pass
 

Re: [Xen-devel] [PATCH OSSTEST v3 21/22] Debian: Arrange to be able to chainload a xen.efi from grub2

2015-06-25 Thread Ian Jackson
Ian Campbell writes (Re: [PATCH OSSTEST v3 21/22] Debian: Arrange to be able 
to chainload a xen.efi from grub2):
 On Thu, 2015-06-25 at 11:33 +0100, Ian Jackson wrote:
  Is there some upstream-friendly way of achieving the same thing ?
 
 Not AFAIK. I could try upstreaming this but given that a) the user still
 needs to manually copy things to the ESP and create a suitable xen.cfg
 and b) people are working on a better way which will just work with the
 existing non-UEFI grub.cfg file entries, I'm not sure how much point
 there is.

I think people are working on a better way is what I was looking
for.  When that change comes along, we can remove 20_linux_xen ?

  I'm not really sure what is `specific to us' (or what `us' here means
  - osstest, or Xen on arm64, or ...?)
 
 All the paths are basically specific to us, just the general shape of
 the entry is more generically applicable.

`us' = osstest ?  Xen ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] xen: new maintainer for the RTDS scheduler

2015-06-25 Thread Dario Faggioli
Signed-off-by: Dario Faggioli dario.faggi...@citrix.com
---
Cc: George Dunlap george.dun...@eu.citrix.com
Cc: Meng Xu xumengpa...@gmail.com
---
 MAINTAINERS |5 +
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6b1068e..e6616d2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,11 @@ F: tools/libxl/libxl_nonetbuffer.c
 F: tools/hotplug/Linux/remus-netbuf-setup
 F: tools/hotplug/Linux/block-drbd-probe
 
+RTDS SCHEDULER
+M: Dario Faggioli dario.faggi...@citrix.com
+S: Supported
+F: xen/common/sched_rt.c
+
 SCHEDULING
 M: George Dunlap george.dun...@eu.citrix.com
 S: Supported


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 09/12] x86/altp2m: add remaining support routines.

2015-06-25 Thread Lengyel, Tamas
On Wed, Jun 24, 2015 at 2:06 PM, Ed White edmund.h.wh...@intel.com wrote:

 On 06/24/2015 09:15 AM, Lengyel, Tamas wrote:
  +bool_t p2m_set_altp2m_mem_access(struct domain *d, uint16_t idx,
  + unsigned long pfn, xenmem_access_t
  access)
  +{
 
 
  This function IMHO should be merged with p2m_set_mem_access and should be
  triggerable with the same memop (XENMEM_access_op) hypercall instead of
  introducing a new hvmop one.

 I think we should vote on this. My view is that it makes XENMEM_access_op
 too complicated to use.


The two functions are not very long and share enough code that it would
justify merging. The only big change added is the copy from host-alt when
the entry doesn't exists in alt, and that itself is pretty self contained.
Let's see if we can get a third opinion on it..


 It also makes using this one specific altp2m
 capability different to using any of the others


That argument goes both ways - a new mem_access function being introduced
that is different from the others.

Tamas
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] Stepping up for being the maintainer of sched_rt.c

2015-06-25 Thread Dario Faggioli
I've been involved with this scheduler from the very beginning of the
upstreaming process (from the RT-Xen project to here).

I've been working with Meng and his group closely since then, and I now feel
comfortable to be the one that will (N)Ack their patches! :-)

Regards,
Dario
---
Dario Faggioli (1):
  xen: new maintainer for the RTDS scheduler

 MAINTAINERS |5 +
 1 file changed, 5 insertions(+)
--
This happens because I choose it to happen! (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems RD Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v3 21/22] Debian: Arrange to be able to chainload a xen.efi from grub2

2015-06-25 Thread Ian Campbell
On Thu, 2015-06-25 at 13:36 +0100, Ian Jackson wrote:
 Ian Campbell writes (Re: [PATCH OSSTEST v3 21/22] Debian: Arrange to be able 
 to chainload a xen.efi from grub2):
  On Thu, 2015-06-25 at 11:33 +0100, Ian Jackson wrote:
   Is there some upstream-friendly way of achieving the same thing ?
  
  Not AFAIK. I could try upstreaming this but given that a) the user still
  needs to manually copy things to the ESP and create a suitable xen.cfg
  and b) people are working on a better way which will just work with the
  existing non-UEFI grub.cfg file entries, I'm not sure how much point
  there is.
 
 I think people are working on a better way is what I was looking
 for.  When that change comes along, we can remove 20_linux_xen ?

OK.

   I'm not really sure what is `specific to us' (or what `us' here means
   - osstest, or Xen on arm64, or ...?)
  
  All the paths are basically specific to us, just the general shape of
  the entry is more generically applicable.
 
 `us' = osstest ?  Xen ?

Mostly osstest.

 
 Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-25 Thread Paul Durrant
 -Original Message-
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: 25 June 2015 14:47
 To: Paul Durrant; Jan Beulich
 Cc: xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 07/17] x86/hvm: add length to mmio check op
 
 On 25/06/15 14:38, Paul Durrant wrote:
  -Original Message-
  From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
  Sent: 25 June 2015 14:38
  To: Paul Durrant; Jan Beulich
  Cc: xen-de...@lists.xenproject.org; Keir (Xen.org)
  Subject: Re: [PATCH v4 07/17] x86/hvm: add length to mmio check op
 
  On 25/06/15 14:36, Paul Durrant wrote:
  -Original Message-
  From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
  Sent: 25 June 2015 14:34
  To: Jan Beulich
  Cc: Paul Durrant; xen-de...@lists.xenproject.org; Keir (Xen.org)
  Subject: Re: [PATCH v4 07/17] x86/hvm: add length to mmio check op
 
  On 25/06/15 13:46, Jan Beulich wrote:
  On 25.06.15 at 14:21, andrew.coop...@citrix.com wrote:
  On 24/06/15 12:24, Paul Durrant wrote:
  When memory mapped I/O is range checked by internal handlers,
 the
  length
  of the access should be taken into account.
 
  Signed-off-by: Paul Durrant paul.durr...@citrix.com
  Cc: Keir Fraser k...@xen.org
  Cc: Jan Beulich jbeul...@suse.com
  Cc: Andrew Cooper andrew.coop...@citrix.com
 
  For what purpose?  The length of the access doesn't affect which
  handler
  should accept the IO.
 
  This length check now causes an MMIO handler to not claim an
 access
  which straddles the upper boundary.
 
  It is probably fine to terminate such an access early, but it isn't 
  fine
  to pass such a straddled access to the default ioreq server.
  No, without involving the length in the check we can end up with
  check() saying Yes, mine but read() or write() saying Not me.
  What I would agree with is for the generic handler to split the
  access if the first byte fits, but the final byte doesn't.
  I discussed this with Paul over lunch.  I had not considered how IO gets
  forwarded to the device model for shared implementations.
 
  Is it reasonable to split a straddled access and direct the halves at
  different handlers? This is not in line with how other hardware behaves
  (PCIe will reject any straddled access).  Furthermore, given small MMIO
  regions and larger registers, there is no guarantee that a single split
  will suffice.
 
  I see in the other thread going on that a domain_crash() is deemed ok
  for now, which is fine my me.
 
  I think that also allows me to simplfy the patch since I don't have to
 modify
  the mmio_check op any more. I simply call it once for the first byte of the
  access and, if it accepts, verify that it also accepts the last byte of the
 access.
 
  At that point, I would say it would be easier to modify the claim check
  to return yes/straddled/no rather than calling it twice.
  That's excessive code churn, I think. The check functions are generally
 cheap and the second call is only made if the first accepts.
 
 You are already churning everything anyway by inserting an extra
 parameter.  I do think it would make the logic cleaner and easier to
 follow (which IMO takes precedent over churn).
 

No, my point was that by making the second call I don't need to add the extra 
parameter. Wait for the revised patch... it's about 6 lines long now ;-)

  Paul

 ~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 15:36, paul.durr...@citrix.com wrote:
 I think that also allows me to simplfy the patch since I don't have to 
 modify the mmio_check op any more. I simply call it once for the first byte 
 of the access and, if it accepts, verify that it also accepts the last byte 
 of the access.

That's actually not (generally) okay: There could be a hole in the
middle. But as long as instructions don't do accesses wider than
a page, we're fine with that in practice I think. Or wait, no, in the
MSI-X this could not be okay: A 64-byte read to the 16 bytes
32 bytes away from a page boundary (and being the last entry
on one device's MSI-X table) would extend into another device's
MSI-X table on the next page. I.e. first and last bytes would be
okay to be accessed, but bytes 16...31 of the access wouldn't.
Of course the MSI-X read/write handlers don't currently permit
such wide accesses, but anyway...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 07/17] x86/hvm: add length to mmio check op

2015-06-25 Thread Andrew Cooper
On 25/06/15 14:38, Paul Durrant wrote:
 -Original Message-
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: 25 June 2015 14:38
 To: Paul Durrant; Jan Beulich
 Cc: xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 07/17] x86/hvm: add length to mmio check op

 On 25/06/15 14:36, Paul Durrant wrote:
 -Original Message-
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: 25 June 2015 14:34
 To: Jan Beulich
 Cc: Paul Durrant; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: Re: [PATCH v4 07/17] x86/hvm: add length to mmio check op

 On 25/06/15 13:46, Jan Beulich wrote:
 On 25.06.15 at 14:21, andrew.coop...@citrix.com wrote:
 On 24/06/15 12:24, Paul Durrant wrote:
 When memory mapped I/O is range checked by internal handlers, the
 length
 of the access should be taken into account.

 Signed-off-by: Paul Durrant paul.durr...@citrix.com
 Cc: Keir Fraser k...@xen.org
 Cc: Jan Beulich jbeul...@suse.com
 Cc: Andrew Cooper andrew.coop...@citrix.com

 For what purpose?  The length of the access doesn't affect which
 handler
 should accept the IO.

 This length check now causes an MMIO handler to not claim an access
 which straddles the upper boundary.

 It is probably fine to terminate such an access early, but it isn't fine
 to pass such a straddled access to the default ioreq server.
 No, without involving the length in the check we can end up with
 check() saying Yes, mine but read() or write() saying Not me.
 What I would agree with is for the generic handler to split the
 access if the first byte fits, but the final byte doesn't.
 I discussed this with Paul over lunch.  I had not considered how IO gets
 forwarded to the device model for shared implementations.

 Is it reasonable to split a straddled access and direct the halves at
 different handlers? This is not in line with how other hardware behaves
 (PCIe will reject any straddled access).  Furthermore, given small MMIO
 regions and larger registers, there is no guarantee that a single split
 will suffice.

 I see in the other thread going on that a domain_crash() is deemed ok
 for now, which is fine my me.

 I think that also allows me to simplfy the patch since I don't have to 
 modify
 the mmio_check op any more. I simply call it once for the first byte of the
 access and, if it accepts, verify that it also accepts the last byte of the 
 access.

 At that point, I would say it would be easier to modify the claim check
 to return yes/straddled/no rather than calling it twice.
 That's excessive code churn, I think. The check functions are generally cheap 
 and the second call is only made if the first accepts.

You are already churning everything anyway by inserting an extra
parameter.  I do think it would make the logic cleaner and easier to
follow (which IMO takes precedent over churn).

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] libxl: Add AHCI support for upstream qemu

2015-06-25 Thread Ian Campbell
On Tue, 2015-06-23 at 11:15 +0200, Fabio Fantoni wrote:
 Usage:
 ahci=0|1 (default=0)

I think a global rather than per disk option is OK (I can't think why a
user would want to mix and match) but maybe we should consider using an
enum (with values ide and ahci, defaulting to ide in libxl) so that we
can add support for whatever fancy new disk controller everyone is using
in 5 years time?

 If enabled adds ich9 disk controller in ahci mode and uses it with
 upstream qemu to emulate disks instead of ide.
 It doesn't support cdroms which still using ide (cdroms will use
 -device ide-cd as new qemu parameter)

I don't follow this reference to will use and a new qemu parameter,
there seems to be nothing corresponding in this patch AFAICT.

 Ahci requires new qemu parameter but for now other emulated disks cases
 remains with old ones (I did it in other patch, not needed by this one)

You can drop the reference to the other patch I think.

 I did it as libxl parameter disabled by default to avoid possible
 problems:
 - with save/restore/migration (restoring with ahci a domU that was with
 ide instead)
 - windows  8 without pv drivers (a registry key change is needed for
 AHCI-IDE change FWIK to avoid possible blue screen)

What is FWIK?

 - windows XP or older that many not support ahci by default.
 Setting AHCI with libxl parameter and default to disabled seems the best
 solution.
 AHCI increase hvm domUs boot performance. On linux hvm domU I saw up to
 only 20% of the previous total boot time, whereas boot time decrease a
 lot on W7 domUs for most of boots I have done. Small difference in boot
 time compared to ide mode on W8 and newer (probably other xen
 improvements or fixes are needed not ahci related)
 
 Signed-off-by: Fabio Fantoni fabio.fant...@m2r.biz
 
 ---
 
 Changes in v2:
 - libxl_dm.c: small code style fix
 - added vbd-interface.txt changes
 ---
  docs/man/xl.cfg.pod.5   |  9 +
  docs/misc/vbd-interface.txt |  5 +++--
  tools/libxl/libxl.h | 10 ++
  tools/libxl/libxl_create.c  |  1 +
  tools/libxl/libxl_dm.c  | 10 +-
  tools/libxl/libxl_types.idl |  1 +
  tools/libxl/xl_cmdimpl.c|  1 +
  7 files changed, 34 insertions(+), 3 deletions(-)
 
 diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
 index a3e0e2e..7e16123 100644
 --- a/docs/man/xl.cfg.pod.5
 +++ b/docs/man/xl.cfg.pod.5
 @@ -904,6 +904,15 @@ default is Bcd.
  
  =back
  
 +=item Bahci=[0|1]

=item Bahci=BOOLEAN please.

 +If enabled adds ich9 disk controller in ahci mode and uses it with
 +upstream qemu to emulate disks instead of ide. It decrease boot time but

decreases

 +may be not supported by default in windows xp and older windows.
 +The default is disabled (0).

may not be supported.

I think AHCI and IDE should be capitalised in the text (not the option
name). As should Windows XP and Windows

 +
 +=back
 +
  =head3 Paging
  
  The following options control the mechanisms used to virtualise guest
 diff --git a/docs/misc/vbd-interface.txt b/docs/misc/vbd-interface.txt
 index f873db0..afb6846 100644
 --- a/docs/misc/vbd-interface.txt
 +++ b/docs/misc/vbd-interface.txt
 @@ -3,18 +3,19 @@ Xen guest interface
  
  A Xen guest can be provided with block devices.  These are always
  provided as Xen VBDs; for HVM guests they may also be provided as
 -emulated IDE or SCSI disks.
 +emulated IDE, AHCI or SCSI disks.
  
  The abstract interface involves specifying, for each block device:
  
   * Nominal disk type: Xen virtual disk (aka xvd*, the default); SCSI
 -   (sd*); IDE (hd*).
 +   (sd*); IDE or AHCI (hd*).
  
 For HVM guests, each whole-disk hd* and and sd* device is made
 available _both_ via emulated IDE resp. SCSI controller, _and_ as a
 Xen VBD.  The HVM guest is entitled to assume that the IDE or SCSI
 disks available via the emulated IDE controller target the same
 underlying devices as the corresponding Xen VBD (ie, multipath).
 +   In hd* case with ahci=1, disk will be AHCI via emulated ich9 controller.
  
 For PV guests every device is made available to the guest only as a
 Xen VBD.  For these domains the type is advisory, for use by the
 diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
 index 0a7913b..6a3677d 100644
 --- a/tools/libxl/libxl.h
 +++ b/tools/libxl/libxl.h
 @@ -596,6 +596,16 @@ typedef struct libxl__ctx libxl_ctx;
  #define LIBXL_HAVE_SPICE_STREAMINGVIDEO 1
  
  /*
 + * LIBXL_HAVE_AHCI
 + *
 + * If defined, then the u.hvm structure will contain a boolean type:
 + * ahci. This value defines if ahci support is present.
 + *
 + * If this is not defined, the ahci support is ignored.
 + */
 +#define LIBXL_HAVE_AHCI 1
 +
 +/*
   * LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS 1
   *
   * If this is defined, libxl_domain_create_restore()'s API has changed to
 diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
 index 86384d2..8ca2481 100644
 --- a/tools/libxl/libxl_create.c
 +++ b/tools/libxl/libxl_create.c
 @@ -331,6 

Re: [Xen-devel] [PATCH OSSTEST v3 02/22] mg-*: Make package fetching common in new mgi-debian

2015-06-25 Thread Ian Jackson
Ian Campbell writes (Re: [PATCH OSSTEST v3 02/22] mg-*: Make package fetching 
common in new mgi-debian):
 On Wed, 2015-06-24 at 17:00 +0100, Ian Jackson wrote:
...
  Although, another option would be to put this in mgi-common and call
  it fetch_debian_package.
 
 I'm happy either way, which would you prefer?

I'd marginally prefer it all in mgi-common.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] xen: Allow xen tools to run in guest using 64K page granularity

2015-06-25 Thread Wei Liu
On Mon, May 11, 2015 at 12:55:34PM +0100, Julien Grall wrote:
 Hi all,
 
 This small series are the only changes required in Xen in order to run a guest
 using 64K page granularity on top of an unmodified Xen.
 
 I'd like feedback from maintainers tools to know if it might be worth to
 introduce a function xc_pagesize() replicating the behavior of getpagesize()
 for Xen.
 

Can we start with documenting the ABI (?) for communicating between
guests with different page sizes?

Or at least mention the ring mfn always has the size of XC_PAGE_SIZE (if
that's the case).

Wei.

 Sincerely yours,
 
 Julien Grall (2):
   tools/xenstored: Use XC_PAGE_SIZE rather than getpagesize()
   tools/xenconsoled: Use XC_PAGE_SIZE rather than getpagesize()
 
  tools/console/daemon/io.c | 4 ++--
  tools/xenstore/xenstored_domain.c | 4 ++--
  2 files changed, 4 insertions(+), 4 deletions(-)
 
 -- 
 2.1.4

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC v1 10/13] lib{xc/xl}: allow the creation of HVM domains with a kernel

2015-06-25 Thread Wei Liu
I think the subject line should be changed a bit.

We already support HVM direct kernel boot with QEMU. Now you're
implementing that without QEMU.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 10/19] tools: extend xc_assign_device() to support rdm reservation policy

2015-06-25 Thread Wei Liu
On Tue, Jun 23, 2015 at 05:57:21PM +0800, Tiejun Chen wrote:
 This patch passes rdm reservation policy to xc_assign_device() so the policy
 is checked when assigning devices to a VM.
 
 Note this also bring some fallout to python usage of xc_assign_device().
 
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 CC: David Scott dave.sc...@eu.citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com

Acked-by: Wei Liu wei.l...@citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 17/17] x86/hvm: track large memory mapped accesses by buffer offset

2015-06-25 Thread Paul Durrant
 -Original Message-
 From: Paul Durrant
 Sent: 25 June 2015 11:52
 To: 'Jan Beulich'
 Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
 Subject: RE: [PATCH v4 17/17] x86/hvm: track large memory mapped
 accesses by buffer offset
 
  -Original Message-
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: 25 June 2015 11:47
  To: Paul Durrant
  Cc: Andrew Cooper; xen-de...@lists.xenproject.org; Keir (Xen.org)
  Subject: Re: [PATCH v4 17/17] x86/hvm: track large memory mapped
  accesses by buffer offset
 
   On 24.06.15 at 13:24, paul.durr...@citrix.com wrote:
   @@ -621,14 +574,41 @@ static int hvmemul_phys_mmio_access(
  
for ( ;; )
{
   -rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
   -*buffer);
   -if ( rc != X86EMUL_OKAY )
   -break;
   +/* Have we already done this chunk? */
   +if ( (*off + chunk) = vio-mmio_cache[dir].size )
 
  I can see why you would like to get rid of the address check, but
  I'm afraid you can't: You have to avoid getting mixed up multiple
  same kind (reads or writes) memory accesses that a single
  instruction can do. While generally I would assume that
  secondary accesses (like the I/O bitmap read associated with an
  OUTS) wouldn't go to MMIO, CMPS with both operands being
  in MMIO would break even if neither crosses a page boundary
  (not to think of when the emulator starts supporting the
  scatter/gather instructions, albeit supporting them will require
  further changes, or we could choose to do them one element at
  a time).
 
 Ok. Can I assume at most two distinct set of addresses for read or write? If 
 so
 then I can just keep two sets of caches in the hvm_io struct.
 

Oh, I mean linear addresses here BTW.

  Paul

 
   +{
   +ASSERT(*off + chunk = vio-mmio_cache[dir].size);
 
  I don't see any difference to the if() expression just above.
 
 
 That's possible  - this has been through a few re-bases.
 
   +if ( dir == IOREQ_READ )
   +memcpy(buffer[*off],
   +   vio-mmio_cache[IOREQ_READ].buffer[*off],
   +   chunk);
   +else
   +{
   +if ( memcmp(buffer[*off],
 
  else if please.
 
 
 Ok.
 
   +vio-mmio_cache[IOREQ_WRITE].buffer[*off],
   +chunk) != 0 )
   +domain_crash(curr-domain);
   +}
   +}
   +else
   +{
   +ASSERT(*off == vio-mmio_cache[dir].size);
   +
   +rc = hvmemul_do_mmio_buffer(gpa, one_rep, chunk, dir, 0,
   +buffer[*off]);
   +if ( rc != X86EMUL_OKAY )
   +break;
   +
   +/* Note that we have now done this chunk */
 
  Missing stop.
 
 
 Ok.
 
   Paul
 
  Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v4 11/11] tools: enable xenpm to control the intel_pstate driver

2015-06-25 Thread Wei Wang
The intel_pstate driver receives percentage values to set the
performance limits. This patch adds interfaces to support the
input of percentage values to control the intel_pstate driver.
Also, the get-cpufreq-para is modified to show percentage
based feedback info.

v4 changes:
None.

Signed-off-by: Wei Wang wei.w.w...@intel.com
---
 tools/libxc/include/xenctrl.h |  14 -
 tools/libxc/xc_pm.c   |  17 ---
 tools/misc/xenpm.c| 116 +-
 3 files changed, 115 insertions(+), 32 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 100b89c..a79494a 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2266,8 +2266,18 @@ struct xc_get_cpufreq_para {
 uint32_t scaling_cur_freq;
 
 char scaling_governor[CPUFREQ_NAME_LEN];
-uint32_t scaling_max_freq;
-uint32_t scaling_min_freq;
+
+union {
+uint32_t freq;
+uint32_t pct;
+} scaling_max;
+
+union {
+uint32_t freq;
+uint32_t  pct;
+} scaling_min;
+
+uint32_t scaling_turbo_pct;
 
 /* for specific governor */
 union {
diff --git a/tools/libxc/xc_pm.c b/tools/libxc/xc_pm.c
index 823bab6..300de33 100644
--- a/tools/libxc/xc_pm.c
+++ b/tools/libxc/xc_pm.c
@@ -261,13 +261,16 @@ int xc_get_cpufreq_para(xc_interface *xch, int cpuid,
 }
 else
 {
-user_para-cpuinfo_cur_freq = sys_para-cpuinfo_cur_freq;
-user_para-cpuinfo_max_freq = sys_para-cpuinfo_max_freq;
-user_para-cpuinfo_min_freq = sys_para-cpuinfo_min_freq;
-user_para-scaling_cur_freq = sys_para-scaling_cur_freq;
-user_para-scaling_max_freq = sys_para-scaling_max.freq;
-user_para-scaling_min_freq = sys_para-scaling_min.freq;
-user_para-turbo_enabled= sys_para-turbo_enabled;
+user_para-cpuinfo_cur_freq = sys_para-cpuinfo_cur_freq;
+user_para-cpuinfo_max_freq = sys_para-cpuinfo_max_freq;
+user_para-cpuinfo_min_freq = sys_para-cpuinfo_min_freq;
+user_para-scaling_cur_freq = sys_para-scaling_cur_freq;
+user_para-scaling_max.freq = sys_para-scaling_max.freq;
+user_para-scaling_min.freq = sys_para-scaling_min.freq;
+user_para-scaling_max.pct  = sys_para-scaling_max.pct;
+user_para-scaling_min.pct  = sys_para-scaling_min.pct;
+user_para-scaling_turbo_pct= sys_para-scaling_turbo_pct;
+user_para-turbo_enabled= sys_para-turbo_enabled;
 
 memcpy(user_para-scaling_driver,
 sys_para-scaling_driver, CPUFREQ_NAME_LEN);
diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
index 2f9bd8e..ea6a32f 100644
--- a/tools/misc/xenpm.c
+++ b/tools/misc/xenpm.c
@@ -33,6 +33,11 @@
 #define MAX_CORE_RESIDENCIES 8
 
 #define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
+#define min_t(type,x,y) \
+({ type __x = (x); type __y = (y); __x  __y ? __x: __y; })
+#define max_t(type,x,y) \
+({ type __x = (x); type __y = (y); __x  __y ? __x: __y; })
+#define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)
 
 static xc_interface *xc_handle;
 static unsigned int max_cpu_nr;
@@ -47,6 +52,9 @@ void show_help(void)
  get-cpuidle-states[cpuid]   list cpu idle info of CPU 
cpuid or all\n
  get-cpufreq-states[cpuid]   list cpu freq info of CPU 
cpuid or all\n
  get-cpufreq-para  [cpuid]   list cpu freq parameter of 
CPU cpuid or all\n
+ set-scaling-max-pct   [cpuid] num set max performance limit in 
percentage\n
+ or as scaling speed in 
percentage in userspace governor\n
+ set-scaling-min-pct   [cpuid] num set min performance limit in 
percentage\n
  set-scaling-maxfreq   [cpuid] HZ  set max cpu frequency HZ 
on CPU cpuid\n
  or all CPUs\n
  set-scaling-minfreq   [cpuid] HZ  set min cpu frequency HZ 
on CPU cpuid\n
@@ -60,10 +68,10 @@ void show_help(void)
  set-up-threshold  [cpuid] num set up threshold on CPU 
cpuid or all\n
  it is used in ondemand 
governor.\n
  get-cpu-topologyget thread/core/socket 
topology info\n
- set-sched-smt   enable|disable enable/disable scheduler 
smt power saving\n
+ set-sched-smt   enable|disable 
enable/disable scheduler smt power saving\n
  set-vcpu-migration-delay  num set scheduler vcpu migration 
delay in us\n
  get-vcpu-migration-delayget scheduler vcpu migration 
delay\n
- set-max-cstatenum set the C-State limitation 
(num = 0)\n
+ set-max-cstatenum set 

Re: [Xen-devel] [v4][PATCH 11/19] tools: introduce some new parameters to set rdm policy

2015-06-25 Thread Wei Liu
On Tue, Jun 23, 2015 at 05:57:22PM +0800, Tiejun Chen wrote:
 This patch introduces user configurable parameters to specify RDM
 resource and according policies,
 
 Global RDM parameter:
 rdm = type=none/host,reserve=strict/relaxed
 Per-device RDM parameter:
 pci = [ 'sbdf, rdm_reserve=strict/relaxed' ]
 
 Global RDM parameter, type, allows user to specify reserved regions
 explicitly, e.g. using 'host' to include all reserved regions reported
 on this platform which is good to handle hotplug scenario. In the future
 this parameter may be further extended to allow specifying random regions,
 e.g. even those belonging to another platform as a preparation for live
 migration with passthrough devices. Instead, 'none' means we have nothing
 to do all reserved regions and ignore all policies, so guest work as before.
 
 'strict/relaxed' policy decides how to handle conflict when reserving RDM
 regions in pfn space. If conflict exists, 'strict' means an immediate error
 so VM will be killed, while 'relaxed' allows moving forward with a warning
 message thrown out.
 
 Default per-device RDM policy is 'strict', while default global RDM policy
 is 'relaxed'. When both policies are specified on a given region, 'strict' is
 always preferred.
 
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com

The code looks good to me. I will wait for native English speakers to
have a look at the docs.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v4][PATCH 12/19] tools/libxl: passes rdm reservation policy

2015-06-25 Thread Wei Liu
On Tue, Jun 23, 2015 at 05:57:23PM +0800, Tiejun Chen wrote:
 This patch passes our rdm reservation policy inside libxl
 when we assign a device or attach a device.
 
 CC: Ian Jackson ian.jack...@eu.citrix.com
 CC: Stefano Stabellini stefano.stabell...@eu.citrix.com
 CC: Ian Campbell ian.campb...@citrix.com
 CC: Wei Liu wei.l...@citrix.com
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com

The code looks good to me. I will wait for native English speakers to
have a look at the docs.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/2] xen: Allow xen tools to run in guest using 64K page granularity

2015-06-25 Thread Wei Liu
On Thu, Jun 25, 2015 at 12:23:26PM +0100, Ian Campbell wrote:
 On Thu, 2015-06-25 at 11:21 +0100, Wei Liu wrote:
  On Mon, May 11, 2015 at 12:55:34PM +0100, Julien Grall wrote:
   Hi all,
   
   This small series are the only changes required in Xen in order to run a 
   guest
   using 64K page granularity on top of an unmodified Xen.
   
   I'd like feedback from maintainers tools to know if it might be worth to
   introduce a function xc_pagesize() replicating the behavior of 
   getpagesize()
   for Xen.
   
  
  Can we start with documenting the ABI (?) for communicating between
  guests with different page sizes?
 
 We should certainly make it clearer what things are in terms of Xen ABI
 page size vs the guest's page size and other things.
 
 I think we can commit these two without that though?
 

It worries me a bit due to the lack of document, though I have a hunch
these patches are correct.

Saying that Xen always use XC_PAGE_SIZE page for store and console
mfn is good enough.

Wei.

  
  Or at least mention the ring mfn always has the size of XC_PAGE_SIZE (if
  that's the case).
  
  Wei.
  
   Sincerely yours,
   
   Julien Grall (2):
 tools/xenstored: Use XC_PAGE_SIZE rather than getpagesize()
 tools/xenconsoled: Use XC_PAGE_SIZE rather than getpagesize()
   
tools/console/daemon/io.c | 4 ++--
tools/xenstore/xenstored_domain.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
   
   -- 
   2.1.4
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 06/11] x86/intel_pstate: APERF/MPERF feature detect

2015-06-25 Thread Jan Beulich
 On 25.06.15 at 13:16, wei.w.w...@intel.com wrote:
 Add support to detect the APERF/MPERF feature. Also, remove the identical
 code in cpufreq.c and powernow.c.
 
 v4 changes:
 1) this is a new consolidated patch dealing with the APERF/MPERF feature
 detection.
 
 Signed-off-by: Wei Wang wei.w.w...@intel.com

I would have taken this right away, if only it had been at the
beginning of the series (or stated that it's independent of the
earlier patches) and, more importantly, ...

 --- a/xen/arch/x86/cpu/common.c
 +++ b/xen/arch/x86/cpu/common.c
 @@ -238,6 +238,9 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 
 *c)
   if ( cpu_has(c, X86_FEATURE_CLFLSH) )
   c-x86_clflush_size = ((ebx  8)  0xff) * 8;
  
 + if (cpuid_ecx(6)  0x1)
 + set_bit(X86_FEATURE_APERFMPERF, c-x86_capability);

... if you hadn't used this plain 0x1 here when _both_ of the old
code pieces nicely used CPUID_6_ECX_APERFMPERF_CAPABILITY.
Bonus points for also giving a sensible name to leaf 6 and naming
its other bits code in the tree already uses (see CPUID_MWAIT_LEAF).

And of course you should check -cpuid_level first.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v3 21/22] Debian: Arrange to be able to chainload a xen.efi from grub2

2015-06-25 Thread Ian Campbell
On Thu, 2015-06-25 at 11:33 +0100, Ian Jackson wrote:
 Ian Campbell writes ([PATCH OSSTEST v3 21/22] Debian: Arrange to be able to 
 chainload a xen.efi from grub2):
  Note that the 20_linux_xen change here is a bit specific to us and not
  really generic enough to go upstream IMHO, hence I haven't.
 
 So if we accept this patch, we are committing to always having
 20_linux_xen (and perhaps updating it to cope with new versions of
 grub).  Originally having this file in osstest was intended as a
 stopgap, pending inclusion of a suitable file upstream.

I originally considered writing NN_osstest_uefi, but it looked like it
was going to involve copying a fair bit of boilerplate from
20_linux_xen.

However, I've changed the approach I was using since then and now I
suspect there wouldn't actually be much duplication. So unless you think
otherwise I'll try that for next time around.

 Is there some upstream-friendly way of achieving the same thing ?

Not AFAIK. I could try upstreaming this but given that a) the user still
needs to manually copy things to the ESP and create a suitable xen.cfg
and b) people are working on a better way which will just work with the
existing non-UEFI grub.cfg file entries, I'm not sure how much point
there is.

 I'm not really sure what is `specific to us' (or what `us' here means
 - osstest, or Xen on arm64, or ...?)

All the paths are basically specific to us, just the general shape of
the entry is more generically applicable.

http://wiki.xen.org/wiki/Xen_EFI already documents how to do things
FWIW.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] libxl: Add AHCI support for upstream qemu

2015-06-25 Thread Stefano Stabellini
On Thu, 25 Jun 2015, Fabio Fantoni wrote:
 Il 25/06/2015 12:21, Ian Campbell ha scritto:
  On Tue, 2015-06-23 at 11:15 +0200, Fabio Fantoni wrote:
   Usage:
   ahci=0|1 (default=0)
  I think a global rather than per disk option is OK (I can't think why a
  user would want to mix and match) but maybe we should consider using an
  enum (with values ide and ahci, defaulting to ide in libxl) so that we
  can add support for whatever fancy new disk controller everyone is using
  in 5 years time?
 
 ahci was added 4 years ago in qemu and I don't know of newer similar
 tecnology, in the case of enum probably shold be more generic for include more
 future possibility or I'm wrong? in that case what can be the name?
 @stabellini and other developer: any advice about this?

I don't know of any other block technologies that would use hd as
block device names. Virtio-blk uses vd, so it couldn't be confused.
However for the sake of being future proof, it might make sense to
introduce an enum, maybe something like hdtype?

enum hdtype {
ide,
ahci,
}

then in the config file:

hdtype=ahci

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   3   4   >