date:20141205

Re: [Xen-devel] [PATCH] tools/hotplug: update systemd dependency to use service instead of socket

2014-12-05 Thread Olaf Hering

On Fri, Dec 05, Olaf Hering wrote:

 On Thu, Dec 04, Konrad Rzeszutek Wilk wrote:
 
  On Thu, Dec 04, 2014 at 08:47:56AM +0100, Olaf Hering wrote:
   Is that something the sysadmin has to adjust, or should the xen source
   provide proper values?
  It would be rather cumbersome if the sysadmin had to adjust it. The goal
  here would be that distros could use it and package it neatly so that it
  works out of the box.
  
  What are the proper values in SuSE?
 
 I have no idea, we dont run with selinux. At least not per default.
 So what is supposed to be there, why does it happen to work for me?
 
 And if there are changes required to the config file, they should be
 passed in via configure instead of doing a patch.

So looking again at
tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in it seems that it
happens to work for me because XENSTORED_MOUNT_CTX is set within that
file. So if something happens to need a different value for
XENSTORED_MOUNT_CTX it has to be provided in the to-be-created config
file: EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xenstored
This config file is not part of xen. 

Does the current state of xen-4.5 (like make rpmball) not work out of
the box on Fedora or anything that uses selinux? If thats the case it
should probably be covered in the INSTALL file.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/hotplug: update systemd dependency to use service instead of socket

2014-12-05 Thread Olaf Hering

On Fri, Dec 05, Olaf Hering wrote:

 So looking again at
 tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in it seems that it
 happens to work for me because XENSTORED_MOUNT_CTX is set within that
 file. So if something happens to need a different value for
 XENSTORED_MOUNT_CTX it has to be provided in the to-be-created config
 file: EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xenstored
 This config file is not part of xen. 

And I wonder why a new config file has to be created, instead of just
reusing the existing tools/hotplug/Linux/init.d/sysconfig.xencommons.in?

I will send out a few patches to adjust the EnvironmentFile handling.

Its just the question if a configure --with-selinux-mount-context=VAL is
needed.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Removing the PVH assert in arch/x86/hvm/io.c:87

2014-12-05 Thread Jan Beulich

 On 04.12.14 at 17:35, roger@citrix.com wrote:
 I've just stumbled upon this assert while testing PVH on different
 hardware. It was added in 7c4870 as a safe belt, but it turns out INS
 and OUTS go through handle_mmio. So using this instructions from a PVH
 guest basically kills Xen.
 
 I've removed it and everything seems fine, so I'm considering sending a
 patch for 4.5 in order to have it removed. I think the path that could
 trigger the crash because of the missing vioapic stuff is already
 guarded by the other chunk added in the same patch.

Iirc we settled on forbidding paths to handle_mmio() for PVH (hence
the ASSERT()). Sadly you provide way too little detail on what is
actually happening in your case: What's the use case of to-be-
emulated INS/OUTS in a PVH kernel? What's the call tree that gets
you into handle_mmio(), considering that both calls to
handle_mmio_with_translation() from hvm_hap_nested_page_fault()
as well as the one to handle_mmio() ought to be unreachable for PVH?

Jan



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Install Xen on ARM in a bare metal fashion on a Nexus Phone/Tablet or an ARM emulator

2014-12-05 Thread Ian Campbell

Please don't top post.

On Fri, 2014-12-05 at 14:51 +0530, Sagun Garg wrote:
 Thanks Ian for helping me with the links,
 
 
 FYI, I found following link :
 https://blog.xenproject.org/2014/04/01/virtualization-on-arm-with-xen/
 (Here it suggests using Foundation Model and Linaro, basically an
 emulator to get started) though the advanced emulators offered by ARM
 are paid ones)
 
 
 Would it be possible to install these ARM emulators on say AWS
 Amazon / Digital Ocean with the free subscription to try and test
 these out ?

Those emulators will run on any x86 system, whether it is virtualised by
Amazon/DO or not. They are quite CPU intensive though.

  Amazon uses Xen as the underlying virtualization technology and it
 also uses custom kernels since last 2 years so coincidentally it might
 just work though the question is how (I read this somewhere on a blog,
 but I can't point a link to it as I don't remember it, but would you
 know of such a tuturial / link where someone else has pursued the
 same ?)

Amazon offers x86 Xen, not ARM Xen. Xen does not do any kind of
cross-architecture virtualisation (i.e. running ARM OSes on X86, or vice
versa). So the fact that Amazon happens to run x86 Xen is of no use when
you want to run ARM Xen.

 or would you know of any RISC as a Service with ARM processors that
 can be provisioned on demand like AWS where we can install XEN on ARM
 directly ? Any cloud offering that can be used would be of great help.

I'm not aware of any cloud service offering ARM at the moment.

 Also I was wondering what are the risks / or rather shortcomings in
 trying directly on the device Nexus / or any other ARM phone which has
 been tested for the same. 

Not sure what sorts of risks you mean, I don't think there is anything
Xen specific here, just the usual stuff with running an untested OS on
any new platform.

I don't know if Nexus devices are brickable, but if so then that might
be an issue with trying any untested OS on them.

Phones and the like aren't typically very good debug platforms (i.e. no
serial, no JTAG etc) so running an untested OS on them can end up being
hard (but not impossible) to debug if it doesn't work, that's why
platforms such as the Arndale exist -- they are mobile phones with all
the extra useful debug stuff brought out to headers.

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-3.10 test] 32086: regressions - trouble: blocked/broken/fail/pass

2014-12-05 Thread xen . org

flight 32086 linux-3.10 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32086/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-winxpsp3  7 windows-install fail REGR. vs. 26303

Regressions which are regarded as allowable (not blocking):
 build-i386-rumpuserxen3 host-install(3)broken blocked in 26303
 test-amd64-i386-qemuu-rhel6hvm-amd  7 redhat-install   fail like 26261
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 17 leak-check/check fail blocked in 
26303
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 26303
 test-amd64-amd64-xl-winxpsp3  7 windows-install  fail   like 26303

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  5 xen-boot fail   never pass
 test-armhf-armhf-xl   5 xen-boot fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass

version targeted for testing:
 linux252f23ea5987a4730e3399ef1ad5d78efcc786c9
baseline version:
 linuxbe67db109090b17b56eb8eb2190cd70700f107aa


774 people touched revisions under test,
not listing them all


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   broken  
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   fail
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64fail
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-win7-amd64   fail
 test-amd64-i386-xl-win7-amd64fail
 test-amd64-i386-xl-credit2   pass
 test-amd64-i386-freebsd10-i386

Re: [Xen-devel] [PATCH v5 2/2] add a new p2m type - p2m_mmio_write_dm

2014-12-05 Thread Tim Deegan

Hi,

At 10:00 +0800 on 05 Dec (1417770044), Yu, Zhang wrote:
  @@ -5978,7 +5982,8 @@ long do_hvm_op(unsigned long op, 
  XEN_GUEST_HANDLE_PARAM(void) arg)
goto param_fail4;
}
if ( !p2m_is_ram(t) 
  - (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) )
  + (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) 
  + t != p2m_mmio_write_dm )
 
  I think that Jan already brough this up, and maybe I missed your
  answer: this realaxation looks wrong to me. I would have thought that
  transition between p2m_mmio_write_dm and p2m_ram_rw/p2m_ram_logdirty
  would be the only ones you would want to allow.
 
 Ha. Sorry, my negligence, and thanks for pointing out. :)
 The transition we use now is only between p2m_mmio_write_dm and 
 p2m_ram_rw. So how about this:
 
  if ( !p2m_is_ram(t) 
   (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) 
   (t != p2m_mmio_write_dm || a.hvmmem_type != HVMMEM_ram_rw) )

Yes, I think that's right. 

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv1] xen: increase default number of PIRQs for hardware domains

2014-12-05 Thread Jan Beulich

 On 03.12.14 at 17:04, david.vra...@citrix.com wrote:
 The default limit for the number of PIRQs for hardware domains (dom0)
 is not sufficient for some (x86) systems.
 
 Since the pirq structures are individually and dynamically allocated,
 the limit for hardware domains may be increased to the number of
 possible IRQs.

I nevertheless disagree to moving the bound up to the Xen internal
limit unconditionally: What use does it have to allow hwdom to use
thousands of MSIs? If a system got that many, the main purpose of
running Xen on it I would expect to be to hand various of the
respective devices to guests. Hence no need for hwdom to have
that many by default, even if this doesn't result in any extra
resource consumption.

That said, I can see the current default of 256 being too low though.
Quite likely in the absence of a user specified value the default
ought to be derived from nr_irqs - nr_static_irqs rather than being
any fixed number. Considering the default used for nr_irqs, I'd think
along the lines of sqrt(num_present_cpus()) * NR_DYNAMIC_VECTORS
or dom0-max_vcpus * NR_DYNAMIC_VECTORS (or the minimum of
the two) for x86.

Jan

 The extra_guest_irqs command line option now only allows changes to
 the domU value.  Any argument for dom0 is ignored.
 
 Signed-off-by: David Vrabel david.vra...@citrix.com
 ---
  docs/misc/xen-command-line.markdown |   11 ---
  xen/common/domain.c |7 +--
  2 files changed, 5 insertions(+), 13 deletions(-)
 
 diff --git a/docs/misc/xen-command-line.markdown 
 b/docs/misc/xen-command-line.markdown
 index 0866df2..d352031 100644
 --- a/docs/misc/xen-command-line.markdown
 +++ b/docs/misc/xen-command-line.markdown
 @@ -594,15 +594,12 @@ except for debugging purposes.
  Force or disable use of EFI runtime services.
  
  ### extra\_guest\_irqs
 - `= [domU number][,dom0 number]`
 + `= [number]`
  
 - Default: `32,256`
 + Default: `32`
  
 -Change the number of PIRQs available for guests.  The optional first number 
 is
 -common for all domUs, while the optional second number (preceded by a comma)
 -is for dom0.  Changing the setting for domU has no impact on dom0 and vice
 -versa.  For example to change dom0 without changing domU, use
 -`extra_guest_irqs=,512`
 +Change the number of PIRQs available for guests. This limit does not
 +apply to hardware domains (dom0).
  
  ### flask\_enabled
   `= integer`
 diff --git a/xen/common/domain.c b/xen/common/domain.c
 index 4a62c1d..a88d829 100644
 --- a/xen/common/domain.c
 +++ b/xen/common/domain.c
 @@ -231,14 +231,11 @@ static int late_hwdom_init(struct domain *d)
  #endif
  }
  
 -static unsigned int __read_mostly extra_dom0_irqs = 256;
  static unsigned int __read_mostly extra_domU_irqs = 32;
  static void __init parse_extra_guest_irqs(const char *s)
  {
  if ( isdigit(*s) )
  extra_domU_irqs = simple_strtoul(s, s, 0);
 -if ( *s == ','  isdigit(*++s) )
 -extra_dom0_irqs = simple_strtoul(s, s, 0);
  }
  custom_param(extra_guest_irqs, parse_extra_guest_irqs);
  
 @@ -324,10 +321,8 @@ struct domain *domain_create(
  atomic_inc(d-pause_count);
  
  if ( !is_hardware_domain(d) )
 -d-nr_pirqs = nr_static_irqs + extra_domU_irqs;
 +d-nr_pirqs = min(nr_static_irqs + extra_domU_irqs, nr_irqs);
  else
 -d-nr_pirqs = nr_static_irqs + extra_dom0_irqs;
 -if ( d-nr_pirqs  nr_irqs )
  d-nr_pirqs = nr_irqs;
  
  radix_tree_init(d-pirq_tree);
 -- 
 1.7.10.4




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PVH cleanups after 4.5

2014-12-05 Thread Tim Deegan

At 09:20 + on 05 Dec (1417767654), Jan Beulich wrote:
  On 04.12.14 at 18:25, t...@xen.org wrote:
  Potential feature flags, based on whiteboard notes at the session.
  Things that are 'Yes' in both columns might not need actual flags :)
  
   'HVM'   'PVH'
  64bit hypercalls  Yes Yes
  32bit hypercalls  Yes No
 
 Iiuc the lack of support of 32-bit hypercalls is simply because PVH
 guests aren't expected to use them as being always 64-bit right
 now. I.e. I can't really see why we couldn't just enable them once
 the 64-bit hypercall tables got combined, in which case we wouldn't
 need a feature flag here either.

Agreed -- I think the same will apply to a few other things, like shadow
pagetables and some of the other MM tricks.  

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [libvirt test] 32083: regressions - FAIL

2014-12-05 Thread Ian Campbell

On Thu, 2014-12-04 at 18:24 +, xen.org wrote:
 flight 32083 libvirt real [real]
 http://www.chiark.greenend.org.uk/~xensrcts/logs/32083/
 
 Regressions :-(
 
 Tests which did not succeed and are blocking,
 including tests which could not be run:
  build-i386-libvirt5 libvirt-build fail REGR. vs. 
 32005
  build-amd64-libvirt   5 libvirt-build fail REGR. vs. 
 32005
  build-armhf-libvirt   5 libvirt-build fail REGR. vs. 
 32005

See
https://www.redhat.com/archives/libvir-list/2014-December/msg00082.html

I replied at:
https://www.redhat.com/archives/libvir-list/2014-December/msg00327.html

Not sure, but I think the answer will be for us to add libxml-xpath-perl
to the set of packages which we install in the build environment.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-05 Thread Olaf Hering

On Tue, Dec 02, Wei Liu wrote:

 AC_CHECK_LIB fails on Debian Jessie since the ld flag it generates is
 incorrect, even in the event systemd library is available.  Use
 PKG_CHECK_MODULES instead.
 
 Tested on Debian Jessie and Arch Linux.

I just tested this and got this failure. The reason is that the LDFLAGS come
before the objects. If I move LDFLAGS after $^ linking works. Will send a patch
to fix the failure.

Olaf

make[3]: Entering directory 
'/work/olaf/factory/github/olafhering/xen.git/tools/xenstore'
gcc-Wl,-rpath,/opt/xen/upstream/staging-honor_prefix/lib64 -lsystemd  
xenstored_core.o xenstored_watch.o xenstored_domain.o xenstored_transaction.o 
xs_lib.o talloc.o utils.o tdb.o hashtable.o xenstored_posix.o 
/work/olaf/factory/github/olafhering/xen.git/tools/xenstore/../../tools/libxc/libxenctrl.so
  -o xenstored
xenstored_core.o: In function `xs_validate_active_socket':
xenstored_core.c:(.text.unlikely+0x38): undefined reference to `sd_notifyf'
xenstored_core.c:(.text.unlikely+0x59): undefined reference to 
`sd_is_socket_unix'
xenstored_core.c:(.text.unlikely+0x77): undefined reference to 
`sd_is_socket_unix'
xenstored_core.o: In function `main':
xenstored_core.c:(.text.startup+0x1df): undefined reference to `sd_booted'
xenstored_core.c:(.text.startup+0x23c): undefined reference to `sd_booted'
xenstored_core.c:(.text.startup+0x25b): undefined reference to `sd_listen_fds'
xenstored_core.c:(.text.startup+0x563): undefined reference to `sd_booted'
xenstored_core.c:(.text.startup+0x8f9): undefined reference to `sd_notifyf'
xenstored_core.c:(.text.startup+0x958): undefined reference to `sd_notifyf'
xenstored_core.c:(.text.startup+0xb0d): undefined reference to `sd_notify'
collect2: error: ld returned 1 exit status
Makefile:80: recipe for target 'xenstored' failed
make[3]: *** [xenstored] Error 1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] A few EFI code questions

2014-12-05 Thread Ian Campbell

On Fri, 2014-12-05 at 09:47 +, Jan Beulich wrote:
  On 05.12.14 at 10:33, ian.campb...@citrix.com wrote:
  On Fri, 2014-12-05 at 07:37 +, Jan Beulich wrote:
   On 04.12.14 at 22:22, roy.fr...@linaro.org wrote:
   On Thu, Dec 4, 2014 at 1:35 AM, Jan Beulich jbeul...@suse.com wrote:
   On 03.12.14 at 22:02, daniel.ki...@oracle.com wrote:
   3) Should not we change xen/arch/*/efi/efi-boot.h to
  xen/arch/*/efi/efi-boot.c? efi-boot.h contains more
  code than definitions, declarations and short static
  functions. So, I think that it is more regular *.c file
  than header file.
  
   That's a matter of taste - I'd probably have made it .c too, but
   didn't mind it being .h as done by Roy (presumably on the basis
   that #include directives are preferred to have .h files as their
   operands). The only thing I regret is that I didn't ask for the
   pointless efi- prefix to be dropped.
   
   I don't mind a change here, and I agree that it is more like a .c file
   than a .h.  If a name change is done, is it worth dropping the efi- at
   the same time?
  
  If we indeed want to change the name (post 4.5), making both
  adjustments at once would be kind of a requirement of mine.
  
  Random thought: *.inc for .c files which happen to be embedded into
  another using #include?
 
 That may conflict with certain editors' language detection, as .inc
 may have other meanings (in the x86 Windows world I'd expect this
 to be an assembler include file for example).

Oh, so does my emacs apparently (a leftover .emacs snippet from a
previous life...). Nevermind that suggestion then.

The existing comment at the top of the included files is probably
sufficient.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PVH cleanups after 4.5

2014-12-05 Thread Ian Campbell

On Fri, 2014-12-05 at 10:49 +0100, Tim Deegan wrote:
 At 09:20 + on 05 Dec (1417767654), Jan Beulich wrote:
   On 04.12.14 at 18:25, t...@xen.org wrote:
   Potential feature flags, based on whiteboard notes at the session.
   Things that are 'Yes' in both columns might not need actual flags :)
   
'HVM'   'PVH'
   64bit hypercalls  Yes Yes
   32bit hypercalls  Yes No
  
  Iiuc the lack of support of 32-bit hypercalls is simply because PVH
  guests aren't expected to use them as being always 64-bit right
  now. I.e. I can't really see why we couldn't just enable them once
  the 64-bit hypercall tables got combined, in which case we wouldn't
  need a feature flag here either.
 
 Agreed -- I think the same will apply to a few other things, like shadow
 pagetables and some of the other MM tricks.  

Might we want to constrain a given PVH domain to only make 32- or 64-bit
hypercalls?

Or do we consider already having crossed that bridge with HVM enough
reason to allow it for PVH? I'm wonder if that, even if it is
technically possible to support not, doing so might mitigate some
potential security issues down the line. There's obviously a tradeoff
against in-guest flexibility though.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-05 Thread Ian Campbell

On Fri, 2014-12-05 at 10:51 +0100, Olaf Hering wrote:
 On Tue, Dec 02, Wei Liu wrote:
 
  AC_CHECK_LIB fails on Debian Jessie since the ld flag it generates is
  incorrect, even in the event systemd library is available.  Use
  PKG_CHECK_MODULES instead.
  
  Tested on Debian Jessie and Arch Linux.
 
 I just tested this and got this failure. The reason is that the LDFLAGS come
 before the objects. If I move LDFLAGS after $^ linking works. Will send a 
 patch
 to fix the failure.

Was this a new failure with this change? AFAICT LDFLAGS is still set
(via SYSTEMD_LIBS) in the same place relative to non-systemd stuff.

FWIW the reason I don't see this in my pre-commit test is that I don't
have the systemd headers installed.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] A good way to speed up the xl destroy time(guest page scrubbing)

2014-12-05 Thread Bob Liu

Hey folks,

In recent months I've been working on speed up the 'xl des' time of XEN
guest with large RAM, but there is still no good solution yet.

I'm looking forward to get more suggestions and appreciate for all of
your input.

(1) The problem
When 'xl destory' a guest with large memory, we have to wait a long
time(~10 minutes for a guest with 1TB memory). Most of the time was
spent on page scrubbing, every page need to get scrubbed before free to
the heap_list(the buddy system).

(2) The way I've tired
1. When free a page to the buddy system only mark it with an new flag
'need_scrub' instead of scrubbing, so 'xl des' can return quickly.

2. Use all idle cpus to do the real page scrubbing in parallel. In:
static void idle_loop(void)
{
iterate the heap_list and scrub any 'need_scrub' page.
}

3. Also in the alloc_heap_page() path, 'need_scrub' pages can be
allocated and scrubbed.(If 'need_scrub' pages are skipped, 'xl create'
new guest may fail when the system is busy since no idle cpus can finish
the scrubbing.)

4. Problem of this way: Lock contention
The heap_list is protected by heap_lock which is a spinlock.
alloc/free path may modify the heap list any time with heap_lock hold.

The idle_loop() need to iterate the heap list for every page scrubbing
(won't modify the list but will scrub page content), there is heavy lock
contention and slow down the alloc/free path.

5. Potential workaround
5.1 Use per-cpu list in idle_loop()
Delist a batch of pages from heap_list to a per-cpu list, then scrub the
per-cpu list and free back to heap_list.

But Jan disagree with this solution:
You should really drop the idea of removing pages temporarily.
All you need to do is make sure a page being allocated and getting
simultaneously scrubbed by another CPU won't get passed to the
caller until the scrubbing finished.

Another reason was it's hard to say how many pages should be delisted to
per-cpu list.

5.2 Use more page flags
Konrad suggested to use more page flags and consider the 'cmpxchg'
instruction instead of spinlock for idle_loop() to iterate the heap_list.
But 'cmpxchg' is only suitable to protect the content of every single
page, it's difficult to protect kinds of race conditions against a list.

(3) Other solutions for speed up page scrubbing
1. George suggested:
* Have a clean freelist and a dirty freelist
* When destroying a domain, simply move pages to the dirty freelist
* Have idle vcpus scrub the dirty freelist before going to sleep
 - ...

* In alloc_domheap_pages():
 - If there are pages on the clean freelist, allocate them
 - If there are no pages on the clean freelist but there are on the
dirty freelist, scrub pages from the dirty freelist synchronously.

But the lock contention is still a problem and may worse with two lists.

2. Delay page scrubbing to the page fault path
Which means a 'need_scrub' page won't be scrubbed until setting up the
page table mapping in page fault path. This is a populate way under linux.
But konrad mentioned this way was not suitable for Windows guest,
because Windows will access every page during boot up, the boot time of
windows might be slowed down.

Welcome any better ideas and thanks again for your patient to read this
long email.

-- 
Regards,
-Bob

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] systemd: use pkg-config to determine systemd library availability

2014-12-05 Thread Olaf Hering

On Fri, Dec 05, Ian Campbell wrote:

 On Fri, 2014-12-05 at 10:51 +0100, Olaf Hering wrote:
  On Tue, Dec 02, Wei Liu wrote:
  
   AC_CHECK_LIB fails on Debian Jessie since the ld flag it generates is
   incorrect, even in the event systemd library is available.  Use
   PKG_CHECK_MODULES instead.
   
   Tested on Debian Jessie and Arch Linux.
  
  I just tested this and got this failure. The reason is that the LDFLAGS come
  before the objects. If I move LDFLAGS after $^ linking works. Will send a 
  patch
  to fix the failure.
 
 Was this a new failure with this change? AFAICT LDFLAGS is still set
 (via SYSTEMD_LIBS) in the same place relative to non-systemd stuff.

No, happens even without it. I just realized that I missed a git rebase.
My own packages do not have systemd-devel yet, so I did not spot this
earlier. Maybe it just happens with the latest toolchain in Factory.
Last time it worked well in 13.1 at least, SLE12 was ok as well.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute

2014-12-05 Thread David Vrabel

On 04/12/14 15:39, Alex Williamson wrote:
 
 I don't know what workaround you're talking about.  As devices are
 released from the user, vfio-pci attempts to reset them.  If
 pci_reset_function() returns success we mark the device clean, otherwise
 it gets marked dirty.  Each time a device is released, if there are
 dirty devices we test whether we can try a bus/slot reset to clean them.
 In the case of assigning a GPU this typically means that the GPU or
 audio function come through first, there's no reset mechanism so it gets
 marked dirty, the next device comes through and we manage to try a bus
 reset.  vfio-pci does not have any device specific resets, all
 functionality is added to the PCI-core, thank-you-very-much.  I even
 posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
 bad so that pci_reset_function() won't claim that worked.  All VGA
 access quirks are done in QEMU, the kernel doesn't have any business in
 remapping config space over MMIO regions or trapping other config space
 backdoors.

Thanks for the info Alex, I hadn't got around to actually looking and
the vfio-pci code and was just going to what Sander said.

We probably do need to have a more in depth look at now PCI devices and
handled by both the toolstack and pciback but in the short term I would
like a simple solution that does not extend the ABI.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 8/9] libxl: soft reset support

2014-12-05 Thread Vitaly Kuznetsov

Wei Liu wei.l...@citrix.com writes:

 (I've skipped the internal implementation since I don't know what's
  required to fulfil soft reset.)

 On Wed, Dec 03, 2014 at 06:16:20PM +0100, Vitaly Kuznetsov wrote:
 [...]
 + libxl__domain_create_state *dcs);
  
  /* Each time the dm needs to be saved, we must call suspend and then save */
  _hidden int libxl__domain_suspend_device_model(libxl__gc *gc,
 diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
 index 53611dc..eb833f0 100644
 --- a/tools/libxl/xl_cmdimpl.c
 +++ b/tools/libxl/xl_cmdimpl.c
 @@ -2043,7 +2043,8 @@ static void reload_domain_config(uint32_t domid,
  }
  
  /* Returns 1 if domain should be restarted,
 - * 2 if domain should be renamed then restarted, or 0
 + * 2 if domain should be renamed then restarted,
 + * 3 if domain performed soft reset, or 0
   * Can update r_domid if domain is destroyed etc */
  static int handle_domain_death(uint32_t *r_domid,
 libxl_event *event,
 @@ -2069,6 +2070,9 @@ static int handle_domain_death(uint32_t *r_domid,
  case LIBXL_SHUTDOWN_REASON_WATCHDOG:
  action = d_config-on_watchdog;
  break;
 +case LIBXL_SHUTDOWN_REASON_SOFT_RESET:
 +LOG(Domain performed soft reset.);
 +return 3;

 Would it be useful to provide on_soft_reset option in xl? Will the
 admin be interested in performing some other action when domain does
 soft reset? Say, for security reason admin want to prohibit domain from
 soft resetting itself.


Makes sense, let's add it.

  default:
  LOG(Unknown shutdown reason code %d. Destroying domain.,
  event-u.domain_shutdown.shutdown_reason);
 @@ -2285,6 +2289,7 @@ static void 
 evdisable_disk_ejects(libxl_evgen_disk_eject **diskws,
  static uint32_t create_domain(struct domain_create *dom_info)
  {
  uint32_t domid = INVALID_DOMID;
 +uint32_t domid_old = INVALID_DOMID;
  
  libxl_domain_config d_config;
  
 @@ -2510,7 +2515,18 @@ start:
   * restore/migrate-receive it again.
   */
  restoring = 0;
 -}else{
 +} else if (domid_old != INVALID_DOMID) {
 +/* Do soft reset */
 +d_config.b_info.nodemap.size = 0;

 What's the reason for doing this?

 If you encounter problem with this it should probably be fixed in
 libxl.

Ah, sorry, I forgot about this hackaround (which was required since
194e7183 if I'm not mistaken). The root cause is that
reload_domain_config() was missing on soft_reset path and we were
hitting Can run NUMA placement only if the domain does not have any
NUMA node affinity set already clause.

I will fix this along with on_soft_reset implementation.


 Wei.

 +ret = libxl_domain_soft_reset(ctx, d_config,
 +  domid, domid_old,
 +  0, 0);
 +
 +if ( ret ) {
 +goto error_out;
 +}
 +domid_old = INVALID_DOMID;
 +} else {
  ret = libxl_domain_create_new(ctx, d_config, domid,
0, autoconnect_console_how);
  }
 @@ -2574,6 +2590,8 @@ start:
  event-u.domain_shutdown.shutdown_reason,
  event-u.domain_shutdown.shutdown_reason);
  switch (handle_domain_death(domid, event, d_config)) {
 +case 3:
 +domid_old = domid;
  case 2:
  if (!preserve_domain(domid, event, d_config)) {
  /* If we fail then exit leaving the old domain in 
 place. */
 -- 
 1.9.3
 
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

-- 
  Vitaly

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] xen/arm: uart interrupts handling

2014-12-05 Thread Julien Grall


Hi Vijay,

On 05/12/2014 00:46, Vijay Kilari wrote:
  Yes, this is the behaviour that Iam seeing. In Linux, uart driver

masks TXI interrupt
in IMSC if buffer is empty. However in xen, this scenario is not
handled. This is the reason why cpu does not come out of uart irq
routine if TX interrupt is raised but buffer is empty.

I have added below changes to fix this on top of your suggested change


Can you send a formal patch (commit message + signed-off-by)?

Also, you will have to make sure you don't break the other serial drivers.

Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/xenstore: fix link error with libsystemd

2014-12-05 Thread Ian Campbell

On Fri, 2014-12-05 at 11:49 +0100, Olaf Hering wrote:
 Linking fails with undefined reference to the used systemd functions.
 Move LDFLAGS after the object files to fix the failure.
 
 Signed-off-by: Olaf Hering o...@aepfle.de
 Cc: Ian Jackson ian.jack...@eu.citrix.com
 Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com

Acked-by: Ian Campbell ian.campb...@citrix.com

This should go into 4.5.

FWIW my suspicion is that this relates to toolstacks using --as-needed
by default.

 Cc: Wei Liu wei.l...@citrix.com
 ---
  tools/xenstore/Makefile | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)
 
 diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile
 index bff9b25..11b6a06 100644
 --- a/tools/xenstore/Makefile
 +++ b/tools/xenstore/Makefile
 @@ -74,10 +74,10 @@ endif
  init-xenstore-domain.o: CFLAGS += $(CFLAGS_libxenguest)
  
  init-xenstore-domain: init-xenstore-domain.o $(LIBXENSTORE)
 - $(CC) $(LDFLAGS) $^ $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) 
 $(LDLIBS_libxenstore) -o $@ $(APPEND_LDFLAGS)
 + $(CC) $^ $(LDFLAGS) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) 
 $(LDLIBS_libxenstore) -o $@ $(APPEND_LDFLAGS)
  
  xenstored: $(XENSTORED_OBJS)
 - $(CC) $(LDFLAGS) $^ $(LDLIBS_libxenctrl) $(SOCKET_LIBS) -o $@ 
 $(APPEND_LDFLAGS)
 + $(CC) $^ $(LDFLAGS) $(LDLIBS_libxenctrl) $(SOCKET_LIBS) -o $@ 
 $(APPEND_LDFLAGS)
  
  xenstored.a: $(XENSTORED_OBJS)
   $(AR) cr $@ $^
 @@ -86,13 +86,13 @@ $(CLIENTS): xenstore
   ln -f xenstore $@
  
  xenstore: xenstore_client.o $(LIBXENSTORE)
 - $(CC) $(LDFLAGS) $ $(LDLIBS_libxenstore) $(SOCKET_LIBS) -o $@ 
 $(APPEND_LDFLAGS)
 + $(CC) $ $(LDFLAGS) $(LDLIBS_libxenstore) $(SOCKET_LIBS) -o $@ 
 $(APPEND_LDFLAGS)
  
  xenstore-control: xenstore_control.o $(LIBXENSTORE)
 - $(CC) $(LDFLAGS) $ $(LDLIBS_libxenstore) $(SOCKET_LIBS) -o $@ 
 $(APPEND_LDFLAGS)
 + $(CC) $ $(LDFLAGS) $(LDLIBS_libxenstore) $(SOCKET_LIBS) -o $@ 
 $(APPEND_LDFLAGS)
  
  xs_tdb_dump: xs_tdb_dump.o utils.o tdb.o talloc.o
 - $(CC) $(LDFLAGS) $^ -o $@ $(APPEND_LDFLAGS)
 + $(CC) $^ $(LDFLAGS) -o $@ $(APPEND_LDFLAGS)
  
  libxenstore.so: libxenstore.so.$(MAJOR)
   ln -sf $ $@



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.5] xen/arm: Correct the opcode for BUG_INSTR on arm32

2014-12-05 Thread Ian Campbell

On Thu, 2014-12-04 at 14:34 -0500, Konrad Rzeszutek Wilk wrote:
 On Thu, Dec 04, 2014 at 07:26:55PM +, Julien Grall wrote:
  A 0 was forgotten when the arm32 BUG instruction opcode has been added in 
  commit
  3e802c6ca1fb9a9549258c2855a57cad483f3cbd xen/arm: Correctly support 
  WARN_ON.
  
  This will result to use a valid instruction (mcreq 0, 3, r0, cr15, cr0, 
  {7}),
  and inhibit usage of BUG/WARN_ON and co.

Doh!

  
  Signed-off-by: Julien Grall julien.gr...@linaro.org
  
  ---
  
  Not sure, why I dropped the 0 when I implemented the patch...
  This is a bug fixed for Xen 4.5. This is only affected ARM32 where the
  BUG opcode was malformed.
  
  With the malformed opcode, the ASSERT/BUG_ON is skipped and the
  processor may execute another patch (because the compiler has optimized
 
 s/patch/path/ ?

Will fix on commit.

  due the unreachable in both macro).
  
  The code modified is only executed when Xen is in bad state.
 
 Release-Acked-by: Konrad Rzeszutek Wilk konrad.w...@oracle.com 

Acked-by: Ian Campbell ian.campb...@citrix.com

 
  ---
   xen/include/asm-arm/arm32/bug.h | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/xen/include/asm-arm/arm32/bug.h 
  b/xen/include/asm-arm/arm32/bug.h
  index 155b420..3e66f35 100644
  --- a/xen/include/asm-arm/arm32/bug.h
  +++ b/xen/include/asm-arm/arm32/bug.h
  @@ -6,7 +6,7 @@
   /* ARMv7 provides a list of undefined opcode (see A8.8.247 DDI 0406C.b)
* Use one them encoding A1 to go in exception mode
*/
  -#define BUG_OPCODE  0xe7f00f0
  +#define BUG_OPCODE  0xe7f000f0
   
   #define BUG_INSTR .word  __stringify(BUG_OPCODE)
   
  -- 
  2.1.3
  
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PVH cleanups after 4.5

2014-12-05 Thread David Vrabel

On 04/12/14 17:25, Tim Deegan wrote:
  'HVM'   'PVH'
 64bit hypercalls  Yes Yes
 32bit hypercalls  Yes No
 Paging assistance Yes Yes
 ioreq-servers Yes No

Perhaps, but no default one.  This
would be required for supporting virtual GPU passthrough to a PVH guest.

 HVM-style CPUID   Yes Yes
 Interrupt controllers Yes No ([x2]apic, ioapic, pic c)

Yes, if enough APIC virtualization
hardware is available.

 TimersYes No (rtc, hpet, pit, pmtimer)
 Machine Check regsYes Yes
 Emulated PCI  Yes No (PVH to use pcifront?)

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Removing the PVH assert in arch/x86/hvm/io.c:87

2014-12-05 Thread Roger Pau Monné

El 05/12/14 a les 10.15, Jan Beulich ha escrit:
 On 04.12.14 at 17:35, roger@citrix.com wrote:
 I've just stumbled upon this assert while testing PVH on different
 hardware. It was added in 7c4870 as a safe belt, but it turns out INS
 and OUTS go through handle_mmio. So using this instructions from a PVH
 guest basically kills Xen.

 I've removed it and everything seems fine, so I'm considering sending a
 patch for 4.5 in order to have it removed. I think the path that could
 trigger the crash because of the missing vioapic stuff is already
 guarded by the other chunk added in the same patch.
 
 Iirc we settled on forbidding paths to handle_mmio() for PVH (hence
 the ASSERT()). Sadly you provide way too little detail on what is
 actually happening in your case: What's the use case of to-be-
 emulated INS/OUTS in a PVH kernel?

In this specific situation I'm seeing intsw instructions executed by the
FreeBSD ATA layer:

http://fxr.watson.org/fxr/source/dev/ata/ata-lowlevel.c#L740

 What's the call tree that gets
 you into handle_mmio(), considering that both calls to
 handle_mmio_with_translation() from hvm_hap_nested_page_fault()
 as well as the one to handle_mmio() ought to be unreachable for PVH?

You can get there from vmx_vmexit_handler if the exit reason is
EXIT_REASON_IO_INSTRUCTION.

Roger.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Install Xen on ARM in a bare metal fashion on a Nexus Phone/Tablet or an ARM emulator

2014-12-05 Thread Julien Grall


Hello,

On 05/12/2014 09:32, Ian Campbell wrote:

On Fri, 2014-12-05 at 14:51 +0530, Sagun Garg wrote:
Not sure what sorts of risks you mean, I don't think there is anything
Xen specific here, just the usual stuff with running an untested OS on
any new platform.

I don't know if Nexus devices are brickable, but if so then that might
be an issue with trying any untested OS on them.


Nexus platform tend to be a good platform for development. I would be 
surprised if you can brick it with untested OS. Though, I would not try 
to change the bootloader and I don't know if they support HYP mode.




Phones and the like aren't typically very good debug platforms (i.e. no
serial, no JTAG etc) so running an untested OS on them can end up being
hard (but not impossible) to debug if it doesn't work, that's why
platforms such as the Arndale exist -- they are mobile phones with all
the extra useful debug stuff brought out to headers.


Nexus smartphone (at least 4) has an UART hidden in the headphone jack.
I don't know it's possible to buy the specific cable but you would be 
able to build your own. Though, I haven't tried myself.


The Linux kernel provided on AOSP has the code to enable the UART. So 
you should be able to get the log.


For Xen, you will have to implement yourself the debug UART.

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Removing the PVH assert in arch/x86/hvm/io.c:87

2014-12-05 Thread David Vrabel

On 05/12/14 11:07, Roger Pau Monné wrote:
 El 05/12/14 a les 10.15, Jan Beulich ha escrit:
 On 04.12.14 at 17:35, roger@citrix.com wrote:
 I've just stumbled upon this assert while testing PVH on different
 hardware. It was added in 7c4870 as a safe belt, but it turns out INS
 and OUTS go through handle_mmio. So using this instructions from a PVH
 guest basically kills Xen.

 I've removed it and everything seems fine, so I'm considering sending a
 patch for 4.5 in order to have it removed. I think the path that could
 trigger the crash because of the missing vioapic stuff is already
 guarded by the other chunk added in the same patch.

 Iirc we settled on forbidding paths to handle_mmio() for PVH (hence
 the ASSERT()). Sadly you provide way too little detail on what is
 actually happening in your case: What's the use case of to-be-
 emulated INS/OUTS in a PVH kernel?
 
 In this specific situation I'm seeing intsw instructions executed by the
 FreeBSD ATA layer:
 
 http://fxr.watson.org/fxr/source/dev/ata/ata-lowlevel.c#L740

Why are you running this device driver at all in a PVH guest?  It should
only be using PV block devices.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Removing the PVH assert in arch/x86/hvm/io.c:87

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 12:07, roger@citrix.com wrote:
 El 05/12/14 a les 10.15, Jan Beulich ha escrit:
 On 04.12.14 at 17:35, roger@citrix.com wrote:
 I've just stumbled upon this assert while testing PVH on different
 hardware. It was added in 7c4870 as a safe belt, but it turns out INS
 and OUTS go through handle_mmio. So using this instructions from a PVH
 guest basically kills Xen.

 I've removed it and everything seems fine, so I'm considering sending a
 patch for 4.5 in order to have it removed. I think the path that could
 trigger the crash because of the missing vioapic stuff is already
 guarded by the other chunk added in the same patch.
 
 Iirc we settled on forbidding paths to handle_mmio() for PVH (hence
 the ASSERT()). Sadly you provide way too little detail on what is
 actually happening in your case: What's the use case of to-be-
 emulated INS/OUTS in a PVH kernel?
 
 In this specific situation I'm seeing intsw instructions executed by the
 FreeBSD ATA layer:
 
 http://fxr.watson.org/fxr/source/dev/ata/ata-lowlevel.c#L740 
 
 What's the call tree that gets
 you into handle_mmio(), considering that both calls to
 handle_mmio_with_translation() from hvm_hap_nested_page_fault()
 as well as the one to handle_mmio() ought to be unreachable for PVH?
 
 You can get there from vmx_vmexit_handler if the exit reason is
 EXIT_REASON_IO_INSTRUCTION.

A PVH guest without passed through device shouldn't access I/O
ports in the first place. Are you trying to hand an IDE or SATA
controller to the guest? Or is this happening with just Dom0, in
which case I'd suspect the I/O bitmap isn't being set up properly,
thus causing a VM exit when none is needed?

And yes, guarding the EXIT_REASON_IO_INSTRUCTION handling
in vmx_vmexit_handler() against PVH would seem necessary,
directing control flow to exit_and_crash. I'm pretty certain I had
pointed this out while reviewing the original PVH series.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv1] xen: increase default number of PIRQs for hardware domains

2014-12-05 Thread Andrew Cooper

On 05/12/14 09:44, Jan Beulich wrote:
 On 03.12.14 at 17:04, david.vra...@citrix.com wrote:
 The default limit for the number of PIRQs for hardware domains (dom0)
 is not sufficient for some (x86) systems.

 Since the pirq structures are individually and dynamically allocated,
 the limit for hardware domains may be increased to the number of
 possible IRQs.
 I nevertheless disagree to moving the bound up to the Xen internal
 limit unconditionally: What use does it have to allow hwdom to use
 thousands of MSIs?

Because systems that big exist.  We have one.  In particular, it needs
somewhere between 288 and 512 pirqs to scan the bus and bring up the
physical functions alone.

 If a system got that many, the main purpose of
 running Xen on it I would expect to be to hand various of the
 respective devices to guests. Hence no need for hwdom to have
 that many by default, even if this doesn't result in any extra
 resource consumption.

 That said, I can see the current default of 256 being too low though.
 Quite likely in the absence of a user specified value the default
 ought to be derived from nr_irqs - nr_static_irqs rather than being
 any fixed number. Considering the default used for nr_irqs, I'd think
 along the lines of sqrt(num_present_cpus()) * NR_DYNAMIC_VECTORS
 or dom0-max_vcpus * NR_DYNAMIC_VECTORS (or the minimum of
 the two) for x86.

The hardware domain is trusted ultimately.  It can, amongst other
things, rewrite the bootloader command line and replace xen.gz.  It can
be trusted not to maliciously waste Xen resource.

Having an arbitrary restriction on the the hardware domains means only
that, in the case the arbitrary limit is hit, system devices fail to
function properly.  This is far more noticeable if the limit is hit
during probe.  The admin can edit the bootloader and increase the limit,
but only if the root disk was a driver lucky enough to get its
interrupt, or the default network card got its interrupts.

The limit serves no security or resource purpose, but has the chance of
crippling the boot of the system, and making recovery hard or
impossible.  On this justification alone, the limit should be removed.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 0/5] tools/hotplug: systemd changes for 4.5

2014-12-05 Thread Olaf Hering

Konrad and Michael Young reported SELinux related failures in
var-lib-xenstored.mount. The first patch tries to address this by
makeing it easier to change the value of XENSTORED_MOUNT_CTX.

Its not clear to me if the mount option context= should be
adjustable by configure --with-selinux-mount-context=VAL to simplify
building via make rpmball for example. Looks like every new make
install or rpm -U --force dist/xen.rpm would require a readjustment of
XENSTORED_MOUNT_CTX without such new configure option.

The remaining patches are a result of reviewing the service files.
They reference non-existant sysconfig files. We should fix this before
the release of 4.5 to avoid stale sysconfig files if someone wants to
adjust the values.

I have tested this series on openSUSE 13.1.

Please review and apply for 4.5.


Olaf

Olaf Hering (5):
  tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons
  tools/hotplug: use existing sysconfig file for xenconsoled
  tools/hotplug: remove EnvironmentFile from
xen-qemu-dom0-disk-backend.service
  tools/hotplug: remove XENSTORED_ROOTDIR from service file
  tools/hotplug: support XENSTORED_TRACE in systemd

 tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 24 --
 tools/hotplug/Linux/init.d/xencommons.in   |  5 +++--
 .../Linux/systemd/var-lib-xenstored.mount.in   |  3 +--
 .../systemd/xen-qemu-dom0-disk-backend.service.in  |  1 -
 tools/hotplug/Linux/systemd/xenconsoled.service.in |  7 ++-
 tools/hotplug/Linux/systemd/xenstored.service.in   |  3 +--
 6 files changed, 29 insertions(+), 14 deletions(-)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons

2014-12-05 Thread Olaf Hering

On a non-SELinux system the mount option context=none works fine. But
with SELinux enabled a proper value has to be defined. To simplify the
required adjustment move XENSTORED_MOUNT_CTX from the service file to
the sysconfig file.

There is no need to require the creation of a new sysconfig file, just
reuse the existing /etc/sysconfig/xencommons file.

Signed-off-by: Olaf Hering o...@aepfle.de
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
---
 tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 7 +++
 tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in | 3 +--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in 
b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
index c12fc8a..3a34b33 100644
--- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
+++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
@@ -40,3 +40,10 @@
 
 # qemu path
 #QEMU_XEN=@LIBEXEC_BIN@/qemu-system-i386
+
+## Type: string
+## Default: none
+#
+# SELinux context for @XEN_LIB_STORED@ mount point.
+# see mount(8) for the meaning of the context= option
+XENSTORED_MOUNT_CTX=none
diff --git a/tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in 
b/tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in
index d5e04db..65e0b79 100644
--- a/tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in
+++ b/tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in
@@ -6,8 +6,7 @@ ConditionPathExists=/proc/xen/capabilities
 RefuseManualStop=true
 
 [Mount]
-Environment=XENSTORED_MOUNT_CTX=none
-EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xenstored
+EnvironmentFile=@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
 What=xenstore
 Where=@XEN_LIB_STORED@
 Type=tmpfs

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/5] tools/hotplug: use existing sysconfig file for xenconsoled

2014-12-05 Thread Olaf Hering

There is no need to require the creation of a new sysconfig file to
pass options to xenconsoled in the systemd service file. Reuse the
existing xencommons file. This file already contains the variable
XENCONSOLED_TRACE, which is used in the sysv runlevel script.

- Adjust systemd service file to use XENCONSOLED_TRACE instead of
  XENCONSOLED_LOG
- Move XENCONSOLED_ARGS and XENCONSOLED_LOG_DIR to the sysconfig file.
- Enable XENCONSOLED_TRACE and set its value to none to have a value
  for --log in the service file.
- Adjust the runlevel script to recognize also XENCONSOLED_ARGS and
  XENCONSOLED_LOG_DIR
- Adjust the runlevel script to handle XENCONSOLED_TRACE properly. If
  an old sysconfig file exist the XENCONSOLED_TRACE will remain empty.

Signed-off-by: Olaf Hering o...@aepfle.de
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
---
 tools/hotplug/Linux/init.d/sysconfig.xencommons.in | 17 +++--
 tools/hotplug/Linux/init.d/xencommons.in   |  5 +++--
 tools/hotplug/Linux/systemd/xenconsoled.service.in |  7 ++-
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in 
b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
index 3a34b33..6271c3e 100644
--- a/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
+++ b/tools/hotplug/Linux/init.d/sysconfig.xencommons.in
@@ -2,8 +2,21 @@
 ## Type: string
 ## Default: none
 #
-# Log xenconsoled messages (cf xl dmesg)
-#XENCONSOLED_TRACE=[none|guest|hv|all]
+# Log xenconsoled messages (cf xl dmesg
+# This can be [none|guest|hv|all]
+XENCONSOLED_TRACE=none
+
+## Type: string
+## Default: 
+#
+# Additional command line arguments for xenconsoled
+XENCONSOLED_ARGS=
+
+## Type: string
+## Default: @XEN_LOG_DIR@/console
+#
+# Output directory for xenconsoled logfiles.
+XENCONSOLED_LOG_DIR=@XEN_LOG_DIR@/console
 
 ## Type: string
 ## Default: xenstored
diff --git a/tools/hotplug/Linux/init.d/xencommons.in 
b/tools/hotplug/Linux/init.d/xencommons.in
index a1095c2..ddc8daa 100644
--- a/tools/hotplug/Linux/init.d/xencommons.in
+++ b/tools/hotplug/Linux/init.d/xencommons.in
@@ -95,8 +95,9 @@ do_start () {
fi
 
echo Starting xenconsoled...
-   test -z $XENCONSOLED_TRACE || XENCONSOLED_ARGS= 
--log=$XENCONSOLED_TRACE
-   ${SBINDIR}/xenconsoled --pid-file=$XENCONSOLED_PIDFILE $XENCONSOLED_ARGS
+   test -z $XENCONSOLED_LOG_DIR || 
XENCONSOLED_LOG_DIR=--log-dir=${XENCONSOLED_LOG_DIR}
+   test -z $XENCONSOLED_TRACE || XENCONSOLED_TRACE= 
--log=$XENCONSOLED_TRACE
+   ${SBINDIR}/xenconsoled --pid-file=$XENCONSOLED_PIDFILE 
${XENCONSOLED_LOG_DIR} ${XENCONSOLED_TRACE} $XENCONSOLED_ARGS
echo Starting QEMU as disk backend for dom0
test -z $QEMU_XEN  QEMU_XEN=${LIBEXEC_BIN}/qemu-system-i386
$QEMU_XEN -xen-domid 0 -xen-attach -name dom0 -nographic -M xenpv 
-daemonize \
diff --git a/tools/hotplug/Linux/systemd/xenconsoled.service.in 
b/tools/hotplug/Linux/systemd/xenconsoled.service.in
index cb44cd6..9f533ff 100644
--- a/tools/hotplug/Linux/systemd/xenconsoled.service.in
+++ b/tools/hotplug/Linux/systemd/xenconsoled.service.in
@@ -6,14 +6,11 @@ ConditionPathExists=/proc/xen/capabilities
 
 [Service]
 Type=simple
-Environment=XENCONSOLED_ARGS=
-Environment=XENCONSOLED_LOG=none
-Environment=XENCONSOLED_LOG_DIR=@XEN_LOG_DIR@/console
-EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xenconsoled
+EnvironmentFile=@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
 PIDFile=@XEN_RUN_DIR@/xenconsoled.pid
 ExecStartPre=/bin/grep -q control_d /proc/xen/capabilities
 ExecStartPre=/bin/mkdir -p ${XENCONSOLED_LOG_DIR}
-ExecStart=@sbindir@/xenconsoled --pid-file @XEN_RUN_DIR@/xenconsoled.pid 
--log=${XENCONSOLED_LOG} --log-dir=${XENCONSOLED_LOG_DIR} $XENCONSOLED_ARGS
+ExecStart=@sbindir@/xenconsoled --pid-file @XEN_RUN_DIR@/xenconsoled.pid 
--log=${XENCONSOLED_TRACE} --log-dir=${XENCONSOLED_LOG_DIR} $XENCONSOLED_ARGS
 
 [Install]
 WantedBy=multi-user.target

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd

2014-12-05 Thread Olaf Hering

The sysv runlevel script handles the boolean variable XENSTORED_TRACE
from sysconfig.xencommons to enable tracing. Recognize this also to
the systemd service file.

Signed-off-by: Olaf Hering o...@aepfle.de
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
---
 tools/hotplug/Linux/systemd/xenstored.service.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/hotplug/Linux/systemd/xenstored.service.in 
b/tools/hotplug/Linux/systemd/xenstored.service.in
index 0f0ac58..7e55f4f 100644
--- a/tools/hotplug/Linux/systemd/xenstored.service.in
+++ b/tools/hotplug/Linux/systemd/xenstored.service.in
@@ -14,7 +14,7 @@ EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
 ExecStartPre=/bin/grep -q control_d /proc/xen/capabilities
 ExecStartPre=-/bin/rm -f @XEN_LIB_STORED@/tdb*
 ExecStartPre=/bin/mkdir -p @XEN_RUN_DIR@
-ExecStart=/bin/sh -c exec $XENSTORED --no-fork $XENSTORED_ARGS
+ExecStart=/bin/sh -c 'if test -n ${XENSTORED_TRACE} ; then 
XENSTORED_ARGS=-T /var/log/xen/xenstored-trace.log ; fi ; exec $XENSTORED 
--no-fork $$XENSTORED_ARGS'
 
 [Install]
 WantedBy=multi-user.target

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 3/5] tools/hotplug: remove EnvironmentFile from xen-qemu-dom0-disk-backend.service

2014-12-05 Thread Olaf Hering

The references Environment file does not exist, and the service file
does not make use of variables anyway.

Signed-off-by: Olaf Hering o...@aepfle.de
Cc: Ian Jackson ian.jack...@eu.citrix.com
Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Ian Campbell ian.campb...@citrix.com
Cc: Wei Liu wei.l...@citrix.com
---
 tools/hotplug/Linux/systemd/xen-qemu-dom0-disk-backend.service.in | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/hotplug/Linux/systemd/xen-qemu-dom0-disk-backend.service.in 
b/tools/hotplug/Linux/systemd/xen-qemu-dom0-disk-backend.service.in
index 0a5807a..274cec0 100644
--- a/tools/hotplug/Linux/systemd/xen-qemu-dom0-disk-backend.service.in
+++ b/tools/hotplug/Linux/systemd/xen-qemu-dom0-disk-backend.service.in
@@ -8,7 +8,6 @@ ConditionPathExists=/proc/xen/capabilities
 
 [Service]
 Type=simple
-EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xenstored
 PIDFile=@XEN_RUN_DIR@/qemu-dom0.pid
 ExecStartPre=/bin/grep -q control_d /proc/xen/capabilities
 ExecStartPre=/bin/mkdir -p @XEN_RUN_DIR@

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xmalloc: add support for checking the pool integrity

2014-12-05 Thread Jan Beulich

 On 04.12.14 at 18:01, mdo...@bitdefender.com wrote:
 --- a/xen/common/xmalloc_tlsf.c
 +++ b/xen/common/xmalloc_tlsf.c
 @@ -120,9 +120,120 @@ struct xmem_pool {
  char name[MAX_POOL_NAME_LEN];
  };
  
 +static struct xmem_pool *xenpool;
 +
 +static inline void MAPPING_INSERT(unsigned long r, int *fl, int *sl);
 +
  /*
   * Helping functions
   */
 +#ifndef NDEBUG
 +static int xmem_pool_check_size(const struct bhdr *b, int fl, int sl)
 +{
 +while ( b )
 +{
 +int __fl;
 +int __sl;
 +
 +MAPPING_INSERT(b-size, __fl, __sl);
 +if ( __fl != fl || __sl != sl )
 +{
 +printk(XENLOG_ERR xmem_pool: for block %p size = %u, { fl = %d, 
 sl = %d } should be { fl = %d, sl = %d }\n, b, b-size, fl, sl, __fl, __sl);

Long line. Only the format message alone is allowed to exceed 80
characters.

 +return 0;
 +}
 +b = b-ptr.free_ptr.next;
 +}
 +return 1;
 +}
 +
 +/*
 + * This function must be called from a context where pool-lock is
 + * already acquired
 + */
 +#define xmem_pool_check_unlocked(__pool) 
 __xmem_pool_check_unlocked(__FILE__, __LINE__, __pool)

No need for the double underscores on the macro parameter.

 +static int __xmem_pool_check_unlocked(const char *file, int line, const 
 struct xmem_pool *pool)
 +{
 +int i;
 +int woops = 0;
 +static int once = 1;

bool_t

 +
 +for ( i = 0; i  REAL_FLI; i++ )
 +{
 +int fl = ( pool-fl_bitmap  (1  i) ) ? i : -1;

Bogus spaces inside parentheses.

 +
 +if ( fl = 0 )
 +{
 +int j;
 +int bitmap_empty = 1;
 +int matrix_empty = 1;

For any of the int-s here and above - can they really all become
negative? If not, they ought to be unsigned int or bool_t.

 +
 +for ( j = 0; j  MAX_SLI; j++ )
 +{
 +int sl = ( pool-sl_bitmap[fl]  (1  j) ) ? j : -1;
 +
 +if ( sl  0 )
 +continue;
 +
 +if ( once  !pool-matrix[fl][sl] )
 +{
 +/* The bitmap is corrupted */
 +printk(XENLOG_ERR xmem_pool:%s:%d the TLSF bitmap is 
 corrupted\n, file, line);
 +__warn((char *)file, line);

Please constify the first parameter of __warn() instead of adding
fragile casts. I also don't see why file and line need printing twice.

 +static int __xmem_pool_check_locked(const char *file, int line, struct 
 xmem_pool *pool)
 +{
 +int err;
 +
 +spin_lock(pool-lock);
 +err = __xmem_pool_check_unlocked(file, line, pool);

Inversed naming: The caller here should be _unlocked, and the
callee _locked.

 +#define xmem_pool_check_locked(__pool) do { if ( 0  (__pool) ); } while (0)
 +#define xmem_pool_check_unlocked(__pool) do { if ( 0  (__pool) ); } while 
 (0)

((void)(pool)) or at least drop the 0  - after all you _want_ the
macro argument to be evaluated (in order to carry out side effects).

 --- a/xen/include/xen/xmalloc.h
 +++ b/xen/include/xen/xmalloc.h
 @@ -123,4 +123,11 @@ unsigned long xmem_pool_get_used_size(struct xmem_pool 
 *pool);
   */
  unsigned long xmem_pool_get_total_size(struct xmem_pool *pool);
  
 +#ifndef NDEBUG
 +#define xmem_pool_check() __xmem_pool_check(__FILE__, __LINE__)
 +int __xmem_pool_check(const char *file, int line);
 +#else
 +#define xmem_pool_check() do { if ( 0 ); } while (0)

((void)0)

or

do {} while (0)

Also perhaps __xmem_pool_check() should have a pool parameter,
with NULL meaning the default one.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv1] xen: increase default number of PIRQs for hardware domains

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 13:02, andrew.coop...@citrix.com wrote:
 On 05/12/14 09:44, Jan Beulich wrote:
 On 03.12.14 at 17:04, david.vra...@citrix.com wrote:
 The default limit for the number of PIRQs for hardware domains (dom0)
 is not sufficient for some (x86) systems.

 Since the pirq structures are individually and dynamically allocated,
 the limit for hardware domains may be increased to the number of
 possible IRQs.
 I nevertheless disagree to moving the bound up to the Xen internal
 limit unconditionally: What use does it have to allow hwdom to use
 thousands of MSIs?
 
 Because systems that big exist.  We have one.  In particular, it needs
 somewhere between 288 and 512 pirqs to scan the bus and bring up the
 physical functions alone.

This are hundreds, not thousands. I also heavily doubt that a system
needs any IRQs at all to scan the bus.

 If a system got that many, the main purpose of
 running Xen on it I would expect to be to hand various of the
 respective devices to guests. Hence no need for hwdom to have
 that many by default, even if this doesn't result in any extra
 resource consumption.

 That said, I can see the current default of 256 being too low though.
 Quite likely in the absence of a user specified value the default
 ought to be derived from nr_irqs - nr_static_irqs rather than being
 any fixed number. Considering the default used for nr_irqs, I'd think
 along the lines of sqrt(num_present_cpus()) * NR_DYNAMIC_VECTORS
 or dom0-max_vcpus * NR_DYNAMIC_VECTORS (or the minimum of
 the two) for x86.
 
 The hardware domain is trusted ultimately.  It can, amongst other
 things, rewrite the bootloader command line and replace xen.gz.  It can
 be trusted not to maliciously waste Xen resource.
 
 Having an arbitrary restriction on the the hardware domains means only
 that, in the case the arbitrary limit is hit, system devices fail to
 function properly.  This is far more noticeable if the limit is hit
 during probe.  The admin can edit the bootloader and increase the limit,
 but only if the root disk was a driver lucky enough to get its
 interrupt, or the default network card got its interrupts.

There's no need to have disk access in order to add a boot option
- any reasonable boot loader ought to allow editing the command
lines.

 The limit serves no security or resource purpose, but has the chance of
 crippling the boot of the system, and making recovery hard or
 impossible.  On this justification alone, the limit should be removed.

But David's patch doesn't remove the limit, it just moves it as high as
is currently deemed reasonable. That may change, even if we can't
foresee it right now. I'm fine with proposing an alternative patch as
requested by David, but I'm not going to ack this one. If another
maintainer wants to commit it nevertheless, my disagreement here
isn't meant to be a veto...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 4/5] tools/hotplug: remove XENSTORED_ROOTDIR from service file

2014-12-05 Thread Ian Jackson

Olaf Hering writes ([PATCH 4/5] tools/hotplug: remove XENSTORED_ROOTDIR from 
service file):
 There is no need to export XENSTORED_ROOTDIR. This variable can be
 enabled in sysconfig/xencommons. If the variable is unset xenstored
 will automatically use @XEN_LIB_STORED@.

Acked-by: Ian Jackson ian.jack...@eu.citrix.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] A good way to speed up the xl destroy time(guest page scrubbing)

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 11:00, bob@oracle.com wrote:
 5. Potential workaround
 5.1 Use per-cpu list in idle_loop()
 Delist a batch of pages from heap_list to a per-cpu list, then scrub the
 per-cpu list and free back to heap_list.
 
 But Jan disagree with this solution:
 You should really drop the idea of removing pages temporarily.
 All you need to do is make sure a page being allocated and getting
 simultaneously scrubbed by another CPU won't get passed to the
 caller until the scrubbing finished.

So you don't mention any downsides to this approach. If there are
any, please name them. If there aren't, what's the reason not to
go this route?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd

2014-12-05 Thread Ian Jackson

Olaf Hering writes ([PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in 
systemd):
 The sysv runlevel script handles the boolean variable XENSTORED_TRACE
 from sysconfig.xencommons to enable tracing. Recognize this also to
 the systemd service file.
...
 -ExecStart=/bin/sh -c exec $XENSTORED --no-fork $XENSTORED_ARGS
 +ExecStart=/bin/sh -c 'if test -n ${XENSTORED_TRACE} ; then 
 XENSTORED_ARGS=-T /var/log/xen/xenstored-trace.log ; fi ; exec $XENSTORED 
 --no-fork $$XENSTORED_ARGS'

I'm afraid I'm not happy with the way that this duplicates logic
already found in /etc/init.d/xencommons.

Nacked-by: Ian Jackson ian.jack...@eu.citrix.com

I think the only way to make this work properly is to factor the
necessary parts out of init.d/xencommons into a new script which can
be used by both xencommons and systemd.  I'm not sure such a patch
would be appropriate for 4.5 at this stage.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons

2014-12-05 Thread Olaf Hering

On Fri, Dec 05, Ian Jackson wrote:

 Olaf Hering writes ([PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to 
 sysconfig.xencommons):
  On a non-SELinux system the mount option context=none works fine. But
  with SELinux enabled a proper value has to be defined. To simplify the
  required adjustment move XENSTORED_MOUNT_CTX from the service file to
  the sysconfig file.
 
 This patch looks like just the hook.  It seems to be missing the part
 where the actual selinux context is defined and plumbed through.

The context in xen source is none. As asked in the cover letter (which
unfortunately got send to just Konrad and xen-devel, no idea how to fix
that) a configure --with-something may be the way to inject it into the
sources, if required.

  There is no need to require the creation of a new sysconfig file, just
  reuse the existing /etc/sysconfig/xencommons file.
 
 This seems to be an unrelated change ?  If not I confess I don't see
 the connection.

The context has to be defined somewhere. And that place is
sysconfig/xencommons.

  --- a/tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in
  +++ b/tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in
 ...
   [Mount]
  -Environment=XENSTORED_MOUNT_CTX=none
  -EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xenstored
  +EnvironmentFile=@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xencommons
 
 And won't this break existing systems which have an
 /etc/{default,sysconfig}/xenstored ?

Which systems would that be? That file is new in 4.5.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] xen: introduce helper functions to do save read and write accesses

2014-12-05 Thread Juergen Gross

Introduce two helper functions to savely read and write unsigned long
values from or to memory without crashing the system in case of access
failures.

These helpers can be used instead of open coded uses of __get_user()
and __put_user() avoiding the need to do casts to fix sparse warnings.

Use the helpers in page.h and p2m.c. This will fix the sparse
warnings when doing make C=1.

Signed-off-by: Juergen Gross jgr...@suse.com
---
 arch/x86/include/asm/xen/page.h | 16 +++-
 arch/x86/xen/p2m.c  |  2 +-
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index f5d5de4..330352f 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -60,6 +60,20 @@ extern int clear_foreign_p2m_mapping(struct 
gnttab_unmap_grant_ref *unmap_ops,
 extern unsigned long m2p_find_override_pfn(unsigned long mfn, unsigned long 
pfn);
 
 /*
+ * Helper functions to write or read unsigned long values to/from memory.
+ * To be used when accesses might fail.
+ */
+static inline int xen_safe_write_ulong(unsigned long *addr, unsigned long val)
+{
+   return __put_user(val, (unsigned long __user *)addr);
+}
+
+static inline int xen_safe_read_ulong(unsigned long *addr, unsigned long *val)
+{
+   return __get_user(*val, (unsigned long __user *)addr);
+}
+
+/*
  * When to use pfn_to_mfn(), __pfn_to_mfn() or get_phys_to_machine():
  * - pfn_to_mfn() returns either INVALID_P2M_ENTRY or the mfn. No indicator
  *   bits (identity or foreign) are set.
@@ -125,7 +139,7 @@ static inline unsigned long 
mfn_to_pfn_no_overrides(unsigned long mfn)
 * In such cases it doesn't matter what we return (we return garbage),
 * but we must handle the fault without crashing!
 */
-   ret = __get_user(pfn, machine_to_phys_mapping[mfn]);
+   ret = xen_safe_read_ulong(machine_to_phys_mapping[mfn], pfn);
if (ret  0)
return ~0;
 
diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
index 8b5db51..edbc7a6 100644
--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -625,7 +625,7 @@ bool __set_phys_to_machine(unsigned long pfn, unsigned long 
mfn)
return true;
}
 
-   if (likely(!__put_user(mfn, xen_p2m_addr + pfn)))
+   if (likely(!xen_safe_write_ulong(xen_p2m_addr + pfn, mfn)))
return true;
 
ptep = lookup_address((unsigned long)(xen_p2m_addr + pfn), level);
-- 
2.1.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd

2014-12-05 Thread Olaf Hering

On Fri, Dec 05, Ian Jackson wrote:

 I think the only way to make this work properly is to factor the
 necessary parts out of init.d/xencommons into a new script which can
 be used by both xencommons and systemd.  I'm not sure such a patch
 would be appropriate for 4.5 at this stage.

Yes, a helper script to launch just xenstored would help. But which part
would do the final exec? Perhaps the sysv script has to fork a shell
like its done above. I will have a look at this. 

Are you opposed to the idea to support XENSTORED_TRACE for systemd right
in 4.5.0?

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] Poor network performance between DomU with multiqueue support

2014-12-05 Thread Wei Liu

On Fri, Dec 05, 2014 at 01:17:16AM +, Zhangleiqiang (Trump) wrote:
[...]
  I think that's expected, because guest RX data path still uses grant_copy 
  while
  guest TX uses grant_map to do zero-copy transmit.
 
 As far as I know, there are three main grant-related operations used in split 
 device model: grant mapping, grant transfer and grant copy. 
 Grant transfer has not used now, and grant mapping and grant transfer both 
 involve TLB refresh work for hypervisor, am I right?  Or only grant 
 transfer has this overhead?

Transfer is not used so I can't tell. Grant unmap causes TLB flush.

I saw in an email the other day XenServer folks has some planned
improvement to avoid TLB flush in Xen to upstream in 4.6 window. I can't
speak for sure it will get upstreamed as I don't work on that.

 Does grant copy surely has more overhead than grant mapping? 
 

At the very least the zero-copy TX path is faster than previous copying
path.

But speaking of the micro operation I'm not sure.

There was once persistent map prototype netback / netfront that
establishes a memory pool between FE and BE then use memcpy to copy
data. Unfortunately that prototype was not done right so the result was
not good.

 From the code, I see that in TX, netback will do gnttab_batch_copy as well 
 as gnttab_map_refs:
 
 code //netback.c:xenvif_tx_action
   xenvif_tx_build_gops(queue, budget, nr_cops, nr_mops);
 
   if (nr_cops == 0)
   return 0;
 
   gnttab_batch_copy(queue-tx_copy_ops, nr_cops);
   if (nr_mops != 0) {
   ret = gnttab_map_refs(queue-tx_map_ops,
 NULL,
 queue-pages_to_map,
 nr_mops);
   BUG_ON(ret);
   }
 /code
 

The copy is for the packet header. Mapping is for packet data.

We need to copy header from guest so that it doesn't change under
netback's feet.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons

2014-12-05 Thread Olaf Hering

On Fri, Dec 05, Ian Jackson wrote:

 Olaf Hering writes (Re: [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX 
 to sysconfig.xencommons):
  On Fri, Dec 05, Ian Jackson wrote:
   This patch looks like just the hook.  It seems to be missing the part
   where the actual selinux context is defined and plumbed through.
  
  The context in xen source is none. As asked in the cover letter (which
  unfortunately got send to just Konrad and xen-devel, no idea how to fix
  that) a configure --with-something may be the way to inject it into the
  sources, if required.
 
 I confess I don't know very much about selinux, but shouldn't we be
 providing a reasonable default policy, rather than leaving it to the
 distro or user to pass special options to configure ?  Or are things
 in the selinux world so fragmented or fast-moving that such a generic
 policy couldn't be written ?

I know nothing about SELinux.  Not sure why a context= is required
anyway.  But I can find out next week if noone else has an idea how to
deal with SELinux.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 5/5] tools/hotplug: support XENSTORED_TRACE in systemd

2014-12-05 Thread Olaf Hering

On Fri, Dec 05, Ian Jackson wrote:

 Can systemd not launch these daemons by running the existing
 xencommons et al init scripts ?  Obviously that won't give you all of
 systemd's shiny features but IMO it ought to work.

I think the point was to let systemd pass the file descriptors. Thats why
the service file does the exec xenstored.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] xen/serial: setup UART idle mode for OMAP

2014-12-05 Thread Oleksandr Dmytryshyn

UART is not able to receive bytes when idle mode is not
configured properly. When we use Xen with old Linux
Kernel (for example 3.8) this kernel configures UART
idle mode even if the UART node in device tree is absent.
So UART works normally in this case. But new Linux
Kernel (3.12 and upper) doesn't configure idle mode for
UART and UART can not work normally in this case.

Signed-off-by: Oleksandr Dmytryshyn oleksandr.dmytrys...@globallogic.com
---
 xen/drivers/char/omap-uart.c | 3 +++
 xen/include/xen/8250-uart.h  | 4 
 2 files changed, 7 insertions(+)

diff --git a/xen/drivers/char/omap-uart.c b/xen/drivers/char/omap-uart.c
index a798b8d..16d1454 100644
--- a/xen/drivers/char/omap-uart.c
+++ b/xen/drivers/char/omap-uart.c
@@ -195,6 +195,9 @@ static void __init omap_uart_init_preirq(struct serial_port 
*port)
 omap_write(uart, UART_MCR, UART_MCR_DTR|UART_MCR_RTS);
 
 omap_write(uart, UART_OMAP_MDR1, UART_OMAP_MDR1_16X_MODE);
+
+/* setup iddle mode */
+omap_write(uart, UART_SYSC, OMAP_UART_SYSC_DEF_CONF);
 }
 
 static void __init omap_uart_init_postirq(struct serial_port *port)
diff --git a/xen/include/xen/8250-uart.h b/xen/include/xen/8250-uart.h
index a682bae..304b9dd 100644
--- a/xen/include/xen/8250-uart.h
+++ b/xen/include/xen/8250-uart.h
@@ -32,6 +32,7 @@
 #define UART_MCR  0x04/* Modem control*/
 #define UART_LSR  0x05/* line status  */
 #define UART_MSR  0x06/* Modem status */
+#define UART_SYSC 0x15/* System configuration register */
 #define UART_USR  0x1f/* Status register (DW) */
 #define UART_DLL  0x00/* divisor latch (ls) (DLAB=1) */
 #define UART_DLM  0x01/* divisor latch (ms) (DLAB=1) */
@@ -145,6 +146,9 @@
 /* SCR register bitmasks */
 #define OMAP_UART_SCR_RX_TRIG_GRANU1_MASK (1  7)
 
+/* System configuration register */
+#define OMAP_UART_SYSC_DEF_CONF 0x0d /* autoidle mode, wakeup is enabled */
+
 #endif /* __XEN_8250_UART_H__ */
 
 /*
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design

2014-12-05 Thread Ian Campbell

On Tue, 2014-12-02 at 23:14 -0700, Chun Yan Liu wrote:
 
  On 11/28/2014 at 11:43 PM, in message 
  1417189409.23604.62.ca...@citrix.com,
 Ian Campbell ian.campb...@citrix.com wrote: 
  On Tue, 2014-11-25 at 02:08 -0700, Chun Yan Liu wrote: 
   Hi, Ian, 

   According to previous discussion, snapshot delete and revert are 
   inclined to be done by high level application itself, won't supply a 
   libxl API. 
   
  I thought you had explained a scenario where the toolstack needed to be 
  at least aware of delete, specifically when you are deleting a snapshot 
  from the middle of an active chain.
   
 The reason why I post such an overview here before sending next
 version is: I'm puzzled about what should be in libxl and what
 in toolstack after previous discussion. So posted here to seek
 some ideas or agreement first. It's not a full design, not break
 down to libxl and toolstack yet.

I guess I thought we had gotten closer to this than we actually have.

  Maybe that's not snapshot delete API in libxl though, but rather a 
  notification API which the toolstack can use to tell libxl something is 
  going on. 
 
 About notification API, after looking at lvm, vhd-util and qcow2, 
 I don't think we need it. No extra work needs to do to handle 
 disk snapshot chain. 
 lvm: doesn't support snapshot of snapshot. 
 vhd-util: backing file chain, external snapshot. Don't need to 
   delete the disk snapshot when deleting domain snapshot. 
 qcow2: 
 * internal disk snapshot: each snapshot increases the refcount 
   of data, deleting snapshot only decrease the refcount, won't 
   affect other snapshots. 
 * external disk snapshot: same as vhd-util, backing file chain. 
   Don't need to delete disk snapshot when deleting domain snapshot.

You don't need to, but might a toolstack (or user) want to consolidate
anyway, e.g. to reduce chain length? (which might otherwise be overly
long.)

  I don't believe xl can take a disk snapshot of an active disk, it 
  doesn't have the machinery to deal with that sort of thing, nor should 
  it, this is exactly the sort of thing which libxl is provided to deal 
  with. 
 
 Like delete a disk snapshot, xl can call external command to do that
 (e.g. qemu-img). But it's better to call qmp to do that.

The toolstack (xl or libvirt) doesn't have direct access to qmp, it
would have to go via a libxl API, for an Active domain at least.
qemu-img is the right answer for an Inactive domain.

Secondly, the disk snapshot has to happen while the domain is
paused/quiesced for consistency. This happens deep in the bowels of the
libxl save/restore code. So either libxl has to do the disk snapshots at
the same time or we need a callback to the toolstack in order for it to
make the snapshots.

 Anyway, if for domain snapshot create, we should put creating disk
 snapshot process in libxl, then for domain snapshot delete, we
 should put deleting disk snapshot process in libxl. That is, in libxl
 there should be:
 libxl_disk_snapshot_create (which handles creating disk snapshot)
 libxl_disk_snapshot_delete (which handles deleting disk snapshot)
 
 Otherwise I would think it's weird to have in libxl:
 libxl_domain_snapshot_create (wrap saving memory [already has API] 
   and creating disk snapshot)
 libxl_disk_snapshot_delete (deleting disk snapshot)

The create and delete cases are subtly different, so it may be that the
API ends up asymmetric.

The create mechanism (whichever one it is) operates on a single Active
domain and is reasonably well defined.

The delete operation however can potentially operate on multiple Active
domains, e.g. 2 domains are running with a common ancestor snapshot
which is being removed.

How would the delete interface deal with this case? In particular
without libxl becoming involved in storage management.

The reason I'm thinking of a delete notify style interface for Active
domains is that it then applies to a single Active domain at a time. If
multiple domains are affected by a snapshot deletion then the
notification is called multiple times.

 And about the snapshot json file store and retrieve, using
 gentype.py to autogenerate xx_to_json and xx_from_json functions
 is very convenient, there would be a group of functions
 set/get/update/delete_snapshot_metadata based on that.
 But I didn't see other such usage in xl, and it's not proper to 
 place in libxl. Anywhere could it be placed but used by xl? 
 Wei might have some ideas about this?

xl hasn't needed to use the autogeneration infrastructure to date, but
there's no reason why it couldn't do so if there was a need. Just create
xl_types.idl and hook it into the Makeile.

It would be harder to extend this to other toolstack, but I suspect we
don't need to.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 1/4] dma: add dma_get_required_mask_from_max_pfn()

2014-12-05 Thread David Vrabel

A generic dma_get_required_mask() is useful even for architectures (such
as ia64) that define ARCH_HAS_GET_REQUIRED_MASK.

Signed-off-by: David Vrabel david.vra...@citrix.com
Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
---
 drivers/base/platform.c |   10 --
 include/linux/dma-mapping.h |1 +
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index b2afc29..f9f3930 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -1009,8 +1009,7 @@ int __init platform_bus_init(void)
return error;
 }
 
-#ifndef ARCH_HAS_DMA_GET_REQUIRED_MASK
-u64 dma_get_required_mask(struct device *dev)
+u64 dma_get_required_mask_from_max_pfn(struct device *dev)
 {
u32 low_totalram = ((max_pfn - 1)  PAGE_SHIFT);
u32 high_totalram = ((max_pfn - 1)  (32 - PAGE_SHIFT));
@@ -1028,6 +1027,13 @@ u64 dma_get_required_mask(struct device *dev)
}
return mask;
 }
+EXPORT_SYMBOL_GPL(dma_get_required_mask_from_max_pfn);
+
+#ifndef ARCH_HAS_DMA_GET_REQUIRED_MASK
+u64 dma_get_required_mask(struct device *dev)
+{
+   return dma_get_required_mask_from_max_pfn(dev);
+}
 EXPORT_SYMBOL_GPL(dma_get_required_mask);
 #endif
 
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index d5d3881..6e2fdfc 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -127,6 +127,7 @@ static inline int dma_coerce_mask_and_coherent(struct 
device *dev, u64 mask)
return dma_set_mask_and_coherent(dev, mask);
 }
 
+extern u64 dma_get_required_mask_from_max_pfn(struct device *dev);
 extern u64 dma_get_required_mask(struct device *dev);
 
 #ifndef set_arch_dma_coherent_ops
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 2/4] ia64: use common dma_get_required_mask_from_pfn()

2014-12-05 Thread David Vrabel

Signed-off-by: David Vrabel david.vra...@citrix.com
Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
Cc: Tony Luck tony.l...@intel.com
Cc: Fenghua Yu fenghua...@intel.com
Cc: linux-i...@vger.kernel.org
---
 arch/ia64/include/asm/machvec.h  |2 +-
 arch/ia64/include/asm/machvec_init.h |1 -
 arch/ia64/pci/pci.c  |   20 
 3 files changed, 1 insertion(+), 22 deletions(-)

diff --git a/arch/ia64/include/asm/machvec.h b/arch/ia64/include/asm/machvec.h
index 9c39bdf..beaa47d 100644
--- a/arch/ia64/include/asm/machvec.h
+++ b/arch/ia64/include/asm/machvec.h
@@ -287,7 +287,7 @@ extern struct dma_map_ops *dma_get_ops(struct device *);
 # define platform_dma_get_ops  dma_get_ops
 #endif
 #ifndef platform_dma_get_required_mask
-# define  platform_dma_get_required_mask   ia64_dma_get_required_mask
+# define  platform_dma_get_required_mask   
dma_get_required_mask_from_max_pfn
 #endif
 #ifndef platform_irq_to_vector
 # define platform_irq_to_vector__ia64_irq_to_vector
diff --git a/arch/ia64/include/asm/machvec_init.h 
b/arch/ia64/include/asm/machvec_init.h
index 37a4698..ef964b2 100644
--- a/arch/ia64/include/asm/machvec_init.h
+++ b/arch/ia64/include/asm/machvec_init.h
@@ -3,7 +3,6 @@
 
 extern ia64_mv_send_ipi_t ia64_send_ipi;
 extern ia64_mv_global_tlb_purge_t ia64_global_tlb_purge;
-extern ia64_mv_dma_get_required_mask ia64_dma_get_required_mask;
 extern ia64_mv_irq_to_vector __ia64_irq_to_vector;
 extern ia64_mv_local_vector_to_irq __ia64_local_vector_to_irq;
 extern ia64_mv_pci_get_legacy_mem_t ia64_pci_get_legacy_mem;
diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c
index 291a582..79da21b 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -791,26 +791,6 @@ static void __init set_pci_dfl_cacheline_size(void)
pci_dfl_cache_line_size = (1  cci.pcci_line_size) / 4;
 }
 
-u64 ia64_dma_get_required_mask(struct device *dev)
-{
-   u32 low_totalram = ((max_pfn - 1)  PAGE_SHIFT);
-   u32 high_totalram = ((max_pfn - 1)  (32 - PAGE_SHIFT));
-   u64 mask;
-
-   if (!high_totalram) {
-   /* convert to mask just covering totalram */
-   low_totalram = (1  (fls(low_totalram) - 1));
-   low_totalram += low_totalram - 1;
-   mask = low_totalram;
-   } else {
-   high_totalram = (1  (fls(high_totalram) - 1));
-   high_totalram += high_totalram - 1;
-   mask = (((u64)high_totalram)  32) + 0x;
-   }
-   return mask;
-}
-EXPORT_SYMBOL_GPL(ia64_dma_get_required_mask);
-
 u64 dma_get_required_mask(struct device *dev)
 {
return platform_dma_get_required_mask(dev);
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH 4/4] x86/xen: assume a 64-bit DMA mask is required

2014-12-05 Thread David Vrabel

On a Xen PV guest the DMA addresses and physical addresses are not 1:1
(such as Xen PV guests) and the generic dma_get_required_mask() does
not return the correct mask (since it uses max_pfn).

Some device drivers (such as mptsas, mpt2sas) use
dma_get_required_mask() to set the device's DMA mask to allow them to
use only 32-bit DMA addresses in hardware structures.  This results in
unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits,
impacting performance significantly.

We could base the DMA mask on the maximum MFN but:

a) The hypercall op to get the maximum MFN (XENMEM_maximum_ram_page)
will truncate the result to an int in 32-bit guests.

b) Future uses of the IOMMU in Xen may map frames at bus addresses
above the end of RAM.

So, just assume a 64-bit DMA mask is always required.

Signed-off-by: David Vrabel david.vra...@citrix.com
---
 arch/x86/xen/pci-swiotlb-xen.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/xen/pci-swiotlb-xen.c b/arch/x86/xen/pci-swiotlb-xen.c
index 0e98e5d..35774f8 100644
--- a/arch/x86/xen/pci-swiotlb-xen.c
+++ b/arch/x86/xen/pci-swiotlb-xen.c
@@ -18,6 +18,11 @@
 
 int xen_swiotlb __read_mostly;
 
+static u64 xen_swiotlb_get_required_mask(struct device *dev)
+{
+   return DMA_BIT_MASK(64);
+}
+
 static struct dma_map_ops xen_swiotlb_dma_ops = {
.mapping_error = xen_swiotlb_dma_mapping_error,
.alloc = xen_swiotlb_alloc_coherent,
@@ -31,6 +36,7 @@ static struct dma_map_ops xen_swiotlb_dma_ops = {
.map_page = xen_swiotlb_map_page,
.unmap_page = xen_swiotlb_unmap_page,
.dma_supported = xen_swiotlb_dma_supported,
+   .get_required_mask = xen_swiotlb_get_required_mask,
 };
 
 /*
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCHv5 0/4] dma, x86, xen: reduce SWIOTLB usage in Xen guests

2014-12-05 Thread David Vrabel

On systems where DMA addresses and physical addresses are not 1:1
(such as Xen PV guests), the generic dma_get_required_mask() will not
return the correct mask (since it uses max_pfn).

Some device drivers (such as mptsas, mpt2sas) use
dma_get_required_mask() to set the device's DMA mask to allow them to use
only 32-bit DMA addresses in hardware structures.  This results in
unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits,
impacting performance significantly.

This series allows Xen PV guests to override the default
dma_get_required_mask() with a more suitable one.

Changes in v5:
- xen_swiotlb_get_required_mask() is x86 only.

Changes in v4:
- Assume 64-bit mask is required.

Changes in v3:
- fix off-by-one in xen_dma_get_required_mask()
- split ia64 changes into separate patch.

Changes in v2:
- split x86 and xen changes into separate patches

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PV DomU running linux 3.17.3 causing xen-netback fatal error in Dom0

2014-12-05 Thread David Vrabel

On 05/12/14 12:48, Zoltan Kiss wrote:
 Hi,
 
 Maybe I'm misreading it, but it seems to me that netfront doesn't slice
 up the linear buffer at all, just blindly sends it. In xennet_start_xmit:

This is handled in the beginning of xennet_make_frags() (which I would
agree isn't not the obvious place for it).

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets

2014-12-05 Thread Ian Campbell

On Fri, 2014-12-05 at 14:36 +, Julien Grall wrote:
 Hi,
 
 On 05/12/14 14:27, Ian Campbell wrote:
  On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote:
   #define nr_static_irqs NR_IRQS
  +#define arch_hwdom_irqs(domid) NR_IRQS
  
  FWIW gic_number_lines() is the ARM equivalent of getting the number of
  GSIs.
  
  *BUT* we don't actually use pirqs on ARM (everything goes via the
  virtualised interrupt controller). So maybe we should be setting
  nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come
  from an ARM person, so I'm fine with you making this NR_IRQS in the
  meantime.
 
 As we already know that PIRQ is not used on ARM, it would make sense to
 use directly in this patch 0.

Are you offering to give a tested-by if Jan posts such a patch?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] lock down hypercall continuation encoding masks

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 15:36, andrew.coop...@citrix.com wrote:
 On 05/12/14 11:31, Jan Beulich wrote:
 Andrew validly points out that even if these masks aren't a formal part
 of the hypercall interface, we aren't free to change them: A guest
 suspended for migration in the middle of a continuation would fail to
 work if resumed on a hypervisor using a different value. Hence add
 respective comments to their definitions.

 Additionally, to help future extensibility as well as in the spirit of
 reducing undefined behavior as much as possible, refuse hypercalls made
 with the respective bits non-zero when the respective sub-ops don't
 make use of those bits.

 Reported-by: Andrew Cooper andrew.coop...@citrix.com
 Signed-off-by: Jan Beulich jbeul...@suse.com
 
 General principle looks good.  A couple of issues.
 
 --- a/xen/arch/x86/mm.c
 +++ b/xen/arch/x86/mm.c
 @@ -4661,9 +4661,8 @@ int xenmem_add_to_physmap_one(
  long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
  {
  int rc;
 -int op = cmd  MEMOP_CMD_MASK;
 
 This needs a blanket start_iter check, as do_memory_op() has not done so.

Not sure what you're asking for - why is removing the masking not
sufficient?

 The ARM code also needs one, as the caller has applied partial checks.

The ARM code never applied a mask.

 --- a/xen/common/memory.c
 +++ b/xen/common/memory.c
 @@ -977,6 +992,9 @@ long do_memory_op(unsigned long cmd, XEN
  unsigned int dom_vnodes, dom_vranges, dom_vcpus;
  struct vnuma_info tmp;
  
 +if ( unlikely(start_extent) )
 +return -ENOSYS;
 +
  /*
   * Guest passes nr_vnodes, number of regions and nr_vcpus thus
   * we know how much memory guest has allocated.
 
 XENMEM_get_vnumainfo needs a guard.

Again - I don't understand what you're asking for: The hunk above
is modifying the XENMEM_get_vnumainfo case.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 15:27, ian.campb...@eu.citrix.com wrote:
 On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote:
  #define nr_static_irqs NR_IRQS
 +#define arch_hwdom_irqs(domid) NR_IRQS
 
 FWIW gic_number_lines() is the ARM equivalent of getting the number of
 GSIs.
 
 *BUT* we don't actually use pirqs on ARM (everything goes via the
 virtualised interrupt controller). So maybe we should be setting
 nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come
 from an ARM person, so I'm fine with you making this NR_IRQS in the
 meantime.

Considering Julien also asking for this, I don't mind changing this to
zero for ARM. Just let me know which way I can get this ack-ed.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 15:48, david.vra...@citrix.com wrote:
 On 05/12/14 13:51, Jan Beulich wrote:
 +d-nr_pirqs = extra_hwdom_irqs ? nr_static_irqs + 
 extra_hwdom_irqs
 +   : arch_hwdom_irqs(domid);
 
 This means if the user asks for 0 extra (by the command line) for hwdoms
 they get the default which non-obvious.

I can certainly add another sentence saying so to the documentation.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] A few EFI code questions

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 15:51, daniel.ki...@oracle.com wrote:
 On Thu, Dec 04, 2014 at 09:35:01AM +, Jan Beulich wrote:
  On 03.12.14 at 22:02, daniel.ki...@oracle.com wrote:
  3) Should not we change xen/arch/*/efi/efi-boot.h to
 xen/arch/*/efi/efi-boot.c? efi-boot.h contains more
 code than definitions, declarations and short static
 functions. So, I think that it is more regular *.c file
 than header file.

 That's a matter of taste - I'd probably have made it .c too, but
 didn't mind it being .h as done by Roy (presumably on the basis
 that #include directives are preferred to have .h files as their
 operands). The only thing I regret is that I didn't ask for the
 pointless efi- prefix to be dropped.
 
 As I can see a few people people agree to some extent with my suggestion.
 Great! Sadly if we wish .c file than simple boot.c (as Jan suggested we can
 drop efi- prefix) conflicts with exiting boot.c link. Is efi-boot.c OK?
 Or maybe boot-arch.c? boot.h is OK for sure. Which one do you prefer?
 Do you have better ideas?

boot.h would be my preference given how things look like right now,
but I don't think this possibility of renaming warrants a much longer
discussion. Please also remember that renaming always implies more
cumbersome backporting, even if only slightly more.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to sysconfig.xencommons

2014-12-05 Thread Ian Jackson

Olaf Hering writes (Re: [PATCH 1/5] tools/hotplug: move XENSTORED_MOUNT_CTX to 
sysconfig.xencommons):
 On Fri, Dec 05, Ian Jackson wrote:
  I confess I don't know very much about selinux, but shouldn't we be
  providing a reasonable default policy, rather than leaving it to the
  distro or user to pass special options to configure ?  Or are things
  in the selinux world so fragmented or fast-moving that such a generic
  policy couldn't be written ?
 
 I know nothing about SELinux.  Not sure why a context= is required
 anyway.  But I can find out next week if noone else has an idea how to
 deal with SELinux.

OK, thanks.

Anyway, I don't think this question should stand in the way of this
hunk of your patch, which is IMO obviously a move in the right
direction.

So if you shuffle things about as I suggested I will ack this hunk in
your next version of the series.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] ucode=scan usefulness

2014-12-05 Thread Jan Beulich

Konrad,

having been surprised to find your cpio scanning code to not work I
had to realize that this can't possibly work when the initrd is
compressed. Considering that you found this useful nevertheless -
am I to imply that you're running with (and only considering) non-
compressed initrd? Are there plans to support compressed ones too?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-4.3-testing test] 32089: regressions - FAIL

2014-12-05 Thread xen . org

flight 32089 xen-4.3-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32089/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-pair   17 guest-migrate/src_host/dst_host fail REGR. vs. 31811

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 debian-hvm-install  fail never pass
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64  7 debian-hvm-install fail never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-armhf-armhf-xl   5 xen-boot fail   never pass
 test-armhf-armhf-libvirt  5 xen-boot fail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 test-amd64-i386-xend-winxpsp3 17 leak-check/check fail  never pass
 build-amd64-rumpuserxen   6 xen-buildfail   never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xend-qemut-winxpsp3 17 leak-check/checkfail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass

version targeted for testing:
 xen  e0921ec746410f0a07eb3767e95e5eda25d4934a
baseline version:
 xen  62f1b78417f3a9afe8d40ee3c0d2f0495240cf47


People who touched revisions under test:
  Jan Beulich jbeul...@suse.com


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 fail
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-win7-amd64   fail
 test-amd64-i386-xl-win7-amd64fail
 test-amd64-i386-xl-credit2   pass

Re: [Xen-devel] [PATCH] lock down hypercall continuation encoding masks

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 16:01, andrew.coop...@citrix.com wrote:
 On 05/12/14 14:47, Jan Beulich wrote:
 On 05.12.14 at 15:36, andrew.coop...@citrix.com wrote:
 On 05/12/14 11:31, Jan Beulich wrote:
 Andrew validly points out that even if these masks aren't a formal part
 of the hypercall interface, we aren't free to change them: A guest
 suspended for migration in the middle of a continuation would fail to
 work if resumed on a hypervisor using a different value. Hence add
 respective comments to their definitions.

 Additionally, to help future extensibility as well as in the spirit of
 reducing undefined behavior as much as possible, refuse hypercalls made
 with the respective bits non-zero when the respective sub-ops don't
 make use of those bits.

 Reported-by: Andrew Cooper andrew.coop...@citrix.com
 Signed-off-by: Jan Beulich jbeul...@suse.com
 General principle looks good.  A couple of issues.

 --- a/xen/arch/x86/mm.c
 +++ b/xen/arch/x86/mm.c
 @@ -4661,9 +4661,8 @@ int xenmem_add_to_physmap_one(
  long arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
  {
  int rc;
 -int op = cmd  MEMOP_CMD_MASK;
 This needs a blanket start_iter check, as do_memory_op() has not done so.
 Not sure what you're asking for - why is removing the masking not
 sufficient?
 
 There is no check to ensure that a non-preemptible arch_memoy_op is not
 called with a non-zero start_iter.
 
 This location needs something like
 
 if ( cmd  ~MEMOP_CMD_MASK )
 return -ENOSYS;

I'm sorry - the default case of sub_arch_memory_op() will ensure
this.

 The ARM code also needs one, as the caller has applied partial checks.
 The ARM code never applied a mask.
 
 But the common code does, so the ARM code must follow suit for consistency.
 
 Otherwise, we end up with ARM non-preemptible memory subops not failing
 with -ENOSYS where primary memory ops would.

Again, the default case results in -ENOSYS for any with the high
bits set.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets

2014-12-05 Thread Julien Grall

On 05/12/14 14:42, Ian Campbell wrote:
 On Fri, 2014-12-05 at 14:36 +, Julien Grall wrote:
 Hi,

 On 05/12/14 14:27, Ian Campbell wrote:
 On Fri, 2014-12-05 at 13:51 +, Jan Beulich wrote:
  #define nr_static_irqs NR_IRQS
 +#define arch_hwdom_irqs(domid) NR_IRQS

 FWIW gic_number_lines() is the ARM equivalent of getting the number of
 GSIs.

 *BUT* we don't actually use pirqs on ARM (everything goes via the
 virtualised interrupt controller). So maybe we should be setting
 nr_pirqs to 0 on ARM. I appreciate you likely want such a patch to come
 from an ARM person, so I'm fine with you making this NR_IRQS in the
 meantime.

 As we already know that PIRQ is not used on ARM, it would make sense to
 use directly in this patch 0.
 
 Are you offering to give a tested-by if Jan posts such a patch?

nr_pirqs is used in 2 different place (without counting this setting):
- event channel = We don't care on ARM as alloc_pirq_struct is
returning NULL
- XEN_DOMCTL_irq_permission = I don't really understand this bits.
AFAIU the pirq number is different on each domain. But we use it to
check permission on both domain. Shouldn't we translate the pirq to irq
for the current-domain?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-4.4-testing test] 31991: regressions - FAIL [and 1 more messages]

2014-12-05 Thread Ian Jackson

xen.org writes ([xen-4.4-testing test] 31991: regressions - FAIL):
 flight 31991 xen-4.4-testing real [real]
 http://www.chiark.greenend.org.uk/~xensrcts/logs/31991/
 
 Regressions :-(
 
 Tests which did not succeed and are blocking,
 including tests which could not be run:
  test-amd64-i386-pair   17 guest-migrate/src_host/dst_host fail REGR. vs. 
 31781

This is the swiotlb problem which is not a recent regression in Xen
4.3, but probably a gradually-regressing kernel problem.

 version targeted for testing:
  xen  a39f202031d7f1d8d9e14b8c3d7d11c812db253e

xen.org writes ([xen-4.3-testing test] 32089: regressions - FAIL):
 flight 32089 xen-4.3-testing real [real]
 http://www.chiark.greenend.org.uk/~xensrcts/logs/32089/
 
 Regressions :-(
 
 Tests which did not succeed and are blocking,
 including tests which could not be run:
  test-amd64-i386-pair   17 guest-migrate/src_host/dst_host fail REGR. vs. 
 31811

Likewise.

 version targeted for testing:
  xen  e0921ec746410f0a07eb3767e95e5eda25d4934a

In both of these cases, that was the only reason osstest didn't do a
push.  Following discussion with Jan on IRC, I am going to do a manual
force push of both trees.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH OSSTEST] ts-xen-build-prep: Install libxml-xpath-perl on build machines

2014-12-05 Thread Ian Campbell

On Fri, 2014-12-05 at 14:55 +, Ian Campbell wrote:
 Required by latest libvirt, to build docs.
 
 Signed-off-by: Ian Campbell ian.campb...@citrix.com

Ian acked this on IRC and I have pushed it along with some other bits
and bobs floating around already acked. Specifically I have pushed to
pretest:
0d8405e Add simple helper to update DI for all architectures.
e7ed319 ts-kernel-build: enable CONFIG_IKCONFIG{_PROC}
6184712 standalone: Introduce HostGroups for use in OSSTEST_CONFIG
a70253f ts-xen-build-prep: Install libxml-xpath-perl on build machines
60670dd linux-next tests: Use correct branch for baseline



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2] console: increase initial conring size

2014-12-05 Thread Daniel Kiper

In general initial conring size is sufficient. However, if log
level is increased on platforms which have e.g. huge number
of memory regions (I have an IBM System x3550 M2 with 8 GiB RAM
which has more than 200 entries in EFI memory map) then some
of earlier messages in console ring are overwritten. It means
that in case of issues deeper analysis can be hindered. Sadly
conring_size argument does not help because new console buffer
is allocated late on heap. It means that it is not possible to
allocate larger ring earlier. So, in this situation initial
conring size should be increased. My experiments showed that
even on not so big machines more than 26 KiB of free space are
needed for initial messages. In theory we could increase conring
size buffer to 32 KiB. However, I think that this value could be
too small for huge machines with large number of ACPI tables and
EFI memory regions. Hence, this patch increases initial conring
size to 64 KiB.

Signed-off-by: Daniel Kiper daniel.ki...@oracle.com
---
This bug (or lack of feature if you prefer) should be fixed, as it
was pointed out by Jan Beulich and Olaf Hering, by allocating conring
earlier. I though about that before posting this patch (I did not
know beforehand about Olaf's work made in 2011). However, I stated
that it is too late to make so intrusive changes. So, I think we
should (sadly) apply this band-aid to 4.5 because, as you can see
in Xen-devel archive, this bug hits more and more people and they fix
this issue in the same way as I did in this patch. On the other hand
I agree that we should finally fix this issue in better way.
Hence, I am adding this thing to my TODO list.

v2 - suggestions/fixes:
   - update documentation
 (suggested by Andrew Cooper),
   - add rationale
 (suggested by Jan Beulich).
---
 docs/misc/xen-command-line.markdown |2 +-
 xen/drivers/char/console.c  |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 0866df2..2ad2340 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -286,7 +286,7 @@ A typical setup for most situations might be 
`com1=115200,8n1`
 ### conring\_size
  `= size`
 
- Default: `conring_size=16k`
+ Default: `conring_size=64k`
 
 Specify the size of the console ring buffer.
 
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index 2f03259..429d296 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -67,7 +67,7 @@ custom_param(console_timestamps, parse_console_timestamps);
 static uint32_t __initdata opt_conring_size;
 size_param(conring_size, opt_conring_size);
 
-#define _CONRING_SIZE 16384
+#define _CONRING_SIZE 65536
 #define CONRING_IDX_MASK(i) ((i)(conring_size-1))
 static char __initdata _conring[_CONRING_SIZE];
 static char *__read_mostly conring = _conring;
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/4] sysctl/libxl: Add interface for returning IO topology data

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 16:55, jbeul...@suse.com wrote:
 On 02.12.14 at 22:34, boris.ostrov...@oracle.com wrote:
 +struct xen_sysctl_iotopo {
 +uint16_t seg;
 +uint8_t bus;
 +uint8_t devfn;
 +uint32_t node;
 +};
 
 This is PCI-centric without expressing in the name or layout. Perhaps
 the first part should be a union from the very beginning?

And I wonder whether that supposed union part wouldn't be nicely
done using struct physdev_pci_device.

Additionally please add IN and OUT annotations. When I first saw
this I assumed they would all be OUT (in which case the long running
loop problem mentioned in the reply to one of the other patches
wouldn't have been there), matching their CPU counterpart...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets

2014-12-05 Thread Julien Grall

On 05/12/14 15:42, Jan Beulich wrote:
 On 05.12.14 at 16:25, julien.gr...@linaro.org wrote:
  - XEN_DOMCTL_irq_permission = I don't really understand this bits.
 AFAIU the pirq number is different on each domain. But we use it to
 check permission on both domain. Shouldn't we translate the pirq to irq
 for the current-domain?
 
 Indeed, see also
 http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00219.html

Do you plan to send a patch to resolve this problem?

Regards,

-- 
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design

2014-12-05 Thread Wei Liu

I have to admit I'm confused by the back and forth discussion. It's hard
to justify the design of new API without knowing what the constraints
and requirements are from your PoV.

Here are my two cents, not about details, but about general constraints.

There are two layers, one is user of libxl (clients -- xl, libvirt etc)
and libxl (the library itself).

1. it's better to *not* have storage management in libxl.

It's likely that clients can have their own management functionality
already.  I'm told that libvirt has that as well as XAPI. Having this
functionality in libxl is a bit redundant and requires lots of work
(enlighten libxl on what a disk looks like and call out to various
utilities).

2. it's *not* a requirement for xl to have the capability to manage
snapshots.

It's the same arguement that xl has no idea on how to manage snapshots
created by xl save.  This should ease your concern on having to
duplicate code for libvirt and xl.  IMHO the xl only needs to have the
capability to create a snapshot and create a domain from a snapshot. The
downside is that now xl and libvirt are disconnected, but I think it's
fine. The arguement is that you're not allowed to run two toolstack on
the same host (think about xl and xend in previous releases).

Do these two constraints make your work easier (or harder)?

Regarding JSON API, as Ian said, feel free to hook it up to libxlu.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design

2014-12-05 Thread Ian Campbell

On Fri, 2014-12-05 at 16:06 +, Wei Liu wrote:
 Regarding JSON API, as Ian said, feel free to hook it up to libxlu.

*If* it is useful to multiple toolstacks but not suitable for libxl then
libxlu would be the right place.

As I understood things the need for JSON here was xl specific, and it is
IMHO fine for xl to also use the idl infrastructure, without needing to
launder it via libxlu.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v8][PATCH 17/17] xen/vtd: re-enable USB device assignment if enable pci_force

2014-12-05 Thread Konrad Rzeszutek Wilk

On Mon, Dec 01, 2014 at 05:24:35PM +0800, Tiejun Chen wrote:
 Before we refine RMRR mechanism, USB RMRR may conflict with guest bios
 region so we always ignore USB RMRR. Now this can be gone when we enable
 pci_force to check/reserve RMRR.
 
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 ---
  xen/drivers/passthrough/vtd/dmar.h  |  1 +
  xen/drivers/passthrough/vtd/iommu.c | 12 
  xen/drivers/passthrough/vtd/utils.c | 18 ++
  3 files changed, 27 insertions(+), 4 deletions(-)
 
 diff --git a/xen/drivers/passthrough/vtd/dmar.h 
 b/xen/drivers/passthrough/vtd/dmar.h
 index a57c0d4..832dc32 100644
 --- a/xen/drivers/passthrough/vtd/dmar.h
 +++ b/xen/drivers/passthrough/vtd/dmar.h
 @@ -132,6 +132,7 @@ do {\
  int vtd_hw_check(void);
  void disable_pmr(struct iommu *iommu);
  int is_usb_device(u16 seg, u8 bus, u8 devfn);
 +int is_reserve_device_memory(struct domain *d, u8 bus, u8 devfn);
  int is_igd_drhd(struct acpi_drhd_unit *drhd);
  
  #endif /* _DMAR_H_ */
 diff --git a/xen/drivers/passthrough/vtd/iommu.c 
 b/xen/drivers/passthrough/vtd/iommu.c
 index ba40209..1f1ceb7 100644
 --- a/xen/drivers/passthrough/vtd/iommu.c
 +++ b/xen/drivers/passthrough/vtd/iommu.c
 @@ -2264,9 +2264,11 @@ static int reassign_device_ownership(
   * remove it from the hardware domain, because BIOS may use RMRR at
   * booting time. Also account for the special casing of USB below (in
   * intel_iommu_assign_device()).
 + * But if we already check to reserve RMRR, this should be fine.
   */
  if ( !is_hardware_domain(source) 
 - !is_usb_device(pdev-seg, pdev-bus, pdev-devfn) )
 + !is_usb_device(pdev-seg, pdev-bus, pdev-devfn) 
 + !is_reserve_device_memory(source, pdev-bus, pdev-devfn) )
  {
  const struct acpi_rmrr_unit *rmrr;
  u16 bdf;
 @@ -2315,12 +2317,14 @@ static int intel_iommu_assign_device(
  if ( ret )
  return ret;
  
 -/* FIXME: Because USB RMRR conflicts with guest bios region,
 - * ignore USB RMRR temporarily.
 +/*
 + * Because USB RMRR conflicts with guest bios region,
 + * ignore USB RMRR temporarily in case of non-reserving-RMRR.
   */
  seg = pdev-seg;
  bus = pdev-bus;
 -if ( is_usb_device(seg, bus, pdev-devfn) )
 +if ( is_usb_device(seg, bus, pdev-devfn) 
 + !is_reserve_device_memory(d, bus, pdev-devfn) )
  return 0;
  
  /* Setup rmrr identity mapping */
 diff --git a/xen/drivers/passthrough/vtd/utils.c 
 b/xen/drivers/passthrough/vtd/utils.c
 index a33564b..1045ac1 100644
 --- a/xen/drivers/passthrough/vtd/utils.c
 +++ b/xen/drivers/passthrough/vtd/utils.c
 @@ -36,6 +36,24 @@ int is_usb_device(u16 seg, u8 bus, u8 devfn)
  return (class == 0xc03);
  }
  
 +int is_reserve_device_memory(struct domain *d, u8 bus, u8 devfn)
 +{
 +int i = 0;
 +
 +if ( d-arch.hvm_domain.pci_force == PCI_DEV_RDM_CHECK )
 +return 1;

Ouch. What if the 'hvm_domain' is not there? Please check
first for that.
 +
 +for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
 +{
 +if ( d-arch.hvm_domain.pcidevs[i].bus == bus 
 + d-arch.hvm_domain.pcidevs[i].devfn == devfn 
 + d-arch.hvm_domain.pcidevs[i].flags == PCI_DEV_RDM_CHECK )
 +return 1;
 +}
 +
 +return 0;
 +}
 +
  /* Disable vt-d protected memory registers. */
  void disable_pmr(struct iommu *iommu)
  {
 -- 
 1.9.1
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] tools/hotplug: update systemd dependency to use service instead of socket

2014-12-05 Thread Konrad Rzeszutek Wilk

On Fri, Dec 05, 2014 at 09:28:44AM +0100, Olaf Hering wrote:
 On Fri, Dec 05, Olaf Hering wrote:
 
  So looking again at
  tools/hotplug/Linux/systemd/var-lib-xenstored.mount.in it seems that it
  happens to work for me because XENSTORED_MOUNT_CTX is set within that
  file. So if something happens to need a different value for
  XENSTORED_MOUNT_CTX it has to be provided in the to-be-created config
  file: EnvironmentFile=-@CONFIG_DIR@/@CONFIG_LEAF_DIR@/xenstored
  This config file is not part of xen. 
 
 And I wonder why a new config file has to be created, instead of just
 reusing the existing tools/hotplug/Linux/init.d/sysconfig.xencommons.in?

Right.
 
 I will send out a few patches to adjust the EnvironmentFile handling.

Excellent. Will be happy to test them out.
 
 Its just the question if a configure --with-selinux-mount-context=VAL is
 needed.

OK. That might be complicated in that the context could change between
bootup and run-time (I think that is what Michael told me).


 
 Olaf
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT

2014-12-05 Thread Luis R. Rodriguez

On Wed, Dec 03, 2014 at 08:39:47PM +0100, Luis R. Rodriguez wrote:
 On Wed, Dec 03, 2014 at 05:37:51AM +0100, Juergen Gross wrote:
  On 12/03/2014 03:28 AM, Luis R. Rodriguez wrote:
  On Tue, Dec 02, 2014 at 11:11:18AM +, David Vrabel wrote:
  On 01/12/14 22:36, Luis R. Rodriguez wrote:
 
  Then I do agree its a fair analogy (and find this obviously odd that how
  widespread cond_resched() is), we just don't have an equivalent for IRQ
  context, why not avoid the special check then and use this all the time 
  in the
  middle of a hypercall on the return from an interrupt (e.g., the timer
  interrupt)?
 
  http://lists.xen.org/archives/html/xen-devel/2014-02/msg01101.html
 
  OK thanks! That explains why we need some asm code but in that submission 
  you
  still also had used is_preemptible_hypercall(regs) and in the new
  implementation you use a CPU variable xen_in_preemptible_hcall prior to 
  calling
  preempt_schedule_irq(). I believe you added the CPU variable because
  preempt_schedule_irq() will preempt first without any checks if it should, 
  I'm
  asking why not do something like cond_resched_irq() where we check with
  should_resched() prior to preempting and that way we can avoid having to 
  use
  the CPU variable?
 
  Because that could preempt at any asynchronous interrupt making the
  no-preempt kernel fully preemptive. 
 
 OK yeah I see. That still doesn't negate the value of using something
 like cond_resched_irq() with a should_resched() on only critical hypercalls.
 The current implementation (patch by David) forces preemption without
 checking for should_resched() so it would preempt unnecessarily at least
 once.
 
  How would you know you are just
  doing a critical hypercall which should be preempted?
 
 You would not, you're right. I was just trying to see if we could generalize
 an API for this to avoid having users having to create their own CPU variables
 but this all seems very specialized as we want to use this on the timer
 so if we do generalize a cond_resched_irq() perhaps the documentation can
 warn about this type of case or abuse.

David's patch had the check only it was x86 based, if we use cond_resched_irq()
we can leave that aspect out to be done through asm inlines or it'll use the
generic shoudl_resched(), that should save some code on the asm implementations.

I have some patches now which generalizees this, I also have more information
about this can happen exactly, and a way to triggger it on small systems with
some hacks to emulate possibly backend behaviour on larger systems. In the worst
case this can be a dangerious situation to be in. I'll send some new RFTs.

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] console: increase initial conring size

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 16:50, daniel.ki...@oracle.com wrote:
 This bug (or lack of feature if you prefer) should be fixed, as it
 was pointed out by Jan Beulich and Olaf Hering, by allocating conring
 earlier. I though about that before posting this patch (I did not
 know beforehand about Olaf's work made in 2011). However, I stated
 that it is too late to make so intrusive changes.

I continue to disagree. If anything, I'd rather see us hide (e.g. behind
opt_cpu_info) some of the worst offenders causing the log to become
that large. Even if yielding a bigger patch, that would have less impact
functionality wise and likely benefit more people. Nor do I see the
change to move the allocation earlier all that intrusive.

But then again, considering that all you enlarge is an __initdata item,
perhaps this is acceptable.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [RFC V8 2/3] libxl domain snapshot API design

2014-12-05 Thread Wei Liu

On Fri, Dec 05, 2014 at 04:11:48PM +, Ian Campbell wrote:
 On Fri, 2014-12-05 at 16:06 +, Wei Liu wrote:
  Regarding JSON API, as Ian said, feel free to hook it up to libxlu.
 
 *If* it is useful to multiple toolstacks but not suitable for libxl then
 libxlu would be the right place.
 
 As I understood things the need for JSON here was xl specific, and it is
 IMHO fine for xl to also use the idl infrastructure, without needing to
 launder it via libxlu.
 

Hmm... I was think about if by any chance Chunyan wants to unify xl and
libvirt's knowledge of a domain snapshot, it can go into libxlu.

I'm no libvirt expert though. If libvirt doesn't need this then putting
it in xl is enough.

Wei.

 Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] have architectures specify the number of PIRQs a hardware domain gets

2014-12-05 Thread Jan Beulich

 On 05.12.14 at 17:05, julien.gr...@linaro.org wrote:
 On 05/12/14 15:42, Jan Beulich wrote:
 On 05.12.14 at 16:25, julien.gr...@linaro.org wrote:
 - XEN_DOMCTL_irq_permission = I don't really understand this bits.
 AFAIU the pirq number is different on each domain. But we use it to
 check permission on both domain. Shouldn't we translate the pirq to irq
 for the current-domain?
 
 Indeed, see also
 http://lists.xenproject.org/archives/html/xen-devel/2014-12/msg00219.html 
 
 Do you plan to send a patch to resolve this problem?

So far I assumed Stefano would, as he was running into an issue
which iirc fixing this would help.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PVH cleanups after 4.5

2014-12-05 Thread Konrad Rzeszutek Wilk

On Fri, Dec 05, 2014 at 10:42:27AM +, Andrew Cooper wrote:
 On 05/12/14 09:54, Ian Campbell wrote:
  On Fri, 2014-12-05 at 10:49 +0100, Tim Deegan wrote:
  At 09:20 + on 05 Dec (1417767654), Jan Beulich wrote:
  On 04.12.14 at 18:25, t...@xen.org wrote:
  Potential feature flags, based on whiteboard notes at the session.
  Things that are 'Yes' in both columns might not need actual flags :)
 
   'HVM'   'PVH'
  64bit hypercalls  Yes Yes
  32bit hypercalls  Yes No
  Iiuc the lack of support of 32-bit hypercalls is simply because PVH
  guests aren't expected to use them as being always 64-bit right
  now. I.e. I can't really see why we couldn't just enable them once
  the 64-bit hypercall tables got combined, in which case we wouldn't
  need a feature flag here either.
  Agreed -- I think the same will apply to a few other things, like shadow
  pagetables and some of the other MM tricks.  
  Might we want to constrain a given PVH domain to only make 32- or 64-bit
  hypercalls?
 
  Or do we consider already having crossed that bridge with HVM enough
  reason to allow it for PVH? I'm wonder if that, even if it is
  technically possible to support not, doing so might mitigate some
  potential security issues down the line. There's obviously a tradeoff
  against in-guest flexibility though.
 
 Madating a 32/64bit split serves only to cause booting issues; you need
 to know a-priori what the eventual kernel is going to be before you
 build the domain. This is an awkward issue with PV domains which
 *really* wants not to apply to PVH as well.
 
 PVH guests with the plan of HVM - qemu should be able to fully choose
 their operating mode, and allow for in-guest bootstrapping which is far
 superior from a security/isolation point of view than toolstack
 bootstrapping.

Or another use-case: kexec-ing from within an 64-bit PVH guest to an
32-bit PVH or vice-versa.

 
 ~Andrew
 
 
 ___
 Xen-devel mailing list
 Xen-devel@lists.xen.org
 http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] libxl: Set path to console on domain startup.

2014-12-05 Thread Anthony PERARD

The path to the pty of a Xen PV console is set only in
virDomainOpenConsole. But this is done too late. A call to
virDomainGetXMLDesc done before OpenConsole will not have the path to
the pty, but a call after OpenConsole will.

e.g. of the current issue.
Starting a domain with 'console type=pty/'
Then:
virDomainGetXMLDesc():
  devices
console type='pty'
  target type='xen' port='0'/
/console
  /devices
virDomainOpenConsole()
virDomainGetXMLDesc():
  devices
console type='pty' tty='/dev/pts/30'
  source path='/dev/pts/30'/
  target type='xen' port='0'/
/console
  /devices

The patch intend to get the tty path on the first call of GetXMLDesc.

Signed-off-by: Anthony PERARD anthony.per...@citrix.com
---
 src/libxl/libxl_domain.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index 9c62291..de56054 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -1290,6 +1290,23 @@ libxlDomainStart(libxlDriverPrivatePtr driver, 
virDomainObjPtr vm,
 if (libxlDomainSetVcpuAffinities(driver, vm)  0)
 goto cleanup_dom;
 
+if (vm-def-nconsoles) {
+virDomainChrDefPtr chr = NULL;
+chr = vm-def-consoles[0];
+if (chr  chr-source.type == VIR_DOMAIN_CHR_TYPE_PTY) {
+libxl_console_type console_type;
+char *console = NULL;
+console_type =
+(chr-targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ?
+ LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV);
+ret = libxl_console_get_tty(priv-ctx, vm-def-id, 
chr-target.port,
+console_type, console);
+if (!ret)
+ignore_value(VIR_STRDUP(chr-source.data.file.path, console));
+VIR_FREE(console);
+}
+}
+
 if (!start_paused) {
 libxl_domain_unpause(priv-ctx, domid);
 virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, 
VIR_DOMAIN_RUNNING_BOOTED);
-- 
Anthony PERARD


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] A few EFI code questions

2014-12-05 Thread Daniel Kiper

On Fri, Dec 05, 2014 at 03:00:14PM +, Jan Beulich wrote:
  On 05.12.14 at 15:51, daniel.ki...@oracle.com wrote:
  On Thu, Dec 04, 2014 at 09:35:01AM +, Jan Beulich wrote:
   On 03.12.14 at 22:02, daniel.ki...@oracle.com wrote:
   3) Should not we change xen/arch/*/efi/efi-boot.h to
  xen/arch/*/efi/efi-boot.c? efi-boot.h contains more
  code than definitions, declarations and short static
  functions. So, I think that it is more regular *.c file
  than header file.
 
  That's a matter of taste - I'd probably have made it .c too, but
  didn't mind it being .h as done by Roy (presumably on the basis
  that #include directives are preferred to have .h files as their
  operands). The only thing I regret is that I didn't ask for the
  pointless efi- prefix to be dropped.
 
  As I can see a few people people agree to some extent with my suggestion.
  Great! Sadly if we wish .c file than simple boot.c (as Jan suggested we can
  drop efi- prefix) conflicts with exiting boot.c link. Is efi-boot.c OK?
  Or maybe boot-arch.c? boot.h is OK for sure. Which one do you prefer?
  Do you have better ideas?

 boot.h would be my preference given how things look like right now,

Granted!

 but I don't think this possibility of renaming warrants a much longer
 discussion. Please also remember that renaming always implies more
 cumbersome backporting, even if only slightly more.

I suppose that you are thinking about backporting my EFI + multiboot2
patches somewhere. If you wish I can rename this file after my patch
series or even later to take some fixes for bugs in my code not
discovered earlier. Is it OK for you?

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] console: increase initial conring size

2014-12-05 Thread Konrad Rzeszutek Wilk

On Fri, Dec 05, 2014 at 04:21:35PM +, Jan Beulich wrote:
  On 05.12.14 at 16:50, daniel.ki...@oracle.com wrote:
  This bug (or lack of feature if you prefer) should be fixed, as it
  was pointed out by Jan Beulich and Olaf Hering, by allocating conring
  earlier. I though about that before posting this patch (I did not
  know beforehand about Olaf's work made in 2011). However, I stated
  that it is too late to make so intrusive changes.
 
 I continue to disagree. If anything, I'd rather see us hide (e.g. behind
 opt_cpu_info) some of the worst offenders causing the log to become
 that large. Even if yielding a bigger patch, that would have less impact

Nowadays the worst offender is the EFI memmap which can be quite
big. We could hide it behind 'opt_efi_info' and only print out some
rather odd entries. But that would be 4.6 material, while this
patch nicely fixes it for 4.5.

 functionality wise and likely benefit more people. Nor do I see the
 change to move the allocation earlier all that intrusive.
 
 But then again, considering that all you enlarge is an __initdata item,
 perhaps this is acceptable.

This has the other side-benefit that it will help us troubleshoot in
the field without having the customer try extra parameters to extend
the log data.

I am all up for less round-trip to troubleshoot issues and I can't
see this causing any regressions (unless we have some hard-coded EFL
section data).


 
 Jan
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] console: allocate ring buffer earlier

2014-12-05 Thread Jan Beulich

... when conring_size= was specified on the command line. We can't
really do this as early as we would want to when the option was not
specified, as the default depends on knowing the system CPU count. Yet
the parsing of the ACPI tables is one of the things that generates a
lot of output especially on large systems.

I didn't change ARM, as I wasn't sure how far ahead this call could be
pulled.

Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1187,6 +1187,7 @@ void __init noreturn __start_xen(unsigne
 }
 
 vm_init();
+console_init_mem();
 vesa_init();
 
 softirq_init();
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -744,15 +744,14 @@ void __init console_init_preirq(void)
 }
 }
 
-void __init console_init_postirq(void)
+void __init console_init_mem(void)
 {
 char *ring;
 unsigned int i, order, memflags;
-
-serial_init_postirq();
+unsigned long flags;
 
 if ( !opt_conring_size )
-opt_conring_size = num_present_cpus()  (9 + xenlog_lower_thresh);
+return;
 
 order = get_order_from_bytes(max(opt_conring_size, conring_size));
 memflags = MEMF_bits(crashinfo_maxaddr_bits);
@@ -763,17 +762,28 @@ void __init console_init_postirq(void)
 }
 opt_conring_size = PAGE_SIZE  order;
 
-spin_lock_irq(console_lock);
+spin_lock_irqsave(console_lock, flags);
 for ( i = conringc ; i != conringp; i++ )
 ring[i  (opt_conring_size - 1)] = conring[i  (conring_size - 1)];
 conring = ring;
 smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer. 
*/
 conring_size = opt_conring_size;
-spin_unlock_irq(console_lock);
+spin_unlock_irqrestore(console_lock, flags);
 
 printk(Allocated console ring of %u KiB.\n, opt_conring_size  10);
 }
 
+void __init console_init_postirq(void)
+{
+serial_init_postirq();
+
+if ( !opt_conring_size )
+opt_conring_size = num_present_cpus()  (9 + xenlog_lower_thresh);
+
+if ( conring == _conring )
+console_init_mem();
+}
+
 void __init console_endboot(void)
 {
 int i, j;
--- a/xen/include/xen/console.h
+++ b/xen/include/xen/console.h
@@ -14,6 +14,7 @@ struct xen_sysctl_readconsole;
 long read_console_ring(struct xen_sysctl_readconsole *op);
 
 void console_init_preirq(void);
+void console_init_mem(void);
 void console_init_postirq(void);
 void console_endboot(void);
 int console_has(const char *device);



console: allocate ring buffer earlier

... when conring_size= was specified on the command line. We can't
really do this as early as we would want to when the option was not
specified, as the default depends on knowing the system CPU count. Yet
the parsing of the ACPI tables is one of the things that generates a
lot of output especially on large systems.

I didn't change ARM, as I wasn't sure how far ahead this call could be
pulled.

Signed-off-by: Jan Beulich jbeul...@suse.com

--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1187,6 +1187,7 @@ void __init noreturn __start_xen(unsigne
 }
 
 vm_init();
+console_init_mem();
 vesa_init();
 
 softirq_init();
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -744,15 +744,14 @@ void __init console_init_preirq(void)
 }
 }
 
-void __init console_init_postirq(void)
+void __init console_init_mem(void)
 {
 char *ring;
 unsigned int i, order, memflags;
-
-serial_init_postirq();
+unsigned long flags;
 
 if ( !opt_conring_size )
-opt_conring_size = num_present_cpus()  (9 + xenlog_lower_thresh);
+return;
 
 order = get_order_from_bytes(max(opt_conring_size, conring_size));
 memflags = MEMF_bits(crashinfo_maxaddr_bits);
@@ -763,17 +762,28 @@ void __init console_init_postirq(void)
 }
 opt_conring_size = PAGE_SIZE  order;
 
-spin_lock_irq(console_lock);
+spin_lock_irqsave(console_lock, flags);
 for ( i = conringc ; i != conringp; i++ )
 ring[i  (opt_conring_size - 1)] = conring[i  (conring_size - 1)];
 conring = ring;
 smp_wmb(); /* Allow users of console_force_unlock() to see larger buffer. 
*/
 conring_size = opt_conring_size;
-spin_unlock_irq(console_lock);
+spin_unlock_irqrestore(console_lock, flags);
 
 printk(Allocated console ring of %u KiB.\n, opt_conring_size  10);
 }
 
+void __init console_init_postirq(void)
+{
+serial_init_postirq();
+
+if ( !opt_conring_size )
+opt_conring_size = num_present_cpus()  (9 + xenlog_lower_thresh);
+
+if ( conring == _conring )
+console_init_mem();
+}
+
 void __init console_endboot(void)
 {
 int i, j;
--- a/xen/include/xen/console.h
+++ b/xen/include/xen/console.h
@@ -14,6 +14,7 @@ struct xen_sysctl_readconsole;
 long read_console_ring(struct xen_sysctl_readconsole *op);
 
 void console_init_preirq(void);
+void console_init_mem(void);
 void console_init_postirq(void);
 void console_endboot(void);
 int

Re: [Xen-devel] [PATCH 1/4] pci: Do not ignore device's PXM information

2014-12-05 Thread Boris Ostrovsky


On 12/05/2014 10:53 AM, Jan Beulich wrote:



--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -56,6 +56,8 @@ struct pci_dev {
  
  u8 phantom_stride;
  
+int node; /* NUMA node */

I don't think we currently support node IDs wider than 8 bits.



I used an int because pxm_to_node() returns an int. OTOH, pxm2node[], 
for which pxm_to_node() is essentially a wrapper, is a u8.


-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.5] flask/policy: Example policy updates for migration

2014-12-05 Thread Daniel De Graaf

The example XSM policy was missing permission for dom0_t to migrate
domains; add these permissions.

Reported-by: Wei Liu wei.l...@citrix.com
Signed-off-by: Daniel De Graaf dgde...@tycho.nsa.gov
---

This has been tested with xl save/restore on a PV domain, which now
succeeds without producing AVC denials.

 tools/flask/policy/policy/modules/xen/xen.if | 11 +++
 tools/flask/policy/policy/modules/xen/xen.te |  3 +++
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/flask/policy/policy/modules/xen/xen.if 
b/tools/flask/policy/policy/modules/xen/xen.if
index fa69c9d..bf5e135 100644
--- a/tools/flask/policy/policy/modules/xen/xen.if
+++ b/tools/flask/policy/policy/modules/xen/xen.if
@@ -48,11 +48,13 @@ define(`create_domain_common', `
allow $1 $2:domain { create max_vcpus setdomainmaxmem setaddrsize
getdomaininfo hypercall setvcpucontext setextvcpucontext
getscheduler getvcpuinfo getvcpuextstate getaddrsize
-   getaffinity setaffinity };
-   allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim 
set_max_evtchn set_vnumainfo get_vnumainfo psr_cmt_op configure_domain };
+   getaffinity setaffinity setvcpuextstate };
+   allow $1 $2:domain2 { set_cpuid settsc setscheduler setclaim
+   set_max_evtchn set_vnumainfo get_vnumainfo cacheflush
+   psr_cmt_op configure_domain };
allow $1 $2:security check_context;
allow $1 $2:shadow enable;
-   allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op };
+   allow $1 $2:mmu { map_read map_write adjust memorymap physmap pinpage 
mmuext_op updatemp };
allow $1 $2:grant setup;
allow $1 $2:hvm { cacheattr getparam hvmctl irqlevel pciroute sethvmc
setparam pcilevel trackdirtyvram nested };
@@ -80,7 +82,7 @@ define(`create_domain_build_label', `
 define(`manage_domain', `
allow $1 $2:domain { getdomaininfo getvcpuinfo getaffinity
getaddrsize pause unpause trigger shutdown destroy
-   setaffinity setdomainmaxmem getscheduler };
+   setaffinity setdomainmaxmem getscheduler resume };
 allow $1 $2:domain2 set_vnumainfo;
 ')
 
@@ -88,6 +90,7 @@ define(`manage_domain', `
 #   Allow creation of a snapshot or migration image from a domain
 #   (inbound migration is the same as domain creation)
 define(`migrate_domain_out', `
+   allow $1 domxen_t:mmu map_read;
allow $1 $2:hvm { gethvmc getparam irqlevel };
allow $1 $2:mmu { stat pageinfo map_read };
allow $1 $2:domain { getaddrsize getvcpucontext getextvcpucontext 
getvcpuextstate pause destroy };
diff --git a/tools/flask/policy/policy/modules/xen/xen.te 
b/tools/flask/policy/policy/modules/xen/xen.te
index d214470..c0128aa 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -129,12 +129,14 @@ create_domain(dom0_t, domU_t)
 manage_domain(dom0_t, domU_t)
 domain_comms(dom0_t, domU_t)
 domain_comms(domU_t, domU_t)
+migrate_domain_out(dom0_t, domU_t)
 domain_self_comms(domU_t)
 
 declare_domain(isolated_domU_t)
 create_domain(dom0_t, isolated_domU_t)
 manage_domain(dom0_t, isolated_domU_t)
 domain_comms(dom0_t, isolated_domU_t)
+migrate_domain_out(dom0_t, isolated_domU_t)
 domain_self_comms(isolated_domU_t)
 
 # Declare a boolean that denies creation of prot_domU_t domains
@@ -142,6 +144,7 @@ gen_bool(prot_doms_locked, false)
 declare_domain(prot_domU_t)
 if (!prot_doms_locked) {
create_domain(dom0_t, prot_domU_t)
+   migrate_domain_out(dom0_t, prot_domU_t)
 }
 domain_comms(dom0_t, prot_domU_t)
 domain_comms(domU_t, prot_domU_t)
-- 
1.9.3


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 9/9] xen/pciback: Implement PCI reset slot or bus with 'do_flr' SysFS attribute

2014-12-05 Thread Konrad Rzeszutek Wilk

On Fri, Dec 05, 2014 at 10:30:01AM +, David Vrabel wrote:
 On 04/12/14 15:39, Alex Williamson wrote:
  
  I don't know what workaround you're talking about.  As devices are
  released from the user, vfio-pci attempts to reset them.  If
  pci_reset_function() returns success we mark the device clean, otherwise
  it gets marked dirty.  Each time a device is released, if there are
  dirty devices we test whether we can try a bus/slot reset to clean them.
  In the case of assigning a GPU this typically means that the GPU or
  audio function come through first, there's no reset mechanism so it gets
  marked dirty, the next device comes through and we manage to try a bus
  reset.  vfio-pci does not have any device specific resets, all
  functionality is added to the PCI-core, thank-you-very-much.  I even
  posted a generic PCI quirk patch recently that marks AMD VGA PM reset as
  bad so that pci_reset_function() won't claim that worked.  All VGA
  access quirks are done in QEMU, the kernel doesn't have any business in
  remapping config space over MMIO regions or trapping other config space
  backdoors.
 
 Thanks for the info Alex, I hadn't got around to actually looking and
 the vfio-pci code and was just going to what Sander said.
 
 We probably do need to have a more in depth look at now PCI devices and
 handled by both the toolstack and pciback but in the short term I would
 like a simple solution that does not extend the ABI.

Could you enumerate the 'simple solution' then please? I am having
a frustrating time figuring out what it is that you are proposing.


 
 David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] console: allocate ring buffer earlier

2014-12-05 Thread Daniel Kiper

On Fri, Dec 05, 2014 at 04:55:24PM +, Jan Beulich wrote:
 ... when conring_size= was specified on the command line. We can't
 really do this as early as we would want to when the option was not
 specified, as the default depends on knowing the system CPU count. Yet
 the parsing of the ACPI tables is one of the things that generates a
 lot of output especially on large systems.

 I didn't change ARM, as I wasn't sure how far ahead this call could be
 pulled.

 Signed-off-by: Jan Beulich jbeul...@suse.com

Make sense for me but I think that we should have the same thing for ARM too.

 --- a/xen/arch/x86/setup.c
 +++ b/xen/arch/x86/setup.c
 @@ -1187,6 +1187,7 @@ void __init noreturn __start_xen(unsigne
  }

  vm_init();
 +console_init_mem();
  vesa_init();

  softirq_init();
 --- a/xen/drivers/char/console.c
 +++ b/xen/drivers/char/console.c
 @@ -744,15 +744,14 @@ void __init console_init_preirq(void)
  }
  }

 -void __init console_init_postirq(void)
 +void __init console_init_mem(void)
  {
  char *ring;
  unsigned int i, order, memflags;
 -
 -serial_init_postirq();
 +unsigned long flags;

  if ( !opt_conring_size )
 -opt_conring_size = num_present_cpus()  (9 + xenlog_lower_thresh);
 +return;

  order = get_order_from_bytes(max(opt_conring_size, conring_size));
  memflags = MEMF_bits(crashinfo_maxaddr_bits);
 @@ -763,17 +762,28 @@ void __init console_init_postirq(void)
  }
  opt_conring_size = PAGE_SIZE  order;

 -spin_lock_irq(console_lock);
 +spin_lock_irqsave(console_lock, flags);

I am not sure why are you change spin_lock_irq() to spin_lock_irqsave() here.
Could you explain this in commit message?

  for ( i = conringc ; i != conringp; i++ )
  ring[i  (opt_conring_size - 1)] = conring[i  (conring_size - 1)];
  conring = ring;
  smp_wmb(); /* Allow users of console_force_unlock() to see larger 
 buffer. */
  conring_size = opt_conring_size;
 -spin_unlock_irq(console_lock);
 +spin_unlock_irqrestore(console_lock, flags);

Ditto.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Xen-users] 4.5 git: regression in xen systemd shutdown hangs the OS

2014-12-05 Thread Olaf Hering

On Tue, Dec 02, Olaf Hering wrote:

 On Tue, Dec 02, Ian Campbell wrote:
 
  On Mon, 2014-12-01 at 23:41 +, Mark Pryor wrote:
   list,
  
  Thanks. If you've identified a buggy changeset then it is fine to post
  to the devel lists. I've added a CC. I've also CCd everyone listed in
  the commit which you've fingered.
  
  Olaf, does this suggested change look correct? If so then can you turn
  it into a patch please.
 
 Yes, something like this (sed -i 's@socket@service@g' *.in):

But even with that change xendomains is hanging if it cant talk to
xenstored for whatever  reason. The result is that the sytem hangs
forever at shutdown. 

I will try to fix that for 4.5.

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] RFC: Cleaning up the Mini-OS namespace

2014-12-05 Thread Martin Lucina

andrew.coop...@citrix.com said:
 I think this is a very good idea, and I am completely in favour of it.
 
 There are already-identified issues such as MiniOS leaking things like
 ARRAY_SIZE() into linked namespaces, which I havn't yet had enough tuits
 to fix.
 
 I think splitting things like the stub libc away from the MiniOS Xen
 Framework is also a good idea.  Ideally, the result of a MiniOS Build
 would be a small set of .a's which can then be linked against some
 normal C to make a minios guest.  (How feasible this is in reality
 remains to be seen.)

The approach I used for rumprun-xen is to link all of MiniOS' object files
except the startfile into a final .o with ld -r. This then allows me to
use objcopy -w -GPREFIX... to make all symbols in minios.o *except* those
starting with PREFIX local.

This has the advantage that I only had to rename symbols I really wanted to
keep global rather than going through all the MiniOS code adding static
in places where it was missing and sorting out the resulting
inter-dependencies.

 From a not-public-API point of view, all you have to worry about is that
 the existing minios stuff in xen.git, including the stubdom stuff,
 continues to work.  We have never made any guarantees to anyone using
 minios out-of-tree.

Existing minios stuff meaning the default build of extras/mini-os?

What's up with the -DHAVE_LIBC codepaths in mini-os? Who or what uses
these? Grepping around in stubdom/ doesn't come up with anything...

Stubdom stuff meaning the default build of stubdom/, plus the make
c-stubdom and make caml-stubdom examples documented in README?

Anything else? Sorry if this is obvious but I'm not that familiar with all
of xen.git.

Thanks,

Martin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCHv5 0/4] dma, x86, xen: reduce SWIOTLB usage in Xen guests

2014-12-05 Thread Greg Kroah-Hartman

On Fri, Dec 05, 2014 at 02:07:59PM +, David Vrabel wrote:
 On systems where DMA addresses and physical addresses are not 1:1
 (such as Xen PV guests), the generic dma_get_required_mask() will not
 return the correct mask (since it uses max_pfn).
 
 Some device drivers (such as mptsas, mpt2sas) use
 dma_get_required_mask() to set the device's DMA mask to allow them to use
 only 32-bit DMA addresses in hardware structures.  This results in
 unnecessary use of the SWIOTLB if DMA addresses are more than 32-bits,
 impacting performance significantly.
 
 This series allows Xen PV guests to override the default
 dma_get_required_mask() with a more suitable one.
 
 Changes in v5:
 - xen_swiotlb_get_required_mask() is x86 only.
 
 Changes in v4:
 - Assume 64-bit mask is required.
 
 Changes in v3:
 - fix off-by-one in xen_dma_get_required_mask()
 - split ia64 changes into separate patch.
 
 Changes in v2:
 - split x86 and xen changes into separate patches
 
 David

Why are you sending these to me?  Am I the DMA maintainer and forgot
about it?

/me digs in MAINTAINERS...

Nope, not me!  Patches are now deleted from my queue, go use
scripts/get_maintainer.pl like you should have done...

greg k-h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-4.4-testing test] 32095: regressions - trouble: blocked/broken/fail/pass

2014-12-05 Thread xen . org

flight 32095 xen-4.4-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32095/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-pair   17 guest-migrate/src_host/dst_host fail REGR. vs. 31781

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-sedf  8 debian-fixupfail pass in 32055
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 3 host-install(3) broken pass in 32055
 test-amd64-amd64-pv  16 guest-stop fail in 32055 pass in 32095
 test-amd64-i386-qemut-rhel6hvm-amd 3 host-install(3) broken in 32055 pass in 
32095

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-xl-win7-amd64  7 windows-install  fail in 32055 like 31733

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt  9 guest-start  fail   never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 build-amd64-rumpuserxen   6 xen-buildfail   never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xend-winxpsp3 17 leak-check/check fail  never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xend-qemut-winxpsp3 17 leak-check/checkfail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail in 32055 never pass

version targeted for testing:
 xen  a39f202031d7f1d8d9e14b8c3d7d11c812db253e
baseline version:
 xen  7679aeb444ed3bc4de0f473c16c47eab7d2f9d33


People who touched revisions under test:
  Jan Beulich jbeul...@suse.com


jobs:
 build-amd64-xend pass
 build-i386-xend  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   blocked

Re: [Xen-devel] [PATCH 1/4] dma: add dma_get_required_mask_from_max_pfn()

2014-12-05 Thread Greg Kroah-Hartman

On Fri, Dec 05, 2014 at 02:08:00PM +, David Vrabel wrote:
 A generic dma_get_required_mask() is useful even for architectures (such
 as ia64) that define ARCH_HAS_GET_REQUIRED_MASK.
 
 Signed-off-by: David Vrabel david.vra...@citrix.com
 Reviewed-by: Stefano Stabellini stefano.stabell...@eu.citrix.com
 ---
  drivers/base/platform.c |   10 --

Is this why you sent this to me?  The x86 maintainers should handle this
patch set, not me for a tiny 8 lines in just one of the files, sorry.

greg k-h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Steps to run XenServer on ARM Platform

2014-12-05 Thread manish jaggi

Hi,

I am trying to find a tutorial to jumpstart installing XenServer / XCP
on an ARM 64bit platform.
Could the mailing list help.

-Regards
Manish

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] Some questions regarding QEMU, UEFI, PCI/VGA Passthrough, and other things

2014-12-05 Thread Zir Blazer

While I am not a developer myself (I always sucked hard when it comes to read 
and write code), there are several capabilities of Xen and its supporting 
Software which I'm always interesed in how they progress, more out of curiosity 
than anything else. However, usually, documentation seems to backtrack a lot 
what its currently implemented in code, and sometimes you catch a mail here 
with some useful data regarding a topic but later you don't hear about that any 
more, missing any progress, or because the whole topic was inconclusive. So, 
this mail is pretty much a compilation of small questions of things I came 
across but didn't popped up later, but can serve to brainstorm someone, which 
is why I believe it to be more useful for xen-devel than xen-users.


   QEMU
Because as a VGA Passthrough user I'm currently forced to use 
qemu-xen-traditional (Through I hear some success about some users using 
qemu-xen in Xen 4.4, but I myself didn't had any luck with it), I'm stuck with 
an old QEMU version. However, looking at changelog from latest versions I 
always see some interesing features, which as far that I know Xen doesn't 
currently incorporate.


1a - One of the things that newer QEMU versions seems to be capable of doing, 
is emulating the much newer Intel Q35 Chipset, instead of only the current 
440FX from the P5 Pentium era. Some data from Q35 emulation here:
www.linux-kvm.org/wiki/images/0/06/2012-forum-Q35.pdf
wiki.qemu.org/Features/Q35

I'm aware that newer doesn't neccesarily means better, specially because the 
practical advantages of Q35 vs 440FX aren't very clear. There are several new 
emulated features like an AHCI Controller and a PCIe Bus, which sounds 
interesing on paper, but I don't know if they add any useful feature or 
increases performance/compatibility. Some comments I read about the matter 
wrongly stated that Q35 would be needed to do PCIe Passthrough, but this is 
currently possible on 440FX, through I don't know about the low level 
implementation differences. I think most of the idea about Q35 is to make the 
VM look more closely to real Hardware, instead of looking like a ridiculous 
obvious emulated platform.
In the case of the AHCI Controller, I suppose than the OS would need to include 
Drivers for the controller during installation time, which if I recall 
correctly both Windows Vista/7/8 and Linux should have, through for a Windows 
XP install the Q35 AHCI Controller Drivers should probabily need to be 
slipstreamed with nLite to an install ISO for it to work.


1b - Another experimental feature that recently popped in QEMU is IOMMU 
emulation. Info here:
www.mulix.org/pubs/iommu/viommu.pdf
www.linux-kvm.org/wiki/images/4/4a/2010-forum-joro-pv-iommu.pdf

IOMMU emulation usefulness seems to be so you can do PCI Passthrough in a 
Nested Virtualization enviroment. At first sight this looked a bit useless, 
cause using a DomU to do PCI Passthrough with an emulated IOMMU sounds rather 
too much overhead if you can simply emulate that device in the nested DomU. 
However, I also read about the possibility of Xen using Hardware virtualization 
for Dom0 instead of it being Paravirtualized. In that case, would it be 
possible to provide the IOMMU emulation layer to Dom0 so you could do PCI 
Passthrough in platforms without proper support for it? It seems a rather 
interesing idea.
I think it would also be useful to serve as an standarized debug platform for 
IOMMU virtualization and passthrough, cause some years ago missing or malformed 
ACPI DMAR/IVRS tables were all over the place and getting IOMMU virtualization 
working was pretty much random luck and at the mercy of the goodwill of the 
Motherboard maker to fix their BIOSes.


   UEFI for DomUs
I managed to get this one working, but it seems to need some clarifications 
here and there.

2a - As far that I know, if you add --enable-ovmf to ./configure before 
building Xen, it downloads and builds some extra code from a OVMF repository 
which Xen maintains, through I don't know if its a snapshop of whatever the 
edk2 repository had at that time, or if it does includes custom patchs for the 
OVMF Firmware to work in Xen. Xen also has another ./configure option, 
--with-system-ovmf, which is supposed to be used to specify a path to provide 
an OVMF Firmware binary. However, when I tried that option some months ago, I 
never managed to get it working, either using a package with a precompiled 
ovmf.bin from Arch Linux User Repository, or using another package with the 
source to compile it myself. Both binaries worked with standalone QEMU, through.
Besides than that parameter itself was quite hidden, there is absolutely no 
info regarding if the provided OVMF binary has to comply with some special 
requeriments, be it some custom patchs for OVMF so it works with Xen, if it has 
to be a binary that only includes TianoCore, or the unified one that includes 
the NVRAM in a single file. In Arch Linux, for the Xen 4.4 package,

[Xen-devel] [qemu-mainline test] 32096: tolerable FAIL - PUSHED

2014-12-05 Thread xen . org

flight 32096 qemu-mainline real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/32096/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 32029

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt  9 guest-start  fail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-libvirt  9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass

version targeted for testing:
 qemuu54f3a180a3d0b334c55d0f61d6e9fe5c7c6d42d5
baseline version:
 qemuu0d7954c288e91b8a457f15a0a8e8244facf6594b


People who touched revisions under test:
  Gerd Hoffmann kra...@redhat.com
  Peter Maydell peter.mayd...@linaro.org


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-amd64-amd64-xl-win7-amd64   fail
 test-amd64-i386-xl-win7-amd64fail
 test-amd64-i386-xl-credit2   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-amd64-xl-pcipt-intel  fail
 test-amd64-i386-rhel6hvm-intel   pass
 test-amd64-i386-qemut-rhel6hvm-intel pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt fail
 test-armhf-armhf-libvirt fail
 test-amd64-i386-libvirt  fail

[Xen-devel] [RFC PATCH] xen/arm: Manage uart TX interrupt correctly

2014-12-05 Thread vijay . kilari

From: Vijaya Kumar K vijaya.ku...@caviumnetworks.com

On pl011.c when TX interrupt is received and
TX buffer is empty, TX interrupt is not disabled and
hence UART interrupt routine see TX interrupt always
in MIS register and cpu loops infinitly.

With this patch, mask and umask TX interrupt
when required

Signed-off-by: Vijaya Kumar K vijaya.ku...@caviumnetworks.com
---
 xen/drivers/char/pl011.c  |   18 ++
 xen/drivers/char/serial.c |   30 +-
 xen/include/xen/serial.h  |4 
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/char/pl011.c b/xen/drivers/char/pl011.c
index dd19ce8..ad48df3 100644
--- a/xen/drivers/char/pl011.c
+++ b/xen/drivers/char/pl011.c
@@ -109,6 +109,8 @@ static void __init pl011_init_preirq(struct serial_port 
*port)
 panic(pl011: No Baud rate configured\n);
 uart-baud = (uart-clock_hz  2) / divisor;
 }
+/* Trigger RX interrupt at 1/2 full, TX interrupt at 7/8 empty */
+pl011_write(uart, IFLS, (23 | 0));
 /* This write must follow FBRD and IBRD writes. */
 pl011_write(uart, LCR_H, (uart-data_bits - 5)  5
 | FEN
@@ -197,6 +199,20 @@ static const struct vuart_info *pl011_vuart(struct 
serial_port *port)
 return uart-vuart;
 }
 
+static void pl011_tx_stop(struct serial_port *port)
+{
+struct pl011 *uart = port-uart;
+
+pl011_write(uart, IMSC, pl011_read(uart, IMSC)  ~(TXI));
+}
+
+static void pl011_tx_start(struct serial_port *port)
+{
+struct pl011 *uart = port-uart;
+
+pl011_write(uart, IMSC, pl011_read(uart, IMSC) | (TXI));
+}
+
 static struct uart_driver __read_mostly pl011_driver = {
 .init_preirq  = pl011_init_preirq,
 .init_postirq = pl011_init_postirq,
@@ -207,6 +223,8 @@ static struct uart_driver __read_mostly pl011_driver = {
 .putc = pl011_putc,
 .getc = pl011_getc,
 .irq  = pl011_irq,
+.start_tx = pl011_tx_start,
+.stop_tx  = pl011_tx_stop,
 .vuart_info   = pl011_vuart,
 };
 
diff --git a/xen/drivers/char/serial.c b/xen/drivers/char/serial.c
index 44026b1..d2ce8a8 100644
--- a/xen/drivers/char/serial.c
+++ b/xen/drivers/char/serial.c
@@ -76,6 +76,19 @@ void serial_tx_interrupt(struct serial_port *port, struct 
cpu_user_regs *regs)
 cpu_relax();
 }
 
+if ( port-txbufc == port-txbufp )
+{
+/* Disable TX. nothing to send */
+if ( port-driver-stop_tx != NULL )
+port-driver-stop_tx(port);
+spin_unlock(port-tx_lock);
+goto out;
+}
+else
+{
+if ( port-driver-tx_ready(port)  (port-driver-start_tx != NULL) )
+port-driver-start_tx(port);
+}
 for ( i = 0, n = port-driver-tx_ready(port); i  n; i++ )
 {
 if ( port-txbufc == port-txbufp )
@@ -117,6 +130,9 @@ static void __serial_putc(struct serial_port *port, char c)
 cpu_relax();
 if ( n  0 )
 {
+/* Enable TX before sending chars */
+if ( port-driver-start_tx != NULL )
+port-driver-start_tx(port);
 while ( n-- )
 port-driver-putc(
 port,
@@ -135,6 +151,9 @@ static void __serial_putc(struct serial_port *port, char c)
 if ( ((port-txbufp - port-txbufc) == 0) 
  port-driver-tx_ready(port)  0 )
 {
+/* Enable TX before sending chars */
+if ( port-driver-start_tx != NULL )
+port-driver-start_tx(port);
 /* Buffer and UART FIFO are both empty, and port is available. */
 port-driver-putc(port, c);
 }
@@ -152,11 +171,18 @@ static void __serial_putc(struct serial_port *port, char 
c)
 while ( !(n = port-driver-tx_ready(port)) )
 cpu_relax();
 if ( n  0 )
+{
+/* Enable TX before sending chars */
+if ( port-driver-start_tx != NULL )
+port-driver-start_tx(port);
 port-driver-putc(port, c);
+}
 }
 else
 {
 /* Simple synchronous transmitter. */
+if ( port-driver-start_tx != NULL )
+port-driver-start_tx(port);
 port-driver-putc(port, c);
 }
 }
@@ -403,7 +429,9 @@ void serial_start_sync(int handle)
 if ( n  0 )
 /* port is unavailable and might not come up until reenabled by
dom0, we can't really do proper sync */
-break;
+break; 
+if ( port-driver-start_tx != NULL )
+port-driver-start_tx(port);
 port-driver-putc(
 port, port-txbuf[mask_serial_txbuf_idx(port-txbufc++)]);
 }
diff --git a/xen/include/xen/serial.h b/xen/include/xen/serial.h
index 9f4451b..71e6ade 100644
--- a/xen/include/xen/serial.h
+++ b/xen/include/xen/serial.h
@@ -81,6 +81,10 @@

Re: [Xen-devel] [PATCH] VMX: don't allow PVH to reach handle_pio() or handle_mmio()

2014-12-05 Thread Mukesh Rathor

On Fri, 05 Dec 2014 14:06:53 +
Jan Beulich jbeul...@suse.com wrote:

 PVH guests are not supposed to access I/O ports they weren't given
 access to (there's nothing to handle emulation of such accesses).
 
 Reported-by: Roger Pau Monnéroger@citrix.com
 Signed-off-by: Jan Beulich jbeul...@suse.com
 ---
 Note: Only compile tested so far.
 
 --- a/xen/arch/x86/hvm/vmx/vmx.c
 +++ b/xen/arch/x86/hvm/vmx/vmx.c
 @@ -3082,6 +3082,9 @@ void vmx_vmexit_handler(struct cpu_user_
  }
  
  case EXIT_REASON_IO_INSTRUCTION:
 +if ( unlikely(is_pvh_vcpu(v)) )
 +goto exit_and_crash;
 +
  __vmread(EXIT_QUALIFICATION, exit_qualification);
  if ( exit_qualification  0x10 )
  {

Actually, handle_pio() will eventually reach handle_pvh_io() which
would access check via admin_io_okay, so that path should be OK,
right?

thanks,
Mukesh


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v6 2/2] add a new p2m type - p2m_mmio_write_dm

2014-12-05 Thread Yu Zhang

From: Yu Zhang yu.c.zh...@intel.com

A new p2m type, p2m_mmio_write_dm, is added to trap and emulate
the write operations on GPU's page tables. Handling of this new
p2m type are similar with existing p2m_ram_ro in most condition
checks, with only difference on final policy of emulation vs. drop.
For p2m_ram_ro types, write operations will not trigger the device
model, and will be discarded later in __hvm_copy(); while for the
p2m_mmio_write_dm type pages, writes will go to the device model
via ioreq-server.

Signed-off-by: Yu Zhang yu.c.zh...@linux.intel.com
Signed-off-by: Wei Ye wei...@intel.com
---
 xen/arch/x86/hvm/hvm.c  | 11 ---
 xen/arch/x86/mm/p2m-ept.c   |  1 +
 xen/arch/x86/mm/p2m-pt.c|  1 +
 xen/include/asm-x86/p2m.h   |  4 +++-
 xen/include/public/hvm/hvm_op.h |  1 +
 5 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 967f822..25114fc 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2837,7 +2837,8 @@ int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long 
gla,
  * to the mmio handler.
  */
 if ( (p2mt == p2m_mmio_dm) || 
- (npfec.write_access  (p2m_is_discard_write(p2mt))) )
+ (npfec.write_access 
+  (p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
 {
 put_gfn(p2m-domain, gfn);
 
@@ -5904,6 +5905,8 @@ long do_hvm_op(unsigned long op, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 get_gfn_query_unlocked(d, a.pfn, t);
 if ( p2m_is_mmio(t) )
 a.mem_type =  HVMMEM_mmio_dm;
+else if ( t == p2m_mmio_write_dm )
+a.mem_type = HVMMEM_mmio_write_dm;
 else if ( p2m_is_readonly(t) )
 a.mem_type =  HVMMEM_ram_ro;
 else if ( p2m_is_ram(t) )
@@ -5931,7 +5934,8 @@ long do_hvm_op(unsigned long op, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 static const p2m_type_t memtype[] = {
 [HVMMEM_ram_rw]  = p2m_ram_rw,
 [HVMMEM_ram_ro]  = p2m_ram_ro,
-[HVMMEM_mmio_dm] = p2m_mmio_dm
+[HVMMEM_mmio_dm] = p2m_mmio_dm,
+[HVMMEM_mmio_write_dm] = p2m_mmio_write_dm
 };
 
 if ( copy_from_guest(a, arg, 1) )
@@ -5978,7 +5982,8 @@ long do_hvm_op(unsigned long op, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 goto param_fail4;
 }
 if ( !p2m_is_ram(t) 
- (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) )
+ (!p2m_is_hole(t) || a.hvmmem_type != HVMMEM_mmio_dm) 
+ (t != p2m_mmio_write_dm || a.hvmmem_type != HVMMEM_ram_rw) )
 {
 put_gfn(d, pfn);
 goto param_fail4;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 15c6e83..e21a92d 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -136,6 +136,7 @@ static void ept_p2m_type_to_flags(ept_entry_t *entry, 
p2m_type_t type, p2m_acces
 entry-x = 0;
 break;
 case p2m_grant_map_ro:
+case p2m_mmio_write_dm:
 entry-r = 1;
 entry-w = entry-x = 0;
 break;
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index e48b63a..26fb18d 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -94,6 +94,7 @@ static unsigned long p2m_type_to_flags(p2m_type_t t, mfn_t 
mfn)
 default:
 return flags | _PAGE_NX_BIT;
 case p2m_grant_map_ro:
+case p2m_mmio_write_dm:
 return flags | P2M_BASE_FLAGS | _PAGE_NX_BIT;
 case p2m_ram_ro:
 case p2m_ram_logdirty:
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 42de75d..2cf73ca 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -72,6 +72,7 @@ typedef enum {
 p2m_ram_shared = 12,  /* Shared or sharable memory */
 p2m_ram_broken = 13,  /* Broken page, access cause domain crash */
 p2m_map_foreign  = 14,/* ram pages from foreign domain */
+p2m_mmio_write_dm = 15,   /* Read-only; writes go to the device model 
*/
 } p2m_type_t;
 
 /* Modifiers to the query */
@@ -111,7 +112,8 @@ typedef unsigned int p2m_query_t;
 #define P2M_RO_TYPES (p2m_to_mask(p2m_ram_logdirty) \
   | p2m_to_mask(p2m_ram_ro) \
   | p2m_to_mask(p2m_grant_map_ro)   \
-  | p2m_to_mask(p2m_ram_shared) )
+  | p2m_to_mask(p2m_ram_shared) \
+  | p2m_to_mask(p2m_mmio_write_dm))
 
 /* Write-discard types, which should discard the write operations */
 #define P2M_DISCARD_WRITE_TYPES (p2m_to_mask(p2m_ram_ro) \
diff --git a/xen/include/public/hvm/hvm_op.h b/xen/include/public/hvm/hvm_op.h
index eeb0a60..a4e5345 100644
--- a/xen/include/public/hvm/hvm_op.h
+++ b/xen/include/public/hvm/hvm_op.h
@@ -81,6 +81,7 @@ typedef enum {
 HVMMEM_ram_rw, /*

92 matches

Mail list logo