date:20200428

Re: [PATCH v11 1/3] x86/tlb: introduce a flush HVM ASIDs flag

2020-04-28 Thread Tim Deegan

At 11:12 +0100 on 27 Apr (1587985955), Wei Liu wrote:
> On Thu, Apr 23, 2020 at 06:33:49PM +0200, Jan Beulich wrote:
> > On 23.04.2020 16:56, Roger Pau Monne wrote:
> > > Introduce a specific flag to request a HVM guest linear TLB flush,
> > > which is an ASID/VPID tickle that forces a guest linear to guest
> > > physical TLB flush for all HVM guests.
> > > 
> > > This was previously unconditionally done in each pre_flush call, but
> > > that's not required: HVM guests not using shadow don't require linear
> > > TLB flushes as Xen doesn't modify the pages tables the guest runs on
> > > in that case (ie: when using HAP). Note that shadow paging code
> > > already takes care of issuing the necessary flushes when the shadow
> > > page tables are modified.
> > > 
> > > In order to keep the previous behavior modify all shadow code TLB
> > > flushes to also flush the guest linear to physical TLB if the guest is
> > > HVM. I haven't looked at each specific shadow code TLB flush in order
> > > to figure out whether it actually requires a guest TLB flush or not,
> > > so there might be room for improvement in that regard.
> > > 
> > > Also perform ASID/VPID flushes when modifying the p2m tables as it's a
> > > requirement for AMD hardware. Finally keep the flush in
> > > switch_cr3_cr4, as it's not clear whether code could rely on
> > > switch_cr3_cr4 also performing a guest linear TLB flush. A following
> > > patch can remove the ASID/VPID tickle from switch_cr3_cr4 if found to
> > > not be necessary.
> > > 
> > > Signed-off-by: Roger Pau Monné 
> > 
> > Reviewed-by: Jan Beulich 
> 
> Tim, ICYMI, this patch needs your ack.

Sorry!  Thanks for the reminder.

Acked-by: Tim Deegan

[qemu-mainline test] 149866: tolerable FAIL - PUSHED

2020-04-28 Thread osstest service owner

flight 149866 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149866/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 149744

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 149744
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 149744
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 149744
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 149744
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 149744
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 149744
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 qemuufdd76fecdde1ad444ff4deb7f1c4f7e4a1ef97d6
baseline version:
 qemuuee573f5326046223b6eef4ae7fbfec31d274e093

Last test of basis   149744  2020-04-23 03:46:20 Z6 days
Testing same since   149866  2020-04-28 17:07:10 Z0 days1 attempts


People who touched revisions under test:
  Peter Maydell 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred

2020-04-28 Thread Jürgen Groß


On 28.04.20 22:55, Julien Grall wrote:

Hi Juergen,

On Tue, 28 Apr 2020 at 16:53, Juergen Gross  wrote:


The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
of the domain's xenstore ring page and the event channel of the
domain for communicating with Xenstore.

The gfn is not really needed. It is stored in the per-domain struct
in xenstored and in case of another XS_INTRODUCE for the domain it
is tested to match the original value. If it doesn't match the
command is aborted via EINVAL.

Today there shouldn't be multiple XS_INTRODUCE requests for the same
domain issued, so the mfn/gfn can just be ignored and multiple
XS_INTRODUCE commands can be rejected without testing the mfn/gfn.


So there is a comment in the else part:

/* Use XS_INTRODUCE for recreating the xenbus event-channel. */

 From the commit message this is not entirely clear why we want to
prevent recreating the event-channel. Can you expand it?


Recreating the event channel would be fine, but I don't see why it
would ever be needed. And XS_INTRODUCE is called only at domain creation
time today, so there is obviously no need for repeating this call.

Maybe the idea was to do this after sending a XS_RESUME command, which
isn't used anywhere in Xen and is another part of Xenstore which doesn't
make any sense today.





Signed-off-by: Juergen Gross 
---
  tools/xenstore/xenstored_domain.c | 47 ---
  1 file changed, 19 insertions(+), 28 deletions(-)

diff --git a/tools/xenstore/xenstored_domain.c 
b/tools/xenstore/xenstored_domain.c
index 5858185211..17328f9fc9 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -369,7 +369,6 @@ int do_introduce(struct connection *conn, struct 
buffered_data *in)
 struct domain *domain;
 char *vec[3];
 unsigned int domid;
-   unsigned long mfn;
 evtchn_port_t port;
 int rc;
 struct xenstore_domain_interface *interface;
@@ -381,7 +380,7 @@ int do_introduce(struct connection *conn, struct 
buffered_data *in)
 return EACCES;

 domid = atoi(vec[0]);
-   mfn = atol(vec[1]);
+   /* Ignore the mfn, we don't need it. */


s/mfn/GFN/


Okay, then I should probably change the parameter description, too.




 port = atoi(vec[2]);

 /* Sanity check args. */
@@ -390,34 +389,26 @@ int do_introduce(struct connection *conn, struct 
buffered_data *in)

 domain = find_domain_by_domid(domid);

-   if (domain == NULL) {
-   interface = map_interface(domid);
-   if (!interface)
-   return errno;
-   /* Hang domain off "in" until we're finished. */
-   domain = new_domain(in, domid, port);
-   if (!domain) {
-   rc = errno;
-   unmap_interface(interface);
-   return rc;
-   }
-   domain->interface = interface;
-   domain->mfn = mfn;


AFAICT domain->mfn is not used anymore, so could we remove the field?


Oh, yes, I missed that.


Juergen

Re: [PATCH 02/11] xen: Fix and improve handling of device_add usb-host errors

2020-04-28 Thread Markus Armbruster

Paul Durrant  writes:

>> -Original Message-
>> From: Markus Armbruster 
>> Sent: 24 April 2020 20:20
>> To: qemu-de...@nongnu.org
>> Cc: Stefano Stabellini ; Anthony Perard 
>> ; Paul
>> Durrant ; Gerd Hoffmann ; 
>> xen-devel@lists.xenproject.org
>> Subject: [PATCH 02/11] xen: Fix and improve handling of device_add usb-host 
>> errors
>> 
>> usbback_portid_add() leaks the error when qdev_device_add() fails.
>> Fix that.  While there, use the error to improve the error message.
>> 
>> The qemu_opts_from_qdict() similarly leaks on failure.  But any
>> failure there is a programming error.  Pass &error_abort.
>> 
>> Fixes: 816ac92ef769f9ffc534e49a1bb6177bddce7aa2
>> Cc: Stefano Stabellini 
>> Cc: Anthony Perard 
>> Cc: Paul Durrant 
>> Cc: Gerd Hoffmann 
>> Cc: xen-devel@lists.xenproject.org
>> Signed-off-by: Markus Armbruster 
>> ---
>>  hw/usb/xen-usb.c | 18 --
>>  1 file changed, 8 insertions(+), 10 deletions(-)
>> 
>> diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
>> index 961190d0f7..42643c3390 100644
>> --- a/hw/usb/xen-usb.c
>> +++ b/hw/usb/xen-usb.c
>> @@ -30,6 +30,7 @@
>>  #include "hw/usb.h"
>>  #include "hw/xen/xen-legacy-backend.h"
>>  #include "monitor/qdev.h"
>> +#include "qapi/error.h"
>>  #include "qapi/qmp/qdict.h"
>>  #include "qapi/qmp/qstring.h"
>> 
>> @@ -755,13 +756,15 @@ static void usbback_portid_add(struct usbback_info 
>> *usbif, unsigned port,
>>  qdict_put_int(qdict, "port", port);
>>  qdict_put_int(qdict, "hostbus", atoi(busid));
>>  qdict_put_str(qdict, "hostport", portname);
>> -opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict, 
>> &local_err);
>> -if (local_err) {
>> -goto err;
>> -}
>> +opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict,
>> +&error_abort);
>>  usbif->ports[port - 1].dev = USB_DEVICE(qdev_device_add(opts, 
>> &local_err));
>>  if (!usbif->ports[port - 1].dev) {
>> -goto err;
>> +qobject_unref(qdict);
>> +xen_pv_printf(&usbif->xendev, 0,
>> +  "device %s could not be opened: %s\n",
>> +  busid, error_get_pretty(local_err));
>> +error_free(local_err);
>
> Previously the goto caused the function to bail out. Should there not be a 
> 'return' here?

Owww, of course.  Thanks!

>
>>  }
>>  qobject_unref(qdict);
>>  speed = usbif->ports[port - 1].dev->speed;
>> @@ -793,11 +796,6 @@ static void usbback_portid_add(struct usbback_info 
>> *usbif, unsigned port,
>>  usbback_hotplug_enq(usbif, port);
>> 
>>  TR_BUS(&usbif->xendev, "port %d attached\n", port);
>> -return;
>> -
>> -err:
>> -qobject_unref(qdict);
>> -xen_pv_printf(&usbif->xendev, 0, "device %s could not be opened\n", 
>> busid);
>>  }
>> 
>>  static void usbback_process_port(struct usbback_info *usbif, unsigned port)
>> --
>> 2.21.1

[xen-unstable test] 149865: tolerable FAIL - PUSHED

2020-04-28 Thread osstest service owner

flight 149865 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149865/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 149842

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10   fail  like 149842
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 149842
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 149842
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 149842
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 149842
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 149842
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 149842
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 149842
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 149842
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 149842
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass

version targeted for testing:
 xen  2add344e6a4ef690871157b87b0e7d52bc6db9e4
baseline version:
 xen  df669de074c395a3b2eeb975fddd3da4c148da13

Last test of basis   149842  2020-04-27 23:38:49 Z1 days
Testing same since   149865  2020-04-28 16:32:15 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Jan Beulich 
  Wei Liu 

jobs:
 build-amd64-xsm

[ovmf test] 149867: all pass - PUSHED

2020-04-28 Thread osstest service owner

flight 149867 ovmf real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149867/

Perfect :-)
All tests in this flight passed as required
version targeted for testing:
 ovmf 099dfbb29d8bf0a30e397e3f5baf1da437b8f0ba
baseline version:
 ovmf 0f1946b6626e263c7f764c21cc241cc9faf8a1ae

Last test of basis   149827  2020-04-26 06:40:24 Z2 days
Testing same since   149867  2020-04-28 18:09:26 Z0 days1 attempts


People who touched revisions under test:
  Ard Biesheuvel 
  Laszlo Ersek 
  Michael Kubacki 
  Ray Ni 
  Sean Brogan 

jobs:
 build-amd64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/osstest/ovmf.git
   0f1946b662..099dfbb29d  099dfbb29d8bf0a30e397e3f5baf1da437b8f0ba -> 
xen-tested-master

[PATCH v2 1/2] Fix undefined behaviour

2020-04-28 Thread Grzegorz Uriasz

This patch fixes qemu crashes when passing through an IGD device to HVM guests 
under XEN. The problem is that on almost every laptop
reading the IGD ROM from SYSFS will fail, the reason for it is that the IGD rom 
is polymorphic and it modifies itself
during bootup - this results in an invalid rom image - the kernel checks 
whether the image is valid and when that's not the case
it won't allow userspace to read it. It seems that when the code was first 
written(xen_pt_load_rom.c) the kernel's back then didn't check
whether the rom is valid and just passed the contents to userspace - because of 
this qemu also tries to repair the rom later in the code.
When stating the rom the kernel isn't validating the rom contents so it is 
returning the proper size of the rom(32 4kb pages).

This results in undefined behaviour - pci_assign_dev_load_option_rom is 
creating a buffer and then writes the size of the buffer to a pointer.
In pci_assign_dev_load_option_rom under old kernels if the fstat would succeed 
then fread would also succeed - this means if the buffer
is allocated the size of the buffer will be set. On newer kernels this is not 
the case - on all laptops I've tested(spanning a few
generations of IGD) the fstat is successful and the buffer is allocated but the 
fread is failing - as the buffer is not deallocated
the function is returning a valid pointer without setting the size of the 
buffer for the caller. The caller of pci_assign_dev_load_option_rom
is holding the size of the buffer in an uninitialized variable and is just 
checking whether pci_assign_dev_load_option_rom returned a valid pointer.
This later results in cpu_physical_memory_write(0xc, 
result_of_pci_assign_dev_load_option_rom, unitialized_variable) which
depending on the compiler parameters, configure flags, etc... might crash qemu 
or might work - the xen 4.12 stable release with default configure
parameters works but changing the compiler options slightly(for instance by 
enabling qemu debug) results in qemu crashing ¯\_(;-;)_/¯

The only situation when the original pci_assign_dev_load_option_rom works is 
when the IDG is not the boot gpu - this won't happen on any laptop and
will be rare on desktops.

This patch deallocates the buffer and returns NULL if reading the VBIOS fails - 
the caller of the function then properly shuts down qemu.
The next patch in the series introduces a better method for getting the vbios 
so qemu does not exit when pci_assign_dev_load_option_rom fails -
this is the reason I've changed error_report to warn_report as whether a 
failure in pci_assign_dev_load_option_rom is fatal
depends on the caller.

Signed-off-by: Grzegorz Uriasz 
---
 hw/xen/xen_pt_load_rom.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/hw/xen/xen_pt_load_rom.c b/hw/xen/xen_pt_load_rom.c
index a50a80837e..9f100dc159 100644
--- a/hw/xen/xen_pt_load_rom.c
+++ b/hw/xen/xen_pt_load_rom.c
@@ -38,12 +38,12 @@ void *pci_assign_dev_load_option_rom(PCIDevice *dev,
 fp = fopen(rom_file, "r+");
 if (fp == NULL) {
 if (errno != ENOENT) {
-error_report("pci-assign: Cannot open %s: %s", rom_file, 
strerror(errno));
+warn_report("pci-assign: Cannot open %s: %s", rom_file, 
strerror(errno));
 }
 return NULL;
 }
 if (fstat(fileno(fp), &st) == -1) {
-error_report("pci-assign: Cannot stat %s: %s", rom_file, 
strerror(errno));
+warn_report("pci-assign: Cannot stat %s: %s", rom_file, 
strerror(errno));
 goto close_rom;
 }
 
@@ -59,10 +59,9 @@ void *pci_assign_dev_load_option_rom(PCIDevice *dev,
 memset(ptr, 0xff, st.st_size);
 
 if (!fread(ptr, 1, st.st_size, fp)) {
-error_report("pci-assign: Cannot read from host %s", rom_file);
-error_printf("Device option ROM contents are probably invalid "
- "(check dmesg).\nSkip option ROM probe with rombar=0, "
- "or load from file with romfile=\n");
+warn_report("pci-assign: Cannot read from host %s", rom_file);
+memory_region_unref(&dev->rom);
+ptr = NULL;
 goto close_rom;
 }
 
-- 
2.26.1

[PATCH v2 2/2] Improve legacy vbios handling

2020-04-28 Thread Grzegorz Uriasz

The current method of getting the vbios is broken - it just isn't working on 
any device I've tested - the reason
for this is explained in the previous patch. The vbios is polymorphic and 
getting a proper unmodified copy is
often not possible without reverse engineering the firmware. We don't need an 
unmodified copy for most purposes -
an unmodified copy is only needed for initializing the bios framebuffer and 
providing the bios with a corrupted
copy of the rom won't do any damage as the bios will just ignore the rom.

After the i915 driver takes over the vbios is only needed for reading some 
metadata/configuration stuff etc...
I've tested that not having any kind of vbios in the guest actually works fine 
but on older generations of IGD
there are some slight hiccups. To maximize compatibility the best approach is 
to just copy the results of the vbios
execution directly to the guest. It turns out the vbios is always present on an 
hardcoded memory range in a reserved
memory range from real mode - all we need to do is to memcpy it into the guest.

The following patch does 2 things:
1) When pci_assign_dev_load_option_rom fails to read the vbios from sysfs(this 
works only when the igd is not the
boot gpu - this is unlikely to happen) it falls back to using /dev/mem to copy 
the vbios directly to the guest.
Using /dev/mem should be fine as there is more xen specific pci code which also 
relies on /dev/mem.
2) When dealing with IGD in the more generic code we skip the allocation of the 
rom resource - the reason for this is to prevent
a malicious guest from modifying the vbios in the host -> this is needed as 
someone might try to pwn the i915 driver in the host by doing so
(attach igd to guest, guest modifies vbios, the guest is terminated and the idg 
is reattached to the host, i915 driver in the host uses data from the modified 
vbios).
This is also needed to not overwrite the proper shadow copy made before.

I've tested this patch and it works fine - the guest isn't complaining about 
the missing vbios tables and the pci config
space in the guest looks fine.

Signed-off-by: Grzegorz Uriasz 
---
 hw/xen/xen_pt.c  |  8 +--
 hw/xen/xen_pt_graphics.c | 48 +---
 hw/xen/xen_pt_load_rom.c |  2 +-
 3 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index b91082cb8b..ffc3559dd4 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -483,8 +483,12 @@ static int xen_pt_register_regions(XenPCIPassthroughState 
*s, uint16_t *cmd)
i, r->size, r->base_addr, type);
 }
 
-/* Register expansion ROM address */
-if (d->rom.base_addr && d->rom.size) {
+/*
+ * Register expansion ROM address. If we are dealing with a ROM
+ * shadow copy for legacy vga devices then don't bother to map it
+ * as previous code creates a proper shadow copy
+ */
+if (d->rom.base_addr && d->rom.size && !(is_igd_vga_passthrough(d))) {
 uint32_t bar_data = 0;
 
 /* Re-set BAR reported by OS, otherwise ROM can't be read. */
diff --git a/hw/xen/xen_pt_graphics.c b/hw/xen/xen_pt_graphics.c
index a3bc7e3921..fe0ef2685c 100644
--- a/hw/xen/xen_pt_graphics.c
+++ b/hw/xen/xen_pt_graphics.c
@@ -129,7 +129,7 @@ int xen_pt_unregister_vga_regions(XenHostPCIDevice *dev)
 return 0;
 }
 
-static void *get_vgabios(XenPCIPassthroughState *s, int *size,
+static void *get_sysfs_vgabios(XenPCIPassthroughState *s, int *size,
XenHostPCIDevice *dev)
 {
 return pci_assign_dev_load_option_rom(&s->dev, size,
@@ -137,6 +137,45 @@ static void *get_vgabios(XenPCIPassthroughState *s, int 
*size,
   dev->dev, dev->func);
 }
 
+static void xen_pt_direct_vbios_copy(XenPCIPassthroughState *s, Error **errp)
+{
+int fd = -1;
+void *guest_bios = NULL;
+void *host_vbios = NULL;
+/* This is always 32 pages in the real mode reserved region */
+int bios_size = 32 << XC_PAGE_SHIFT;
+int vbios_addr = 0xc;
+
+fd = open("/dev/mem", O_RDONLY);
+if (fd == -1) {
+error_setg(errp, "Can't open /dev/mem: %s", strerror(errno));
+return;
+}
+host_vbios = mmap(NULL, bios_size,
+PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, fd, vbios_addr);
+close(fd);
+
+if (host_vbios == MAP_FAILED) {
+error_setg(errp, "Failed to mmap host vbios: %s", strerror(errno));
+return;
+}
+
+memory_region_init_ram(&s->dev.rom, OBJECT(&s->dev),
+"legacy_vbios.rom", bios_size, &error_abort);
+guest_bios = memory_region_get_ram_ptr(&s->dev.rom);
+memcpy(guest_bios, host_vbios, bios_size);
+
+if (munmap(host_vbios, bios_size) == -1) {
+XEN_PT_LOG(&s->dev, "Failed to unmap host vbios: %s\n", 
strerror(errno));
+}
+
+cpu_physical_memory_write(vbios_addr, guest_bios, bios_size);
+memory_region_set_address(&s->dev.rom, vbios_addr);
+pci_register_bar(&s->d

[PATCH v2 0/2] Fix QEMU crashes when passing IGD to a guest VM under XEN

2020-04-28 Thread Grzegorz Uriasz

This is the v1 cover letter - the patches now include a detailed description of 
the changes.

Hi,

This patch series is a small subset of a bigger patch set spanning few projects 
aiming to isolate the GPU
in QUBES OS to a dedicated security domain. I'm doing this together with 3 
colleagues as part of our Bachelors thesis.

When passing an Intel Graphic Device to a HVM guest under XEN, QEMU sometimes 
crashes
when starting the VM. It turns out that the code responsible for setting up
the legacy VBIOS for the IGD contains a bug which results in a memcpy of an 
undefined size
between the QEMU heap and the physical memory of the guest.

If the size of the memcpy is small enough qemu does not crash - this means that 
this
bug is actually a small security issue - a hostile guest kernel might determine 
the memory layout of
QEMU simply by looking at physical memory beyond 0xd - this defeats ASLR 
and might make exploitation
easier if other issues were to be found.

The problem is the current mechanism for obtaining a copy of the ROM of the IGD.
We first allocate a buffer which holds the vbios - the size of which is 
obtained from sysfs.
We then try to read the rom from sysfs, if we fail then we just return without 
setting the size of the buffer.
This would be ok if the size of the ROM reported by sysfs would be 0, but the 
size is always 32 pages as this corresponds
to legacy memory ranges. It turns out that reading the ROM fails on every 
single device I've tested(spanning few
generations of IGD), which means qemu never sets the size of the buffer and 
returns a valid pointer to code which
basically does a memcpy of an undefined size.

I'm including two patches.
The first one fixes the security issue by making failing to read the ROM from 
sysfs fatal.
The second patch introduces a better method for obtaining the VBIOS. I've 
haven't yet seen a single device on which
the old code was working, the new code basically creates a shadow copy directly 
by reading from /dev/mem - this
should be fine as a quick grep of the codebase shows that this approach is 
already being used to handle MSI.
I've tested the new code on few different laptops and it works fine and the 
guest VMS finally stopped complaining that
the VBIOS tables are missing.

Grzegorz Uriasz (2):
  Fix undefined behaviour
  Improve legacy vbios handling

 hw/xen/xen_pt.c  |  8 +--
 hw/xen/xen_pt_graphics.c | 48 +---
 hw/xen/xen_pt_load_rom.c | 13 +--
 3 files changed, 57 insertions(+), 12 deletions(-)

-- 
2.26.1

Re: [PATCH] x86/xen: drop an unused parameter gsi_override

2020-04-28 Thread Boris Ostrovsky



On 4/28/20 11:36 AM, Wei Liu wrote:
> All callers within the same file pass in -1 (no override).
>
> Signed-off-by: Wei Liu 


Reviewed-by: Boris Ostrovsky

[linux-linus test] 149854: tolerable FAIL - PUSHED

2020-04-28 Thread osstest service owner

flight 149854 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149854/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-arm64-arm64-examine  8 reboot   fail in 149840 pass in 149854
 test-amd64-amd64-examine4 memdisk-try-append fail in 149840 pass in 149854
 test-arm64-arm64-xl-xsm   7 xen-boot   fail pass in 149840

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 149832

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10  fail blocked in 149832
 test-arm64-arm64-xl-xsm 13 migrate-support-check fail in 149840 never pass
 test-arm64-arm64-xl-xsm 14 saverestore-support-check fail in 149840 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 149832
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 149832
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 149832
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 149832
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 149832
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 149832
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 149832
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 149832
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 linux51184ae37e0518fd90cb437a2fbc953ae558cd0d
baseline version:
 linux6a8b55ed4056ea5559ebe4f6a4b247f627870d4c

Last test of basis

Re: [Xen-devel] [BUG] panic: "IO-APIC + timer doesn't work" - several people have reproduced

2020-04-28 Thread Jason Andryuk

On Fri, Apr 17, 2020 at 5:31 AM Jan Beulich  wrote:
>
> On 17.03.2020 16:23, Jason Andryuk wrote:
> > Below is the diff.  It was messier and I tidied it up some.
>
> I've looked into this some more. I can see how what we currently
> do is not in line with firmware handing off with LegacyReplacement
> mode enabled. However, this case doesn't look to apply here:

Thanks for taking a look again.  Sorry for the slow response.  I
wanted to dig through the linux history to see if there was a reason
for enabling the periodic timer there, but I haven't gotten around to
it.

> So right now the only possible approach I see to address your
> problem is to add yet another fallback mode to check_timer(),
> forcing LegacyReplacement mode to be enabled. But between /
> after which step(s) to put this there isn't at all obvious to me.

I don't really have a good suggestion here.

Aaron, I'm curious to know if you've tried this patch on your hardware
and if it helped.

Thanks again,
Jason

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred

2020-04-28 Thread Julien Grall

Hi Juergen,

On Tue, 28 Apr 2020 at 16:53, Juergen Gross  wrote:
>
> The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
> of the domain's xenstore ring page and the event channel of the
> domain for communicating with Xenstore.
>
> The gfn is not really needed. It is stored in the per-domain struct
> in xenstored and in case of another XS_INTRODUCE for the domain it
> is tested to match the original value. If it doesn't match the
> command is aborted via EINVAL.
>
> Today there shouldn't be multiple XS_INTRODUCE requests for the same
> domain issued, so the mfn/gfn can just be ignored and multiple
> XS_INTRODUCE commands can be rejected without testing the mfn/gfn.

So there is a comment in the else part:

/* Use XS_INTRODUCE for recreating the xenbus event-channel. */

>From the commit message this is not entirely clear why we want to
prevent recreating the event-channel. Can you expand it?

>
> Signed-off-by: Juergen Gross 
> ---
>  tools/xenstore/xenstored_domain.c | 47 
> ---
>  1 file changed, 19 insertions(+), 28 deletions(-)
>
> diff --git a/tools/xenstore/xenstored_domain.c 
> b/tools/xenstore/xenstored_domain.c
> index 5858185211..17328f9fc9 100644
> --- a/tools/xenstore/xenstored_domain.c
> +++ b/tools/xenstore/xenstored_domain.c
> @@ -369,7 +369,6 @@ int do_introduce(struct connection *conn, struct 
> buffered_data *in)
> struct domain *domain;
> char *vec[3];
> unsigned int domid;
> -   unsigned long mfn;
> evtchn_port_t port;
> int rc;
> struct xenstore_domain_interface *interface;
> @@ -381,7 +380,7 @@ int do_introduce(struct connection *conn, struct 
> buffered_data *in)
> return EACCES;
>
> domid = atoi(vec[0]);
> -   mfn = atol(vec[1]);
> +   /* Ignore the mfn, we don't need it. */

s/mfn/GFN/

> port = atoi(vec[2]);
>
> /* Sanity check args. */
> @@ -390,34 +389,26 @@ int do_introduce(struct connection *conn, struct 
> buffered_data *in)
>
> domain = find_domain_by_domid(domid);
>
> -   if (domain == NULL) {
> -   interface = map_interface(domid);
> -   if (!interface)
> -   return errno;
> -   /* Hang domain off "in" until we're finished. */
> -   domain = new_domain(in, domid, port);
> -   if (!domain) {
> -   rc = errno;
> -   unmap_interface(interface);
> -   return rc;
> -   }
> -   domain->interface = interface;
> -   domain->mfn = mfn;

AFAICT domain->mfn is not used anymore, so could we remove the field?

Cheers,

Re: [PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred

2020-04-28 Thread Andrew Cooper

On 28/04/2020 16:51, Juergen Gross wrote:
> The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
> of the domain's xenstore ring page and the event channel of the
> domain for communicating with Xenstore.
>
> The gfn is not really needed. It is stored in the per-domain struct
> in xenstored and in case of another XS_INTRODUCE for the domain it
> is tested to match the original value. If it doesn't match the
> command is aborted via EINVAL.
>
> Today there shouldn't be multiple XS_INTRODUCE requests for the same
> domain issued, so the mfn/gfn can just be ignored and multiple
> XS_INTRODUCE commands can be rejected without testing the mfn/gfn.
>
> Signed-off-by: Juergen Gross 

In hindsight, this is cleanup following c/s 38eeb3864d "tools/xenstored:
Drop mapping of the ring via foreign map".

Acked-by: Andrew Cooper

[xen-unstable-smoke test] 149863: tolerable all pass - PUSHED

2020-04-28 Thread osstest service owner

flight 149863 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149863/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  4ec07971f1c5a236a0d8c528d806efb6bfd3d1a6
baseline version:
 xen  2add344e6a4ef690871157b87b0e7d52bc6db9e4

Last test of basis   149857  2020-04-28 12:00:23 Z0 days
Testing same since   149863  2020-04-28 16:01:08 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Hongyan Xia 
  Jan Beulich 
  Juergen Gross 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   2add344e6a..4ec07971f1  4ec07971f1c5a236a0d8c528d806efb6bfd3d1a6 -> smoke

Re: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread Stefano Stabellini

On Tue, 28 Apr 2020, Jürgen Groß wrote:
> On 28.04.20 09:33, peng@nxp.com wrote:
> > From: Peng Fan 
> > 
> > When booting xen on i.MX8QM, met:
> > "
> > [3.602128] Unable to handle kernel paging request at virtual address
> > 00272d40
> > [3.610804] Mem abort info:
> > [3.613905]   ESR = 0x9604
> > [3.617332]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [3.623211]   SET = 0, FnV = 0
> > [3.626628]   EA = 0, S1PTW = 0
> > [3.630128] Data abort info:
> > [3.633362]   ISV = 0, ISS = 0x0004
> > [3.637630]   CM = 0, WnR = 0
> > [3.640955] [00272d40] user address but active_mm is swapper
> > [3.647983] Internal error: Oops: 9604 [#1] PREEMPT SMP
> > [3.654137] Modules linked in:
> > [3.677285] Hardware name: Freescale i.MX8QM MEK (DT)
> > [3.677302] Workqueue: events deferred_probe_work_func
> > [3.684253] imx6q-pcie 5f00.pcie: PCI host bridge to bus :00
> > [3.688297] pstate: 6005 (nZCv daif -PAN -UAO)
> > [3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0
> > [3.693993] pci_bus :00: root bus resource [bus 00-ff]
> > [3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0
> > "
> > 
> > In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask) or
> > range_straddles_page_boundary(phys, size) are true, it will
> > create contiguous region. So when free, we need to free contiguous
> > region use upper check condition.
> 
> No, this will break PV guests on x86.
> 
> I think there is something wrong with your setup in combination with
> the ARM xen_create_contiguous_region() implementation.
> 
> Stefano?

Let me start by asking Peng a couple of questions:

Peng, could you please add a printk to check which one of the two
conditions is True for you?  Is it (dev_addr + size - 1 > dma_mask) or
range_straddles_page_boundary(phys, size)?

Is hwdev->coherent_dma_mask set for your DMA capable device?

Finally, is this patch supposed to fix the crash you are seeing? If not,
can you tell where is the crash exactly?

Juergen, keep in mind that xen_create_contiguous_region does nothing on
ARM because in dom0 guest_phys == phys. xen_create_contiguous_region
simply sets dma_handle to phys. Whatever condition caused the code to
take the xen_create_contiguous_region branch in
xen_swiotlb_alloc_coherent, it will also cause it to WARN in
xen_swiotlb_free_coherent.

range_straddles_page_boundary should never return True because
guest_phys == phys. That leaves us with the dma_mask check:

  dev_addr + size - 1 <= dma_mask

dev_addr is the dma_handle allocated by xen_alloc_coherent_pages.
dma_mask is either DMA_BIT_MASK(32) or hwdev->coherent_dma_mask.

The implementation of xen_alloc_coherent_pages has been recently changed
to use dma_direct_alloc.

Christoff, does dma_direct_alloc respect hwdev->coherent_dma_mask if
present? Also, can it return highmem pages?

> Juergen
> 
> > 
> > Signed-off-by: Peng Fan 
> > ---
> >   drivers/xen/swiotlb-xen.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index b6d27762c6f8..ab96e468584f 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t
> > size, void *vaddr,
> > /* Convert the size to actually allocated. */
> > size = 1UL << (order + XEN_PAGE_SHIFT);
> >   - if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
> > -range_straddles_page_boundary(phys, size)) &&
> > +   if (((dev_addr + size - 1 > dma_mask) ||
> > +   range_straddles_page_boundary(phys, size)) &&
> > TestClearPageXenRemapped(virt_to_page(vaddr)))
> > xen_destroy_contiguous_region(phys, order);
> >   
>

Re: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread Joe Jin

On 4/28/20 10:25 AM, Konrad Rzeszutek Wilk wrote:
> On Tue, Apr 28, 2020 at 12:19:41PM +0200, Jürgen Groß wrote:
>> On 28.04.20 10:25, Peng Fan wrote:
> 
> Adding Joe Jin.
> 
> Joe, didn't you have some ideas on how this could be implemented?
> 
 Subject: Re: [PATCH] xen/swiotlb: correct the check for
 xen_destroy_contiguous_region

 On 28.04.20 09:33, peng@nxp.com wrote:
> From: Peng Fan 
>
> When booting xen on i.MX8QM, met:
> "
> [3.602128] Unable to handle kernel paging request at virtual address
 00272d40
> [3.610804] Mem abort info:
> [3.613905]   ESR = 0x9604
> [3.617332]   EC = 0x25: DABT (current EL), IL = 32 bits
> [3.623211]   SET = 0, FnV = 0
> [3.626628]   EA = 0, S1PTW = 0
> [3.630128] Data abort info:
> [3.633362]   ISV = 0, ISS = 0x0004
> [3.637630]   CM = 0, WnR = 0
> [3.640955] [00272d40] user address but active_mm is
 swapper
> [3.647983] Internal error: Oops: 9604 [#1] PREEMPT SMP
> [3.654137] Modules linked in:
> [3.677285] Hardware name: Freescale i.MX8QM MEK (DT)
> [3.677302] Workqueue: events deferred_probe_work_func
> [3.684253] imx6q-pcie 5f00.pcie: PCI host bridge to bus :00
> [3.688297] pstate: 6005 (nZCv daif -PAN -UAO)
> [3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0
> [3.693993] pci_bus :00: root bus resource [bus 00-ff]
> [3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0
> "
>
> In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask)
> or range_straddles_page_boundary(phys, size) are true, it will create
> contiguous region. So when free, we need to free contiguous region use
> upper check condition.

 No, this will break PV guests on x86.
>>>
>>> Could you share more details why alloc and free not matching for the check?
>>
>> xen_create_contiguous_region() is needed only in case:
>>
>> - the bus address is not within dma_mask, or
>> - the memory region is not physically contiguous (can happen only for
>>   PV guests)
>>
>> In any case it should arrange for the memory to be suitable for the
>> DMA operation, so to be contiguous and within dma_mask afterwards. So
>> xen_destroy_contiguous_region() should only ever called for areas
>> which match above criteria, as otherwise we can be sure
>> xen_create_contiguous_region() was not used for making the area DMA-able
>> in the beginning.

I agreed with Juergen's explanation, That is my understanding.

Peng, if panic caused by (dev_addr + size - 1 > dma_mask), you should check
how you get the addr, if memory created by xen_create_contiguous_region(),
memory must be with in [0 - dma_mask].

Thanks,
Joe

>>
>> And this is very important in the PV case, as in those guests the page
>> tables are containing the host-PFNs, not the guest-PFNS, and
>> xen_create_contiguous_region() will fiddle with host- vs. guest-PFN
>> arrangements, and xen_destroy_contiguous_region() is reverting this
>> fiddling. Any call of xen_destroy_contiguous_region() for an area it
>> was not intended to be called for might swap physical pages beneath
>> random virtual addresses, which was the reason for this test to be
>> added by me.
>>
>>
>> Juergen
>>
>>>
>>> Thanks,
>>> Peng.
>>>

 I think there is something wrong with your setup in combination with the 
 ARM
 xen_create_contiguous_region() implementation.

 Stefano?


 Juergen

>
> Signed-off-by: Peng Fan 
> ---
>drivers/xen/swiotlb-xen.c | 4 ++--
>1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index b6d27762c6f8..ab96e468584f 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev,
 size_t size, void *vaddr,
>   /* Convert the size to actually allocated. */
>   size = 1UL << (order + XEN_PAGE_SHIFT);
>
> - if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
> -  range_straddles_page_boundary(phys, size)) &&
> + if (((dev_addr + size - 1 > dma_mask) ||
> + range_straddles_page_boundary(phys, size)) &&
>   TestClearPageXenRemapped(virt_to_page(vaddr)))
>   xen_destroy_contiguous_region(phys, order);
>
>
>>>
>>

Re: Golang Xen packages and the golang packaging system

2020-04-28 Thread Nick Rosbrook

> BTW the separate repo isn’t off the table.  But there were some things other 
> Ian pointed out:

After trying (and failing) to get a go module with a remote import
path like `golang.xenproject.org/xenlight` defined in xen.git, I would
like to circle back to the separate repo.

In theory, modules are not supposed to be tightly coupled with vcs,
but in practice they are. It seems like you cannot define a module
with a remote import path without pointing to the repo *root*. I tried
many ways, but every attempt resulted in `go get enr0n.net/xenlight`
downloading the root of xen.git as the "module", while the actual
module in tools/golang/xenlight was excluded since it was seen as a
nested go module. If you want to get around these issues, I think you
would need to define your own module proxy. Or, we would need to just
give up on the "vanity" URL and use
`xenbits.xen.org/git-http/xen.git/tools/golang/xenlight` as the import
path.

For reference, I did get a test module `enr0n.net/xenlight` that
points to github.com/enr0n/xenlight working. That part is trivial if
you are able to point to the repo root.

Overall, between fighting with Go modules, tagging versions, and
making sure code is re-generated on IDL changes, it seems to me that
there are more negatives with putting the module in xen.git than there
are with putting the module in its own repo.

> 1. The GPL requires that you provide the “preferred form for modification” to 
> all the code.  I’m not sure this has been adjudicated in court, but there’s a 
> strong argument that *generated* code doesn’t match that criteria: that to 
> satisfy the GPL you’d need to include libxl_types.idl, idl.py, gengotypes.py, 
> and a Makefile suitable for tying them all together.  (Not that the 
> generation needs to be run with `go build`, but that ideally the 
> infrastructure would be there so that it *could* be run.)

Is there anything involved in solving this issue besides making sure
those files are copied to the repo in addition to the generated go
files? Or is there some concern in doing so?

> 2. Ian was concerned with how someone using the bindings would submit a patch 
> upstream.  Suppose someone cloned our “bindings” repo, made some changes so 
> that it worked for them, then wanted to submit the patch upstream.  How would 
> they do that?

I think we could mostly solve this with a good README explaining what
to do. It's not uncommon to see (a) generated Go code with // DO NOT
EDIT at the top, and (b) repos (mirrors) on github that do not accept
PRs. Or am I oversimplifying?

Thanks,
-NR

Re: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread Konrad Rzeszutek Wilk

On Tue, Apr 28, 2020 at 12:19:41PM +0200, Jürgen Groß wrote:
> On 28.04.20 10:25, Peng Fan wrote:

Adding Joe Jin.

Joe, didn't you have some ideas on how this could be implemented?

> > > Subject: Re: [PATCH] xen/swiotlb: correct the check for
> > > xen_destroy_contiguous_region
> > > 
> > > On 28.04.20 09:33, peng@nxp.com wrote:
> > > > From: Peng Fan 
> > > > 
> > > > When booting xen on i.MX8QM, met:
> > > > "
> > > > [3.602128] Unable to handle kernel paging request at virtual address
> > > 00272d40
> > > > [3.610804] Mem abort info:
> > > > [3.613905]   ESR = 0x9604
> > > > [3.617332]   EC = 0x25: DABT (current EL), IL = 32 bits
> > > > [3.623211]   SET = 0, FnV = 0
> > > > [3.626628]   EA = 0, S1PTW = 0
> > > > [3.630128] Data abort info:
> > > > [3.633362]   ISV = 0, ISS = 0x0004
> > > > [3.637630]   CM = 0, WnR = 0
> > > > [3.640955] [00272d40] user address but active_mm is
> > > swapper
> > > > [3.647983] Internal error: Oops: 9604 [#1] PREEMPT SMP
> > > > [3.654137] Modules linked in:
> > > > [3.677285] Hardware name: Freescale i.MX8QM MEK (DT)
> > > > [3.677302] Workqueue: events deferred_probe_work_func
> > > > [3.684253] imx6q-pcie 5f00.pcie: PCI host bridge to bus :00
> > > > [3.688297] pstate: 6005 (nZCv daif -PAN -UAO)
> > > > [3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0
> > > > [3.693993] pci_bus :00: root bus resource [bus 00-ff]
> > > > [3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0
> > > > "
> > > > 
> > > > In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask)
> > > > or range_straddles_page_boundary(phys, size) are true, it will create
> > > > contiguous region. So when free, we need to free contiguous region use
> > > > upper check condition.
> > > 
> > > No, this will break PV guests on x86.
> > 
> > Could you share more details why alloc and free not matching for the check?
> 
> xen_create_contiguous_region() is needed only in case:
> 
> - the bus address is not within dma_mask, or
> - the memory region is not physically contiguous (can happen only for
>   PV guests)
> 
> In any case it should arrange for the memory to be suitable for the
> DMA operation, so to be contiguous and within dma_mask afterwards. So
> xen_destroy_contiguous_region() should only ever called for areas
> which match above criteria, as otherwise we can be sure
> xen_create_contiguous_region() was not used for making the area DMA-able
> in the beginning.
> 
> And this is very important in the PV case, as in those guests the page
> tables are containing the host-PFNs, not the guest-PFNS, and
> xen_create_contiguous_region() will fiddle with host- vs. guest-PFN
> arrangements, and xen_destroy_contiguous_region() is reverting this
> fiddling. Any call of xen_destroy_contiguous_region() for an area it
> was not intended to be called for might swap physical pages beneath
> random virtual addresses, which was the reason for this test to be
> added by me.
> 
> 
> Juergen
> 
> > 
> > Thanks,
> > Peng.
> > 
> > > 
> > > I think there is something wrong with your setup in combination with the 
> > > ARM
> > > xen_create_contiguous_region() implementation.
> > > 
> > > Stefano?
> > > 
> > > 
> > > Juergen
> > > 
> > > > 
> > > > Signed-off-by: Peng Fan 
> > > > ---
> > > >drivers/xen/swiotlb-xen.c | 4 ++--
> > > >1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > > > index b6d27762c6f8..ab96e468584f 100644
> > > > --- a/drivers/xen/swiotlb-xen.c
> > > > +++ b/drivers/xen/swiotlb-xen.c
> > > > @@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev,
> > > size_t size, void *vaddr,
> > > > /* Convert the size to actually allocated. */
> > > > size = 1UL << (order + XEN_PAGE_SHIFT);
> > > > 
> > > > -   if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
> > > > -range_straddles_page_boundary(phys, size)) &&
> > > > +   if (((dev_addr + size - 1 > dma_mask) ||
> > > > +   range_straddles_page_boundary(phys, size)) &&
> > > > TestClearPageXenRemapped(virt_to_page(vaddr)))
> > > > xen_destroy_contiguous_region(phys, order);
> > > > 
> > > > 
> > 
>

[xen-4.12-testing test] 149845: tolerable FAIL - PUSHED

2020-04-28 Thread osstest service owner

flight 149845 xen-4.12-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149845/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-qcow217 guest-localmigrate/x10   fail  like 149646
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stop fail never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stop fail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 xen  e6a2681148382e9227f54b70a5df8e895914c877
baseline version:
 xen  3536f8dc39cc7311715340b87a04a89fac468406

Last test of basis   149646  2020-04-14 13:05:53 Z   14 days
Testing same since   149838  2020-04-27 14:06:02 Z1 days2 attempts


People who touched revisions under test:
  Andrew Cooper 
  Anthony PERARD 
  Harsha Shamsundara Havanur 
  Jan Beulich 
  Julien Grall 
  Roger Pau Monné 
  Wei Liu 

jobs:
 build-amd64-xsm  pass
 build-

[xen-unstable test] 149842: tolerable FAIL - PUSHED

2020-04-28 Thread osstest service owner

flight 149842 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149842/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10  fail blocked in 149831
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 149831
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 149831
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 149831
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 149831
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 149831
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 149831
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 149831
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 149831
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop fail like 149831
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass

version targeted for testing:
 xen  df669de074c395a3b2eeb975fddd3da4c148da13
baseline version:
 xen  f093b08c47b39da6019421a2b61d40745b3e573b

Last test of basis   149831  2020-04-27 01:52:33 Z1 days
Failing since149835  2020-04-27 12:07:17 Z1 days2 attempts
Testing same since   149842  2020-04-27 23:38:49 Z0 days1 attempts


People who touched revisions under test:
  Anthony PERARD 
  George Dunlap 
  Jan Beulich 
  Julien Grall 
  Nick Rosbrook 
  Nick Rosbrook 
  Stefano Stabellini 
  Wei Liu 

jobs:

Re: Xen network domain performance for 10Gb NIC

2020-04-28 Thread Roger Pau Monné

On Tue, Apr 28, 2020 at 04:08:08PM +, tosher 1 wrote:
> > Do you get the expected performance from the driver domain when not
> > using it as a backend? Ie: running the iperf benchmarks directly on
> > the driver domain and not on the guest.
> 
> 
> Yes, the bandwidth between the driver domain and the client machine is close 
> to 10Gbits/sec.

Also note that doing grant map / unmap operations from HVM domains is
much more expensive than doing them from PV domains, even without an
IOMMU. Xen needs to modify the domain p2m and issue a flush, and then
the guest (usually) needs to map the newly populated memory in it's
page tables in order to access it.

This and the addition of the IOMMU TLB flush could well explain why
you are seeing such a degraded performance. I'm afraid the only way to
figure out exactly what is causing the bottleneck will be to time the
different operations in Xen. You can compare against a PV dom0 in
order to get an idea.

Roger.

Re: [PATCH] x86/pass-through: avoid double IRQ unbind during domain cleanup

2020-04-28 Thread Roger Pau Monné

On Tue, Apr 28, 2020 at 02:21:48PM +0200, Jan Beulich wrote:
> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTs.
> In that scenario, it is possible to receive multiple _pirq_guest_unbind
> calls for the same pirq from domain_kill, if the pirq has not yet been
> removed from the domain's pirq_tree, as:
>   domain_kill()
> -> domain_relinquish_resources()
>   -> pci_release_devices()
> -> pci_clean_dpci_irq()
>   -> pirq_guest_unbind()
> -> __pirq_guest_unbind()
> 
> Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
> from the tree being iterated after the first call there. In case such a
> removed entry still has a softirq outstanding, record it and re-check
> upon re-invocation.
> 
> Reported-by: Varad Gautam 
> Signed-off-by: Jan Beulich 
> Tested-by: Varad Gautam 
> 
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1323,7 +1323,7 @@ void (pirq_cleanup_check)(struct pirq *p
>  }
>  
>  if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
> -BUG();
> +BUG_ON(!d->is_dying);

I think to keep the previous behavior this should be:

BUG_ON(!is_hvm_domain(d) || !d->is_dying);

Since the pirqs will only be removed elsewhere if the domain is HVM?

Thanks, Roger.

Re: Xen network domain performance for 10Gb NIC

2020-04-28 Thread tosher 1

> Do you get the expected performance from the driver domain when not
> using it as a backend? Ie: running the iperf benchmarks directly on
> the driver domain and not on the guest.


Yes, the bandwidth between the driver domain and the client machine is close to 
10Gbits/sec.

Re: [PATCH 5/6] x86/pv: map and unmap page tables in mark_pv_pt_pages_rdonly

2020-04-28 Thread Hongyan Xia

On Tue, 2020-04-28 at 16:55 +0100, Hongyan Xia wrote:
> On Tue, 2020-04-28 at 17:33 +0200, Jan Beulich wrote:
> > On 17.04.2020 11:52, Hongyan Xia wrote:
> > > --- a/xen/arch/x86/pv/dom0_build.c
> > > +++ b/xen/arch/x86/pv/dom0_build.c
> > > @@ -50,17 +50,17 @@ static __init void
> > > mark_pv_pt_pages_rdonly(struct domain *d,
> > >  unsigned long count;
> > >  struct page_info *page;
> > >  l4_pgentry_t *pl4e;
> > > -l3_pgentry_t *pl3e;
> > > -l2_pgentry_t *pl2e;
> > > -l1_pgentry_t *pl1e;
> > > +l3_pgentry_t *pl3e, *l3t;
> > > +l2_pgentry_t *pl2e, *l2t;
> > > +l1_pgentry_t *pl1e, *l1t;
> > 
> > I don't quite see why the new local variables get introduced:
> > unmap_domain_page(), iirc, is quite fine with a non-page-
> > aligned argument.
> 
> You are right, although in this function, where plXe points to may
> not
> be the page we want to unmap. When plXe becomes aligned and points to
> a
> new page, we actually want to unmap the page before it increments to
> an
> aligned value.

Hmm, we can just unmap(plXe - 1) if my logic is correct, and save 3
local variables.

Hongyan

Re: [PATCH 5/6] x86/pv: map and unmap page tables in mark_pv_pt_pages_rdonly

2020-04-28 Thread Hongyan Xia

On Tue, 2020-04-28 at 17:33 +0200, Jan Beulich wrote:
> On 17.04.2020 11:52, Hongyan Xia wrote:
> > --- a/xen/arch/x86/pv/dom0_build.c
> > +++ b/xen/arch/x86/pv/dom0_build.c
> > @@ -50,17 +50,17 @@ static __init void
> > mark_pv_pt_pages_rdonly(struct domain *d,
> >  unsigned long count;
> >  struct page_info *page;
> >  l4_pgentry_t *pl4e;
> > -l3_pgentry_t *pl3e;
> > -l2_pgentry_t *pl2e;
> > -l1_pgentry_t *pl1e;
> > +l3_pgentry_t *pl3e, *l3t;
> > +l2_pgentry_t *pl2e, *l2t;
> > +l1_pgentry_t *pl1e, *l1t;
> 
> I don't quite see why the new local variables get introduced:
> unmap_domain_page(), iirc, is quite fine with a non-page-
> aligned argument.

You are right, although in this function, where plXe points to may not
be the page we want to unmap. When plXe becomes aligned and points to a
new page, we actually want to unmap the page before it increments to an
aligned value.

Hongyan

[PATCH] tools/xenstore: don't store domU's mfn of ring page in xensotred

2020-04-28 Thread Juergen Gross

The XS_INTRODUCE command has two parameters: the mfn (or better: gfn)
of the domain's xenstore ring page and the event channel of the
domain for communicating with Xenstore.

The gfn is not really needed. It is stored in the per-domain struct
in xenstored and in case of another XS_INTRODUCE for the domain it
is tested to match the original value. If it doesn't match the
command is aborted via EINVAL.

Today there shouldn't be multiple XS_INTRODUCE requests for the same
domain issued, so the mfn/gfn can just be ignored and multiple
XS_INTRODUCE commands can be rejected without testing the mfn/gfn.

Signed-off-by: Juergen Gross 
---
 tools/xenstore/xenstored_domain.c | 47 ---
 1 file changed, 19 insertions(+), 28 deletions(-)

diff --git a/tools/xenstore/xenstored_domain.c 
b/tools/xenstore/xenstored_domain.c
index 5858185211..17328f9fc9 100644
--- a/tools/xenstore/xenstored_domain.c
+++ b/tools/xenstore/xenstored_domain.c
@@ -369,7 +369,6 @@ int do_introduce(struct connection *conn, struct 
buffered_data *in)
struct domain *domain;
char *vec[3];
unsigned int domid;
-   unsigned long mfn;
evtchn_port_t port;
int rc;
struct xenstore_domain_interface *interface;
@@ -381,7 +380,7 @@ int do_introduce(struct connection *conn, struct 
buffered_data *in)
return EACCES;
 
domid = atoi(vec[0]);
-   mfn = atol(vec[1]);
+   /* Ignore the mfn, we don't need it. */
port = atoi(vec[2]);
 
/* Sanity check args. */
@@ -390,34 +389,26 @@ int do_introduce(struct connection *conn, struct 
buffered_data *in)
 
domain = find_domain_by_domid(domid);
 
-   if (domain == NULL) {
-   interface = map_interface(domid);
-   if (!interface)
-   return errno;
-   /* Hang domain off "in" until we're finished. */
-   domain = new_domain(in, domid, port);
-   if (!domain) {
-   rc = errno;
-   unmap_interface(interface);
-   return rc;
-   }
-   domain->interface = interface;
-   domain->mfn = mfn;
-
-   /* Now domain belongs to its connection. */
-   talloc_steal(domain->conn, domain);
-
-   fire_watches(NULL, in, "@introduceDomain", false);
-   } else if ((domain->mfn == mfn) && (domain->conn != conn)) {
-   /* Use XS_INTRODUCE for recreating the xenbus event-channel. */
-   if (domain->port)
-   xenevtchn_unbind(xce_handle, domain->port);
-   rc = xenevtchn_bind_interdomain(xce_handle, domid, port);
-   domain->port = (rc == -1) ? 0 : rc;
-   domain->remote_port = port;
-   } else
+   if (domain)
return EINVAL;
 
+   interface = map_interface(domid);
+   if (!interface)
+   return errno;
+   /* Hang domain off "in" until we're finished. */
+   domain = new_domain(in, domid, port);
+   if (!domain) {
+   rc = errno;
+   unmap_interface(interface);
+   return rc;
+   }
+   domain->interface = interface;
+
+   /* Now domain belongs to its connection. */
+   talloc_steal(domain->conn, domain);
+
+   fire_watches(NULL, in, "@introduceDomain", false);
+
domain_conn_reset(domain);
 
send_ack(conn, XS_INTRODUCE);
-- 
2.16.4

Re: [PATCH 5/6] x86/pv: map and unmap page tables in mark_pv_pt_pages_rdonly

2020-04-28 Thread Wei Liu

On Tue, Apr 28, 2020 at 05:33:29PM +0200, Jan Beulich wrote:
> On 17.04.2020 11:52, Hongyan Xia wrote:
> > --- a/xen/arch/x86/pv/dom0_build.c
> > +++ b/xen/arch/x86/pv/dom0_build.c
> > @@ -50,17 +50,17 @@ static __init void mark_pv_pt_pages_rdonly(struct 
> > domain *d,
> >  unsigned long count;
> >  struct page_info *page;
> >  l4_pgentry_t *pl4e;
> > -l3_pgentry_t *pl3e;
> > -l2_pgentry_t *pl2e;
> > -l1_pgentry_t *pl1e;
> > +l3_pgentry_t *pl3e, *l3t;
> > +l2_pgentry_t *pl2e, *l2t;
> > +l1_pgentry_t *pl1e, *l1t;
> 
> I don't quite see why the new local variables get introduced:
> unmap_domain_page(), iirc, is quite fine with a non-page-
> aligned argument.

(Assuming this is actually written by me)

I wanted to make things abundantly clear: plXe points to an entry while
lXt points to the start of a page table.

In a long function the distinction could be helpful; in a short function
(like this one?) not so much.

Wei.

> 
> Jan

RE: [PATCH v2 4/5] common/domain: add a domain context record for shared_info...

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Julien Grall 
> Sent: 20 April 2020 18:35
> To: Paul Durrant ; xen-devel@lists.xenproject.org
> Cc: Paul Durrant ; Ian Jackson 
> ; Wei Liu ;
> Andrew Cooper ; George Dunlap 
> ; Jan Beulich
> ; Stefano Stabellini 
> Subject: Re: [PATCH v2 4/5] common/domain: add a domain context record for 
> shared_info...
> 
> Hi Paul,
> 
> On 07/04/2020 18:38, Paul Durrant wrote:
> > ... and update xen-domctx to dump some information describing the record.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Ian Jackson 
> > Cc: Wei Liu 
> > Cc: Andrew Cooper 
> > Cc: George Dunlap 
> > Cc: Jan Beulich 
> > Cc: Julien Grall 
> > Cc: Stefano Stabellini 
> >
> > v2:
> >   - Drop the header change to define a 'Xen' page size and instead use a
> > variable length struct now that the framework makes this is feasible
> >   - Guard use of 'has_32bit_shinfo' in common code with CONFIG_COMPAT
> > ---
> >   tools/misc/xen-domctx.c   | 11 ++
> >   xen/common/domain.c   | 81 +++
> >   xen/include/public/save.h | 10 -
> >   3 files changed, 101 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/misc/xen-domctx.c b/tools/misc/xen-domctx.c
> > index d663522a8b..a8d3922321 100644
> > --- a/tools/misc/xen-domctx.c
> > +++ b/tools/misc/xen-domctx.c
> > @@ -59,6 +59,16 @@ static void dump_header(struct domain_save_descriptor 
> > *desc)
> >   off += desc->length;
> >   }
> >
> > +static void dump_shared_info(struct domain_save_descriptor *desc)
> > +{
> > +DOMAIN_SAVE_TYPE(SHARED_INFO) s;
> > +READ(s);
> > +printf("SHARED_INFO: field_width %u buffer size: %lu\n",
> > +   s.field_width, desc->length - sizeof(s));
> > +
> > +off += desc->length;
> > +}
> > +
> >   static void dump_end(struct domain_save_descriptor *desc)
> >   {
> >   DOMAIN_SAVE_TYPE(END) e;
> > @@ -125,6 +135,7 @@ int main(int argc, char **argv)
> >   switch (desc.typecode)
> >   {
> >   case DOMAIN_SAVE_CODE(HEADER): dump_header(&desc); break;
> > +case DOMAIN_SAVE_CODE(SHARED_INFO): dump_shared_info(&desc); break;
> >   case DOMAIN_SAVE_CODE(END): dump_end(&desc); return 0;
> >   default:
> >   printf("Unknown type %u: skipping\n", desc.typecode);
> > diff --git a/xen/common/domain.c b/xen/common/domain.c
> > index 3dcd73f67c..8b72462e07 100644
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -33,6 +33,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -1646,6 +1647,86 @@ int continue_hypercall_on_cpu(
> >   return 0;
> >   }
> >
> > +static int save_shared_info(const struct vcpu *v, struct domain_context *c,
> > +bool dry_run)
> > +{
> > +struct domain *d = v->domain;
> > +struct domain_shared_info_context ctxt = {};
> > +size_t hdr_size = offsetof(typeof(ctxt), buffer);
> > +size_t size = hdr_size + PAGE_SIZE;
> > +int rc;
> > +
> > +rc = DOMAIN_SAVE_BEGIN(SHARED_INFO, c, v, size);
> > +if ( rc )
> > +return rc;
> > +
> > +if ( !dry_run )
> 
> NIT: I think the if is not necessary here as you don't skip that much code.
> 

I know, but it is illustrative so I'd rather keep it.

> > +ctxt.field_width =
> > +#ifdef CONFIG_COMPAT
> > +has_32bit_shinfo(d) ? 4 :
> > +#endif
> > +8;
> > +
> > +rc = domain_save_data(c, &ctxt, hdr_size);
> > +if ( rc )
> > +return rc;
> > +
> > +rc = domain_save_data(c, d->shared_info, PAGE_SIZE);
> > +if ( rc )
> > +return rc;
> > +
> > +return domain_save_end(c);
> > +}
> > +
> > +static int load_shared_info(struct vcpu *v, struct domain_context *c)
> > +{
> > +struct domain *d = v->domain;
> > +struct domain_shared_info_context ctxt = {};
> > +size_t hdr_size = offsetof(typeof(ctxt), buffer);
> > +size_t size = hdr_size + PAGE_SIZE;
> > +unsigned int i;
> > +int rc;
> > +
> > +rc = DOMAIN_LOAD_BEGIN(SHARED_INFO, c, v, size, true);
> > +if ( rc )
> > +return rc;
> > +
> > +rc = domain_load_data(c, &ctxt, hdr_size);
> > +if ( rc )
> > +return rc;
> > +
> > +for ( i = 0; i < ARRAY_SIZE(ctxt.pad); i++ )
> > +if ( ctxt.pad[i] )
> > +return -EINVAL;
> > +
> > +switch ( ctxt.field_width )
> > +{
> > +#ifdef CONFIG_COMPAT
> > +case 4:
> > +d->arch.has_32bit_shinfo = 1;
> > +break;
> > +#endif
> > +case 8:
> > +#ifdef CONFIG_COMPAT
> > +d->arch.has_32bit_shinfo = 0;
> > +#endif
> > +break;
> > +
> > +default:
> > +rc = -EINVAL;
> > +break;
> > +}
> > +
> > +rc = domain_load_data(c, d->shared_info, PAGE_SIZE);
> > +if ( rc )
> > +return rc;
> > +
> > +return domain_load_end(c);
> > +}
> > +
> > +DOMAIN_REGISTER_SAVE_RESTORE(SHARED_INFO, false, save_shared_info,
> > +

RE: [PATCH v2 2/5] xen/common/domctl: introduce XEN_DOMCTL_get/setdomaincontext

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Julien Grall 
> Sent: 20 April 2020 18:26
> To: Paul Durrant ; xen-devel@lists.xenproject.org
> Cc: Paul Durrant ; Daniel De Graaf 
> ; Ian Jackson
> ; Wei Liu ; Andrew Cooper 
> ; George
> Dunlap ; Jan Beulich ; Stefano 
> Stabellini
> 
> Subject: Re: [PATCH v2 2/5] xen/common/domctl: introduce 
> XEN_DOMCTL_get/setdomaincontext
> 
> Hi Paul,
> 
> On 07/04/2020 18:38, Paul Durrant wrote:
> > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
> > index 1ad34c35eb..8ab39acf0c 100644
> > --- a/xen/include/public/domctl.h
> > +++ b/xen/include/public/domctl.h
> > @@ -38,7 +38,7 @@
> >   #include "hvm/save.h"
> >   #include "memory.h"
> >
> > -#define XEN_DOMCTL_INTERFACE_VERSION 0x0012
> > +#define XEN_DOMCTL_INTERFACE_VERSION 0x0013
> >
> >   /*
> >* NB. xen_domctl.domain is an IN/OUT parameter for this operation.
> > @@ -1129,6 +1129,44 @@ struct xen_domctl_vuart_op {
> >*/
> >   };
> >
> > +/*
> > + * Get/Set domain PV context. The same struct xen_domctl_domaincontext
> 
> I think you want to update the comments to match the split.

Oh yes.

> 
> > + * is used for both commands but with slightly different field semantics
> > + * as follows:
> 
> Reviewed-by: Julien Grall 
> 

Thanks,

  Paul

> 
> Cheers,
> 
> 
> --
> Julien Grall

[PATCH] x86/xen: drop an unused parameter gsi_override

2020-04-28 Thread Wei Liu

All callers within the same file pass in -1 (no override).

Signed-off-by: Wei Liu 
---
 arch/x86/pci/xen.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 91220cc25854..e3f1ca316068 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -60,8 +60,7 @@ static int xen_pcifront_enable_irq(struct pci_dev *dev)
 }
 
 #ifdef CONFIG_ACPI
-static int xen_register_pirq(u32 gsi, int gsi_override, int triggering,
-bool set_pirq)
+static int xen_register_pirq(u32 gsi, int triggering, bool set_pirq)
 {
int rc, pirq = -1, irq = -1;
struct physdev_map_pirq map_irq;
@@ -94,9 +93,6 @@ static int xen_register_pirq(u32 gsi, int gsi_override, int 
triggering,
name = "ioapic-level";
}
 
-   if (gsi_override >= 0)
-   gsi = gsi_override;
-
irq = xen_bind_pirq_gsi_to_irq(gsi, map_irq.pirq, shareable, name);
if (irq < 0)
goto out;
@@ -112,12 +108,12 @@ static int acpi_register_gsi_xen_hvm(struct device *dev, 
u32 gsi,
if (!xen_hvm_domain())
return -1;
 
-   return xen_register_pirq(gsi, -1 /* no GSI override */, trigger,
+   return xen_register_pirq(gsi, trigger,
 false /* no mapping of GSI to PIRQ */);
 }
 
 #ifdef CONFIG_XEN_DOM0
-static int xen_register_gsi(u32 gsi, int gsi_override, int triggering, int 
polarity)
+static int xen_register_gsi(u32 gsi, int triggering, int polarity)
 {
int rc, irq;
struct physdev_setup_gsi setup_gsi;
@@ -128,7 +124,7 @@ static int xen_register_gsi(u32 gsi, int gsi_override, int 
triggering, int polar
printk(KERN_DEBUG "xen: registering gsi %u triggering %d polarity %d\n",
gsi, triggering, polarity);
 
-   irq = xen_register_pirq(gsi, gsi_override, triggering, true);
+   irq = xen_register_pirq(gsi, triggering, true);
 
setup_gsi.gsi = gsi;
setup_gsi.triggering = (triggering == ACPI_EDGE_SENSITIVE ? 0 : 1);
@@ -148,7 +144,7 @@ static int xen_register_gsi(u32 gsi, int gsi_override, int 
triggering, int polar
 static int acpi_register_gsi_xen(struct device *dev, u32 gsi,
 int trigger, int polarity)
 {
-   return xen_register_gsi(gsi, -1 /* no GSI override */, trigger, 
polarity);
+   return xen_register_gsi(gsi, trigger, polarity);
 }
 #endif
 #endif
@@ -491,7 +487,7 @@ int __init pci_xen_initial_domain(void)
if (acpi_get_override_irq(irq, &trigger, &polarity) == -1)
continue;
 
-   xen_register_pirq(irq, -1 /* no GSI override */,
+   xen_register_pirq(irq,
trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE,
true /* Map GSI to PIRQ */);
}
-- 
2.20.1

RE: [PATCH v2 1/5] xen/common: introduce a new framework for save/restore of 'domain' context

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Julien Grall 
> Sent: 20 April 2020 18:21
> To: Paul Durrant ; xen-devel@lists.xenproject.org
> Cc: Paul Durrant ; Andrew Cooper 
> ; George Dunlap
> ; Ian Jackson ; Jan 
> Beulich ;
> Stefano Stabellini ; Wei Liu ; 
> Volodymyr Babchuk
> ; Roger Pau Monné 
> Subject: Re: [PATCH v2 1/5] xen/common: introduce a new framework for 
> save/restore of 'domain' context
> 
> Hi Paul,
> 
> On 07/04/2020 18:38, Paul Durrant wrote:
> > To allow enlightened HVM guests (i.e. those that have PV drivers) to be
> > migrated without their co-operation it will be necessary to transfer 'PV'
> > state such as event channel state, grant entry state, etc.
> >
> > Currently there is a framework (entered via the hvm_save/load() functions)
> > that allows a domain's 'HVM' (architectural) state to be transferred but
> > 'PV' state is also common with pure PV guests and so this framework is not
> > really suitable.
> >
> > This patch adds the new public header and low level implementation of a new
> > common framework, entered via the domain_save/load() functions. Subsequent
> > patches will introduce other parts of the framework, and code that will
> > make use of it within the current version of the libxc migration stream.
> >
> > This patch also marks the HVM-only framework as deprecated in favour of the
> > new framework.
> >
> > Signed-off-by: Paul Durrant 
> > ---
> > Cc: Andrew Cooper 
> > Cc: George Dunlap 
> > Cc: Ian Jackson 
> > Cc: Jan Beulich 
> > Cc: Julien Grall 
> > Cc: Stefano Stabellini 
> > Cc: Wei Liu 
> > Cc: Volodymyr Babchuk 
> > Cc: "Roger Pau Monné" 
> >
> > v2:
> >   - Allow multi-stage save/load to avoid the need to double-buffer
> >   - Get rid of the masks and add an 'ignore' flag instead
> >   - Create copy function union to preserve const save buffer
> >   - Deprecate HVM-only framework
> > ---
> >   xen/common/Makefile|   1 +
> >   xen/common/save.c  | 329 +
> >   xen/include/public/arch-arm/hvm/save.h |   5 +
> >   xen/include/public/arch-x86/hvm/save.h |   5 +
> >   xen/include/public/save.h  |  84 +++
> >   xen/include/xen/save.h | 152 
> >   6 files changed, 576 insertions(+)
> >   create mode 100644 xen/common/save.c
> >   create mode 100644 xen/include/public/save.h
> >   create mode 100644 xen/include/xen/save.h
> >
> > diff --git a/xen/common/Makefile b/xen/common/Makefile
> > index e8cde65370..90553ba5d7 100644
> > --- a/xen/common/Makefile
> > +++ b/xen/common/Makefile
> > @@ -37,6 +37,7 @@ obj-y += radix-tree.o
> >   obj-y += rbtree.o
> >   obj-y += rcupdate.o
> >   obj-y += rwlock.o
> > +obj-y += save.o
> >   obj-y += shutdown.o
> >   obj-y += softirq.o
> >   obj-y += sort.o
> > diff --git a/xen/common/save.c b/xen/common/save.c
> > new file mode 100644
> > index 00..6cdac3785b
> > --- /dev/null
> > +++ b/xen/common/save.c
> > @@ -0,0 +1,329 @@
> > +/*
> > + * save.c: Save and restore PV guest state common to all domain types.
> > + *
> > + * Copyright Amazon.com Inc. or its affiliates.
> > + *
> > + * This program is free software; you can redistribute it and/or modify it
> > + * under the terms and conditions of the GNU General Public License,
> > + * version 2, as published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope it will be useful, but WITHOUT
> > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> > + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License 
> > for
> > + * more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along 
> > with
> > + * this program; If not, see .
> > + */
> > +
> > +#include 
> > +
> > +union domain_copy_entry {
> > +domain_write_entry write;
> > +domain_read_entry read;
> > +};
> > +
> > +struct domain_context {
> > +bool log;
> > +struct domain_save_descriptor desc;
> > +size_t data_len;
> 
> What is data_len?
> 

It’s used for internal accounting.

> > +union domain_copy_entry copy;
> > +void *priv;
> > +};
> > +
> > +static struct {
> > +const char *name;
> > +bool per_vcpu;
> > +domain_save_handler save;
> > +domain_load_handler load;
> > +} handlers[DOMAIN_SAVE_CODE_MAX + 1];
> > +
> > +void __init domain_register_save_type(unsigned int tc, const char *name,
> > +  bool per_vcpu,
> > +  domain_save_handler save,
> > +  domain_load_handler load)
> > +{
> > +BUG_ON(tc > ARRAY_SIZE(handlers));
> > +
> > +ASSERT(!handlers[tc].save);
> > +ASSERT(!handlers[tc].load);
> > +
> > +handlers[tc].name = name;
> > +handlers[tc].per_vcpu = per_vcpu;
> > +handlers[tc].save = save;
> > +handlers[tc].load = load;
> > +}
> > +
> > +int domain_save_begin(struct domain_context *c, un

Re: [PATCH 6/6] x86/pv: map and unmap page table in dom0_construct_pv

2020-04-28 Thread Jan Beulich

On 24.04.2020 11:18, Julien Grall wrote:
> On 17/04/2020 10:52, Hongyan Xia wrote:
>> From: Wei Liu 
>>
>> Signed-off-by: Wei Liu 
>> Signed-off-by: Hongyan Xia 
> 
> Reviewed-by: Julien Grall 

Acked-by: Jan Beulich

Re: [PATCH 5/6] x86/pv: map and unmap page tables in mark_pv_pt_pages_rdonly

2020-04-28 Thread Jan Beulich

On 17.04.2020 11:52, Hongyan Xia wrote:
> --- a/xen/arch/x86/pv/dom0_build.c
> +++ b/xen/arch/x86/pv/dom0_build.c
> @@ -50,17 +50,17 @@ static __init void mark_pv_pt_pages_rdonly(struct domain 
> *d,
>  unsigned long count;
>  struct page_info *page;
>  l4_pgentry_t *pl4e;
> -l3_pgentry_t *pl3e;
> -l2_pgentry_t *pl2e;
> -l1_pgentry_t *pl1e;
> +l3_pgentry_t *pl3e, *l3t;
> +l2_pgentry_t *pl2e, *l2t;
> +l1_pgentry_t *pl1e, *l1t;

I don't quite see why the new local variables get introduced:
unmap_domain_page(), iirc, is quite fine with a non-page-
aligned argument.

Jan

Re: [PATCH 4/6] x86/smpboot: map and unmap page tables in cleanup_cpu_root_pgt

2020-04-28 Thread Jan Beulich

On 24.04.2020 11:13, Julien Grall wrote:
> On 17/04/2020 10:52, Hongyan Xia wrote:
>> From: Wei Liu 
>>
>> Signed-off-by: Wei Liu 
>> Signed-off-by: Hongyan Xia 
> 
> Reviewed-by: Julien Grall 

Acked-by: Jan Beulich

[PATCH v2] mem_sharing: map shared_info page to same gfn during fork

2020-04-28 Thread Tamas K Lengyel

During a VM fork we copy the shared_info page; however, we also need to ensure
that the page is mapped into the same GFN in the fork as its in the parent.

Signed-off-by: Tamas K Lengyel 
Suggested-by: Roger Pau Monne 
---
 xen/arch/x86/mm/mem_sharing.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index 344a5bfb3d..a1dea8fedb 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1656,6 +1656,7 @@ static void copy_tsc(struct domain *cd, struct domain *d)
 static int copy_special_pages(struct domain *cd, struct domain *d)
 {
 mfn_t new_mfn, old_mfn;
+gfn_t new_gfn, old_gfn;
 struct p2m_domain *p2m = p2m_get_hostp2m(cd);
 static const unsigned int params[] =
 {
@@ -1701,6 +1702,30 @@ static int copy_special_pages(struct domain *cd, struct 
domain *d)
 new_mfn = _mfn(virt_to_mfn(cd->shared_info));
 copy_domain_page(new_mfn, old_mfn);
 
+old_gfn = _gfn(get_gpfn_from_mfn(mfn_x(old_mfn)));
+new_gfn = _gfn(get_gpfn_from_mfn(mfn_x(new_mfn)));
+
+if ( !gfn_eq(old_gfn, new_gfn) )
+{
+if ( !gfn_eq(new_gfn, INVALID_GFN) )
+{
+/* if shared_info is mapped to a different gfn just remove it */
+rc = p2m->set_entry(p2m, new_gfn, INVALID_MFN, PAGE_ORDER_4K,
+p2m_invalid, p2m->default_access, -1);
+if ( rc )
+return rc;
+}
+
+if ( !gfn_eq(old_gfn, INVALID_GFN) )
+{
+/* now map it to the same gfn as the parent */
+rc = p2m->set_entry(p2m, old_gfn, new_mfn, PAGE_ORDER_4K,
+p2m_ram_rw, p2m->default_access, -1);
+if ( rc )
+return rc;
+}
+}
+
 return 0;
 }
 
-- 
2.20.1

Re: [PATCH 3/6] x86_64/mm: map and unmap page tables in subarch_memory_op

2020-04-28 Thread Jan Beulich

On 24.04.2020 11:06, Julien Grall wrote:
> On 17/04/2020 10:52, Hongyan Xia wrote:
>> From: Wei Liu 
>>
>> Signed-off-by: Wei Liu 
>> Signed-off-by: Hongyan Xia 
> 
> Reviewed-by: Julien Grall 

Acked-by: Jan Beulich

Re: [PATCH 2/6] x86_64/mm: map and unmap page tables in subarch_init_memory

2020-04-28 Thread Jan Beulich

On 24.04.2020 11:04, Julien Grall wrote:
> On 17/04/2020 10:52, Hongyan Xia wrote:
>> From: Wei Liu 
>>
>> Signed-off-by: Wei Liu 
>> Signed-off-by: Hongyan Xia 
> 
> Reviewed-by: Julien Grall 

Acked-by: Jan Beulich

Re: [PATCH 1/6] x86_64/mm: map and unmap page tables in cleanup_frame_table

2020-04-28 Thread Jan Beulich

On 24.04.2020 11:02, Julien Grall wrote:
> On 17/04/2020 10:52, Hongyan Xia wrote:
>> @@ -763,10 +763,10 @@ static void cleanup_frame_table(struct mem_hotadd_info 
>> *info)
>>   continue;
>>   }
>>   -    ASSERT(l1e_get_flags(l2e_to_l1e(l2e)[l1_table_offset(sva)]) &
>> -    _PAGE_PRESENT);
>> - sva = (sva & ~((1UL << PAGE_SHIFT) - 1)) +
>> -    (1UL << PAGE_SHIFT);
>> +    ASSERT(l1e_get_flags(l1e_from_l2e(l2e, l1_table_offset(sva))) &
>> +   _PAGE_PRESENT);
>> +
>> +    sva = (sva & ~((1UL << PAGE_SHIFT) - 1)) + (1UL << PAGE_SHIFT);
> 
> NIT: While you are modifying the indentation. Couldn't we use PAGE_MASK and 
> PAGE_SIZE here?

And with this (which I think can be done while committing)
Reviewed-by: Jan Beulich 

Jan

[PATCH v5] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Paul Durrant

From: Paul Durrant 

... to specify a separate migration stream that will also be suitable for
live update.

The original scope of the document was to support non-cooperative migration
of guests [1] but, since then, live update of xenstored has been brought into
scope. Thus it makes more sense to define a separate image format for
serializing xenstore state that is suitable for both purposes.

The document has been limited to specifying a new image format. The mechanism
for acquiring the image for live update or migration is not covered as that
is more appropriately dealt with by a patch to docs/misc/xenstore.txt. It is
also expected that, when the first implementation of live update or migration
making use of this specification is committed, that the document is moved from
docs/designs into docs/specs.

NOTE: It will only be necessary to save and restore state for active xenstore
  connections, but the documentation for 'RESUME' in xenstore.txt implies
  otherwise. That command is unused so this patch deletes it from the
  specification.

[1] See 
https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md

Signed-off-by: Paul Durrant 
Reviewed-by: Juergen Gross 
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Stefano Stabellini 
Cc: Wei Liu 

v5:
 - Addressed comments from Julien
 - Dropped 'mfn' from shared ring 'conn-spec' as its feasibility/utility is
   debatable

v4:
 - Move path-len and value-len earlier in NODE_DATA

v3:
 - Address missed comments from Juergen

v2:
 - Address comments from Juergen
---
 docs/designs/xenstore-migration.md | 469 +++--
 docs/misc/xenstore.txt |  17 --
 2 files changed, 307 insertions(+), 179 deletions(-)

diff --git a/docs/designs/xenstore-migration.md 
b/docs/designs/xenstore-migration.md
index 6ab351e8fe..34a2afd17e 100644
--- a/docs/designs/xenstore-migration.md
+++ b/docs/designs/xenstore-migration.md
@@ -3,254 +3,399 @@
 ## Background
 
 The design for *Non-Cooperative Migration of Guests*[1] explains that extra
-save records are required in the migrations stream to allow a guest running
-PV drivers to be migrated without its co-operation. Moreover the save
-records must include details of registered xenstore watches as well as
-content; information that cannot currently be recovered from `xenstored`,
-and hence some extension to the xenstore protocol[2] will also be required.
-
-The *libxenlight Domain Image Format* specification[3] already defines a
-record type `EMULATOR_XENSTORE_DATA` but this is not suitable for
-transferring xenstore data pertaining to the domain directly as it is
-specified such that keys are relative to the path
-`/local/domain/$dm_domid/device-model/$domid`. Thus it is necessary to
-define at least one new save record type.
+save records are required in the migrations stream to allow a guest running PV
+drivers to be migrated without its co-operation. Moreover the save records must
+include details of registered xenstore watches as well as content; information
+that cannot currently be recovered from `xenstored`, and hence some extension
+to the xenstored implementations will also be required.
+
+As a similar set of data is needed for transferring xenstore data from one
+instance to another when live updating xenstored this document proposes an
+image format for a 'migration stream' suitable for both purposes.
 
 ## Proposal
 
-### New Save Record
+The image format consists of a _header_ followed by 1 or more _records_. Each
+record consists of a type and length field, followed by any data mandated by
+the record type. At minimum there will be a single record of type `END`
+(defined below).
 
-A new mandatory record type should be defined within the libxenlight Domain
-Image Format:
+### Header
 
-`0x0007: DOMAIN_XENSTORE_DATA`
+The header identifies the stream as a `xenstore` stream, including the version
+of the specification that it complies with.
 
-An arbitrary number of these records may be present in the migration
-stream and may appear in any order. The format of each record should be as
-follows:
+All fields in this header must be in _big-endian_ byte order, regardless of
+the setting of the endianness bit.
 
 
 ```
 0   1   2   3   4   5   6   7octet
 +---+---+---+---+---+---+---+---+
-| type  | record specific data  |
-+---+   |
-...
-+---+
+| ident |
++---+---|
+| version   | flags |
++---+---+
 ```
 
-where type is one of the following values
 
+| Field | Description

Re: [PATCH] tools/xenstore: simplify socket initialization

2020-04-28 Thread Wei Liu

On Tue, Apr 28, 2020 at 04:58:37PM +0200, Juergen Gross wrote:
> The setup of file descriptors for the Xenstore sockets is needlessly
> complicated: the space is allocated dynamically, while two static
> variables really would do the job.
> 
> For tearing down the sockets it is easier to widen the scope of the
> file descriptors from function to file.
> 
> Signed-off-by: Juergen Gross 

Reviewed-by: Wei Liu

[PATCH] tools/xenstore: simplify socket initialization

2020-04-28 Thread Juergen Gross

The setup of file descriptors for the Xenstore sockets is needlessly
complicated: the space is allocated dynamically, while two static
variables really would do the job.

For tearing down the sockets it is easier to widen the scope of the
file descriptors from function to file.

Signed-off-by: Juergen Gross 
---
 tools/xenstore/xenstored_core.c | 69 -
 1 file changed, 27 insertions(+), 42 deletions(-)

diff --git a/tools/xenstore/xenstored_core.c b/tools/xenstore/xenstored_core.c
index 551fe38f57..7bd959f28b 100644
--- a/tools/xenstore/xenstored_core.c
+++ b/tools/xenstore/xenstored_core.c
@@ -70,6 +70,9 @@ static struct pollfd *fds;
 static unsigned int current_array_size;
 static unsigned int nr_fds;
 
+static int sock = -1;
+static int ro_sock = -1;
+
 #define ROUNDUP(_x, _w) (((unsigned long)(_x)+(1UL<<(_w))-1) & 
~((1UL<<(_w))-1))
 
 static bool verbose = false;
@@ -310,8 +313,7 @@ fail:
return -1;
 }
 
-static void initialize_fds(int sock, int *p_sock_pollfd_idx,
-  int ro_sock, int *p_ro_sock_pollfd_idx,
+static void initialize_fds(int *p_sock_pollfd_idx, int *p_ro_sock_pollfd_idx,
   int *ptimeout)
 {
struct connection *conn;
@@ -1789,43 +1791,29 @@ void corrupt(struct connection *conn, const char *fmt, 
...)
check_store();
 }
 
-
-#ifdef NO_SOCKETS
-static void init_sockets(int **psock, int **pro_sock)
-{
-   static int minus_one = -1;
-   *psock = *pro_sock = &minus_one;
-}
-#else
-static int destroy_fd(void *_fd)
+#ifndef NO_SOCKETS
+static void destroy_fds(void)
 {
-   int *fd = _fd;
-   close(*fd);
-   return 0;
+   if (sock >= 0)
+   close(sock);
+   if (ro_sock >= 0)
+   close(ro_sock);
 }
 
-static void init_sockets(int **psock, int **pro_sock)
+static void init_sockets(void)
 {
struct sockaddr_un addr;
-   int *sock, *ro_sock;
const char *soc_str = xs_daemon_socket();
const char *soc_str_ro = xs_daemon_socket_ro();
 
/* Create sockets for them to listen to. */
-   *psock = sock = talloc(talloc_autofree_context(), int);
-   if (!sock)
-   barf_perror("No memory when creating sockets");
-   *sock = socket(PF_UNIX, SOCK_STREAM, 0);
-   if (*sock < 0)
+   atexit(destroy_fds);
+   sock = socket(PF_UNIX, SOCK_STREAM, 0);
+   if (sock < 0)
barf_perror("Could not create socket");
-   *pro_sock = ro_sock = talloc(talloc_autofree_context(), int);
-   if (!ro_sock)
-   barf_perror("No memory when creating sockets");
-   *ro_sock = socket(PF_UNIX, SOCK_STREAM, 0);
-   if (*ro_sock < 0)
+   ro_sock = socket(PF_UNIX, SOCK_STREAM, 0);
+   if (ro_sock < 0)
barf_perror("Could not create socket");
-   talloc_set_destructor(sock, destroy_fd);
-   talloc_set_destructor(ro_sock, destroy_fd);
 
/* FIXME: Be more sophisticated, don't mug running daemon. */
unlink(soc_str);
@@ -1836,24 +1824,21 @@ static void init_sockets(int **psock, int **pro_sock)
if(strlen(soc_str) >= sizeof(addr.sun_path))
barf_perror("socket string '%s' too long", soc_str);
strcpy(addr.sun_path, soc_str);
-   if (bind(*sock, (struct sockaddr *)&addr, sizeof(addr)) != 0)
+   if (bind(sock, (struct sockaddr *)&addr, sizeof(addr)) != 0)
barf_perror("Could not bind socket to %s", soc_str);
 
if(strlen(soc_str_ro) >= sizeof(addr.sun_path))
barf_perror("socket string '%s' too long", soc_str_ro);
strcpy(addr.sun_path, soc_str_ro);
-   if (bind(*ro_sock, (struct sockaddr *)&addr, sizeof(addr)) != 0)
+   if (bind(ro_sock, (struct sockaddr *)&addr, sizeof(addr)) != 0)
barf_perror("Could not bind socket to %s", soc_str_ro);
 
if (chmod(soc_str, 0600) != 0
|| chmod(soc_str_ro, 0660) != 0)
barf_perror("Could not chmod sockets");
 
-   if (listen(*sock, 1) != 0
-   || listen(*ro_sock, 1) != 0)
+   if (listen(sock, 1) != 0 || listen(ro_sock, 1) != 0)
barf_perror("Could not listen on sockets");
-
-
 }
 #endif
 
@@ -1909,7 +1894,7 @@ int priv_domid = 0;
 
 int main(int argc, char *argv[])
 {
-   int opt, *sock = NULL, *ro_sock = NULL;
+   int opt;
int sock_pollfd_idx = -1, ro_sock_pollfd_idx = -1;
bool dofork = true;
bool outputpid = false;
@@ -1997,7 +1982,9 @@ int main(int argc, char *argv[])
 
talloc_enable_null_tracking();
 
-   init_sockets(&sock, &ro_sock);
+#ifndef NO_SOCKETS
+   init_sockets();
+#endif
 
init_pipe(reopen_log_pipe);
 
@@ -2025,8 +2012,7 @@ int main(int argc, char *argv[])
tracefile = talloc_strdup(NULL, tracefile);
 
/* Get ready to listen to the tools. */
-   initialize_fds(*sock, &sock_pollfd_idx, *ro_sock, &ro_sock_pollfd_idx,
-

Re: [XEN PATCH v5 16/16] build, include: rework compat-build-header.py

2020-04-28 Thread Wei Liu





On 21/04/2020 17:12, Anthony PERARD wrote:

Replace a mix of shell script and python script by all python script.

No change to the final generated headers.

Signed-off-by: Anthony PERARD 


To the best of my knowledge this patch is doing the right transformation.

Acked-by: Wei Liu

Re: [PATCH v4] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Jürgen Groß

On 28.04.20 16:50, Paul Durrant wrote:

-Original Message-
From: Jürgen Groß 
Sent: 28 April 2020 13:56
To: p...@xen.org; 'Julien Grall' ; 
xen-devel@lists.xenproject.org
Cc: 'Paul Durrant' ; 'Andrew Cooper' 
; 'George Dunlap'
; 'Ian Jackson' ; 'Jan 
Beulich'
; 'Stefano Stabellini' ; 'Wei Liu' 

Subject: Re: [PATCH v4] docs/designs: re-work the xenstore migration document...

On 28.04.20 14:46, Paul Durrant wrote:

-Original Message-
From: Julien Grall 
Sent: 28 April 2020 11:23
To: Jürgen Groß ; Paul Durrant ; 
xen-devel@lists.xenproject.org
Cc: Paul Durrant ; Andrew Cooper 
; George Dunlap
; Ian Jackson ; Jan Beulich

;

Stefano Stabellini ; Wei Liu 
Subject: Re: [PATCH v4] docs/designs: re-work the xenstore migration document...

Hi Juergen,

On 28/04/2020 11:10, Jürgen Groß wrote:

On 28.04.20 11:05, Julien Grall wrote:

-where tx_id is the non-zero identifier values of an open transaction.
+| Field | Description   |
+|---|---|
+| `domid`   | The domain-id that owns the shared page   |
+|   |   |
+| `tdomid`  | The domain-id that `domid` acts on behalf of if   |
+|   | it has been subject to an SET_TARGET  |
+|   | operation [2] or DOMID_INVALID [3] otherwise  |
+|   |   |
+| `flags`   | Must be zero  |
+|   |   |
+| `evtchn`  | The port number of the interdomain channel used   |
+|   | by `domid` to communicate with xenstored  |
+|   |   |
+| `mfn` | The MFN of the shared page for a live update or   |
+|   | ~0 (i.e. all bits set) otherwise  |
-### Protocol Extension
+Since the ABI guarantees that entry 1 in `domid`'s grant table will
always
+contain the GFN of the shared page, so for a live update `mfn` can
be used to
+give confidence that `domid` has not been re-cycled during the update.

I am confused, there is no way a XenStored running in an Arm domain
can map the MFN of the shared page. So how is this going to work out?

I guess this will be a MFN for PV guests and a guest PFN else.

If we use Xen terminology (xen/include/xen/mm.h), this would be a GFN.

TBH I'm not sure a GFN would give much confidence about domain recycling as it 
would likely be the

same for many domains, I think. We really need a UUID.

No, we need a per-host domain value, which is associated with a domain
and which is unique during the life-time of the host.

E.g. a global counter, which is incremented at each domain creation and
stored in struct domain.

Can we just drop this and fall back on the fact that libxl now prevents domids 
from being recycled for 60s?

In any case this discussion does not belong to this patch IMO.

Juergen

RE: [PATCH v4] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Jürgen Groß 
> Sent: 28 April 2020 13:56
> To: p...@xen.org; 'Julien Grall' ; 
> xen-devel@lists.xenproject.org
> Cc: 'Paul Durrant' ; 'Andrew Cooper' 
> ; 'George Dunlap'
> ; 'Ian Jackson' ; 'Jan 
> Beulich'
> ; 'Stefano Stabellini' ; 'Wei Liu' 
> 
> Subject: Re: [PATCH v4] docs/designs: re-work the xenstore migration 
> document...
> 
> On 28.04.20 14:46, Paul Durrant wrote:
> >> -Original Message-
> >> From: Julien Grall 
> >> Sent: 28 April 2020 11:23
> >> To: Jürgen Groß ; Paul Durrant ; 
> >> xen-devel@lists.xenproject.org
> >> Cc: Paul Durrant ; Andrew Cooper 
> >> ; George Dunlap
> >> ; Ian Jackson ; Jan 
> >> Beulich
> ;
> >> Stefano Stabellini ; Wei Liu 
> >> Subject: Re: [PATCH v4] docs/designs: re-work the xenstore migration 
> >> document...
> >>
> >> Hi Juergen,
> >>
> >> On 28/04/2020 11:10, Jürgen Groß wrote:
> >>> On 28.04.20 11:05, Julien Grall wrote:
> > -where tx_id is the non-zero identifier values of an open transaction.
> > +| Field | Description   |
> > +|---|---|
> > +| `domid`   | The domain-id that owns the shared page   |
> > +|   |   |
> > +| `tdomid`  | The domain-id that `domid` acts on behalf of if   |
> > +|   | it has been subject to an SET_TARGET  |
> > +|   | operation [2] or DOMID_INVALID [3] otherwise  |
> > +|   |   |
> > +| `flags`   | Must be zero  |
> > +|   |   |
> > +| `evtchn`  | The port number of the interdomain channel used   |
> > +|   | by `domid` to communicate with xenstored  |
> > +|   |   |
> > +| `mfn` | The MFN of the shared page for a live update or   |
> > +|   | ~0 (i.e. all bits set) otherwise  |
> > -### Protocol Extension
> > +Since the ABI guarantees that entry 1 in `domid`'s grant table will
> > always
> > +contain the GFN of the shared page, so for a live update `mfn` can
> > be used to
> > +give confidence that `domid` has not been re-cycled during the update.
> 
>  I am confused, there is no way a XenStored running in an Arm domain
>  can map the MFN of the shared page. So how is this going to work out?
> >>>
> >>> I guess this will be a MFN for PV guests and a guest PFN else.
> >>
> >> If we use Xen terminology (xen/include/xen/mm.h), this would be a GFN.
> >>
> >
> > TBH I'm not sure a GFN would give much confidence about domain recycling as 
> > it would likely be the
> same for many domains, I think. We really need a UUID.
> 
> No, we need a per-host domain value, which is associated with a domain
> and which is unique during the life-time of the host.
> 
> E.g. a global counter, which is incremented at each domain creation and
> stored in struct domain.
> 

Can we just drop this and fall back on the fact that libxl now prevents domids 
from being recycled for 60s?

  Paul

[xen-unstable-smoke test] 149857: tolerable all pass - PUSHED

2020-04-28 Thread osstest service owner

flight 149857 xen-unstable-smoke real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149857/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass

version targeted for testing:
 xen  2add344e6a4ef690871157b87b0e7d52bc6db9e4
baseline version:
 xen  dd0882f912005b56653013025ebd160862e360ad

Last test of basis   149839  2020-04-27 21:01:50 Z0 days
Testing same since   149857  2020-04-28 12:00:23 Z0 days1 attempts


People who touched revisions under test:
  Jan Beulich 
  Wei Liu 

jobs:
 build-arm64-xsm  pass
 build-amd64  pass
 build-armhf  pass
 build-amd64-libvirt  pass
 test-armhf-armhf-xl  pass
 test-arm64-arm64-xl-xsm  pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-amd64-libvirt pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Pushing revision :

To xenbits.xen.org:/home/xen/git/xen.git
   dd0882f912..2add344e6a  2add344e6a4ef690871157b87b0e7d52bc6db9e4 -> smoke

Re: [XEN PATCH v5 16/16] build, include: rework compat-build-header.py

2020-04-28 Thread Jan Beulich

On 21.04.2020 18:12, Anthony PERARD wrote:
> Replace a mix of shell script and python script by all python script.
> 
> No change to the final generated headers.
> 
> Signed-off-by: Anthony PERARD 
> ---
> 
> Notes:
> v5:
> - Removed -P from CPP when generating compat/%.i
>   -> keep removing linemarkers and keep de-duplicating empty lines.
>   So that all the blank line that currently exist in the generated
>   headers stays in place.

Thanks for doing these adjustments. However, my Python isn't good
enough to actually ack this patch, I'm afraid.

Jan

Re: [XEN PATCH v5 15/16] build, include: rework compat-build-source.py

2020-04-28 Thread Jan Beulich

On 21.04.2020 18:12, Anthony PERARD wrote:
> Improvement are:
> - give the path to xlat.lst as argument
> - include `grep -v` in compat-build-source.py script, we don't need to
>   write this in several scripted language.
> 
> No changes in final compat/%.h headers.
> 
> Signed-off-by: Anthony PERARD 

Acked-by: Jan Beulich

RE: [PATCH] PCI: drop a redundant variable from pci_add_device()

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich 
> Sent: 28 April 2020 14:00
> To: xen-devel@lists.xenproject.org
> Cc: Paul Durrant 
> Subject: [PATCH] PCI: drop a redundant variable from pci_add_device()
> 
> Surrounding code already uses the available alternative, after all.
> 
> Signed-off-by: Jan Beulich 
> 

Reviewed-by: Paul Durrant

Re: [XEN PATCH v5 08/16] build: Introduce $(cpp_flags)

2020-04-28 Thread Jan Beulich

On 28.04.2020 16:01, Anthony PERARD wrote:
> On Thu, Apr 23, 2020 at 06:48:51PM +0200, Jan Beulich wrote:
>> On 21.04.2020 18:12, Anthony PERARD wrote:
>>> --- a/xen/Rules.mk
>>> +++ b/xen/Rules.mk
>>> @@ -123,6 +123,7 @@ $(obj-bin-y): XEN_CFLAGS := $(filter-out 
>>> -flto,$(XEN_CFLAGS))
>>>  
>>>  c_flags = -MMD -MP -MF $(@D)/.$(@F).d $(XEN_CFLAGS) 
>>> '-D__OBJECT_FILE__="$@"'
>>>  a_flags = -MMD -MP -MF $(@D)/.$(@F).d $(XEN_AFLAGS)
>>> +cpp_flags = $(filter-out -Wa$(comma)%,$(a_flags))
>>
>> I can see this happening to be this way right now, but in principle
>> I could see a_flags to hold items applicable to assembly files only,
>> but not to (the preprocessing of) C files. Hence while this is fine
>> for now, ...
>>
>>> @@ -207,7 +208,7 @@ quiet_cmd_cc_s_c = CC  $@
>>>  cmd_cc_s_c = $(CC) $(filter-out -Wa$(comma)%,$(c_flags)) -S $< -o $@
>>>  
>>>  quiet_cmd_s_S = CPP $@
>>> -cmd_s_S = $(CPP) $(filter-out -Wa$(comma)%,$(a_flags)) $< -o $@
>>> +cmd_s_S = $(CPP) $(cpp_flags) $< -o $@
>>
>> ... this one is a trap waiting for someone to fall in imo. Instead
>> where I'd expect this patch to use $(cpp_flags) is e.g. in
>> xen/arch/x86/mm/Makefile:
>>
>> guest_walk_%.i: guest_walk.c Makefile
>>  $(CPP) $(cpp_flags) -DGUEST_PAGING_LEVELS=$* -c $< -o $@
>>
>> And note how this currently uses $(c_flags), not $(a_flags), which
>> suggests that your deriving from $(a_flags) isn't correct either.
> 
> I think we can drop this patch for now, and change patch "xen/build:
> factorise generation of the linker scripts" to not use $(cpp_flags).
> 
> If we derive $(cpp_flags) from $(c_flags) instead, we would need to
> find out if CPP commands using a_flags can use c_flags instead.
> 
> On the other hand, I've looked at Linux source code, and they use
> $(cpp_flags) for only a few targets, only to generate the .lds scripts.
> For other rules, they use either a_flags or c_flags, for example:
> %.i: %.c ; uses $(c_flags)
> %.i: %.S ; uses $(a_flags)
> %.s: %.S ; uses $(a_flags)

The first on really ought to be use cpp_flags. I couldn't find the
middle one. The last one clearly has to do something about -Wa,
options, but apart from this I'd consider a_flags appropriate to
use there.

> (Also, they use -Qunused-arguments clang's options, so they don't need
> to filter out -Wa,* arguments, I think.)

Maybe we should do so too then?

> So, maybe having a single $(cpp_flags) when running the CPP command
> isn't such a good idea.

Right - after all in particular the use of CPP to produce .lds is
an abuse, as the source file (named .lds.S) isn't really what its
name says.

> So, would dropping $(cpp_flags) for now, and rework the *FLAGS later, be
> good enough?

I don't think so, no, I'm sorry. cpp_flags should be there for its
real purpose. Whether the .lds.S -> .lds rule can use it, or should
use a_flags, or yet something else is a different thing.

Jan

Re: [XEN PATCH v5 14/16] build: use if_changed to build mm/*/guest_%.o

2020-04-28 Thread Jan Beulich

On 21.04.2020 18:12, Anthony PERARD wrote:
> Use if_changed for building all guest_%.o objects, and make use of
> command that already exist.
> 
> The current command only runs `CC`, but the runes to build every other
> object in Xen also runs `objcopy` (when CONFIG_ENFORCE_UNIQUE_SYMBOLS=y)
> which modify the file symbol. But with patch
> "xen,symbols: rework file symbols selection", ./symbols should still
> select the file symbols directive intended to be used for guest_%.o
> objects.
> 
> The goal here is to reduce the number of commands written in
> makefiles.
> 
> Signed-off-by: Anthony PERARD 

Reviewed-by: Jan Beulich

Re: [PATCH] PCI: drop a redundant variable from pci_add_device()

2020-04-28 Thread Andrew Cooper

On 28/04/2020 13:59, Jan Beulich wrote:
> Surrounding code already uses the available alternative, after all.
>
> Signed-off-by: Jan Beulich 

Acked-by: Andrew Cooper

Re: [XEN PATCH v5 13/16] xen,symbols: rework file symbols selection

2020-04-28 Thread Jan Beulich

On 21.04.2020 18:12, Anthony PERARD wrote:
> We want to use the same rune to build mm/*/guest_*.o as the one use to
> build every other *.o object. The consequence it that file symbols that
> the program ./symbols prefer changes with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y.
> 
> For example, when building arch/x86/mm/guest_walk_2.o from guest_walk.c,
> this would be the difference of file symbol present in the object when
> building with CONFIG_ENFORCE_UNIQUE_SYMBOLS=y:
> 
> (1) Currently we have those two file symbols:
> guest_walk.c
> guest_walk_2.o
> (2) When building with the same rune, we will have:
> arch/x86/mm/guest_walk.c
> guest_walk_2.o

So I had to go to the v4 discussion to understand this again. As said
there, it may be obvious to you but despite having been the one to
introduce the objcopy step there, it is not to me. Hence my continued
desire to have this at least briefly mentioned here, as it's not
otherwise noticeable from looking at the patch itself.

The code change itself looks okay to me, but I'll want to ack this
change only once I can follow the description from a single pass of
reading through it. I'm sorry for the extra trouble.

Jan

Re: [XEN PATCH v5 08/16] build: Introduce $(cpp_flags)

2020-04-28 Thread Anthony PERARD

On Thu, Apr 23, 2020 at 06:48:51PM +0200, Jan Beulich wrote:
> On 21.04.2020 18:12, Anthony PERARD wrote:
> > --- a/xen/Rules.mk
> > +++ b/xen/Rules.mk
> > @@ -123,6 +123,7 @@ $(obj-bin-y): XEN_CFLAGS := $(filter-out 
> > -flto,$(XEN_CFLAGS))
> >  
> >  c_flags = -MMD -MP -MF $(@D)/.$(@F).d $(XEN_CFLAGS) 
> > '-D__OBJECT_FILE__="$@"'
> >  a_flags = -MMD -MP -MF $(@D)/.$(@F).d $(XEN_AFLAGS)
> > +cpp_flags = $(filter-out -Wa$(comma)%,$(a_flags))
> 
> I can see this happening to be this way right now, but in principle
> I could see a_flags to hold items applicable to assembly files only,
> but not to (the preprocessing of) C files. Hence while this is fine
> for now, ...
> 
> > @@ -207,7 +208,7 @@ quiet_cmd_cc_s_c = CC  $@
> >  cmd_cc_s_c = $(CC) $(filter-out -Wa$(comma)%,$(c_flags)) -S $< -o $@
> >  
> >  quiet_cmd_s_S = CPP $@
> > -cmd_s_S = $(CPP) $(filter-out -Wa$(comma)%,$(a_flags)) $< -o $@
> > +cmd_s_S = $(CPP) $(cpp_flags) $< -o $@
> 
> ... this one is a trap waiting for someone to fall in imo. Instead
> where I'd expect this patch to use $(cpp_flags) is e.g. in
> xen/arch/x86/mm/Makefile:
> 
> guest_walk_%.i: guest_walk.c Makefile
>   $(CPP) $(cpp_flags) -DGUEST_PAGING_LEVELS=$* -c $< -o $@
> 
> And note how this currently uses $(c_flags), not $(a_flags), which
> suggests that your deriving from $(a_flags) isn't correct either.

I think we can drop this patch for now, and change patch "xen/build:
factorise generation of the linker scripts" to not use $(cpp_flags).

If we derive $(cpp_flags) from $(c_flags) instead, we would need to
find out if CPP commands using a_flags can use c_flags instead.

On the other hand, I've looked at Linux source code, and they use
$(cpp_flags) for only a few targets, only to generate the .lds scripts.
For other rules, they use either a_flags or c_flags, for example:
%.i: %.c ; uses $(c_flags)
%.i: %.S ; uses $(a_flags)
%.s: %.S ; uses $(a_flags)

(Also, they use -Qunused-arguments clang's options, so they don't need
to filter out -Wa,* arguments, I think.)

So, maybe having a single $(cpp_flags) when running the CPP command
isn't such a good idea.

So, would dropping $(cpp_flags) for now, and rework the *FLAGS later, be
good enough?

Thanks,

-- 
Anthony PERARD

Re: [XEN PATCH v5 11/16] xen/build: factorise generation of the linker scripts

2020-04-28 Thread Jan Beulich

On 21.04.2020 18:12, Anthony PERARD wrote:
> In Arm and X86 makefile, generating the linker script is the same, so
> we can simply have both call the same macro.
> 
> We need to add *.lds files into extra-y so that Rules.mk can find the
> .*.cmd dependency file and load it.
> 
> Change made to the command line:
> - Use of $(CPP) instead of $(CC) -E
> - Remove -Ui386.
>   We don't compile Xen for i386 anymore, so that macro is never
>   defined. Also it doesn't make sense on Arm.
> - Use $(cpp_flags) which simply filter -Wa,% options from $(a_flags).
> 
> Signed-off-by: Anthony PERARD 

For the non-Arm bits
Acked-by: Jan Beulich

Re: [XEN PATCH v5 09/16] xen/build: use if_changed on built_in.o

2020-04-28 Thread Jan Beulich

On 21.04.2020 18:12, Anthony PERARD wrote:
> In the case where $(obj-y) is empty, we also replace $(c_flags) by
> $(XEN_CFLAGS) to avoid generating an .%.d dependency file. This avoid
> make trying to include %.h file in the ld command if $(obj-y) isn't
> empty anymore on a second run.
> 
> Signed-off-by: Anthony PERARD 

Acked-by: Jan Beulich 

Personally I'd prefer ...

> --- a/xen/Rules.mk
> +++ b/xen/Rules.mk
> @@ -130,15 +130,24 @@ include $(BASEDIR)/arch/$(TARGET_ARCH)/Rules.mk
>  c_flags += $(CFLAGS-y)
>  a_flags += $(CFLAGS-y) $(AFLAGS-y)
>  
> -built_in.o: $(obj-y) $(extra-y)
> -ifeq ($(obj-y),)
> - $(CC) $(c_flags) -c -x c /dev/null -o $@
> -else
> +quiet_cmd_ld_builtin = LD  $@
>  ifeq ($(CONFIG_LTO),y)
> - $(LD_LTO) -r -o $@ $(filter-out $(extra-y),$^)
> +cmd_ld_builtin = \
> +$(LD_LTO) -r -o $@ $(filter-out $(extra-y),$(real-prereqs))
>  else
> - $(LD) $(XEN_LDFLAGS) -r -o $@ $(filter-out $(extra-y),$^)
> +cmd_ld_builtin = \
> +$(LD) $(XEN_LDFLAGS) -r -o $@ $(filter-out $(extra-y),$(real-prereqs))
>  endif
> +
> +quiet_cmd_cc_builtin = LD  $@
> +cmd_cc_builtin = \
> +$(CC) $(XEN_CFLAGS) -c -x c /dev/null -o $@
> +
> +built_in.o: $(obj-y) $(extra-y) FORCE
> +ifeq ($(obj-y),)
> + $(call if_changed,cc_builtin)
> +else
> + $(call if_changed,ld_builtin)
>  endif

...

   $(call if_changed,$(if $(obj-y),ld,cc)_builtin)

but perhaps I'm the only one.

Jan

[PATCH] PCI: drop a redundant variable from pci_add_device()

2020-04-28 Thread Jan Beulich

Surrounding code already uses the available alternative, after all.

Signed-off-by: Jan Beulich 

--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -760,7 +760,6 @@ int pci_add_device(u16 seg, u8 bus, u8 d
 {
 unsigned int idx = pos + PCI_SRIOV_BAR + i * 4;
 uint32_t bar = pci_conf_read32(pdev->sbdf, idx);
-pci_sbdf_t sbdf = PCI_SBDF3(seg, bus, devfn);
 
 if ( (bar & PCI_BASE_ADDRESS_SPACE) ==
  PCI_BASE_ADDRESS_SPACE_IO )
@@ -771,7 +770,8 @@ int pci_add_device(u16 seg, u8 bus, u8 d
seg, bus, slot, func, i);
 continue;
 }
-ret = pci_size_mem_bar(sbdf, idx, NULL, &pdev->vf_rlen[i],
+ret = pci_size_mem_bar(pdev->sbdf, idx, NULL,
+   &pdev->vf_rlen[i],
PCI_BAR_VF |
((i == PCI_SRIOV_NUM_BARS - 1) ?
 PCI_BAR_LAST : 0));

Re: [PATCH v4] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Jürgen Groß

On 28.04.20 14:46, Paul Durrant wrote:

-Original Message-
From: Julien Grall 
Sent: 28 April 2020 11:23
To: Jürgen Groß ; Paul Durrant ; 
xen-devel@lists.xenproject.org
Cc: Paul Durrant ; Andrew Cooper 
; George Dunlap
; Ian Jackson ; Jan Beulich 
;
Stefano Stabellini ; Wei Liu 
Subject: Re: [PATCH v4] docs/designs: re-work the xenstore migration document...

Hi Juergen,

On 28/04/2020 11:10, Jürgen Groß wrote:

On 28.04.20 11:05, Julien Grall wrote:

-where tx_id is the non-zero identifier values of an open transaction.
+| Field | Description   |
+|---|---|
+| `domid`   | The domain-id that owns the shared page   |
+|   |   |
+| `tdomid`  | The domain-id that `domid` acts on behalf of if   |
+|   | it has been subject to an SET_TARGET  |
+|   | operation [2] or DOMID_INVALID [3] otherwise  |
+|   |   |
+| `flags`   | Must be zero  |
+|   |   |
+| `evtchn`  | The port number of the interdomain channel used   |
+|   | by `domid` to communicate with xenstored  |
+|   |   |
+| `mfn` | The MFN of the shared page for a live update or   |
+|   | ~0 (i.e. all bits set) otherwise  |
-### Protocol Extension
+Since the ABI guarantees that entry 1 in `domid`'s grant table will
always
+contain the GFN of the shared page, so for a live update `mfn` can
be used to
+give confidence that `domid` has not been re-cycled during the update.

I am confused, there is no way a XenStored running in an Arm domain
can map the MFN of the shared page. So how is this going to work out?

I guess this will be a MFN for PV guests and a guest PFN else.

If we use Xen terminology (xen/include/xen/mm.h), this would be a GFN.

TBH I'm not sure a GFN would give much confidence about domain recycling as it 
would likely be the same for many domains, I think. We really need a UUID.

No, we need a per-host domain value, which is associated with a domain
and which is unique during the life-time of the host.

E.g. a global counter, which is incremented at each domain creation and
stored in struct domain.

Juergen

RE: [PATCH v4] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Julien Grall 
> Sent: 28 April 2020 10:06
> To: Paul Durrant ; xen-devel@lists.xenproject.org
> Cc: Paul Durrant ; Juergen Gross ; 
> Andrew Cooper
> ; George Dunlap ; Ian 
> Jackson
> ; Jan Beulich ; Stefano 
> Stabellini
> ; Wei Liu 
> Subject: Re: [PATCH v4] docs/designs: re-work the xenstore migration 
> document...
> 
> Hi Paul,
> 
> On 27/04/2020 16:50, Paul Durrant wrote:
> > diff --git a/docs/designs/xenstore-migration.md 
> > b/docs/designs/xenstore-migration.md
> > index 6ab351e8fe..51d8b85171 100644
> > --- a/docs/designs/xenstore-migration.md
> > +++ b/docs/designs/xenstore-migration.md
> > @@ -3,254 +3,400 @@
> >   ## Background
> >
> >   The design for *Non-Cooperative Migration of Guests*[1] explains that 
> > extra
> > -save records are required in the migrations stream to allow a guest running
> > -PV drivers to be migrated without its co-operation. Moreover the save
> > -records must include details of registered xenstore watches as well as
> > -content; information that cannot currently be recovered from `xenstored`,
> > -and hence some extension to the xenstore protocol[2] will also be required.
> > -
> > -The *libxenlight Domain Image Format* specification[3] already defines a
> > -record type `EMULATOR_XENSTORE_DATA` but this is not suitable for
> > -transferring xenstore data pertaining to the domain directly as it is
> > -specified such that keys are relative to the path
> > -`/local/domain/$dm_domid/device-model/$domid`. Thus it is necessary to
> > -define at least one new save record type.
> > +save records are required in the migrations stream to allow a guest 
> > running PV
> > +drivers to be migrated without its co-operation. Moreover the save records 
> > must
> > +include details of registered xenstore watches as well ascontent; 
> > information
> 
> s/ascontent/as content/
> 

Oh yes.

> [...]
> 
> > +### END
> > +
> > +The end record marks the end of the image, and is the final record
> > +in the stream.
> >
> >   ```
> > -0   1   2   3 octet
> > -+---+---+---+---+
> > -| NODE_DATA |
> > -+---+
> > -| path length   |
> > -+---+
> > -| path data |
> > -...
> > -| pad (0 to 3 octets)   |
> > -+---+
> > -| perm count (N)|
> > -+---+
> > -| perm0 |
> > -+---+
> > -...
> > -+---+
> > -| permN |
> > -+---+
> > -| value length  |
> > -+---+
> > -| value data|
> > -...
> > -| pad (0 to 3 octets)   |
> > -+---+
> > +0   1   2   3   4   5   6   7octet
> > ++---+---+---+---+---+---+---+---+
> >   ```
> >
> > -where perm0..N are formatted as follows:
> >
> > +The end record contains no fields; its body length is 0.
> > +
> > +\pagebreak
> > +
> > +### GLOBAL_DATA
> > +
> > +This record is only relevant for live update. It contains details of global
> > +xenstored state that needs to be restored.
> 
> Reading this paragraph, it sounds like GLOBAL_DATA should always be
> present in the liveupdate stream. However ...
> 
> [...]
> 
> > -path length and value length are specified in octets (excluding the NUL
> > -terminator of the path). perm should be one of the ASCII values `w`, `r`,
> > -`b` or `n` as described in [2]. All pad values should be 0.
> > -All paths should be absolute (i.e. start with `/`) and as described in
> > -[2].
> > +| Field  | Description  |
> > +||--|
> > +| `rw-socket-fd` | The file descriptor of the socket accepting  |
> > +|| read-write connections   |
> > +||  |
> > +| `ro-socket-fd` | The file descriptor of the socket accepting  |
> > +|| read-only connections|
> >
> > +xenstored will resume in the original process context. Hence 
> > `rw-socket-fd` and
> > +`ro-socket-fd` simply specify the file descriptors of the sockets.
> 
> ... sockets may not always be available in XenStored. So should we
> reserve a value for "not in-use socket"?
> 

Ok.

> [...]
> 
> > -wpath length and token length are specified in octets (excluding the NUL
> > -terminator). The wpath should be as described for the `WATCH` operation in
> > -[2]. The token is an arbitrary string of octets not containing any NUL
> > -values.
> >
> > +| Field   | Description |
> > +|-|-|
> > +| `conn-id`   | A non-zero nu

RE: [PATCH v4] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Julien Grall 
> Sent: 28 April 2020 11:23
> To: Jürgen Groß ; Paul Durrant ; 
> xen-devel@lists.xenproject.org
> Cc: Paul Durrant ; Andrew Cooper 
> ; George Dunlap
> ; Ian Jackson ; Jan 
> Beulich ;
> Stefano Stabellini ; Wei Liu 
> Subject: Re: [PATCH v4] docs/designs: re-work the xenstore migration 
> document...
> 
> Hi Juergen,
> 
> On 28/04/2020 11:10, Jürgen Groß wrote:
> > On 28.04.20 11:05, Julien Grall wrote:
> >>> -where tx_id is the non-zero identifier values of an open transaction.
> >>> +| Field | Description   |
> >>> +|---|---|
> >>> +| `domid`   | The domain-id that owns the shared page   |
> >>> +|   |   |
> >>> +| `tdomid`  | The domain-id that `domid` acts on behalf of if   |
> >>> +|   | it has been subject to an SET_TARGET  |
> >>> +|   | operation [2] or DOMID_INVALID [3] otherwise  |
> >>> +|   |   |
> >>> +| `flags`   | Must be zero  |
> >>> +|   |   |
> >>> +| `evtchn`  | The port number of the interdomain channel used   |
> >>> +|   | by `domid` to communicate with xenstored  |
> >>> +|   |   |
> >>> +| `mfn` | The MFN of the shared page for a live update or   |
> >>> +|   | ~0 (i.e. all bits set) otherwise  |
> >>> -### Protocol Extension
> >>> +Since the ABI guarantees that entry 1 in `domid`'s grant table will
> >>> always
> >>> +contain the GFN of the shared page, so for a live update `mfn` can
> >>> be used to
> >>> +give confidence that `domid` has not been re-cycled during the update.
> >>
> >> I am confused, there is no way a XenStored running in an Arm domain
> >> can map the MFN of the shared page. So how is this going to work out?
> >
> > I guess this will be a MFN for PV guests and a guest PFN else.
> 
> If we use Xen terminology (xen/include/xen/mm.h), this would be a GFN.
> 

TBH I'm not sure a GFN would give much confidence about domain recycling as it 
would likely be the same for many domains, I think. We really need a UUID.

> >
> >>
> >> [...]
> >>
> >>> -START_DOMAIN_TRANSACTION||
> >>> +0   1   2   3octet
> >>> ++---+---+---+---+
> >>> +| conn-id   |
> >>> ++---+
> >>> +| tx-id |
> >>> ++---+---+
> >>> +| path-len  | value-len |
> >>> ++---+---+
> >>> +| access| perm-count|
> >>> ++---+---+
> >>> +| perm1 |
> >>> ++---+
> >>> +...
> >>> ++---+
> >>> +| permN |
> >>> ++---+---+
> >>> +| path
> >>> +...
> >>> +| value
> >>> +...
> >>> +```
> >>> +
> >>> +
> >>> +| Field| Description|
> >>> +|--||
> >>> +| `conn-id`| If this value is non-zero then this record |
> >>> +|  | related to a pending transaction   |
> >>
> >> If conn-id is 0, how do you know the owner of the node?
> >
> > The first permission entry's domid is the owner.
> 
> I think this should be spell out in the specification.
> 

In xenstore.txt, there is a reference to https://wiki.xen.org/wiki/XenBus

This explains things reasonably well; I'll reference it here and add some words 
too.

  Paul

RE: [PATCH 1/2] Fix undefined behaviour

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Paul Durrant 
> Sent: 28 April 2020 13:33
> To: 'Artur Puzio' ; 'Grzegorz Uriasz' 
> ; qemu-de...@nongnu.org
> Cc: marma...@invisiblethingslab.com; ja...@bartmin.ski; 
> j.nowa...@student.uw.edu.pl; 'Stefano
> Stabellini' ; 'Anthony Perard' 
> ; xen-
> de...@lists.xenproject.org
> Subject: RE: [PATCH 1/2] Fix undefined behaviour
> 
> > -Original Message-
> > From: Artur Puzio 
> > Sent: 28 April 2020 10:41
> > To: p...@xen.org; 'Grzegorz Uriasz' ; 
> > qemu-de...@nongnu.org
> > Cc: marma...@invisiblethingslab.com; ja...@bartmin.ski; 
> > j.nowa...@student.uw.edu.pl; 'Stefano
> > Stabellini' ; 'Anthony Perard' 
> > ; xen-
> > de...@lists.xenproject.org
> > Subject: Re: [PATCH 1/2] Fix undefined behaviour
> >
> > On 28.04.2020 10:10, Paul Durrant wrote:
> > >> -Original Message-
> > >> From: Grzegorz Uriasz 
> > >> Sent: 28 April 2020 07:29
> > >> To: qemu-de...@nongnu.org
> > >> Cc: Grzegorz Uriasz ; 
> > >> marma...@invisiblethingslab.com; ar...@puzio.waw.pl;
> > >> ja...@bartmin.ski; j.nowa...@student.uw.edu.pl; Stefano Stabellini 
> > >> ;
> > Anthony
> > >> Perard ; Paul Durrant ; 
> > >> xen-devel@lists.xenproject.org
> > >> Subject: [PATCH 1/2] Fix undefined behaviour
> > >>
> > >> Signed-off-by: Grzegorz Uriasz 
> > > I think we need more of a commit comment for both this and patch #2 to 
> > > explain why you are making
> > the changes.
> > >
> > >   Paul
> >
> > I agree Grzegorz should improve the commit messages. In the mean time
> > see email with subject "[PATCH 0/2] Fix QEMU crashes when passing IGD to
> > a guest VM under XEN", it contains quite detailed explanation for both
> > "Fix undefined behaviour" and "Improve legacy vbios handling" patches.
> >
> 
> Ok. Can you please make sure maintainers are cc-ed on patch #0 too.
> 

Actually they are, sorry. My MUA is playing tricks on me.

  Paul

RE: [PATCH 1/2] Fix undefined behaviour

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Artur Puzio 
> Sent: 28 April 2020 10:41
> To: p...@xen.org; 'Grzegorz Uriasz' ; 
> qemu-de...@nongnu.org
> Cc: marma...@invisiblethingslab.com; ja...@bartmin.ski; 
> j.nowa...@student.uw.edu.pl; 'Stefano
> Stabellini' ; 'Anthony Perard' 
> ; xen-
> de...@lists.xenproject.org
> Subject: Re: [PATCH 1/2] Fix undefined behaviour
> 
> On 28.04.2020 10:10, Paul Durrant wrote:
> >> -Original Message-
> >> From: Grzegorz Uriasz 
> >> Sent: 28 April 2020 07:29
> >> To: qemu-de...@nongnu.org
> >> Cc: Grzegorz Uriasz ; marma...@invisiblethingslab.com; 
> >> ar...@puzio.waw.pl;
> >> ja...@bartmin.ski; j.nowa...@student.uw.edu.pl; Stefano Stabellini 
> >> ;
> Anthony
> >> Perard ; Paul Durrant ; 
> >> xen-devel@lists.xenproject.org
> >> Subject: [PATCH 1/2] Fix undefined behaviour
> >>
> >> Signed-off-by: Grzegorz Uriasz 
> > I think we need more of a commit comment for both this and patch #2 to 
> > explain why you are making
> the changes.
> >
> >   Paul
> 
> I agree Grzegorz should improve the commit messages. In the mean time
> see email with subject "[PATCH 0/2] Fix QEMU crashes when passing IGD to
> a guest VM under XEN", it contains quite detailed explanation for both
> "Fix undefined behaviour" and "Improve legacy vbios handling" patches.
> 

Ok. Can you please make sure maintainers are cc-ed on patch #0 too.

  Paul

RE: [PATCH] x86/pass-through: avoid double IRQ unbind during domain cleanup

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich 
> Sent: 28 April 2020 13:22
> To: xen-devel@lists.xenproject.org
> Cc: Paul Durrant ; Varad Gautam ; Andrew Cooper
> ; Roger Pau Monné ; Wei Liu 
> 
> Subject: [PATCH] x86/pass-through: avoid double IRQ unbind during domain 
> cleanup
> 
> XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTs.
> In that scenario, it is possible to receive multiple _pirq_guest_unbind
> calls for the same pirq from domain_kill, if the pirq has not yet been
> removed from the domain's pirq_tree, as:
>   domain_kill()
> -> domain_relinquish_resources()
>   -> pci_release_devices()
> -> pci_clean_dpci_irq()
>   -> pirq_guest_unbind()
> -> __pirq_guest_unbind()
> 
> Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
> from the tree being iterated after the first call there. In case such a
> removed entry still has a softirq outstanding, record it and re-check
> upon re-invocation.
> 
> Reported-by: Varad Gautam 
> Signed-off-by: Jan Beulich 
> Tested-by: Varad Gautam 

Reviewed-by: Paul Durrant

[PATCH] x86/pass-through: avoid double IRQ unbind during domain cleanup

2020-04-28 Thread Jan Beulich

XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTs.
In that scenario, it is possible to receive multiple _pirq_guest_unbind
calls for the same pirq from domain_kill, if the pirq has not yet been
removed from the domain's pirq_tree, as:
  domain_kill()
-> domain_relinquish_resources()
  -> pci_release_devices()
-> pci_clean_dpci_irq()
  -> pirq_guest_unbind()
-> __pirq_guest_unbind()

Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
from the tree being iterated after the first call there. In case such a
removed entry still has a softirq outstanding, record it and re-check
upon re-invocation.

Reported-by: Varad Gautam 
Signed-off-by: Jan Beulich 
Tested-by: Varad Gautam 

--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1323,7 +1323,7 @@ void (pirq_cleanup_check)(struct pirq *p
 }
 
 if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
-BUG();
+BUG_ON(!d->is_dying);
 }
 
 /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -873,7 +873,14 @@ static int pci_clean_dpci_irq(struct dom
 xfree(digl);
 }
 
-return pt_pirq_softirq_active(pirq_dpci) ? -ERESTART : 0;
+radix_tree_delete(&d->pirq_tree, dpci_pirq(pirq_dpci)->pirq);
+
+if ( !pt_pirq_softirq_active(pirq_dpci) )
+return 0;
+
+domain_get_irq_dpci(d)->pending_pirq_dpci = pirq_dpci;
+
+return -ERESTART;
 }
 
 static int pci_clean_dpci_irqs(struct domain *d)
@@ -890,8 +897,18 @@ static int pci_clean_dpci_irqs(struct do
 hvm_irq_dpci = domain_get_irq_dpci(d);
 if ( hvm_irq_dpci != NULL )
 {
-int ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
+int ret = 0;
+
+if ( hvm_irq_dpci->pending_pirq_dpci )
+{
+if ( pt_pirq_softirq_active(hvm_irq_dpci->pending_pirq_dpci) )
+ ret = -ERESTART;
+else
+ hvm_irq_dpci->pending_pirq_dpci = NULL;
+}
 
+if ( !ret )
+ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
 if ( ret )
 {
 spin_unlock(&d->event_lock);
--- a/xen/include/asm-x86/hvm/irq.h
+++ b/xen/include/asm-x86/hvm/irq.h
@@ -158,6 +158,8 @@ struct hvm_irq_dpci {
 DECLARE_BITMAP(isairq_map, NR_ISAIRQS);
 /* Record of mapped Links */
 uint8_t link_cnt[NR_LINK];
+/* Clean up: Entry with a softirq invocation pending / in progress. */
+struct hvm_pirq_dpci *pending_pirq_dpci;
 };
 
 /* Machine IRQ to guest device/intx mapping. */

Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs

2020-04-28 Thread Jan Beulich

On 28.04.2020 13:58, v...@amazon.com wrote:
> On 3/10/20 3:19 PM, Jan Beulich wrote:
>> On 09.03.2020 18:47, Paul Durrant wrote:
>>> Please suggest code if you think it ought to be done differentely. I tried.
>> How about this? It's admittedly more code, but imo less ad hoc.
>> I've smoke tested it, but I depend on you or Varad to check that
>> it actually addresses the reported issue.
>>
>> Jan
>>
>> x86/pass-through: avoid double IRQ unbind during domain cleanup
> 
> 
> I have tested that this patch prevents __pirq_guest_unbind on an 
> already-unbound pirq
> during the continuation call for domain_kill -ERESTART, by using a modified 
> xen that
> forces an -ERESTART from pirq_guest_unbind to create the continuation. It 
> fixes the
> underlying issue.
> 
> Tested-by: Varad Gautam 

Thanks much; I'll formally submit the patch then.

Jan

Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for shared pirqs

2020-04-28 Thread vrd


Hi Jan,

On 3/10/20 3:19 PM, Jan Beulich wrote:

On 09.03.2020 18:47, Paul Durrant wrote:

-Original Message-
From: Jan Beulich 
Sent: 09 March 2020 16:29
To: p...@xen.org
Cc: xen-devel@lists.xenproject.org; Varad Gautam ; Julien Grall 
; Roger
Pau Monné ; Andrew Cooper 
Subject: Re: [PATCH v4] x86: irq: Do not BUG_ON multiple unbind calls for 
shared pirqs

On 06.03.2020 17:02, p...@xen.org wrote:

From: Varad Gautam 

XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
In that scenario, it is possible to receive multiple __pirq_guest_unbind
calls for the same pirq from domain_kill, if the pirq has not yet been
removed from the domain's pirq_tree, as:
   domain_kill()
 -> domain_relinquish_resources()
   -> pci_release_devices()
 -> pci_clean_dpci_irq()
   -> pirq_guest_unbind()
 -> __pirq_guest_unbind()

For a shared pirq (nr_guests > 1), the first call would zap the current
domain from the pirq's guests[] list, but the action handler is never freed
as there are other guests using this pirq. As a result, on the second call,
__pirq_guest_unbind searches for the current domain which has been removed
from the guests[] list, and hits a BUG_ON.

Make __pirq_guest_unbind safe to be called multiple times by letting xen
continue if a shared pirq has already been unbound from this guest. The
PIRQ will be cleaned up from the domain's pirq_tree during the destruction
in complete_domain_destroy anyway.

Signed-off-by: Varad Gautam 
[taking over from Varad at v4]
Signed-off-by: Paul Durrant 
---
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Roger Pau Monné 
Cc: Andrew Cooper 

Roger suggested cleaning the entry from the domain pirq_tree so that
we need not make it safe to re-call __pirq_guest_unbind(). This seems like
a reasonable suggestion but the semantics of the code are almost
impenetrable (e.g. 'pirq' is used to mean an index, a pointer and is also
the name of struct so you generally have little idea what it actally means)
so I prefer to stick with a small fix that I can actually reason about.

v4:
  - Re-work the guest array search to make it clearer

I.e. there are cosmetic differences to v3 (see below), but
technically it's still the same. I can't believe the re-use
of "pirq" for different entities is this big of a problem.

Please suggest code if you think it ought to be done differentely. I tried.

How about this? It's admittedly more code, but imo less ad hoc.
I've smoke tested it, but I depend on you or Varad to check that
it actually addresses the reported issue.

Jan

x86/pass-through: avoid double IRQ unbind during domain cleanup



I have tested that this patch prevents __pirq_guest_unbind on an 
already-unbound pirq
during the continuation call for domain_kill -ERESTART, by using a 
modified xen that
forces an -ERESTART from pirq_guest_unbind to create the continuation. 
It fixes the

underlying issue.

Tested-by: Varad Gautam 




XEN_DOMCTL_destroydomain creates a continuation if domain_kill -ERESTARTS.
In that scenario, it is possible to receive multiple _pirq_guest_unbind
calls for the same pirq from domain_kill, if the pirq has not yet been
removed from the domain's pirq_tree, as:
   domain_kill()
 -> domain_relinquish_resources()
   -> pci_release_devices()
 -> pci_clean_dpci_irq()
   -> pirq_guest_unbind()
 -> __pirq_guest_unbind()

Avoid recurring invocations of pirq_guest_unbind() by removing the pIRQ
from the tree being iterated after the first call there. In case such a
removed entry still has a softirq outstanding, record it and re-check
upon re-invocation.

Reported-by: Varad Gautam 
Signed-off-by: Jan Beulich 

--- unstable.orig/xen/arch/x86/irq.c
+++ unstable/xen/arch/x86/irq.c
@@ -1323,7 +1323,7 @@ void (pirq_cleanup_check)(struct pirq *p
  }

  if ( radix_tree_delete(&d->pirq_tree, pirq->pirq) != pirq )
-BUG();
+BUG_ON(!d->is_dying);
  }

  /* Flush all ready EOIs from the top of this CPU's pending-EOI stack. */
--- unstable.orig/xen/drivers/passthrough/pci.c
+++ unstable/xen/drivers/passthrough/pci.c
@@ -873,7 +873,14 @@ static int pci_clean_dpci_irq(struct dom
  xfree(digl);
  }

-return pt_pirq_softirq_active(pirq_dpci) ? -ERESTART : 0;
+radix_tree_delete(&d->pirq_tree, dpci_pirq(pirq_dpci)->pirq);
+
+if ( !pt_pirq_softirq_active(pirq_dpci) )
+return 0;
+
+domain_get_irq_dpci(d)->pending_pirq_dpci = pirq_dpci;
+
+return -ERESTART;
  }

  static int pci_clean_dpci_irqs(struct domain *d)
@@ -890,8 +897,18 @@ static int pci_clean_dpci_irqs(struct do
  hvm_irq_dpci = domain_get_irq_dpci(d);
  if ( hvm_irq_dpci != NULL )
  {
-int ret = pt_pirq_iterate(d, pci_clean_dpci_irq, NULL);
+int ret = 0;
+
+if ( hvm_irq_dpci->pending_pirq_dpci )
+{
+if ( pt_pirq_softirq_active(hvm_irq_dpci->pending_pirq_dpci) )
+ ret = -ERESTART;
+else
+

Re: [PATCH] mini-os: Avoid segfaults in tc{g,s}etattr

2020-04-28 Thread Andrew Cooper

On 28/04/2020 12:44, Jason Andryuk wrote:
> From: Andrew Cooper 
>
> Andrew Cooper wrote:
>> On 28/04/2020 12:16, Wei Liu wrote:
>> ---
>> I can't get ioemu-stubdom to start without this.  With this, the guest
>> just reboots immediately, but it does that with a non-stubdom
>> device_model_version="qemu-xen-traditional" .  The same guest disk image
>> (cirros 0.5.1) boots with a linux stubdom or non-stubdom Ubuntu
>> qemu-system-x86_64.
 Ubuntu gcc-9 adds -fcf-protection by default.  Somehow that flag
 caused rombios (I think) to restart.  Setting -fcf-protection=none
 like below (probably just the EMBEDDED_EXTRA_CFLAGS part) lets rombios
 start properly.
>> All it does is insert ENDBR{32,64} instructions, which are nops on older
>> processors.
>>
>> I suspect that it is not the -fcf-protection= directly, but some change
>> in alignment of a critical function.
>>
   The hypervisor needs it as well via
 EXTRA_CFLAGS_XEN_CORE=-fcf-protection=none and maybe also added to
 xen/arch/x86/boot/build32.mk .
>>> Are you able to turn this into a proper patch? I suspect you will need
>>> to test the availability of this new (?) flag.
>>>
>>> Also Cc Jan and Andrew because it affects hypervisor build too.
>> I need to chase this up.  It is a GCC bug breaking the hypervisor build,
>> and I'm moderately disinclined to hack around it, seeing as
>> -fcf-protection is something we want in due course.
>>
>> The bug is that GCC falsely declares that -fcf-protection is
>> incompatible with -mindirect-thunk=extern, despite me spending a week
>> during the Spectre embargo period specifically arranging for the two to
>> be compatible, because we knew we'd want to build retpoline-safe
>> binaries which could also use CET on newer hardware.
> The gcc manual states:
>   "Note that -mindirect-branch=thunk-extern is incompatible with
>-fcf-protection=branch since the external thunk cannot be modified
>to disable control-flow check."
>
> https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

Yes.  This is false.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93654

but sadly tumbleweeds.

I'll start a thread on the email list.

>
> Below is what I was preparing to submit as a patch.  So, yes it hacks around
> it, but it isn't messy.
>
> ---
> Disable fcf-protection to build working binaries
>
> Ubuntu gcc-9 enables -fcf-protection by default, which conflicts with
> -mindirect-branch=extern and prevents building the hypervisor with
> CONFIG_INDIRECT_THUNK:
> xmalloc.h:81:1: error: ‘-mindirect-branch’ and ‘-fcf-protection’ are not
> compatible
>
> Stefan Bader also noticed that build32.mk requires -fcf-protection=none
> or else the hypervisor will not boot.
> https://bugs.launchpad.net/ubuntu/+source/gcc-9/+bug/1863260  Similarly,
> rombios reboots almost immediately without -fcf-protection=none.  Both
> of those can be handled by setting it in EMBEDDED_EXTRA_CFLAGS.
>
> CC: Stefan Bader 
> Signed-off-by: Jason Andryuk 

Sadly, this isn't really appropriate.  We specifically do want to use
both -fcf-protection and -mindirect-branch=thunk-extern together, when
GCC isn't broken.

Overriding -fcf-protection is ok but only when we're certain we've got a
buggy GCC, so that when this bug is fixed, we can return to sensible
behaviour.

~Andrew

> ---
>  Config.mk | 1 +
>  xen/arch/x86/Rules.mk | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/Config.mk b/Config.mk
> index 0f303c79b2..efb3d42bc4 100644
> --- a/Config.mk
> +++ b/Config.mk
> @@ -205,6 +205,7 @@ APPEND_CFLAGS += $(foreach i, $(APPEND_INCLUDES), -I$(i))
>  
>  EMBEDDED_EXTRA_CFLAGS := -nopie -fno-stack-protector -fno-stack-protector-all
>  EMBEDDED_EXTRA_CFLAGS += -fno-exceptions
> +EMBEDDED_EXTRA_CFLAGS += -fcf-protection=none
>  
>  XEN_EXTFILES_URL ?= http://xenbits.xen.org/xen-extfiles
>  # All the files at that location were downloaded from elsewhere on
> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
> index 4b7ab78467..c3cbae69d2 100644
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -69,6 +69,7 @@ CFLAGS += -mno-sse $(call cc-option,$(CC),-mskip-rax-setup)
>  ifeq ($(CONFIG_INDIRECT_THUNK),y)
>  CFLAGS += -mindirect-branch=thunk-extern -mindirect-branch-register
>  CFLAGS += -fno-jump-tables
> +$(call cc-option-add,CFLAGS,CC,-fcf-protection=none)
>  endif
>  
>  # If supported by the compiler, reduce stack alignment to 8 bytes. But allow

Re: [PATCH] mini-os: Avoid segfaults in tc{g,s}etattr

2020-04-28 Thread Jason Andryuk

From: Andrew Cooper 

Andrew Cooper wrote:
>On 28/04/2020 12:16, Wei Liu wrote:
> ---
> I can't get ioemu-stubdom to start without this.  With this, the guest
> just reboots immediately, but it does that with a non-stubdom
> device_model_version="qemu-xen-traditional" .  The same guest disk image
> (cirros 0.5.1) boots with a linux stubdom or non-stubdom Ubuntu
> qemu-system-x86_64.
>>> Ubuntu gcc-9 adds -fcf-protection by default.  Somehow that flag
>>> caused rombios (I think) to restart.  Setting -fcf-protection=none
>>> like below (probably just the EMBEDDED_EXTRA_CFLAGS part) lets rombios
>>> start properly.
>
>All it does is insert ENDBR{32,64} instructions, which are nops on older
>processors.
>
>I suspect that it is not the -fcf-protection= directly, but some change
>in alignment of a critical function.
>
>>>   The hypervisor needs it as well via
>>> EXTRA_CFLAGS_XEN_CORE=-fcf-protection=none and maybe also added to
>>> xen/arch/x86/boot/build32.mk .
>> Are you able to turn this into a proper patch? I suspect you will need
>> to test the availability of this new (?) flag.
>>
>> Also Cc Jan and Andrew because it affects hypervisor build too.
>
>I need to chase this up.  It is a GCC bug breaking the hypervisor build,
>and I'm moderately disinclined to hack around it, seeing as
>-fcf-protection is something we want in due course.
>
>The bug is that GCC falsely declares that -fcf-protection is
>incompatible with -mindirect-thunk=extern, despite me spending a week
>during the Spectre embargo period specifically arranging for the two to
>be compatible, because we knew we'd want to build retpoline-safe
>binaries which could also use CET on newer hardware.

The gcc manual states:
  "Note that -mindirect-branch=thunk-extern is incompatible with
   -fcf-protection=branch since the external thunk cannot be modified
   to disable control-flow check."

https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html

Below is what I was preparing to submit as a patch.  So, yes it hacks around
it, but it isn't messy.

---
Disable fcf-protection to build working binaries

Ubuntu gcc-9 enables -fcf-protection by default, which conflicts with
-mindirect-branch=extern and prevents building the hypervisor with
CONFIG_INDIRECT_THUNK:
xmalloc.h:81:1: error: ‘-mindirect-branch’ and ‘-fcf-protection’ are not
compatible

Stefan Bader also noticed that build32.mk requires -fcf-protection=none
or else the hypervisor will not boot.
https://bugs.launchpad.net/ubuntu/+source/gcc-9/+bug/1863260  Similarly,
rombios reboots almost immediately without -fcf-protection=none.  Both
of those can be handled by setting it in EMBEDDED_EXTRA_CFLAGS.

CC: Stefan Bader 
Signed-off-by: Jason Andryuk 
---
 Config.mk | 1 +
 xen/arch/x86/Rules.mk | 1 +
 2 files changed, 2 insertions(+)

diff --git a/Config.mk b/Config.mk
index 0f303c79b2..efb3d42bc4 100644
--- a/Config.mk
+++ b/Config.mk
@@ -205,6 +205,7 @@ APPEND_CFLAGS += $(foreach i, $(APPEND_INCLUDES), -I$(i))
 
 EMBEDDED_EXTRA_CFLAGS := -nopie -fno-stack-protector -fno-stack-protector-all
 EMBEDDED_EXTRA_CFLAGS += -fno-exceptions
+EMBEDDED_EXTRA_CFLAGS += -fcf-protection=none
 
 XEN_EXTFILES_URL ?= http://xenbits.xen.org/xen-extfiles
 # All the files at that location were downloaded from elsewhere on
diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
index 4b7ab78467..c3cbae69d2 100644
--- a/xen/arch/x86/Rules.mk
+++ b/xen/arch/x86/Rules.mk
@@ -69,6 +69,7 @@ CFLAGS += -mno-sse $(call cc-option,$(CC),-mskip-rax-setup)
 ifeq ($(CONFIG_INDIRECT_THUNK),y)
 CFLAGS += -mindirect-branch=thunk-extern -mindirect-branch-register
 CFLAGS += -fno-jump-tables
+$(call cc-option-add,CFLAGS,CC,-fcf-protection=none)
 endif
 
 # If supported by the compiler, reduce stack alignment to 8 bytes. But allow
-- 
2.20.1

Re: [PATCH] MAINTAINERS: list myself as mini-os reviewer

2020-04-28 Thread Wei Liu

On Tue, Apr 28, 2020 at 01:30:20PM +0200, Samuel Thibault wrote:
> Wei Liu, le mar. 28 avril 2020 12:23:46 +0100, a ecrit:
> > I probably don't have much time to actually review patches, but I do
> > want to be CC'ed such that I can commit patches in a timely manner.
> > 
> > Signed-off-by: Wei Liu 
> 
> I actually thought you were already referenced there...

No I wasn't. Before the introduction of R: tag the only way to get CC'ed
was to step up as a maintainer. And I had had far too many hats
already...

Wei.

Re: [PATCH] MAINTAINERS: list myself as mini-os reviewer

2020-04-28 Thread Jan Beulich

On 28.04.2020 13:30, Samuel Thibault wrote:
> Wei Liu, le mar. 28 avril 2020 12:23:46 +0100, a ecrit:
>> I probably don't have much time to actually review patches, but I do
>> want to be CC'ed such that I can commit patches in a timely manner.
>>
>> Signed-off-by: Wei Liu 
> 
> I actually thought you were already referenced there...
> 
> Reviewed-by: Samuel Thibault 

Acked-by: Jan Beulich

Re: [PATCH v7 08/12] xen: add /buildinfo/config entry to hypervisor filesystem

2020-04-28 Thread Jürgen Groß


On 28.04.20 13:23, George Dunlap wrote:




On Apr 28, 2020, at 9:39 AM, Jan Beulich  wrote:

On 28.04.2020 10:24, George Dunlap wrote:

On Apr 28, 2020, at 8:20 AM, Jan Beulich  wrote:
On 27.04.2020 18:25, George Dunlap wrote:

If Jan is OK with it simply being outside CONFIG_EXPERT, then great.  But if he 
insists on some kind of testing for it to be outside of CONFIG_EXPERT, then 
again, the people who want it to be security supported should be the ones who 
do the work to make it happen.


I don't understand this part, I'm afraid: Without a config option,
the code is going to be security supported as long as it doesn't
get marked otherwise (experimental or what not). With an option
depending on EXPERT, what would become security unsupported is the
non-default (i.e. disabled) setting. There's not a whole lot to
test there, it's merely a formal consequence of our general rules.
(Of course, over time dependencies of other code may develop on
the information being available e.g. to Dom0 userland. Just like
there's Linux userland code assuming the kernel config is
available in certain ways [I don't necessarily mean the equivalent
of hypfs here], to then use it in what I'd call abusive ways in at
least some cases.)


Here’s an argument you might make:

“As a member of the security team, I don’t want to be on the hook for issuing 
XSAs for code which isn’t at least smoke-tested.  Therefore, I oppose any patch 
adding CONFIG_HYPFS outside of CONFIG_EXPERT, *unless* there is a concrete plan 
for getting regular testing for CONFIG_HYPFS=n.”

I’m not saying that’s an argument you *should* make.  But personally I don’t 
have a strong argument against such an argument. So, it seems to me, if you did 
make it, you have a reasonable chance of carrying your point.

Now consider this hypothetical universe where you made that argument and nobody 
opposed it.  In order to get a particular feature (CONFIG_HYPFS=n security 
supported), there is extra work that needs to be done (getting CONFIG_HYPFS=n 
tested regularly).  My point was, the expectation should be that the extra work 
will be done by the people who want or benefit from the feature; the series 
shouldn’t be blocked until Juergen implements CONFIG_HYPFS=n testing (since he 
doesn’t personally have a stake in that feature).

Now obviously, doing work to help someone else out in the community is of 
course a good thing to do; it builds goodwill, uses our aggregate resources 
more efficiently, and makes our community more enjoyable to work with. But the 
goodwill primarily comes from the fact that it was done as a voluntary choice, 
not as a requirement.

Juergen was balking at having to do what he saw as extra work to implement 
CONFIG_HYPFS.  I wanted to make it clear that even though I see value in having 
CONFIG_HYPFS, *he* doesn’t have to do the work if he doesn’t want to (although 
it would certainly be appreciated if he did).  And this paragraph was extending 
the same principle into the hypothetical universe where someone insisted that 
CONFIG_HYPFS=n had to be tested before being security supported.

Hope that makes sense. :-)


Yes, it does, thanks for the clarification. I can see what you describe
as a valid perspective to take, but really in my request to Jürgen I
took another: Now that we have Kconfig, additions of larger bodies of
code (possibly also just in terms of binary size) should imo generally
be questioned whether they want/need to be built for everyone. I.e. it
is not to be left to people being worried about binary sizes to arrange
for things to not be built, but for people contributing new but not
entirely essential code to consider making it option from the very
beginning.


I think that’s a reasonable position to take, but needs to be balanced on the 
amount of work that this would actually require.  If it only requires adding a 
handful of #ifdef’s and maybe making a few stubs, then yes, asking the 
submitter to make the change makes sense.  But if it requires three dozen 
#ifdef’s throughout the code and a fairly major architectural change, then I 
think it’s reasonable for a submitter to push back.

I don’t really understand why Juergen thinks adding CONFIG_HYPFS would cause a 
lot of code churn; my argumentation here is based on the assumption that his 
assessment is correct.


The main problem I'm seeing is the setting of runtime parameters.

This will need a complete different set of macros for defining those
parameters, split across multiple patches. And I'm fairly sure I'll
need to touch each custom_runtime_param handling function, too, in
order not to add dead code.


Juergen

Re: [PATCH] MAINTAINERS: list myself as mini-os reviewer

2020-04-28 Thread Samuel Thibault

Wei Liu, le mar. 28 avril 2020 12:23:46 +0100, a ecrit:
> I probably don't have much time to actually review patches, but I do
> want to be CC'ed such that I can commit patches in a timely manner.
> 
> Signed-off-by: Wei Liu 

I actually thought you were already referenced there...

Reviewed-by: Samuel Thibault 

> ---
> Cc: Samuel Thibault 
> Cc: minios-de...@lists.xenproject.org
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8a4c869704b0..e3748167550c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -374,6 +374,7 @@ F:xen/test/livepatch/*
>  
>  MINI-OS
>  M:   Samuel Thibault 
> +R:   Wei Liu 
>  S:   Supported
>  L:   minios-de...@lists.xenproject.org
>  T:   git https://xenbits.xenproject.org/git-http/mini-os.git
> -- 
> 2.20.1
> 

-- 
Samuel
 je la connaissais pas celle la : "make: Entering an unknown directory"
 -+- #ens-mim -+-

Re: [PATCH] mini-os: Avoid segfaults in tc{g,s}etattr

2020-04-28 Thread Andrew Cooper

On 28/04/2020 12:16, Wei Liu wrote:
 ---
 I can't get ioemu-stubdom to start without this.  With this, the guest
 just reboots immediately, but it does that with a non-stubdom
 device_model_version="qemu-xen-traditional" .  The same guest disk image
 (cirros 0.5.1) boots with a linux stubdom or non-stubdom Ubuntu
 qemu-system-x86_64.
>> Ubuntu gcc-9 adds -fcf-protection by default.  Somehow that flag
>> caused rombios (I think) to restart.  Setting -fcf-protection=none
>> like below (probably just the EMBEDDED_EXTRA_CFLAGS part) lets rombios
>> start properly.

All it does is insert ENDBR{32,64} instructions, which are nops on older
processors.

I suspect that it is not the -fcf-protection= directly, but some change
in alignment of a critical function.

>>   The hypervisor needs it as well via
>> EXTRA_CFLAGS_XEN_CORE=-fcf-protection=none and maybe also added to
>> xen/arch/x86/boot/build32.mk .
> Are you able to turn this into a proper patch? I suspect you will need
> to test the availability of this new (?) flag.
>
> Also Cc Jan and Andrew because it affects hypervisor build too.

I need to chase this up.  It is a GCC bug breaking the hypervisor build,
and I'm moderately disinclined to hack around it, seeing as
-fcf-protection is something we want in due course.

The bug is that GCC falsely declares that -fcf-protection is
incompatible with -mindirect-thunk=extern, despite me spending a week
during the Spectre embargo period specifically arranging for the two to
be compatible, because we knew we'd want to build retpoline-safe
binaries which could also use CET on newer hardware.

~Andrew

[PATCH] MAINTAINERS: list myself as mini-os reviewer

2020-04-28 Thread Wei Liu

I probably don't have much time to actually review patches, but I do
want to be CC'ed such that I can commit patches in a timely manner.

Signed-off-by: Wei Liu 
---
Cc: Samuel Thibault 
Cc: minios-de...@lists.xenproject.org
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8a4c869704b0..e3748167550c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -374,6 +374,7 @@ F:  xen/test/livepatch/*
 
 MINI-OS
 M: Samuel Thibault 
+R: Wei Liu 
 S: Supported
 L: minios-de...@lists.xenproject.org
 T: git https://xenbits.xenproject.org/git-http/mini-os.git
-- 
2.20.1

Re: [PATCH v7 08/12] xen: add /buildinfo/config entry to hypervisor filesystem

2020-04-28 Thread George Dunlap



> On Apr 28, 2020, at 9:39 AM, Jan Beulich  wrote:
> 
> On 28.04.2020 10:24, George Dunlap wrote:
>>> On Apr 28, 2020, at 8:20 AM, Jan Beulich  wrote:
>>> On 27.04.2020 18:25, George Dunlap wrote:
 If Jan is OK with it simply being outside CONFIG_EXPERT, then great.  But 
 if he insists on some kind of testing for it to be outside of 
 CONFIG_EXPERT, then again, the people who want it to be security supported 
 should be the ones who do the work to make it happen.
>>> 
>>> I don't understand this part, I'm afraid: Without a config option,
>>> the code is going to be security supported as long as it doesn't
>>> get marked otherwise (experimental or what not). With an option
>>> depending on EXPERT, what would become security unsupported is the
>>> non-default (i.e. disabled) setting. There's not a whole lot to
>>> test there, it's merely a formal consequence of our general rules.
>>> (Of course, over time dependencies of other code may develop on
>>> the information being available e.g. to Dom0 userland. Just like
>>> there's Linux userland code assuming the kernel config is
>>> available in certain ways [I don't necessarily mean the equivalent
>>> of hypfs here], to then use it in what I'd call abusive ways in at
>>> least some cases.)
>> 
>> Here’s an argument you might make:
>> 
>> “As a member of the security team, I don’t want to be on the hook for 
>> issuing XSAs for code which isn’t at least smoke-tested.  Therefore, I 
>> oppose any patch adding CONFIG_HYPFS outside of CONFIG_EXPERT, *unless* 
>> there is a concrete plan for getting regular testing for CONFIG_HYPFS=n.”
>> 
>> I’m not saying that’s an argument you *should* make.  But personally I don’t 
>> have a strong argument against such an argument. So, it seems to me, if you 
>> did make it, you have a reasonable chance of carrying your point.
>> 
>> Now consider this hypothetical universe where you made that argument and 
>> nobody opposed it.  In order to get a particular feature (CONFIG_HYPFS=n 
>> security supported), there is extra work that needs to be done (getting 
>> CONFIG_HYPFS=n tested regularly).  My point was, the expectation should be 
>> that the extra work will be done by the people who want or benefit from the 
>> feature; the series shouldn’t be blocked until Juergen implements 
>> CONFIG_HYPFS=n testing (since he doesn’t personally have a stake in that 
>> feature).
>> 
>> Now obviously, doing work to help someone else out in the community is of 
>> course a good thing to do; it builds goodwill, uses our aggregate resources 
>> more efficiently, and makes our community more enjoyable to work with. But 
>> the goodwill primarily comes from the fact that it was done as a voluntary 
>> choice, not as a requirement.
>> 
>> Juergen was balking at having to do what he saw as extra work to implement 
>> CONFIG_HYPFS.  I wanted to make it clear that even though I see value in 
>> having CONFIG_HYPFS, *he* doesn’t have to do the work if he doesn’t want to 
>> (although it would certainly be appreciated if he did).  And this paragraph 
>> was extending the same principle into the hypothetical universe where 
>> someone insisted that CONFIG_HYPFS=n had to be tested before being security 
>> supported.
>> 
>> Hope that makes sense. :-)
> 
> Yes, it does, thanks for the clarification. I can see what you describe
> as a valid perspective to take, but really in my request to Jürgen I
> took another: Now that we have Kconfig, additions of larger bodies of
> code (possibly also just in terms of binary size) should imo generally
> be questioned whether they want/need to be built for everyone. I.e. it
> is not to be left to people being worried about binary sizes to arrange
> for things to not be built, but for people contributing new but not
> entirely essential code to consider making it option from the very
> beginning.

I think that’s a reasonable position to take, but needs to be balanced on the 
amount of work that this would actually require.  If it only requires adding a 
handful of #ifdef’s and maybe making a few stubs, then yes, asking the 
submitter to make the change makes sense.  But if it requires three dozen 
#ifdef’s throughout the code and a fairly major architectural change, then I 
think it’s reasonable for a submitter to push back.

I don’t really understand why Juergen thinks adding CONFIG_HYPFS would cause a 
lot of code churn; my argumentation here is based on the assumption that his 
assessment is correct.

 -George

Re: [PATCH] mini-os: Avoid segfaults in tc{g,s}etattr

2020-04-28 Thread Wei Liu

On Mon, Apr 27, 2020 at 09:54:29AM +0200, Samuel Thibault wrote:
> Jason Andryuk, le dim. 26 avril 2020 23:40:19 -0400, a ecrit:
> > Commit c96c22f1d94 "mini-os: minimal implementations of some termios
> > functions" introduced implementations of tcgetattr and tcsetattr.
> > However, they do not check if files[fildes].cons.dev is non-NULL before
> > dereferencing.  This is not a problem for FDs allocated through
> > alloc_fd, but the files array pre-allocates FDs 0-2 for stdio.  Those
> > entries have a NULL .dev, so tc{g,s}etattr on them segfault.
> > 
> > ioemu-stubdom segfaults when term_init() calls tcgetattr on FD 0.
> > 
> > Restore tcgetattr and tcsetattr behavior when .dev is NULL equivalent to
> > unsupported_function as it was before c96c22f1d94.
> > 
> > Signed-off-by: Jason Andryuk 
> 
> Reviewed-by: Samuel Thibault 
> 
> Thanks!

Applied. Thanks.

Re: [PATCH] mini-os: Avoid segfaults in tc{g,s}etattr

2020-04-28 Thread Wei Liu

On Mon, Apr 27, 2020 at 09:30:50AM -0400, Jason Andryuk wrote:
> On Mon, Apr 27, 2020 at 3:54 AM Samuel Thibault
>  wrote:
> >
> > Jason Andryuk, le dim. 26 avril 2020 23:40:19 -0400, a ecrit:
> > > Commit c96c22f1d94 "mini-os: minimal implementations of some termios
> > > functions" introduced implementations of tcgetattr and tcsetattr.
> > > However, they do not check if files[fildes].cons.dev is non-NULL before
> > > dereferencing.  This is not a problem for FDs allocated through
> > > alloc_fd, but the files array pre-allocates FDs 0-2 for stdio.  Those
> > > entries have a NULL .dev, so tc{g,s}etattr on them segfault.
> > >
> > > ioemu-stubdom segfaults when term_init() calls tcgetattr on FD 0.
> > >
> > > Restore tcgetattr and tcsetattr behavior when .dev is NULL equivalent to
> > > unsupported_function as it was before c96c22f1d94.
> > >
> > > Signed-off-by: Jason Andryuk 
> >
> > Reviewed-by: Samuel Thibault 
> >
> > Thanks!
> 
> Thank you!
> 
> > > ---
> > > I can't get ioemu-stubdom to start without this.  With this, the guest
> > > just reboots immediately, but it does that with a non-stubdom
> > > device_model_version="qemu-xen-traditional" .  The same guest disk image
> > > (cirros 0.5.1) boots with a linux stubdom or non-stubdom Ubuntu
> > > qemu-system-x86_64.
> 
> Ubuntu gcc-9 adds -fcf-protection by default.  Somehow that flag
> caused rombios (I think) to restart.  Setting -fcf-protection=none
> like below (probably just the EMBEDDED_EXTRA_CFLAGS part) lets rombios
> start properly.  The hypervisor needs it as well via
> EXTRA_CFLAGS_XEN_CORE=-fcf-protection=none and maybe also added to
> xen/arch/x86/boot/build32.mk .

Are you able to turn this into a proper patch? I suspect you will need
to test the availability of this new (?) flag.

Also Cc Jan and Andrew because it affects hypervisor build too.

> 
> diff --git a/Config.mk b/Config.mk
> index 0f303c79b2..efb3d42bc4 100644
> --- a/Config.mk
> +++ b/Config.mk
> @@ -205,6 +205,7 @@ APPEND_CFLAGS += $(foreach i, $(APPEND_INCLUDES), -I$(i))
> 
>  EMBEDDED_EXTRA_CFLAGS := -nopie -fno-stack-protector -fno-stack-protector-all
>  EMBEDDED_EXTRA_CFLAGS += -fno-exceptions
> +EMBEDDED_EXTRA_CFLAGS += -fcf-protection=none
> 
>  XEN_EXTFILES_URL ?= http://xenbits.xen.org/xen-extfiles
>  # All the files at that location were downloaded from elsewhere on
> diff --git a/tools/firmware/Rules.mk b/tools/firmware/Rules.mk
> index 26bbddccd4..0d33514d53 100644
> --- a/tools/firmware/Rules.mk
> +++ b/tools/firmware/Rules.mk
> @@ -17,3 +17,4 @@ $(call cc-options-add,CFLAGS,CC,$(EMBEDDED_EXTRA_CFLAGS))
> 
>  # Extra CFLAGS suitable for an embedded type of environment.
>  CFLAGS += -fno-builtin -msoft-float
> +CFLAGS += -fcf-protection=none
>

[libvirt test] 149850: regressions - FAIL

2020-04-28 Thread osstest service owner

flight 149850 libvirt real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149850/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-amd64-libvirt   6 libvirt-buildfail REGR. vs. 146182
 build-i386-libvirt6 libvirt-buildfail REGR. vs. 146182
 build-arm64-libvirt   6 libvirt-buildfail REGR. vs. 146182
 build-armhf-libvirt   6 libvirt-buildfail REGR. vs. 146182

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-libvirt-vhd  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-pair  1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt-raw  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-arm64-arm64-libvirt-qcow2  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-armhf-armhf-libvirt  1 build-check(1)   blocked  n/a

version targeted for testing:
 libvirt  c1165f70c2a51efc389801d25e06c44099fb7c83
baseline version:
 libvirt  a1cd25b919509be2645dbe6f952d5263e0d4e4e5

Last test of basis   146182  2020-01-17 06:00:23 Z  102 days
Failing since146211  2020-01-18 04:18:52 Z  101 days   94 attempts
Testing same since   149850  2020-04-28 04:25:46 Z0 days1 attempts


People who touched revisions under test:
  Andrea Bolognani 
  Arnaud Patard 
  Bjoern Walk 
  Boris Fiuczynski 
  Chen Hanxiao 
  Christian Borntraeger 
  Christian Ehrhardt 
  Christian Schoenebeck 
  Collin Walling 
  Cornelia Huck 
  Daniel Henrique Barboza 
  Daniel P. Berrangé 
  Daniel Veillard 
  Dario Faggioli 
  Erik Skultety 
  Gaurav Agrawal 
  Han Han 
  Jamie Strandboge 
  Jim Fehlig 
  Jiri Denemark 
  Jonathon Jongsma 
  Julio Faracco 
  Ján Tomko 
  Laine Stump 
  Leonid Bloch 
  Lin Ma 
  Marc-André Lureau 
  Marek Marczykowski-Górecki 
  Mark Asselstine 
  Mauro S. M. Rodrigues 
  Michal Privoznik 
  Nikolay Shirokovskiy 
  Pavel Hrdina 
  Pavel Mores 
  Peter Krempa 
  Philipp Hahn 
  Pino Toscano 
  Prathamesh Chavan 
  Rafael Fonseca 
  Richard W.M. Jones 
  Rikard Falkeborn 
  Ryan Moeller 
  Sahid Orentino Ferdjaoui 
  Sebastian Mitterle 
  Seeteena Thoufeek 
  Stefan Berger 
  Stefan Berger 
  Stefan Hajnoczi 
  Thomas Huth 
  Wu Qingliang 
  Yi Li 
  Your Name 
  Zhang Bo 
  zhenwei pi 
  Zhimin Feng 

jobs:
 build-amd64-xsm  pass
 build-arm64-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-arm64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  fail
 build-arm64-libvirt  fail
 build-armhf-libvirt  fail
 build-i386-libvirt   fail
 build-amd64-pvopspass
 build-arm64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm   blocked 
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsmblocked 
 test-amd64-amd64-libvirt-xsm blocked 
 test-arm64-arm64-libvirt-xsm blocked 
 test-amd64-i386-libvirt-xsm  blocked 
 test-amd64-amd64-libvirt blocked 
 test-arm64-arm64-libvirt blocked 
 test-armhf-armhf-libvirt blocked 
 test-amd64-i386-libvirt  blocked 
 test-amd64-amd64-libvirt-pairblocked 
 test-amd64-i386-libvirt-pair blocked 
 test-arm64-arm64-libvi

Re: [PATCH 1/2] Fix undefined behaviour

2020-04-28 Thread Artur Puzio

On 28.04.2020 10:10, Paul Durrant wrote:
>> -Original Message-
>> From: Grzegorz Uriasz 
>> Sent: 28 April 2020 07:29
>> To: qemu-de...@nongnu.org
>> Cc: Grzegorz Uriasz ; marma...@invisiblethingslab.com; 
>> ar...@puzio.waw.pl;
>> ja...@bartmin.ski; j.nowa...@student.uw.edu.pl; Stefano Stabellini 
>> ; Anthony
>> Perard ; Paul Durrant ; 
>> xen-devel@lists.xenproject.org
>> Subject: [PATCH 1/2] Fix undefined behaviour
>>
>> Signed-off-by: Grzegorz Uriasz 
> I think we need more of a commit comment for both this and patch #2 to 
> explain why you are making the changes.
>
>   Paul 

I agree Grzegorz should improve the commit messages. In the mean time
see email with subject "[PATCH 0/2] Fix QEMU crashes when passing IGD to
a guest VM under XEN", it contains quite detailed explanation for both
"Fix undefined behaviour" and "Improve legacy vbios handling" patches.

Artur Puzio

[linux-linus test] 149840: regressions - FAIL

2020-04-28 Thread osstest service owner

flight 149840 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/149840/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-examine  8 reboot   fail REGR. vs. 149832
 test-amd64-amd64-examine  4 memdisk-try-append   fail REGR. vs. 149832

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds16 guest-start/debian.repeat fail REGR. vs. 149832

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-rtds 18 guest-localmigrate/x10  fail blocked in 149832
 test-amd64-amd64-xl-qemut-win7-amd64 17 guest-stopfail like 149832
 test-armhf-armhf-libvirt 14 saverestore-support-checkfail  like 149832
 test-amd64-amd64-xl-qemuu-win7-amd64 17 guest-stopfail like 149832
 test-amd64-amd64-xl-qemut-ws16-amd64 17 guest-stopfail like 149832
 test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 149832
 test-armhf-armhf-libvirt-raw 13 saverestore-support-checkfail  like 149832
 test-amd64-i386-xl-qemut-win7-amd64 17 guest-stop fail like 149832
 test-amd64-amd64-xl-qemuu-ws16-amd64 17 guest-stopfail like 149832
 test-amd64-i386-xl-pvshim12 guest-start  fail   never pass
 test-amd64-amd64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-seattle  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt 13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  13 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  13 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 11 migrate-support-check 
fail never pass
 test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2  fail never pass
 test-arm64-arm64-xl-xsm  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-xsm  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit1  14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-thunderx 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 13 migrate-support-checkfail   never pass
 test-arm64-arm64-libvirt-xsm 14 saverestore-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  13 migrate-support-checkfail   never pass
 test-arm64-arm64-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  14 saverestore-support-checkfail   never pass
 test-amd64-amd64-libvirt-vhd 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-libvirt 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 13 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-multivcpu 14 saverestore-support-checkfail  never pass
 test-armhf-armhf-xl  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  14 saverestore-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  13 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit1  14 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-ws16-amd64 17 guest-stop  fail never pass
 test-armhf-armhf-libvirt-raw 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 13 migrate-support-checkfail never pass
 test-armhf-armhf-xl-cubietruck 14 saverestore-support-checkfail never pass
 test-armhf-armhf-xl-vhd  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-vhd  13 saverestore-support-checkfail   never pass
 test-amd64-i386-xl-qemut-ws16-amd64 17 guest-stop  fail never pass

version targeted for testing:
 linux51184ae37e0518fd90cb437a2fbc953ae558cd0d
baseline version:
 linux6a8b55ed4056ea5559ebe4f6a4b247f627870d4c

Last test of basis   149832  2020-04-27 03:24:02 Z1 days
Testing same since   14

Re: [PATCH v4] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Julien Grall


Hi Juergen,

On 28/04/2020 11:10, Jürgen Groß wrote:

On 28.04.20 11:05, Julien Grall wrote:

-where tx_id is the non-zero identifier values of an open transaction.
+| Field | Description   |
+|---|---|
+| `domid`   | The domain-id that owns the shared page   |
+|   |   |
+| `tdomid`  | The domain-id that `domid` acts on behalf of if   |
+|   | it has been subject to an SET_TARGET  |
+|   | operation [2] or DOMID_INVALID [3] otherwise  |
+|   |   |
+| `flags`   | Must be zero  |
+|   |   |
+| `evtchn`  | The port number of the interdomain channel used   |
+|   | by `domid` to communicate with xenstored  |
+|   |   |
+| `mfn` | The MFN of the shared page for a live update or   |
+|   | ~0 (i.e. all bits set) otherwise  |
-### Protocol Extension
+Since the ABI guarantees that entry 1 in `domid`'s grant table will 
always
+contain the GFN of the shared page, so for a live update `mfn` can 
be used to

+give confidence that `domid` has not been re-cycled during the update.


I am confused, there is no way a XenStored running in an Arm domain 
can map the MFN of the shared page. So how is this going to work out?


I guess this will be a MFN for PV guests and a guest PFN else.


If we use Xen terminology (xen/include/xen/mm.h), this would be a GFN.





[...]


-START_DOMAIN_TRANSACTION    ||
+    0   1   2   3    octet
++---+---+---+---+
+| conn-id   |
++---+
+| tx-id |
++---+---+
+| path-len  | value-len |
++---+---+
+| access    | perm-count    |
++---+---+
+| perm1 |
++---+
+...
++---+
+| permN |
++---+---+
+| path
+...
+| value
+...
+```
+
+
+| Field    | Description    |
+|--||
+| `conn-id`    | If this value is non-zero then this record |
+|  | related to a pending transaction   |


If conn-id is 0, how do you know the owner of the node?


The first permission entry's domid is the owner.


I think this should be spell out in the specification.






+|  |    |
+| `tx-id`  | This value should be ignored if `conn-id` is   |
+|  | zero. Otherwise it specifies the id of the |
+|  | pending transaction    |
+|  |    |
+| `path-len`   | The length (in octets) of `path` including the |
+|  | NUL terminator |
+|  |    |
+| `value-len`  | The length (in octets) of `value` (which will  |
+|  | be zero for a deleted node)    |
+|  |    |
+| `access` | This value should be ignored if this record    |
+|  | does not relate to a pending transaction,  |
+|  | otherwise it specifies the accesses made to    |
+|  | the node and hence is a bitwise OR of: |
+|  |    |
+|  | 0x0001: read   |
+|  | 0x0002: written    |
+|  |    |
+|  | The value will be zero for a deleted node  |
+|  |    |
+| `perm-count` | The number (N) of node permission specifiers   |
+|  | (which will be 0 for a node deleted in a   |
+|  | pending transaction)   |
+|  |    |
+| `perm1..N`   | A list of zero or more node permission |
+|  | specifiers (see below) |


This is a bit odd to start at one. Does it mean perm0 exists and not 
preserved?


Why? perm0 does not exist. If you have N permissions 1..N is just fine.
If you have no permissions (N=0) you won't have any permX entry.

In theory you could say perm0..N-1, but this would result in perm0..-1
for N=0 which would be really odd.


As it is odd to me to start at 1 (I am used to index starting at 0) ;). 
The more it wasn't clear h

Re: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread Jürgen Groß

On 28.04.20 10:25, Peng Fan wrote:

Subject: Re: [PATCH] xen/swiotlb: correct the check for
xen_destroy_contiguous_region

On 28.04.20 09:33, peng@nxp.com wrote:

From: Peng Fan 

When booting xen on i.MX8QM, met:
"
[3.602128] Unable to handle kernel paging request at virtual address

00272d40

[3.610804] Mem abort info:
[3.613905]   ESR = 0x9604
[3.617332]   EC = 0x25: DABT (current EL), IL = 32 bits
[3.623211]   SET = 0, FnV = 0
[3.626628]   EA = 0, S1PTW = 0
[3.630128] Data abort info:
[3.633362]   ISV = 0, ISS = 0x0004
[3.637630]   CM = 0, WnR = 0
[3.640955] [00272d40] user address but active_mm is

swapper

[3.647983] Internal error: Oops: 9604 [#1] PREEMPT SMP
[3.654137] Modules linked in:
[3.677285] Hardware name: Freescale i.MX8QM MEK (DT)
[3.677302] Workqueue: events deferred_probe_work_func
[3.684253] imx6q-pcie 5f00.pcie: PCI host bridge to bus :00
[3.688297] pstate: 6005 (nZCv daif -PAN -UAO)
[3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0
[3.693993] pci_bus :00: root bus resource [bus 00-ff]
[3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0
"

In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask)
or range_straddles_page_boundary(phys, size) are true, it will create
contiguous region. So when free, we need to free contiguous region use
upper check condition.

No, this will break PV guests on x86.

Could you share more details why alloc and free not matching for the check?

xen_create_contiguous_region() is needed only in case:

- the bus address is not within dma_mask, or
- the memory region is not physically contiguous (can happen only for
  PV guests)

In any case it should arrange for the memory to be suitable for the
DMA operation, so to be contiguous and within dma_mask afterwards. So
xen_destroy_contiguous_region() should only ever called for areas
which match above criteria, as otherwise we can be sure
xen_create_contiguous_region() was not used for making the area DMA-able
in the beginning.

And this is very important in the PV case, as in those guests the page
tables are containing the host-PFNs, not the guest-PFNS, and
xen_create_contiguous_region() will fiddle with host- vs. guest-PFN
arrangements, and xen_destroy_contiguous_region() is reverting this
fiddling. Any call of xen_destroy_contiguous_region() for an area it
was not intended to be called for might swap physical pages beneath
random virtual addresses, which was the reason for this test to be
added by me.

Juergen

Thanks,
Peng.

I think there is something wrong with your setup in combination with the ARM
xen_create_contiguous_region() implementation.

Stefano?

Juergen

Signed-off-by: Peng Fan 
---
   drivers/xen/swiotlb-xen.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b6d27762c6f8..ab96e468584f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev,

size_t size, void *vaddr,

/* Convert the size to actually allocated. */
size = 1UL << (order + XEN_PAGE_SHIFT);

-   if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
-range_straddles_page_boundary(phys, size)) &&
+   if (((dev_addr + size - 1 > dma_mask) ||
+   range_straddles_page_boundary(phys, size)) &&
TestClearPageXenRemapped(virt_to_page(vaddr)))
xen_destroy_contiguous_region(phys, order);

Re: [PATCH v4] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Jürgen Groß


On 28.04.20 11:05, Julien Grall wrote:

Hi Paul,

On 27/04/2020 16:50, Paul Durrant wrote:
diff --git a/docs/designs/xenstore-migration.md 
b/docs/designs/xenstore-migration.md

index 6ab351e8fe..51d8b85171 100644
--- a/docs/designs/xenstore-migration.md
+++ b/docs/designs/xenstore-migration.md
@@ -3,254 +3,400 @@
  ## Background
  The design for *Non-Cooperative Migration of Guests*[1] explains 
that extra
-save records are required in the migrations stream to allow a guest 
running

-PV drivers to be migrated without its co-operation. Moreover the save
-records must include details of registered xenstore watches as well as
-content; information that cannot currently be recovered from 
`xenstored`,
-and hence some extension to the xenstore protocol[2] will also be 
required.

-
-The *libxenlight Domain Image Format* specification[3] already defines a
-record type `EMULATOR_XENSTORE_DATA` but this is not suitable for
-transferring xenstore data pertaining to the domain directly as it is
-specified such that keys are relative to the path
-`/local/domain/$dm_domid/device-model/$domid`. Thus it is necessary to
-define at least one new save record type.
+save records are required in the migrations stream to allow a guest 
running PV
+drivers to be migrated without its co-operation. Moreover the save 
records must
+include details of registered xenstore watches as well ascontent; 
information


s/ascontent/as content/

[...]


+### END
+
+The end record marks the end of the image, and is the final record
+in the stream.
  ```
-    0   1   2   3 octet
-+---+---+---+---+
-| NODE_DATA |
-+---+
-| path length   |
-+---+
-| path data |
-...
-| pad (0 to 3 octets)   |
-+---+
-| perm count (N)    |
-+---+
-| perm0 |
-+---+
-...
-+---+
-| permN |
-+---+
-| value length  |
-+---+
-| value data    |
-...
-| pad (0 to 3 octets)   |
-+---+
+    0   1   2   3   4   5   6   7    octet
++---+---+---+---+---+---+---+---+
  ```
-where perm0..N are formatted as follows:
+The end record contains no fields; its body length is 0.
+
+\pagebreak
+
+### GLOBAL_DATA
+
+This record is only relevant for live update. It contains details of 
global

+xenstored state that needs to be restored.


Reading this paragraph, it sounds like GLOBAL_DATA should always be 
present in the liveupdate stream. However ...


[...]


-path length and value length are specified in octets (excluding the NUL
-terminator of the path). perm should be one of the ASCII values `w`, 
`r`,

-`b` or `n` as described in [2]. All pad values should be 0.
-All paths should be absolute (i.e. start with `/`) and as described in
-[2].
+| Field  | Description  |
+||--|
+| `rw-socket-fd` | The file descriptor of the socket accepting  |
+|    | read-write connections   |
+|    |  |
+| `ro-socket-fd` | The file descriptor of the socket accepting  |
+|    | read-only connections    |
+xenstored will resume in the original process context. Hence 
`rw-socket-fd` and

+`ro-socket-fd` simply specify the file descriptors of the sockets.


... sockets may not always be available in XenStored. So should we 
reserve a value for "not in-use socket"?


Yes, this should be -1 (implying that both fields should be signed
types).



[...]


-wpath length and token length are specified in octets (excluding the NUL
-terminator). The wpath should be as described for the `WATCH` 
operation in

-[2]. The token is an arbitrary string of octets not containing any NUL
-values.
+| Field   | Description |
+|-|-|
+| `conn-id`   | A non-zero number used to identify this |
+| | connection in subsequent connection-specific    |
+| | records |
+| | |
+| `conn-type` | 0x: shared ring |
+| | 0x0001: socket  |


I would write "0x0002 - 0x: reserved for future use" to match the 
rest of the specification.


[...]


-where tx_id is the non-zero identifier values of an open transaction.
+| Field | Description   |
+|---|--

Re: [PATCH v7 08/12] xen: add /buildinfo/config entry to hypervisor filesystem

2020-04-28 Thread Julien Grall


Hi Jan,

On 28/04/2020 10:59, Jan Beulich wrote:

On 28.04.2020 11:43, Julien Grall wrote:

Hi Jan,

On 28/04/2020 09:39, Jan Beulich wrote:

On 28.04.2020 10:24, George Dunlap wrote:

On Apr 28, 2020, at 8:20 AM, Jan Beulich  wrote:
On 27.04.2020 18:25, George Dunlap wrote:

If Jan is OK with it simply being outside CONFIG_EXPERT, then great.  But if he 
insists on some kind of testing for it to be outside of CONFIG_EXPERT, then 
again, the people who want it to be security supported should be the ones who 
do the work to make it happen.


I don't understand this part, I'm afraid: Without a config option,
the code is going to be security supported as long as it doesn't
get marked otherwise (experimental or what not). With an option
depending on EXPERT, what would become security unsupported is the
non-default (i.e. disabled) setting. There's not a whole lot to
test there, it's merely a formal consequence of our general rules.
(Of course, over time dependencies of other code may develop on
the information being available e.g. to Dom0 userland. Just like
there's Linux userland code assuming the kernel config is
available in certain ways [I don't necessarily mean the equivalent
of hypfs here], to then use it in what I'd call abusive ways in at
least some cases.)


Here’s an argument you might make:

“As a member of the security team, I don’t want to be on the hook for issuing 
XSAs for code which isn’t at least smoke-tested.  Therefore, I oppose any patch 
adding CONFIG_HYPFS outside of CONFIG_EXPERT, *unless* there is a concrete plan 
for getting regular testing for CONFIG_HYPFS=n.”

I’m not saying that’s an argument you *should* make.  But personally I don’t 
have a strong argument against such an argument. So, it seems to me, if you did 
make it, you have a reasonable chance of carrying your point.

Now consider this hypothetical universe where you made that argument and nobody 
opposed it.  In order to get a particular feature (CONFIG_HYPFS=n security 
supported), there is extra work that needs to be done (getting CONFIG_HYPFS=n 
tested regularly).  My point was, the expectation should be that the extra work 
will be done by the people who want or benefit from the feature; the series 
shouldn’t be blocked until Juergen implements CONFIG_HYPFS=n testing (since he 
doesn’t personally have a stake in that feature).

Now obviously, doing work to help someone else out in the community is of 
course a good thing to do; it builds goodwill, uses our aggregate resources 
more efficiently, and makes our community more enjoyable to work with.  But the 
goodwill primarily comes from the fact that it was done as a voluntary choice, 
not as a requirement.

Juergen was balking at having to do what he saw as extra work to implement 
CONFIG_HYPFS.  I wanted to make it clear that even though I see value in having 
CONFIG_HYPFS, *he* doesn’t have to do the work if he doesn’t want to (although 
it would certainly be appreciated if he did).  And this paragraph was extending 
the same principle into the hypothetical universe where someone insisted that 
CONFIG_HYPFS=n had to be tested before being security supported.

Hope that makes sense. :-)


Yes, it does, thanks for the clarification. I can see what you describe
as a valid perspective to take, but really in my request to Jürgen I
took another: Now that we have Kconfig, additions of larger bodies of
code (possibly also just in terms of binary size) should imo generally
be questioned whether they want/need to be built for everyone. I.e. it
is not to be left to people being worried about binary sizes to arrange
for things to not be built, but for people contributing new but not
entirely essential code to consider making it option from the very
beginning.


I like the idea to have a more configurable Xen but this also comes at the 
expense of the testing/support.

At the moment, we are getting around the problem by gating the new config 
options with CONFIG_EXPERT. I have stoppped counting the number of time I 
sweared because my config got rewritten when using 'make clean' or explain to 
someone else how to use it.

As it stands, CONFIG_EXPERT is unusable and most likely anything behind it will 
rot quite quickly. So if we want to add more stuff behind it, then I would 
suggest to make it more accessible so any developper can experiment with it.


This complaint is not new; what I'm missing are concrete suggestions
on how to improve the situation.


It is quite easy to improve, rather than specifying on the command line 
we could introduce a new Kconfig option so the user can select it normally.


This would not be very different compare to what Linux does.




Going forward, I would expect the embedded folks to want more part of Xen 
configurable. Requesting them to use CONFIG_EXPERT may be an issue as this 
means we would not security support them. At the same time, I understand that 
exposing a CONFIG increase the testing matrix. How about declaring we are 
supporting/t

Re: [PATCH v7 08/12] xen: add /buildinfo/config entry to hypervisor filesystem

2020-04-28 Thread Jan Beulich

On 28.04.2020 11:43, Julien Grall wrote:
> Hi Jan,
> 
> On 28/04/2020 09:39, Jan Beulich wrote:
>> On 28.04.2020 10:24, George Dunlap wrote:
 On Apr 28, 2020, at 8:20 AM, Jan Beulich  wrote:
 On 27.04.2020 18:25, George Dunlap wrote:
> If Jan is OK with it simply being outside CONFIG_EXPERT, then great.  But 
> if he insists on some kind of testing for it to be outside of 
> CONFIG_EXPERT, then again, the people who want it to be security 
> supported should be the ones who do the work to make it happen.

 I don't understand this part, I'm afraid: Without a config option,
 the code is going to be security supported as long as it doesn't
 get marked otherwise (experimental or what not). With an option
 depending on EXPERT, what would become security unsupported is the
 non-default (i.e. disabled) setting. There's not a whole lot to
 test there, it's merely a formal consequence of our general rules.
 (Of course, over time dependencies of other code may develop on
 the information being available e.g. to Dom0 userland. Just like
 there's Linux userland code assuming the kernel config is
 available in certain ways [I don't necessarily mean the equivalent
 of hypfs here], to then use it in what I'd call abusive ways in at
 least some cases.)
>>>
>>> Here’s an argument you might make:
>>>
>>> “As a member of the security team, I don’t want to be on the hook for 
>>> issuing XSAs for code which isn’t at least smoke-tested.  Therefore, I 
>>> oppose any patch adding CONFIG_HYPFS outside of CONFIG_EXPERT, *unless* 
>>> there is a concrete plan for getting regular testing for CONFIG_HYPFS=n.”
>>>
>>> I’m not saying that’s an argument you *should* make.  But personally I 
>>> don’t have a strong argument against such an argument. So, it seems to me, 
>>> if you did make it, you have a reasonable chance of carrying your point.
>>>
>>> Now consider this hypothetical universe where you made that argument and 
>>> nobody opposed it.  In order to get a particular feature (CONFIG_HYPFS=n 
>>> security supported), there is extra work that needs to be done (getting 
>>> CONFIG_HYPFS=n tested regularly).  My point was, the expectation should be 
>>> that the extra work will be done by the people who want or benefit from the 
>>> feature; the series shouldn’t be blocked until Juergen implements 
>>> CONFIG_HYPFS=n testing (since he doesn’t personally have a stake in that 
>>> feature).
>>>
>>> Now obviously, doing work to help someone else out in the community is of 
>>> course a good thing to do; it builds goodwill, uses our aggregate resources 
>>> more efficiently, and makes our community more enjoyable to work with.  But 
>>> the goodwill primarily comes from the fact that it was done as a voluntary 
>>> choice, not as a requirement.
>>>
>>> Juergen was balking at having to do what he saw as extra work to implement 
>>> CONFIG_HYPFS.  I wanted to make it clear that even though I see value in 
>>> having CONFIG_HYPFS, *he* doesn’t have to do the work if he doesn’t want to 
>>> (although it would certainly be appreciated if he did).  And this paragraph 
>>> was extending the same principle into the hypothetical universe where 
>>> someone insisted that CONFIG_HYPFS=n had to be tested before being security 
>>> supported.
>>>
>>> Hope that makes sense. :-)
>>
>> Yes, it does, thanks for the clarification. I can see what you describe
>> as a valid perspective to take, but really in my request to Jürgen I
>> took another: Now that we have Kconfig, additions of larger bodies of
>> code (possibly also just in terms of binary size) should imo generally
>> be questioned whether they want/need to be built for everyone. I.e. it
>> is not to be left to people being worried about binary sizes to arrange
>> for things to not be built, but for people contributing new but not
>> entirely essential code to consider making it option from the very
>> beginning.
> 
> I like the idea to have a more configurable Xen but this also comes at the 
> expense of the testing/support.
> 
> At the moment, we are getting around the problem by gating the new config 
> options with CONFIG_EXPERT. I have stoppped counting the number of time I 
> sweared because my config got rewritten when using 'make clean' or explain to 
> someone else how to use it.
> 
> As it stands, CONFIG_EXPERT is unusable and most likely anything behind it 
> will rot quite quickly. So if we want to add more stuff behind it, then I 
> would suggest to make it more accessible so any developper can experiment 
> with it.

This complaint is not new; what I'm missing are concrete suggestions
on how to improve the situation.

> Going forward, I would expect the embedded folks to want more part of Xen 
> configurable. Requesting them to use CONFIG_EXPERT may be an issue as this 
> means we would not security support them. At the same time, I understand that 
> exposing a CONFIG increase the testing matrix. How abo

Re: [PATCH] mem_sharing: map shared_info page to same gfn during fork

2020-04-28 Thread Roger Pau Monné

On Mon, Apr 27, 2020 at 09:36:05AM -0700, Tamas K Lengyel wrote:
> During a VM fork we copy the shared_info page; however, we also need to ensure
> that the page is mapped into the same GFN in the fork as its in the parent.
> 
> Signed-off-by: Tamas K Lengyel 
> Suggested-by: Roger Pau Monne 
> ---
>  xen/arch/x86/mm/mem_sharing.c | 29 +
>  1 file changed, 29 insertions(+)
> 
> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
> index 344a5bfb3d..acbf21b22c 100644
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1656,6 +1656,7 @@ static void copy_tsc(struct domain *cd, struct domain 
> *d)
>  static int copy_special_pages(struct domain *cd, struct domain *d)
>  {
>  mfn_t new_mfn, old_mfn;
> +gfn_t old_gfn;
>  struct p2m_domain *p2m = p2m_get_hostp2m(cd);
>  static const unsigned int params[] =
>  {
> @@ -1701,6 +1702,34 @@ static int copy_special_pages(struct domain *cd, 
> struct domain *d)
>  new_mfn = _mfn(virt_to_mfn(cd->shared_info));
>  copy_domain_page(new_mfn, old_mfn);
>  
> +old_gfn = _gfn(get_gpfn_from_mfn(mfn_x(old_mfn)));
> +
> +if ( !gfn_eq(old_gfn, INVALID_GFN) )

I think you need to compare the parent shared info gfn against the
child shared info gfn, in case the child has mapped the shared info
but the parent doesn't have it mapped. In which case you want to
remove the mapping when doing a fork reset.

> +{
> +/* let's make sure shared_info is mapped to the same gfn */
> +gfn_t new_gfn = _gfn(get_gpfn_from_mfn(mfn_x(new_mfn)));
> +
> +if ( !gfn_eq(new_gfn, INVALID_GFN) && !gfn_eq(old_gfn, new_gfn) )

You can then remove the last condition in this if.

> +{
> +/* if shared info is mapped to a different gfn just remove it */
> +rc = p2m->set_entry(p2m, new_gfn, INVALID_MFN, PAGE_ORDER_4K,
> +p2m_invalid, p2m->default_access, -1);
> +if ( rc )
> +return rc;
> +
> +new_gfn = INVALID_GFN;
> +}
> +
> +if ( gfn_eq(new_gfn, INVALID_GFN) )

And this should then be if ( !gfn_eq(old_gfn, INVALID_GFN) ) ...

Thanks, Roger.

Re: [PATCH 01/12] libxc/save: Shrink code volume where possible

2020-04-28 Thread Wei Liu

On Mon, Apr 27, 2020 at 09:00:30PM +0100, Andrew Cooper wrote:
> On 27/04/2020 20:55, Wei Liu wrote:
> > On Mon, Apr 27, 2020 at 06:19:37PM +0100, Ian Jackson wrote:
> >> Andrew Cooper writes ("Re: [PATCH 01/12] libxc/save: Shrink code volume 
> >> where possible"):
> >>> On 14/01/2020 16:48, Ian Jackson wrote:
>  Andrew Cooper writes ("[PATCH 01/12] libxc/save: Shrink code volume 
>  where possible"):
> > A property of how the error handling (0 on success, nonzero otherwise)
> > allows these calls to be chained together with the ternary operatior.
>  I'm quite surprised to find a suggestion like this coming from you in
>  particular.
> >>> What probably is relevant is that ?: is a common construct in the
> >>> hypervisor, which I suppose does colour my expectation of people knowing
> >>> exactly what it means and how it behaves.
> >> I expect other C programmers to know what ?: does, too.  But I think
> >> using it to implement the error monad is quite unusual.  I asked
> >> around a bit and my feeling is still that this isn't an improvement.
> >>
>  Or just to permit
> rc = write_one_vcpu_basic(ctx, i);if (rc) goto error;
>  (ie on a single line).
> >>> OTOH, it should come as no surprise that I'd rather drop this patch
> >>> entirely than go with these alternatives, both of which detract from
> >>> code clarity. The former for hiding control flow, and the latter for
> >>> being atypical layout which unnecessary cognitive load to follow.
> >> I think, then, that it would be best to drop this patch, unless Wei
> >> (or someone else) disagrees with me.
> > I don't feel strongly either way.
> 
> I'm confused... I dropped this 3 and a half months ago, because it was
> blindingly obvious it was going nowhere.
> 
> This is the v1 series which was totally superseded by the v2 series also
> posted in January.

OK. I saw Ian's reply on 27th and thought it was still in progress.

Wei.

Re: [PATCH v7 08/12] xen: add /buildinfo/config entry to hypervisor filesystem

2020-04-28 Thread Julien Grall


Hi Jan,

On 28/04/2020 09:39, Jan Beulich wrote:

On 28.04.2020 10:24, George Dunlap wrote:

On Apr 28, 2020, at 8:20 AM, Jan Beulich  wrote:
On 27.04.2020 18:25, George Dunlap wrote:

If Jan is OK with it simply being outside CONFIG_EXPERT, then great.  But if he 
insists on some kind of testing for it to be outside of CONFIG_EXPERT, then 
again, the people who want it to be security supported should be the ones who 
do the work to make it happen.


I don't understand this part, I'm afraid: Without a config option,
the code is going to be security supported as long as it doesn't
get marked otherwise (experimental or what not). With an option
depending on EXPERT, what would become security unsupported is the
non-default (i.e. disabled) setting. There's not a whole lot to
test there, it's merely a formal consequence of our general rules.
(Of course, over time dependencies of other code may develop on
the information being available e.g. to Dom0 userland. Just like
there's Linux userland code assuming the kernel config is
available in certain ways [I don't necessarily mean the equivalent
of hypfs here], to then use it in what I'd call abusive ways in at
least some cases.)


Here’s an argument you might make:

“As a member of the security team, I don’t want to be on the hook for issuing 
XSAs for code which isn’t at least smoke-tested.  Therefore, I oppose any patch 
adding CONFIG_HYPFS outside of CONFIG_EXPERT, *unless* there is a concrete plan 
for getting regular testing for CONFIG_HYPFS=n.”

I’m not saying that’s an argument you *should* make.  But personally I don’t 
have a strong argument against such an argument. So, it seems to me, if you did 
make it, you have a reasonable chance of carrying your point.

Now consider this hypothetical universe where you made that argument and nobody 
opposed it.  In order to get a particular feature (CONFIG_HYPFS=n security 
supported), there is extra work that needs to be done (getting CONFIG_HYPFS=n 
tested regularly).  My point was, the expectation should be that the extra work 
will be done by the people who want or benefit from the feature; the series 
shouldn’t be blocked until Juergen implements CONFIG_HYPFS=n testing (since he 
doesn’t personally have a stake in that feature).

Now obviously, doing work to help someone else out in the community is of 
course a good thing to do; it builds goodwill, uses our aggregate resources 
more efficiently, and makes our community more enjoyable to work with.  But the 
goodwill primarily comes from the fact that it was done as a voluntary choice, 
not as a requirement.

Juergen was balking at having to do what he saw as extra work to implement 
CONFIG_HYPFS.  I wanted to make it clear that even though I see value in having 
CONFIG_HYPFS, *he* doesn’t have to do the work if he doesn’t want to (although 
it would certainly be appreciated if he did).  And this paragraph was extending 
the same principle into the hypothetical universe where someone insisted that 
CONFIG_HYPFS=n had to be tested before being security supported.

Hope that makes sense. :-)


Yes, it does, thanks for the clarification. I can see what you describe
as a valid perspective to take, but really in my request to Jürgen I
took another: Now that we have Kconfig, additions of larger bodies of
code (possibly also just in terms of binary size) should imo generally
be questioned whether they want/need to be built for everyone. I.e. it
is not to be left to people being worried about binary sizes to arrange
for things to not be built, but for people contributing new but not
entirely essential code to consider making it option from the very
beginning.


I like the idea to have a more configurable Xen but this also comes at 
the expense of the testing/support.


At the moment, we are getting around the problem by gating the new 
config options with CONFIG_EXPERT. I have stoppped counting the number 
of time I sweared because my config got rewritten when using 'make 
clean' or explain to someone else how to use it.


As it stands, CONFIG_EXPERT is unusable and most likely anything behind 
it will rot quite quickly. So if we want to add more stuff behind it, 
then I would suggest to make it more accessible so any developper can 
experiment with it.


Going forward, I would expect the embedded folks to want more part of 
Xen configurable. Requesting them to use CONFIG_EXPERT may be an issue 
as this means we would not security support them. At the same time, I 
understand that exposing a CONFIG increase the testing matrix. How about 
declaring we are supporting/testing a given set of .config? On Arm it 
would be defconfig and tiny.


Cheers,

--
Julien Grall

Re: [PATCH v4] docs/designs: re-work the xenstore migration document...

2020-04-28 Thread Julien Grall


Hi Paul,

On 27/04/2020 16:50, Paul Durrant wrote:

diff --git a/docs/designs/xenstore-migration.md 
b/docs/designs/xenstore-migration.md
index 6ab351e8fe..51d8b85171 100644
--- a/docs/designs/xenstore-migration.md
+++ b/docs/designs/xenstore-migration.md
@@ -3,254 +3,400 @@
  ## Background
  
  The design for *Non-Cooperative Migration of Guests*[1] explains that extra

-save records are required in the migrations stream to allow a guest running
-PV drivers to be migrated without its co-operation. Moreover the save
-records must include details of registered xenstore watches as well as
-content; information that cannot currently be recovered from `xenstored`,
-and hence some extension to the xenstore protocol[2] will also be required.
-
-The *libxenlight Domain Image Format* specification[3] already defines a
-record type `EMULATOR_XENSTORE_DATA` but this is not suitable for
-transferring xenstore data pertaining to the domain directly as it is
-specified such that keys are relative to the path
-`/local/domain/$dm_domid/device-model/$domid`. Thus it is necessary to
-define at least one new save record type.
+save records are required in the migrations stream to allow a guest running PV
+drivers to be migrated without its co-operation. Moreover the save records must
+include details of registered xenstore watches as well ascontent; information


s/ascontent/as content/

[...]


+### END
+
+The end record marks the end of the image, and is the final record
+in the stream.
  
  ```

-0   1   2   3 octet
-+---+---+---+---+
-| NODE_DATA |
-+---+
-| path length   |
-+---+
-| path data |
-...
-| pad (0 to 3 octets)   |
-+---+
-| perm count (N)|
-+---+
-| perm0 |
-+---+
-...
-+---+
-| permN |
-+---+
-| value length  |
-+---+
-| value data|
-...
-| pad (0 to 3 octets)   |
-+---+
+0   1   2   3   4   5   6   7octet
++---+---+---+---+---+---+---+---+
  ```
  
-where perm0..N are formatted as follows:
  
+The end record contains no fields; its body length is 0.

+
+\pagebreak
+
+### GLOBAL_DATA
+
+This record is only relevant for live update. It contains details of global
+xenstored state that needs to be restored.


Reading this paragraph, it sounds like GLOBAL_DATA should always be 
present in the liveupdate stream. However ...


[...]


-path length and value length are specified in octets (excluding the NUL
-terminator of the path). perm should be one of the ASCII values `w`, `r`,
-`b` or `n` as described in [2]. All pad values should be 0.
-All paths should be absolute (i.e. start with `/`) and as described in
-[2].
+| Field  | Description  |
+||--|
+| `rw-socket-fd` | The file descriptor of the socket accepting  |
+|| read-write connections   |
+||  |
+| `ro-socket-fd` | The file descriptor of the socket accepting  |
+|| read-only connections|
  
+xenstored will resume in the original process context. Hence `rw-socket-fd` and

+`ro-socket-fd` simply specify the file descriptors of the sockets.


... sockets may not always be available in XenStored. So should we 
reserve a value for "not in-use socket"?


[...]


-wpath length and token length are specified in octets (excluding the NUL
-terminator). The wpath should be as described for the `WATCH` operation in
-[2]. The token is an arbitrary string of octets not containing any NUL
-values.
  
+| Field   | Description |

+|-|-|
+| `conn-id`   | A non-zero number used to identify this |
+| | connection in subsequent connection-specific|
+| | records |
+| | |
+| `conn-type` | 0x: shared ring |
+| | 0x0001: socket  |


I would write "0x0002 - 0x: reserved for future use" to match the 
rest of the specification.


[...]


-where tx_id is the non-zero identifier values of an open transaction.
  
+| Field | Description   |

+|---|---|
+| `domid`   | The domain-id that owns the shared page   |
+|   |

Re: [PATCH 1/2] Fix undefined behaviour

2020-04-28 Thread Peter Maydell

On Tue, 28 Apr 2020 at 08:50, Grzegorz Uriasz  wrote:
>
> Signed-off-by: Grzegorz Uriasz 
> ---
>  hw/xen/xen_pt_load_rom.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)

The subject doesn't match the patch contents and there is
no long-form part of the commit message explaining why...

thanks
-- PMM

Re: [PATCH v7 08/12] xen: add /buildinfo/config entry to hypervisor filesystem

2020-04-28 Thread Jan Beulich

On 28.04.2020 10:24, George Dunlap wrote:
>> On Apr 28, 2020, at 8:20 AM, Jan Beulich  wrote:
>> On 27.04.2020 18:25, George Dunlap wrote:
>>> If Jan is OK with it simply being outside CONFIG_EXPERT, then great.  But 
>>> if he insists on some kind of testing for it to be outside of 
>>> CONFIG_EXPERT, then again, the people who want it to be security supported 
>>> should be the ones who do the work to make it happen.
>>
>> I don't understand this part, I'm afraid: Without a config option,
>> the code is going to be security supported as long as it doesn't
>> get marked otherwise (experimental or what not). With an option
>> depending on EXPERT, what would become security unsupported is the
>> non-default (i.e. disabled) setting. There's not a whole lot to
>> test there, it's merely a formal consequence of our general rules.
>> (Of course, over time dependencies of other code may develop on
>> the information being available e.g. to Dom0 userland. Just like
>> there's Linux userland code assuming the kernel config is
>> available in certain ways [I don't necessarily mean the equivalent
>> of hypfs here], to then use it in what I'd call abusive ways in at
>> least some cases.)
> 
> Here’s an argument you might make:
> 
> “As a member of the security team, I don’t want to be on the hook for issuing 
> XSAs for code which isn’t at least smoke-tested.  Therefore, I oppose any 
> patch adding CONFIG_HYPFS outside of CONFIG_EXPERT, *unless* there is a 
> concrete plan for getting regular testing for CONFIG_HYPFS=n.”
> 
> I’m not saying that’s an argument you *should* make.  But personally I don’t 
> have a strong argument against such an argument. So, it seems to me, if you 
> did make it, you have a reasonable chance of carrying your point.
> 
> Now consider this hypothetical universe where you made that argument and 
> nobody opposed it.  In order to get a particular feature (CONFIG_HYPFS=n 
> security supported), there is extra work that needs to be done (getting 
> CONFIG_HYPFS=n tested regularly).  My point was, the expectation should be 
> that the extra work will be done by the people who want or benefit from the 
> feature; the series shouldn’t be blocked until Juergen implements 
> CONFIG_HYPFS=n testing (since he doesn’t personally have a stake in that 
> feature).
> 
> Now obviously, doing work to help someone else out in the community is of 
> course a good thing to do; it builds goodwill, uses our aggregate resources 
> more efficiently, and makes our community more enjoyable to work with.  But 
> the goodwill primarily comes from the fact that it was done as a voluntary 
> choice, not as a requirement.
> 
> Juergen was balking at having to do what he saw as extra work to implement 
> CONFIG_HYPFS.  I wanted to make it clear that even though I see value in 
> having CONFIG_HYPFS, *he* doesn’t have to do the work if he doesn’t want to 
> (although it would certainly be appreciated if he did).  And this paragraph 
> was extending the same principle into the hypothetical universe where someone 
> insisted that CONFIG_HYPFS=n had to be tested before being security supported.
> 
> Hope that makes sense. :-)

Yes, it does, thanks for the clarification. I can see what you describe
as a valid perspective to take, but really in my request to Jürgen I
took another: Now that we have Kconfig, additions of larger bodies of
code (possibly also just in terms of binary size) should imo generally
be questioned whether they want/need to be built for everyone. I.e. it
is not to be left to people being worried about binary sizes to arrange
for things to not be built, but for people contributing new but not
entirely essential code to consider making it option from the very
beginning.

In particular, seeing which patch we're discussing, I find it far less
helpful to have CONFIG_HYPFS_CONFIG when there's no CONFIG_HYPFS, at
least as long as our .config-s are still tiny compared to Linux'es.

Jan

RE: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread Peng Fan

> Subject: Re: [PATCH] xen/swiotlb: correct the check for
> xen_destroy_contiguous_region
> 
> On 28.04.20 09:33, peng@nxp.com wrote:
> > From: Peng Fan 
> >
> > When booting xen on i.MX8QM, met:
> > "
> > [3.602128] Unable to handle kernel paging request at virtual address
> 00272d40
> > [3.610804] Mem abort info:
> > [3.613905]   ESR = 0x9604
> > [3.617332]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [3.623211]   SET = 0, FnV = 0
> > [3.626628]   EA = 0, S1PTW = 0
> > [3.630128] Data abort info:
> > [3.633362]   ISV = 0, ISS = 0x0004
> > [3.637630]   CM = 0, WnR = 0
> > [3.640955] [00272d40] user address but active_mm is
> swapper
> > [3.647983] Internal error: Oops: 9604 [#1] PREEMPT SMP
> > [3.654137] Modules linked in:
> > [3.677285] Hardware name: Freescale i.MX8QM MEK (DT)
> > [3.677302] Workqueue: events deferred_probe_work_func
> > [3.684253] imx6q-pcie 5f00.pcie: PCI host bridge to bus :00
> > [3.688297] pstate: 6005 (nZCv daif -PAN -UAO)
> > [3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0
> > [3.693993] pci_bus :00: root bus resource [bus 00-ff]
> > [3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0
> > "
> >
> > In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask)
> > or range_straddles_page_boundary(phys, size) are true, it will create
> > contiguous region. So when free, we need to free contiguous region use
> > upper check condition.
> 
> No, this will break PV guests on x86.

Could you share more details why alloc and free not matching for the check?

Thanks,
Peng.

> 
> I think there is something wrong with your setup in combination with the ARM
> xen_create_contiguous_region() implementation.
> 
> Stefano?
> 
> 
> Juergen
> 
> >
> > Signed-off-by: Peng Fan 
> > ---
> >   drivers/xen/swiotlb-xen.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index b6d27762c6f8..ab96e468584f 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev,
> size_t size, void *vaddr,
> > /* Convert the size to actually allocated. */
> > size = 1UL << (order + XEN_PAGE_SHIFT);
> >
> > -   if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
> > -range_straddles_page_boundary(phys, size)) &&
> > +   if (((dev_addr + size - 1 > dma_mask) ||
> > +   range_straddles_page_boundary(phys, size)) &&
> > TestClearPageXenRemapped(virt_to_page(vaddr)))
> > xen_destroy_contiguous_region(phys, order);
> >
> >

Re: [PATCH v7 08/12] xen: add /buildinfo/config entry to hypervisor filesystem

2020-04-28 Thread George Dunlap

> On Apr 28, 2020, at 8:20 AM, Jan Beulich  wrote:
> 
> On 27.04.2020 18:25, George Dunlap wrote:
>> If Jan is OK with it simply being outside CONFIG_EXPERT, then great.  But if 
>> he insists on some kind of testing for it to be outside of CONFIG_EXPERT, 
>> then again, the people who want it to be security supported should be the 
>> ones who do the work to make it happen.
> 
> I don't understand this part, I'm afraid: Without a config option,
> the code is going to be security supported as long as it doesn't
> get marked otherwise (experimental or what not). With an option
> depending on EXPERT, what would become security unsupported is the
> non-default (i.e. disabled) setting. There's not a whole lot to
> test there, it's merely a formal consequence of our general rules.
> (Of course, over time dependencies of other code may develop on
> the information being available e.g. to Dom0 userland. Just like
> there's Linux userland code assuming the kernel config is
> available in certain ways [I don't necessarily mean the equivalent
> of hypfs here], to then use it in what I'd call abusive ways in at
> least some cases.)

Here’s an argument you might make:

“As a member of the security team, I don’t want to be on the hook for issuing 
XSAs for code which isn’t at least smoke-tested.  Therefore, I oppose any patch 
adding CONFIG_HYPFS outside of CONFIG_EXPERT, *unless* there is a concrete plan 
for getting regular testing for CONFIG_HYPFS=n.”

I’m not saying that’s an argument you *should* make.  But personally I don’t 
have a strong argument against such an argument. So, it seems to me, if you did 
make it, you have a reasonable chance of carrying your point.

Now consider this hypothetical universe where you made that argument and nobody 
opposed it.  In order to get a particular feature (CONFIG_HYPFS=n security 
supported), there is extra work that needs to be done (getting CONFIG_HYPFS=n 
tested regularly).  My point was, the expectation should be that the extra work 
will be done by the people who want or benefit from the feature; the series 
shouldn’t be blocked until Juergen implements CONFIG_HYPFS=n testing (since he 
doesn’t personally have a stake in that feature).

Now obviously, doing work to help someone else out in the community is of 
course a good thing to do; it builds goodwill, uses our aggregate resources 
more efficiently, and makes our community more enjoyable to work with.  But the 
goodwill primarily comes from the fact that it was done as a voluntary choice, 
not as a requirement.

Juergen was balking at having to do what he saw as extra work to implement 
CONFIG_HYPFS.  I wanted to make it clear that even though I see value in having 
CONFIG_HYPFS, *he* doesn’t have to do the work if he doesn’t want to (although 
it would certainly be appreciated if he did).  And this paragraph was extending 
the same principle into the hypothetical universe where someone insisted that 
CONFIG_HYPFS=n had to be tested before being security supported.

Hope that makes sense. :-)

 -George

Re: [PATCH] x86/ioemul: Rewrite stub generation

2020-04-28 Thread Jan Beulich

On 28.04.2020 00:20, Andrew Cooper wrote:
> On 27/04/2020 16:28, Jan Beulich wrote:
>> On 27.04.2020 14:20, Andrew Cooper wrote:
>>> The logic is completely undocumented and almost impossible to follow.  It
>>> actually uses return oriented programming.  Rewrite it to conform to more
>>> normal call mechanics, and leave a big comment explaining thing.  As well as
>>> the code being easier to follow, it will execute faster as it isn't fighting
>>> the branch predictor.
>>>
>>> Move the ioemul_handle_quirk() function pointer from traps.c to
>>> ioport_emulate.c.  There is no reason for it to be in neither of the two
>>> translation units which use it.  Alter the behaviour to return the number of
>>> bytes written into the stub.
>>>
>>> Access the addresses of the host/guest helpers with extern const char 
>>> arrays.
>>> Nothing good will come of C thinking they are regular functions.
>> I agree with the C aspect, but imo the assembly routines should,
>> with the changes you make, be marked as being ordinary functions.
> 
> Despite the changes, they are still very much not ordinary functions,
> and cannot be used by C.
> 
> I have no objection to marking them as STT_FUNCTION (as this doesn't
> mean C function), but...
> 
>> A reasonable linker would then warn about the C file wanting an
>> STT_OBJECT while the assembly file provides an STT_FUNC. I'd
>> therefore prefer, along with marking the functions as such, to
>> have them also declared as functions in C.
> 
> ... there is literally nothing safe which C can do with them other than
> manipulate their address.
> 
> Writing it like this is specifically prevents something from compiling
> which will explode at runtime, is someone is naive enough to try using
> the function from C.

Besides being certain that such an attempt, if it made it into a
submitted patch in the first place, would be caught by review, I
don't see you addressing my main counter argument. Preventing a
declared function to be called can be had by other means, e.g.
__attribute__((__warning__())).

>>> @@ -19,18 +22,16 @@ static bool ioemul_handle_proliant_quirk(
>>>  0xa8, 0x80, /*test $0x80, %al */
>>>  0x75, 0xfb, /*jnz 1b  */
>>>  0x9d,   /*popf*/
>>> -0xc3,   /*ret */
>>>  };
>>>  uint16_t port = regs->dx;
>>>  uint8_t value = regs->al;
>>>  
>>>  if ( (opcode != 0xee) || (port != 0xcd4) || !(value & 0x80) )
>>> -return false;
>>> +return 0;
>>>  
>>>  memcpy(io_emul_stub, stub, sizeof(stub));
>>> -BUILD_BUG_ON(IOEMUL_QUIRK_STUB_BYTES < sizeof(stub));
>> So you treat a build failure for a runtime crash.
> 
> I presume you mean s/treat/trade/ here, and the answer is no, not really.
> 
> There is nothing which actually forced a connection between the build
> time checks and runtime behaviour, so it was only a facade of safety,
> not real safety.

I'm not following, I'm afraid: The above together with

BUILD_BUG_ON(STUB_BUF_SIZE / 2 < MAX(9, /* Default emul stub */
 5 + IOEMUL_QUIRK_STUB_BYTES));

(where the literal numbers live next to what they describe)
did very well provide some level of guarding.

>>  I can see the
>> advantages of the new approach, but the original got away with
>> less stub space.
> 
> Stub space doesn't matter, so long as it fits.  In particular, ...
> 
>> If our L1_CACHE_SHIFT wasn't hard coded to 7
>> just to cover some unusual CPUs, your new approach would, if I
>> got the counting and calculations right, not work (with a value
>> resulting in a 64-byte cache line size).
> 
> ... the SYSCALL stubs use 64 bytes so Xen cannot function with a shift
> less than 7.

Oh, my fault - I read the STUB_BUF_SHIFT expression as min() when
it really is max(). So yes, we can rely on there being 64 bytes.
(Everything further down therefore becomes moot afaict.)

>> Introducing a Kconfig
>> option for this should imo not come with a need to re-work the
>> logic here again. Therefore I'd like us to think about a way
>> to make the space needed not exceed 32 bytes.
> 
> And why would we ever want an option like that?  (I know how you're
> going to answer this, so I'm going to pre-emptively point out that there
> are hundreds of kilobytes of easier-to-shrink per-cpu data structures
> than this one).

Not sure what per-CPU data structures you talk about. I wasn't
thinking of space savings in particular, but rather about getting
our setting in sync with actual hardware. Its value is, afaics,
used in a far more relevant way by xmalloc() and friends.

Jan

> Honestly, this suggestion is a total waste of time and effort.  It is an
> enormous amount of complexity to micro-optimise a problem which doesn't
> exist in the first place.
> 
> The stubs are already 128 bytes per CPU and cannot be made smaller. 
> Attempting to turn this particular stub into <32 has no benefit (the
> stubs don't actually get smaller), and maj

RE: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread Peng Fan

> Subject: Re: [PATCH] xen/swiotlb: correct the check for
> xen_destroy_contiguous_region
> 
> On Tue, Apr 28, 2020 at 03:33:45PM +0800, peng@nxp.com wrote:
> >
> > In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask)
> > or range_straddles_page_boundary(phys, size) are true, it will create
> > contiguous region. So when free, we need to free contiguous region use
> > upper check condition.
> >
> > Signed-off-by: Peng Fan 
> > ---
> >  drivers/xen/swiotlb-xen.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> > index b6d27762c6f8..ab96e468584f 100644
> > --- a/drivers/xen/swiotlb-xen.c
> > +++ b/drivers/xen/swiotlb-xen.c
> > @@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev,
> size_t size, void *vaddr,
> > /* Convert the size to actually allocated. */
> > size = 1UL << (order + XEN_PAGE_SHIFT);
> >
> > -   if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
> > -range_straddles_page_boundary(phys, size)) &&
> > +   if (((dev_addr + size - 1 > dma_mask) ||
> > +   range_straddles_page_boundary(phys, size)) &&
> > TestClearPageXenRemapped(virt_to_page(vaddr)))
> 
> No need for the inner braces.
> 
> But more importantly please factor our a helper that can be used by alloc and
> free to make sure that they always stay in sync.  Something

Thanks for reviewing. I'll take your suggestion in v2. Before that,
I would wait to see if there are more comments in this patch,
because there are several history commits touching this place.

Thanks,
Peng.

> like:
> 
> static inline bool xen_swiotlb_need_contiguous_region(struct device *dev,
>   phys_addr_t phys, size_t size)
> {
> 
>   return xen_phys_to_bus(phys) + size - 1 > dev->coherent_dma_mask ||
>   range_straddles_page_boundary(phys, size)) }

RE: [PATCH 1/2] Fix undefined behaviour

2020-04-28 Thread Paul Durrant

> -Original Message-
> From: Grzegorz Uriasz 
> Sent: 28 April 2020 07:29
> To: qemu-de...@nongnu.org
> Cc: Grzegorz Uriasz ; marma...@invisiblethingslab.com; 
> ar...@puzio.waw.pl;
> ja...@bartmin.ski; j.nowa...@student.uw.edu.pl; Stefano Stabellini 
> ; Anthony
> Perard ; Paul Durrant ; 
> xen-devel@lists.xenproject.org
> Subject: [PATCH 1/2] Fix undefined behaviour
> 
> Signed-off-by: Grzegorz Uriasz 

I think we need more of a commit comment for both this and patch #2 to explain 
why you are making the changes.

  Paul

Re: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread Jürgen Groß


On 28.04.20 09:33, peng@nxp.com wrote:

From: Peng Fan 

When booting xen on i.MX8QM, met:
"
[3.602128] Unable to handle kernel paging request at virtual address 
00272d40
[3.610804] Mem abort info:
[3.613905]   ESR = 0x9604
[3.617332]   EC = 0x25: DABT (current EL), IL = 32 bits
[3.623211]   SET = 0, FnV = 0
[3.626628]   EA = 0, S1PTW = 0
[3.630128] Data abort info:
[3.633362]   ISV = 0, ISS = 0x0004
[3.637630]   CM = 0, WnR = 0
[3.640955] [00272d40] user address but active_mm is swapper
[3.647983] Internal error: Oops: 9604 [#1] PREEMPT SMP
[3.654137] Modules linked in:
[3.677285] Hardware name: Freescale i.MX8QM MEK (DT)
[3.677302] Workqueue: events deferred_probe_work_func
[3.684253] imx6q-pcie 5f00.pcie: PCI host bridge to bus :00
[3.688297] pstate: 6005 (nZCv daif -PAN -UAO)
[3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0
[3.693993] pci_bus :00: root bus resource [bus 00-ff]
[3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0
"

In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask) or
range_straddles_page_boundary(phys, size) are true, it will
create contiguous region. So when free, we need to free contiguous
region use upper check condition.


No, this will break PV guests on x86.

I think there is something wrong with your setup in combination with
the ARM xen_create_contiguous_region() implementation.

Stefano?


Juergen



Signed-off-by: Peng Fan 
---
  drivers/xen/swiotlb-xen.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b6d27762c6f8..ab96e468584f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
/* Convert the size to actually allocated. */
size = 1UL << (order + XEN_PAGE_SHIFT);
  
-	if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||

-range_straddles_page_boundary(phys, size)) &&
+   if (((dev_addr + size - 1 > dma_mask) ||
+   range_straddles_page_boundary(phys, size)) &&
TestClearPageXenRemapped(virt_to_page(vaddr)))
xen_destroy_contiguous_region(phys, order);

Re: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread Christoph Hellwig

On Tue, Apr 28, 2020 at 03:33:45PM +0800, peng@nxp.com wrote:
> 
> In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask) or
> range_straddles_page_boundary(phys, size) are true, it will
> create contiguous region. So when free, we need to free contiguous
> region use upper check condition.
> 
> Signed-off-by: Peng Fan 
> ---
>  drivers/xen/swiotlb-xen.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index b6d27762c6f8..ab96e468584f 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
> size, void *vaddr,
>   /* Convert the size to actually allocated. */
>   size = 1UL << (order + XEN_PAGE_SHIFT);
>  
> - if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
> -  range_straddles_page_boundary(phys, size)) &&
> + if (((dev_addr + size - 1 > dma_mask) ||
> + range_straddles_page_boundary(phys, size)) &&
>   TestClearPageXenRemapped(virt_to_page(vaddr)))

No need for the inner braces.

But more importantly please factor our a helper that can be used by
alloc and free to make sure that they always stay in sync.  Something
like:

static inline bool xen_swiotlb_need_contiguous_region(struct device *dev,
phys_addr_t phys, size_t size)
{

return xen_phys_to_bus(phys) + size - 1 > dev->coherent_dma_mask ||
range_straddles_page_boundary(phys, size))
}

Ping: [PATCH v3 0/5] (remaining) XSA-292 follow-up

2020-04-28 Thread Jan Beulich

Andrew,

On 25.09.2019 17:19, Jan Beulich wrote:
> 1: x86: suppress XPTI-related TLB flushes when possible
> 2: x86/mm: honor opt_pcid also for 32-bit PV domains

I realize these two weren't entirely uncontroversial. May I
please ask that you get back to them, more than half a year
after their posting? For patch 1 I don't think I got
anything back yet on v2 / v3. For patch 2 see
https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg01721.html

> 3: x86/HVM: move NOFLUSH handling out of hvm_set_cr3()

See
https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg01689.html
https://lists.xenproject.org/archives/html/xen-devel/2019-09/msg01737.html

> 4: x86/HVM: refuse CR3 loads with reserved (upper) bits set
> 5: x86/HVM: cosmetics to hvm_set_cr3()

These two have your ack, but depend on at least patch 3 afaict.

Jan

[PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

2020-04-28 Thread peng . fan

From: Peng Fan 

When booting xen on i.MX8QM, met:
"
[3.602128] Unable to handle kernel paging request at virtual address 
00272d40
[3.610804] Mem abort info:
[3.613905]   ESR = 0x9604
[3.617332]   EC = 0x25: DABT (current EL), IL = 32 bits
[3.623211]   SET = 0, FnV = 0
[3.626628]   EA = 0, S1PTW = 0
[3.630128] Data abort info:
[3.633362]   ISV = 0, ISS = 0x0004
[3.637630]   CM = 0, WnR = 0
[3.640955] [00272d40] user address but active_mm is swapper
[3.647983] Internal error: Oops: 9604 [#1] PREEMPT SMP
[3.654137] Modules linked in:
[3.677285] Hardware name: Freescale i.MX8QM MEK (DT)
[3.677302] Workqueue: events deferred_probe_work_func
[3.684253] imx6q-pcie 5f00.pcie: PCI host bridge to bus :00
[3.688297] pstate: 6005 (nZCv daif -PAN -UAO)
[3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0
[3.693993] pci_bus :00: root bus resource [bus 00-ff]
[3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0
"

In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask) or
range_straddles_page_boundary(phys, size) are true, it will
create contiguous region. So when free, we need to free contiguous
region use upper check condition.

Signed-off-by: Peng Fan 
---
 drivers/xen/swiotlb-xen.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index b6d27762c6f8..ab96e468584f 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
/* Convert the size to actually allocated. */
size = 1UL << (order + XEN_PAGE_SHIFT);
 
-   if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
-range_straddles_page_boundary(phys, size)) &&
+   if (((dev_addr + size - 1 > dma_mask) ||
+   range_straddles_page_boundary(phys, size)) &&
TestClearPageXenRemapped(virt_to_page(vaddr)))
xen_destroy_contiguous_region(phys, order);
 
-- 
2.16.4

1 2 >

1 - 100 of 102 matches

Mail list logo