date:20150721

[Xen-devel] [PATCH v2 1/3] xen-blkfront: introduce blkfront_gather_backend_features()

2015-07-21 Thread Bob Liu

There is a bug when migrate from !feature-persistent host to feature-persistent
host, because domU still thinks new host/backend doesn't support persistent.
Dmesg like:
backed has not unmapped grant: 839
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 839

The fix is to recheck feature-persistent of new backend in blkif_recover().
See: https://lkml.org/lkml/2015/5/25/469

As Roger suggested, we can split the part of blkfront_connect that checks for
optional features, like persistent grants, indirect descriptors and
flush/barrier features to a separate function and call it from both
blkfront_connect and blkif_recover

Signed-off-by: Bob Liu 
---
Changes in v2:
 * Also put blkfront_setup_indirect() inside
---
 drivers/block/xen-blkfront.c |  122 +++---
 1 file changed, 68 insertions(+), 54 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5b45ee5..3b193cf 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -181,6 +181,7 @@ static DEFINE_SPINLOCK(minor_lock);
((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
 
 static int blkfront_setup_indirect(struct blkfront_info *info);
+static int blkfront_gather_backend_features(struct blkfront_info *info);
 
 static int get_id_from_freelist(struct blkfront_info *info)
 {
@@ -1514,7 +1515,7 @@ static int blkif_recover(struct blkfront_info *info)
info->shadow_free = info->ring.req_prod_pvt;
info->shadow[BLK_RING_SIZE(info)-1].req.u.rw.id = 0x0fff;
 
-   rc = blkfront_setup_indirect(info);
+   rc = blkfront_gather_backend_features(info);
if (rc) {
kfree(copy);
return rc;
@@ -1694,20 +1695,13 @@ static void blkfront_setup_discard(struct blkfront_info 
*info)
 
 static int blkfront_setup_indirect(struct blkfront_info *info)
 {
-   unsigned int indirect_segments, segs;
+   unsigned int segs;
int err, i;
 
-   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
-   "feature-max-indirect-segments", "%u", 
&indirect_segments,
-   NULL);
-   if (err) {
-   info->max_indirect_segments = 0;
+   if (info->max_indirect_segments == 0)
segs = BLKIF_MAX_SEGMENTS_PER_REQUEST;
-   } else {
-   info->max_indirect_segments = min(indirect_segments,
- xen_blkif_max_segments);
+   else
segs = info->max_indirect_segments;
-   }
 
err = fill_grant_buffer(info, (segs + INDIRECT_GREFS(segs)) * 
BLK_RING_SIZE(info));
if (err)
@@ -1771,6 +1765,68 @@ out_of_memory:
 }
 
 /*
+ * Gather all backend feature-*
+ */
+static int blkfront_gather_backend_features(struct blkfront_info *info)
+{
+   int err;
+   int barrier, flush, discard, persistent;
+   unsigned int indirect_segments;
+
+   info->feature_flush = 0;
+
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-barrier", "%d", &barrier,
+   NULL);
+
+   /*
+* If there's no "feature-barrier" defined, then it means
+* we're dealing with a very old backend which writes
+* synchronously; nothing to do.
+*
+* If there are barriers, then we use flush.
+*/
+   if (!err && barrier)
+   info->feature_flush = REQ_FLUSH | REQ_FUA;
+   /*
+* And if there is "feature-flush-cache" use that above
+* barriers.
+*/
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-flush-cache", "%d", &flush,
+   NULL);
+
+   if (!err && flush)
+   info->feature_flush = REQ_FLUSH;
+
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-discard", "%d", &discard,
+   NULL);
+
+   if (!err && discard)
+   blkfront_setup_discard(info);
+
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-persistent", "%u", &persistent,
+   NULL);
+   if (err)
+   info->feature_persistent = 0;
+   else
+   info->feature_persistent = persistent;
+
+   err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+   "feature-max-indirect-segments", "%u", 
&indirect_segments,
+   NULL);
+   if (err)
+   info->max_indirect_segments = 0;
+   else
+   info->max_indirect_segments = min(indirect_segments,
+ xen_blkif_max_segments);
+
+   return blkfront_setup_indirect(info);
+}
+
+/*
  * Invoked when the backend is finally 'ready' (and has told produced
  * the details about the physical devi

[Xen-devel] [PATCH v2] xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()

2015-07-21 Thread Bob Liu

The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
work haven't finished.
There is a work_pending() before this BUG_ON, but it doesn't account if the work
is still currently running.

Signed-off-by: Bob Liu 
---
Change in v2:
 * Replace with work_busy()
---
 drivers/block/xen-blkback/blkback.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index ced9677..954c002 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -369,8 +369,8 @@ static void purge_persistent_gnt(struct xen_blkif *blkif)
return;
}
 
-   if (work_pending(&blkif->persistent_purge_work)) {
-   pr_alert_ratelimited("Scheduled work from previous purge is 
still pending, cannot purge list\n");
+   if (work_busy(&blkif->persistent_purge_work)) {
+   pr_alert_ratelimited("Scheduled work from previous purge is 
still busy, cannot purge list\n");
return;
}
 
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v2 2/3] xen-blkfront: don't add indirect pages to list when !feature_persistent

2015-07-21 Thread Bob Liu

We should consider info->feature_persistent when adding indriect page to list
info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.

Signed-off-by: Bob Liu 
---
 drivers/block/xen-blkfront.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 3b193cf..5dd591d 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1125,8 +1125,10 @@ static void blkif_completion(struct blk_shadow *s, 
struct blkfront_info *info,
 * Add the used indirect page back to the list 
of
 * available pages for indirect grefs.
 */
-   indirect_page = 
pfn_to_page(s->indirect_grants[i]->pfn);
-   list_add(&indirect_page->lru, 
&info->indirect_pages);
+   if (!info->feature_persistent) {
+   indirect_page = 
pfn_to_page(s->indirect_grants[i]->pfn);
+   list_add(&indirect_page->lru, 
&info->indirect_pages);
+   }
s->indirect_grants[i]->gref = GRANT_INVALID_REF;
list_add_tail(&s->indirect_grants[i]->node, 
&info->grants);
}
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-21 Thread Manish Jaggi




On Tuesday 14 July 2015 11:31 PM, Stefano Stabellini wrote:

On Tue, 14 Jul 2015, Julien Grall wrote:

Hi Stefano,

On 14/07/2015 18:46, Stefano Stabellini wrote:

Linux provides a function (pci_for_each_dma_alias) which will return a
requester ID for a given PCI device. It appears that the BDF (the 's' of
sBDF
is only internal to Linux and not part of the hardware) is equal to the
requester ID on your platform but we can't assume it for anyone else.

The PCI Express Base Specification states that the requester ID is "The
combination of a Requester's Bus Number, Device Number, and Function
Number that uniquely identifies the Requester."

I think it is safe to assume BDF = requester ID on all platforms.

With the catch that in case of ARI devices
(http://pcisig.com/sites/default/files/specification_documents/ECN-alt-rid-interpretation-070604.pdf),
BDF is actually BF because the device number is always 0 and the
function number is 8 bits.

And some other problem such as broken PCI device...
Both Xen x86 (domain_context_mapping in drivers/passthrough/vtd/iommu.c) and
Linux (pci_dma_for_each_alias) use a code more complex than requesterID = BDF.

So I don't think we can use requesterID = BDF in physdev op unless we are
*stricly* sure this is valid.

The spec is quite clear about it, but I guess there might be hardware quirks.
Can we keep this open and for now till there is agreement make 
requesterid = bdf.

If you are ok, I will update and send Draft 3.




Although, based on the x86 code, Xen should be able to translate the BDF into
the requester ID...

Yes, that is a good point.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v6 10/15] x86/altp2m: add remaining support routines.

2015-07-21 Thread Sahita, Ravi

>From: George Dunlap [mailto:george.dun...@citrix.com]
>Sent: Tuesday, July 21, 2015 6:19 AM
>
>On 07/21/2015 12:58 AM, Ed White wrote:
>> Add the remaining routines required to support enabling the alternate
>> p2m functionality.
>>
>> Signed-off-by: Ed White 
>>
>> Reviewed-by: Andrew Cooper 
>> ---
>>  xen/arch/x86/hvm/hvm.c   |  58 +-
>>  xen/arch/x86/mm/hap/Makefile |   1 +
>>  xen/arch/x86/mm/hap/altp2m_hap.c |  98 ++
>>  xen/arch/x86/mm/p2m-ept.c|   3 +
>>  xen/arch/x86/mm/p2m.c| 385
>+++
>>  xen/include/asm-x86/hvm/altp2m.h |   4 +
>>  xen/include/asm-x86/p2m.h|  33 
>>  7 files changed, 576 insertions(+), 6 deletions(-)  create mode
>> 100644 xen/arch/x86/mm/hap/altp2m_hap.c
>>
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index
>> f0ab4d4..38cf0c6 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -2856,10 +2856,11 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>  mfn_t mfn;
>>  struct vcpu *curr = current;
>>  struct domain *currd = curr->domain;
>> -struct p2m_domain *p2m;
>> +struct p2m_domain *p2m, *hostp2m;
>>  int rc, fall_through = 0, paged = 0;
>>  int sharing_enomem = 0;
>>  vm_event_request_t *req_ptr = NULL;
>> +bool_t ap2m_active = 0;
>>
>>  /* On Nested Virtualization, walk the guest page table.
>>   * If this succeeds, all is fine.
>> @@ -2919,11 +2920,31 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>  goto out;
>>  }
>>
>> -p2m = p2m_get_hostp2m(currd);
>> -mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
>> +ap2m_active = altp2m_active(currd);
>> +
>> +/* Take a lock on the host p2m speculatively, to avoid potential
>> + * locking order problems later and to handle unshare etc.
>> + */
>> +hostp2m = p2m_get_hostp2m(currd);
>> +mfn = get_gfn_type_access(hostp2m, gfn, &p2mt, &p2ma,
>>P2M_ALLOC | (npfec.write_access ? P2M_UNSHARE 
>> : 0),
>>NULL);
>>
>> +if ( ap2m_active )
>> +{
>> +if ( altp2m_hap_nested_page_fault(curr, gpa, gla, npfec, &p2m) == 1 
>> )
>> +{
>> +/* entry was lazily copied from host -- retry */
>> +__put_gfn(hostp2m, gfn);
>> +rc = 1;
>> +goto out;
>> +}
>> +
>> +mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma, 0, NULL);
>> +}
>> +else
>> +p2m = hostp2m;
>> +
>>  /* Check access permissions first, then handle faults */
>>  if ( mfn_x(mfn) != INVALID_MFN )
>>  {
>> @@ -2963,6 +2984,20 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>> unsigned long gla,
>>
>>  if ( violation )
>>  {
>> +/* Should #VE be emulated for this fault? */
>> +if ( p2m_is_altp2m(p2m) && !cpu_has_vmx_virt_exceptions )
>> +{
>> +bool_t sve;
>> +
>> +p2m->get_entry(p2m, gfn, &p2mt, &p2ma, 0, NULL,
>> + &sve);
>> +
>> +if ( !sve && altp2m_vcpu_emulate_ve(curr) )
>> +{
>> +rc = 1;
>> +goto out_put_gfn;
>> +}
>> +}
>> +
>>  if ( p2m_mem_access_check(gpa, gla, npfec, &req_ptr) )
>>  {
>>  fall_through = 1;
>> @@ -2982,7 +3017,9 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>unsigned long gla,
>>   (npfec.write_access &&
>>(p2m_is_discard_write(p2mt) || (p2mt == p2m_mmio_write_dm))) )
>>  {
>> -put_gfn(currd, gfn);
>> +__put_gfn(p2m, gfn);
>> +if ( ap2m_active )
>> +__put_gfn(hostp2m, gfn);
>>
>>  rc = 0;
>>  if ( unlikely(is_pvh_domain(currd)) ) @@ -3011,6 +3048,7 @@
>> int hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>>  /* Spurious fault? PoD and log-dirty also take this path. */
>>  if ( p2m_is_ram(p2mt) )
>>  {
>> +rc = 1;
>>  /*
>>   * Page log dirty is always done with order 0. If this mfn resides 
>> in
>>   * a large page, we do not change other pages type within
>> that large @@ -3019,9 +3057,15 @@ int
>hvm_hap_nested_page_fault(paddr_t gpa, unsigned long gla,
>>  if ( npfec.write_access )
>>  {
>>  paging_mark_dirty(currd, mfn_x(mfn));
>> +/* If p2m is really an altp2m, unlock here to avoid lock 
>> ordering
>> + * violation when the change below is propagated from host p2m 
>> */
>> +if ( ap2m_active )
>> +__put_gfn(p2m, gfn);
>>  p2m_change_type_one(currd, gfn, p2m_ram_logdirty,
>> p2m_ram_rw);
>> +__put_gfn(ap2m_active ? hostp2m : p2m, gfn);
>> +
>> +goto out;
>>  }
>> -rc = 1;
>>  goto out_put_gfn;
>>  }
>>
>> @@ -3031,7 +3075,9 @@ int hvm_hap_nested_page

Re: [Xen-devel] [PATCH v6 05/15] x86/altp2m: basic data structures and support routines.

2015-07-21 Thread Sahita, Ravi

>From: George Dunlap [mailto:george.dun...@citrix.com]
>Sent: Tuesday, July 21, 2015 5:47 AM
>
>On 07/21/2015 11:13 AM, Jan Beulich wrote:
> On 21.07.15 at 01:58,  wrote:
>>> Add the basic data structures needed to support alternate p2m's and
>>> the functions to initialise them and tear them down.
>>>
>>> Although Intel hardware can handle 512 EPTP's per hardware thread
>>> concurrently, only 10 per domain are supported in this patch for
>>> performance reasons.
>>>
>>> The iterator in hap_enable() does need to handle 512, so that is now
>>> uint16_t.
>>
>> Sigh - this one is still here (and the respective code unchanged).
>> I'm not going to NAK the patch just because of this, but it really
>> looks like you aren't trying to address comments even when they're
>> trivial (and quick) to carry out and testing of the change comes as a
>> side effect of you needing to test all the other changes as well.
>>
>>> --- a/xen/arch/x86/hvm/Makefile
>>> +++ b/xen/arch/x86/hvm/Makefile
>>> @@ -1,6 +1,7 @@
>>>  subdir-y += svm
>>>  subdir-y += vmx
>>>
>>> +obj-y += altp2m.o
>>
>> Wasn't the outcome of the earlier discussion to put this in x86/mm, or
>> possibly not even introduce a new file?
>
>That was my recommendation [1] in response to v5, but there was no
>response.  The mail seems to have been seen, however, since Andrew's
>Reviewed-by was dropped.  (Perhaps they didn't notice the additional
>comment further down.)
>
>[1] marc.info/?i=<55a53159.4010...@eu.citrix.com>
>

We got these and hopefully all the straightforward ones in the next rev.

>> Overall the situation didn't really change from v5 - the code from a
>> pure functionality pov looks okay, but I don't see myself giving in on
>> all the "minor" issues the patch introduces. If some were left
>> adjusting of which really takes time to or risks breaking the code,
>> I'd (reluctantly) give my ack, but not this way, I'm afraid.
>
>This is a bit puzzling, and somewhat frustrating too -- to have carefully 
>combed
>through past versions, seen the comments and discussion, and then carefully
>combed through this one to find that nearly none of them have been
>addressed, even minor ones.

Sorry we did the ABI and some minor ones first in v6, the next rev should have 
these and patch 10 sorted out.
Thanks to you both for the reviews.

Ravi

>
> -George


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] xen-blkfront: rm BUG_ON(info->feature_persistent) in blkif_free

2015-07-21 Thread Bob Liu



On 07/22/2015 12:43 PM, Bob Liu wrote:
> 
> On 07/21/2015 05:25 PM, Roger Pau Monné wrote:
>> El 21/07/15 a les 5.30, Bob Liu ha escrit:
>>> This BUG_ON() in blkif_free() is incorrect, because indirect page can be 
>>> added
>>> to list info->indirect_pages in blkif_completion() no matter 
>>> feature_persistent
>>> is true or false.
>>>
>>> Signed-off-by: Bob Liu 
>>
>> Acked-by: Roger Pau Monné 
>>
>> This was probably an oversight from when blkif_completion was changed to
>> check for gnttab_query_foreign_access. It should be backported to stable
>> trees.
>>
> 
> Sorry, this patch is buggy and I haven't figure out why.
> 
> general protection fault:  [#1] SMP 
> Modules linked in:
> CPU: 0 PID: 39 Comm: xenwatch Tainted: GW   
> 4.1.0-rc3-3-g718cf80-dirty #67
> Hardware name: Xen HVM domU, BIOS 4.5.0-rc 11/23/2014
> task: 880283f4eca0 ti: 880283fb4000 task.ti: 880283fb4000
> RIP: 0010:[]  [] blkif_free+0x162/0x5a9
> RSP: 0018:880283fb7c48  EFLAGS: 00010087
> RAX: dead00200200 RBX: 88014140 RCX: 
> RDX: dead00100100 RSI: dead00100100 RDI: 88028f418bb8
> RBP: 880283fb7ca8 R08: dead00200200 R09: 0001
> R10: 0001 R11:  R12: 8801414481c8
> R13: dead001000e0 R14: 8801414481b8 R15: ea00
> FS:  () GS:88028f40() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 01582e08 CR3: 00013345b000 CR4: 001406f0
> Stack:
>  880023aa8420 0286 880283fb7cb7 880023aa8420
>  8800363fe240 81862c50 880283fb7ce8 880023aa8440
>  8187 880023aa8400 88014140 88014148
> Call Trace:
>  [] blkfront_remove+0x4c/0xff
>  [] xenbus_dev_remove+0x76/0xb0
>  [] __device_release_driver+0x84/0xf8
>  [] device_release_driver+0x1e/0x2b
>  [] bus_remove_device+0x12c/0x141
>  [] device_del+0x161/0x1e5
>  [] ? xenbus_thread+0x239/0x239
>  [] device_unregister+0x43/0x4f
>  [] xenbus_dev_changed+0x82/0x17f
>  [] ? xenbus_otherend_changed+0xf0/0xff
>  [] frontend_changed+0x43/0x48
>  [] xenwatch_thread+0xf9/0x127
>  [] ? add_wait_queue+0x44/0x44
>  [] kthread+0xcd/0xd5
>  [] ? alloc_pid+0xe8/0x492
>  [] ? kthread_freezable_should_stop+0x48/0x48
>  [] ret_from_fork+0x42/0x70
>  [] ? kthread_freezable_should_stop+0x48/0x48
> Code: 04 00 4c 8b 28 48 8d 78 e0 49 83 ed 20 eb 3d 48 8b 47 28 48 8b 57 20 48 
> be 00 01 10 00 00 00 ad de 49 b8 00 02 20 00 00 00 ad de <48> 89 42 08 48 89 
> 10 48 89 77 20 4c 89 47 28 31 f6 e8 26 7d cf 
> RIP  [] blkif_free+0x162/0x5a9
>  RSP 
> ---[ end trace 5321d7f1ef8414d0 ]---
> 

The right fix should be:

--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1124,8 +1124,10 @@ static void blkif_completion(struct blk_shadow *s, 
struct blkfront_info *info,
 * Add the used indirect page back to the list 
of
 * available pages for indirect grefs.
 */
-   indirect_page = 
pfn_to_page(s->indirect_grants[i]->pfn);
-   list_add(&indirect_page->lru, 
&info->indirect_pages);
+   if (!info->feature_persistent) {
+   indirect_page = 
pfn_to_page(s->indirect_grants[i]->pfn);
+   list_add(&indirect_page->lru, 
&info->indirect_pages);
+   }
s->indirect_grants[i]->gref = GRANT_INVALID_REF;
list_add_tail(&s->indirect_grants[i]->node, 
&info->grants);
}

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-unstable test] 59795: tolerable FAIL

2015-07-21 Thread osstest service owner

flight 59795 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59795/

Failures :-/ but no regressions.

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-ovmf-amd64 9 debian-hvm-install fail in 59772 pass in 
59795
 test-amd64-i386-xl-qemut-win7-amd64 9 windows-install fail in 59772 pass in 
59795
 test-armhf-armhf-xl-arndale   6 xen-bootfail pass in 59772
 test-amd64-i386-xl-qemut-debianhvm-amd64 11 guest-saverestore fail pass in 
59772
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 6 xen-boot fail pass in 
59772

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail in 59772 like 59699
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
like 59772
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 59772
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 59772
 test-amd64-i386-xl-qemuu-win7-amd64  9 windows-install fail like 59772

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-arndale  12 migrate-support-check fail in 59772 never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass

version targeted for testing:
 xen  21d9b079e53805b68047d60d28cde224d09bbb40
baseline version:
 xen  21d9b079e53805b68047d60d28cde224d09bbb40

Last test of basis59795  2015-07-21 09:56:30 Z0 days
Testing same since0  1970-01-01 00:00:00 Z 16638 days0 attempts

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm pass
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsmfail
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd

Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb

2015-07-21 Thread Jan Beulich

>>> Daniel Kiper  07/21/15 8:23 PM >>>
>On Tue, Jul 21, 2015 at 03:37:48AM -0600, Jan Beulich wrote:
>> >>> On 20.07.15 at 16:28,  wrote:
>>
>> ... because of ??? Nowadays - with X86_FEATURE_ERMS - rep stosb
>> is expected to be faster than rep stosl.
>
>OK, I did not know about that. However, as I know this feature
>was introduced in 2012 with Ivy Bridge. So, I suppose that there
>are still a lot of machines in the wild which does not support it.
>Anyway, because this code is not performance critical I am not going
>to insist on one or another solution. However, Andrew suggested that
>thing, so, please agree with him in which direction we should go.
>I will do what you agree.

ISTR having seen a similar patch from him(?), maybe in another area
of code, before (or was it v1 of this one?), which I responded to with the
same as above.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 02/23] x86/boot: copy only text section from .lnk file to .bin file

2015-07-21 Thread Jan Beulich

>>> Daniel Kiper  07/21/15 7:24 PM >>>
>On Tue, Jul 21, 2015 at 03:35:07AM -0600, Jan Beulich wrote:
>> >>> On 20.07.15 at 16:28,  wrote:
>>
>> Without any explanation (description) I'm inclined to say this makes
>> things more fragile instead of improving the situation. As it looks
>> like we indeed pointlessly copy .eh_frame, but I think this would
>> better be avoided by suppressing its generation (i.e. add
>> -fno-asynchronous-unwind-tables just like Rules.mk has).
>
>Make sense, however, there is still place for two small optimizations.
>
>First of all ld generates .got.plt section and objcopy copy it to binary file.
>It is not needed because we do not link our stuff here with shared libraries.
>So, we can use -R objcopy option to remove it (if you do not like -j .text).
>This way we could save 15 bytes (at least on my machines).

.got.plt shouldn't be generated in the first place (and I don't recall having
seen one here - I'll double check once in the office), i.e. perhaps another
missing compiler option or a linker quirk?

>We could also save another 3 bytes (per one xen/arch/x86/boot C input file)
>in final Xen binary in worst case :-))). We just need generate output assembly
>files as string of .byte instead of .long.

If you feel it worth your time, go ahead.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 2/3] xen-blkfront: rm BUG_ON(info->feature_persistent) in blkif_free

2015-07-21 Thread Bob Liu


On 07/21/2015 05:25 PM, Roger Pau Monné wrote:
> El 21/07/15 a les 5.30, Bob Liu ha escrit:
>> This BUG_ON() in blkif_free() is incorrect, because indirect page can be 
>> added
>> to list info->indirect_pages in blkif_completion() no matter 
>> feature_persistent
>> is true or false.
>>
>> Signed-off-by: Bob Liu 
> 
> Acked-by: Roger Pau Monné 
> 
> This was probably an oversight from when blkif_completion was changed to
> check for gnttab_query_foreign_access. It should be backported to stable
> trees.
> 

Sorry, this patch is buggy and I haven't figure out why.

general protection fault:  [#1] SMP 
Modules linked in:
CPU: 0 PID: 39 Comm: xenwatch Tainted: GW   
4.1.0-rc3-3-g718cf80-dirty #67
Hardware name: Xen HVM domU, BIOS 4.5.0-rc 11/23/2014
task: 880283f4eca0 ti: 880283fb4000 task.ti: 880283fb4000
RIP: 0010:[]  [] blkif_free+0x162/0x5a9
RSP: 0018:880283fb7c48  EFLAGS: 00010087
RAX: dead00200200 RBX: 88014140 RCX: 
RDX: dead00100100 RSI: dead00100100 RDI: 88028f418bb8
RBP: 880283fb7ca8 R08: dead00200200 R09: 0001
R10: 0001 R11:  R12: 8801414481c8
R13: dead001000e0 R14: 8801414481b8 R15: ea00
FS:  () GS:88028f40() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 01582e08 CR3: 00013345b000 CR4: 001406f0
Stack:
 880023aa8420 0286 880283fb7cb7 880023aa8420
 8800363fe240 81862c50 880283fb7ce8 880023aa8440
 8187 880023aa8400 88014140 88014148
Call Trace:
 [] blkfront_remove+0x4c/0xff
 [] xenbus_dev_remove+0x76/0xb0
 [] __device_release_driver+0x84/0xf8
 [] device_release_driver+0x1e/0x2b
 [] bus_remove_device+0x12c/0x141
 [] device_del+0x161/0x1e5
 [] ? xenbus_thread+0x239/0x239
 [] device_unregister+0x43/0x4f
 [] xenbus_dev_changed+0x82/0x17f
 [] ? xenbus_otherend_changed+0xf0/0xff
 [] frontend_changed+0x43/0x48
 [] xenwatch_thread+0xf9/0x127
 [] ? add_wait_queue+0x44/0x44
 [] kthread+0xcd/0xd5
 [] ? alloc_pid+0xe8/0x492
 [] ? kthread_freezable_should_stop+0x48/0x48
 [] ret_from_fork+0x42/0x70
 [] ? kthread_freezable_should_stop+0x48/0x48
Code: 04 00 4c 8b 28 48 8d 78 e0 49 83 ed 20 eb 3d 48 8b 47 28 48 8b 57 20 48 
be 00 01 10 00 00 00 ad de 49 b8 00 02 20 00 00 00 ad de <48> 89 42 08 48 89 10 
48 89 77 20 4c 89 47 28 31 f6 e8 26 7d cf 
RIP  [] blkif_free+0x162/0x5a9
 RSP 
---[ end trace 5321d7f1ef8414d0 ]---

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH] x86/psr: remove invalid cpu_to_socket call

2015-07-21 Thread Chao Peng

cpu_to_socket() can't give correct socket value in CPU_PREPARE notifier
as at that time phys_proc_id has not yet been initialized (the value is
its default 0 in this case) which is incorrect for sockets other than
socket 0.

cos_to_cbm now is pre-allocated in CPU_PREPARE notifier and then consumed
in CPU_STARTING notifier.

Signed-off-by: Chao Peng 
---
 xen/arch/x86/psr.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 861683f..ed59803 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -50,6 +50,8 @@ static unsigned int __read_mostly opt_cos_max = 255;
 static uint64_t rmid_mask;
 static DEFINE_PER_CPU(struct psr_assoc, psr_assoc);
 
+static struct psr_cat_cbm *temp_cos_to_cbm;
+
 static unsigned int get_socket_cpu(unsigned int socket)
 {
 if ( likely(socket < nr_sockets) )
@@ -451,22 +453,15 @@ void psr_domain_free(struct domain *d)
 
 static int cat_cpu_prepare(unsigned int cpu)
 {
-struct psr_cat_socket_info *info;
-unsigned int socket;
-
 if ( !cat_socket_info )
 return 0;
 
-socket = cpu_to_socket(cpu);
-if ( socket >= nr_sockets )
-return -ENOSPC;
-
-info = cat_socket_info + socket;
-if ( info->cos_to_cbm )
-return 0;
+if ( temp_cos_to_cbm == NULL &&
+ (temp_cos_to_cbm = xzalloc_array(struct psr_cat_cbm,
+  opt_cos_max + 1UL)) == NULL )
+return -ENOMEM;
 
-info->cos_to_cbm = xzalloc_array(struct psr_cat_cbm, opt_cos_max + 1UL);
-return info->cos_to_cbm ? 0 : -ENOMEM;
+return 0;
 }
 
 static void cat_cpu_init(void)
@@ -492,6 +487,8 @@ static void cat_cpu_init(void)
 info->cbm_len = (eax & 0x1f) + 1;
 info->cos_max = min(opt_cos_max, edx & 0x);
 
+info->cos_to_cbm = temp_cos_to_cbm;
+temp_cos_to_cbm = NULL;
 /* cos=0 is reserved as default cbm(all ones). */
 info->cos_to_cbm[0].cbm = (1ull << info->cbm_len) - 1;
 
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] race condition in xen-gntdev

2015-07-21 Thread Marek Marczykowski-Górecki

On Mon, Jun 29, 2015 at 04:50:10PM +0200, Marek Marczykowski-Górecki wrote:
> On Mon, Jun 29, 2015 at 10:39:26AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Fri, Jun 26, 2015 at 03:28:24AM +0200, Marek Marczykowski-Górecki wrote:
> > > On Mon, Jun 22, 2015 at 03:14:16PM -0400, Daniel De Graaf wrote:
> > > > The reason that gntdev_release didn't have a lock is because there are 
> > > > not
> > > > supposed to be any references to the areas pointed to by priv->maps 
> > > > when it
> > > > is called.  However, since the MMU notifier has not yet been 
> > > > unregistered,
> > > > it is apparently possible to race here; the comment on 
> > > > mmu_notifier_unregister
> > > > seems to confirm this as a possibility (as do the backtraces).
> > > > 
> > > > I think adding the lock will be sufficient.
> > > 
> > > Ok, so here is the patch:
> > 
> > Awesome!
> > 
> > Since you are the one who has been seeing this particular fault - any chance
> > you could give it some soak time? If I recall your emails correctly it takes
> > about a week or so before you saw the crash?
> 
> Sure. I've already installed patched kernel, will report back results
> later.

Ok, after few weeks I can surely confirm - this fixes the issue.

> > > ---8<
> > > 
> > > From b876e14888bdafa112c3265e6420543fa74aa709 Mon Sep 17 00:00:00 2001
> > > From: =?UTF-8?q?Marek=20Marczykowski-G=C3=B3recki?=
> > >  
> > > Date: Fri, 26 Jun 2015 02:16:49 +0200
> > > Subject: [PATCH] xen/grant: fix race condition in gntdev_release
> > > 
> > > While gntdev_release is called, MMU notifier is still registered and
> > > can traverse priv->maps list even if no pages are mapped (which is the
> > > case - gntdev_release is called after all). But gntdev_release will
> > > clear that list, so make sure that only one of those things happens at
> > > the same time.
> > > 
> > > Signed-off-by: Marek Marczykowski-Górecki 
> > > 
> > > ---
> > >  drivers/xen/gntdev.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
> > > index 8927485..4bd23bb 100644
> > > --- a/drivers/xen/gntdev.c
> > > +++ b/drivers/xen/gntdev.c
> > > @@ -568,12 +568,14 @@ static int gntdev_release(struct inode *inode, 
> > > struct file *flip)
> > >  
> > >   pr_debug("priv %p\n", priv);
> > >  
> > > + mutex_lock(&priv->lock);
> > >   while (!list_empty(&priv->maps)) {
> > >   map = list_entry(priv->maps.next, struct grant_map, next);
> > >   list_del(&map->next);
> > >   gntdev_put_map(NULL /* already removed */, map);
> > >   }
> > >   WARN_ON(!list_empty(&priv->freeable_maps));
> > > + mutex_unlock(&priv->lock);
> > >  
> > >   if (use_ptemod)
> > >   mmu_notifier_unregister(&priv->mn, priv->mm);
> > > -- 
> > > 1.9.3
> > > 
> > > 
> > > -- 
> > > Best Regards,
> > > Marek Marczykowski-Górecki
> > > Invisible Things Lab
> > > A: Because it messes up the order in which people normally read text.
> > > Q: Why is top-posting such a bad thing?
> > 
> > 
> 



-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


pgpkjAIJQiI9A.pgp
Description: PGP signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-4.4-testing test] 59794: tolerable FAIL - PUSHED

2015-07-21 Thread osstest service owner

flight 59794 xen-4.4-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59794/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-winxpsp3 15 guest-localmigrate/x10   fail like 59510
 test-amd64-i386-libvirt  11 guest-start  fail   like 59538
 test-amd64-amd64-libvirt 11 guest-start  fail   like 59538
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59538
 test-armhf-armhf-xl-multivcpu 15 guest-start/debian.repeatfail  like 59538

Tests which did not succeed, but are not blocking:
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 build-amd64-rumpuserxen   6 xen-buildfail   never pass
 build-i386-rumpuserxen6 xen-buildfail   never pass
 test-armhf-armhf-libvirt 11 guest-start  fail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-amd64-i386-xend-qemut-winxpsp3 20 leak-check/checkfail never pass

version targeted for testing:
 xen  d273ce76c9d79415000d316ce7a3cad03ddb2865
baseline version:
 xen  33eba764618669b9c394c7d9cd2e335b426862ab

Last test of basis59538  2015-07-14 08:07:23 Z7 days
Testing same since59794  2015-07-21 09:55:06 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Dario Faggioli 
  Elena Ufimtseva 
  Ian Campbell 
  Jan Beulich 
  Juergen Gross 
  Liang Li 
  Yang Zhang 

jobs:
 build-amd64-xend pass
 build-i386-xend  pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-i386-rumpuserxen-i386 blocked 
 test-amd64-i386-qemut-rhel6hvm-intel pass
 test-amd64-i386-qemuu-rhel6hvm-intel

[Xen-devel] [qemu-upstream-4.5-testing test] 59793: tolerable FAIL - PUSHED

2015-07-21 Thread osstest service owner

flight 59793 qemu-upstream-4.5-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59793/

Failures :-/ but no regressions.

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stopfail REGR. vs. 58384
 test-amd64-i386-libvirt  11 guest-start  fail   like 58413
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58413

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-rtds 11 guest-start  fail   never pass

version targeted for testing:
 qemuueb75549f69ca0f3eab26ed39d4ad0fcb6613f64a
baseline version:
 qemuud9552b0af21c27535cd3c8549bb31d26bbecd506

Last test of basis58413  2015-06-11 17:02:58 Z   40 days
Testing same since59774  2015-07-20 12:44:09 Z1 days2 attempts


People who touched revisions under test:
  Gerd Hoffmann 
  Marc-AndrÃ© Lureau 
  Stefano Stabellini 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-amd64-xl-pvh-intelfail
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  fail
 test-amd64-amd64-xl-multivcpupass
 test-armhf-armhf-xl-multivcpupass
 test-amd64-amd64-pairpass
 test-amd64-i386-pair pass
 test-amd64-amd64-xl-rtds pass
 test-armhf-armhf-xl-rtds fail
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass
 test-amd64-amd64-xl-qemuu-winxpsp3   pass
 test-amd64-i386-xl-qemuu-winxpsp3pass



sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
http://xenbits.xen.org/g

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andy Lutomirski

On Tue, Jul 21, 2015 at 7:04 PM, Boris Ostrovsky
 wrote:
>
>
> On 07/21/2015 08:49 PM, Andrew Cooper wrote:
>>
>> On 22/07/2015 01:28, Andy Lutomirski wrote:
>>>
>>> On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
>>>  wrote:

 On 22/07/2015 01:07, Andy Lutomirski wrote:
>
> On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
>  wrote:
>>
>> On 21/07/2015 22:53, Boris Ostrovsky wrote:
>>>
>>> On 07/21/2015 03:59 PM, Andy Lutomirski wrote:

 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
#endif
  /*
 + * ldt_structs can be allocated, used, and freed, but they are
 never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};
>>>
>>> This breaks Xen which expects LDT to be page-aligned. Not sure why.
>>>
>>> Jan, Andrew?
>>
>> PV guests are not permitted to have writeable mappings to the frames
>> making up the GDT and LDT, so it cannot make unaudited changes to
>> loadable descriptors.  In particular, for a 32bit PV guest, it is only
>> the segment limit which protects Xen from the ring1 guest kernel.
>>
>> A lot of this code hasn't been touched in years, and it certainly
>> predates me.  The alignment requirement appears to come from the
>> virtual
>> region Xen uses to map the guests GDT and LDT.  Strict alignment is
>> required for the GDT so Xen's descriptors starting at 0xe0xx are
>> correct, but the LDT alignment seems to be a side effect of similar
>> codepaths.
>>
>> For an LDT smaller than 8192 entries, I can't see any specific reason
>> for enforcing alignment, other than "that's the way it has always
>> been".
>>
>> However, the guest would still have to relinquish write access to all
>> frames which make up the LDT, which looks to be a bit of an issue
>> given
>> the snippet above.
>
> Does the LDT itself need to be aligned or just the address passed to
> paravirt_alloc_ldt?

 The address which Xen receives needs to be aligned.

 It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
 it is passed is page aligned, and passes it straight through.
>>>
>>> xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
>>> it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
>>> pointer to the ldt_struct.
>>
>> So it is.  It is the linear_addr in xen_set_ldt() which Xen currently
>> audits to be page aligned.
>>
>> This will allow ldt_struct itself to be page aligned, and for the size
>> field to sit across the base/limit field of what would logically be
>> selector 0x0008  There would be some issues accessing size.  To load
>> frames as an LDT, a guest must drop all refs to the page so that its
>> type may be changed from writeable to segdesc.  After that, an
>> update_descriptor hypercall can be used to change size, and I believe
>> the guest may subsequently recreate read-only mappings to the frames
>> in
>> question (although frankly it is getting late so you will want to
>> double
>> check all of this).
>>
>> Anyhow, this looks like an issue which should be fixed up with
>> slightly
>> more PVOps, rather than enforcing a Xen view of the world on native
>> Linux.
>>
> I could presumably make the allocation the other way around so the
> size is at the end.  I could even use two separate allocations if
> needed.
>
>
> Why not wrap mm_context_t's ldt and size into a struct (just like ldt_struct
> but without __pad) and have a single allocation of ldt?
>
> I.e.
>
> struct ldt_struct {
> int size;
> struct desc_struct *entries;
> }
>
> --- a/arch/x86/include/asm/mmu.h
> +++ b/arch/x86/include/asm/mmu.h
> @@ -9,8 +9,7 @@
>* we put the segment information here.
>*/
>   typedef struct {
> -void *ldt;
> -int size;
> +struct ldt_struct ldt;
> #ifdef CONFIG_X86_64
>   /* True if mm supports a task running in 32 bit compatibility mode. */

I want the atomic read of both of them.  The current code make
interesting assumptions about ordering that may or may not be correct
but are certainly not obviously correct.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Boris Ostrovsky




On 07/21/2015 08:49 PM, Andrew Cooper wrote:

On 22/07/2015 01:28, Andy Lutomirski wrote:

On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
 wrote:

On 22/07/2015 01:07, Andy Lutomirski wrote:

On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
 wrote:

On 21/07/2015 22:53, Boris Ostrovsky wrote:

On 07/21/2015 03:59 PM, Andy Lutomirski wrote:

--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
*mm) {}
   #endif
 /*
+ * ldt_structs can be allocated, used, and freed, but they are never
+ * modified while live.
+ */
+struct ldt_struct {
+int size;
+int __pad;/* keep the descriptors naturally aligned. */
+struct desc_struct entries[];
+};

This breaks Xen which expects LDT to be page-aligned. Not sure why.

Jan, Andrew?

PV guests are not permitted to have writeable mappings to the frames
making up the GDT and LDT, so it cannot make unaudited changes to
loadable descriptors.  In particular, for a 32bit PV guest, it is only
the segment limit which protects Xen from the ring1 guest kernel.

A lot of this code hasn't been touched in years, and it certainly
predates me.  The alignment requirement appears to come from the virtual
region Xen uses to map the guests GDT and LDT.  Strict alignment is
required for the GDT so Xen's descriptors starting at 0xe0xx are
correct, but the LDT alignment seems to be a side effect of similar
codepaths.

For an LDT smaller than 8192 entries, I can't see any specific reason
for enforcing alignment, other than "that's the way it has always been".

However, the guest would still have to relinquish write access to all
frames which make up the LDT, which looks to be a bit of an issue given
the snippet above.

Does the LDT itself need to be aligned or just the address passed to
paravirt_alloc_ldt?

The address which Xen receives needs to be aligned.

It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
it is passed is page aligned, and passes it straight through.

xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
pointer to the ldt_struct.

So it is.  It is the linear_addr in xen_set_ldt() which Xen currently
audits to be page aligned.


This will allow ldt_struct itself to be page aligned, and for the size
field to sit across the base/limit field of what would logically be
selector 0x0008  There would be some issues accessing size.  To load
frames as an LDT, a guest must drop all refs to the page so that its
type may be changed from writeable to segdesc.  After that, an
update_descriptor hypercall can be used to change size, and I believe
the guest may subsequently recreate read-only mappings to the frames in
question (although frankly it is getting late so you will want to double
check all of this).

Anyhow, this looks like an issue which should be fixed up with slightly
more PVOps, rather than enforcing a Xen view of the world on native Linux.


I could presumably make the allocation the other way around so the
size is at the end.  I could even use two separate allocations if
needed.


Why not wrap mm_context_t's ldt and size into a struct (just like 
ldt_struct but without __pad) and have a single allocation of ldt?


I.e.

struct ldt_struct {
int size;
struct desc_struct *entries;
}

--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -9,8 +9,7 @@
   * we put the segment information here.
   */
  typedef struct {
-void *ldt;
-int size;
+struct ldt_struct ldt;
#ifdef CONFIG_X86_64
  /* True if mm supports a task running in 32 bit compatibility 
mode. */



-boris


I suspect two separate allocations would be the better solution, as it
means that the size field doesn't need to be subject to funny page
permissions.

True.  OTOH we never write to the size field after allocating the thing.

Right, but even reading it is going to cause problems if one of the
paravirt ops can't re-establish ro mappings.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [v11][PATCH 13/16] libxl: construct e820 map with RDM information for HVM guest

2015-07-21 Thread Tiejun Chen

Here we'll construct a basic guest e820 table via
XENMEM_set_memory_map. This table includes lowmem, highmem
and RDMs if they exist, and hvmloader would need this info
later.

Note this guest e820 table would be same as before if the
platform has no any RDM or we disable RDM (by default).

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Acked-by: Wei Liu 
Acked-by: Ian Jackson 
Signed-off-by: Tiejun Chen 
---
v8 ~ v11:

* Nothing is changed.

v7:

* Just sync with the fallout of renaming parameters from patch #10.

v6:

* Nothing is changed.

v5:

* Make this variable "rdm_mem_boundary_memkb" specific to .hvm 

v4:

* Separated from the previous patch to provide a parameter to set that
  predefined boundary dynamically.

 tools/libxl/libxl_arch.h |  7 
 tools/libxl/libxl_arm.c  |  8 +
 tools/libxl/libxl_dom.c  |  5 +++
 tools/libxl/libxl_x86.c  | 83 
 4 files changed, 103 insertions(+)

diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h
index 9a80d43..bd030b6 100644
--- a/tools/libxl/libxl_arch.h
+++ b/tools/libxl/libxl_arch.h
@@ -55,4 +55,11 @@ int libxl__arch_vnuma_build_vmemrange(libxl__gc *gc,
 _hidden
 int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t domid, int irq);
 
+/* arch specific to construct memory mapping function */
+_hidden
+int libxl__arch_domain_construct_memmap(libxl__gc *gc,
+libxl_domain_config *d_config,
+uint32_t domid,
+struct xc_hvm_build_args *args);
+
 #endif
diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index d306905..42ab6d8 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -969,6 +969,14 @@ int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t 
domid, int irq)
 return xc_domain_bind_pt_spi_irq(CTX->xch, domid, irq, irq);
 }
 
+int libxl__arch_domain_construct_memmap(libxl__gc *gc,
+libxl_domain_config *d_config,
+uint32_t domid,
+struct xc_hvm_build_args *args)
+{
+return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 0b7c39d..a76d4b3 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1012,6 +1012,11 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 goto out;
 }
 
+if (libxl__arch_domain_construct_memmap(gc, d_config, domid, &args)) {
+LOG(ERROR, "setting domain memory map failed");
+goto out;
+}
+
 ret = hvm_build_set_params(ctx->xch, domid, info, state->store_port,
&state->store_mfn, state->console_port,
&state->console_mfn, state->store_domid,
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 8cd15ca..b3cf3e2 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -445,6 +445,89 @@ int libxl__arch_domain_map_irq(libxl__gc *gc, uint32_t 
domid, int irq)
 }
 
 /*
+ * Here we're just trying to set these kinds of e820 mappings:
+ *
+ * #1. Low memory region
+ *
+ * Low RAM starts at least from 1M to make sure all standard regions
+ * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
+ * have enough space.
+ * Note: Those stuffs below 1M are still constructed with multiple
+ * e820 entries by hvmloader. At this point we don't change anything.
+ *
+ * #2. RDM region if it exists
+ *
+ * #3. High memory region if it exists
+ *
+ * Note: these regions are not overlapping since we already check
+ * to adjust them. Please refer to libxl__domain_device_construct_rdm().
+ */
+#define GUEST_LOW_MEM_START_DEFAULT 0x10
+int libxl__arch_domain_construct_memmap(libxl__gc *gc,
+libxl_domain_config *d_config,
+uint32_t domid,
+struct xc_hvm_build_args *args)
+{
+int rc = 0;
+unsigned int nr = 0, i;
+/* We always own at least one lowmem entry. */
+unsigned int e820_entries = 1;
+struct e820entry *e820 = NULL;
+uint64_t highmem_size =
+args->highmem_end ? args->highmem_end - (1ull << 32) : 0;
+
+/* Add all rdm entries. */
+for (i = 0; i < d_config->num_rdms; i++)
+if (d_config->rdms[i].policy != LIBXL_RDM_RESERVE_POLICY_INVALID)
+e820_entries++;
+
+
+/* If we should have a highmem range. */
+if (highmem_size)
+e820_entries++;
+
+if (e820_entries >= E820MAX) {
+LOG(ERROR, "Ooops! Too many entries in the memory map!\n");
+rc = ERROR_INVAL;
+goto out;
+}
+
+e820 = libxl__malloc(gc, sizeof(struct e820entry) * e820_entries);
+
+/* Low memory */
+e820[nr].addr = GUEST_LOW_MEM_START_DEFAULT;
+e820[nr].size = args->lowmem_end

[Xen-devel] [v11][PATCH 14/16] xen/vtd: enable USB device assignment

2015-07-21 Thread Tiejun Chen

USB RMRR may conflict with guest BIOS region. In such case, identity
mapping setup is simply skipped in previous implementation. Now we
can handle this scenario cleanly with new policy mechanism so previous
hack code can be removed now.

CC: Yang Zhang 
CC: Kevin Tian 
Signed-off-by: Tiejun Chen 
Acked-by: Kevin Tian 
---
v5 ~ v11:

* Nothing is changed.

v4:

* Refine the patch head description

 xen/drivers/passthrough/vtd/dmar.h  |  1 -
 xen/drivers/passthrough/vtd/iommu.c | 11 ++-
 xen/drivers/passthrough/vtd/utils.c |  7 ---
 3 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/dmar.h 
b/xen/drivers/passthrough/vtd/dmar.h
index af1feef..af205f5 100644
--- a/xen/drivers/passthrough/vtd/dmar.h
+++ b/xen/drivers/passthrough/vtd/dmar.h
@@ -129,7 +129,6 @@ do {\
 
 int vtd_hw_check(void);
 void disable_pmr(struct iommu *iommu);
-int is_usb_device(u16 seg, u8 bus, u8 devfn);
 int is_igd_drhd(struct acpi_drhd_unit *drhd);
 
 #endif /* _DMAR_H_ */
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index a2f3a66..8a8d763 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2242,11 +2242,9 @@ static int reassign_device_ownership(
 /*
  * If the device belongs to the hardware domain, and it has RMRR, don't
  * remove it from the hardware domain, because BIOS may use RMRR at
- * booting time. Also account for the special casing of USB below (in
- * intel_iommu_assign_device()).
+ * booting time.
  */
-if ( !is_hardware_domain(source) &&
- !is_usb_device(pdev->seg, pdev->bus, pdev->devfn) )
+if ( !is_hardware_domain(source) )
 {
 const struct acpi_rmrr_unit *rmrr;
 u16 bdf;
@@ -2299,13 +2297,8 @@ static int intel_iommu_assign_device(
 if ( ret )
 return ret;
 
-/* FIXME: Because USB RMRR conflicts with guest bios region,
- * ignore USB RMRR temporarily.
- */
 seg = pdev->seg;
 bus = pdev->bus;
-if ( is_usb_device(seg, bus, pdev->devfn) )
-return 0;
 
 /* Setup rmrr identity mapping */
 for_each_rmrr_device( rmrr, bdf, i )
diff --git a/xen/drivers/passthrough/vtd/utils.c 
b/xen/drivers/passthrough/vtd/utils.c
index bd14c02..b8a077f 100644
--- a/xen/drivers/passthrough/vtd/utils.c
+++ b/xen/drivers/passthrough/vtd/utils.c
@@ -29,13 +29,6 @@
 #include "extern.h"
 #include 
 
-int is_usb_device(u16 seg, u8 bus, u8 devfn)
-{
-u16 class = pci_conf_read16(seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
-PCI_CLASS_DEVICE);
-return (class == 0xc03);
-}
-
 /* Disable vt-d protected memory registers. */
 void disable_pmr(struct iommu *iommu)
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [v11][PATCH 03/16] xen/passthrough: extend hypercall to support rdm reservation policy

2015-07-21 Thread Tiejun Chen

This patch extends the existing hypercall to support rdm reservation policy.
We return error or just throw out a warning message depending on whether
the policy is "strict" or "relaxed" when reserving RDM regions in pfn space.
Note in some special cases, e.g. add a device to hwdomain, and remove a
device from user domain, 'relaxed' is fine enough since this is always safe
to hwdomain.

CC: Tim Deegan 
CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Suravee Suthikulpanit 
CC: Aravind Gopalakrishnan 
CC: Ian Campbell 
CC: Stefano Stabellini 
CC: Yang Zhang 
CC: Kevin Tian 
Signed-off-by: Tiejun Chen 
Reviewed-by: George Dunlap 
Acked-by: Jan Beulich 
---
v10 ~ v11:

* Nothing is changed.

v9:

* Correct one check condition of XEN_DOMCTL_DEV_RDM_RELAXED

v8:

* Force to pass "0"(strict) when add or move a device in hardware domain,
  and improve some associated code comments.

v6 ~ v7:

* Nothing is changed.

v5:

* Just leave one bit XEN_DOMCTL_DEV_RDM_RELAXED as our flag, so
  "0" means "strict" and "1" means "relaxed".

* So make DT device ignore the flag field

* Improve the code comments

v4:

* Add code comments to describer why we fix to set a policy flag in some
  cases like adding a device to hwdomain, and removing a device from user 
domain.

* Avoid using fixed width types for the parameter of set_identity_p2m_entry()

* Fix one judging condition
  domctl->u.assign_device.flag == XEN_DOMCTL_DEV_NO_RDM
  -> domctl->u.assign_device.flag != XEN_DOMCTL_DEV_NO_RDM

* Add to range check the flag passed to make future extensions possible
  (and to avoid ambiguity on what out of range values would mean).

 xen/arch/x86/mm/p2m.c   |  7 --
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  3 ++-
 xen/drivers/passthrough/arm/smmu.c  |  2 +-
 xen/drivers/passthrough/device_tree.c   |  3 ++-
 xen/drivers/passthrough/pci.c   | 15 
 xen/drivers/passthrough/vtd/iommu.c | 37 ++---
 xen/include/asm-x86/p2m.h   |  2 +-
 xen/include/public/domctl.h |  3 +++
 xen/include/xen/iommu.h |  2 +-
 9 files changed, 55 insertions(+), 19 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 1e763dc..89616b7 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -901,7 +901,7 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, 
mfn_t mfn,
 }
 
 int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
-   p2m_access_t p2ma)
+   p2m_access_t p2ma, unsigned int flag)
 {
 p2m_type_t p2mt;
 p2m_access_t a;
@@ -923,7 +923,10 @@ int set_identity_p2m_entry(struct domain *d, unsigned long 
gfn,
 ret = 0;
 else
 {
-ret = -EBUSY;
+if ( flag & XEN_DOMCTL_DEV_RDM_RELAXED )
+ret = 0;
+else
+ret = -EBUSY;
 printk(XENLOG_G_WARNING
"Cannot setup identity map d%d:%lx,"
" gfn already mapped to %lx.\n",
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index e83bb35..920b35a 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -394,7 +394,8 @@ static int reassign_device(struct domain *source, struct 
domain *target,
 }
 
 static int amd_iommu_assign_device(struct domain *d, u8 devfn,
-   struct pci_dev *pdev)
+   struct pci_dev *pdev,
+   u32 flag)
 {
 struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev->seg);
 int bdf = PCI_BDF2(pdev->bus, devfn);
diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 6cc4394..9a667e9 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2605,7 +2605,7 @@ static void arm_smmu_destroy_iommu_domain(struct 
iommu_domain *domain)
 }
 
 static int arm_smmu_assign_dev(struct domain *d, u8 devfn,
-  struct device *dev)
+  struct device *dev, u32 flag)
 {
struct iommu_domain *domain;
struct arm_smmu_xen_domain *xen_domain;
diff --git a/xen/drivers/passthrough/device_tree.c 
b/xen/drivers/passthrough/device_tree.c
index 5d3842a..7ff79f8 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -52,7 +52,8 @@ int iommu_assign_dt_device(struct domain *d, struct 
dt_device_node *dev)
 goto fail;
 }
 
-rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev));
+/* The flag field doesn't matter to DT device. */
+rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev), 0);
 
 if ( rc )
 goto fail;
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index e30be43..c7bbf6e 100644
--- a/xen/drivers/passthr

[Xen-devel] [v11][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Tiejun Chen

While building a VM, HVM domain builder provides struct hvm_info_table{}
to help hvmloader. Currently it includes two fields to construct guest
e820 table by hvmloader, low_mem_pgend and high_mem_pgend. So we should
check them to fix any conflict with RDM.

RMRR can reside in address space beyond 4G theoretically, but we never
see this in real world. So in order to avoid breaking highmem layout
we don't solve highmem conflict. Note this means highmem rmrr could still
be supported if no conflict.

But in the case of lowmem, RMRR probably scatter the whole RAM space.
Especially multiple RMRR entries would worsen this to lead a complicated
memory layout. And then its hard to extend hvm_info_table{} to work
hvmloader out. So here we're trying to figure out a simple solution to
avoid breaking existing layout. So when a conflict occurs,

#1. Above a predefined boundary (2G)
- move lowmem_end below reserved region to solve conflict;

#2. Below a predefined boundary (2G)
- Check strict/relaxed policy.
"strict" policy leads to fail libxl. Note when both policies
are specified on a given region, 'strict' is always preferred.
"relaxed" policy issue a warning message and also mask this entry 
INVALID
to indicate we shouldn't expose this entry to hvmloader.

Note later we need to provide a parameter to set that predefined boundary
dynamically.

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Acked-by: Wei Liu 
Signed-off-by: Tiejun Chen 
Reviewed-by: Kevin Tian 
---
v11:

* Use GCNEW_ARRAY to replace libxl__malloc()

* #define pfn_to_paddrk is missing safety () around x, and
  move this into libxl_internal.h

* Rename set_rdm_entries() to add_rdm_entry() and put the
  increment at the end so that the assignments are
  to ->rdms[d_config->num_rdms].

* "Simply make it so that if there are any rdms specified
  in the domain config, they are used instead of the
  automatically gathered information (from strategy and
  devices)." So just return if d_config->rmds is valid.

* Shorten some code comments.

v9 ~ v10:

* Nothing is changed.

v8:

* Introduce pfn_to_paddr(x) -> ((uint64_t)x << XC_PAGE_SHIFT)
  and set_rdm_entries() to factor out current codes.

v7:

* Just sync with the fallout of renaming parameters from patch #10.

v6:

* fix some code stypes
* Refine libxl__xc_device_get_rdm()

v5:

* A little change to make sure the per-device policy always override the global
  policy and correct its associated code comments.
* Fix one typo in the patch head description
* Rename xc_device_get_rdm() with libxl__xc_device_get_rdm(), and then replace
  malloc() with libxl__malloc(), and finally cleanup this fallout.
* libxl__xc_device_get_rdm() should return proper libxl error code, ERROR_FAIL.
  Then instead, the allocated RDM entries would be returned with an out 
parameter.

v4:

* Consistent to use term "RDM".
* Unconditionally set *nr_entries to 0
* Grab to all sutffs to provide a parameter to set our predefined boundary
  dynamically to as a separated patch later

 tools/libxl/libxl_create.c   |   2 +-
 tools/libxl/libxl_dm.c   | 274 +++
 tools/libxl/libxl_dom.c  |  17 ++-
 tools/libxl/libxl_internal.h |  14 ++-
 tools/libxl/libxl_types.idl  |   7 ++
 5 files changed, 311 insertions(+), 3 deletions(-)

diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 7c884c4..5b57062 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -407,7 +407,7 @@ int libxl__domain_build(libxl__gc *gc,
 
 switch (info->type) {
 case LIBXL_DOMAIN_TYPE_HVM:
-ret = libxl__build_hvm(gc, domid, info, state);
+ret = libxl__build_hvm(gc, domid, d_config, state);
 if (ret)
 goto out;
 
diff --git a/tools/libxl/libxl_dm.c b/tools/libxl/libxl_dm.c
index 634b8d2..29476fc 100644
--- a/tools/libxl/libxl_dm.c
+++ b/tools/libxl/libxl_dm.c
@@ -92,6 +92,280 @@ const char *libxl__domain_device_model(libxl__gc *gc,
 return dm;
 }
 
+static int
+libxl__xc_device_get_rdm(libxl__gc *gc,
+ uint32_t flag,
+ uint16_t seg,
+ uint8_t bus,
+ uint8_t devfn,
+ unsigned int *nr_entries,
+ struct xen_reserved_device_memory **xrdm)
+{
+int rc = 0, r;
+
+/*
+ * We really can't presume how many entries we can get in advance.
+ */
+*nr_entries = 0;
+r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,
+  NULL, nr_entries);
+assert(r <= 0);
+/* "0" means we have no any rdm entry. */
+if (!r) goto out;
+
+if (errno != ENOBUFS) {
+rc = ERROR_FAIL;
+goto out;
+}
+
+GCNEW_ARRAY(*xrdm, *nr_entries);
+r = xc_reserved_device_memory_map(CTX->xch, flag, seg, bus, devfn,
+  *xrdm, n

[Xen-devel] [v11][PATCH 07/16] hvmloader/e820: construct guest e820 table

2015-07-21 Thread Tiejun Chen

Now use the hypervisor-supplied memory map to build our final e820 table:
* Add regions for BIOS ranges and other special mappings not in the
  hypervisor map
* Add in the hypervisor supplied regions
* Adjust the lowmem and highmem regions if we've had to relocate
  memory (adding a highmem region if necessary)
* Sort all the ranges so that they appear in memory order.

CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Reviewed-by: George Dunlap 
Reviewed-by: Jan Beulich 
Signed-off-by: Tiejun Chen 
---
v11:

* To check/sync memory_map.map[] before copy them into e820 since
  ultimately this can make sure hvm_info, memory_map.map[] and e820
  are on the same page. 

* Refine some code implementations

v10:

* Instead of correcting e820, I'd like to correct memory_map.map[]
  and then copy them into e820 directly. I think this can make sure
  hvm_info, memory_map.map[] and e820 are on the same page.

v9:

* Refine that chunk of codes to check/modify highmem

v8:

* define low_mem_end as uint32_t

* Correct those two wrong loops, memory_map.nr_map -> nr
  when we're trying to revise low/high memory e820 entries.

* Improve code comments and the patch head description

* Add one check if highmem is just populated by hvmloader itself

v5 ~ v7:

* Nothing is changed.

v4:

* Rename local variable, low_mem_pgend, to low_mem_end.

* Improve some code comments

* Adjust highmem after lowmem is changed.
 
 tools/firmware/hvmloader/e820.c | 109 +++-
 1 file changed, 96 insertions(+), 13 deletions(-)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 7a414ab..a6cacdf 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -105,7 +105,11 @@ int build_e820_table(struct e820entry *e820,
  unsigned int lowmem_reserved_base,
  unsigned int bios_image_base)
 {
-unsigned int nr = 0;
+unsigned int nr = 0, i, j;
+uint32_t low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT;
+uint32_t add_high_mem = 0;
+uint64_t high_mem_end = (uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT;
+uint64_t map_start, map_size, map_end;
 
 if ( !lowmem_reserved_base )
 lowmem_reserved_base = 0xA;
@@ -149,13 +153,6 @@ int build_e820_table(struct e820entry *e820,
 e820[nr].type = E820_RESERVED;
 nr++;
 
-/* Low RAM goes here. Reserve space for special pages. */
-BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20));
-e820[nr].addr = 0x10;
-e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
-nr++;
-
 /*
  * Explicitly reserve space for special pages.
  * This space starts at RESERVED_MEMBASE an extends to cover various
@@ -191,16 +188,102 @@ int build_e820_table(struct e820entry *e820,
 nr++;
 }
 
+/* Low RAM goes here. Reserve space for special pages. */
+BUG_ON(low_mem_end < (2u << 20));
 
-if ( hvm_info->high_mem_pgend )
+/*
+ * Construct E820 table according to recorded memory map.
+ *
+ * The memory map created by toolstack may include,
+ *
+ * #1. Low memory region
+ *
+ * Low RAM starts at least from 1M to make sure all standard regions
+ * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios,
+ * have enough space.
+ *
+ * #2. Reserved regions if they exist
+ *
+ * #3. High memory region if it exists
+ *
+ * Note we just have one low memory entry and one high mmeory entry if
+ * exists.
+ *
+ * But we may have relocated RAM to allocate sufficient MMIO previously
+ * so low_mem_pgend would be changed over there. And here memory_map[]
+ * records the original low/high memory, so if low_mem_end is less than
+ * the original we need to revise low/high memory range firstly.
+ */
+for ( i = 0; i < memory_map.nr_map; i++ )
 {
-e820[nr].addr = ((uint64_t)1 << 32);
-e820[nr].size =
-((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - e820[nr].addr;
-e820[nr].type = E820_RAM;
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+map_end = map_start + map_size;
+
+/* If we need to adjust lowmem. */
+if ( memory_map.map[i].type == E820_RAM &&
+ low_mem_end > map_start && low_mem_end < map_end )
+{
+add_high_mem = map_end - low_mem_end;
+memory_map.map[i].size = low_mem_end - map_start;
+break;
+}
+}
+
+/* If we need to adjust highmem. */
+if ( add_high_mem )
+{
+/* Modify the existing highmem region if it exists. */
+for ( i = 0; i < memory_map.nr_map; i++ )
+{
+map_start = memory_map.map[i].addr;
+map_size = memory_map.map[i].size;
+

[Xen-devel] [v11][PATCH 09/16] tools: extend xc_assign_device() to support rdm reservation policy

2015-07-21 Thread Tiejun Chen

This patch passes rdm reservation policy to xc_assign_device() so the policy
is checked when assigning devices to a VM.

Note this also bring some fallout to python usage of xc_assign_device().

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
CC: David Scott 
Acked-by: Wei Liu 
Signed-off-by: Tiejun Chen 
---
v6 ~ v11:

* Nothing is changed.

v5:

* Fix the flag field as "0" to DT device

v4:

* In the patch head description, I add to explain why we need to sync
  the xc.c file

 tools/libxc/include/xenctrl.h   |  3 ++-
 tools/libxc/xc_domain.c |  9 -
 tools/libxl/libxl_pci.c |  3 ++-
 tools/ocaml/libs/xc/xenctrl_stubs.c | 16 
 tools/python/xen/lowlevel/xc/xc.c   | 30 --
 5 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 2991333..5c535c4 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2067,7 +2067,8 @@ int xc_hvm_destroy_ioreq_server(xc_interface *xch,
 /* HVM guest pass-through */
 int xc_assign_device(xc_interface *xch,
  uint32_t domid,
- uint32_t machine_sbdf);
+ uint32_t machine_sbdf,
+ uint32_t flag);
 
 int xc_get_device_group(xc_interface *xch,
  uint32_t domid,
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 298b3b5..69e6d8f 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1697,7 +1697,8 @@ int xc_domain_setdebugging(xc_interface *xch,
 int xc_assign_device(
 xc_interface *xch,
 uint32_t domid,
-uint32_t machine_sbdf)
+uint32_t machine_sbdf,
+uint32_t flag)
 {
 DECLARE_DOMCTL;
 
@@ -1705,6 +1706,7 @@ int xc_assign_device(
 domctl.domain = domid;
 domctl.u.assign_device.dev = XEN_DOMCTL_DEV_PCI;
 domctl.u.assign_device.u.pci.machine_sbdf = machine_sbdf;
+domctl.u.assign_device.flag = flag;
 
 return do_domctl(xch, &domctl);
 }
@@ -1792,6 +1794,11 @@ int xc_assign_dt_device(
 
 domctl.u.assign_device.dev = XEN_DOMCTL_DEV_DT;
 domctl.u.assign_device.u.dt.size = size;
+/*
+ * DT doesn't own any RDM so actually DT has nothing to do
+ * for any flag and here just fix that as 0.
+ */
+domctl.u.assign_device.flag = 0;
 set_xen_guest_handle(domctl.u.assign_device.u.dt.path, path);
 
 rc = do_domctl(xch, &domctl);
diff --git a/tools/libxl/libxl_pci.c b/tools/libxl/libxl_pci.c
index e0743f8..632c15e 100644
--- a/tools/libxl/libxl_pci.c
+++ b/tools/libxl/libxl_pci.c
@@ -894,6 +894,7 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, 
libxl_device_pci *pcidev, i
 FILE *f;
 unsigned long long start, end, flags, size;
 int irq, i, rc, hvm = 0;
+uint32_t flag = XEN_DOMCTL_DEV_RDM_RELAXED;
 
 if (type == LIBXL_DOMAIN_TYPE_INVALID)
 return ERROR_FAIL;
@@ -987,7 +988,7 @@ static int do_pci_add(libxl__gc *gc, uint32_t domid, 
libxl_device_pci *pcidev, i
 
 out:
 if (!libxl_is_stubdom(ctx, domid, NULL)) {
-rc = xc_assign_device(ctx->xch, domid, pcidev_encode_bdf(pcidev));
+rc = xc_assign_device(ctx->xch, domid, pcidev_encode_bdf(pcidev), 
flag);
 if (rc < 0 && (hvm || errno != ENOSYS)) {
 LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_assign_device failed");
 return ERROR_FAIL;
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c 
b/tools/ocaml/libs/xc/xenctrl_stubs.c
index 64f1137..b7de615 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -1172,12 +1172,17 @@ CAMLprim value stub_xc_domain_test_assign_device(value 
xch, value domid, value d
CAMLreturn(Val_bool(ret == 0));
 }
 
-CAMLprim value stub_xc_domain_assign_device(value xch, value domid, value desc)
+static int domain_assign_device_rdm_flag_table[] = {
+XEN_DOMCTL_DEV_RDM_RELAXED,
+};
+
+CAMLprim value stub_xc_domain_assign_device(value xch, value domid, value desc,
+value rflag)
 {
-   CAMLparam3(xch, domid, desc);
+   CAMLparam4(xch, domid, desc, rflag);
int ret;
int domain, bus, dev, func;
-   uint32_t sbdf;
+   uint32_t sbdf, flag;
 
domain = Int_val(Field(desc, 0));
bus = Int_val(Field(desc, 1));
@@ -1185,7 +1190,10 @@ CAMLprim value stub_xc_domain_assign_device(value xch, 
value domid, value desc)
func = Int_val(Field(desc, 3));
sbdf = encode_sbdf(domain, bus, dev, func);
 
-   ret = xc_assign_device(_H(xch), _D(domid), sbdf);
+   ret = Int_val(Field(rflag, 0));
+   flag = domain_assign_device_rdm_flag_table[ret];
+
+   ret = xc_assign_device(_H(xch), _D(domid), sbdf, flag);
 
if (ret < 0)
failwith_xc(_H(xch));
diff --git a/tools/python/xen/lowlevel/xc/xc.c 
b/tools/python/xen/lowlevel/xc/xc.c
index ee3e1d0..c8380d1 100644
--- a/tools/python/

[Xen-devel] [v11][PATCH 01/16] xen: introduce XENMEM_reserved_device_memory_map

2015-07-21 Thread Tiejun Chen

From: Jan Beulich 

This is a prerequisite for punching holes into HVM and PVH guests' P2M
to allow passing through devices that are associated with (on VT-d)
RMRRs.

CC: Jan Beulich 
CC: Yang Zhang 
CC: Kevin Tian 
Signed-off-by: Jan Beulich 
Signed-off-by: Tiejun Chen 
Acked-by: Kevin Tian 
---
v7 ~ v11:

* Nothing is changed.

v6:

* Add a comments to the nr_entries field inside xen_reserved_device_memory_map

v5 ~ v4:

* Nothing is changed.

 xen/common/compat/memory.c   | 66 
 xen/common/memory.c  | 64 ++
 xen/drivers/passthrough/iommu.c  | 10 ++
 xen/drivers/passthrough/vtd/dmar.c   | 32 +
 xen/drivers/passthrough/vtd/extern.h |  1 +
 xen/drivers/passthrough/vtd/iommu.c  |  1 +
 xen/include/public/memory.h  | 37 +++-
 xen/include/xen/iommu.h  | 10 ++
 xen/include/xen/pci.h|  2 ++
 xen/include/xlat.lst |  3 +-
 10 files changed, 224 insertions(+), 2 deletions(-)

diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index b258138..b608496 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -17,6 +17,45 @@ CHECK_TYPE(domid);
 CHECK_mem_access_op;
 CHECK_vmemrange;
 
+#ifdef HAS_PASSTHROUGH
+struct get_reserved_device_memory {
+struct compat_reserved_device_memory_map map;
+unsigned int used_entries;
+};
+
+static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
+  u32 id, void *ctxt)
+{
+struct get_reserved_device_memory *grdm = ctxt;
+u32 sbdf;
+struct compat_reserved_device_memory rdm = {
+.start_pfn = start, .nr_pages = nr
+};
+
+sbdf = PCI_SBDF2(grdm->map.seg, grdm->map.bus, grdm->map.devfn);
+if ( (grdm->map.flag & PCI_DEV_RDM_ALL) || (sbdf == id) )
+{
+if ( grdm->used_entries < grdm->map.nr_entries )
+{
+if ( rdm.start_pfn != start || rdm.nr_pages != nr )
+return -ERANGE;
+
+if ( __copy_to_compat_offset(grdm->map.buffer,
+ grdm->used_entries,
+ &rdm,
+ 1) )
+{
+return -EFAULT;
+}
+}
+++grdm->used_entries;
+return 1;
+}
+
+return 0;
+}
+#endif
+
 int compat_memory_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) compat)
 {
 int split, op = cmd & MEMOP_CMD_MASK;
@@ -303,6 +342,33 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)
 break;
 }
 
+#ifdef HAS_PASSTHROUGH
+case XENMEM_reserved_device_memory_map:
+{
+struct get_reserved_device_memory grdm;
+
+if ( copy_from_guest(&grdm.map, compat, 1) ||
+ !compat_handle_okay(grdm.map.buffer, grdm.map.nr_entries) )
+return -EFAULT;
+
+grdm.used_entries = 0;
+rc = iommu_get_reserved_device_memory(get_reserved_device_memory,
+  &grdm);
+
+if ( !rc && grdm.map.nr_entries < grdm.used_entries )
+rc = -ENOBUFS;
+
+grdm.map.nr_entries = grdm.used_entries;
+if ( grdm.map.nr_entries )
+{
+if ( __copy_to_guest(compat, &grdm.map, 1) )
+rc = -EFAULT;
+}
+
+return rc;
+}
+#endif
+
 default:
 return compat_arch_memory_op(cmd, compat);
 }
diff --git a/xen/common/memory.c b/xen/common/memory.c
index e5d49d8..2fa45d0 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -748,6 +748,43 @@ static int construct_memop_from_reservation(
 return 0;
 }
 
+#ifdef HAS_PASSTHROUGH
+struct get_reserved_device_memory {
+struct xen_reserved_device_memory_map map;
+unsigned int used_entries;
+};
+
+static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
+  u32 id, void *ctxt)
+{
+struct get_reserved_device_memory *grdm = ctxt;
+u32 sbdf;
+
+sbdf = PCI_SBDF2(grdm->map.seg, grdm->map.bus, grdm->map.devfn);
+if ( (grdm->map.flag & PCI_DEV_RDM_ALL) || (sbdf == id) )
+{
+if ( grdm->used_entries < grdm->map.nr_entries )
+{
+struct xen_reserved_device_memory rdm = {
+.start_pfn = start, .nr_pages = nr
+};
+
+if ( __copy_to_guest_offset(grdm->map.buffer,
+grdm->used_entries,
+&rdm,
+1) )
+{
+return -EFAULT;
+}
+}
+++grdm->used_entries;
+return 1;
+}
+
+return 0;
+}
+#endif
+
 long do_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void

[Xen-devel] [v11][PATCH 06/16] hvmloader/pci: Try to avoid placing BARs in RMRRs

2015-07-21 Thread Tiejun Chen

Try to avoid placing PCI BARs over RMRRs:

- If mmio_hole_size is not specified, and the existing MMIO range has
  RMRRs in it, and there is space to expand the hole in lowmem without
  moving more memory, then make the MMIO hole as large as possible.

- When placing RMRRs, find the next RMRR higher than the current base
  in the lowmem mmio hole.  If it overlaps, skip ahead of it and find
  the next one.

This certainly won't work in all cases, but it should work in a
significant number of cases.  Additionally, users should be able to
work around problems by setting mmio_hole_size larger in the guest
config.

CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Reviewed-by: Jan Beulich 
Signed-off-by: George Dunlap 
Signed-off-by: Tiejun Chen 
---
v11:

* To find the lowest RMRR the _end_ of which is higher than base.

* Refine some code implementations 

v10:

* This is from George' draft patch which implements an acceptable solution in
  current cycle. Here I just implemented check_overlap_all() and some cleanups.

v9:

* A little improvement to code implementation but again, its still argued about
  this solution.

v8:

* Based on current discussion its hard to reshape the original mmio
  allocation mechanism but we haven't a good and simple way to in short term.
  So instead, we don't bring more complicated to intervene that process but
  still check any conflicts to disable all associated devices.

v6 ~ v7:

* Nothing is changed.

v5:

* Rename that field, is_64bar, inside struct bars with flag, and
  then extend to also indicate if this bar is already allocated.

v4:

* We have to re-design this as follows:

  #1. Goal

  MMIO region should exclude all reserved device memory

  #2. Requirements

  #2.1 Still need to make sure MMIO region is fit all pci devices as before

  #2.2 Accommodate the not aligned reserved memory regions

  If I'm missing something let me know.

  #3. How to

  #3.1 Address #2.1

  We need to either of populating more RAM, or of expanding more highmem. But
  we should know just 64bit-bar can work with highmem, and as you mentioned we
  also should avoid expanding highmem as possible. So my implementation is to 
  allocate 32bit-bar and 64bit-bar orderly.

  1>. The first allocation round just to 32bit-bar

  If we can finish allocating all 32bit-bar, we just go to allocate 64bit-bar
  with all remaining resources including low pci memory.

  If not, we need to calculate how much RAM should be populated to allocate the 
  remaining 32bit-bars, then populate sufficient RAM as exp_mem_resource to go
  to the second allocation round 2>.

  2>. The second allocation round to the remaining 32bit-bar

  We should can finish allocating all 32bit-bar in theory, then go to the third
  allocation round 3>.

  3>. The third allocation round to 64bit-bar

  We'll try to first allocate from the remaining low memory resource. If that
  isn't enough, we try to expand highmem to allocate for 64bit-bar. This process
  should be same as the original.

  #3.2 Address #2.2

  I'm trying to accommodate the not aligned reserved memory regions:

  We should skip all reserved device memory, but we also need to check if other
  smaller bars can be allocated if a mmio hole exists between resource->base and
  reserved device memory. If a hole exists between base and reserved device
  memory, lets go out simply to try allocate for next bar since all bars are in
  descending order of size. If not, we need to move resource->base to 
reserved_end
  just to reallocate this bar.

 tools/firmware/hvmloader/pci.c | 65 ++
 1 file changed, 65 insertions(+)

diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
index 5ff87a7..74fc080 100644
--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -38,6 +38,46 @@ uint64_t pci_hi_mem_start = 0, pci_hi_mem_end = 0;
 enum virtual_vga virtual_vga = VGA_none;
 unsigned long igd_opregion_pgbase = 0;
 
+/* Check if the specified range conflicts with any reserved device memory. */
+static bool check_overlap_all(uint64_t start, uint64_t size)
+{
+unsigned int i;
+
+for ( i = 0; i < memory_map.nr_map; i++ )
+{
+if ( memory_map.map[i].type == E820_RESERVED &&
+ check_overlap(start, size,
+   memory_map.map[i].addr,
+   memory_map.map[i].size) )
+return true;
+}
+
+return false;
+}
+
+/* Find the lowest RMRR higher than base. */
+static int find_next_rmrr(uint32_t base)
+{
+unsigned int i;
+int next_rmrr = -1;
+uint64_t end, min_end = (1ull << 32);
+
+for ( i = 0; i < memory_map.nr_map ; i++ )
+{
+end = memory_map.map[i].addr + memory_map.map[i].size;
+
+if ( memory_map.map[i].type == E820_RESERVED &&
+ end > base &&
+ min_end < min_end )
+{
+next

[Xen-devel] [v11][PATCH 05/16] hvmloader: get guest memory map into memory_map[]

2015-07-21 Thread Tiejun Chen

Now we get this map layout by call XENMEM_memory_map then
save them into one global variable memory_map[]. It should
include lowmem range, rdm range and highmem range. Note
rdm range and highmem range may not exist in some cases.

And here we need to check if any reserved memory conflicts with
[RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END).
This range is used to allocate memory in hvmloder level, and
we would lead hvmloader failed in case of conflict since its
another rare possibility in real world.

CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Signed-off-by: Tiejun Chen 
Reviewed-by: Kevin Tian 
Reviewed-by: George Dunlap 
Acked-by: Jan Beulich 
---
v10 ~ v11:

* Nothing is changed.

v9:

* Correct [RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END]
-> [RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END) in
  the patch head description;
  Merge two if{} as one if{};

v8:

* Actually we should check this range started from
  RESERVED_MEMORY_DYNAMIC_START, not RESERVED_MEMORY_DYNAMIC_START - 1.
  So correct this and sync the patch head description.

v5 ~ v7:

* Nothing is changed.

v4:

* Move some codes related to e820 to that specific file, e820.c.

* Consolidate "printf()+BUG()" and "BUG_ON()"

* Avoid another fixed width type for the parameter of get_mem_mapping_layout()

 tools/firmware/hvmloader/e820.c  | 32 
 tools/firmware/hvmloader/e820.h  |  7 +++
 tools/firmware/hvmloader/hvmloader.c |  2 ++
 tools/firmware/hvmloader/util.c  | 26 ++
 tools/firmware/hvmloader/util.h  | 12 
 5 files changed, 79 insertions(+)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 2e05e93..7a414ab 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -23,6 +23,38 @@
 #include "config.h"
 #include "util.h"
 
+struct e820map memory_map;
+
+void memory_map_setup(void)
+{
+unsigned int nr_entries = E820MAX, i;
+int rc;
+uint64_t alloc_addr = RESERVED_MEMORY_DYNAMIC_START;
+uint64_t alloc_size = RESERVED_MEMORY_DYNAMIC_END - alloc_addr;
+
+rc = get_mem_mapping_layout(memory_map.map, &nr_entries);
+
+if ( rc || !nr_entries )
+{
+printf("Get guest memory maps[%d] failed. (%d)\n", nr_entries, rc);
+BUG();
+}
+
+memory_map.nr_map = nr_entries;
+
+for ( i = 0; i < nr_entries; i++ )
+{
+if ( memory_map.map[i].type == E820_RESERVED &&
+ check_overlap(alloc_addr, alloc_size,
+   memory_map.map[i].addr, memory_map.map[i].size) )
+{
+printf("Fail to setup memory map due to conflict");
+printf(" on dynamic reserved memory range.\n");
+BUG();
+}
+}
+}
+
 void dump_e820_table(struct e820entry *e820, unsigned int nr)
 {
 uint64_t last_end = 0, start, end;
diff --git a/tools/firmware/hvmloader/e820.h b/tools/firmware/hvmloader/e820.h
index b2ead7f..8b5a9e0 100644
--- a/tools/firmware/hvmloader/e820.h
+++ b/tools/firmware/hvmloader/e820.h
@@ -15,6 +15,13 @@ struct e820entry {
 uint32_t type;
 } __attribute__((packed));
 
+#define E820MAX128
+
+struct e820map {
+unsigned int nr_map;
+struct e820entry map[E820MAX];
+};
+
 #endif /* __HVMLOADER_E820_H__ */
 
 /*
diff --git a/tools/firmware/hvmloader/hvmloader.c 
b/tools/firmware/hvmloader/hvmloader.c
index 25b7f08..84c588c 100644
--- a/tools/firmware/hvmloader/hvmloader.c
+++ b/tools/firmware/hvmloader/hvmloader.c
@@ -262,6 +262,8 @@ int main(void)
 
 init_hypercalls();
 
+memory_map_setup();
+
 xenbus_setup();
 
 bios = detect_bios();
diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c
index 80d822f..122e3fa 100644
--- a/tools/firmware/hvmloader/util.c
+++ b/tools/firmware/hvmloader/util.c
@@ -27,6 +27,17 @@
 #include 
 #include 
 
+/*
+ * Check whether there exists overlap in the specified memory range.
+ * Returns true if exists, else returns false.
+ */
+bool check_overlap(uint64_t start, uint64_t size,
+   uint64_t reserved_start, uint64_t reserved_size)
+{
+return (start + size > reserved_start) &&
+(start < reserved_start + reserved_size);
+}
+
 void wrmsr(uint32_t idx, uint64_t v)
 {
 asm volatile (
@@ -368,6 +379,21 @@ uuid_to_string(char *dest, uint8_t *uuid)
 *p = '\0';
 }
 
+int get_mem_mapping_layout(struct e820entry entries[], uint32_t *max_entries)
+{
+int rc;
+struct xen_memory_map memmap = {
+.nr_entries = *max_entries
+};
+
+set_xen_guest_handle(memmap.buffer, entries);
+
+rc = hypercall_memory_op(XENMEM_memory_map, &memmap);
+*max_entries = memmap.nr_entries;
+
+return rc;
+}
+
 void mem_hole_populate_ram(xen_pfn_t mfn, uint32_t nr_mfns)
 {
 static int over_allocated;
diff --git a/tools/firmware/hvmloader/util

[Xen-devel] [v11][PATCH 16/16] tools: parse to enable new rdm policy parameters

2015-07-21 Thread Tiejun Chen

This patch parses to enable user configurable parameters to specify
RDM resource and according policies which are defined previously,

Global RDM parameter:
rdm = "strategy=host,policy=strict/relaxed"
Per-device RDM parameter:
pci = [ 'sbdf, rdm_policy=strict/relaxed' ]

Default per-device RDM policy is same as default global RDM policy as being
'relaxed'. And the per-device policy would override the global policy like
others.

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Acked-by: Wei Liu 
Signed-off-by: Tiejun Chen 
---
v9 ~ v11:

* Nothing is changed.

v8:

* Clean some codes style issues.

v7:

* Just sync with the fallout of renaming parameters from patch #10.

v6:

* Just sync those renames introduced by patch #10.

v5:

* Need a rebase after we make all rdm variables specific to .hvm.
* Like other pci option, the per-device policy always follows
  the global policy by default.

v4:

* Separated from current patch #11 to parse/enable our rdm policy parameters
  since its make a lot sense and these stuffs are specific to xl/libxlu.

 tools/libxl/libxlu_pci.c | 92 +++-
 tools/libxl/libxlutil.h  |  4 +++
 tools/libxl/xl_cmdimpl.c | 13 +++
 3 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxlu_pci.c b/tools/libxl/libxlu_pci.c
index 26fb143..026413b 100644
--- a/tools/libxl/libxlu_pci.c
+++ b/tools/libxl/libxlu_pci.c
@@ -42,6 +42,9 @@ static int pcidev_struct_fill(libxl_device_pci *pcidev, 
unsigned int domain,
 #define STATE_OPTIONS_K 6
 #define STATE_OPTIONS_V 7
 #define STATE_TERMINAL  8
+#define STATE_TYPE  9
+#define STATE_RDM_STRATEGY  10
+#define STATE_RESERVE_POLICY11
 int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci *pcidev, const char 
*str)
 {
 unsigned state = STATE_DOMAIN;
@@ -143,7 +146,18 @@ int xlu_pci_parse_bdf(XLU_Config *cfg, libxl_device_pci 
*pcidev, const char *str
 pcidev->permissive = atoi(tok);
 }else if ( !strcmp(optkey, "seize") ) {
 pcidev->seize = atoi(tok);
-}else{
+} else if (!strcmp(optkey, "rdm_policy")) {
+if (!strcmp(tok, "strict")) {
+pcidev->rdm_policy = LIBXL_RDM_RESERVE_POLICY_STRICT;
+} else if (!strcmp(tok, "relaxed")) {
+pcidev->rdm_policy = LIBXL_RDM_RESERVE_POLICY_RELAXED;
+} else {
+XLU__PCI_ERR(cfg, "%s is not an valid PCI RDM property"
+  " policy: 'strict' or 'relaxed'.",
+ tok);
+goto parse_error;
+}
+} else {
 XLU__PCI_ERR(cfg, "Unknown PCI BDF option: %s", optkey);
 }
 tok = ptr + 1;
@@ -167,6 +181,82 @@ parse_error:
 return ERROR_INVAL;
 }
 
+int xlu_rdm_parse(XLU_Config *cfg, libxl_rdm_reserve *rdm, const char *str)
+{
+unsigned state = STATE_TYPE;
+char *buf2, *tok, *ptr, *end;
+
+if (NULL == (buf2 = ptr = strdup(str)))
+return ERROR_NOMEM;
+
+for (tok = ptr, end = ptr + strlen(ptr) + 1; ptr < end; ptr++) {
+switch(state) {
+case STATE_TYPE:
+if (*ptr == '=') {
+state = STATE_RDM_STRATEGY;
+*ptr = '\0';
+if (strcmp(tok, "strategy")) {
+XLU__PCI_ERR(cfg, "Unknown RDM state option: %s", tok);
+goto parse_error;
+}
+tok = ptr + 1;
+}
+break;
+case STATE_RDM_STRATEGY:
+if (*ptr == '\0' || *ptr == ',') {
+state = STATE_RESERVE_POLICY;
+*ptr = '\0';
+if (!strcmp(tok, "host")) {
+rdm->strategy = LIBXL_RDM_RESERVE_STRATEGY_HOST;
+} else {
+XLU__PCI_ERR(cfg, "Unknown RDM strategy option: %s", tok);
+goto parse_error;
+}
+tok = ptr + 1;
+}
+break;
+case STATE_RESERVE_POLICY:
+if (*ptr == '=') {
+state = STATE_OPTIONS_V;
+*ptr = '\0';
+if (strcmp(tok, "policy")) {
+XLU__PCI_ERR(cfg, "Unknown RDM property value: %s", tok);
+goto parse_error;
+}
+tok = ptr + 1;
+}
+break;
+case STATE_OPTIONS_V:
+if (*ptr == ',' || *ptr == '\0') {
+state = STATE_TERMINAL;
+*ptr = '\0';
+if (!strcmp(tok, "strict")) {
+rdm->policy = LIBXL_RDM_RESERVE_POLICY_STRICT;
+} else if (!strcmp(tok, "relaxed")) {
+rdm->policy = LIBXL_RDM_RESERVE_POLICY_RELAXED;
+} else {

[Xen-devel] [v11][PATCH 15/16] xen/vtd: prevent from assign the device with shared rmrr

2015-07-21 Thread Tiejun Chen

Currently we're intending to cover this kind of devices
with shared RMRR simply since the case of shared RMRR is
a rare case according to our previous experiences. But
late we can group these devices which shared rmrr, and
then allow all devices within a group to be assigned to
same domain.

CC: Yang Zhang 
CC: Kevin Tian 
Signed-off-by: Tiejun Chen 
Acked-by: Kevin Tian 
---
v10 ~ v11:

* Noting is changed.

v9:

* Correct one indentation issue

v8:

* Merge two if{} as one if{}

* Add to print RMRR range info when stop assign a group device

v5 ~ v7:

* Nothing is changed.

v4:

* Refine one code comment.

 xen/drivers/passthrough/vtd/iommu.c | 30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 8a8d763..ce5c295 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2293,13 +2293,37 @@ static int intel_iommu_assign_device(
 if ( list_empty(&acpi_drhd_units) )
 return -ENODEV;
 
+seg = pdev->seg;
+bus = pdev->bus;
+/*
+ * In rare cases one given rmrr is shared by multiple devices but
+ * obviously this would put the security of a system at risk. So
+ * we should prevent from this sort of device assignment.
+ *
+ * TODO: in the future we can introduce group device assignment
+ * interface to make sure devices sharing RMRR are assigned to the
+ * same domain together.
+ */
+for_each_rmrr_device( rmrr, bdf, i )
+{
+if ( rmrr->segment == seg &&
+ PCI_BUS(bdf) == bus &&
+ PCI_DEVFN2(bdf) == devfn &&
+ rmrr->scope.devices_cnt > 1 )
+{
+printk(XENLOG_G_ERR VTDPREFIX
+   " cannot assign %04x:%02x:%02x.%u"
+   " with shared RMRR at %"PRIx64" for Dom%d.\n",
+   seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+   rmrr->base_address, d->domain_id);
+return -EPERM;
+}
+}
+
 ret = reassign_device_ownership(hardware_domain, d, devfn, pdev);
 if ( ret )
 return ret;
 
-seg = pdev->seg;
-bus = pdev->bus;
-
 /* Setup rmrr identity mapping */
 for_each_rmrr_device( rmrr, bdf, i )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [v11][PATCH 12/16] tools: introduce a new parameter to set a predefined rdm boundary

2015-07-21 Thread Tiejun Chen

Previously we always fix that predefined boundary as 2G to handle
conflict between memory and rdm, but now this predefined boundar
can be changes with the parameter "rdm_mem_boundary" in .cfg file.

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Acked-by: Wei Liu 
Acked-by: Ian Jackson 
Signed-off-by: Tiejun Chen 
---
v8 ~ v10:

* Nothing is changed.

v7:

* Just sync with the fallout of renaming parameters from patch #10.

v6:

* Nothing is changed.

v5:

* Make this variable "rdm_mem_boundary_memkb" specific to .hvm 

v4:

* Separated from the previous patch to provide a parameter to set that
  predefined boundary dynamically.

 docs/man/xl.cfg.pod.5   | 22 ++
 tools/libxl/libxl.h |  6 ++
 tools/libxl/libxl_create.c  |  4 
 tools/libxl/libxl_dom.c |  8 +---
 tools/libxl/libxl_types.idl |  1 +
 tools/libxl/xl_cmdimpl.c|  3 +++
 6 files changed, 37 insertions(+), 7 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index e6e0f70..ce7ce85 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -845,6 +845,28 @@ More information about Xen gfx_passthru feature is 
available
 on the XenVGAPassthrough L
 wiki page.
 
+=item B
+
+Number of megabytes to set a boundary for checking rdm conflict.
+
+When RDM conflicts with RAM, RDM probably scatter the whole RAM space.
+Especially multiple RDM entries would worsen this to lead a complicated
+memory layout. So here we're trying to figure out a simple solution to
+avoid breaking existing layout. So when a conflict occurs,
+
+#1. Above a predefined boundary
+- move lowmem_end below reserved region to solve conflict;
+
+#2. Below a predefined boundary
+- Check strict/relaxed policy.
+"strict" policy leads to fail libxl. Note when both policies
+are specified on a given region, 'strict' is always preferred.
+"relaxed" policy issue a warning message and also mask this
+entry INVALID to indicate we shouldn't expose this entry to
+hvmloader.
+
+Here the default is 2G.
+
 =item B
 
 Specifies the host device tree nodes to passthrough to this guest. Each
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 5a7308d..927b2d8 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -909,6 +909,12 @@ const char *libxl_defbool_to_string(libxl_defbool b);
 #define LIBXL_TIMER_MODE_DEFAULT -1
 #define LIBXL_MEMKB_DEFAULT ~0ULL
 
+/*
+ * We'd like to set a memory boundary to determine if we need to check
+ * any overlap with reserved device memory.
+ */
+#define LIBXL_RDM_MEM_BOUNDARY_MEMKB_DEFAULT (2048 * 1024)
+
 #define LIBXL_MS_VM_GENID_LEN 16
 typedef struct {
 uint8_t bytes[LIBXL_MS_VM_GENID_LEN];
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index 5b57062..b27c53a 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -54,6 +54,10 @@ void libxl__rdm_setdefault(libxl__gc *gc, 
libxl_domain_build_info *b_info)
 {
 if (b_info->u.hvm.rdm.policy == LIBXL_RDM_RESERVE_POLICY_INVALID)
 b_info->u.hvm.rdm.policy = LIBXL_RDM_RESERVE_POLICY_RELAXED;
+
+if (b_info->u.hvm.rdm_mem_boundary_memkb == LIBXL_MEMKB_DEFAULT)
+b_info->u.hvm.rdm_mem_boundary_memkb =
+LIBXL_RDM_MEM_BOUNDARY_MEMKB_DEFAULT;
 }
 
 int libxl__domain_build_info_setdefault(libxl__gc *gc,
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 9af3b21..0b7c39d 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -930,12 +930,6 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 int ret, rc = ERROR_FAIL;
 uint64_t mmio_start, lowmem_end, highmem_end;
 libxl_domain_build_info *const info = &d_config->b_info;
-/*
- * Currently we fix this as 2G to guarantee how to handle
- * our rdm policy. But we'll provide a parameter to set
- * this dynamically.
- */
-uint64_t rdm_mem_boundary = 0x8000;
 
 memset(&args, 0, sizeof(struct xc_hvm_build_args));
 /* The params from the configuration file are in Mb, which are then
@@ -974,7 +968,7 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
 args.mmio_start = mmio_start;
 
 rc = libxl__domain_device_construct_rdm(gc, d_config,
-rdm_mem_boundary,
+
info->u.hvm.rdm_mem_boundary_memkb*1024,
 &args);
 if (rc) {
 LOG(ERROR, "checking reserved device memory failed");
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 157fa59..9caaf44 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -503,6 +503,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("ms_vm_genid",  libxl_ms_vm_genid),
("serial_lis

[Xen-devel] [v11][PATCH 02/16] xen/vtd: create RMRR mapping

2015-07-21 Thread Tiejun Chen

RMRR reserved regions must be setup in the pfn space with an identity
mapping to reported mfn. However existing code has problem to setup
correct mapping when VT-d shares EPT page table, so lead to problem
when assigning devices (e.g GPU) with RMRR reported. So instead, this
patch aims to setup identity mapping in p2m layer, regardless of
whether EPT is shared or not. And we still keep creating VT-d table.

And we also need to introduce a pair of helper to create/clear this
sort of identity mapping as follows:

set_identity_p2m_entry():

If the gfn space is unoccupied, we just set the mapping. If space
is already occupied by desired identity mapping, do nothing.
Otherwise, failure is returned.

clear_identity_p2m_entry():

We just define macro to wrapper guest_physmap_remove_page() with
a returning value as necessary.

CC: Tim Deegan 
CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
CC: Yang Zhang 
CC: Kevin Tian 
Reviewed-by: Kevin Tian 
Reviewed-by: Tim Deegan 
Acked-by: George Dunlap 
Signed-off-by: Tiejun Chen 
---
v6 ~ v11:

* Nothing is changed.

v5:

* Fold our original patch #2 and #3 as this new

* Introduce a new, clear_identity_p2m_entry, which can wrapper
  guest_physmap_remove_page(). And we use this to clean our
  identity mapping. 

v4:

* Change that orginal condition,

  if ( p2mt == p2m_invalid || p2mt == p2m_mmio_dm )
  
  to make sure we catch those invalid mfn mapping as we expected.

* To have

  if ( !paging_mode_translate(p2m->domain) )
return 0;

  at the start, instead of indenting the whole body of the function
  in an inner scope. 

* extend guest_physmap_remove_page() to return a value as a proper
  unmapping helper

* Instead of intel_iommu_unmap_page(), we should use
  guest_physmap_remove_page() to unmap rmrr mapping correctly. 

* Drop iommu_map_page() since actually ept_set_entry() can do this
  internally.

 xen/arch/x86/mm/p2m.c   | 40 +++--
 xen/drivers/passthrough/vtd/iommu.c |  5 ++---
 xen/include/asm-x86/p2m.h   | 13 +---
 3 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6fe6387..1e763dc 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -584,14 +584,16 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long 
gfn, unsigned long mfn,
  p2m->default_access);
 }
 
-void
+int
 guest_physmap_remove_page(struct domain *d, unsigned long gfn,
   unsigned long mfn, unsigned int page_order)
 {
 struct p2m_domain *p2m = p2m_get_hostp2m(d);
+int rc;
 gfn_lock(p2m, gfn, page_order);
-p2m_remove_page(p2m, gfn, mfn, page_order);
+rc = p2m_remove_page(p2m, gfn, mfn, page_order);
 gfn_unlock(p2m, gfn, page_order);
+return rc;
 }
 
 int
@@ -898,6 +900,40 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long 
gfn, mfn_t mfn,
 return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct, access);
 }
 
+int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
+   p2m_access_t p2ma)
+{
+p2m_type_t p2mt;
+p2m_access_t a;
+mfn_t mfn;
+struct p2m_domain *p2m = p2m_get_hostp2m(d);
+int ret;
+
+if ( !paging_mode_translate(p2m->domain) )
+return 0;
+
+gfn_lock(p2m, gfn, 0);
+
+mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL);
+
+if ( p2mt == p2m_invalid || p2mt == p2m_mmio_dm )
+ret = p2m_set_entry(p2m, gfn, _mfn(gfn), PAGE_ORDER_4K,
+p2m_mmio_direct, p2ma);
+else if ( mfn_x(mfn) == gfn && p2mt == p2m_mmio_direct && a == p2ma )
+ret = 0;
+else
+{
+ret = -EBUSY;
+printk(XENLOG_G_WARNING
+   "Cannot setup identity map d%d:%lx,"
+   " gfn already mapped to %lx.\n",
+   d->domain_id, gfn, mfn_x(mfn));
+}
+
+gfn_unlock(p2m, gfn, 0);
+return ret;
+}
+
 /* Returns: 0 for success, -errno for failure */
 int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
 {
diff --git a/xen/drivers/passthrough/vtd/iommu.c 
b/xen/drivers/passthrough/vtd/iommu.c
index 9849d0e..5aa482f 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1839,7 +1839,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t 
map,
 
 while ( base_pfn < end_pfn )
 {
-if ( intel_iommu_unmap_page(d, base_pfn) )
+if ( clear_identity_p2m_entry(d, base_pfn, 0) )
 ret = -ENXIO;
 base_pfn++;
 }
@@ -1855,8 +1855,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t 
map,
 
 while ( base_pfn < end_pfn )
 {
-int err = intel_iommu_map_page(d, base_pfn, base_pfn,
-   IOMMUF_readable|IOMMUF_writable);
+int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw);
 
 if ( err )

[Xen-devel] [v11][PATCH 04/16] xen: enable XENMEM_memory_map in hvm

2015-07-21 Thread Tiejun Chen

This patch enables XENMEM_memory_map in hvm. So hvmloader can
use it to setup the e820 mappings.

CC: Keir Fraser 
CC: Jan Beulich 
CC: Andrew Cooper 
Signed-off-by: Tiejun Chen 
Reviewed-by: Tim Deegan 
Reviewed-by: Kevin Tian 
Acked-by: Jan Beulich 
Acked-by: George Dunlap 
---
v5 ~ v11:

* Nothing is changed.

v4:

* Just refine the patch head description as Jan commented.

 xen/arch/x86/hvm/hvm.c | 2 --
 xen/arch/x86/mm.c  | 6 --
 2 files changed, 8 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c07e3ef..d860579 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4855,7 +4855,6 @@ static long hvm_memory_op(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 
 switch ( cmd & MEMOP_CMD_MASK )
 {
-case XENMEM_memory_map:
 case XENMEM_machine_memory_map:
 case XENMEM_machphys_mapping:
 return -ENOSYS;
@@ -4931,7 +4930,6 @@ static long hvm_memory_op_compat32(int cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 
 switch ( cmd & MEMOP_CMD_MASK )
 {
-case XENMEM_memory_map:
 case XENMEM_machine_memory_map:
 case XENMEM_machphys_mapping:
 return -ENOSYS;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 342414f..8c887d8 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -4717,12 +4717,6 @@ long arch_memory_op(unsigned long cmd, 
XEN_GUEST_HANDLE_PARAM(void) arg)
 return rc;
 }
 
-if ( is_hvm_domain(d) )
-{
-rcu_unlock_domain(d);
-return -EPERM;
-}
-
 e820 = xmalloc_array(e820entry_t, fmap.map.nr_entries);
 if ( e820 == NULL )
 {
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [v11][PATCH 00/16] Fix RMRR

2015-07-21 Thread Tiejun Chen

v11:

* Rebase on staging

* Patch #6: hvmloader/pci: Try to avoid placing BARs in RMRRs
  To find the lowest RMRR the _end_ of which is higher than base;
  Refine some code implementations; 

* Patch #7: hvmloader/e820: construct guest e820 table
  To check/sync memory_map.map[] before copy them into e820 since
  ultimately this can make sure hvm_info, memory_map.map[] and e820
  are on the same page;
  Refine some code implementations;

* Patch #11: tools/libxl: detect and avoid conflicts with RDM
  Use GCNEW_ARRAY to replace libxl__malloc();
  #define pfn_to_paddrk is missing safety () around x, and
  move this into libxl_internal.h;
  Rename set_rdm_entries() to add_rdm_entry() and put the
  increment at the end so that the assignments are
  to ->rdms[d_config->num_rdms];
  "Simply make it so that if there are any rdms specified
  in the domain config, they are used instead of the
  automatically gathered information (from strategy and
  devices)." So just return if d_config->rmds is valid;
  Shorten some code comments.

v10:

* Patch #6: hvmloader/pci: Try to avoid placing BARs in RMRRs
  This is from George' draft patch which implements an acceptable
  solution in current cycle. Here I just implemented check_overlap_all() and
  some cleanups.

* Patch #7: hvmloader/e820: construct guest e820 table
  Instead of correcting e820, I'd like to correct memory_map.map[]
  and then copy them into e820 directly. I think this can make sure
  hvm_info, memory_map.map[] and e820 are on the same page.

v9:

* Patch #3: xen/passthrough: extend hypercall to support rdm reservation policy
  Correct one check condition of XEN_DOMCTL_DEV_RDM_RELAXED

* Patch #5: hvmloader: get guest memory map into memory_map[]
  Correct the patch head description:
  [RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END]
-> [RESERVED_MEMORY_DYNAMIC_START, RESERVED_MEMORY_DYNAMIC_END);
  Merge two if{} as one if{};

* Patch #6: hvmloader/pci: disable all pci devices conflicting with rdm
  A little improvement to code implementation but again, its still argued
  about this solution. Myself prefer to take a look at v7 if possible.

* Patch #7: hvmloader/e820: construct guest e820 table
  Refine that chunk of codes to check/modify highmem

* Patch #15: xen/vtd: prevent from assign the device with shared rmrr
  Correct one indentation issue

v8:

* Patch #3: xen/passthrough: extend hypercall to support rdm reservation policy
  Force to pass "0"(strict) when add or move a device in hardware domain,
  and improve some associated code comments.

* Patch #5: hvmloader: get guest memory map into memory_map[]
  Actually we should check this range started from
  RESERVED_MEMORY_DYNAMIC_START, not RESERVED_MEMORY_DYNAMIC_START - 1.
  So correct this and sync the patch head description.

* Patch #6: hvmloader/pci: disable all pci devices conflicting
  We have a big change to this patch:

  Based on current discussion its hard to reshape the original mmio
  allocation mechanism but we haven't a good and simple way to in short term.
  So instead, we don't bring more complicated to intervene that process but
  still check any conflicts to disable all associated devices.

  I know this is still argumented but I'd like to discuss this based on this
  revision and thanks for your time.

* Patch #7: hvmloader/e820: construct guest e820 table
  define low_mem_end as uint32_t;
  Correct those two wrong loops, memory_map.nr_map -> nr
  when we're trying to revise low/high memory e820 entries;
  Improve code comments and the patch head description;
  Add one check if highmem is just populated by hvmloader itself

* Patch #11: tools/libxl: detect and avoid conflicts with RDM
  Introduce pfn_to_paddr(x) -> ((uint64_t)x << XC_PAGE_SHIFT)
  and set_rdm_entries() to factor out current codes.

* Patch #13: libxl: construct e820 map with RDM information for HVM guest
  make that core construction function as arch-specific to make sure
  we don't break ARM at this point.

* Patch #15:  xen/vtd: prevent from assign the device with shared rmrr
  Merge two if{} as one if{};
  Add to print RMRR range info when stop assign a group device

* Some minimal code style changes

v7:

* Need to rename some parameters:
  In the xl rdm config parsing, `reserve=' should be `policy='.
  In the xl pci config parsing, `rdm_reserve=' should be `rdm_policy='.
  The type `libxl_rdm_reserve_flag' should be `libxl_rdm_policy'.
  The field name `reserve' in `libxl_rdm_reserve' should be `policy'.

* Just sync with the fallout of renaming parameters above.

Note I also mask patch #10 Acked by Wei Liu, Ian Jackson and Ian
Campbell. ( If I'm wrong just let me know at this point. ) And
as we discussed I'd further improve something as next step after
this round of review.

v6:

* Inside patch #01, add a comments to the nr_entries field inside
  xen_reserved_device_memory_map. Note this is from Jan.

* Inside patch #10,  we need rename something to make our policy reasonable
  "type" ->

[Xen-devel] [v11][PATCH 10/16] tools: introduce some new parameters to set rdm policy

2015-07-21 Thread Tiejun Chen

This patch introduces user configurable parameters to specify RDM
resource and according policies,

Global RDM parameter:
rdm = "strategy=host,policy=strict/relaxed"
Per-device RDM parameter:
pci = [ 'sbdf, rdm_policy=strict/relaxed' ]

Global RDM parameter, "strategy", allows user to specify reserved regions
explicitly, Currently, using 'host' to include all reserved regions reported
on this platform which is good to handle hotplug scenario. In the future
this parameter may be further extended to allow specifying random regions,
e.g. even those belonging to another platform as a preparation for live
migration with passthrough devices. By default this isn't set so we don't
check all rdms. Instead, we just check rdm specific to a given device if
you're assigning this kind of device. Note this option is not recommended
unless you can make sure any conflict does exist.

'strict/relaxed' policy decides how to handle conflict when reserving RDM
regions in pfn space. If conflict exists, 'strict' means an immediate error
so VM can't keep running, while 'relaxed' allows moving forward with a
warning message thrown out.

Default per-device RDM policy is same as default global RDM policy as being
'relaxed'. And the per-device policy would override the global policy like
others.

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Acked-by: Wei Liu 
Acked-by: Ian Jackson 
Signed-off-by: Tiejun Chen 
---
v9 ~ v11:

* Nothing is changed.

v8:

* One minimal code style change

v7:

* Need to rename some parameters:
  In the xl rdm config parsing, `reserve=' should be `policy='.
  In the xl pci config parsing, `rdm_reserve=' should be `rdm_policy='.
  The type `libxl_rdm_reserve_flag' should be `libxl_rdm_policy'.
  The field name `reserve' in `libxl_rdm_reserve' should be `policy'.

v6:

* Some rename to make our policy reasonable
  "type" -> "strategy"
  "none" -> "ignore"
* Don't expose "ignore" in xl level and just keep that as a default.
  And then sync docs and the patch head description

v5:

* Just make sure the per-device plicy always override the global policy,
  and so cleanup some associated comments and the patch head description.
* A little change to follow one bit, XEN_DOMCTL_DEV_RDM_RELAXED.
* Improve all descriptions in doc.
* Make all rdm variables specific to .hvm

v4:

* No need to define init_val for libxl_rdm_reserve_type since its just zero
* Grab those changes to xl/libxlu to as a final patch

 docs/man/xl.cfg.pod.5| 81 
 docs/misc/vtd.txt| 24 +
 tools/libxl/libxl_create.c   |  7 
 tools/libxl/libxl_internal.h |  2 ++
 tools/libxl/libxl_pci.c  |  9 +
 tools/libxl/libxl_types.idl  | 18 ++
 6 files changed, 141 insertions(+)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 382f30b..e6e0f70 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -633,6 +633,79 @@ assigned slave device.
 
 =back
 
+=item B
+
+(HVM/x86 only) Specifies information about Reserved Device Memory (RDM),
+which is necessary to enable robust device passthrough. One example of RDM
+is reported through ACPI Reserved Memory Region Reporting (RMRR) structure
+on x86 platform.
+
+B has the form C<[KEY=VALUE,KEY=VALUE,...> where:
+
+=over 4
+
+=item B
+
+Possible Bs are:
+
+=over 4
+
+=item B
+
+Currently there is only one valid type:
+
+"host" means all reserved device memory on this platform should be checked to
+reserve regions in this VM's guest address space. This global rdm parameter
+allows user to specify reserved regions explicitly, and using "host" includes
+all reserved regions reported on this platform, which is useful when doing
+hotplug.
+
+By default this isn't set so we don't check all rdms. Instead, we just check
+rdm specific to a given device if you're assigning this kind of device. Note
+this option is not recommended unless you can make sure any conflict does 
exist.
+
+For example, you're trying to set "memory = 2800" to allocate memory to one
+given VM but the platform owns two RDM regions like,
+
+Device A [sbdf_A]: RMRR region_A: base_addr ac6d3000 end_address ac6e6fff
+Device B [sbdf_B]: RMRR region_B: base_addr ad80 end_address afff
+
+In this conflict case,
+
+#1. If B is set to "host", for example,
+
+rdm = "strategy=host,policy=strict" or rdm = "strategy=host,policy=relaxed"
+
+It means all conflicts will be handled according to the policy
+introduced by B as described below.
+
+#2. If B is not set at all, but
+
+pci = [ 'sbdf_A, rdm_policy=x' ]
+
+It means only one conflict of region_A will be handled according to the policy
+introduced by B as described inside pci options.
+
+=item B
+
+Specifies how to deal with conflicts when reserving reserved device
+memory in guest address space.
+
+When that conflict is unsolved,
+
+"strict" means VM can't be created, or the associated device can't be
+attached in the case of hotplug.
+
+"relaxed" all

[Xen-devel] [v11][PATCH 08/16] tools/libxc: Expose new hypercall xc_reserved_device_memory_map

2015-07-21 Thread Tiejun Chen

We will introduce the hypercall xc_reserved_device_memory_map
approach to libxc. This helps us get rdm entry info according to
different parameters. If flag == PCI_DEV_RDM_ALL, all entries
should be exposed. Or we just expose that rdm entry specific to
a SBDF.

CC: Ian Jackson 
CC: Stefano Stabellini 
CC: Ian Campbell 
CC: Wei Liu 
Reviewed-by: Kevin Tian 
Acked-by: Wei Liu 
Signed-off-by: Tiejun Chen 
---
v4 ~ v11:

* Nothing is changed.

 tools/libxc/include/xenctrl.h |  8 
 tools/libxc/xc_domain.c   | 36 
 2 files changed, 44 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index ce9029c..2991333 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1314,6 +1314,14 @@ int xc_domain_set_memory_map(xc_interface *xch,
 int xc_get_machine_memory_map(xc_interface *xch,
   struct e820entry entries[],
   uint32_t max_entries);
+
+int xc_reserved_device_memory_map(xc_interface *xch,
+  uint32_t flag,
+  uint16_t seg,
+  uint8_t bus,
+  uint8_t devfn,
+  struct xen_reserved_device_memory entries[],
+  uint32_t *max_entries);
 #endif
 int xc_domain_set_time_offset(xc_interface *xch,
   uint32_t domid,
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 6db8d13..298b3b5 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -684,6 +684,42 @@ int xc_domain_set_memory_map(xc_interface *xch,
 
 return rc;
 }
+
+int xc_reserved_device_memory_map(xc_interface *xch,
+  uint32_t flag,
+  uint16_t seg,
+  uint8_t bus,
+  uint8_t devfn,
+  struct xen_reserved_device_memory entries[],
+  uint32_t *max_entries)
+{
+int rc;
+struct xen_reserved_device_memory_map xrdmmap = {
+.flag = flag,
+.seg = seg,
+.bus = bus,
+.devfn = devfn,
+.nr_entries = *max_entries
+};
+DECLARE_HYPERCALL_BOUNCE(entries,
+ sizeof(struct xen_reserved_device_memory) *
+ *max_entries, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+if ( xc_hypercall_bounce_pre(xch, entries) )
+return -1;
+
+set_xen_guest_handle(xrdmmap.buffer, entries);
+
+rc = do_memory_op(xch, XENMEM_reserved_device_memory_map,
+  &xrdmmap, sizeof(xrdmmap));
+
+xc_hypercall_bounce_post(xch, entries);
+
+*max_entries = xrdmmap.nr_entries;
+
+return rc;
+}
+
 int xc_get_machine_memory_map(xc_interface *xch,
   struct e820entry entries[],
   uint32_t max_entries)
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [xen-4.5-testing test] 59792: regressions - FAIL

2015-07-21 Thread osstest service owner

flight 59792 xen-4.5-testing real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59792/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail REGR. vs. 59527
 test-amd64-i386-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail REGR. 
vs. 59527

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-rumpuserxen-amd64 15 
rumpuserxen-demo-xenstorels/xenstorels.repeat fail REGR. vs. 59527
 test-amd64-i386-xl-qemuu-win7-amd64 15 guest-localmigrate/x10 fail blocked in 
59527
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 59508
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail like 59527
 test-armhf-armhf-xl-rtds 11 guest-start  fail   like 59527

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass

version targeted for testing:
 xen  666b80f239c566283cb1b3435180d99a329d0156
baseline version:
 xen  36a7c54a698db7d087873b312087cfa64de33175

Last test of basis59527  2015-07-14 01:40:53 Z7 days
Testing same since59792  2015-07-21 09:35:47 Z0 days1 attempts


People who touched revisions under test:
  Andrew Cooper 
  Dario Faggioli 
  Elena Ufimtseva 
  Ian Campbell 
  Jan Beulich 
  Juergen Gross 
  Liang Li 
  Yang Zhang 

jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  fail
 test-amd64-amd64-rumpuserxen-amd64   fail
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  pass
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-i386-rumpuserxen-i386 pass
 test-amd64-amd64-xl-pvh-intelfail
 test-amd64-i386-qemut-rhel6hvm-intel pass
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andy Lutomirski

On Tue, Jul 21, 2015 at 5:49 PM, Andrew Cooper
 wrote:
> On 22/07/2015 01:28, Andy Lutomirski wrote:
>> On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
>>  wrote:
>>> On 22/07/2015 01:07, Andy Lutomirski wrote:
 On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
  wrote:
> On 21/07/2015 22:53, Boris Ostrovsky wrote:
>> On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
>>> --- a/arch/x86/include/asm/mmu_context.h
>>> +++ b/arch/x86/include/asm/mmu_context.h
>>> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
>>> *mm) {}
>>>   #endif
>>> /*
>>> + * ldt_structs can be allocated, used, and freed, but they are never
>>> + * modified while live.
>>> + */
>>> +struct ldt_struct {
>>> +int size;
>>> +int __pad;/* keep the descriptors naturally aligned. */
>>> +struct desc_struct entries[];
>>> +};
>>
>> This breaks Xen which expects LDT to be page-aligned. Not sure why.
>>
>> Jan, Andrew?
> PV guests are not permitted to have writeable mappings to the frames
> making up the GDT and LDT, so it cannot make unaudited changes to
> loadable descriptors.  In particular, for a 32bit PV guest, it is only
> the segment limit which protects Xen from the ring1 guest kernel.
>
> A lot of this code hasn't been touched in years, and it certainly
> predates me.  The alignment requirement appears to come from the virtual
> region Xen uses to map the guests GDT and LDT.  Strict alignment is
> required for the GDT so Xen's descriptors starting at 0xe0xx are
> correct, but the LDT alignment seems to be a side effect of similar
> codepaths.
>
> For an LDT smaller than 8192 entries, I can't see any specific reason
> for enforcing alignment, other than "that's the way it has always been".
>
> However, the guest would still have to relinquish write access to all
> frames which make up the LDT, which looks to be a bit of an issue given
> the snippet above.
 Does the LDT itself need to be aligned or just the address passed to
 paravirt_alloc_ldt?
>>> The address which Xen receives needs to be aligned.
>>>
>>> It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
>>> it is passed is page aligned, and passes it straight through.
>> xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
>> it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
>> pointer to the ldt_struct.
>
> So it is.  It is the linear_addr in xen_set_ldt() which Xen currently
> audits to be page aligned.
>
> This will allow ldt_struct itself to be page aligned, and for the size
> field to sit across the base/limit field of what would logically be
> selector 0x0008  There would be some issues accessing size.  To load
> frames as an LDT, a guest must drop all refs to the page so that its
> type may be changed from writeable to segdesc.  After that, an
> update_descriptor hypercall can be used to change size, and I believe
> the guest may subsequently recreate read-only mappings to the frames in
> question (although frankly it is getting late so you will want to double
> check all of this).
>
> Anyhow, this looks like an issue which should be fixed up with slightly
> more PVOps, rather than enforcing a Xen view of the world on native Linux.
>
 I could presumably make the allocation the other way around so the
 size is at the end.  I could even use two separate allocations if
 needed.
>>> I suspect two separate allocations would be the better solution, as it
>>> means that the size field doesn't need to be subject to funny page
>>> permissions.
>> True.  OTOH we never write to the size field after allocating the thing.
>
> Right, but even reading it is going to cause problems if one of the
> paravirt ops can't re-establish ro mappings.

Does paravirt_alloc_ldt completely deny access or does it just set it RO?

--Andy

>
> ~Andrew



-- 
Andy Lutomirski
AMA Capital Management, LLC

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andrew Cooper

On 22/07/2015 01:28, Andy Lutomirski wrote:
> On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
>  wrote:
>> On 22/07/2015 01:07, Andy Lutomirski wrote:
>>> On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
>>>  wrote:
 On 21/07/2015 22:53, Boris Ostrovsky wrote:
> On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
>> --- a/arch/x86/include/asm/mmu_context.h
>> +++ b/arch/x86/include/asm/mmu_context.h
>> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
>> *mm) {}
>>   #endif
>> /*
>> + * ldt_structs can be allocated, used, and freed, but they are never
>> + * modified while live.
>> + */
>> +struct ldt_struct {
>> +int size;
>> +int __pad;/* keep the descriptors naturally aligned. */
>> +struct desc_struct entries[];
>> +};
>
> This breaks Xen which expects LDT to be page-aligned. Not sure why.
>
> Jan, Andrew?
 PV guests are not permitted to have writeable mappings to the frames
 making up the GDT and LDT, so it cannot make unaudited changes to
 loadable descriptors.  In particular, for a 32bit PV guest, it is only
 the segment limit which protects Xen from the ring1 guest kernel.

 A lot of this code hasn't been touched in years, and it certainly
 predates me.  The alignment requirement appears to come from the virtual
 region Xen uses to map the guests GDT and LDT.  Strict alignment is
 required for the GDT so Xen's descriptors starting at 0xe0xx are
 correct, but the LDT alignment seems to be a side effect of similar
 codepaths.

 For an LDT smaller than 8192 entries, I can't see any specific reason
 for enforcing alignment, other than "that's the way it has always been".

 However, the guest would still have to relinquish write access to all
 frames which make up the LDT, which looks to be a bit of an issue given
 the snippet above.
>>> Does the LDT itself need to be aligned or just the address passed to
>>> paravirt_alloc_ldt?
>> The address which Xen receives needs to be aligned.
>>
>> It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
>> it is passed is page aligned, and passes it straight through.
> xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
> it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
> pointer to the ldt_struct.

So it is.  It is the linear_addr in xen_set_ldt() which Xen currently
audits to be page aligned.

 This will allow ldt_struct itself to be page aligned, and for the size
 field to sit across the base/limit field of what would logically be
 selector 0x0008  There would be some issues accessing size.  To load
 frames as an LDT, a guest must drop all refs to the page so that its
 type may be changed from writeable to segdesc.  After that, an
 update_descriptor hypercall can be used to change size, and I believe
 the guest may subsequently recreate read-only mappings to the frames in
 question (although frankly it is getting late so you will want to double
 check all of this).

 Anyhow, this looks like an issue which should be fixed up with slightly
 more PVOps, rather than enforcing a Xen view of the world on native Linux.

>>> I could presumably make the allocation the other way around so the
>>> size is at the end.  I could even use two separate allocations if
>>> needed.
>> I suspect two separate allocations would be the better solution, as it
>> means that the size field doesn't need to be subject to funny page
>> permissions.
> True.  OTOH we never write to the size field after allocating the thing.

Right, but even reading it is going to cause problems if one of the
paravirt ops can't re-establish ro mappings.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun


On 2015/7/21 23:57, Ian Jackson wrote:

Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts 
with RDM"):

Sorry, I just ignore the line in brackets since I always think this kind
of thing is often not a big deal, and next time I should pay more
attention to the (). But indeed, before I post this whole patch online I
also picked up this chunk of code to ask you to take a look that. This
manner means I'm not very sure if I'm addressing this properly. But I
didn't get a further response, so I guess that should work for you and
then I posted the whole online.


You are talking about <55ae2bb1.9030...@intel.com> I guess.  I replied
to that with several comments about your prose and about the
computation of the new set of rdms.

It's true that I didn't comment on the frat that you had half-done one
of the things I had requested.  It is of course a waste of my time to
be constantly re-reviewing half-done changes.


Next time I should see each line of all comments carefully. Maybe its 
good way to use IRC to take your quick advice in advance, and I hope 
this make you feel better. Anyway, sorry to bring this kind of 
inconvenience.





Now back on our problem,

static void
add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
{
  d_config->rdms = libxl__realloc(NOGC, d_config->rdms,
  (d_config->num_rdms+1) * sizeof(libxl_device_rdm));

  d_config->rdms[d_config->num_rdms].start = rdm_start;
  d_config->rdms[d_config->num_rdms].size = rdm_size;
  d_config->rdms[d_config->num_rdms].policy = rdm_policy;
  d_config->num_rdms++;
}

Does this work for you? If I'm still wrong, please correct this function
directly to cost you less.


Yes, that is what I meant.



Good to know.

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andy Lutomirski

On Tue, Jul 21, 2015 at 5:21 PM, Andrew Cooper
 wrote:
> On 22/07/2015 01:07, Andy Lutomirski wrote:
>> On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
>>  wrote:
>>> On 21/07/2015 22:53, Boris Ostrovsky wrote:
 On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
> *mm) {}
>   #endif
> /*
> + * ldt_structs can be allocated, used, and freed, but they are never
> + * modified while live.
> + */
> +struct ldt_struct {
> +int size;
> +int __pad;/* keep the descriptors naturally aligned. */
> +struct desc_struct entries[];
> +};


 This breaks Xen which expects LDT to be page-aligned. Not sure why.

 Jan, Andrew?
>>> PV guests are not permitted to have writeable mappings to the frames
>>> making up the GDT and LDT, so it cannot make unaudited changes to
>>> loadable descriptors.  In particular, for a 32bit PV guest, it is only
>>> the segment limit which protects Xen from the ring1 guest kernel.
>>>
>>> A lot of this code hasn't been touched in years, and it certainly
>>> predates me.  The alignment requirement appears to come from the virtual
>>> region Xen uses to map the guests GDT and LDT.  Strict alignment is
>>> required for the GDT so Xen's descriptors starting at 0xe0xx are
>>> correct, but the LDT alignment seems to be a side effect of similar
>>> codepaths.
>>>
>>> For an LDT smaller than 8192 entries, I can't see any specific reason
>>> for enforcing alignment, other than "that's the way it has always been".
>>>
>>> However, the guest would still have to relinquish write access to all
>>> frames which make up the LDT, which looks to be a bit of an issue given
>>> the snippet above.
>> Does the LDT itself need to be aligned or just the address passed to
>> paravirt_alloc_ldt?
>
> The address which Xen receives needs to be aligned.
>
> It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
> it is passed is page aligned, and passes it straight through.

xen_alloc_ldt is just fiddling with protection though, I think.  Isn't
it xen_set_ldt that's the meat?  We could easily pass xen_alloc_ldt a
pointer to the ldt_struct.

>
>>
>>> I think I have a solution, but I doubt it is going to be very popular.
>>>
>>> * Make a new paravirt hook for allocation of ldt_struct, so the paravirt
>>> backend can choose an alignment if needed
>>> * Make absolutely certain that __pad has the value 0 (so size and __pad
>>> combined don't look like a present descriptor)
>>> * Never hand selector 0x0008 to unsuspecting users.
>> Yuck.
>
> I actually meant 0x0004, but yes.  Very much yuck.
>
>>
>>> This will allow ldt_struct itself to be page aligned, and for the size
>>> field to sit across the base/limit field of what would logically be
>>> selector 0x0008  There would be some issues accessing size.  To load
>>> frames as an LDT, a guest must drop all refs to the page so that its
>>> type may be changed from writeable to segdesc.  After that, an
>>> update_descriptor hypercall can be used to change size, and I believe
>>> the guest may subsequently recreate read-only mappings to the frames in
>>> question (although frankly it is getting late so you will want to double
>>> check all of this).
>>>
>>> Anyhow, this looks like an issue which should be fixed up with slightly
>>> more PVOps, rather than enforcing a Xen view of the world on native Linux.
>>>
>> I could presumably make the allocation the other way around so the
>> size is at the end.  I could even use two separate allocations if
>> needed.
>
> I suspect two separate allocations would be the better solution, as it
> means that the size field doesn't need to be subject to funny page
> permissions.

True.  OTOH we never write to the size field after allocating the thing.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andrew Cooper

On 22/07/2015 01:07, Andy Lutomirski wrote:
> On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
>  wrote:
>> On 21/07/2015 22:53, Boris Ostrovsky wrote:
>>> On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
 --- a/arch/x86/include/asm/mmu_context.h
 +++ b/arch/x86/include/asm/mmu_context.h
 @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
 *mm) {}
   #endif
 /*
 + * ldt_structs can be allocated, used, and freed, but they are never
 + * modified while live.
 + */
 +struct ldt_struct {
 +int size;
 +int __pad;/* keep the descriptors naturally aligned. */
 +struct desc_struct entries[];
 +};
>>>
>>>
>>> This breaks Xen which expects LDT to be page-aligned. Not sure why.
>>>
>>> Jan, Andrew?
>> PV guests are not permitted to have writeable mappings to the frames
>> making up the GDT and LDT, so it cannot make unaudited changes to
>> loadable descriptors.  In particular, for a 32bit PV guest, it is only
>> the segment limit which protects Xen from the ring1 guest kernel.
>>
>> A lot of this code hasn't been touched in years, and it certainly
>> predates me.  The alignment requirement appears to come from the virtual
>> region Xen uses to map the guests GDT and LDT.  Strict alignment is
>> required for the GDT so Xen's descriptors starting at 0xe0xx are
>> correct, but the LDT alignment seems to be a side effect of similar
>> codepaths.
>>
>> For an LDT smaller than 8192 entries, I can't see any specific reason
>> for enforcing alignment, other than "that's the way it has always been".
>>
>> However, the guest would still have to relinquish write access to all
>> frames which make up the LDT, which looks to be a bit of an issue given
>> the snippet above.
> Does the LDT itself need to be aligned or just the address passed to
> paravirt_alloc_ldt?

The address which Xen receives needs to be aligned.

It looks like xen_alloc_ldt() blindly assumes that the desc_struct *ldt
it is passed is page aligned, and passes it straight through.

>
>> I think I have a solution, but I doubt it is going to be very popular.
>>
>> * Make a new paravirt hook for allocation of ldt_struct, so the paravirt
>> backend can choose an alignment if needed
>> * Make absolutely certain that __pad has the value 0 (so size and __pad
>> combined don't look like a present descriptor)
>> * Never hand selector 0x0008 to unsuspecting users.
> Yuck.

I actually meant 0x0004, but yes.  Very much yuck.

>
>> This will allow ldt_struct itself to be page aligned, and for the size
>> field to sit across the base/limit field of what would logically be
>> selector 0x0008  There would be some issues accessing size.  To load
>> frames as an LDT, a guest must drop all refs to the page so that its
>> type may be changed from writeable to segdesc.  After that, an
>> update_descriptor hypercall can be used to change size, and I believe
>> the guest may subsequently recreate read-only mappings to the frames in
>> question (although frankly it is getting late so you will want to double
>> check all of this).
>>
>> Anyhow, this looks like an issue which should be fixed up with slightly
>> more PVOps, rather than enforcing a Xen view of the world on native Linux.
>>
> I could presumably make the allocation the other way around so the
> size is at the end.  I could even use two separate allocations if
> needed.

I suspect two separate allocations would be the better solution, as it
means that the size field doesn't need to be subject to funny page
permissions.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andy Lutomirski

On Tue, Jul 21, 2015 at 4:38 PM, Andrew Cooper
 wrote:
> On 21/07/2015 22:53, Boris Ostrovsky wrote:
>> On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
>>> --- a/arch/x86/include/asm/mmu_context.h
>>> +++ b/arch/x86/include/asm/mmu_context.h
>>> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
>>> *mm) {}
>>>   #endif
>>> /*
>>> + * ldt_structs can be allocated, used, and freed, but they are never
>>> + * modified while live.
>>> + */
>>> +struct ldt_struct {
>>> +int size;
>>> +int __pad;/* keep the descriptors naturally aligned. */
>>> +struct desc_struct entries[];
>>> +};
>>
>>
>>
>> This breaks Xen which expects LDT to be page-aligned. Not sure why.
>>
>> Jan, Andrew?
>
> PV guests are not permitted to have writeable mappings to the frames
> making up the GDT and LDT, so it cannot make unaudited changes to
> loadable descriptors.  In particular, for a 32bit PV guest, it is only
> the segment limit which protects Xen from the ring1 guest kernel.
>
> A lot of this code hasn't been touched in years, and it certainly
> predates me.  The alignment requirement appears to come from the virtual
> region Xen uses to map the guests GDT and LDT.  Strict alignment is
> required for the GDT so Xen's descriptors starting at 0xe0xx are
> correct, but the LDT alignment seems to be a side effect of similar
> codepaths.
>
> For an LDT smaller than 8192 entries, I can't see any specific reason
> for enforcing alignment, other than "that's the way it has always been".
>
> However, the guest would still have to relinquish write access to all
> frames which make up the LDT, which looks to be a bit of an issue given
> the snippet above.

Does the LDT itself need to be aligned or just the address passed to
paravirt_alloc_ldt?

>
> I think I have a solution, but I doubt it is going to be very popular.
>
> * Make a new paravirt hook for allocation of ldt_struct, so the paravirt
> backend can choose an alignment if needed
> * Make absolutely certain that __pad has the value 0 (so size and __pad
> combined don't look like a present descriptor)
> * Never hand selector 0x0008 to unsuspecting users.

Yuck.

>
> This will allow ldt_struct itself to be page aligned, and for the size
> field to sit across the base/limit field of what would logically be
> selector 0x0008  There would be some issues accessing size.  To load
> frames as an LDT, a guest must drop all refs to the page so that its
> type may be changed from writeable to segdesc.  After that, an
> update_descriptor hypercall can be used to change size, and I believe
> the guest may subsequently recreate read-only mappings to the frames in
> question (although frankly it is getting late so you will want to double
> check all of this).
>
> Anyhow, this looks like an issue which should be fixed up with slightly
> more PVOps, rather than enforcing a Xen view of the world on native Linux.
>

I could presumably make the allocation the other way around so the
size is at the end.  I could even use two separate allocations if
needed.

--Andy

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [qemu-mainline test] 59791: regressions - FAIL

2015-07-21 Thread osstest service owner

flight 59791 qemu-mainline real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59791/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 12 guest-saverestore   fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-ovmf-amd64 11 guest-saverestore  fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-debianhvm-amd64 11 guest-saverestore fail REGR. vs. 
59059
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 11 guest-saverestore fail REGR. 
vs. 59059
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 11 guest-saverestore fail REGR. vs. 
59059
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 11 guest-saverestore fail REGR. vs. 
59059
 test-amd64-amd64-xl-qemuu-ovmf-amd64 11 guest-saverestore fail REGR. vs. 59059
 test-amd64-amd64-xl-qemuu-winxpsp3 11 guest-saverestore   fail REGR. vs. 59059
 test-amd64-i386-freebsd10-amd64 12 guest-saverestore  fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-winxpsp3 11 guest-saverestorefail REGR. vs. 59059
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 11 guest-saverestore fail REGR. 
vs. 59059
 test-amd64-amd64-xl-qemuu-win7-amd64 11 guest-saverestore fail REGR. vs. 59059
 test-amd64-i386-xl-qemuu-win7-amd64 11 guest-saverestore  fail REGR. vs. 59059

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-rtds 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 15 guest-start/debian.repeatfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass

version targeted for testing:
 qemuu13566fe3e584e7b14a6f45246976b91677dc2a77
baseline version:
 qemuu35360642d043c2a5366e8a04a10e5545e7353bd5

Last test of basis59059  2015-07-05 10:39:20 Z   16 days
Failing since 59109  2015-07-06 14:58:21 Z   15 days   22 attempts
Testing same since59791  2015-07-21 08:25:15 Z0 days1 attempts


People who touched revisions under test:
  Alberto Garcia 
  Alex Williamson 
  Alexander Graf 
  Alexey Kardashevskiy 
  Alvise Rigo 
  Amit Shah 
  Andreas FÃ¤rber 
  Andrew Bennett 
  Andrew Jones 
  Artyom Tarasenko 
  Aurelien Jarno 
  Benjamin Herrenschmidt 
  Bharata B Rao 
  Bharata B Rao 
  Brian Kress 
  Chen Hanxiao 
  Christian Borntraeger 
  Christoph Hellwig 
  Claudio Fontana 
  Cormac O'Brien 
  Cornelia Huck 
  Daniel P. Berrange 
  David Gibson 
  Denis V. Lunev 
  Dmitry Osipenko 
  Dr. David Alan Gilbert 
  Eduardo Habkost 
  Eric Auger 
  Fam Zheng 
  Frediano Ziglio 
  Gabriel Laupre 
  Gavin Shan 
  Gerd Hoffmann 
  Gonglei 
  Greg Kurz 
  Hannes Reinecke 
  HervÃ© Poussineau 
  Igor Mammedov 
  James Hogan 
  Jan Kiszka 
  Jason Wang 
  Jeff Cody 
  Johannes Schlatow 
  John Snow 
  Josh Durgin 
  Juan Quintela 
  Justin Ossevoort 
  Keith Busch 
  Kevin Wolf 
  Kirk Allan 
  Laszlo Ersek 
  Laurent Vivier 
  Laurent Vivier 
  Leon Alrae 
  Li Zhijian 
  Liang Li 
  Lin Ma 
  Marc-AndrÃ© Lureau 
  Markus Armbruster 
  Max Filippov 
  Michael Roth 
  Michael S. Tsirkin 
  Nikunj A Dadhania 
  Olga Krishtal 
  Pankaj Gupta 
  Paolo Bonzini 
  Paul Durrant 
  Paulo Alcantara 
  Paulo Alcantara 
  Peter Crosthwaite 
  Peter Crosthwaite 
  Peter Maydell 
  Radim KrÄmÃ¡Å 
  Richard W.M. Jones 
  Scott Feldman 
  Sergey Fedorov 
  Stefan Hajnoczi 
  Stefan Weil 
  Ting Wang 
  Vikram Sethi 
  Wen Congyang 
  Wenshuang Ma 
  Wolfgang Bumiller 
  Xu Wang 
  Yongbok Kim 
  é©¬æé 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Andrew Cooper

On 21/07/2015 22:53, Boris Ostrovsky wrote:
> On 07/21/2015 03:59 PM, Andy Lutomirski wrote:
>> --- a/arch/x86/include/asm/mmu_context.h
>> +++ b/arch/x86/include/asm/mmu_context.h
>> @@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct
>> *mm) {}
>>   #endif
>> /*
>> + * ldt_structs can be allocated, used, and freed, but they are never
>> + * modified while live.
>> + */
>> +struct ldt_struct {
>> +int size;
>> +int __pad;/* keep the descriptors naturally aligned. */
>> +struct desc_struct entries[];
>> +};
>
>
>
> This breaks Xen which expects LDT to be page-aligned. Not sure why.
>
> Jan, Andrew?

PV guests are not permitted to have writeable mappings to the frames
making up the GDT and LDT, so it cannot make unaudited changes to
loadable descriptors.  In particular, for a 32bit PV guest, it is only
the segment limit which protects Xen from the ring1 guest kernel.

A lot of this code hasn't been touched in years, and it certainly
predates me.  The alignment requirement appears to come from the virtual
region Xen uses to map the guests GDT and LDT.  Strict alignment is
required for the GDT so Xen's descriptors starting at 0xe0xx are
correct, but the LDT alignment seems to be a side effect of similar
codepaths.

For an LDT smaller than 8192 entries, I can't see any specific reason
for enforcing alignment, other than "that's the way it has always been".

However, the guest would still have to relinquish write access to all
frames which make up the LDT, which looks to be a bit of an issue given
the snippet above.

I think I have a solution, but I doubt it is going to be very popular.

* Make a new paravirt hook for allocation of ldt_struct, so the paravirt
backend can choose an alignment if needed
* Make absolutely certain that __pad has the value 0 (so size and __pad
combined don't look like a present descriptor)
* Never hand selector 0x0008 to unsuspecting users.

This will allow ldt_struct itself to be page aligned, and for the size
field to sit across the base/limit field of what would logically be
selector 0x0008  There would be some issues accessing size.  To load
frames as an LDT, a guest must drop all refs to the page so that its
type may be changed from writeable to segdesc.  After that, an
update_descriptor hypercall can be used to change size, and I believe
the guest may subsequently recreate read-only mappings to the frames in
question (although frankly it is getting late so you will want to double
check all of this).

Anyhow, this looks like an issue which should be fixed up with slightly
more PVOps, rather than enforcing a Xen view of the world on native Linux.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 1/3] x86/ldt: Make modify_ldt synchronous

2015-07-21 Thread Boris Ostrovsky


On 07/21/2015 03:59 PM, Andy Lutomirski wrote:

modify_ldt has questionable locking and does not synchronize
threads.  Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.

This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.

Cc: sta...@vger.kernel.org
Signed-off-by: Andy Lutomirski 
---
  arch/x86/include/asm/desc.h|  15 ---
  arch/x86/include/asm/mmu.h |   3 +-
  arch/x86/include/asm/mmu_context.h |  48 ++-
  arch/x86/kernel/cpu/common.c   |   4 +-
  arch/x86/kernel/cpu/perf_event.c   |  12 +-
  arch/x86/kernel/ldt.c  | 247 +++--
  arch/x86/kernel/process_64.c   |   4 +-
  arch/x86/kernel/step.c |   6 +-
  arch/x86/power/cpu.c   |   3 +-
  9 files changed, 192 insertions(+), 150 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index a0bf89fd2647..4e10d73cf018 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -280,21 +280,6 @@ static inline void clear_LDT(void)
set_ldt(NULL, 0);
  }
  
-/*

- * load one particular LDT into the current CPU
- */
-static inline void load_LDT_nolock(mm_context_t *pc)
-{
-   set_ldt(pc->ldt, pc->size);
-}
-
-static inline void load_LDT(mm_context_t *pc)
-{
-   preempt_disable();
-   load_LDT_nolock(pc);
-   preempt_enable();
-}
-
  static inline unsigned long get_desc_base(const struct desc_struct *desc)
  {
return (unsigned)(desc->base0 | ((desc->base1) << 16) | ((desc->base2) 
<< 24));
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 09b9620a73b4..364d27481a52 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -9,8 +9,7 @@
   * we put the segment information here.
   */
  typedef struct {
-   void *ldt;
-   int size;
+   struct ldt_struct *ldt;
  
  #ifdef CONFIG_X86_64

/* True if mm supports a task running in 32 bit compatibility mode. */
diff --git a/arch/x86/include/asm/mmu_context.h 
b/arch/x86/include/asm/mmu_context.h
index 5e8daee7c5c9..1ff121fbf366 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -34,6 +34,44 @@ static inline void load_mm_cr4(struct mm_struct *mm) {}
  #endif
  
  /*

+ * ldt_structs can be allocated, used, and freed, but they are never
+ * modified while live.
+ */
+struct ldt_struct {
+   int size;
+   int __pad;  /* keep the descriptors naturally aligned. */
+   struct desc_struct entries[];
+};




This breaks Xen which expects LDT to be page-aligned. Not sure why.

Jan, Andrew?

-boris



+
+static inline void load_mm_ldt(struct mm_struct *mm)
+{
+   struct ldt_struct *ldt;
+   DEBUG_LOCKS_WARN_ON(!irqs_disabled());
+
+   /* lockless_dereference synchronizes with smp_store_release */
+   ldt = lockless_dereference(mm->context.ldt);
+
+   /*
+* Any change to mm->context.ldt is followed by an IPI to all
+* CPUs with the mm active.  The LDT will not be freed until
+* after the IPI is handled by all such CPUs.  This means that,
+* if the ldt_struct changes before we return, the values we see
+* will be safe, and the new values will be loaded before we run
+* any user code.
+*
+* NB: don't try to convert this to use RCU without extreme care.
+* We would still need IRQs off, because we don't want to change
+* the local LDT after an IPI loaded a newer value than the one
+* that we can see.
+*/
+
+   if (unlikely(ldt))
+   set_ldt(ldt->entries, ldt->size);
+   else
+   clear_LDT();
+}
+
+/*
   * Used for LDT copy/destruction.
   */
  int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
@@ -78,12 +116,12 @@ static inline void switch_mm(struct mm_struct *prev, 
struct mm_struct *next,
 * was called and then modify_ldt changed
 * prev->context.ldt but suppressed an IPI to this CPU.
 * In this case, prev->context.ldt != NULL, because we
-* never free an LDT while the mm still exists.  That
-* means that next->context.ldt != prev->context.ldt,
-* because mms never share an LDT.
+* never set context.ldt to NULL while the mm still
+* exists.  That means that next->context.ldt !=
+* prev->context.ldt, because mms never share an LDT.
 */
if (unlikely(prev->context.ldt != next->context.ldt))
-   load_LDT_nolock(&next->context);
+   load_mm_ldt(next);
}
  #ifdef CONFIG_SMP
  else {
@@ -106,7 +144,7 @@ static inline void switch_mm(struct mm_struct *prev, struct 
mm_struct *next,

[Xen-devel] [linux-3.18 test] 59785: regressions - FAIL

2015-07-21 Thread osstest service owner

flight 59785 linux-3.18 real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59785/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail REGR. vs. 58581

Tests which are failing intermittently (not blocking):
 test-amd64-i386-xl-qemuu-ovmf-amd64 18 guest-start/debianhvm.repeat fail in 
59766 pass in 59785
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 18 guest-start/debianhvm.repeat 
fail pass in 59766

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt  6 xen-boot  fail REGR. vs. 58581
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 11 guest-saverestore fail 
baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 12 guest-localmigrate 
fail baseline untested
 test-armhf-armhf-xl-rtds 11 guest-start fail baseline untested
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 9 debian-hvm-install fail 
in 59766 baseline untested
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 14 guest-localmigrate.2 
fail in 59766 baseline untested
 test-armhf-armhf-xl-rtds 14 guest-start.2  fail in 59766 baseline untested
 test-amd64-i386-libvirt-xsm  11 guest-start   fail in 59766 like 58558
 test-amd64-amd64-libvirt 11 guest-start   fail in 59766 like 58558
 test-amd64-amd64-libvirt-xsm 11 guest-start   fail in 59766 like 58558
 test-amd64-i386-libvirt  11 guest-start   fail in 59766 like 58581
 test-armhf-armhf-xl-credit2   6 xen-boot fail   like 58581
 test-armhf-armhf-xl   6 xen-boot fail   like 58581
 test-armhf-armhf-xl-multivcpu  6 xen-boot fail  like 58581
 test-armhf-armhf-xl-xsm   6 xen-boot fail   like 58581
 test-armhf-armhf-libvirt-xsm  6 xen-boot fail   like 58581
 test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 58581
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail like 58581
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail like 58581

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-rtds 12 migrate-support-check fail in 59766 never pass
 test-amd64-i386-freebsd10-i386  9 freebsd-install  fail never pass
 test-amd64-i386-freebsd10-amd64  9 freebsd-install fail never pass
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-cubietruck  6 xen-boot fail never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass

version targeted for testing:
 linux866cebe251f4fb2b435f4ecfe6d3bb4025938533
baseline version:
 linuxd048c068d00da7d4cfa5ea7651933b99026958cf

Last test of basis58581  2015-06-15 09:42:22 Z   36 days
Failing since 58976  2015-06-29 19:43:23 Z   22 days   31 attempts
Testing same since59412  2015-07-11 00:18:42 Z   10 days   19 attempts


308 people touched revisions under test,
not listing them all

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm

Re: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest

2015-07-21 Thread Boris Ostrovsky


On 07/20/2015 10:43 AM, Boris Ostrovsky wrote:

On 07/20/2015 10:09 AM, Dario Faggioli wrote:

On Fri, 2015-07-17 at 14:17 -0400, Boris Ostrovsky wrote:

On 07/17/2015 03:27 AM, Dario Faggioli wrote:

In the meanwhile, what should we do? Document this? How? "don't use
vNUMA with PV guest in SMT enabled systems" seems a bit harsh... Is
there a workaround we can put in place/suggest?

I haven't been able to reproduce this on my Intel box because I think I
have different core enumeration.


Yes, most likely, that's highly topology dependant. :-(


Can you try adding
cpuid=['0x1:ebx=0001']
to your config file?


Done (sorry for the delay, the testbox was busy doing other stuff).

Still no joy (.101 is the IP address of the guest, domain id 3):

root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
root@Zhaman:~# ssh root@192.168.1.101 "yes > /dev/null 2>&1 &"
root@Zhaman:~# xl vcpu-list 3
NameID  VCPU   CPU State Time(s) 
Affinity (Hard / Soft)

test 3 04   r-- 23.6  all / 0-7
test 3 19   r-- 19.8  all / 0-7
test 3 28   -b- 0.4  all / 8-15
test 3 34   -b- 0.2  all / 8-15

*HOWEVER* it seems to have an effect. In fact, now, topology as it is
shown in /sys/... is different:

root@test:~# cat 
/sys/devices/system/cpu/cpu0/topology/thread_siblings_list

0
(it was 0-1)

This, OTOH, is still the same:
root@test:~# cat 
/sys/devices/system/cpu/cpu0/topology/core_siblings_list

0-3

Also, I now see this:

[0.150560] [ cut here ]
[0.150560] WARNING: CPU: 2 PID: 0 at 
../arch/x86/kernel/smpboot.c:317 topology_sane.isra.2+0x74/0x88()
[0.150560] sched: CPU #2's llc-sibling CPU #0 is not on the same 
node! [node: 1 != 0]. Ignoring dependency.

[0.150560] Modules linked in:
[0.150560] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0+ #1
[0.150560]  0009 88001ee2fdd0 81657c7b 
810bbd2c
[0.150560]  88001ee2fe20 88001ee2fe10 81081510 
88001ee2fea0
[0.150560]  8103aa02 88003ea0a001  
88001f20a040

[0.150560] Call Trace:
[0.150560]  [] dump_stack+0x4f/0x7b
[0.150560]  [] ? up+0x39/0x3e
[0.150560]  [] warn_slowpath_common+0xa1/0xbb
[0.150560]  [] ? topology_sane.isra.2+0x74/0x88
[0.150560]  [] warn_slowpath_fmt+0x46/0x48
[0.150560]  [] ? __cpuid.constprop.0+0x15/0x19
[0.150560]  [] topology_sane.isra.2+0x74/0x88
[0.150560]  [] set_cpu_sibling_map+0x27a/0x444
[0.150560]  [] ? numa_add_cpu+0x98/0x9f
[0.150560]  [] cpu_bringup+0x63/0xa8
[0.150560]  [] cpu_bringup_and_idle+0xe/0x1a
[0.150560] ---[ end trace 63d204896cce9f68 ]---

Notice that it now says 'llc-sibling', while, before, it was saying
'smt-sibling'.


Exactly. You are now passing the first topology test which was to see 
that threads are on the same node. And since each processor has only 
one thread (as evidenced by thread_siblings_list) we are good.


The second test checks that cores (i.e. things that share last level 
cache) are on the same node. And they are not.






On AMD, BTW, we fail a different test so some other bits probably need
to be tweaked. You may fail it too (the LLC sanity check).


Yep, that's the one I guess. Should I try something more/else?



I'll need to see how LLC IDs are calculated, probably also from some 
CPUID bits.



No, can't do this: LLC is calculated from CPUID leaf 4 (on Intel) which 
use indexes in ECX register and xl syntax doesn't allow you to override 
CPUIDs for such leaves.


-boris

The question though will be --- what do we do with how cache sizes 
(and TLB sizes for that matter) are presented to the guests. Do we 
scale them down per thread?


-boris



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] altp2m: patch 07/15 and 11/15

2015-07-21 Thread Sahita, Ravi

Hi maintainers,

While we are sorting through the last 2 to 3 patches that have the main open 
comments (patch 5,6,10/15), could you please ack patch 07/15 and 11/15 if you 
are ok with it - all previous comments on those have been addressed (this will 
allow us to focus on remaining opens). Sorry for missing some emails 
(unintentional) reg patch 5 and 10 ;  some changes were skipped in rev 6 so 
that we could focus on the Category 1 (ABI ones). 

Thanks,
Ravi


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [linux-linus test] 59789: regressions - FAIL

2015-07-21 Thread osstest service owner

flight 59789 linux-linus real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59789/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-pvops  5 kernel-build  fail REGR. vs. 59254
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 59254

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-xl-rtds 11 guest-start   fail REGR. vs. 59254

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-amd64-xl-xsm   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-libvirt-xsm  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-xl-xsm1 build-check(1)   blocked  n/a
 test-amd64-i386-xl1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-qemuu-rhel6hvm-intel  1 build-check(1) blocked n/a
 test-amd64-i386-pair  1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt-xsm   1 build-check(1)   blocked  n/a
 test-amd64-i386-libvirt   1 build-check(1)   blocked  n/a
 test-amd64-i386-qemut-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemut-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-rtds  1 build-check(1)   blocked  n/a
 test-amd64-i386-freebsd10-amd64  1 build-check(1)   blocked  n/a
 test-amd64-i386-qemuu-rhel6hvm-amd  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm  1 build-check(1)blocked n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemuu-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-i386-xl-qemuu-ovmf-amd64  1 build-check(1)  blocked n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-win7-amd64  1 build-check(1)  blocked n/a
 test-amd64-i386-freebsd10-i386  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked 
n/a
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1  1 build-check(1) blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemuu-winxpsp3  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-xl-credit2  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 12 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass

version targeted for testing:
 linux9d634c410b07be7bf637ea03362d3ff132088fe3
baseline version:
 linux45820c294fe1b1a9df495d57f40585ef2d069a39

Last test of basis59254  2015-07-09 04:20:48 Z   12 days
Failing since 59348  2015-07-10 04

Re: [Xen-devel] [PATCH v5 10/15] x86/altp2m: add remaining support routines.

2015-07-21 Thread Sahita, Ravi

>From: Jan Beulich [mailto:jbeul...@suse.com]
>Sent: Monday, July 20, 2015 11:38 PM
>
 +void p2m_altp2m_propagate_change(struct domain *d, gfn_t gfn,
 + mfn_t mfn, unsigned int page_order,
 + p2m_type_t p2mt, p2m_access_t
 +p2ma) {
 +struct p2m_domain *p2m;
 +p2m_access_t a;
 +p2m_type_t t;
 +mfn_t m;
 +uint16_t i;
 +bool_t reset_p2m;
 +unsigned int reset_count = 0;
 +uint16_t last_reset_idx = ~0;
 +
 +if ( !altp2m_active(d) )
 +return;
 +
 +altp2m_list_lock(d);
 +
 +for ( i = 0; i < MAX_ALTP2M; i++ )
 +{
 +if ( d->arch.altp2m_eptp[i] == INVALID_MFN )
 +continue;
 +
 +p2m = d->arch.altp2m_p2m[i];
 +m = get_gfn_type_access(p2m, gfn_x(gfn), &t, &a, 0,
 + NULL);
 +
 +reset_p2m = 0;
 +
 +/* Check for a dropped page that may impact this altp2m */
 +if ( mfn_x(mfn) == INVALID_MFN &&
 + gfn_x(gfn) >= p2m->min_remapped_gfn &&
 + gfn_x(gfn) <= p2m->max_remapped_gfn )
 +reset_p2m = 1;
>>>
>>>Considering that this looks like an optimization, what's the
>>>downside of possibly having min=0 and max=space>?
>>>I.e.
>>>can there a long latency operation result that's this way a guest
>>>can
>>>effect?
>>>
>>
>> ... A p2m is a gfn->mfn map, amongst other things. There is a
>> reverse
>> mfn->gfn map, but that is only valid for the host p2m. Unless the
>> remap altp2m hypercall is used, the gfn->mfn map in every altp2m
>> mirrors the gfn->mfn map in the host p2m (or a subset thereof, due
>> to lazy-copy), so handling removal of an mfn from a guest is simple:
>> do a reverse look up for the host p2m and mark the relevant gfn as
>> invalid, then do the same for every altp2m where that gfn is
>> currently
>>>valid.
>>
>> Remap changes things: it says take gfn1 and replace ->mfn with the
>> ->mfn of gfn2. Here is where the optimization is used and the
>> ->invalidate
>logic is:
>> record the lowest and highest gfn2's that have been used in remap
>> ops; if an mfn is dropped from the hostp2m, for the purposes of
>> altp2m invalidation, see if the gfn derived from the host p2m
>> reverse lookup falls within the range of used gfn2's. If it does,
>> an invalidation is required. Which is why min and max are inited
>> the way they are - hope the explanation clarifies this optimization.
>
>Sadly it doesn't, it just re-states what I already understood and
>doesn't answer the question: What happens if min=0 and
>max=space>? I.e. can the guest nullify the optimization by careful
>space>fiddling
 issuing
>some of the new hypercalls, and if so will this have any negative
>impact on
 the
>hypervisor? I'm asking this from a security standpoint ...
>

 To take that exact case, If min=0 and max=
 then any hostp2m change where the first mfn is dropped, will cause
 all altp2ms to be reset even if the mfn dropped doesn't affect
 altp2ms at all, which wont serve as an optimization at all - Hope that
>clarifies.
>>>
>>>Again - no. I understand the optimization is gone then. But what's the
>effect?
>>>I.e. will the guest, by extending this range to be arbitrarily wide,
>>>be able
>> to
>>>cause a long latency hypervisor operation (i.e. a DoS)?
>>>
>>
>> The extent of the range affects the likelihood of an invalidation. It
>> has no impact on the cost of an invalidation (so no its not a DoS issue).
>> I'm not sure what change you are suggesting here or just clarification
>> (if you think this optimization is confusing perhaps some
>> documentation of this optimization will help?)
>
>Well, the optimization must be optimizing _something_. And hence
>_something_ must go sub-optimal when the optimization is being subverted.
>And the question is how much worse un-optimized is compared to optimized.
>
>It _looks like_ the overall effect really is just to avoid a one time (for a 
>given
>non-preemptible operation) reset, but whether that's really the case depends
>on the calling contexts (which, as said a couple of times before, is hard to 
>see
>for a patch that introduces functions without callers - hence the question).
>

As you now understand, invalidating an altp2m effectively resets it to be a 
(lazily-copied) exact duplicate of the host p2m again -- so losing any altp2m 
permissions restrictions or remaps. This is a first cut at minimizing the 
likelihood of that happening unnecessarily. There's some discussion on this 
first cut between Tim and Ed going back to February or March. The intention 
co

Re: [Xen-devel] [PATCH v2 03/23] x86: zero BSS using stosl instead of stosb

2015-07-21 Thread Daniel Kiper

On Tue, Jul 21, 2015 at 03:37:48AM -0600, Jan Beulich wrote:
> >>> On 20.07.15 at 16:28,  wrote:
>
> ... because of ??? Nowadays - with X86_FEATURE_ERMS - rep stosb
> is expected to be faster than rep stosl.

OK, I did not know about that. However, as I know this feature
was introduced in 2012 with Ivy Bridge. So, I suppose that there
are still a lot of machines in the wild which does not support it.
Anyway, because this code is not performance critical I am not going
to insist on one or another solution. However, Andrew suggested that
thing, so, please agree with him in which direction we should go.
I will do what you agree.

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v5 05/15] x86/altp2m: basic data structures and support routines.

2015-07-21 Thread Sahita, Ravi

>From: George Dunlap [mailto:george.dun...@eu.citrix.com]
>Sent: Tuesday, July 14, 2015 8:57 AM
>
>On 07/14/2015 01:14 AM, Ed White wrote:
>> Add the basic data structures needed to support alternate p2m's and
>> the functions to initialise them and tear them down.
>>
>> Although Intel hardware can handle 512 EPTP's per hardware thread
>> concurrently, only 10 per domain are supported in this patch for
>> performance reasons.
>>
>> The iterator in hap_enable() does need to handle 512, so that is now
>> uint16_t.
>>
>> This change also splits the p2m lock into one lock type for altp2m's
>> and another type for all other p2m's. The purpose of this is to place
>> the altp2m list lock between the types, so the list lock can be
>> acquired whilst holding the host p2m lock.
>>
>> Signed-off-by: Ed White 
>>
>> Reviewed-by: Andrew Cooper 
>
>With the number of major changes you made here, you definitely should
>have dropped this reviewed-by.
>
>> ---
>>  xen/arch/x86/hvm/Makefile|   1 +
>>  xen/arch/x86/hvm/altp2m.c|  77 +
>>  xen/arch/x86/hvm/hvm.c   |  21 
>>  xen/arch/x86/mm/hap/hap.c|  38 ++-
>>  xen/arch/x86/mm/mm-locks.h   |  46 +-
>>  xen/arch/x86/mm/p2m.c| 102
>+++
>>  xen/include/asm-x86/domain.h |  10 
>>  xen/include/asm-x86/hvm/altp2m.h |  38 +++
>>  xen/include/asm-x86/hvm/hvm.h|  14 ++
>>  xen/include/asm-x86/hvm/vcpu.h   |   9 
>>  xen/include/asm-x86/p2m.h|  32 +++-
>>  11 files changed, 384 insertions(+), 4 deletions(-)  create mode
>> 100644 xen/arch/x86/hvm/altp2m.c  create mode 100644
>> xen/include/asm-x86/hvm/altp2m.h
>>
>> diff --git a/xen/arch/x86/hvm/Makefile b/xen/arch/x86/hvm/Makefile
>> index 69af47f..eb1a37b 100644
>> --- a/xen/arch/x86/hvm/Makefile
>> +++ b/xen/arch/x86/hvm/Makefile
>> @@ -1,6 +1,7 @@
>>  subdir-y += svm
>>  subdir-y += vmx
>>
>> +obj-y += altp2m.o
>>  obj-y += asid.o
>>  obj-y += emulate.o
>>  obj-y += event.o
>> diff --git a/xen/arch/x86/hvm/altp2m.c b/xen/arch/x86/hvm/altp2m.c new
>> file mode 100644 index 000..a10f347
>> --- /dev/null
>> +++ b/xen/arch/x86/hvm/altp2m.c
>> @@ -0,0 +1,77 @@
>> +/*
>> + * Alternate p2m HVM
>> + * Copyright (c) 2014, Intel Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> +modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but
>> +WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of
>MERCHANTABILITY
>> +or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> +License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> +along with
>> + * this program; if not, write to the Free Software Foundation, Inc.,
>> +59 Temple
>> + * Place - Suite 330, Boston, MA 02111-1307 USA.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +void
>> +altp2m_vcpu_reset(struct vcpu *v)
>
>OK, so it looks like at the end of this patch series:
>* altp2m_vcpu_reset() isn't called outside this file
>* altp2m_vcpu_initialise() is only called from hvm.c when the guest enables
>the altp2m functionality
>* altp2m_vcpu_destroy() is called when the guest disables altp2m
>funcitonality, or when the vcpu is destroyed
>
>Looking at the "vcpu_destroy" case, it's hard to tell exactly how much on that
>path is actually useful; but it looks like the only thing that's critical is 
>decreasing
>the active_vcpu count of the p2m that's being used.
>
>Also, it looks like these functions don't do anything specifically with the HVM
>side of things.
>
>So on the whole, it seems like these would better go along with the other
>altp2m functions inside p2m.c.
>
>Thoughts?

George, apologies on this one - I completely missed this email from you -

We could move these functions into p2m.c, except destroy, the VMCS updates are 
critical.
We will try to get this into what will be our final rev (at least for 4.6 
candidate). 
Again sorry for the snafu on this email.

Ravi

>
> -George



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH OSSTEST v2 03/13] osstest migrate support check catch -> variables

2015-07-21 Thread Ian Jackson

Wei Liu writes ("Re: [PATCH OSSTEST v2 03/13] osstest migrate support check 
catch -> variables"):
> Do I need to change anything in this patch? I guess not? It's not very
> clear to me.

Ian C was asking whether the patch (which I wrote) was right, in a
particular respect.  I answered that it is correct.  So no, you don't
need to do anything to this patch.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3] xl: fix vcpus to vnode assignement in config file

2015-07-21 Thread Dario Faggioli

In fact, right now, the following (legitimate)
configuration:

 vcpus   = '4'
 vnuma = [ [ "pnode=0","size=512","vcpus=0,1","vdistances=10,20"  ],
   [ "pnode=1","size=512","vcpus=2,3","vdistances=20,10"  ] ]

Produces the following error:

 # xl create /etc/xen/test.cfg
 Parsing config from /etc/xen/test.cfg
 xl: maxvcpus < vcpu

That is because, we only process the first element of the
"vcpus=" list (of each vnode specification). Therefore,
in the above case, we only see 2 vcpus, out of 4, being
assigned to the vnodes, and hence the error.

What we need is either a multidimentional array, or a
bitmap, to temporary store the vcpus of a vnode, while
parsing the vnuma config entry. Let's use the latter,
which happens to also make it easier to copy the outcome
of the parsing to its final destination in b_info, if
everything goes ok.

Signed-off-by: Dario Faggioli 
Acked-by: Wei Liu 
---
Changes from v2:
 * added the description of the error to the changelog,
   as requested during review
Changes from v1:
 * fix coding style, as requested during review
---
Cc: Ian Jackson 
Cc: Ian Campbell 
---
 tools/libxl/xl_cmdimpl.c |   34 ++
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 5c6d1b0..1d45dd5 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1076,9 +1076,7 @@ static void parse_vnuma_config(const XLU_Config *config,
 /* Temporary storage for parsed vcpus information to avoid
  * parsing config twice. This array has num_vnuma elements.
  */
-struct vcpu_range_parsed {
-unsigned long start, end;
-} *vcpu_range_parsed;
+libxl_bitmap *vcpu_parsed;
 
 libxl_physinfo_init(&physinfo);
 if (libxl_get_physinfo(ctx, &physinfo) != 0) {
@@ -1095,7 +1093,14 @@ static void parse_vnuma_config(const XLU_Config *config,
 
 b_info->num_vnuma_nodes = num_vnuma;
 b_info->vnuma_nodes = xcalloc(num_vnuma, sizeof(libxl_vnode_info));
-vcpu_range_parsed = xcalloc(num_vnuma, sizeof(*vcpu_range_parsed));
+vcpu_parsed = xcalloc(num_vnuma, sizeof(libxl_bitmap));
+for (i = 0; i < num_vnuma; i++) {
+libxl_bitmap_init(&vcpu_parsed[i]);
+if (libxl_cpu_bitmap_alloc(ctx, &vcpu_parsed[i], b_info->max_vcpus)) {
+fprintf(stderr, "libxl_node_bitmap_alloc failed.\n");
+exit(1);
+}
+}
 
 for (i = 0; i < b_info->num_vnuma_nodes; i++) {
 libxl_vnode_info *p = &b_info->vnuma_nodes[i];
@@ -1165,12 +1170,14 @@ static void parse_vnuma_config(const XLU_Config *config,
 split_string_into_string_list(value, ",", &cpu_spec_list);
 len = libxl_string_list_length(&cpu_spec_list);
 
-for (j = 0; j < len; j++)
+for (j = 0; j < len; j++) {
 parse_range(cpu_spec_list[j], &s, &e);
+for (; s <= e; s++) {
+libxl_bitmap_set(&vcpu_parsed[i], s);
+max_vcpus++;
+}
+}
 
-vcpu_range_parsed[i].start = s;
-vcpu_range_parsed[i].end   = e;
-max_vcpus += (e - s + 1);
 libxl_string_list_dispose(&cpu_spec_list);
 } else if (!strcmp("vdistances", option)) {
 libxl_string_list vdist;
@@ -1209,17 +1216,12 @@ static void parse_vnuma_config(const XLU_Config *config,
 
 for (i = 0; i < b_info->num_vnuma_nodes; i++) {
 libxl_vnode_info *p = &b_info->vnuma_nodes[i];
-int cpu;
 
-libxl_cpu_bitmap_alloc(ctx, &p->vcpus, b_info->max_vcpus);
-libxl_bitmap_set_none(&p->vcpus);
-for (cpu = vcpu_range_parsed[i].start;
- cpu <= vcpu_range_parsed[i].end;
- cpu++)
-libxl_bitmap_set(&p->vcpus, cpu);
+libxl_bitmap_copy_alloc(ctx, &p->vcpus, &vcpu_parsed[i]);
+libxl_bitmap_dispose(&vcpu_parsed[i]);
 }
 
-free(vcpu_range_parsed);
+free(vcpu_parsed);
 }
 
 static void parse_config_data(const char *config_source,


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH OSSTEST v2 03/13] osstest migrate support check catch -> variables

2015-07-21 Thread Wei Liu

On Tue, Jul 21, 2015 at 05:32:25PM +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [PATCH OSSTEST v2 03/13] osstest migrate support 
> check catch -> variables"):
> > On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
> > > @@ -300,7 +300,9 @@ proc run-job/test-pair {} {
> > >  proc test-guest-migr {g} {
> > > -if {[catch { run-ts . = ts-migrate-support-check + host $g }]} return
> > > +set to_reap [spawn-ts . = ts-migrate-support-check + host $g]
> > 
> > Most other uses of spawn-ts use [eval spawn-ts ]. I think those
> > are just trying to expand a $args into multiple arguments to spawn-ts,
> > and hence that isn't needed here (because $g is a singleton argument
> > already). But TBH I don't know...
> 
> Yes, the effect of the
> set reap [eval spawn-ts $args]
> is to expand the list in $args as arguments to spawn-ts.  $g is a
> singleton as you say, not a list.
> 
> spawn-ts has the same argument convention as run-ts.
> 

Do I need to change anything in this patch? I guess not? It's not very
clear to me.

Wei.

> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 02/23] x86/boot: copy only text section from .lnk file to .bin file

2015-07-21 Thread Daniel Kiper

On Tue, Jul 21, 2015 at 03:35:07AM -0600, Jan Beulich wrote:
> >>> On 20.07.15 at 16:28,  wrote:
>
> Without any explanation (description) I'm inclined to say this makes
> things more fragile instead of improving the situation. As it looks
> like we indeed pointlessly copy .eh_frame, but I think this would
> better be avoided by suppressing its generation (i.e. add
> -fno-asynchronous-unwind-tables just like Rules.mk has).

Make sense, however, there is still place for two small optimizations.

First of all ld generates .got.plt section and objcopy copy it to binary file.
It is not needed because we do not link our stuff here with shared libraries.
So, we can use -R objcopy option to remove it (if you do not like -j .text).
This way we could save 15 bytes (at least on my machines).

We could also save another 3 bytes (per one xen/arch/x86/boot C input file)
in final Xen binary in worst case :-))). We just need generate output assembly
files as string of .byte instead of .long.

Where should I stop?

Daniel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH OSSTEST v2 08/13] ts-libvirt-build: run libvirt test suite

2015-07-21 Thread Ian Campbell

On Tue, 2015-07-21 at 17:22 +0100, Wei Liu wrote:
> On Mon, Jul 13, 2015 at 12:25:35PM +0100, Ian Campbell wrote:
> > On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
> > > We're interested in xlconfigtest.
> > > 
> > > Signed-off-by: Wei Liu 
> > > Cc: Ian Campbell 
> > > Cc: Ian Jackson 
> > > [...]
> > > + | tee ../libvirt-test-suite-log
> > 
> > Should something be collecting/stashing that log file?
> > 
> 
> Not sure.
> 
> > I think since it is logged to the output of the command it's 
> > probably
> > not needed, but I suppose you added the tee for a reason?
> > 
> 
> I followed suite. There is "tee log" in build command. I can't seem 
> to
> find that log stashed anywhere though.
> 
> What should I do about this?

Given the precedent, I think nothing.

Acked-by: Ian Campbell 



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH OSSTEST v2 03/13] osstest migrate support check catch -> variables

2015-07-21 Thread Ian Jackson

Ian Campbell writes ("Re: [PATCH OSSTEST v2 03/13] osstest migrate support 
check catch -> variables"):
> On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
> > @@ -300,7 +300,9 @@ proc run-job/test-pair {} {
> >  proc test-guest-migr {g} {
> > -if {[catch { run-ts . = ts-migrate-support-check + host $g }]} return
> > +set to_reap [spawn-ts . = ts-migrate-support-check + host $g]
> 
> Most other uses of spawn-ts use [eval spawn-ts ]. I think those
> are just trying to expand a $args into multiple arguments to spawn-ts,
> and hence that isn't needed here (because $g is a singleton argument
> already). But TBH I don't know...

Yes, the effect of the
set reap [eval spawn-ts $args]
is to expand the list in $args as arguments to spawn-ts.  $g is a
singleton as you say, not a list.

spawn-ts has the same argument convention as run-ts.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xl: fix vcpus to vnode assignement in config file

2015-07-21 Thread Dario Faggioli

On Tue, 2015-07-21 at 15:13 +0100, Ian Campbell wrote:
> On Mon, 2015-07-20 at 11:30 +0200, Dario Faggioli wrote:
> > In fact, right now, if the "vcpus=" list (where the
> > user specifies what vcpus should be part of a vnode)
> > has multiple elements, things don't work.
> > E.g., the following examples all result in failure
> > to create the guest:
> 
> What is the failure?
> 
With the following configuration:
vcpus   = '4'
memory  = '1024'
vnuma = [ [ "pnode=0","size=512","vcpus=0,1","vdistances=10,20"  ],
  [ "pnode=1","size=512","vcpus=2,3","vdistances=20,10"  ] ]

The error message is this one:
xl: maxvcpus < vcpus

The reason is that, without this change, we only process the first
element of the "vcpus=" list, i.e., only vcpu 0 (for vnode 0) and vcpu 2
(for vnode 1), up to a total of 2 (rather than 4) vcpus.

Basically, things only work if for each vnode, its vcpus are specified
by means of a single element.

> > Reason is we need either a multidimentional array,
> > or a bitmap, to temporary store the vcpus of a
> > vnode, while parsing the vnuma config entry.
> 
> That sounds like a cure, not the reason for the failure. Please can you
> explain the nature of the failure, so it becomes clear why this change
> is needed.
> 
Ok, I'll mention this in here.

Thanks and Regards,
Dario
-- 
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


signature.asc
Description: This is a digitally signed message part
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH OSSTEST v2 08/13] ts-libvirt-build: run libvirt test suite

2015-07-21 Thread Wei Liu

On Mon, Jul 13, 2015 at 12:25:35PM +0100, Ian Campbell wrote:
> On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
> > We're interested in xlconfigtest.
> > 
> > Signed-off-by: Wei Liu 
> > Cc: Ian Campbell 
> > Cc: Ian Jackson 
> > [...]
> > + | tee ../libvirt-test-suite-log
> 
> Should something be collecting/stashing that log file?
> 

Not sure.

> I think since it is logged to the output of the command it's probably
> not needed, but I suppose you added the tee for a reason?
> 

I followed suite. There is "tee log" in build command. I can't seem to
find that log stashed anywhere though.

What should I do about this?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH OSSTEST v2 09/13] ts-debian-hvm-install: stub out libvirt + ovmf / rombios

2015-07-21 Thread Wei Liu

On Mon, Jul 13, 2015 at 12:27:30PM +0100, Ian Campbell wrote:
> On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
> > Libvirt's configuration converter doesn't know how to deal with BIOS
> > selection. The end result is it always use the default one (seabios).
> > Stub out ovmf and rombios to avoid false positive results.
> 
> It's worth mentioning here whether or not we expect to currently see
> such configurations in osstest today.
> 

I don't expect to see those configurations any time soon. I will explain
this in commit message.

> If we do expect to see them then it would be good to filter them in
> make-flight to avoid wasting lots of test time.
> 

The filtering is done in later patch.

This change is more like another level of safety in case we change
something in make-flight by mistake.

Wei.

> > 
> > This restriction will be removed once libvirt's converter knows how to
> > deal with BIOS selection.
> > 
> > Signed-off-by: Wei Liu 
> > Cc: Ian Campbell 
> > Cc: Ian Jackson 
> > ---
> >  ts-debian-hvm-install | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/ts-debian-hvm-install b/ts-debian-hvm-install
> > index f05b1a7..bd16506 100755
> > --- a/ts-debian-hvm-install
> > +++ b/ts-debian-hvm-install
> > @@ -28,6 +28,13 @@ if (@ARGV && $ARGV[0] =~ m/^--stage(\d+)$/) { $stage=$1; 
> > shift @ARGV; }
> >  
> >  defined($r{bios}) or die "Need to define which bios to use";
> >  
> > +# Libvirt doesn't know anything about bios. It will always use the
> > +# default one (seabios). Stub out rombios and ovmf to avoid false
> > +# positive results.
> > +if ($r{bios} =~ m/ovmf|rombios/ && $r{toolstack} eq 'libvirt') {
> > +die "libvirt + $r{bios} is not supported yet.";
> > +}
> > +
> >  our ($whhost,$gn) = @ARGV;
> >  $whhost ||= 'host';
> >  $gn ||= 'debianhvm';
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.6 3/3] xen: remove non-POSIX error codes

2015-07-21 Thread Jan Beulich

>>> On 21.07.15 at 17:56,  wrote:
> --- a/xen/include/public/errno.h
> +++ b/xen/include/public/errno.h
> @@ -54,8 +54,6 @@ XEN_ERRNO(EDEADLK,  35) /* Resource deadlock would 
> occur */
>  XEN_ERRNO(ENAMETOOLONG,  36) /* File name too long */
>  XEN_ERRNO(ENOLCK,37) /* No record locks available */
>  XEN_ERRNO(ENOSYS,38) /* Function not implemented */
> -XEN_ERRNO(EBADRQC,   56) /* Invalid request code */
> -XEN_ERRNO(EBADSLT,   57) /* Invalid slot */
>  XEN_ERRNO(ENODATA,   61) /* No data available */
>  XEN_ERRNO(ETIME, 62) /* Timer expired */
>  XEN_ERRNO(EBADMSG,   74) /* Not a data message */
> @@ -64,15 +62,12 @@ XEN_ERRNO(EILSEQ, 84) /* Illegal byte sequence */
>  #ifdef __XEN__ /* Internal only, should never be exposed to the guest. */
>  XEN_ERRNO(ERESTART,  85) /* Interrupted system call should be restarted 
> */
>  #endif
> -XEN_ERRNO(EUSERS,87) /* Too many users */
>  XEN_ERRNO(EOPNOTSUPP,95) /* Operation not supported on transport 
> endpoint */
>  XEN_ERRNO(EADDRINUSE,98) /* Address already in use */
>  XEN_ERRNO(EADDRNOTAVAIL, 99) /* Cannot assign requested address */
>  XEN_ERRNO(ENOBUFS,   105)/* No buffer space available */
>  XEN_ERRNO(EISCONN,   106)/* Transport endpoint is already connected */
>  XEN_ERRNO(ENOTCONN,  107)/* Transport endpoint is not connected */
> -XEN_ERRNO(ESHUTDOWN, 108)/* Cannot send after transport endpoint 
> shutdown */
> -XEN_ERRNO(ETOOMANYREFS,  109)/* Too many references: cannot splice */
>  XEN_ERRNO(ETIMEDOUT, 110)/* Connection timed out */

Considering that I stripped several values when putting together
the change that introduced this header, shouldn't we perhaps
consider re-adding a few that could be used as replacements?
ENOTSOCK (possibly usable in patch 1) would be one such
candidate.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] x86/HVM: avoid pointer wraparound in bufioreq handling

2015-07-21 Thread Stefano Stabellini

On Thu, 18 Jun 2015, Jan Beulich wrote:
> The number of slots per page being 511 (i.e. not a power of two) means
> that the (32-bit) read and write indexes going beyond 2^32 will likely
> disturb operation. Extend I/O req server creation so the caller can
> indicate that it is using suitable atomic accesses where needed (not
> all accesses to the two pointers really need to be atomic), allowing
> the hypervisor to atomically canonicalize both pointers when both have
> gone through at least one cycle.
> 
> Signed-off-by: Jan Beulich 
> Acked-by: Ian Campbell 
> ---
> v2: Limit canonicalization loop to IOREQ_BUFFER_SLOT_NUM iterations.
> Adjust xc_hvm_create_ioreq_server() documentation.
> 
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1933,7 +1933,8 @@ int xc_get_hvm_param(xc_interface *handl
>   *
>   * @parm xch a handle to an open hypervisor interface.
>   * @parm domid the domain id to be serviced
> - * @parm handle_bufioreq should the IOREQ Server handle buffered requests?
> + * @parm handle_bufioreq how should the IOREQ Server handle buffered requests
> + *   (HVM_IOREQSRV_BUFIOREQ_*)?
>   * @parm id pointer to an ioservid_t to receive the IOREQ Server id.
>   * @return 0 on success, -1 on failure.
>   */
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -1411,7 +1411,7 @@ int xc_hvm_create_ioreq_server(xc_interf
>  hypercall.arg[1] = HYPERCALL_BUFFER_AS_ARG(arg);
>  
>  arg->domid = domid;
> -arg->handle_bufioreq = !!handle_bufioreq;
> +arg->handle_bufioreq = handle_bufioreq;
>  
>  rc = do_xen_hypercall(xch, &hypercall);
>  
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -921,7 +921,7 @@ static void hvm_ioreq_server_disable(str
>  
>  static int hvm_ioreq_server_init(struct hvm_ioreq_server *s, struct domain 
> *d,
>   domid_t domid, bool_t is_default,
> - bool_t handle_bufioreq, ioservid_t id)
> + int bufioreq_handling, ioservid_t id)

uint8_t?


>  {
>  struct vcpu *v;
>  int rc;
> @@ -938,7 +938,11 @@ static int hvm_ioreq_server_init(struct 
>  if ( rc )
>  return rc;
>  
> -rc = hvm_ioreq_server_setup_pages(s, is_default, handle_bufioreq);
> +if ( bufioreq_handling == HVM_IOREQSRV_BUFIOREQ_ATOMIC )
> +s->bufioreq_atomic = 1;
> +
> +rc = hvm_ioreq_server_setup_pages(
> + s, is_default, bufioreq_handling != HVM_IOREQSRV_BUFIOREQ_OFF);
>  if ( rc )
>  goto fail_map;
>  
> @@ -997,12 +1001,15 @@ static ioservid_t next_ioservid(struct d
>  }
>  
>  static int hvm_create_ioreq_server(struct domain *d, domid_t domid,
> -   bool_t is_default, bool_t handle_bufioreq,
> +   bool_t is_default, int bufioreq_handling,

uint8_t?


> ioservid_t *id)
>  {
>  struct hvm_ioreq_server *s;
>  int rc;
>  
> +if ( bufioreq_handling > HVM_IOREQSRV_BUFIOREQ_ATOMIC )
> +return -EINVAL;
> +
>  rc = -ENOMEM;
>  s = xzalloc(struct hvm_ioreq_server);
>  if ( !s )
> @@ -1015,7 +1022,7 @@ static int hvm_create_ioreq_server(struc
>  if ( is_default && d->arch.hvm_domain.default_ioreq_server != NULL )
>  goto fail2;
>  
> -rc = hvm_ioreq_server_init(s, d, domid, is_default, handle_bufioreq,
> +rc = hvm_ioreq_server_init(s, d, domid, is_default, bufioreq_handling,
> next_ioservid(d));
>  if ( rc )
>  goto fail3;
> @@ -2560,7 +2567,7 @@ int hvm_buffered_io_send(ioreq_t *p)
>  spin_lock(&s->bufioreq_lock);
>  
> -if ( (pg->write_pointer - pg->read_pointer) >=
> +if ( (pg->ptrs.write_pointer - pg->ptrs.read_pointer) >=
>   (IOREQ_BUFFER_SLOT_NUM - qw) )
>  {
>  /* The queue is full: send the iopacket through the normal path. */
> @@ -2568,17 +2575,29 @@ int hvm_buffered_io_send(ioreq_t *p)
>  return 0;
>  }
>  
> -pg->buf_ioreq[pg->write_pointer % IOREQ_BUFFER_SLOT_NUM] = bp;
> +pg->buf_ioreq[pg->ptrs.write_pointer % IOREQ_BUFFER_SLOT_NUM] = bp;
>  
>  if ( qw )
>  {
>  bp.data = p->data >> 32;
> -pg->buf_ioreq[(pg->write_pointer+1) % IOREQ_BUFFER_SLOT_NUM] = bp;
> +pg->buf_ioreq[(pg->ptrs.write_pointer+1) % IOREQ_BUFFER_SLOT_NUM] = 
> bp;
>  }
>  
>  /* Make the ioreq_t visible /before/ write_pointer. */
>  wmb();
> -pg->write_pointer += qw ? 2 : 1;
> +pg->ptrs.write_pointer += qw ? 2 : 1;
> +
> +/* Canonicalize read/write pointers to prevent their overflow. */
> +while ( s->bufioreq_atomic && qw++ < IOREQ_BUFFER_SLOT_NUM &&
> +pg->ptrs.read_pointer >= IOREQ_BUFFER_SLOT_NUM )
> +{
> +union bufioreq_pointers old = pg->ptrs, new;
> +unsigned int n = old.read_pointer / IOREQ_BUFFER_SLOT_NUM;
> +
> +new.read_pointer = o

Re: [Xen-devel] [PATCH OSSTEST v2 06/13] toolstack/libvirt: guest migrate, save and restore support

2015-07-21 Thread Wei Liu

On Mon, Jul 13, 2015 at 12:23:52PM +0100, Ian Campbell wrote:
> On Sun, 2015-07-12 at 17:20 +0100, Wei Liu wrote:
> 
> Perhaps the libvirt part of the check_for_command stuff ought to be
> moved here? Otherwise we are claiming support before the code is
> actually willing to try to do so.

I move this patch before "toolstack: distinguish local and remote
migration support" (the patch that claims support). It should be fine
now.

Wei.

> 
> > Signed-off-by: Wei Liu 
> > Cc: Ian Campbell 
> > Cc: Ian Jackson 
> > Acked-by: Ian Campbell 
> > ---
> >  Osstest/Toolstack/libvirt.pm | 11 ---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Osstest/Toolstack/libvirt.pm b/Osstest/Toolstack/libvirt.pm
> > index ddf84df..3dc1856 100644
> > --- a/Osstest/Toolstack/libvirt.pm
> > +++ b/Osstest/Toolstack/libvirt.pm
> > @@ -105,17 +105,22 @@ sub saverestore_check ($) {
> >  
> >  sub migrate ($) {
> >  my ($self,$gho,$dst,$timeout) = @_;
> > -die "Migration is not yet supported on libvirt.";
> > +my $ho = $self->{Host};
> > +my $gn = $gho->{Name};
> > +target_cmd_root($ho, "virsh migrate $gn $dst", $timeout);
> >  }
> >  
> >  sub save () {
> >  my ($self,$gho,$f,$timeout) = @_;
> > -die "Save is not yet supported on libvirt.";
> > +my $ho = $self->{Host};
> > +my $gn = $gho->{Name};
> > +target_cmd_root($ho, "virsh save $gn $f", $timeout);
> >  }
> >  
> >  sub restore () {
> >  my ($self,$gho,$f,$timeout) = @_;
> > -die "Restore is not yet supported on libvirt.";
> > +my $ho = $self->{Host};
> > +target_cmd_root($ho, "virsh restore $f", $timeout);
> >  }
> >  
> >  1;
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.6 1/3] xen/x86/libxl: replace non-POSIX error codes used by PSR code

2015-07-21 Thread Ian Jackson

Jan Beulich writes ("Re: [PATCH for-4.6 1/3] xen/x86/libxl: replace non-POSIX 
error codes used by PSR code"):
> On 21.07.15 at 17:56,  wrote:
> > PSR was using EBADSLT and EUSERS which are not POSIX error codes, replace
> > them with EINVAL and EOVERFLOW respectively.
> 
> Considering that we use EINVAL for almost everything (well beyond
> parameter checking I'm afraid), I don't think using this value for
> something intended to yield a specific user mode error message is
> really a good choice. Looking at the two respective hypervisor side
> changes - how about e.g. using EDOM, EBADF, or ENXIO instead?

EBADF is rather poor because it's the same error code you would get if
your privcmd fd had been closed.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.6 2/3] xen: replace non-POSIX error codes

2015-07-21 Thread Jan Beulich

>>> On 21.07.15 at 17:56,  wrote:
> Some DOMCTLs returned non-POSIX error codes, replace them with POSIX
> compilant values instead. EBADRQC and EBADSLT are replaced by EINVAL, while
> EUSERS is replaced with EOVERFLOW.

Same here basically - I'd appreciate if we could use EINVAL only as
a last resort error value, to make errors distinguishable.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.6 1/3] xen/x86/libxl: replace non-POSIX error codes used by PSR code

2015-07-21 Thread Jan Beulich

>>> On 21.07.15 at 17:56,  wrote:
> PSR was using EBADSLT and EUSERS which are not POSIX error codes, replace
> them with EINVAL and EOVERFLOW respectively.

Considering that we use EINVAL for almost everything (well beyond
parameter checking I'm afraid), I don't think using this value for
something intended to yield a specific user mode error message is
really a good choice. Looking at the two respective hypervisor side
changes - how about e.g. using EDOM, EBADF, or ENXIO instead?

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 2/2] xen: sched/cpupool: properly update affinity when removing a cpu from a cpupool

2015-07-21 Thread Dario Faggioli

And this time, do it right. In fact, a similar change was
attempted in 93be8285a79c6 ("cpupools: update domU's node-affinity
on the cpupool_unassign_cpu() path"). But that was buggy, and got
reverted with 8395b67ab0b8a86.

However, even though reverting was the right thing to do, it
remains true that:
 - calling the function is better done in the cpupool cpu removal
   code, even if just for simmetry with the cpupool cpu adding path;
 - it is not necessary to call it during cpu teardown (for suspend
   or shutdown) code as we either are going down and will never
   come up (shutdown) or, when coming up, we want everything to be
   as before the tearing down process started, and so we would just
   undo any update made during the process.
 - calling it from the teardown path is not only unnecessary, but
   it can trigger an ASSERT(), in case we get, during the process,
   to remove the last online pcpu of a domain's node affinity:

  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  ... ... ...
  (XEN) Xen call trace:
  (XEN)[] domain_update_node_affinity+0x113/0x240
  (XEN)[] cpu_disable_scheduler+0x334/0x3f2
  (XEN)[] __cpu_disable+0x313/0x36e
  (XEN)[] take_cpu_down+0x34/0x3b
  (XEN)[] stopmachine_action+0x70/0x99
  (XEN)[] do_tasklet_work+0x78/0xab
  (XEN)[] do_tasklet+0x5e/0x8a
  (XEN)[] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 12:
  (XEN) Assertion '!cpumask_empty(dom_cpumask)' failed at domain.c:466
  (XEN) 

Therefore, for all these reasons, move the call from
cpu_disable_schedule() to cpupool_unassign_cpu_helper().

While there, add some sanity checking (in the latter function), and
make sure that scanning the domain list is done with domlist_read_lock
held, at least when the system is 'live'.

I re-tested the scenario described in here:
 http://permalink.gmane.org/gmane.comp.emulators.xen.devel/235310

which is what led to the revert of 93be8285a79c6, and that is
working ok after this commit.

Signed-off-by: 
Acked-by: George Dunlap 
Acked-by: Juergen Gross 
---
 xen/common/cpupool.c  |   18 ++
 xen/common/schedule.c |7 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/xen/common/cpupool.c b/xen/common/cpupool.c
index b48ae17..69b984c 100644
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -297,12 +297,25 @@ static int cpupool_assign_cpu_locked(struct cpupool *c, 
unsigned int cpu)
 static long cpupool_unassign_cpu_helper(void *info)
 {
 int cpu = cpupool_moving_cpu;
+struct cpupool *c = info;
+struct domain *d;
 long ret;
 
 cpupool_dprintk("cpupool_unassign_cpu(pool=%d,cpu=%d)\n",
 cpupool_cpu_moving->cpupool_id, cpu);
 
 spin_lock(&cpupool_lock);
+if ( c != cpupool_cpu_moving )
+{
+ret = -EBUSY;
+goto out;
+}
+
+/*
+ * We need this for scanning the domain list, both in
+ * cpu_disable_scheduler(), and at the bottom of this function.
+ */
+rcu_read_lock(&domlist_read_lock);
 ret = cpu_disable_scheduler(cpu);
 cpumask_set_cpu(cpu, &cpupool_free_cpus);
 if ( !ret )
@@ -319,6 +332,11 @@ static long cpupool_unassign_cpu_helper(void *info)
 cpupool_cpu_moving = NULL;
 }
 
+for_each_domain_in_cpupool(d, c)
+{
+domain_update_node_affinity(d);
+}
+rcu_read_unlock(&domlist_read_lock);
 out:
 spin_unlock(&cpupool_lock);
 cpupool_dprintk("cpupool_unassign_cpu ret=%ld\n", ret);
diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 89fc10a..1419064 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -650,6 +650,12 @@ int cpu_disable_scheduler(unsigned int cpu)
 if ( c == NULL )
 return ret;
 
+/*
+ * We'd need the domain RCU lock, but:
+ *  - when we are called from cpupool code, it's acquired there already;
+ *  - when we are called for CPU teardown, we're in stop-machine context,
+ *so that's not be a problem.
+ */
 for_each_domain_in_cpupool ( d, c )
 {
 for_each_vcpu ( d, v )
@@ -735,7 +741,6 @@ int cpu_disable_scheduler(unsigned int cpu)
 ret = -EAGAIN;
 }
 }
-domain_update_node_affinity(d);
 }
 
 return ret;


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH v3 1/2] xen: sched: reorganize cpu_disable_scheduler()

2015-07-21 Thread Dario Faggioli

The function is called both when we want to remove a cpu
from a cpupool, and during cpu teardown, for suspend or
shutdown. If, however, the boot cpu (cpu 0, most of the
times) is not present in the default cpupool, during
suspend or shutdown, Xen crashes like this:

  root@Zhaman:~# xl cpupool-cpu-remove Pool-0 0
  root@Zhaman:~# shutdown -h now
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  ...
  (XEN) Xen call trace:
  (XEN)[] _csched_cpu_pick+0x156/0x61f
  (XEN)[] csched_cpu_pick+0xe/0x10
  (XEN)[] vcpu_migrate+0x18e/0x321
  (XEN)[] cpu_disable_scheduler+0x1cf/0x2ac
  (XEN)[] __cpu_disable+0x313/0x36e
  (XEN)[] take_cpu_down+0x34/0x3b
  (XEN)[] stopmachine_action+0x70/0x99
  (XEN)[] do_tasklet_work+0x78/0xab
  (XEN)[] do_tasklet+0x5e/0x8a
  (XEN)[] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 15:
  (XEN) Assertion 'cpu < nr_cpu_ids' failed at 
...URCES/xen/xen/xen.git/xen/include/xen/cpumask.h:97
  (XEN) 

There also are problems when we try to suspend or shutdown
with a cpupool configured with just one cpu (no matter, in
this case, whether that is the boot cpu or not):

  root@Zhaman:~# xl create /etc/xen/test.cfg
  root@Zhaman:~# xl cpupool-migrate test Pool-1
  root@Zhaman:~# xl cpupool-list -c
  Name   CPU list
  Pool-0 0,1,2,3,4,5,6,7,8,9,10,11,13,14,15
  Pool-1 12
  root@Zhaman:~# shutdown -h now
  (XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
  (XEN) CPU:12
  ...
  (XEN) Xen call trace:
  (XEN)[] __cpu_disable+0x317/0x36e
  (XEN)[] take_cpu_down+0x34/0x3b
  (XEN)[] stopmachine_action+0x70/0x99
  (XEN)[] do_tasklet_work+0x78/0xab
  (XEN)[] do_tasklet+0x5e/0x8a
  (XEN)[] idle_loop+0x56/0x6b
  (XEN)
  (XEN)
  (XEN) 
  (XEN) Panic on CPU 12:
  (XEN) Xen BUG at smpboot.c:895
  (XEN) 

In both cases, the problem is the scheduler not being able
to:
 - move all the vcpus to the boot cpu (as the boot cpu is
   not in the cpupool), in the former;
 - move the vcpus away from a cpu at all (as that is the
   only one cpu in the cpupool), in the latter.

Solution is to distinguish, inside cpu_disable_scheduler(),
the two cases of cpupool manipulation and teardown. For
cpupool manipulation, it is correct to ask the scheduler to
take an action, as pathological situation (like there not
being any cpu in the pool where to send vcpus) are taken
care of (i.e., forbidden!) already. For suspend and shutdown,
we don't want the scheduler to be involved at all, as the
final goal is pretty simple: "send all the vcpus to the
boot cpu ASAP", so we just go for it.

Signed-off-by: Dario Faggioli 
---
Changes from v2:
 * add a missing spin_unlock, most likely eaten by a
   forgotten `stg refresh' (sorry!)
 * fix a typo

Changes from v1:
 * BUG_ON() if, in the suspend/shutdown case, the mask of
   online pCPUs will ever get empty, as suggested during
   review;
 * reorganize and improve comments inside cpu_disable_scheduler()
   as suggested during review;
 * make it more clear that vcpu_move_nosched() (name
   changed, as suggested during review), should only be
   called from "quite contextes", such us, during suspend
   or shutdown. Do that via both comments and asserts,
   as requested during review;
 * reorganize cpu_disable_scheduler() and vcpu_move_nosched()
   so that calling to sleep and wakeup functions are only
   called when necessary (i.e., *not* in case we are
   suspending/shutting down, as requested during review.
---
Cc: George Dunlap 
Cc: Juergen Gross 
---
 xen/common/schedule.c |  104 +
 1 file changed, 88 insertions(+), 16 deletions(-)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index df8c1d0..89fc10a 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -454,9 +454,10 @@ void vcpu_unblock(struct vcpu *v)
  * Do the actual movement of a vcpu from old to new CPU. Locks for *both*
  * CPUs needs to have been taken already when calling this!
  */
-static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
-  unsigned int new_cpu)
+static void vcpu_move_locked(struct vcpu *v, unsigned int new_cpu)
 {
+unsigned int old_cpu = v->processor;
+
 /*
  * Transfer urgency status to new CPU before switching CPUs, as
  * once the switch occurs, v->is_urgent is no longer protected by
@@ -478,6 +479,33 @@ static void vcpu_move(struct vcpu *v, unsigned int old_cpu,
 v->processor = new_cpu;
 }
 
+/*
+ * Move a vcpu from it's current processor to a target new processor,
+ * without asking the scheduler to do any placement. This is intended
+ * for being called from special contextes, where things are quiet
+ * enough that no contention is supposed to happen (i.e., during
+ * shutdown

[Xen-devel] [PATCH v3 0/2] xen: sched/cpupool: more fixing of (corner?) cases

2015-07-21 Thread Dario Faggioli

v2, which is here:
 http://lists.xen.org/archives/html/xen-devel/2015-07/msg03540.html

suffered from a "missing stg refresh" issue (or, at least,
that's my best guess).

Thanks Juergen for noticing and pointing that out, and
sorry for that.

This posting should be fine.

Regards,
Dario
---
Dario Faggioli (2):
  xen: sched: reorganize cpu_disable_scheduler()
  xen: sched/cpupool: properly update affinity when removing a cpu from a 
cpupool

 xen/common/cpupool.c  |   18 
 xen/common/schedule.c |  111 +
 2 files changed, 112 insertions(+), 17 deletions(-)
--
<> (Raistlin Majere)
-
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson

Ian Jackson writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM"):
> Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
> conflicts with RDM"):
> > Sorry, I just ignore the line in brackets since I always think this kind 
> > of thing is often not a big deal, and next time I should pay more 
> > attention to the (). But indeed, before I post this whole patch online I 
> > also picked up this chunk of code to ask you to take a look that. This 
> > manner means I'm not very sure if I'm addressing this properly. But I 
> > didn't get a further response, so I guess that should work for you and 
> > then I posted the whole online.
> 
> You are talking about <55ae2bb1.9030...@intel.com> I guess.  I replied
> to that with several comments about your prose and about the
> computation of the new set of rdms.
> 
> It's true that I didn't comment on the frat that you had half-done one
 fact
> of the things I had requested.  It is of course a waste of my time to
> be constantly re-reviewing half-done changes.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.6 1/3] xen/x86/libxl: replace non-POSIX error codes used by PSR code

2015-07-21 Thread Roger Pau Monne

PSR was using EBADSLT and EUSERS which are not POSIX error codes, replace
them with EINVAL and EOVERFLOW respectively.

Signed-off-by: Roger Pau Monné 
Cc: Ian Jackson 
Cc: Stefano Stabellini 
Cc: Ian Campbell 
Cc: Wei Liu 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
 tools/libxl/libxl_psr.c | 6 +++---
 xen/arch/x86/psr.c  | 8 
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/libxl/libxl_psr.c b/tools/libxl/libxl_psr.c
index 2a0..f64406a 100644
--- a/tools/libxl/libxl_psr.c
+++ b/tools/libxl/libxl_psr.c
@@ -31,7 +31,7 @@ static void libxl__psr_log_err_msg(libxl__gc *gc, int err)
 case ESRCH:
 msg = "invalid domain ID";
 break;
-case EBADSLT:
+case EINVAL:
 msg = "socket is not supported";
 break;
 case EFAULT:
@@ -59,7 +59,7 @@ static void libxl__psr_cmt_log_err_msg(libxl__gc *gc, int err)
 case ENOENT:
 msg = "CMT is not attached to this domain";
 break;
-case EUSERS:
+case EOVERFLOW:
 msg = "no free RMID available";
 break;
 default:
@@ -81,7 +81,7 @@ static void libxl__psr_cat_log_err_msg(libxl__gc *gc, int err)
 case ENOENT:
 msg = "CAT is not enabled on the socket";
 break;
-case EUSERS:
+case EOVERFLOW:
 msg = "no free COS available";
 break;
 case EEXIST:
diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c
index 861683f..0185c45 100644
--- a/xen/arch/x86/psr.c
+++ b/xen/arch/x86/psr.c
@@ -176,7 +176,7 @@ int psr_alloc_rmid(struct domain *d)
 if ( rmid > psr_cmt->rmid_max )
 {
 d->arch.psr_rmid = 0;
-return -EUSERS;
+return -EOVERFLOW;
 }
 
 d->arch.psr_rmid = rmid;
@@ -251,7 +251,7 @@ static struct psr_cat_socket_info 
*get_cat_socket_info(unsigned int socket)
 return ERR_PTR(-ENODEV);
 
 if ( socket >= nr_sockets )
-return ERR_PTR(-EBADSLT);
+return ERR_PTR(-EINVAL);
 
 if ( !test_bit(socket, cat_socket_enable) )
 return ERR_PTR(-ENOENT);
@@ -332,7 +332,7 @@ static int write_l3_cbm(unsigned int socket, unsigned int 
cos, uint64_t cbm)
 unsigned int cpu = get_socket_cpu(socket);
 
 if ( cpu >= nr_cpu_ids )
-return -EBADSLT;
+return -EINVAL;
 on_selected_cpus(cpumask_of(cpu), do_write_l3_cbm, &info, 1);
 }
 
@@ -381,7 +381,7 @@ int psr_set_l3_cbm(struct domain *d, unsigned int socket, 
uint64_t cbm)
 if ( !found )
 {
 spin_unlock(&info->cbm_lock);
-return -EUSERS;
+return -EOVERFLOW;
 }
 
 cos = found - map;
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.6 3/3] xen: remove non-POSIX error codes

2015-07-21 Thread Roger Pau Monne

Xen was using some non-POSIX error codes that are removed in this patch. For
future reference, the list of POSIX error codes has been obtained from:

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html

The error codes already present and defined as optional (XSR), have been
left in place.

Signed-off-by: Roger Pau Monné 
Cc: Ian Campbell 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Tim Deegan 
---
 xen/include/public/errno.h | 5 -
 1 file changed, 5 deletions(-)

diff --git a/xen/include/public/errno.h b/xen/include/public/errno.h
index 9a7e411..3498727 100644
--- a/xen/include/public/errno.h
+++ b/xen/include/public/errno.h
@@ -54,8 +54,6 @@ XEN_ERRNO(EDEADLK,35) /* Resource deadlock would 
occur */
 XEN_ERRNO(ENAMETOOLONG,36) /* File name too long */
 XEN_ERRNO(ENOLCK,  37) /* No record locks available */
 XEN_ERRNO(ENOSYS,  38) /* Function not implemented */
-XEN_ERRNO(EBADRQC, 56) /* Invalid request code */
-XEN_ERRNO(EBADSLT, 57) /* Invalid slot */
 XEN_ERRNO(ENODATA, 61) /* No data available */
 XEN_ERRNO(ETIME,   62) /* Timer expired */
 XEN_ERRNO(EBADMSG, 74) /* Not a data message */
@@ -64,15 +62,12 @@ XEN_ERRNO(EILSEQ,   84) /* Illegal byte sequence */
 #ifdef __XEN__ /* Internal only, should never be exposed to the guest. */
 XEN_ERRNO(ERESTART,85) /* Interrupted system call should be restarted 
*/
 #endif
-XEN_ERRNO(EUSERS,  87) /* Too many users */
 XEN_ERRNO(EOPNOTSUPP,  95) /* Operation not supported on transport 
endpoint */
 XEN_ERRNO(EADDRINUSE,  98) /* Address already in use */
 XEN_ERRNO(EADDRNOTAVAIL, 99)   /* Cannot assign requested address */
 XEN_ERRNO(ENOBUFS, 105)/* No buffer space available */
 XEN_ERRNO(EISCONN, 106)/* Transport endpoint is already connected */
 XEN_ERRNO(ENOTCONN,107)/* Transport endpoint is not connected */
-XEN_ERRNO(ESHUTDOWN,   108)/* Cannot send after transport endpoint 
shutdown */
-XEN_ERRNO(ETOOMANYREFS,109)/* Too many references: cannot splice */
 XEN_ERRNO(ETIMEDOUT,   110)/* Connection timed out */
 
 #undef XEN_ERRNO
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.6 0/3] Get rid of non-POSIX error codes

2015-07-21 Thread Roger Pau Monne

This patch series gets rid of non-POSIX error codes in the hypervisor and 
the toolstack. This is needed for OS compatibility.

I think the patch series is fairly small and non-intrusive, hence the 
for-4.6 tag.

Thanks, Roger.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [PATCH for-4.6 2/3] xen: replace non-POSIX error codes

2015-07-21 Thread Roger Pau Monne

Some DOMCTLs returned non-POSIX error codes, replace them with POSIX
compilant values instead. EBADRQC and EBADSLT are replaced by EINVAL, while
EUSERS is replaced with EOVERFLOW.

Signed-off-by: Roger Pau Monné 
Cc: George Dunlap 
Cc: Jan Beulich 
Cc: Andrew Cooper 
---
Nothing in libxc or libxl seems to check for those specific error codes, so
I guess it's fine to replace them with whatever we want.
---
 xen/arch/x86/mm/paging.c | 2 +-
 xen/common/domain.c  | 4 ++--
 xen/common/hvm/save.c| 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index 7089155..1c3504d 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -766,7 +766,7 @@ long 
paging_domctl_continuation(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
 
 if ( op.interface_version != XEN_DOMCTL_INTERFACE_VERSION ||
  op.cmd != XEN_DOMCTL_shadow_op )
-return -EBADRQC;
+return -EINVAL;
 
 d = rcu_lock_domain_by_id(op.domain);
 if ( d == NULL )
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 8efef5c..791166b 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -900,7 +900,7 @@ int vcpu_pause_by_systemcontroller(struct vcpu *v)
 new = old + 1;
 
 if ( new > 255 )
-return -EUSERS;
+return -EOVERFLOW;
 
 prev = cmpxchg(&v->controller_pause_count, old, new);
 } while ( prev != old );
@@ -980,7 +980,7 @@ int __domain_pause_by_systemcontroller(struct domain *d,
  * toolstack overflowing d->pause_count with many repeated hypercalls.
  */
 if ( new > 255 )
-return -EUSERS;
+return -EOVERFLOW;
 
 prev = cmpxchg(&d->controller_pause_count, old, new);
 } while ( prev != old );
diff --git a/xen/common/hvm/save.c b/xen/common/hvm/save.c
index da6e668..db85fee 100644
--- a/xen/common/hvm/save.c
+++ b/xen/common/hvm/save.c
@@ -114,7 +114,7 @@ int hvm_save_one(struct domain *d, uint16_t typecode, 
uint16_t instance,
 uint32_t off;
 const struct hvm_save_descriptor *desc;
 
-rv = -EBADSLT;
+rv = -EINVAL;
 for ( off = 0; off < (ctxt.cur - sizeof(*desc)); off += desc->length )
 {
 desc = (void *)(ctxt.data + off);
-- 
1.9.5 (Apple Git-50.3)


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson

Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM"):
> Sorry, I just ignore the line in brackets since I always think this kind 
> of thing is often not a big deal, and next time I should pay more 
> attention to the (). But indeed, before I post this whole patch online I 
> also picked up this chunk of code to ask you to take a look that. This 
> manner means I'm not very sure if I'm addressing this properly. But I 
> didn't get a further response, so I guess that should work for you and 
> then I posted the whole online.

You are talking about <55ae2bb1.9030...@intel.com> I guess.  I replied
to that with several comments about your prose and about the
computation of the new set of rdms.

It's true that I didn't comment on the frat that you had half-done one
of the things I had requested.  It is of course a waste of my time to
be constantly re-reviewing half-done changes.

> Now back on our problem,
> 
> static void
> add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
>uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
> {
>  d_config->rdms = libxl__realloc(NOGC, d_config->rdms,
>  (d_config->num_rdms+1) * sizeof(libxl_device_rdm));
> 
>  d_config->rdms[d_config->num_rdms].start = rdm_start;
>  d_config->rdms[d_config->num_rdms].size = rdm_size;
>  d_config->rdms[d_config->num_rdms].policy = rdm_policy;
>  d_config->num_rdms++;
> }
> 
> Does this work for you? If I'm still wrong, please correct this function 
> directly to cost you less.

Yes, that is what I meant.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun


On 2015/7/21 23:09, Ian Jackson wrote:

Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts 
with RDM"):

+static void
+add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
+  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
+{
+d_config->num_rdms++;
+d_config->rdms = libxl__realloc(NOGC, d_config->rdms,
+d_config->num_rdms * sizeof(libxl_device_rdm));
+
+d_config->rdms[d_config->num_rdms - 1].start = rdm_start;
+d_config->rdms[d_config->num_rdms - 1].size = rdm_size;
+d_config->rdms[d_config->num_rdms - 1].policy = rdm_policy;
+}



But, I wrote:

Can I suggest a function

   void add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
 uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)

which assumes that d_config->num_rdms is set correctly, and increments
it ?

(Please put the increment at the end so that the assignments are to
->rdms[d_config->num_rdms], or perhaps make a convenience alias.)

Note the last paragraph.

This is now the third time I have posted that text.  It is the fifth
request or clarification I have had to make about this very small
area.  I have to say that I'm finding this rather frustrating.


Sorry, I just ignore the line in brackets since I always think this kind 
of thing is often not a big deal, and next time I should pay more 
attention to the (). But indeed, before I post this whole patch online I 
also picked up this chunk of code to ask you to take a look that. This 
manner means I'm not very sure if I'm addressing this properly. But I 
didn't get a further response, so I guess that should work for you and 
then I posted the whole online.


Now back on our problem,

static void
add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
{
d_config->rdms = libxl__realloc(NOGC, d_config->rdms,
(d_config->num_rdms+1) * sizeof(libxl_device_rdm));

d_config->rdms[d_config->num_rdms].start = rdm_start;
d_config->rdms[d_config->num_rdms].size = rdm_size;
d_config->rdms[d_config->num_rdms].policy = rdm_policy;
d_config->num_rdms++;
}

Does this work for you? If I'm still wrong, please correct this function 
directly to cost you less.


Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2 4/6] xen/x86/pvh: Set up descriptors for 32-bit PVH guests

2015-07-21 Thread Boris Ostrovsky


On 07/17/2015 01:52 PM, Boris Ostrovsky wrote:

On 07/17/2015 12:43 PM, Konrad Rzeszutek Wilk wrote:

On Fri, Jul 17, 2015 at 11:36:29AM -0400, Boris Ostrovsky wrote:

On 07/17/2015 11:21 AM, Konrad Rzeszutek Wilk wrote:

On Thu, Jul 16, 2015 at 05:43:39PM -0400, Boris Ostrovsky wrote:

Signed-off-by: Boris Ostrovsky 
---
Changes in v2:
* Set segment selectors using loadsegment() instead of assembly

  arch/x86/xen/enlighten.c | 15 ++-
  1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index f8dc398..d665b1d 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1362,12 +1362,12 @@ static void __init 
xen_boot_params_init_edd(void)

  static void __ref xen_setup_gdt(int cpu)
  {
  if (xen_feature(XENFEAT_auto_translated_physmap)) {
-#ifdef CONFIG_X86_64
-unsigned long dummy;
+unsigned long __attribute__((unused)) dummy;
-load_percpu_segment(cpu); /* We need to access per-cpu 
area */

You removed that - where are we going to do that? As the
'switch_to_new_gdt' uses the per-cpu GDT table.
load_percpu_segment() is part of switch_to_new_gdt(), so I thought 
there is

no need to call it here.

But you are right --- switch_to_new_gdt() starts with 
get_cpu_gdt_table()

which accesses per-CPU area. How did this manage to work then?

I was surprised as well - I was expecting your patch to have blow up.
Unless we are doing something fancy for CPU0 and for the other CPUs we
already have the per-cpu segment setup during bootup (copied from BSP)?



No, %fs is zero when we enter xen_setup_gdt() (for 32-bit).

In any case, I should put load_percpu_segment() back.



No, I shouldn't.

Until the new GDT is loaded we can't load selectors since current GDT 
doesn't have descriptors set up for them. And so any attempt to load 
uninitialized selectors results in a fault.


This worked for 64-bit guests because there we load zero into %gs and 
that is allowed (processor doesn't perform descriptor checks for the 
first 4 indexes). But for 32-bit guests we load %fs with 0xd8.


And the reason the code worked before was because we are using "master" 
per-cpu area and because GDT is the same for all CPUs at that point. Or 
so I think.



-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson

Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM"):
> Indeed, I'm not a fan to Xen tools so I can't picture what this real 
> scenario would happen. So if I'm misunderstanding what you mean, just 
> please correct me. Or if you still think its hard to explain this to me, 
> just tell me what I should do. I think this make your life easy.

Please ignore this line of discussion.

Instead, please simply make it so that if there are any rdms specified
in the domain config, they are used instead of the automatically
gathered information (from strategy and devices).

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/5] xl: Command line: Adjust "Fix segfaults from `xl psr-cat-cbm-set`..."

2015-07-21 Thread Ian Campbell

On Tue, 2015-07-21 at 15:36 +0100, Wei Liu wrote:
> On Tue, Jul 21, 2015 at 03:27:26PM +0100, Ian Jackson wrote:
> > Ian Campbell writes ("Re: [PATCH 1/5] xl: Command line: Adjust "Fix 
> > segfaults from `xl psr-cat-cbm-set`...""):
> > > On Fri, 2015-07-17 at 18:00 +0100, Ian Jackson wrote:
> > > 
> > > Replying here in lieu of a 0/N:
> > > 
> > > Is any subset of this aimed at 4.6?
> > 
> > Yes, ideally, all of them.  I think they are bugfixes or minor
> > cleanups.
> > 
> > If Wei wants only a subset, I can probably tease them out.
> > 
> 
> I think this is part of the audit of toolstack code.  They look
> obviously correct.  I'm fine with them going in.
> 
> Currently we have three series with freeze exception. Only RMRR has
> significant number of changes for xl, however that series doens't
> introduce a new command, so we're probably safe to apply this series
> now.

I applied #1..3, #4 had a good suggestion for an improvement and #5
relies on that change so I left them in this pass.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Chen, Tiejun


Just to your example,

libxl_domain_config cfg;
cfg.stuff = blah;
cfg.rdm.strategy = HOST;

libxl_domain_create_new(&cfg, &domid);
libxl_domain_destroy(domid);

Here looks you mean d_config->rdms would be changed, right? Currently
this shouldn't be allowed. But I think we need to further discussion
make this case clear after feature freeze since we didn't have this kind
of assumption in our previous design.

libxl_domain_create_new(&cfg, &domid);


This response of yours does not lead me to think you have understood
what I am saying, but I agree that this can be dealt with later (if


Indeed, I'm not a fan to Xen tools so I can't picture what this real 
scenario would happen. So if I'm misunderstanding what you mean, just 
please correct me. Or if you still think its hard to explain this to me, 
just tell me what I should do. I think this make your life easy.


Thanks
Tiejun


indeed it needs to be dealt with at all).

Ian.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Xen-Devel] Enabling IRQ Crossbar (Secondary Interrupt Controller) Support

2015-07-21 Thread Ian Campbell

On Tue, 2015-07-21 at 10:07 -0400, Brandon Perez wrote:
> 
>  I'm not sure that these patches are quite ready yet to be put 
> into 
> the Xen repo.

That's ok, but even for an RFC (Request For Comments) please post them
one patch per email in the manner of git send-email. You can use -
-subject-prefix='PATCH RFC' to tag them as such.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] build: use correct qemu path in systemd service file and init script

2015-07-21 Thread Ian Campbell

On Fri, 2015-07-17 at 00:15 +0800, Ting-Wei Lan wrote:

This all looks pretty good. One comment:

> +if test "x$qemu_xen_path" = "x" || test "x$qemu_xen_path" = "xqemu"; 
> then :
> +
> +qemu_xen_path_service="$LIBEXEC_BIN/qemu-system-i386"

It's a shame we have to repeat the "qemu-system-i386" here and in
libxl_dm.c.

I think rather than adding a new qemu_xen_path_service we should just
make the existing $qemu_xen_path default to the full $LIBEXEC_BIN/qemu
-system-i386 and have it substituted everywhere much like you've done
here.

Then libxl_dm.c:qemu_xen_path() can return QEMU_XEN_PATH always.

What do you think?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson

Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM"):
> [Ian Jackson:]
> > That is, an address would be reserved if it was reserved in any of the
> > rdm regions implied by the config.
> 
> Are you saying this point?
> 
> "The union of two sets A and B is the set of elements which are in A, in 
> B, or in both A and B."

Yes.

> > The explicitly specified regions might overlap with the computed ones,
> > without being identical. Computing the union would not be entirely
> > trivial.
> 
> Just to your example,
> 
>libxl_domain_config cfg;
>cfg.stuff = blah;
>cfg.rdm.strategy = HOST;
> 
>libxl_domain_create_new(&cfg, &domid);
>libxl_domain_destroy(domid);
> 
> Here looks you mean d_config->rdms would be changed, right? Currently 
> this shouldn't be allowed. But I think we need to further discussion 
> make this case clear after feature freeze since we didn't have this kind 
> of assumption in our previous design.
> 
>libxl_domain_create_new(&cfg, &domid);

This response of yours does not lead me to think you have understood
what I am saying, but I agree that this can be dealt with later (if
indeed it needs to be dealt with at all).

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/4] Fix to libxl migration v2 issue blocking OSSTest

2015-07-21 Thread Ian Campbell

On Tue, 2015-07-21 at 13:49 +0100, Wei Liu wrote:
> On Mon, Jul 20, 2015 at 11:11:28AM +0100, Andrew Cooper wrote:
> > On 20/07/15 10:56, Wei Liu wrote:
> > > On Fri, Jul 17, 2015 at 05:51:14PM +0100, Andrew Cooper wrote:
> > > > And three improvements to debugging.
> > > > 
> > > > Note that there is still a bug in libxl__toolstack_save() which
> > > > valgrind identified, but I do not wish to block this bugfix on 
> > > > that
> > > > 
> > > > Andrew Cooper (4):
> > > >   tools/libxc: Identify the path of the kernel image which 
> > > > cannot be
> > > > found
> > > >   tools/libxl: Log the subject fd in datacopier messages
> > > >   tools/libxl: Identify copywhat in stream v2 datacopiers
> > > I think all three patches should wait until next development 
> > > window
> > > opens unless we have nothing else in our queue (which doesn't 
> > > seem to be
> > > the case at the moment).
> > 
> > You mean delay until 4.7? I disagree.  Without these fixes, 
> > debugging
> > issues is substantially harder than they need to be.
> > 
> > They literally are only adding extra information into existing 
> > error
> > messages.
> > 
> 
> Well I am expecting two to three big series getting applied soon, any
> changes that gets applied now has the chance of forcing those series 
> to
> be rebased.

Wei and I discussed this IRL, the concern was the outstanding colopre
patches.

However I did a test apply on top of
https://github.com/macrosheep/xen.git#colo/colo-v9 (the latest colopre)
and there were no rejects due to the remus refactoring.

There were rejects because I already applied 4/4 on Friday, i.e. they
were the inverse of what I fixed up then.

So given the lack of interaction with colopre Wei gave me permission to
go ahead, so I have applied patches 1..3.

4 was applied already, of course.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 0/4] Fix to libxl migration v2 issue blocking OSSTest

2015-07-21 Thread Ian Campbell

On Tue, 2015-07-21 at 15:51 +0100, Ian Campbell wrote:
> On Tue, 2015-07-21 at 13:49 +0100, Wei Liu wrote:
> > On Mon, Jul 20, 2015 at 11:11:28AM +0100, Andrew Cooper wrote:
> > > On 20/07/15 10:56, Wei Liu wrote:
> > > > On Fri, Jul 17, 2015 at 05:51:14PM +0100, Andrew Cooper wrote:
> > > > > And three improvements to debugging.
> > > > > 
> > > > > Note that there is still a bug in libxl__toolstack_save() 
> > > > > which
> > > > > valgrind identified, but I do not wish to block this bugfix 
> > > > > on 
> > > > > that
> > > > > 
> > > > > Andrew Cooper (4):
> > > > >   tools/libxc: Identify the path of the kernel image which 
> > > > > cannot be
> > > > > found
> > > > >   tools/libxl: Log the subject fd in datacopier messages
> > > > >   tools/libxl: Identify copywhat in stream v2 datacopiers
> > > > I think all three patches should wait until next development 
> > > > window
> > > > opens unless we have nothing else in our queue (which doesn't 
> > > > seem to be
> > > > the case at the moment).
> > > 
> > > You mean delay until 4.7? I disagree.  Without these fixes, 
> > > debugging
> > > issues is substantially harder than they need to be.
> > > 
> > > They literally are only adding extra information into existing 
> > > error
> > > messages.
> > > 
> > 
> > Well I am expecting two to three big series getting applied soon, 
> > any
> > changes that gets applied now has the chance of forcing those 
> > series 
> > to
> > be rebased.
> 
> Wei and I discussed this IRL, the concern was the outstanding colopre
> patches.
> 
> However I did a test apply on top of  
> https://github.com/macrosheep/xen.git#colo/colo-v9 (the latest 
> colopre)
> and there were no rejects due to the remus refactoring.
> 
> There were rejects because I already applied 4/4 on Friday, i.e. they
> were the inverse of what I fixed up then.
> 
> So given the lack of interaction with colopre Wei gave me permission 
> to
> go ahead, so I have applied patches 1..3.
> 
> 4 was applied already, of course.

In doing this I managed to revert part of #4, thanks to Andy for
noticing so promptly.

I've pushed the following:

>From 1287ac109c44ca9b99eb642316d7af83b4081b52 Mon Sep 17 00:00:00 2001
From: Ian Campbell 
Date: Tue, 21 Jul 2015 16:00:19 +0100
Subject: [PATCH] tools: libxl: Refix "Initialise the fd of the unused half of
 a datacopier"

Applying the series out of order led to d72befc35f31 "tools/libxl:
Identify copywhat in stream v2 datacopiers" unintentionally reverting
part of 21d9b079e538 "tools/libxl: Initialise the fd of the unused
half of a datacopier".

Put this back.

Reported-by: Andrew Cooper 
Signed-off-by: Ian Campbell 
---
 tools/libxl/libxl_stream_read.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/libxl/libxl_stream_read.c b/tools/libxl/libxl_stream_read.c
index 3e1cd2a..32a3551 100644
--- a/tools/libxl/libxl_stream_read.c
+++ b/tools/libxl/libxl_stream_read.c
@@ -611,6 +611,7 @@ static void write_emulator_blob(libxl__egc *egc,
 dc->writewhat  = "qemu save file";
 dc->copywhat   = "restore v2 stream";
 dc->writefd= writefd;
+dc->readfd = -1;
 dc->maxsz  = -1;
 dc->callback   = write_emulator_done;
 
-- 
2.1.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [v10][PATCH 11/16] tools/libxl: detect and avoid conflicts with RDM

2015-07-21 Thread Ian Jackson

Chen, Tiejun writes ("Re: [v10][PATCH 11/16] tools/libxl: detect and avoid 
conflicts with RDM"):
> +static void
> +add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
> +  uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)
> +{
> +d_config->num_rdms++;
> +d_config->rdms = libxl__realloc(NOGC, d_config->rdms,
> +d_config->num_rdms * sizeof(libxl_device_rdm));
> +
> +d_config->rdms[d_config->num_rdms - 1].start = rdm_start;
> +d_config->rdms[d_config->num_rdms - 1].size = rdm_size;
> +d_config->rdms[d_config->num_rdms - 1].policy = rdm_policy;
> +}


But, I wrote:

   Can I suggest a function

  void add_rdm_entry(libxl__gc *gc, libxl_domain_config *d_config,
uint64_t rdm_start, uint64_t rdm_size, int rdm_policy)

   which assumes that d_config->num_rdms is set correctly, and increments
   it ?

   (Please put the increment at the end so that the assignments are to
   ->rdms[d_config->num_rdms], or perhaps make a convenience alias.)

Note the last paragraph.

This is now the third time I have posted that text.  It is the fifth
request or clarification I have had to make about this very small
area.  I have to say that I'm finding this rather frustrating.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [Xen-Devel] Enabling IRQ Crossbar (Secondary Interrupt Controller) Support

2015-07-21 Thread Brandon Perez


On 07/21/2015 05:41 AM, Julien Grall wrote:



On 21/07/2015 00:17, Brandon Perez wrote:

Hello All,


Hi Brandon,

We use to send one mail by patch rather than sending them as an
attachment of a single email. It's easier for reviewing the patches.
You also need to add you Signed-off-by on each patch and CC all the
relevant maintainers. Please see [1] for all the guidelines to submit a
patch to Xen.

A couple of comments I about this series:
 - Patch #2: You are allowing any guest to do smc which, unless you
trust all the guest, is unsecure. There was some discussion about
different solution to handle SMC back in 2013 [2]. So far I didn't see
any more update on it. It may be worth to send a separate thread about
how to handle SMC.
 - Patch #3, I can't find any documentation or implementation of the
property "default-mapping" in Linux. Can you provide a link about it?

I will comment more when you will resend the patches inline.

Regards,

[1] http://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches
[2] http://lists.xen.org/archives/html/xen-devel/2013-07/msg02779.html



Hi Julien,

I'm not sure that these patches are quite ready yet to be put into 
the Xen repo. For one, these patches don't solve the problem of the 
crossbar as outlined in the thread this came from [1]. Also, I haven't 
had a chance to clean up these patches yet. I've provided these patches 
at the request of Ian, so that you guys could see the changes I have 
made so far, and we could discuss what changes would be needed for the 
crossbar.


I agree about Patch #2, that makes sense, this was a workaround 
I've been using for now. Perhaps a check could be added to see if the 
domain is the privileged domain?


That's correct, the "default-mapping" property is not standard. 
It's another workaround that I'm working with for now. The interrupts 
property is going to contain the crossbar input number, not the actual 
SPI GIC line, so I needed a way to convey this to Xen.


The patches are inlined below:

Patch #1:

From f2bf190255c8f872d15063d7f8a6382c279e312d Mon Sep 17 00:00:00 2001
From: Brandon Perez 
Date: Mon, 20 Jul 2015 17:56:49 -0400
Subject: DRA7: Add specific mappings for devices/regions not in the 
device tree.


The DRA7 chip, which is similar to the OMAP5 chip, also requires 
specific mappings. These are MMIO mappings which are not explicitly 
stated in the device tree, so Xen does not know to map them. This patch 
adds these regions required by the DRA7 to be mapped.


Signed-off-by: Brandon Perez 

---
 xen/arch/arm/platforms/omap5.c|   27 +++
 xen/include/asm-arm/platforms/omap5.h |3 +++
 2 files changed, 30 insertions(+)

diff --git a/xen/arch/arm/platforms/omap5.c b/xen/arch/arm/platforms/omap5.c
index e7bf30d..3c6495a 100644
--- a/xen/arch/arm/platforms/omap5.c
+++ b/xen/arch/arm/platforms/omap5.c
@@ -120,6 +120,32 @@ static int omap5_specific_mapping(struct domain *d)
 return 0;
 }

+/* Additional mappings for dom0 (not in the DTS) */
+static int dra7_specific_mapping(struct domain *d)
+{
+/* Map the PRM module */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_PRM_BASE), 2,
+ paddr_to_pfn(OMAP5_PRM_BASE));
+
+/* Map the PRM_MPU */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_PRCM_MPU_BASE), 1,
+ paddr_to_pfn(OMAP5_PRCM_MPU_BASE));
+
+/* Map the Wakeup Gen */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_WKUPGEN_BASE), 1,
+ paddr_to_pfn(OMAP5_WKUPGEN_BASE));
+
+/* Map the on-chip SRAM */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_SRAM_PA), 32,
+ paddr_to_pfn(OMAP5_SRAM_PA));
+
+/* Map GPMC address space for NAND flash. */
+map_mmio_regions(d, paddr_to_pfn(OMAP5_GPMC_PA), 65536,
+ paddr_to_pfn(OMAP5_GPMC_PA));
+
+return 0;
+}
+
 static int __init omap5_smp_init(void)
 {
 void __iomem *wugen_base;
@@ -171,6 +197,7 @@ PLATFORM_START(dra7, "TI DRA7")
 .init_time = omap5_init_time,
 .cpu_up = cpu_up_send_sgi,
 .smp_init = omap5_smp_init,
+.specific_mapping = dra7_specific_mapping,

 .dom0_gnttab_start = 0x4b00,
 .dom0_gnttab_size = 0x2,
diff --git a/xen/include/asm-arm/platforms/omap5.h 
b/xen/include/asm-arm/platforms/omap5.h

index c559c84..d87e7d2 100644
--- a/xen/include/asm-arm/platforms/omap5.h
+++ b/xen/include/asm-arm/platforms/omap5.h
@@ -20,6 +20,9 @@
 #define OMAP_AUX_CORE_BOOT_0_OFFSET 0x800
 #define OMAP_AUX_CORE_BOOT_1_OFFSET 0x804

+#define OMAP5_GPMC_PA   0x0100
+#define OMAP5_TILER_PA  0x6000
+
 #endif /* __ASM_ARM_PLATFORMS_OMAP5_H */

 /*
--
1.7.9.5

Patch #2:

From e53fdc1ea750dd3143e2d7cd62a5d38eb446afde Mon Sep 17 00:00:00 2001
From: Brandon Perez 
Date: Mon, 20 Jul 2015 17:58:24 -0400
Subject: Traps: Enable pass-through SMC call support for guest OS's.

Originally, Xen did not allow for guests to mak

Re: [Xen-devel] [PATCH v4 1/3] libxl: make libxlstrdup and libxlstrndup handle NULL

2015-07-21 Thread Ian Campbell

On Fri, 2015-07-17 at 18:03 +0100, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v4 1/3] libxl: make libxl__strdup and 
> libxl__strndup handle NULL"):
> > Signed-off-by: Wei Liu 
> 
> Acked-by: Ian Jackson 

Applied. thanks.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v4 3/3] libxl: call libxl_cpupoolinfo_{init, dispose} in numa_place_domain

2015-07-21 Thread Ian Campbell

On Fri, 2015-07-17 at 18:03 +0100, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v4 3/3] libxl: call 
> libxl_cpupoolinfo_{init,dispose} in numa_place_domain"):
> > Call libxl_cpupoolinfo_init at the beginning.  Change two returns 
> > to
> > goto out so that libxl_cpupoolinfo_dispose is called in failure 
> > path.
> 
> Acked-by: Ian Jackson 

Applied.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xen: arm: Avoid reading beyond the last module

2015-07-21 Thread Ian Campbell

On Mon, 2015-07-20 at 13:32 +0100, Julien Grall wrote:
> Hi Chris,
> 
> On 17/07/15 21:48, Chris (Christopher) Brand wrote:
> > nr_mods is set in add_boot_module() to the number of module
> > array elements used. This function also ensures that nr_mods
> > never exceeds MAX_MODULES (the size of the array). When looping
> > through the array, the correct maximum index is "nr_mods-1",
> > not "nr_mods". If the array is full, using the latter will in
> > fact access beyond the end of the array.
> > This was done correctly in boot_module_find_by_kind() and
> > consider_modules() but incorrectly in discard_initial_modules()
> > and next_module().
> > 
> > Signed-off-by: Chris Brand 
> 
> Reviewed-by: Julien Grall 

Acked + applied.

Care should be taken when backporting since I think this off-by-one was
the result of us previously not including Xen in nr_mods despite it
being in the array or something like that (i..e the off-by-one used to
be correct).

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH] xen: arm: gic-v3: avoid \n in the middle of log messages

2015-07-21 Thread Ian Campbell

On Fri, 2015-07-17 at 15:10 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 17/07/15 14:21, Ian Campbell wrote:
> > These result in log messages such as:
> > 
> > (XEN) d0v0: vGICD: RAZ on reserved register offset 0x0c
> > (XEN) d0v0: vGICR: write r2 offset 0x000180
> > (XEN)  not found<3>traps.c:2417:d0v0 HSR=0x93820046 
> > pc=0xffc000322bfc gva=0xff890180 gpa=0x008d110180
> > 
> > Fix this by rewording without a \n in the middle. Also add one at 
> > the end.
> > 
> > Signed-off-by: Ian Campbell 
> 
> Reviewed-by: Julien Grall 

Applied thanks.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.6 1/3] libxl: include sys/endian.h for FreeBSD

2015-07-21 Thread Ian Campbell

On Tue, 2015-07-21 at 10:36 +0100, Wei Liu wrote:
> On Mon, Jul 20, 2015 at 04:55:00PM +0200, Roger Pau Monne wrote:
> > be64toh and friends are declared in sys/endian.h on FreeBSD, so 
> > include it
> > as part of libxl_osdeps.h.
> > 
> > Signed-off-by: Roger Pau Monné 
> > Cc: Ian Jackson 
> > Cc: Ian Campbell 
> > Cc: Wei Liu 
> 
> Acked-by: Wei Liu 

Applied.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for-4.6 3/3] hotplug/FreeBSD: fix xendriverdomain rc.d script

2015-07-21 Thread Ian Campbell

On Tue, 2015-07-21 at 12:03 +0100, Wei Liu wrote:
> On Mon, Jul 20, 2015 at 04:55:02PM +0200, Roger Pau Monne wrote:
> > hotplugpath.sh by default is located in /usr/local/etc/xen/scripts 
> > on
> > FreeBSD. Instead of hardcoding it's location use the XEN_SCRIPT_DIR 
> > variable
> > like it's used on the xencommons rc.d script.
> > 
> > Signed-off-by: Roger Pau Monné 
> > Cc: Ian Jackson 
> > Cc: Ian Campbell 
> > Cc: Wei Liu 
> 
> Acked-by: Wei Liu 

Me too, plus applied.

Thanks Roger, sorry for the breakage.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/5] xl: Command line: Adjust "Fix segfaults from `xl psr-cat-cbm-set`..."

2015-07-21 Thread Wei Liu

On Tue, Jul 21, 2015 at 03:27:26PM +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [PATCH 1/5] xl: Command line: Adjust "Fix segfaults 
> from `xl psr-cat-cbm-set`...""):
> > On Fri, 2015-07-17 at 18:00 +0100, Ian Jackson wrote:
> > 
> > Replying here in lieu of a 0/N:
> > 
> > Is any subset of this aimed at 4.6?
> 
> Yes, ideally, all of them.  I think they are bugfixes or minor
> cleanups.
> 
> If Wei wants only a subset, I can probably tease them out.
> 

I think this is part of the audit of toolstack code.  They look
obviously correct.  I'm fine with them going in.

Currently we have three series with freeze exception. Only RMRR has
significant number of changes for xl, however that series doens't
introduce a new command, so we're probably safe to apply this series
now.

Wei.

> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/5] xl: Command line: Adjust "Fix segfaults from `xl psr-cat-cbm-set`..."

2015-07-21 Thread Ian Jackson

Ian Campbell writes ("Re: [PATCH 1/5] xl: Command line: Adjust "Fix segfaults 
from `xl psr-cat-cbm-set`...""):
> On Fri, 2015-07-17 at 18:00 +0100, Ian Jackson wrote:
> 
> Replying here in lieu of a 0/N:
> 
> Is any subset of this aimed at 4.6?

Yes, ideally, all of them.  I think they are bugfixes or minor
cleanups.

If Wei wants only a subset, I can probably tease them out.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH for 4.6] xen/tools: Widen the machine_irq in xc_domain_*bind_pt_irq_int

2015-07-21 Thread Ian Campbell

On Fri, 2015-07-17 at 15:06 +0100, Julien Grall wrote:
> The DOMCTLs {,un}bind_pt_irq are using uint32_t for the machine_irq
> while the helper is using uint8_t.
> 
> Currently on ARM, we are supporting SPIs whose irq number can go up 
> to
> 1019 which doesn't fit in an uint8_t. The helpers 
> xc_domain_bind_pt_spi
> and xc_domain_unbint_pt_spi are correctly taking an uint16_t so the
> libxc was truncating without noticing the user which may end up to
> route the wrong IRQ.
> 
> Fix the problem by widening the machine_irq parameter in
> xc_domain_*bind_pt_irq_int.
> 
> Note that XEN_DOMCTL_irq_permission has the same problem but it's not
> used at the moment on ARM. So we can defer the changes after the 
> release
> of Xen 4.7.
> 
> Reported-by: Iurii Konovalenko 
> Signed-off-by: Julien Grall 

Acked-by: Ian Campbell 

I think this is a bugfix and should be applied for 4.6.

Ian.

> 
> ---
> This is based on the patch "arm: irq: increase size of irq from 
> uint8_t
> to uint32_t" [1] by Iurii few months ago.
> 
> The bug has been introduced by the device passthrough series 
> pushed
> in Xen 2 months ago. It prevents to route any IRQ number > 256 on 
> ARM.
> 
> The changes are minimal in order to get it fixed for Xen 4.6. 
> There
> is technically change for x86 as the machine_irq field in the
> DOMCTL was already uint32_t. Only the parameter of the internal
> helper is widen.
> 
> [1] 
> http://lists.xen.org/archives/html/xen-devel/2015-04/msg00681.html
> ---
>  tools/libxc/xc_domain.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
> index 6db8d13..b7a41e4 100644
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -1880,7 +1880,7 @@ int xc_domain_unbind_msi_irq(
>  static int xc_domain_bind_pt_irq_int(
>  xc_interface *xch,
>  uint32_t domid,
> -uint8_t machine_irq,
> +uint32_t machine_irq,
>  uint8_t irq_type,
>  uint8_t bus,
>  uint8_t device,
> @@ -1939,7 +1939,7 @@ int xc_domain_bind_pt_irq(
>  static int xc_domain_unbind_pt_irq_int(
>  xc_interface *xch,
>  uint32_t domid,
> -uint8_t machine_irq,
> +uint32_t machine_irq,
>  uint8_t irq_type,
>  uint8_t bus,
>  uint8_t device,

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH v2] xl: fix vcpus to vnode assignement in config file

2015-07-21 Thread Ian Campbell

On Mon, 2015-07-20 at 11:30 +0200, Dario Faggioli wrote:
> In fact, right now, if the "vcpus=" list (where the
> user specifies what vcpus should be part of a vnode)
> has multiple elements, things don't work.
> E.g., the following examples all result in failure
> to create the guest:

What is the failure?

>  [ "pnode=0","size=512","vcpus=0,2","vdistances=10,20"  ]
>  [ "pnode=0","size=512","vcpus=0-1,4","vdistances=10,20"  ]
> 
> Reason is we need either a multidimentional array,
> or a bitmap, to temporary store the vcpus of a
> vnode, while parsing the vnuma config entry.

That sounds like a cure, not the reason for the failure. Please can you
explain the nature of the failure, so it becomes clear why this change
is needed.

> Let's use the latter, which happens to also make it
> easier to copy the outcome of the parsing to its
> final destination in b_info, if everything goes ok.
> 
> Signed-off-by: Dario Faggioli 
> Acked-by: Wei Liu 

---
> Changes from v1:
>  * fix coding style
> ---
> Cc: Ian Jackson 
> Cc: Ian Campbell 
> ---
>  tools/libxl/xl_cmdimpl.c |   34 ++
>  1 file changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 8cbf30e..b41874a 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -1076,9 +1076,7 @@ static void parse_vnuma_config(const XLU_Config 
> *config,
>  /* Temporary storage for parsed vcpus information to avoid
>   * parsing config twice. This array has num_vnuma elements.
>   */
> -struct vcpu_range_parsed {
> -unsigned long start, end;
> -} *vcpu_range_parsed;
> +libxl_bitmap *vcpu_parsed;
>  
>  libxl_physinfo_init(&physinfo);
>  if (libxl_get_physinfo(ctx, &physinfo) != 0) {
> @@ -1095,7 +1093,14 @@ static void parse_vnuma_config(const 
> XLU_Config *config,
>  
>  b_info->num_vnuma_nodes = num_vnuma;
>  b_info->vnuma_nodes = xcalloc(num_vnuma, 
> sizeof(libxl_vnode_info));
> -vcpu_range_parsed = xcalloc(num_vnuma, 
> sizeof(*vcpu_range_parsed));
> +vcpu_parsed = xcalloc(num_vnuma, sizeof(libxl_bitmap));
> +for (i = 0; i < num_vnuma; i++) {
> +libxl_bitmap_init(&vcpu_parsed[i]);
> +if (libxl_cpu_bitmap_alloc(ctx, &vcpu_parsed[i], b_info
> ->max_vcpus)) {
> +fprintf(stderr, "libxl_node_bitmap_alloc failed.\n");
> +exit(1);
> +}
> +}
>  
>  for (i = 0; i < b_info->num_vnuma_nodes; i++) {
>  libxl_vnode_info *p = &b_info->vnuma_nodes[i];
> @@ -1165,12 +1170,14 @@ static void parse_vnuma_config(const 
> XLU_Config *config,
>  split_string_into_string_list(value, ",", 
> &cpu_spec_list);
>  len = libxl_string_list_length(&cpu_spec_list);
>  
> -for (j = 0; j < len; j++)
> +for (j = 0; j < len; j++) {
>  parse_range(cpu_spec_list[j], &s, &e);
> +for (; s <= e; s++) {
> +libxl_bitmap_set(&vcpu_parsed[i], s);
> +max_vcpus++;
> +}
> +}
>  
> -vcpu_range_parsed[i].start = s;
> -vcpu_range_parsed[i].end   = e;
> -max_vcpus += (e - s + 1);
>  libxl_string_list_dispose(&cpu_spec_list);
>  } else if (!strcmp("vdistances", option)) {
>  libxl_string_list vdist;
> @@ -1209,17 +1216,12 @@ static void parse_vnuma_config(const 
> XLU_Config *config,
>  
>  for (i = 0; i < b_info->num_vnuma_nodes; i++) {
>  libxl_vnode_info *p = &b_info->vnuma_nodes[i];
> -int cpu;
>  
> -libxl_cpu_bitmap_alloc(ctx, &p->vcpus, b_info->max_vcpus);
> -libxl_bitmap_set_none(&p->vcpus);
> -for (cpu = vcpu_range_parsed[i].start;
> - cpu <= vcpu_range_parsed[i].end;
> - cpu++)
> -libxl_bitmap_set(&p->vcpus, cpu);
> +libxl_bitmap_copy_alloc(ctx, &p->vcpus, &vcpu_parsed[i]);
> +libxl_bitmap_dispose(&vcpu_parsed[i]);
>  }
>  
> -free(vcpu_range_parsed);
> +free(vcpu_parsed);
>  }
>  
>  static void parse_config_data(const char *config_source,
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH 1/5] xl: Command line: Adjust "Fix segfaults from `xl psr-cat-cbm-set`..."

2015-07-21 Thread Ian Campbell

On Fri, 2015-07-17 at 18:00 +0100, Ian Jackson wrote:

Replying here in lieu of a 0/N:

Is any subset of this aimed at 4.6?


> This adjust commit a49077e5 "Fix segfaults from `xl psr-cat-cbm-set`
> command line handling":
> 
>  * Do not use the constant `required_argument' here (we simply use 1
>everywhere else).
> 
>  * Fix the minimum required arguments argument to SWITCH_FOREACH_OPT.
> 
> Leave the separate check on optind, because it checks for too many as
> well as too few arguments.
> 
> (There are many things in xl which fail to check for too many
> arguments.  I do not intend to drain that swamp now: I started but
> decided a complete overhaul of most of xl's command line argument
> processing would be best.)
> 
> This is just a code cleanup with no ultimate functional change.
> 
> Signed-off-by: Ian Jackson 
> CC: Chao Peng 
> CC: Andrew Cooper 
> ---
>  tools/libxl/xl_cmdimpl.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index 7949202..55c041c 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -8397,7 +8397,7 @@ int main_psr_cat_cbm_set(int argc, char **argv)
>  int i, j, len;
>  
>  static struct option opts[] = {
> -{"socket", required_argument, 0, 's'},
> +{"socket", 1, 0, 's'},
>  COMMON_LONG_OPTS,
>  {0, 0, 0, 0}
>  };
> @@ -8405,7 +8405,7 @@ int main_psr_cat_cbm_set(int argc, char **argv)
>  libxl_socket_bitmap_alloc(ctx, &target_map, 0);
>  libxl_bitmap_set_none(&target_map);
>  
> -SWITCH_FOREACH_OPT(opt, "s:", opts, "psr-cat-cbm-set", 1) {
> +SWITCH_FOREACH_OPT(opt, "s:", opts, "psr-cat-cbm-set", 2) {
>  case 's':
>  trim(isspace, optarg, &value);
>  split_string_into_string_list(value, ",", &socket_list);

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] [PATCH RFC] libxl: fix build with glibc < 2.9

2015-07-21 Thread Ian Campbell

On Mon, 2015-07-20 at 15:30 +0100, Jan Beulich wrote:
> htobe*() and be*toh() don't exist there. While replacing the 32-bit
> ones with hton() and ntoh() would be possible, there wouldn't be an
> obvious replacement for the 64-bit ones. Hence just take what current
> glibc (2.21) has (assuming __bswap_*() exists, which it does back to
> at least 2.4 according to my checking).
> 
> Signed-off-by: Jan Beulich 
> ---
> Not sure whether I picked an appropriate header to place this in, or
> an appropriate #ifdef to hook this onto. Hence the RFC.

I think they will do.

I was a bit confused by xl.c including libxl_osdeps.h, but I think that
is an aberration and we don't install the header so it is not really
"public".

Acked-by: Ian Campbell 


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

[Xen-devel] [qemu-upstream-unstable test] 59777: regressions - trouble: broken/fail/pass

2015-07-21 Thread osstest service owner

flight 59777 qemu-upstream-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/59777/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl-credit2   3 host-install(3) broken REGR. vs. 58880
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 11 guest-saverestore fail REGR. 
vs. 58880
 test-armhf-armhf-xl-multivcpu 11 guest-start  fail REGR. vs. 58880

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 17 leak-check/check  fail REGR. vs. 58880

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-amd  11 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel 11 guest-start  fail  never pass
 test-amd64-i386-libvirt  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt-xsm  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-rtds 11 guest-start  fail   never pass
 test-armhf-armhf-xl-xsm  12 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  12 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-cubietruck 12 migrate-support-checkfail never pass
 test-armhf-armhf-libvirt-xsm 12 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 12 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop  fail never pass
 test-armhf-armhf-xl-arndale  12 migrate-support-checkfail   never pass

version targeted for testing:
 qemuuc3eb5b77be3c731c2ecd6eddab403bb8dabc135a
baseline version:
 qemuuc4a962ec0c61aa9b860a3635c8424472e6c2cc2c

Last test of basis58880  2015-06-24 13:45:58 Z   27 days
Testing same since59777  2015-07-20 12:49:32 Z1 days1 attempts


People who touched revisions under test:
  Gerd Hoffmann 
  Marc-AndrÃ© Lureau 
  Stefano Stabellini 

jobs:
 build-amd64-xsm  pass
 build-armhf-xsm  pass
 build-i386-xsm   pass
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsmpass
 test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm fail
 test-amd64-amd64-libvirt-xsm pass
 test-armhf-armhf-libvirt-xsm pass
 test-amd64-i386-libvirt-xsm  pass
 test-amd64-amd64-xl-xsm  pass
 test-armhf-armhf-xl-xsm  pass
 test-amd64-i386-xl-xsm   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemuu-win7-amd64  fail
 test-armhf-armhf-xl-arndale  pass
 test-amd64-amd64-xl-credit2  pass
 test-armhf-armhf-xl-credit2  broken
 test-armhf-armhf-xl-cubietruck   pass
 test-amd64-i386-freebsd10-i386   pass
 test-amd64-amd64-xl-pvh-intelfail
 test-amd64-i386-qemuu-rhel6hvm-intel pass
 test-amd64-amd64-libvirt

1 2 >

1 - 100 of 170 matches

Mail list logo