Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-08-14 Thread Boris Ostrovsky

On 08/14/2015 06:38 AM, Jan Beulich wrote:

On 31.07.15 at 18:06, boris.ostrov...@oracle.com wrote:

On 07/24/2015 05:41 AM, Jan Beulich wrote:

@@ -1693,14 +1703,22 @@ int nvmx_handle_vmclear(struct cpu_user_
   else
   {
   /* Even if this VMCS isn't the current one, we must clear it. */
-vvmcs = hvm_map_guest_frame_rw(gpa  PAGE_SHIFT, 0);
+bool_t writable;
+
+vvmcs = hvm_map_guest_frame_rw(gpa  PAGE_SHIFT, 0, writable);

Since you replaced 'gpa  PAGE_SHIFT' with 'paddr_to_pfn(gpa' above,
perhaps it should be replaced here too.

Other than that,
Reviewed-by: Boris Ostrovsky boris.ostrov...@oracle.com

I took the liberty to downgrade this to an ack (covering SVM) on
v2 (since the SVM part didn't change).


Sure. (I also reviewed v2 so you can use either tag).

-boris

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-08-11 Thread Jan Beulich
 On 31.07.15 at 18:06, boris.ostrov...@oracle.com wrote:
 On 07/24/2015 05:41 AM, Jan Beulich wrote:
 @@ -1693,14 +1703,22 @@ int nvmx_handle_vmclear(struct cpu_user_
   else
   {
   /* Even if this VMCS isn't the current one, we must clear it. */
 -vvmcs = hvm_map_guest_frame_rw(gpa  PAGE_SHIFT, 0);
 +bool_t writable;
 +
 +vvmcs = hvm_map_guest_frame_rw(gpa  PAGE_SHIFT, 0, writable);
 
 Since you replaced 'gpa  PAGE_SHIFT' with 'paddr_to_pfn(gpa' above, 
 perhaps it should be replaced here too.

Yes indeed.

 Other than that,
 Reviewed-by: Boris Ostrovsky boris.ostrov...@oracle.com

Thanks, Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-08-11 Thread Tim Deegan
At 07:51 -0600 on 11 Aug (1439279513), Jan Beulich wrote:
  On 27.07.15 at 13:09, t...@xen.org wrote:
  At 13:02 +0100 on 24 Jul (1437742964), Andrew Cooper wrote:
  On 24/07/15 10:41, Jan Beulich wrote:
   Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
   too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
   between the setting of the dirty flag and the actual write happening?
   I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?
  
  It does indeed.  (Ideally the dirty bit should probably be held high for 
  the duration that a mapping exists, but that is absolutely infeasible to 
  do).
  
  IMO that would not be very useful -- a well-behaved toolstack will
  have to make sure that relevant mappings are torn down before
  stop-and-copy.  Forcing the dirty bit high in the meantime just makes
  every intermediate pass send a wasted copy of the page, without
  actually closing the race window if the tools are buggy.
 
 Making sure such mappings got torn down in time doesn't help
 when the most recent write happened _after_ the most recent
 clearing of the dirty flag in a pass prior to stop-and-copy.

This is why e.g. __gnttab_unmap_common sets the dirty bit again
as it unmaps.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-08-11 Thread Jan Beulich
 On 27.07.15 at 13:09, t...@xen.org wrote:
 At 13:02 +0100 on 24 Jul (1437742964), Andrew Cooper wrote:
 On 24/07/15 10:41, Jan Beulich wrote:
  Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
  too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
  between the setting of the dirty flag and the actual write happening?
  I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?
 
 It does indeed.  (Ideally the dirty bit should probably be held high for 
 the duration that a mapping exists, but that is absolutely infeasible to 
 do).
 
 IMO that would not be very useful -- a well-behaved toolstack will
 have to make sure that relevant mappings are torn down before
 stop-and-copy.  Forcing the dirty bit high in the meantime just makes
 every intermediate pass send a wasted copy of the page, without
 actually closing the race window if the tools are buggy.

Making sure such mappings got torn down in time doesn't help
when the most recent write happened _after_ the most recent
clearing of the dirty flag in a pass prior to stop-and-copy. But
yes, holding the dirty bit high would cause overhead. Yet
setting it only in hvm_unmap_guest_frame() wouldn't, as I now
realize, address the problem either, as that may happen e.g.
only upon guest destruction (i.e. after the stop-and-copy pass).
I.e. for guest pages currently mapped this way we'd really need
a mechanism to avoid their dirty flags to get set for initial passes,
but force it set on the final one.

And other than Andrew says I think tracking these mappings
(namely permanent ones) isn't infeasible, the more that there
shouldn't be that many of them. With them being tracked the
model then would be to set the dirty flag along with removing a
page from the tracking set, and report the dirty flags set on the
final pass (to make this work without interface changes we could
use the suspended state of the domain as indicator of the final
pass being in progress) for all pages still in the tracking set.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-08-11 Thread Jan Beulich
 On 11.08.15 at 16:34, t...@xen.org wrote:
 At 07:51 -0600 on 11 Aug (1439279513), Jan Beulich wrote:
  On 27.07.15 at 13:09, t...@xen.org wrote:
  At 13:02 +0100 on 24 Jul (1437742964), Andrew Cooper wrote:
  On 24/07/15 10:41, Jan Beulich wrote:
   Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
   too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
   between the setting of the dirty flag and the actual write happening?
   I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?
  
  It does indeed.  (Ideally the dirty bit should probably be held high for 
  the duration that a mapping exists, but that is absolutely infeasible to 
  do).
  
  IMO that would not be very useful -- a well-behaved toolstack will
  have to make sure that relevant mappings are torn down before
  stop-and-copy.  Forcing the dirty bit high in the meantime just makes
  every intermediate pass send a wasted copy of the page, without
  actually closing the race window if the tools are buggy.
 
 Making sure such mappings got torn down in time doesn't help
 when the most recent write happened _after_ the most recent
 clearing of the dirty flag in a pass prior to stop-and-copy.
 
 This is why e.g. __gnttab_unmap_common sets the dirty bit again
 as it unmaps.

And how does this help when the mapping survives until the guest
gets suspended? And why would doing it _again_ when unmapping
be better than doing it _only_ then?

But in any event I read this as agreement that moving (or in the
worst case replicating) the hvm_map_guest_frame_rw() one into
hvm_unmap_guest_frame() would be an appropriate thing to do.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-08-11 Thread Tim Deegan
At 09:37 -0600 on 11 Aug (1439285833), Jan Beulich wrote:
  On 11.08.15 at 16:34, t...@xen.org wrote:
  At 07:51 -0600 on 11 Aug (1439279513), Jan Beulich wrote:
   On 27.07.15 at 13:09, t...@xen.org wrote:
   At 13:02 +0100 on 24 Jul (1437742964), Andrew Cooper wrote:
   On 24/07/15 10:41, Jan Beulich wrote:
Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
between the setting of the dirty flag and the actual write happening?
I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?
   
   It does indeed.  (Ideally the dirty bit should probably be held high 
   for 
   the duration that a mapping exists, but that is absolutely infeasible 
   to 
   do).
   
   IMO that would not be very useful -- a well-behaved toolstack will
   have to make sure that relevant mappings are torn down before
   stop-and-copy.  Forcing the dirty bit high in the meantime just makes
   every intermediate pass send a wasted copy of the page, without
   actually closing the race window if the tools are buggy.
  
  Making sure such mappings got torn down in time doesn't help
  when the most recent write happened _after_ the most recent
  clearing of the dirty flag in a pass prior to stop-and-copy.
  
  This is why e.g. __gnttab_unmap_common sets the dirty bit again
  as it unmaps.
 
 And how does this help when the mapping survives until the guest
 gets suspended?

Suspended is fine, so long as it happens before the final read of the
bitmap.

 And why would doing it _again_ when unmapping
 be better than doing it _only_ then?

My mistake - it is of course done _only_ then.

 But in any event I read this as agreement that moving (or in the
 worst case replicating) the hvm_map_guest_frame_rw() one into
 hvm_unmap_guest_frame() would be an appropriate thing to do.

Yep!

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-31 Thread Boris Ostrovsky

On 07/24/2015 05:41 AM, Jan Beulich wrote:
  
--- a/xen/arch/x86/hvm/vmx/vvmx.c

+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1619,10 +1619,23 @@ int nvmx_handle_vmptrld(struct cpu_user_
  
  if ( nvcpu-nv_vvmcxaddr == VMCX_EADDR )

  {
-nvcpu-nv_vvmcx = hvm_map_guest_frame_rw(gpa  PAGE_SHIFT, 1);
-if ( nvcpu-nv_vvmcx )
-nvcpu-nv_vvmcxaddr = gpa;
-if ( !nvcpu-nv_vvmcx ||
+bool_t writable;
+void *vvmcx = hvm_map_guest_frame_rw(paddr_to_pfn(gpa), 1, writable);
+


...


@@ -1693,14 +1703,22 @@ int nvmx_handle_vmclear(struct cpu_user_
  else
  {
  /* Even if this VMCS isn't the current one, we must clear it. */
-vvmcs = hvm_map_guest_frame_rw(gpa  PAGE_SHIFT, 0);
+bool_t writable;
+
+vvmcs = hvm_map_guest_frame_rw(gpa  PAGE_SHIFT, 0, writable);


Since you replaced 'gpa  PAGE_SHIFT' with 'paddr_to_pfn(gpa' above, 
perhaps it should be replaced here too.


Other than that,
Reviewed-by: Boris Ostrovsky boris.ostrov...@oracle.com

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-30 Thread Tian, Kevin
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Friday, July 24, 2015 5:41 PM
 
 ... and its callers.
 
 While all non-nested users are made fully honor the semantics of that
 type, doing so in the nested case seemed insane (if doable at all,
 considering VMCS shadowing), and hence there the respective operations
 are simply made fail.
 
 One case not (yet) taken care of is that of a page getting transitioned
 to this type after a mapping got established.
 
 Signed-off-by: Jan Beulich jbeul...@suse.com

Acked-by: Kevin Tian kevin.t...@intel.com


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-27 Thread Tim Deegan
At 13:02 +0100 on 24 Jul (1437742964), Andrew Cooper wrote:
 On 24/07/15 10:41, Jan Beulich wrote:
  Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
  too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
  between the setting of the dirty flag and the actual write happening?
  I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?
 
 It does indeed.  (Ideally the dirty bit should probably be held high for 
 the duration that a mapping exists, but that is absolutely infeasible to 
 do).

IMO that would not be very useful -- a well-behaved toolstack will
have to make sure that relevant mappings are torn down before
stop-and-copy.  Forcing the dirty bit high in the meantime just makes
every intermediate pass send a wasted copy of the page, without
actually closing the race window if the tools are buggy.

If we want to catch these bugs, it might be useful to have a flag
that the tools can set when stop-and-copy begins, to indicate any
subsequent mark_dirty() calls are too late.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-24 Thread Jan Beulich
... and its callers.

While all non-nested users are made fully honor the semantics of that
type, doing so in the nested case seemed insane (if doable at all,
considering VMCS shadowing), and hence there the respective operations
are simply made fail.

One case not (yet) taken care of is that of a page getting transitioned
to this type after a mapping got established.

Signed-off-by: Jan Beulich jbeul...@suse.com
---
Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
between the setting of the dirty flag and the actual write happening?
I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?

Note that this is conflicting with the altp2m series (adding another
call to hvm_map_guest_frame_rw(), reviewing of which made me spot the
issue in the first place).

--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3594,8 +3594,8 @@ int hvm_virtual_to_linear_addr(
 
 /* On non-NULL return, we leave this function holding an additional 
  * ref on the underlying mfn, if any */
-static void *__hvm_map_guest_frame(unsigned long gfn, bool_t writable,
-   bool_t permanent)
+static void *_hvm_map_guest_frame(unsigned long gfn, bool_t permanent,
+  bool_t *writable)
 {
 void *map;
 p2m_type_t p2mt;
@@ -3618,7 +3618,12 @@ static void *__hvm_map_guest_frame(unsig
 }
 
 if ( writable )
-paging_mark_dirty(d, page_to_mfn(page));
+{
+if ( !p2m_is_discard_write(p2mt) )
+paging_mark_dirty(d, page_to_mfn(page));
+else
+*writable = 0;
+}
 
 if ( !permanent )
 return __map_domain_page(page);
@@ -3630,14 +3635,16 @@ static void *__hvm_map_guest_frame(unsig
 return map;
 }
 
-void *hvm_map_guest_frame_rw(unsigned long gfn, bool_t permanent)
+void *hvm_map_guest_frame_rw(unsigned long gfn, bool_t permanent,
+ bool_t *writable)
 {
-return __hvm_map_guest_frame(gfn, 1, permanent);
+*writable = 1;
+return _hvm_map_guest_frame(gfn, permanent, writable);
 }
 
 void *hvm_map_guest_frame_ro(unsigned long gfn, bool_t permanent)
 {
-return __hvm_map_guest_frame(gfn, 0, permanent);
+return _hvm_map_guest_frame(gfn, permanent, NULL);
 }
 
 void hvm_unmap_guest_frame(void *p, bool_t permanent)
@@ -3657,7 +3664,7 @@ void hvm_unmap_guest_frame(void *p, bool
 put_page(mfn_to_page(mfn));
 }
 
-static void *hvm_map_entry(unsigned long va)
+static void *hvm_map_entry(unsigned long va, bool_t *writable)
 {
 unsigned long gfn;
 uint32_t pfec;
@@ -3680,7 +3687,7 @@ static void *hvm_map_entry(unsigned long
 if ( (pfec == PFEC_page_paged) || (pfec == PFEC_page_shared) )
 goto fail;
 
-v = hvm_map_guest_frame_rw(gfn, 0);
+v = hvm_map_guest_frame_rw(gfn, 0, writable);
 if ( v == NULL )
 goto fail;
 
@@ -3702,6 +3709,7 @@ static int hvm_load_segment_selector(
 struct segment_register desctab, cs, segr;
 struct desc_struct *pdesc, desc;
 u8 dpl, rpl, cpl;
+bool_t writable;
 int fault_type = TRAP_invalid_tss;
 struct cpu_user_regs *regs = guest_cpu_user_regs();
 struct vcpu *v = current;
@@ -3738,7 +3746,7 @@ static int hvm_load_segment_selector(
 if ( ((sel  0xfff8) + 7)  desctab.limit )
 goto fail;
 
-pdesc = hvm_map_entry(desctab.base + (sel  0xfff8));
+pdesc = hvm_map_entry(desctab.base + (sel  0xfff8), writable);
 if ( pdesc == NULL )
 goto hvm_map_fail;
 
@@ -3797,6 +3805,7 @@ static int hvm_load_segment_selector(
 break;
 }
 } while ( !(desc.b  0x100)  /* Ensure Accessed flag is set */
+  writable  /* except if we are to discard writes */
   (cmpxchg(pdesc-b, desc.b, desc.b | 0x100) != desc.b) );
 
 /* Force the Accessed flag in our local copy. */
@@ -3834,6 +3843,7 @@ void hvm_task_switch(
 struct cpu_user_regs *regs = guest_cpu_user_regs();
 struct segment_register gdt, tr, prev_tr, segr;
 struct desc_struct *optss_desc = NULL, *nptss_desc = NULL, tss_desc;
+bool_t otd_writable, ntd_writable;
 unsigned long eflags;
 int exn_raised, rc;
 struct {
@@ -3860,11 +3870,12 @@ void hvm_task_switch(
 goto out;
 }
 
-optss_desc = hvm_map_entry(gdt.base + (prev_tr.sel  0xfff8)); 
+optss_desc = hvm_map_entry(gdt.base + (prev_tr.sel  0xfff8),
+   otd_writable);
 if ( optss_desc == NULL )
 goto out;
 
-nptss_desc = hvm_map_entry(gdt.base + (tss_sel  0xfff8)); 
+nptss_desc = hvm_map_entry(gdt.base + (tss_sel  0xfff8), ntd_writable);
 if ( nptss_desc == NULL )
 goto out;
 
@@ -3996,11 +4007,11 @@ void hvm_task_switch(
 v-arch.hvm_vcpu.guest_cr[0] |= X86_CR0_TS;
 hvm_update_guest_cr(v, 0);
 
-if ( (taskswitch_reason == TSW_iret) ||
- (taskswitch_reason == TSW_jmp) )
+if ( (taskswitch_reason 

Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-24 Thread Jan Beulich
 On 24.07.15 at 12:26, wei.l...@citrix.com wrote:
 On Fri, Jul 24, 2015 at 03:41:26AM -0600, Jan Beulich wrote:
 @@ -3618,7 +3618,12 @@ static void *__hvm_map_guest_frame(unsig
  }
  
  if ( writable )
 
 I don't claim I know this piece of code, but checking the pointer but
 not the content looks suspicious.
 
 -paging_mark_dirty(d, page_to_mfn(page));
 +{
 +if ( !p2m_is_discard_write(p2mt) )
 +paging_mark_dirty(d, page_to_mfn(page));
 +else
 +*writable = 0;
 
 You then set *writable here, which makes it even more suspicious.

Why? A caller _wanting_ a writable mapping passes non-NULL as
the pointer argument (pre-initialized to point to a variable holding
TRUE aka 1). Upon return the variable will have got set to FALSE
aka 0 if the page shouldn't be written to.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-24 Thread Wei Liu
On Fri, Jul 24, 2015 at 03:41:26AM -0600, Jan Beulich wrote:
 ... and its callers.
 
 While all non-nested users are made fully honor the semantics of that
 type, doing so in the nested case seemed insane (if doable at all,
 considering VMCS shadowing), and hence there the respective operations
 are simply made fail.
 
 One case not (yet) taken care of is that of a page getting transitioned
 to this type after a mapping got established.
 
 Signed-off-by: Jan Beulich jbeul...@suse.com
 ---
 Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
 too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
 between the setting of the dirty flag and the actual write happening?
 I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?
 
 Note that this is conflicting with the altp2m series (adding another
 call to hvm_map_guest_frame_rw(), reviewing of which made me spot the
 issue in the first place).
 
 --- a/xen/arch/x86/hvm/hvm.c
 +++ b/xen/arch/x86/hvm/hvm.c
 @@ -3594,8 +3594,8 @@ int hvm_virtual_to_linear_addr(
  
  /* On non-NULL return, we leave this function holding an additional 
   * ref on the underlying mfn, if any */
 -static void *__hvm_map_guest_frame(unsigned long gfn, bool_t writable,
 -   bool_t permanent)
 +static void *_hvm_map_guest_frame(unsigned long gfn, bool_t permanent,
 +  bool_t *writable)
  {
  void *map;
  p2m_type_t p2mt;
 @@ -3618,7 +3618,12 @@ static void *__hvm_map_guest_frame(unsig
  }
  
  if ( writable )

I don't claim I know this piece of code, but checking the pointer but
not the content looks suspicious.

 -paging_mark_dirty(d, page_to_mfn(page));
 +{
 +if ( !p2m_is_discard_write(p2mt) )
 +paging_mark_dirty(d, page_to_mfn(page));
 +else
 +*writable = 0;

You then set *writable here, which makes it even more suspicious.

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-24 Thread Wei Liu
On Fri, Jul 24, 2015 at 04:37:25AM -0600, Jan Beulich wrote:
  On 24.07.15 at 12:26, wei.l...@citrix.com wrote:
  On Fri, Jul 24, 2015 at 03:41:26AM -0600, Jan Beulich wrote:
  @@ -3618,7 +3618,12 @@ static void *__hvm_map_guest_frame(unsig
   }
   
   if ( writable )
  
  I don't claim I know this piece of code, but checking the pointer but
  not the content looks suspicious.
  
  -paging_mark_dirty(d, page_to_mfn(page));
  +{
  +if ( !p2m_is_discard_write(p2mt) )
  +paging_mark_dirty(d, page_to_mfn(page));
  +else
  +*writable = 0;
  
  You then set *writable here, which makes it even more suspicious.
 
 Why? A caller _wanting_ a writable mapping passes non-NULL as
 the pointer argument (pre-initialized to point to a variable holding
 TRUE aka 1). Upon return the variable will have got set to FALSE
 aka 0 if the page shouldn't be written to.
 

If this is the convention for using this function then the code you
write is of course fine.

Wei.

 Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-24 Thread Andrew Cooper

On 24/07/15 10:41, Jan Beulich wrote:

... and its callers.

While all non-nested users are made fully honor the semantics of that
type, doing so in the nested case seemed insane (if doable at all,
considering VMCS shadowing), and hence there the respective operations
are simply made fail.


Sorry, but I can't parse this sentence.  Surely in the nested case, it 
is the host p2m type which is relevant to whether a mapping should be 
forced read only?




One case not (yet) taken care of is that of a page getting transitioned
to this type after a mapping got established.

Signed-off-by: Jan Beulich jbeul...@suse.com
---
Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
between the setting of the dirty flag and the actual write happening?
I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?


It does indeed.  (Ideally the dirty bit should probably be held high for 
the duration that a mapping exists, but that is absolutely infeasible to 
do).


This is definitely not the only issue between the p2m and logdirty. At 
some point I need to see about making migration safe to use while the 
guest is ballooning.  Both PV and HVM guests break in different ways if 
they actually perform ballooning or physmap alteration while logdirty is 
active for live migration purposes.




Note that this is conflicting with the altp2m series (adding another
call to hvm_map_guest_frame_rw(), reviewing of which made me spot the
issue in the first place).


@@ -3797,6 +3805,7 @@ static int hvm_load_segment_selector(
  break;
  }
  } while ( !(desc.b  0x100)  /* Ensure Accessed flag is set */
+  writable  /* except if we are to discard writes */
(cmpxchg(pdesc-b, desc.b, desc.b | 0x100) != desc.b) );


I can't recall where I read it in the manual, but I believe it is a 
faultable error to load a descriptor from RO memory if the accessed bit 
is not already set.  This was to prevent a processor livelock when 
running with gdtr pointing into ROM (which was a considered usecase).


Otherwise, the rest of the patch looks fine.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()

2015-07-24 Thread Jan Beulich
 On 24.07.15 at 14:02, andrew.coop...@citrix.com wrote:
 On 24/07/15 10:41, Jan Beulich wrote:
 ... and its callers.

 While all non-nested users are made fully honor the semantics of that
 type, doing so in the nested case seemed insane (if doable at all,
 considering VMCS shadowing), and hence there the respective operations
 are simply made fail.
 
 Sorry, but I can't parse this sentence.  Surely in the nested case, it 
 is the host p2m type which is relevant to whether a mapping should be 
 forced read only?

No, what I mean to say is
- callers outside of nested-HVM code properly obey the write-ignore
  semantics
- callers inside nested-HVM code would be too cumbersome (and
  maybe impossible) to fix, and hence they're being made return
  failure to their callers.

 Beyond that log-dirty handling in _hvm_map_guest_frame() looks bogus
 too: What if a XEN_DOMCTL_SHADOW_OP_* gets issued and acted upon
 between the setting of the dirty flag and the actual write happening?
 I.e. shouldn't the flag instead be set in hvm_unmap_guest_frame()?
 
 It does indeed.  (Ideally the dirty bit should probably be held high for 
 the duration that a mapping exists, but that is absolutely infeasible to 
 do).

I don't see this being too difficult, the more that for transient
mappings it doesn't really matter (if there's a race, then setting
the flag after the write(s) is good enough). For permanent
mappings I can't see why we wouldn't be able to add a (short)
linked list of pages paging_log_dirty_op() should always set the
dirty flags for.

 @@ -3797,6 +3805,7 @@ static int hvm_load_segment_selector(
   break;
   }
   } while ( !(desc.b  0x100)  /* Ensure Accessed flag is set */
 +  writable  /* except if we are to discard writes */
 (cmpxchg(pdesc-b, desc.b, desc.b | 0x100) != desc.b) );
 
 I can't recall where I read it in the manual, but I believe it is a 
 faultable error to load a descriptor from RO memory if the accessed bit 
 is not already set.  This was to prevent a processor livelock when 
 running with gdtr pointing into ROM (which was a considered usecase).

I don't see why a processor would live-lock in such a case. It can do
the write, and ignore whether it actually too effect. I don't see why
it would e.g. spin until it sees the flag set. (Note that a cmpxchg()
like loop alone wouldn't have that problem, i.e. for a live lock to occur
there would still need to be an outer loop doing the checking).

But even it there was such (perhaps even model specific) behavior,
without having a pointer to where this is specified (and hence what
precise fault [and error code] to raise), I wouldn't want to go that
route here.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel