Re: [PATCH 07/16] x86/shadow: call sh_update_cr3() directly from sh_page_fault()

2023-03-29 Thread Tim Deegan
At 12:37 +0200 on 28 Mar (1680007032), Jan Beulich wrote:
> On 27.03.2023 17:39, Tim Deegan wrote:
> > At 10:33 +0100 on 22 Mar (1679481226), Jan Beulich wrote:
> >> There's no need for an indirect call here, as the mode is invariant
> >> throughout the entire paging-locked region. All it takes to avoid it is
> >> to have a forward declaration of sh_update_cr3() in place.
> >>
> >> Signed-off-by: Jan Beulich 
> >> ---
> >> I find this and the respective Win7 related comment suspicious: If we
> >> really need to "fix up" L3 entries "on demand", wouldn't we better retry
> >> the shadow_get_and_create_l1e() rather than exit? The spurious page
> >> fault that the guest observes can, after all, not be known to be non-
> >> fatal inside the guest. That's purely an OS policy.
> > 
> > I think it has to be non-fatal because it can happen on real hardware,
> > even if the hardware *does* fill the TLB here (which it is not
> > required to).
> 
> But if hardware filled the TLB, it won't raise #PF. If it didn't fill
> the TLB (e.g. because of not even doing a walk when a PDPTE is non-
> present), a #PF would be legitimate, and the handler would then need
> to reload CR3. But such a #PF is what, according to the comment, Win7
> chokes on.

IIRC the Win7 behaviour is to accept and discard the #PF as spurious
(because it sees the PDPTE is present) *without* reloading CR3, so it
gets stuck in a loop taking pagefaults.  Here, we reload CR3 for it,
to unstick it.

I can't think of a mental model of the MMU that would have a problem
here -- either the L3Es are special in which case the pagefault is
correct (and we shouldn't even read the new contents) or they're just
like other PTEs in which case the spurious fault is fine.

> > The assert's not true here because the guest can push us down this
> > path by doing exactly what Win 7 does here - loading CR3 with a
> > missing L3E, then filling the L3E and causing a page fault that uses
> > the now-filled L3E.  (Or maybe that's not true any more; my mental
> > model of the pagetable walker code might be out of date)
> 
> Well, I specifically started the paragraph with 'Beyond the "on demand"
> L3 entry creation'. IOW the assertion would look applicable to me if
> we, as suggested higher up, retried shadow_get_and_create_l1e() and it
> failed again. As the comment there says, "we know the guest entries are
> OK", so the missing L3 entry must have appeared.

Ah, I didn't quite understand you.  Yes, if we changed the handler to
rebuild the SL3E then I think the assertion would be valid again.

Cheers,

Tim.



Re: [PATCH 07/16] x86/shadow: call sh_update_cr3() directly from sh_page_fault()

2023-03-27 Thread Tim Deegan
Hi,

At 10:33 +0100 on 22 Mar (1679481226), Jan Beulich wrote:
> There's no need for an indirect call here, as the mode is invariant
> throughout the entire paging-locked region. All it takes to avoid it is
> to have a forward declaration of sh_update_cr3() in place.
> 
> Signed-off-by: Jan Beulich 
> ---
> I find this and the respective Win7 related comment suspicious: If we
> really need to "fix up" L3 entries "on demand", wouldn't we better retry
> the shadow_get_and_create_l1e() rather than exit? The spurious page
> fault that the guest observes can, after all, not be known to be non-
> fatal inside the guest. That's purely an OS policy.

I think it has to be non-fatal because it can happen on real hardware,
even if the hardware *does* fill the TLB here (which it is not
required to).

Filling just one sl3e sounds plausible, though we don't want to go
back to the idea of having L3 shadows on PAE!

> Furthermore the sh_update_cr3() will also invalidate L3 entries which
> were loaded successfully before, but invalidated by the guest
> afterwards. I strongly suspect that the described hardware behavior is
> _only_ to load previously not-present entries from the PDPT, but not
> purge ones already marked present.

Very likely, but we *are* allowed to forget old entries whenever we
want to, so this is at worst a performance problem.

> IOW I think sh_update_cr3() would
> need calling in an "incremental" mode here.

This would be the best way of updating just the one entry - but as far
as I can tell the existing code is correct so I wouldn't add any more
complexity unless we know that this path is causing perf problems.

> In any event emitting a TRC_SHADOW_DOMF_DYING trace record in this case
> looks wrong.

Yep.

> Beyond the "on demand" L3 entry creation I also can't see what guest
> actions could lead to the ASSERT() being inapplicable in the PAE case.
> The 3-level code in shadow_get_and_create_l2e() doesn't consult guest
> PDPTEs, and all other logic is similar to that for other modes.

The assert's not true here because the guest can push us down this
path by doing exactly what Win 7 does here - loading CR3 with a
missing L3E, then filling the L3E and causing a page fault that uses
the now-filled L3E.  (Or maybe that's not true any more; my mental
model of the pagetable walker code might be out of date)

Cheers,

Tim.




Re: [PATCH v3 0/4] x86/P2M: allow 2M superpage use for shadowed guests

2022-08-16 Thread Tim Deegan
At 09:43 +0200 on 12 Aug (1660297391), Jan Beulich wrote:
> I did notice this anomaly in the context of IOMMU side work.
> 
> 1: shadow: slightly consolidate sh_unshadow_for_p2m_change() (part I)
> 2: shadow: slightly consolidate sh_unshadow_for_p2m_change() (part II)
> 3: shadow: slightly consolidate sh_unshadow_for_p2m_change() (part III)
> 4: P2M: allow 2M superpage use for shadowed guests


Acked-by: Tim Deegan 

FWIW I think that adding some kind of mfn_mask() opreration  would
be neater and more understandable than subtracting the PSE flag.

Cheers,

Tim.



Re: [PATCH 0/8] x86: XSA-40{1,2,8} follow-up

2022-07-30 Thread Tim Deegan
At 17:58 +0200 on 26 Jul (1658858332), Jan Beulich wrote:
> Perhaps not entirely unexpected the work there has turned up a few
> further items which want dealing with. Most if not all of these
> aren't interdependent, so could probably be looked at (and go in)
> in (about) any order.

Shadow parts Acked-by: Tim Deegan 

Cheers,

Tim.



Re: [PATCH 1/2] x86/mm: don't open-code p2m_is_pod()

2021-12-02 Thread Tim Deegan
At 12:01 +0100 on 01 Dec (1638360084), Jan Beulich wrote:
> Replace all comparisons against p2m_populate_on_demand (outside of
> switch() statements) with the designated predicate.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH 1/2] x86/shadow: defer/avoid paging_mfn_is_dirty() invocation

2021-12-02 Thread Tim Deegan
At 11:35 +0100 on 01 Dec (1638358515), Jan Beulich wrote:
> paging_mfn_is_dirty() is moderately expensive, so avoid its use unless
> its result might actually change anything. This means moving the
> surrounding if() down below all other checks that can result in clearing
> _PAGE_RW from sflags, in order to then check whether _PAGE_RW is
> actually still set there before calling the function.
> 
> While moving the block of code, fold two if()s and make a few style
> adjustments.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Tim Deegan 

> ---
> TBD: Perhaps the dirty-VRAM piece also wants moving down a little, such
>  that all three "level == 1" conditionals can be folded?

I have no strong feelings on that either way.

Cheers,

Tim.



Re: sh_unshadow_for_p2m_change() vs p2m_set_entry()

2021-10-01 Thread Tim Deegan
At 07:59 +0200 on 01 Oct (1633075173), Jan Beulich wrote:
> On 27.09.2021 22:25, Tim Deegan wrote:
> > At 13:31 +0200 on 24 Sep (1632490304), Jan Beulich wrote:
> >> The 2M logic also first checks _PAGE_PRESENT (and _PAGE_PSE), while
> >> the 4k logic appears to infer that the old page was present from
> >> p2m_is_{valid,grant}().
> > 
> > I think the p2m_type check is the right one (rather than
> > _PAGE_PRESENT), since that's the one that the p2m lookups will obey
> > when causing things to get shadowed.  But the extra _PAGE_PSE check
> > should stay.
> 
> Actually, having transformed things into patch form, I'm now puzzled
> by you explicitly saying this. Wasn't this check wrong in the first
> place? I don't see anything preventing an L1 page table getting
> zapped (or replaced by a 2M mapping) all in one go.

ISTR that this couldn't happen, but I don't remember exactly exactly
why.  It may just be that the p2m update code didn't have that path,
but it's a bit fragile to rely on that.

> The full range
> of GFNs would need checking in this case as well, just like in the
> opposite case (2M mapping getting replaced by an L1 pt).

Yes.  Any case where we're inserting or removing an L1 table would
need to check all 512 L1Es.

Cheers,

Tim.



Re: sh_unshadow_for_p2m_change() vs p2m_set_entry()

2021-09-27 Thread Tim Deegan
Hi,

At 13:31 +0200 on 24 Sep (1632490304), Jan Beulich wrote:
> I'm afraid you're still my best guess to hopefully get an insight
> on issues like this one.

I'm now very rusty on all this but I'll do my best!  I suspect I'll
just be following you through the code.

> While doing IOMMU superpage work I was, just in the background,
> considering in how far the superpage re-coalescing to be used there
> couldn't be re-used for P2M / EPT / NPT. Which got me to think about
> shadow mode's using of p2m-pt.c: That's purely software use of the
> tables in that case, isn't it? In which case hardware support for
> superpages shouldn't matter at all.

ISTR at the time we used the same table for p2m and NPT.
If that's gone away, then yes, we could have superpages
in the p2m without caring about hardware support.

> The only place where I could spot that P2M superpages would actually
> make a difference to shadow code was sh_unshadow_for_p2m_change().
> That one appears to have been dealing with 2M pages (more below)
> already at the time of 0ca1669871f8a ("P2M: check whether hap mode
> is enabled before using 2mb pages"), so I wonder what "potential
> errors when hap is disabled" this commit's description might be
> talking about.

Sorry, I have no idea what the "potential errors" were here. :(

> As to sh_unshadow_for_p2m_change()'s readiness for at least 2Mb
> pages: The 4k page handling there differs from the 2M one primarily
> in the p2m type checks: "p2m_is_valid(...) || p2m_is_grant(...)"
> vs "p2m_is_valid(...)" plus later "!p2m_is_ram(...)", the first
> three acting on the type taken from the old PTE, while the latter
> acts on the type in the new (split) PTEs.

So I think the type tests on the old entry are correct - as you say,
p2m_is_grant() only applies to 4k entries and otherwise they're the
same.

> Shouldn't the exact same
> checks be used in both cases (less the p2m_is_grant() check which
> can't be true for superpages)? IOW isn't !p2m_is_ram() at least
> superfluous (and perhaps further redundant with the subsequent
> !mfn_eq(l1e_get_mfn(npte[i]), omfn))? Instead I'm missing an
> entry-is-present check, without which l1e_get_mfn(npte[i]) looks
> risky at best. Or is p2m_is_ram() considered a sufficient
> replacement, making assumptions on the behavior of a lot of other
> code?

AFAICT, p2m_is_ram(p2m_flags_to_type(l1e_get_flags(npte[i]))) implies
that npte[i] has a valid MFN in it, but I agree that it would be better
to check _PAGE_PRESENT.

And I think in general it would make sense to factor out the old->new
checks and make them the same between the L1 and L2.  I don't see
anything obviously wrong with the current code but the refactored code
would be more obviously right.

> The 2M logic also first checks _PAGE_PRESENT (and _PAGE_PSE), while
> the 4k logic appears to infer that the old page was present from
> p2m_is_{valid,grant}().

I think the p2m_type check is the right one (rather than
_PAGE_PRESENT), since that's the one that the p2m lookups will obey
when causing things to get shadowed.  But the extra _PAGE_PSE check
should stay.

> And isn't this 2M page handling code (because of the commit pointed
> at above) dead right now anyway? If not, where would P2M superpages
> come from?

Yep, that sounds right - p2m_set_entry still only sets 4k entries on
shadow domains, so I think this code is dead right now.  If you had
asked me I would not have remembered that was the case.

Cheers,

Tim.



Re: [PATCH 4/4] x86/PV: properly set shadow allocation for Dom0

2021-08-31 Thread Tim Deegan
At 15:03 +0200 on 30 Aug (1630335824), Jan Beulich wrote:
> Leaving shadow setup just to the L1TF tasklet means running Dom0 on a
> minimally acceptable shadow memory pool, rather than what normally
> would be used (also, for example, for PVH). Populate the pool before
> triggering the tasklet, on a best effort basis (again like done for
> PVH).
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH] x86/shadow: drop callback_mask pseudo-variables

2021-07-03 Thread Tim Deegan
At 08:42 +0200 on 30 Jun (1625042541), Jan Beulich wrote:
> In commit 90629587e16e ("x86/shadow: replace stale literal numbers in
> hash_{vcpu,domain}_foreach()") I had to work around a clang shortcoming
> (if you like), leveraging that gcc tolerates static const variables in
> otherwise integer constant expressions. Roberto suggests that we'd
> better not rely on such behavior. Drop the involved static const-s,
> using their "expansions" in both of the prior use sites each. This then
> allows dropping the short-circuiting of the check for clang.
> 
> Requested-by: Roberto Bagnara 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: Invalid _Static_assert expanded from HASH_CALLBACKS_CHECK

2021-06-07 Thread Tim Deegan
Hi,

At 08:45 +0200 on 31 May (1622450756), Jan Beulich wrote:
> On 28.05.2021 17:44, Tim Deegan wrote:
> > Hi,
> > 
> > At 10:58 +0200 on 25 May (1621940330), Jan Beulich wrote:
> >> On 24.05.2021 06:29, Roberto Bagnara wrote:
> >>> I stumbled upon parsing errors due to invalid uses of
> >>> _Static_assert expanded from HASH_CALLBACKS_CHECK where
> >>> the tested expression is not constant, as mandated by
> >>> the C standard.
> >>>
> >>> Judging from the following comment, there is partial awareness
> >>> of the fact this is an issue:
> >>>
> >>> #ifndef __clang__ /* At least some versions dislike some of the uses. */
> >>> #define HASH_CALLBACKS_CHECK(mask) \
> >>>  BUILD_BUG_ON((mask) > (1U << ARRAY_SIZE(callbacks)) - 1)
> >>>
> >>> Indeed, this is not a fault of Clang: the point is that some
> >>> of the expansions of this macro are not C.  Moreover,
> >>> the fact that GCC sometimes accepts them is not
> >>> something we can rely upon:
> > 
> > Well, that is unfortunate - especially since the older ad-hoc
> > compile-time assertion macros handled this kind of thing pretty well.
> > Why when I were a lad   :)
> 
> So I have to admit I don't understand: The commit introducing
> HASH_CALLBACKS_CHECK() (90629587e16e "x86/shadow: replace stale
> literal numbers in hash_{vcpu,domain}_foreach()") did not replace
> any prior compile-time checking. Hence I wonder what you're
> referring to (and hence what alternative ways of dealing with the
> situation there might be that I'm presently not seeing).

Sorry, I wasn't clear.  Before there was compiler support for
compile-time assertions, people used horrible macros that expanded to
things like int x[(p)?0:-1].  (I don't remember which exact flavour we
had in Xen.)  Those worked fine with static consts because the
predicates only had to be compile-time constant in practice, but now
they have to be constant in principle too.

So I don't think there was a better way of adding these assertions in
90629587e16e, I'm just generally grumbling that the official
compile-time assertions are not quite as useful as the hacks they
replaced.

And I am definitely *not* suggesting that we go back to those kind of
hacks just to get around the compiler's insistence on the letter of
the law. :)

Cheers,

Tim.



[PATCH] MAINTAINERS: adjust x86/mm/shadow maintainers

2021-06-03 Thread Tim Deegan
Better reflect reality: Andrew and Jan are active maintainers
and I review patches.  Keep myself as a reviewer so I can help
with historical context 

Signed-off-by: Tim Deegan 
---
 MAINTAINERS | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git MAINTAINERS MAINTAINERS
index d46b08a0d2..09a92534bf 100644
--- MAINTAINERS
+++ MAINTAINERS
@@ -591,7 +591,9 @@ F:  xen/arch/x86/mm/mem_sharing.c
 F: tools/tests/mem-sharing/
 
 X86 SHADOW PAGETABLES
-M: Tim Deegan 
+M: Jan Beulich 
+M: Andrew Cooper 
+R: Tim Deegan 
 S: Maintained
 F: xen/arch/x86/mm/shadow/
 
-- 
2.26.2




Re: Invalid _Static_assert expanded from HASH_CALLBACKS_CHECK

2021-05-28 Thread Tim Deegan
Hi,

At 10:58 +0200 on 25 May (1621940330), Jan Beulich wrote:
> On 24.05.2021 06:29, Roberto Bagnara wrote:
> > I stumbled upon parsing errors due to invalid uses of
> > _Static_assert expanded from HASH_CALLBACKS_CHECK where
> > the tested expression is not constant, as mandated by
> > the C standard.
> > 
> > Judging from the following comment, there is partial awareness
> > of the fact this is an issue:
> > 
> > #ifndef __clang__ /* At least some versions dislike some of the uses. */
> > #define HASH_CALLBACKS_CHECK(mask) \
> >  BUILD_BUG_ON((mask) > (1U << ARRAY_SIZE(callbacks)) - 1)
> > 
> > Indeed, this is not a fault of Clang: the point is that some
> > of the expansions of this macro are not C.  Moreover,
> > the fact that GCC sometimes accepts them is not
> > something we can rely upon:

Well, that is unfortunate - especially since the older ad-hoc
compile-time assertion macros handled this kind of thing pretty well.
Why when I were a lad   :)

> > Finally, I think this can be easily avoided: instead
> > of initializing a static const with a constant expression
> > and then static-asserting the static const, just static-assert
> > the constant initializer.
> 
> Well, yes, but the whole point of constructs like
> 
> HASH_CALLBACKS_CHECK(callback_mask);
> hash_domain_foreach(d, callback_mask, callbacks, gmfn);
> 
> is to make very obvious that the checked mask and the used mask
> match. Hence if anything I'd see us eliminate the static const
> callback_mask variables altogether.

That seems like a good approach.

Cheers,

Tim.



Re: [PATCH] x86/shadow: fix DO_UNSHADOW()

2021-05-24 Thread Tim Deegan
At 14:36 +0200 on 19 May (1621434982), Jan Beulich wrote:
> When adding the HASH_CALLBACKS_CHECK() I failed to properly recognize
> the (somewhat unusually formatted) if() around the call to
> hash_domain_foreach()). Gcc 11 is absolutely right in pointing out the
> apparently misleading indentation. Besides adding the missing braces,
> also adjust the two oddly formatted if()-s in the macro.
> 
> Fixes: 90629587e16e ("x86/shadow: replace stale literal numbers in 
> hash_{vcpu,domain}_foreach()")
> Signed-off-by: Jan Beulich 

Reviewed-by: Tim Deegan 



Re: [PATCH v3 1/5] tools/debugger: Fix PAGE_SIZE redefinition error

2021-05-11 Thread Tim Deegan
At 11:35 +0300 on 10 May (1620646515), Costin Lupu wrote:
> If PAGE_SIZE is already defined in the system (e.g. in /usr/include/limits.h
> header) then gcc will trigger a redefinition error because of -Werror. This
> patch replaces usage of PAGE_* macros with KDD_PAGE_* macros in order to avoid
> confusion between control domain page granularity (PAGE_* definitions) and
> guest domain page granularity (which is what we are dealing with here).
> 
> We chose to define the KDD_PAGE_* macros instead of using XC_PAGE_* macros
> because (1) the code in kdd.c should not include any Xen headers and (2) to 
> add
> consistency for code in both kdd.c and kdd-xen.c.
> 
> Signed-off-by: Costin Lupu 

Reviewed-by: Tim Deegan 

Thanks!

Tim.



Re: [PATCH 1/5] tools/debugger: Fix PAGE_SIZE redefinition error

2021-04-30 Thread Tim Deegan
At 14:36 +0300 on 30 Apr (1619793419), Costin Lupu wrote:
> Hi Tim,
> 
> On 4/29/21 10:58 PM, Tim Deegan wrote:
> > Hi,
> > 
> > At 15:05 +0300 on 27 Apr (1619535916), Costin Lupu wrote:
> >> If PAGE_SIZE is already defined in the system (e.g. in
> >> /usr/include/limits.h header) then gcc will trigger a redefinition error
> >> because of -Werror. This commit also protects PAGE_SHIFT definitions for
> >> keeping consistency.
> > 
> > Thanks for looking into this!  I think properly speaking we should fix
> > this by defining KDD_PAGE_SHIFT and KDD_PAGE_SIZE in kdd.h and using
> > those everywhere we currently use PAGE_SIZE/PAGE_SHIFT. in kdd.c and
> > kdd-xen.c.  If for some reason we ever ended up with a system-defined
> > PAGE_SIZE that wasn't 4096u then we would not want to use it here
> > because it would break our guest operations.
> 
> As discussed for another patch of the series, an agreed solution that
> would apply for other libs as well would be to use XC_PAGE_* macros
> instead of PAGE_* macros. I've just sent a v2 doing that. Please let me
> know if you think it would be better to introduce the KDD_PAGE_*
> definitions instead.

Sorry to be annoying, but yes, please do introduce the KDD_ versions.
All the xen-specific code in KDD lives in kdd-xen.c; kdd.c shouldn't
include any xen headers.

Cheers,

Tim.



Re: [PATCH 1/5] tools/debugger: Fix PAGE_SIZE redefinition error

2021-04-29 Thread Tim Deegan
Hi,

At 15:05 +0300 on 27 Apr (1619535916), Costin Lupu wrote:
> If PAGE_SIZE is already defined in the system (e.g. in
> /usr/include/limits.h header) then gcc will trigger a redefinition error
> because of -Werror. This commit also protects PAGE_SHIFT definitions for
> keeping consistency.

Thanks for looking into this!  I think properly speaking we should fix
this by defining KDD_PAGE_SHIFT and KDD_PAGE_SIZE in kdd.h and using
those everywhere we currently use PAGE_SIZE/PAGE_SHIFT. in kdd.c and
kdd-xen.c.  If for some reason we ever ended up with a system-defined
PAGE_SIZE that wasn't 4096u then we would not want to use it here
because it would break our guest operations.

Cheers,

Tim



Re: [PATCH v4 1/3] VMX: use a single, global APIC access page

2021-04-26 Thread Tim Deegan
At 16:42 +0200 on 23 Apr (1619196141), Jan Beulich wrote:
> On 23.04.2021 16:17, Roger Pau Monné wrote:
> > On Fri, Apr 23, 2021 at 12:52:57PM +0200, Jan Beulich wrote:
> >> +if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) &&
> > 
> > Nit: I would prefer if assigned mfn outside of the condition, like
> > it's done in the chunk added to shadow_get_page_from_l1e.
> 
> Well, I did it differently here because the variable really is
> only needed inside the if(), whereas in "get" the subsequent
> patches use it elsewhere as well. I'll wait what Tim says.

No strong feelings on this, but since you asked me, I would also
prefer it to be outside the condition.

Cheers,

Tim.



Re: [PATCH v4 1/3] VMX: use a single, global APIC access page

2021-04-26 Thread Tim Deegan
At 12:52 +0200 on 23 Apr (1619182377), Jan Beulich wrote:
> The address of this page is used by the CPU only to recognize when to
> access the virtual APIC page instead. No accesses would ever go to this
> page. It only needs to be present in the (CPU) page tables so that
> address translation will produce its address as result for respective
> accesses.
> 
> By making this page global, we also eliminate the need to refcount it,
> or to assign it to any domain in the first place.
> 
> Signed-off-by: Jan Beulich 

Looks good, thanks for the changes!

Acked-by: Tim Deegan 



Re: [PATCH v4] VMX: use a single, global APIC access page

2021-04-22 Thread Tim Deegan
At 11:38 +0200 on 22 Apr (1619091522), Jan Beulich wrote:
> On 22.04.2021 09:42, Tim Deegan wrote:
> > At 13:25 +0200 on 19 Apr (1618838726), Jan Beulich wrote:
> >> On 17.04.2021 21:24, Tim Deegan wrote:
> >>> At 12:40 +0200 on 12 Apr (1618231248), Jan Beulich wrote:
> >>>> --- a/xen/arch/x86/mm/shadow/set.c
> >>>> +++ b/xen/arch/x86/mm/shadow/set.c
> >>>> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
> >>>>  ASSERT(!sh_l1e_is_magic(sl1e));
> >>>>  ASSERT(shadow_mode_refcounts(d));
> >>>>  
> >>>> +/*
> >>>> + * VMX'es APIC access MFN is just a surrogate page.  It doesn't 
> >>>> actually
> >>>> + * get accessed, and hence there's no need to refcount it (and 
> >>>> refcounting
> >>>> + * would fail, due to the page having no owner).
> >>>> + */
> >>>> +if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
> >>>
> >>> Would it be better to check specifically for mfn == apic_access_mfn
> >>> (and apic_access_mfn != 0, I guess)?
> >>
> >> Roger did ask about the same - I neither want to expose apic_access_mfn
> >> outside its CU, nor do I want to introduce an accessor function. Both
> >> feel like layering violations to me.
> > 
> > I think that this is even more of a layering violation: what we
> > actually want is to allow un-refcounted mappings of the
> > apic_access_mfn, but to do it we're relying on an internal
> > implementation detail (that it happens to be un-owned and PGC_extra)
> > rather than giving ourselves an API.
> > 
> > And so we're tangled up talking about how to write comments to warn
> > our future selves about the possible side-effects.
> > 
> >>>  If we want this behaviour for
> >>> for all un-owned PGC_extra MFNs it would be good to explain that in the
> >>> comments.
> >>
> >> This is hard to tell without knowing which (or even if) further such
> >> PGC_extra pages will appear. Hence any comment to that effect would be
> >> guesswork at best. Of course I can add e.g. "Other pages with the same
> >> properties would be treated the same", if that's what you're after?
> > 
> > If you want to go this way there should be a comment here saying that
> > we're allowing this for all PGC_extra pages because we need it for
> > apic_access_mfn, and a comment at PGC_extra saying that it has this
> > effect.
> 
> So (along with a comment to this effect) how about I make
> page_suppress_refcounting() and page_refcounting_suppressed() helpers?
> The former would set PGC_extra on the page and assert the page has no
> owner, while the latter would subsume the checks done here.

That sounds good to me.

> The only
> question then is what to do with the ASSERT(type == p2m_mmio_direct):
> That's still a property of the APIC access MFN which may or may not
> hold for future such pages. (It can't be part of the new helper anyway
> as "put" doesn't have the type available.)

I think we might drop that assertion, since the new mehanism would be
more general.

Cheers,

Tim.



Re: [PATCH 6/7] x86/shadow: Make _shadow_prealloc() compile at -Og

2021-04-22 Thread Tim Deegan
At 15:01 +0100 on 19 Apr (1618844491), Andrew Cooper wrote:
> When compiling at -Og:
> 
>   In file included from
>   /builds/xen-project/people/andyhhp/xen/xen/include/asm/domain.h:4:0,
>from 
> /builds/xen-project/people/andyhhp/xen/xen/include/xen/domain.h:8,
>from 
> /builds/xen-project/people/andyhhp/xen/xen/include/xen/sched.h:11,
>from 
> /builds/xen-project/people/andyhhp/xen/xen/include/xen/ioreq.h:22,
>from common.c:23:
>   common.c: In function '_shadow_prealloc':
>   /builds/xen-project/people/andyhhp/xen/xen/include/xen/mm.h:252:55: error: 
> 't' may be used uninitialized in this function [-Werror=maybe-uninitialized]
>return page != head->next ? pdx_to_page(page->list.prev) : NULL;
>  ^
>   common.c:933:28: note: 't' was declared here
>struct page_info *sp, *t;
>   ^
> 
> I'm not certain the analysis is correct.  't' is a temporary variable, and is
> clearly initialised before use in foreach_pinned_shadow().  Either way,
> initialising it to NULL placates the compiler.

Yeah, this analysis seems wrong to me too - if nothing else, why does
it not complain about the identical code in shadow_blow_tables() below?

That said, since the non-debug build doesn't complain here, presumably it
will be able to elide this dead store.

Acked-by: Tim Deegan 



Re: [PATCH v4] VMX: use a single, global APIC access page

2021-04-22 Thread Tim Deegan
At 13:25 +0200 on 19 Apr (1618838726), Jan Beulich wrote:
> On 17.04.2021 21:24, Tim Deegan wrote:
> > At 12:40 +0200 on 12 Apr (1618231248), Jan Beulich wrote:
> >> By making this page global, we also eliminate the need to refcount it,
> >> or to assign it to any domain in the first place.
> > 
> > What is the aim here?  To save 4k per domain?  It seems to come out
> > about even for adding and removing code. 
> 
> True, but still it looks wrong to me to use a page per guest when one
> her host suffices. Think about many tiny, short-lived VMs (as in
> Tamas'es VM forking).

OK, fair enough.

> >> --- a/xen/arch/x86/mm/shadow/set.c
> >> +++ b/xen/arch/x86/mm/shadow/set.c
> >> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
> >>  ASSERT(!sh_l1e_is_magic(sl1e));
> >>  ASSERT(shadow_mode_refcounts(d));
> >>  
> >> +/*
> >> + * VMX'es APIC access MFN is just a surrogate page.  It doesn't 
> >> actually
> >> + * get accessed, and hence there's no need to refcount it (and 
> >> refcounting
> >> + * would fail, due to the page having no owner).
> >> + */
> >> +if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )
> > 
> > Would it be better to check specifically for mfn == apic_access_mfn
> > (and apic_access_mfn != 0, I guess)?
> 
> Roger did ask about the same - I neither want to expose apic_access_mfn
> outside its CU, nor do I want to introduce an accessor function. Both
> feel like layering violations to me.

I think that this is even more of a layering violation: what we
actually want is to allow un-refcounted mappings of the
apic_access_mfn, but to do it we're relying on an internal
implementation detail (that it happens to be un-owned and PGC_extra)
rather than giving ourselves an API.

And so we're tangled up talking about how to write comments to warn
our future selves about the possible side-effects.

> >  If we want this behaviour for
> > for all un-owned PGC_extra MFNs it would be good to explain that in the
> > comments.
> 
> This is hard to tell without knowing which (or even if) further such
> PGC_extra pages will appear. Hence any comment to that effect would be
> guesswork at best. Of course I can add e.g. "Other pages with the same
> properties would be treated the same", if that's what you're after?

If you want to go this way there should be a comment here saying that
we're allowing this for all PGC_extra pages because we need it for
apic_access_mfn, and a comment at PGC_extra saying that it has this
effect.

Cheers,

Tim



Re: [PATCH v4] VMX: use a single, global APIC access page

2021-04-17 Thread Tim Deegan
Hi,

Apologies for not sending comments before, and thanks for the ping.

At 12:40 +0200 on 12 Apr (1618231248), Jan Beulich wrote:
> The address of this page is used by the CPU only to recognize when to
> access the virtual APIC page instead. No accesses would ever go to this
> page. It only needs to be present in the (CPU) page tables so that
> address translation will produce its address as result for respective
> accesses.
> 
> By making this page global, we also eliminate the need to refcount it,
> or to assign it to any domain in the first place.

What is the aim here?  To save 4k per domain?  It seems to come out
about even for adding and removing code. 

> --- a/xen/arch/x86/mm/shadow/set.c
> +++ b/xen/arch/x86/mm/shadow/set.c
> @@ -94,6 +94,22 @@ shadow_get_page_from_l1e(shadow_l1e_t sl
>  ASSERT(!sh_l1e_is_magic(sl1e));
>  ASSERT(shadow_mode_refcounts(d));
>  
> +/*
> + * VMX'es APIC access MFN is just a surrogate page.  It doesn't actually
> + * get accessed, and hence there's no need to refcount it (and 
> refcounting
> + * would fail, due to the page having no owner).
> + */
> +if ( mfn_valid(mfn = shadow_l1e_get_mfn(sl1e)) )

Would it be better to check specifically for mfn == apic_access_mfn
(and apic_access_mfn != 0, I guess)?  If we want this behaviour for
for all un-owned PGC_extra MFNs it would be good to explain that in the
comments.

Cheers,

Tim.



Re: [PATCH v2 11/12] x86/p2m: write_p2m_entry_{pre,post} hooks are HVM-only

2021-04-15 Thread Tim Deegan
At 16:13 +0200 on 12 Apr (1618244032), Jan Beulich wrote:
> Move respective shadow code to its HVM-only source file, thus making it
> possible to exclude the hooks as well. This then shows that
> shadow_p2m_init() also isn't needed in !HVM builds.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH v2 03/12] x86/mm: the gva_to_gfn() hook is HVM-only

2021-04-15 Thread Tim Deegan
At 16:07 +0200 on 12 Apr (1618243630), Jan Beulich wrote:
> As is the adjacent ga_to_gfn() one as well as paging_gva_to_gfn().
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH] x86/shadow: adjust callback arrays

2021-04-15 Thread Tim Deegan
At 18:03 +0200 on 15 Apr (1618509812), Jan Beulich wrote:
> On 15.04.2021 17:59, Tim Deegan wrote:
> > At 12:42 +0200 on 12 Apr (1618231332), Jan Beulich wrote:
> >> Some of them have entries with stale comments. Rather than correcting
> >> these comments, re-arrange how these arrays get populated, shrinking
> >> their sizes at the same time (by omitting trailing NULL entries: Use
> >> dedicated element initializers, serving the purpose of what the
> >> comments did so far. This then also makes these arrays independent of
> >> the actual ordering of the individual SH_type_*.
> >>
> >> While tightening respective ASSERT()s in hash_{vcpu,domain}_foreach(),
> >> also tighten related ones in shadow_hash_{insert,delete}().
> >>
> >> Signed-off-by: Jan Beulich 
> > 
> > Looks good, but please leave the arrays at full size.  With the full
> > array, a bug could lead to an assertion failure or NULL deref; with
> > a short array it could mean following a bogus funtion pointer.
> > 
> > With that change, Acked-by: Tim Deegan 
> 
> Thanks, but let me ask back about which of the two possble meanings
> of "full" you mean: Dimensioned by SH_type_unused, or dimensioned
> by SH_type_max_shadow + 1? The former would leave the arrays as they
> are now, while the latter would shrink them a little.

As they are now, please.

Cheers,

Tim.



Re: [PATCH 0/2] x86/shadow: shadow_get_page_from_l1e() adjustments

2021-04-15 Thread Tim Deegan
At 13:46 +0200 on 12 Apr (1618235218), Jan Beulich wrote:
> The aspect of the function the second patch here changes has been
> puzzling me for many years. I finally thought I ought to make an
> attempt at reducing this to just a single get_page_from_l1e()
> invocation. The first patch mainly helps readability (of the code
> in general as well as the second patch).
> 
> Note that the first patch here takes "VMX: use a single, global APIC
> access page" as a prereq; it could be re-based ahead of that one.
> 
> 1: re-use variables in shadow_get_page_from_l1e()
> 2: streamline shadow_get_page_from_l1e()

Acked-by: Tim Deegan 

Thanks,

Tim.




Re: [PATCH] x86/shadow: adjust callback arrays

2021-04-15 Thread Tim Deegan
At 12:42 +0200 on 12 Apr (1618231332), Jan Beulich wrote:
> Some of them have entries with stale comments. Rather than correcting
> these comments, re-arrange how these arrays get populated, shrinking
> their sizes at the same time (by omitting trailing NULL entries: Use
> dedicated element initializers, serving the purpose of what the
> comments did so far. This then also makes these arrays independent of
> the actual ordering of the individual SH_type_*.
> 
> While tightening respective ASSERT()s in hash_{vcpu,domain}_foreach(),
> also tighten related ones in shadow_hash_{insert,delete}().
> 
> Signed-off-by: Jan Beulich 

Looks good, but please leave the arrays at full size.  With the full
array, a bug could lead to an assertion failure or NULL deref; with
a short array it could mean following a bogus funtion pointer.

With that change, Acked-by: Tim Deegan 

Cheers,

Tim.

> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -1565,7 +1565,7 @@ void shadow_hash_insert(struct domain *d
>  
>  ASSERT(paging_locked_by_me(d));
>  ASSERT(d->arch.paging.shadow.hash_table);
> -ASSERT(t);
> +ASSERT(t >= SH_type_min_shadow && t <= SH_type_max_shadow);
>  
>  sh_hash_audit(d);
>  
> @@ -1590,7 +1590,7 @@ void shadow_hash_delete(struct domain *d
>  
>  ASSERT(paging_locked_by_me(d));
>  ASSERT(d->arch.paging.shadow.hash_table);
> -ASSERT(t);
> +ASSERT(t >= SH_type_min_shadow && t <= SH_type_max_shadow);
>  
>  sh_hash_audit(d);
>  
> @@ -1668,7 +1668,7 @@ static void hash_vcpu_foreach(struct vcp
>  {
>  if ( callback_mask & (1 << x->u.sh.type) )
>  {
> -ASSERT(x->u.sh.type < SH_type_unused);
> +ASSERT(x->u.sh.type <= SH_type_max_shadow);
>  ASSERT(callbacks[x->u.sh.type] != NULL);
>  done = callbacks[x->u.sh.type](v, page_to_mfn(x),
> callback_mfn);
> @@ -1715,7 +1715,7 @@ static void hash_domain_foreach(struct d
>  {
>  if ( callback_mask & (1 << x->u.sh.type) )
>  {
> -ASSERT(x->u.sh.type < SH_type_unused);
> +ASSERT(x->u.sh.type <= SH_type_max_shadow);
>  ASSERT(callbacks[x->u.sh.type] != NULL);
>  done = callbacks[x->u.sh.type](d, page_to_mfn(x),
> callback_mfn);
> @@ -1819,26 +1819,16 @@ int sh_remove_write_access(struct domain
> unsigned long fault_addr)
>  {
>  /* Dispatch table for getting per-type functions */
> -static const hash_domain_callback_t callbacks[SH_type_unused] = {
> -NULL, /* none*/
> +static const hash_domain_callback_t callbacks[] = {
>  #ifdef CONFIG_HVM
> -SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 2), /* l1_32   */
> -SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 2), /* fl1_32  */
> -NULL, /* l2_32   */
> -SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 3), /* l1_pae  */
> -SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 3), /* fl1_pae */
> -NULL, /* l2_pae  */
> +[SH_type_l1_32_shadow] = 
> SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 2),
> +[SH_type_fl1_32_shadow] = 
> SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 2),
> +[SH_type_l1_pae_shadow] = 
> SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 3),
> +[SH_type_fl1_pae_shadow] = 
> SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 3),
>  #endif
> -SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 4), /* l1_64   */
> -SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 4), /* fl1_64  */
> -NULL, /* l2_64   */
> -NULL, /* l2h_64  */
> -NULL, /* l3_64   */
> -NULL, /* l4_64   */
> -NULL, /* p2m */
> -NULL  /* unused  */
> +[SH_type_l1_64_shadow] = 
> SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 4),
> +[SH_type_fl1_64_shadow] = 
> SHADOW_INTERNAL_NAME(sh_rm_write_access_from_l1, 4),
>  };
> -
>  static const unsigned int callback_mask = SHF_L1_ANY | SHF_FL1_ANY;
>  struct page_info *pg = mfn_to_page(gmfn);
>  #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
> @@ -2044,26 +2034,16 @@ int sh_remove_all_mappings(struct domain
>  struct page_info *page = mfn_to_page(gmfn);
>  
>  /* Dispatch table for getting per-type functions */
> -static const hash_domain_callback_t callbacks[SH_type_unused] = {
> -NULL, /* none*/
> +static c

Re: [PATCH 10/14] tools/kdd: Use const whenever we point to literal strings

2021-04-06 Thread Tim Deegan
At 16:57 +0100 on 05 Apr (1617641829), Julien Grall wrote:
> From: Julien Grall 
> 
> literal strings are not meant to be modified. So we should use const
> char * rather than char * when we want to shore a pointer to them.
> 
> Signed-off-by: Julien Grall 

Acked-by: Tim Deegan 

Thanks,

Tim.



Re: [PATCH 03/14] xen/x86: shadow: The return type of sh_audit_flags() should be const

2021-04-06 Thread Tim Deegan
At 16:57 +0100 on 05 Apr (1617641822), Julien Grall wrote:
> From: Julien Grall 
> 
> The function sh_audit_flags() is returning pointer to literal strings.
> They should not be modified, so the return is now const and this is
> propagated to the callers.
> 
> Take the opportunity to fix the coding style in the declaration of
> sh_audit_flags.
> 
> Signed-off-by: Julien Grall 

Acked-by: Tim Deegan 

Thanks,

Tim.



Re: [PATCH 2/2][4.15?] x86/shadow: encode full GFN in magic MMIO entries

2021-03-08 Thread Tim Deegan
At 16:37 +0100 on 05 Mar (1614962265), Jan Beulich wrote:
> Since we don't need to encode all of the PTE flags, we have enough bits
> in the shadow entry to store the full GFN. Don't use literal numbers -
> instead derive the involved values. Or, where derivation would become
> too ugly, sanity-check the result (invoking #error to identify failure).
> 
> This then allows dropping from sh_l1e_mmio() again the guarding against
> too large GFNs.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

> I wonder if the respective check in sh_audit_l1_table() is actually
> useful to retain with these changes.

Yes, I think so.  We care about these PTEs being bogus for any reason,
not just the ones that this could address.

> -#define SH_L1E_MAGIC 0x0001ULL
> +#define SH_L1E_MAGIC_NR_META_BITS 4
> +#define SH_L1E_MAGIC_MASK ((~0ULL << (PADDR_BITS - PAGE_SHIFT + \
> +  SH_L1E_MAGIC_NR_META_BITS)) | \
> +   _PAGE_PRESENT)

I don't think this makes the code any more readable, TBH, but if you
prefer it that's OK.  I'd be happier with it if you added a
BUILD_BUG_ON that checks that 1ULL << (PADDR_BITS - 1) is set in the
mask, since that's the main thing we care about.

> -#define SH_L1E_MMIO_MAGIC   0x0001ULL
> -#define SH_L1E_MMIO_MAGIC_MASK  0x0009ULL
> -#define SH_L1E_MMIO_GFN_MASK0xfff0ULL
> +#define SH_L1E_MMIO_MAGIC   SH_L1E_MAGIC_MASK
> +#define SH_L1E_MMIO_MAGIC_BIT   ((_PAGE_PRESENT | _PAGE_RW | _PAGE_USER) + 1)

IMO this would be more readable as a straight 0x8 (or even _PAGE_PWT).
The ack stands either way.

Cheers,

Tim.




Re: [PATCH 1/2][4.15?] x86/shadow: suppress "fast fault path" optimization when running virtualized

2021-03-08 Thread Tim Deegan
At 16:37 +0100 on 05 Mar (1614962224), Jan Beulich wrote:
> We can't make correctness of our own behavior dependent upon a
> hypervisor underneath us correctly telling us the true physical address
> with hardware uses. Without knowing this, we can't be certain reserved
> bit faults can actually be observed. Therefore, besides evaluating the
> number of address bits when deciding whether to use the optimization,
> also check whether we're running virtualized ourselves.
> 
> Requested-by: Andrew Cooper 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

I would consider this to be a bug in the underlying hypervisor, but I
agree than in practice it won't be safe to rely on it being correct.

These checks are getting fiddly now.  I think that if we end up adding
any more to them it might be good to set a read-mostly variable at boot
time rather than do them on every MMIO/NP fault.

Cheers,

Tim.



Re: [PATCH][4.15] x86/shadow: suppress "fast fault path" optimization without reserved bits

2021-02-26 Thread Tim Deegan
Hi,

At 14:03 +0100 on 25 Feb (1614261809), Jan Beulich wrote:
> When none of the physical address bits in PTEs are reserved, we can't
> create any 4k (leaf) PTEs which would trigger reserved bit faults. Hence
> the present SHOPT_FAST_FAULT_PATH machinery needs to be suppressed in
> this case, which is most easily achieved by never creating any magic
> entries.
> 
> To compensate a little, eliminate sh_write_p2m_entry_post()'s impact on
> such hardware.
> 
> While at it, also avoid using an MMIO magic entry when that would
> truncate the incoming GFN.
> 
> Requested-by: Andrew Cooper 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

> I wonder if subsequently we couldn't arrange for SMEP/SMAP faults to be
> utilized instead, on capable hardware (which might well be all having
> such large a physical address width).

I don't immediately see how, since we don't control the access type
that the guest will use.

> I further wonder whether SH_L1E_MMIO_GFN_MASK couldn't / shouldn't be
> widened. I don't see a reason why it would need confining to the low
> 32 bits of the PTE - using the full space up to bit 50 ought to be fine
> (i.e. just one address bit left set in the magic mask), and we wouldn't
> even need that many to encode a 40-bit GFN (i.e. the extra guarding
> added here wouldn't then be needed in the first place).

Yes, I think it could be reduced to use just one reserved address bit.
IIRC we just used such a large mask so the magic entries would be
really obvious in debugging, and there was no need to support arbitrary
address widths for emulated devices.

Cheers,

Tim.





Re: [PATCH][4.15] x86/shadow: replace bogus return path in shadow_get_page_from_l1e()

2021-02-26 Thread Tim Deegan
At 16:08 +0100 on 26 Feb (1614355713), Jan Beulich wrote:
> Prior to be640b1800bb ("x86: make get_page_from_l1e() return a proper
> error code") a positive return value did indicate an error. Said commit
> failed to adjust this return path, but luckily the only caller has
> always been inside a shadow_mode_refcounts() conditional.
> 
> Subsequent changes caused 1 to end up at the default (error) label in
> the caller's switch() again, but the returning of 1 (== _PAGE_PRESENT)
> is still rather confusing here, and a latent risk.
> 
> Convert to an ASSERT() instead, just in case any new caller would
> appear.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH] x86/shadow: replace stale literal numbers in hash_{vcpu,domain}_foreach()

2021-01-25 Thread Tim Deegan
At 12:07 +0100 on 25 Jan (1611576438), Jan Beulich wrote:
> 15 apparently once used to be the last valid type to request a callback
> for, and the dimension of the respective array. The arrays meanwhile are
> larger than this (in a benign way, i.e. no caller ever sets a mask bit
> higher than 15), dimensioned by SH_type_unused. Have the ASSERT()s
> follow suit and add build time checks at the call sites.
> 
> Also adjust a comment naming the wrong of the two functions.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Tim Deegan 

> ---
> The ASSERT()s being adjusted look redundant with the BUILD_BUG_ON()s
> being added, so I wonder whether dropping them wouldn't be the better
> route.

I'm happy to keep both, as they do slightly different things.

Thanks for fixing this up!

Tim.



Re: [PATCH 15/17] x86/shadow: drop SH_type_l2h_pae_shadow

2021-01-22 Thread Tim Deegan
Hi,

At 17:31 +0100 on 22 Jan (1611336662), Jan Beulich wrote:
> On 22.01.2021 14:11, Tim Deegan wrote:
> > At 16:10 +0100 on 14 Jan (1610640627), Jan Beulich wrote:
> >> hash_{domain,vcpu}_foreach() have a use each of literal 15. It's not
> >> clear to me what the proper replacement constant would be, as it
> >> doesn't look as if SH_type_monitor_table was meant.
> > 
> > Good spot.  I think the '<= 15' should be replaced with '< SH_type_unused'.
> > It originally matched the callback arrays all being coded as
> > "static hash_callback_t callbacks[16]".
> 
> So are you saying this was off by one then prior to this patch
> (when SH_type_unused was still 17), albeit in apparently a
> benign way?

Yes - this assertion is just to catch overruns of the callback table,
and so it was OK for its limit to be too low.  The new types that were
added since then are never put in the hash table, so don't trigger
this assertion.

> And the comments in sh_remove_write_access(),
> sh_remove_all_mappings(), sh_remove_shadows(), and
> sh_reset_l3_up_pointers() are then wrong as well, and would
> instead better be like in shadow_audit_tables()?

Yes, it looks like those comments are also out of date where they
mention 'unused'.

> Because of this having been benign (due to none of the callback
> tables specifying non-NULL entries there), wouldn't it make
> sense to dimension the tables by SH_type_max_shadow + 1 only?
> Or would you consider this too risky?

Yes, I think that would be fine, also changing '<= 15' to
'<= SH_type_max_shadow'.  Maybe add a matching
ASSERT(t <= SH_type_max_shadow) in shadow_hash_insert as well?

Cheers,

Tim.



Re: [PATCH 00/17] x86/PV: avoid speculation abuse through guest accessors plus ...

2021-01-22 Thread Tim Deegan
At 16:01 +0100 on 14 Jan (1610640090), Jan Beulich wrote:
> ... shadow adjustments towards not building 2- and 3-level code
> when !HVM. While the latter isn't functionally related to the
> former, it depends on some of the rearrangements done there.

The shadow changes look good, thank you!
Reviewed-by: Tim Deegan 

I have read the uaccess stuff in passing and it looks OK to me too,
but I didn't review it in detail.

Cheers,

Tim.




Re: [PATCH 15/17] x86/shadow: drop SH_type_l2h_pae_shadow

2021-01-22 Thread Tim Deegan
Hi,

At 16:10 +0100 on 14 Jan (1610640627), Jan Beulich wrote:
> This is a remnant from 32-bit days, having no place anymore where a
> shadow of this type would be created.
> 
> Signed-off-by: Jan Beulich 
> ---
> hash_{domain,vcpu}_foreach() have a use each of literal 15. It's not
> clear to me what the proper replacement constant would be, as it
> doesn't look as if SH_type_monitor_table was meant.

Good spot.  I think the '<= 15' should be replaced with '< SH_type_unused'.
It originally matched the callback arrays all being coded as
"static hash_callback_t callbacks[16]".

Cheers,

Tim



Re: [PATCH 5/5] x86/p2m: split write_p2m_entry() hook

2020-11-12 Thread Tim Deegan
At 15:04 +0100 on 12 Nov (1605193496), Jan Beulich wrote:
> On 12.11.2020 14:07, Roger Pau Monné wrote:
> > On Thu, Nov 12, 2020 at 01:29:33PM +0100, Jan Beulich wrote:
> >> I agree with all this. If only it was merely about TLB flushes. In
> >> the shadow case, shadow_blow_all_tables() gets invoked, and that
> >> one - looking at the other call sites - wants the paging lock held.
[...]
> > The post hook for shadow could pick the lock again, as I don't think
> > the removal of the tables needs to be strictly done inside of the same
> > locked region?
> 
> I think it does, or else a use of the now stale tables may occur
> before they got blown away. Tim?

Is this the call to shadow_blow_tables() in the write_p2m_entry path?
I think it would be safe to drop and re-take the paging lock there as
long as the call happens before the write is considered to have
finished.

But it would not be a useful performance improvement - any update that
takes this path is going to be very slow regardless.  So unless you
have another pressing reason to split it up, I would be inclined to
leave it as it is.  That way it's easier to see that the locking is
correct.

Cheers,

Tim.



Re: [PATCH v2 9/9] x86/shadow: adjust TLB flushing in sh_unshadow_for_p2m_change()

2020-11-07 Thread Tim Deegan
At 10:39 +0100 on 06 Nov (1604659168), Jan Beulich wrote:
> Accumulating transient state of d->dirty_cpumask in a local variable is
> unnecessary here: The flush is fine to make with the dirty set at the
> time of the call. With this, move the invocation to a central place at
> the end of the function.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH v2 8/9] x86/shadow: cosmetics to sh_unshadow_for_p2m_change()

2020-11-07 Thread Tim Deegan
At 10:38 +0100 on 06 Nov (1604659127), Jan Beulich wrote:
> Besides the adjustments for style
> - use switch(),
> - widen scope of commonly used variables,
> - narrow scope of other variables.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH v2 7/9] x86/p2m: pass old PTE directly to write_p2m_entry_pre() hook

2020-11-06 Thread Tim Deegan
At 10:38 +0100 on 06 Nov (1604659085), Jan Beulich wrote:
> In no case is a pointer to non-const needed. Since no pointer arithmetic
> is done by the sole user of the hook, passing in the PTE itself is quite
> fine.
> 
> While doing this adjustment also
> - drop the intermediate sh_write_p2m_entry_pre():
>   sh_unshadow_for_p2m_change() can itself be used as the hook function,
>   moving the conditional into there,
> - introduce a local variable holding the flags of the old entry.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 




Re: Ping: [PATCH v3 0/3] x86: shim building adjustments (plus shadow follow-on)

2020-10-29 Thread Tim Deegan
At 14:40 +0100 on 29 Oct (1603982415), Jan Beulich wrote:
> Tim,
> 
> unless you tell me otherwise I'm intending to commit the latter
> two with Roger's acks some time next week. Of course it would
> still be nice to know your view on the first of the TBDs in
> patch 3 (which may result in further changes, in a follow-up).

Ack, sorry for the dropped patches, and thank you for the ping.  I've
now replied to everything that I think is wating for my review.

Cheers,

Tim.



Re: [PATCH] x86/shadow: correct GFN use by sh_unshadow_for_p2m_change()

2020-10-29 Thread Tim Deegan
At 16:42 +0100 on 28 Oct (1603903365), Jan Beulich wrote:
> Luckily sh_remove_all_mappings()'s use of the parameter is limited to
> generation of log messages. Nevertheless we'd better pass correct GFNs
> around:
> - the incoming GFN, when replacing a large page, may not be large page
>   aligned,
> - incrementing by page-size-scaled values can't be right.
> 
> Signed-off-by: Jan Beulich 

Reviewed-by: Tim Deegan 

Thanks!

Tim.



Re: [PATCH 5/5] x86/p2m: split write_p2m_entry() hook

2020-10-29 Thread Tim Deegan
At 10:24 +0100 on 28 Oct (1603880693), Jan Beulich wrote:
> Fair parts of the present handlers are identical; in fact
> nestedp2m_write_p2m_entry() lacks a call to p2m_entry_modify(). Move
> common parts right into write_p2m_entry(), splitting the hooks into a
> "pre" one (needed just by shadow code) and a "post" one.
> 
> For the common parts moved I think that the p2m_flush_nestedp2m() is,
> at least from an abstract perspective, also applicable in the shadow
> case. Hence it doesn't get a 3rd hook put in place.
> 
> The initial comment that was in hap_write_p2m_entry() gets dropped: Its
> placement was bogus, and looking back the the commit introducing it
> (dd6de3ab9985 "Implement Nested-on-Nested") I can't see either what use
> of a p2m it was meant to be associated with.
> 
> Signed-off-by: Jan Beulich 

This seems like a good approach to me.  I'm happy with the shadow
parts but am not confident enough on nested p2m any more to have an
opinion on that side. 

Acked-by: Tim Deegan 




Re: [PATCH 2/5] x86/p2m: collapse the two ->write_p2m_entry() hooks

2020-10-29 Thread Tim Deegan
At 10:22 +0100 on 28 Oct (1603880578), Jan Beulich wrote:
> The struct paging_mode instances get set to the same functions
> regardless of mode by both HAP and shadow code, hence there's no point
> having this hook there. The hook also doesn't need moving elsewhere - we
> can directly use struct p2m_domain's. This merely requires (from a
> strictly formal pov; in practice this may not even be needed) making
> sure we don't end up using safe_write_pte() for nested P2Ms.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH v3 3/3] x86/shadow: sh_{make,destroy}_monitor_table() are "even more" HVM-only

2020-10-29 Thread Tim Deegan
At 10:45 +0200 on 19 Oct (1603104300), Jan Beulich wrote:
> With them depending on just the number of shadow levels, there's no need
> for more than one instance of them, and hence no need for any hook (IOW
> 452219e24648 ["x86/shadow: monitor table is HVM-only"] didn't go quite
> far enough). Move the functions to hvm.c while dropping the dead
> is_pv_32bit_domain() code paths.
>
> While moving the code, replace a stale comment reference to
> sh_install_xen_entries_in_l4(). Doing so made me notice the function
> also didn't have its prototype dropped in 8d7b633adab7 ("x86/mm:
> Consolidate all Xen L4 slot writing into init_xen_l4_slots()"), which
> gets done here as well.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

> TBD: In principle both functions could have their first parameter
>  constified. In fact, "destroy" doesn't depend on the vCPU at all
>  and hence could be passed a struct domain *. Not sure whether such
>  an asymmetry would be acceptable.
>  In principle "make" would also not need passing of the number of
>  shadow levels (can be derived from v), which would result in yet
>  another asymmetry.
>  If these asymmetries were acceptable, "make" could then also update
>  v->arch.hvm.monitor_table, instead of doing so at both call sites.

Feel free to add consts, but please don't change the function
parameters any more than that.  I would rather keep them as a matched
pair, and leave the hvm.monitor_table updates in the caller, where
it's easier to see why they're not symmetrical.

Cheers

Tim.



Re: [PATCH v3 2/3] x86/shadow: refactor shadow_vram_{get,put}_l1e()

2020-10-29 Thread Tim Deegan
At 10:44 +0200 on 19 Oct (1603104271), Jan Beulich wrote:
> By passing the functions an MFN and flags, only a single instance of
> each is needed; they were pretty large for being inline functions
> anyway.
> 
> While moving the code, also adjust coding style and add const where
> sensible / possible.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH 0/2] x86/mm: {paging,sh}_{cmpxchg,write}_guest_entry() adjustments

2020-10-03 Thread Tim Deegan
At 13:56 +0200 on 28 Sep (1601301371), Jan Beulich wrote:
> 1: {paging,sh}_{cmpxchg,write}_guest_entry() cannot fault
> 2: remove some indirection from {paging,sh}_cmpxchg_guest_entry()

Acked-by: Tim Deegan 



Re: [PATCH] x86/hvm: Clean up track_dirty_vram() calltree

2020-07-26 Thread Tim Deegan
At 16:15 +0100 on 22 Jul (1595434548), Andrew Cooper wrote:
>  * Rename nr to nr_frames.  A plain 'nr' is confusing to follow in the the
>lower levels.
>  * Use DIV_ROUND_UP() rather than opencoding it in several different ways
>  * The hypercall input is capped at uint32_t, so there is no need for
>nr_frames to be unsigned long in the lower levels.
> 
> No functional change.
> 
> Signed-off-by: Andrew Cooper 

Acked-by: Tim Deegan 



Re: [PATCH 5/5] x86/shadow: l3table[] and gl3e[] are HVM only

2020-07-20 Thread Tim Deegan
At 10:55 +0200 on 20 Jul (1595242521), Jan Beulich wrote:
> On 18.07.2020 20:20, Tim Deegan wrote:
> > At 12:00 +0200 on 15 Jul (1594814409), Jan Beulich wrote:
> >> ... by the very fact that they're 3-level specific, while PV always gets
> >> run in 4-level mode. This requires adding some seemingly redundant
> >> #ifdef-s - some of them will be possible to drop again once 2- and
> >> 3-level guest code doesn't get built anymore in !HVM configs, but I'm
> >> afraid there's still quite a bit of disentangling work to be done to
> >> make this possible.
> >>
> >> Signed-off-by: Jan Beulich 
> > 
> > Looks good.  It seems like the new code for '3-level non-HVM' in
> > guest-walks ought to have some sort of assert-unreachable in them too
> > - or is there a reason to to?
> 
> You mean this piece of code
> 
> +#elif !defined(CONFIG_HVM)
> +(void)root_gfn;
> +memset(gw, 0, sizeof(*gw));
> +return false;
> +#else /* PAE */
> 
> If so - sure, ASSERT_UNREACHABLE() could be added there. It simply
> didn't occur to me. I take it your ack for the entire series holds
> here with this addition.

Yes, it does.  Thanks!

Tim.



Re: [PATCH 0/5] x86: mostly shadow related XSA-319 follow-up

2020-07-18 Thread Tim Deegan
At 11:56 +0200 on 15 Jul (1594814214), Jan Beulich wrote:
> This in particular goes a few small steps further towards proper
> !HVM and !PV config handling (i.e. no carrying of unnecessary
> baggage).
> 
> 1: x86/shadow: dirty VRAM tracking is needed for HVM only
> 2: x86/shadow: shadow_table[] needs only one entry for PV-only configs
> 3: x86/PV: drop a few misleading paging_mode_refcounts() checks
> 4: x86/shadow: have just a single instance of sh_set_toplevel_shadow()
> 5: x86/shadow: l3table[] and gl3e[] are HVM only

I sent a question on #5 separatly; otherwise these all seem good to
me, thank you!

Acked-by: Tim Deegan 



Re: [PATCH 5/5] x86/shadow: l3table[] and gl3e[] are HVM only

2020-07-18 Thread Tim Deegan
At 12:00 +0200 on 15 Jul (1594814409), Jan Beulich wrote:
> ... by the very fact that they're 3-level specific, while PV always gets
> run in 4-level mode. This requires adding some seemingly redundant
> #ifdef-s - some of them will be possible to drop again once 2- and
> 3-level guest code doesn't get built anymore in !HVM configs, but I'm
> afraid there's still quite a bit of disentangling work to be done to
> make this possible.
> 
> Signed-off-by: Jan Beulich 

Looks good.  It seems like the new code for '3-level non-HVM' in
guest-walks ought to have some sort of assert-unreachable in them too
- or is there a reason to to?

Cheers,

Tim.




Re: [PATCH for-4.14] kdd: fix build again

2020-07-03 Thread Tim Deegan
At 20:10 + on 03 Jul (1593807001), Wei Liu wrote:
> Restore Tim's patch. The one that was committed was recreated by me
> because git didn't accept my saved copy. I made some mistakes while
> recreating that patch and here we are.
> 
> Fixes: 3471cafbdda3 ("kdd: stop using [0] arrays to access packet contents")
> Reported-by: Michael Young 
> Signed-off-by: Wei Liu 

Reviewed-by: Tim Deegan 

Thanks!

Tim.



Re: Build problems in kdd.c with xen-4.14.0-rc4

2020-07-03 Thread Tim Deegan
Hi Michael,

Thanks for ther report!

At 23:21 +0100 on 30 Jun (1593559296), Michael Young wrote:
> I get the following errors when trying to build xen-4.14.0-rc4
> 
> kdd.c: In function 'kdd_tx':
> kdd.c:754:15: error: array subscript 16 is above array bounds of 
> 'uint8_t[16]' {aka 'unsigned char[16]'} [-Werror=array-bounds]
>754 | s->txb[len++] = 0xaa;
>| ~~^~~
> kdd.c:82:17: note: while referencing 'txb'
> 82 | uint8_t txb[sizeof (kdd_hdr)];   /* Marshalling area 
> for tx */
>| ^~~
> kdd.c: In function 'kdd_break':
> kdd.c:819:19: error: array subscript 16 is above array bounds of 
> 'uint8_t[16]' {aka 'unsigned char[16]'} [-Werror=array-bounds]

Oh dear.  The fix for the last kdd bug seems to have gone wrong
somewhere.  The patch I posted has:

-uint8_t txb[sizeof (kdd_hdr) + 65536];   /* Marshalling area for tx */
+uint8_t txb[sizeof (kdd_pkt)];   /* Marshalling area for tx */

but as applied in master it's:

-uint8_t txb[sizeof (kdd_hdr) + 65536];   /* Marshalling area for tx */
+uint8_t txb[sizeof (kdd_hdr)];   /* Marshalling area for tx */

i.e. the marshalling area is only large enough for a header and GCC
is correctly complaining about that.

Wei, it looks like you committed this patch - can you figure out what
happened to it please?

Cheers,

Tim.



Re: [PATCH v1] kdd: remove zero-length arrays

2020-06-10 Thread Tim Deegan
At 15:22 +0200 on 09 Jun (1591716153), Olaf Hering wrote:
> Am Tue, 9 Jun 2020 13:15:49 +0100
> schrieb Tim Deegan :
> 
> > Olaf, can you try dropping the 'payload' field from the header and 
> > replacing the payload[0] in pkt with payload[] ?
> 
> In file included from kdd.c:53:
> kdd.h:325:17: error: flexible array member in union
>   325 | uint8_t payload[];

How tedious.  Well, the only place we actually allocate one of these
we already leave enough space for a max-size packet, so how about
this?

kdd: stop using [0] arrays to access packet contents.

GCC 10 is unhappy about this, and we already use 64k buffers
in the only places where packets are allocated, so move the
64k size into the packet definition.

Reported-by: Olaf Hering 
Signed-off-by: Tim Deegan 
---
 tools/debugger/kdd/kdd.c | 4 ++--
 tools/debugger/kdd/kdd.h | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git tools/debugger/kdd/kdd.c tools/debugger/kdd/kdd.c
index 3ebda9b12c..a7d0976ea4 100644
--- tools/debugger/kdd/kdd.c
+++ tools/debugger/kdd/kdd.c
@@ -79,11 +79,11 @@ typedef struct {
 /* State of the debugger stub */
 typedef struct {
 union {
-uint8_t txb[sizeof (kdd_hdr) + 65536];   /* Marshalling area for tx */
+uint8_t txb[sizeof (kdd_pkt)];   /* Marshalling area for tx */
 kdd_pkt txp; /* Also readable as a packet structure */
 };
 union {
-uint8_t rxb[sizeof (kdd_hdr) + 65536];   /* Marshalling area for rx */
+uint8_t rxb[sizeof (kdd_pkt)];   /* Marshalling area for rx */
 kdd_pkt rxp; /* Also readable as a packet structure */
 };
 unsigned int cur;   /* Offset into rx where we'll put the next byte */
diff --git tools/debugger/kdd/kdd.h tools/debugger/kdd/kdd.h
index bfb00ba5c5..b9a17440df 100644
--- tools/debugger/kdd/kdd.h
+++ tools/debugger/kdd/kdd.h
@@ -68,7 +68,6 @@ typedef struct {
 uint16_t len; /* Payload length, excl. header and trailing byte */
 uint32_t id;  /* Echoed in responses */
 uint32_t sum; /* Unsigned sum of all payload bytes */
-uint8_t payload[0];
 } PACKED kdd_hdr;
 
 #define KDD_PKT_CMD 0x0002  /* Debugger commands (and replies to them) */
@@ -323,7 +322,7 @@ typedef struct {
 kdd_msg msg;
 kdd_reg reg;
 kdd_stc stc;
-uint8_t payload[0];
+uint8_t payload[65536];
 };
 } PACKED kdd_pkt;
 
-- 
2.26.2




Re: [PATCH v1] kdd: remove zero-length arrays

2020-06-09 Thread Tim Deegan
Hi,

At 09:55 +0100 on 09 Jun (1591696552), Paul Durrant wrote:
> > -Original Message-
> > From: Xen-devel  On Behalf Of Olaf 
> > Hering
> > Sent: 08 June 2020 21:39
> > To: xen-devel@lists.xenproject.org
> > Cc: Ian Jackson ; Olaf Hering ; 
> > Tim Deegan ;
> > Wei Liu 
> > Subject: [PATCH v1] kdd: remove zero-length arrays
> > 
> > Struct 'kdd_hdr' already has a member named 'payload[]' to easily
> > refer to the data after the header.  Remove the payload member from
> > 'kdd_pkt' and always use 'kdd_hdr' to fix the following compile error:
> 
> Is it not sufficient to just change the declaration of payload in kdd_pkt 
> from [0] to []? In fact I can't see any use of the payload
> field in kdd_hdr so it looks like that is the one that ought to be dropped.

Yes, if one of these has to go, it should be the one in the header,
since it's not used and the one in the packet is neatly in the union
with the other decriptions of the packet payload.

I'm not currently in a position to test patches, but might be later in
the week.  Olaf, can you try dropping the 'payload' field from the
header and replacing the payload[0] in pkt with payload[] ?

Cheers,

Tim.



Re: [PATCH] x86/shadow: Reposition sh_remove_write_access_from_sl1p()

2020-05-24 Thread Tim Deegan
At 10:04 +0100 on 21 May (1590055468), Andrew Cooper wrote:
> When compiling with SHOPT_OUT_OF_SYNC disabled, the build fails with:
> 
>   common.c:41:12: error: ‘sh_remove_write_access_from_sl1p’ declared ‘static’ 
> but never defined [-Werror=unused-function]
>static int sh_remove_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
>   ^~~~
> 
> due to an unguarded forward declaration.
> 
> It turns out there is no need to forward declare
> sh_remove_write_access_from_sl1p() to begin with, so move it to just ahead of
> its first users, which is within a larger #ifdef'd SHOPT_OUT_OF_SYNC block.
> 
> Fix up for style while moving it.  No functional change.
> 
> Signed-off-by: Andrew Cooper 

Thank you!  This is fine, either as-is or with the suggested change to
a switch.

Reviewed-by: Tim Deegan 




Re: [PATCH] x86/traps: Rework #PF[Rsvd] bit handling

2020-05-20 Thread Tim Deegan
At 18:09 +0200 on 19 May (1589911795), Jan Beulich wrote:
> static inline int sh_type_has_up_pointer(struct domain *d, unsigned int t)
> {
> /* Multi-page shadows don't have up-pointers */
> if ( t == SH_type_l1_32_shadow
>  || t == SH_type_fl1_32_shadow
>  || t == SH_type_l2_32_shadow )
> return 0;
> /* Pinnable shadows don't have up-pointers either */
> return !sh_type_is_pinnable(d, t);
> }
> 
> It's unclear to me in which way SH_type_l1_32_shadow and
> SH_type_l2_32_shadow are "multi-page" shadows; I'd rather have
> expected all three SH_type_fl1_*_shadow to be. Tim?

They are multi-page in the sense that the shadow itself is more than a
page long (because it needs to have 1024 8-byte entries).

The FL1 shadows are the same size as their equivalent L1s.

Tim.



Re: [PATCH v11 1/3] x86/tlb: introduce a flush HVM ASIDs flag

2020-04-29 Thread Tim Deegan
At 11:12 +0100 on 27 Apr (1587985955), Wei Liu wrote:
> On Thu, Apr 23, 2020 at 06:33:49PM +0200, Jan Beulich wrote:
> > On 23.04.2020 16:56, Roger Pau Monne wrote:
> > > Introduce a specific flag to request a HVM guest linear TLB flush,
> > > which is an ASID/VPID tickle that forces a guest linear to guest
> > > physical TLB flush for all HVM guests.
> > > 
> > > This was previously unconditionally done in each pre_flush call, but
> > > that's not required: HVM guests not using shadow don't require linear
> > > TLB flushes as Xen doesn't modify the pages tables the guest runs on
> > > in that case (ie: when using HAP). Note that shadow paging code
> > > already takes care of issuing the necessary flushes when the shadow
> > > page tables are modified.
> > > 
> > > In order to keep the previous behavior modify all shadow code TLB
> > > flushes to also flush the guest linear to physical TLB if the guest is
> > > HVM. I haven't looked at each specific shadow code TLB flush in order
> > > to figure out whether it actually requires a guest TLB flush or not,
> > > so there might be room for improvement in that regard.
> > > 
> > > Also perform ASID/VPID flushes when modifying the p2m tables as it's a
> > > requirement for AMD hardware. Finally keep the flush in
> > > switch_cr3_cr4, as it's not clear whether code could rely on
> > > switch_cr3_cr4 also performing a guest linear TLB flush. A following
> > > patch can remove the ASID/VPID tickle from switch_cr3_cr4 if found to
> > > not be necessary.
> > > 
> > > Signed-off-by: Roger Pau Monné 
> > 
> > Reviewed-by: Jan Beulich 
> 
> Tim, ICYMI, this patch needs your ack.

Sorry!  Thanks for the reminder.

Acked-by: Tim Deegan 




Re: [PATCH v2 2/4] x86/shadow: sh_update_linear_entries() is a no-op for PV

2020-04-22 Thread Tim Deegan
At 11:11 +0200 on 21 Apr (1587467497), Jan Beulich wrote:
> Consolidate the shadow_mode_external() in here: Check this once at the
> start of the function.
> 
> Signed-off-by: Jan Beulich 
> Acked-by: Andrew Cooper 
> Acked-by: Tim Deegan 
> ---
> v2: Delete stale part of comment.
> ---
> Tim - I'm re-posting as I wasn't entirely sure whether you meant to drop
> the entire PV part of the comment, or only the last two sentences.

Looks good, thanks!

Acked-by: Tim Deegan 



Re: [PATCH] x86/shadow: make sh_remove_write_access() helper HVM only

2020-04-22 Thread Tim Deegan
At 14:05 +0200 on 21 Apr (1587477956), Jan Beulich wrote:
> Despite the inline attribute at least some clang versions warn about
> trace_shadow_wrmap_bf() being unused in !HVM builds. Include the helper
> in the #ifdef region.
> 
> Fixes: 8b8d011ad868 ("x86/shadow: the guess_wrmap() hook is needed for HVM 
> only")
> Reported-by: Andrew Cooper 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 



Re: [PATCH 07/10] x86/shadow: the guess_wrmap() hook is needed for HVM only

2020-04-20 Thread Tim Deegan
At 15:06 +0200 on 20 Apr (1587395210), Jan Beulich wrote:
> On 18.04.2020 11:03, Tim Deegan wrote:
> > At 16:28 +0200 on 17 Apr (1587140897), Jan Beulich wrote:
> >> sh_remove_write_access() bails early for !external guests, and hence its
> >> building and thus the need for the hook can be suppressed altogether in
> >> !HVM configs.
> >>
> >> Signed-off-by: Jan Beulich 
> > 
> >> @@ -366,6 +367,14 @@ int sh_validate_guest_entry(struct vcpu
> >>  extern int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
> >>unsigned int level,
> >>unsigned long fault_addr);
> >> +#else
> >> +static inline int sh_remove_write_access(struct domain *d, mfn_t 
> >> readonly_mfn,
> >> + unsigned int level,
> >> + unsigned long fault_addr)
> >> +{
> > 
> > Can we have an ASSERT(!shadow_mode_refcounts(d)) here, please,
> > matching the check that would have made it a noop before?
> 
> I've added one, but I find this quite odd in a !HVM build, where
> 
> #define PG_refcounts   0
> 
> and
> 
> #define paging_mode_refcounts(_d) (!!((_d)->arch.paging.mode & PG_refcounts))
> 
> Perhaps you're wanting this mainly for documentation purposes?


Yeah, that and future-proofing.  If !HVM builds ever start using
paging_mode_refcounts then this assertion will forcibly remind us that
we need changes here.  I'm glad that it compiles away to nothing in
the meantime.

Cheers,

Tim.



Re: [PATCH 00/10] x86: mm (mainly shadow) adjustments

2020-04-18 Thread Tim Deegan
At 16:23 +0200 on 17 Apr (1587140581), Jan Beulich wrote:
> Large parts of this series are to further isolate pieces which
> are needed for HVM only, and hence would better not be built
> with HVM=n. But there are also a few other items which I've
> noticed along the road.

Acked-by: Tim Deegan 
with two suggestions that I've sent separately.

Cheers,

Tim.



Re: [PATCH 07/10] x86/shadow: the guess_wrmap() hook is needed for HVM only

2020-04-18 Thread Tim Deegan
At 16:28 +0200 on 17 Apr (1587140897), Jan Beulich wrote:
> sh_remove_write_access() bails early for !external guests, and hence its
> building and thus the need for the hook can be suppressed altogether in
> !HVM configs.
> 
> Signed-off-by: Jan Beulich 

> @@ -366,6 +367,14 @@ int sh_validate_guest_entry(struct vcpu
>  extern int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
>unsigned int level,
>unsigned long fault_addr);
> +#else
> +static inline int sh_remove_write_access(struct domain *d, mfn_t 
> readonly_mfn,
> + unsigned int level,
> + unsigned long fault_addr)
> +{

Can we have an ASSERT(!shadow_mode_refcounts(d)) here, please,
matching the check that would have made it a noop before?

Cheers,

Tim.




Re: [PATCH 04/10] x86/shadow: sh_update_linear_entries() is a no-op for PV

2020-04-18 Thread Tim Deegan
At 16:26 +0200 on 17 Apr (1587140817), Jan Beulich wrote:
> Consolidate the shadow_mode_external() in here: Check this once at the
> start of the function.
> 
> Signed-off-by: Jan Beulich 
> 
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3707,34 +3707,30 @@ sh_update_linear_entries(struct vcpu *v)
>   */

This looks good.  Can you please also delete the out-of-date comment
just above that talks about about PAE linear pagetables in PV guests,
to make it clear that PV guests don't need any maintenance here?

Cheers,

Tim.



Re: [Xen-devel] [PATCH v4 4/7] x86/tlb: introduce a flush guests TLB flag

2020-02-13 Thread Tim Deegan
At 18:28 +0100 on 10 Feb (1581359306), Roger Pau Monne wrote:
> Introduce a specific flag to request a HVM guest TLB flush, which is
> an ASID/VPID tickle that forces a linear TLB flush for all HVM guests.
> 
> This was previously unconditionally done in each pre_flush call, but
> that's not required: HVM guests not using shadow don't require linear
> TLB flushes as Xen doesn't modify the guest page tables in that case
> (ie: when using HAP).
> 
> Modify all shadow code TLB flushes to also flush the guest TLB, in
> order to keep the previous behavior. I haven't looked at each specific
> shadow code TLB flush in order to figure out whether it actually
> requires a guest TLB flush or not, so there might be room for
> improvement in that regard.
> 
> Signed-off-by: Roger Pau Monné 
> Reviewed-by: Wei Liu 

Acked-by:  Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/7] x86/paging: add TLB flush hooks

2020-02-13 Thread Tim Deegan
At 09:02 + on 13 Feb (1581584528), Tim Deegan wrote:
> At 18:28 +0100 on 10 Feb (1581359304), Roger Pau Monne wrote:
> > Add shadow and hap implementation specific helpers to perform guest
> > TLB flushes. Note that the code for both is exactly the same at the
> > moment, and is copied from hvm_flush_vcpu_tlb. This will be changed by
> > further patches that will add implementation specific optimizations to
> > them.
> > 
> > No functional change intended.
> > 
> > Signed-off-by: Roger Pau Monné 
> > Reviewed-by: Wei Liu 
> 
> Acked-by: Tim Deegan 

Oops, wrong address, sorry.

Acked-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 2/7] x86/paging: add TLB flush hooks

2020-02-13 Thread Tim Deegan
At 18:28 +0100 on 10 Feb (1581359304), Roger Pau Monne wrote:
> Add shadow and hap implementation specific helpers to perform guest
> TLB flushes. Note that the code for both is exactly the same at the
> moment, and is copied from hvm_flush_vcpu_tlb. This will be changed by
> further patches that will add implementation specific optimizations to
> them.
> 
> No functional change intended.
> 
> Signed-off-by: Roger Pau Monné 
> Reviewed-by: Wei Liu 

Acked-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 0/4] purge free_shared_domheap_page()

2020-02-07 Thread Tim Deegan
At 09:17 + on 06 Feb (1580980664), Durrant, Paul wrote:
> > -Original Message-
> > From: Jan Beulich 
> > On 06.02.2020 09:28, Durrant, Paul wrote:
> > >>  xen/arch/x86/mm/shadow/common.c |  2 +-

> George, Julien, Tim,
> 
>   Can I have acks or otherwise, please?

Acked-by: Tim Deegan 

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] xen/x86: p2m: Don't initialize slot 0 of the P2M

2020-02-05 Thread Tim Deegan
At 18:31 + on 03 Feb (1580754711), Julien Grall wrote:
> On 03/02/2020 17:37, George Dunlap wrote:
> > On 2/3/20 5:22 PM, Julien Grall wrote:
> >> On 03/02/2020 17:10, George Dunlap wrote:
> >>> On 2/3/20 4:58 PM, Julien Grall wrote:
>  From: Julien Grall 
> 
>  It is not entirely clear why the slot 0 of each p2m should be populated
>  with empty page-tables. The commit introducing it 759af8e3800 "[HVM]
>  Fix 64-bit HVM domain creation." does not contain meaningful
>  explanation except that it was necessary for shadow.
> >>>
> >>> Tim, any ideas here?

Afraid not, sorry.  I can't think what would rely on the tables being
allocated for slot 0 in particular.  Maybe there's something later
that adds other entries in the bottom 2MB and can't handle a table
allocation failure?

> > Also, it's not clear to me what kind of bug the code you're deleting
> > would fix.  If you read a not-present entry, you should get INVALID_MFN
> > anyway.  Unless you were calling p2m_get_entry_query(), which I'm pretty
> > sure hadn't been introduced at this point.
> 
> I can't find this function you mention in staging. Was it removed recently?
> 
> The code is allocating all page-tables for _gfn(0). I would not expect 
> the common code to care whether a table is allocated or not. So this 
> would suggest that an internal implementation (of the shadow?) is 
> relying on this.
> 
> However, I can't find anything obvious suggesting that is necessary. If 
> there was anything, I would expect to happen during domain creation, as 
> neither Xen nor a guest could rely on this (there are way to make those 
> pages disappear with the MEMORY op hypercall).

That may not have been true at the time (and so whatever it was that
neede this may have been fixed when it became true?)

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Issues/improvements performing flush of guest TLBs

2020-01-16 Thread Tim Deegan
Hi,

At 12:16 +0100 on 15 Jan (1579090561), Roger Pau Monné wrote:
>  - Shadow: it's not clear to me exactly which parts of sh_update_cr3
>are needed in order to perform a guest TLB flush. I think calling:
> 
> #if (SHADOW_OPTIMIZATIONS & SHOPT_VIRTUAL_TLB)
> /* No longer safe to use cached gva->gfn translations */
> vtlb_flush(v);
> #endif
> #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
> /* Need to resync all the shadow entries on a TLB flush. */
> shadow_resync_current_vcpu(v);
> #endif
> 
> if ( is_hvm_domain(d) )
> /*
>  * Linear mappings might be cached in non-root mode when ASID/VPID is
>  * in use and hence they need to be flushed here.
>  */
> hvm_asid_flush_vcpu(v);
> 
>Should be enough but I'm not very familiar with the shadow code,
>and hence would like some feedback from someone more familiar with
>shadow in order to assert exactly what's required to perform a
>guest TLB flush.

I would advise keeping the whole thing until you have measurememnts
that show that it's worthwhile being clever here (e.g. that the IPI
costs don't dominate).

But I think for safety we need at least the code you mention and also:
 - the code that reloads the PAE top-level entries; and
 - the shadow_resync_other_vcpus() at the end.

>Also, AFAICT sh_update_cr3 is not safe to be called on vCPUs
>currently running on remote pCPUs, albeit there are no assertions
>to that end. It's also not clear which parts of sh_update_cr3 are
>safe to be called while the vCPU is running.

Yeah, sh_update_cr3 makes a bunch of state changes and assumes
that the vcpu can't do TLB loads part-way through.  It may be possible
to do some of it remotely but as you say it would take a lot of
thinking, and if the guest is running you're going to need an IPI
anyway to flush the actual TLB.

> FWIW, there also seems to be a lot of unneeded flushes of HVM guests
> TLB, as do_tlb_flush will unconditionally clear all HVM guest TLBs on
> the pCPU by calling hvm_asid_flush_core which I don't think it's
> necessary/intended by quite a lot of the Xen TLB flush callers. I
> guess this would also warrant a different discussion, as there seems
> to be room for improvement in this area.

There may be room for improvement, but do be careful - the Xen MM
safety rules depend on TLB flushes when a page's type or ownership
changes, and that does mean flushing even the guest TLBs.  ISTR
discussing this at the time that vTLBs were introduced and deciding
that it wasn't worth adding all the tracking that would be necessary;
that may have changed now that the p2m infrastructure is better
developed.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shadow: use single (atomic) MOV for emulated writes

2020-01-16 Thread Tim Deegan
At 15:29 -0500 on 16 Jan (1579188566), Jason Andryuk wrote:
> This is the corresponding change to the shadow code as made by
> bf08a8a08a2e "x86/HVM: use single (atomic) MOV for aligned emulated
> writes" to the non-shadow HVM code.
> 
> The bf08a8a08a2e commit message:
> Using memcpy() may result in multiple individual byte accesses
> (depending how memcpy() is implemented and how the resulting insns,
> e.g. REP MOVSB, get carried out in hardware), which isn't what we
> want/need for carrying out guest insns as correctly as possible. Fall
> back to memcpy() only for accesses not 2, 4, or 8 bytes in size.
> 
> Signed-off-by: Jason Andryuk 

Acked-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] Fix the KDD_LOG statements to use appropriate format specifier for printing uint64_t

2019-12-01 Thread Tim Deegan
At 03:11 -0500 on 30 Nov (1575083478), Julian Tuminaro wrote:
> Previous commit in kdd.c had a small issue which lead to warning/error while 
> compiling
> on 32-bit systems due to mismatch of type size while doing type cast from 
> uint64_t to
> void *
> 
> Signed-off-by: Jenish Rakholiya 
> Signed-off-by: Julian Tuminaro 

Acked-by: Tim Deegan 

Thanks for the fix!

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH V2] kdd.c: Add support for initial handshake in KD protocol for Win 7, 8 and 10 (64 bit)

2019-11-14 Thread Tim Deegan
Hi,

At 23:55 -0500 on 13 Nov (1573689341), Julian Tuminaro wrote:
> From: Julian Tuminaro and Jenish Rakholiya  rakholiyajenish...@gmail.com>
> 
> Current implementation of find_os is based on the hard-coded values for
> different Windows version. It uses the value for get the address to
> start looking for DOS header in the given specified range. However, this
> is not scalable to all version of Windows as it will require us to keep
> adding new entries and also due to KASLR, chances of not hitting the PE
> header is significant. We implement a way for 64-bit systems to use IDT
> entry to get a valid exception/interrupt handler and then move back into
> the memory to find the valid DOS header. Since IDT entries are protected
> by PatchGuard, we think our assumption that IDT entries will not be
> corrupted is valid for our purpose. Once we have the image base, we
> search for the DBGKD_GET_VERSION64 structure type in .data section to
> get information required for handshake.

Thanks for the updates, this looks good!

Reviewed-by: Tim Deegan 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 1/1] kdd.c: Add support for initial handshake in KD protocol for Win 7, 8 and 10 (64 bit)

2019-11-10 Thread Tim Deegan
Hi,

At 21:24 -0500 on 05 Nov (1572989067), Julian Tuminaro wrote:
> Current implementation of find_os is based on the hard-coded values for
> different Windows version. It uses the value for get the address to
> start looking for DOS header in the given specified range. However, this
> is not scalable to all version of Windows as it will require us to keep
> adding new entries and also due to KASLR, chances of not hitting the PE
> header is significant. We implement a way for 64-bit systems to use IDT
> entry to get a valid exception/interrupt handler and then move back into
> the memory to find the valid DOS header. Since IDT entries are protected
> by PatchGuard, we think our assumption that IDT entries will not be
> corrupted is valid for our purpose. Once we have the image base, we
> search for the DBGKD_GET_VERSION64 structure type in .data section to
> get information required for handshake.

Thank you very much for this - it looks super-useful!  Does this
technique only work if the guest kernel has debugging enabled, or can
it work on all systems?

I have some commetns on the code, below.

>  /* Windows version details */
>  typedef struct {
>  uint32_t build; 
> @@ -62,6 +64,7 @@ typedef struct {
>  uint32_t version;   /* +-> NtBuildNumber */
>  uint32_t modules;   /* +-> PsLoadedModuleList */
>  uint32_t prcbs; /* +-> KiProcessorBlock */
> +uint32_t kddl;

This needs a comment describing the Windows name of what it points to.

> +/**
> + * @brief Parse the memory at \a filebase as a valid DOS header and get 
> virtual
> + * address offset and size for any given section name (if it exists)
> + *
> + * @param s Pointer to the kdd_state structure
> + * @param filebase Base address of the file structure
> + * @param sectname Pointer to the section name c-string to look for
> + * @param vaddr Pointer to write the virtual address of section start to
> + * (if found)
> + * @param visze Pointer to write the section size to (if found)
> + *
> + * @return -1 on failure to find the section name
> + * @return 0 on success
> + */
> +static int get_pe64_sections(kdd_state *s, uint64_t filebase, char *sectname,
> +uint64_t *vaddr, uint32_t *vsize)

These new functions don't belong in the 'Utility functions' section.
Please move them to beside the other OS-finding code.

> +{
> +uint8_t buf[0x30];

PE_SECT_ENT_SZ, please.

> +uint64_t pe_hdr;
> +uint64_t sect_start;
> +uint16_t num_sections;
> +int ret;
> +
> +ret = -1;
> +
> +if (!s->os.w64) {
> +return ret;
> +}
> +
> +// read PE header offset
> +if (kdd_read_virtual(s, s->cpuid, filebase + DOS_HDR_PE_OFF, 
> DOS_HDR_PE_SZ,
> +buf) != DOS_HDR_PE_SZ) {
> +return -1;
> +}
> +pe_hdr = filebase + *(uint32_t *)buf;

Here and elsewhere, please read directly into the variables, e.g.:

  uint32_t pe_hdr;
  kdd_read_virtual(s, s->cpuid, filebase + DOS_HDR_PE_OFF,
   sizeof pe_hdr, _hdr);
  pe_hdr += filebase;

That gives neater code and avoids any confusion about the sizes of
various copies and buffers.

> +
> +// read number of sections
> +if (kdd_read_virtual(s, s->cpuid, pe_hdr + PE_NUM_SECTION_OFF,
> +PE_NUM_SECTION_SZ, ) != PE_NUM_SECTION_SZ) {
> +return -1;
> +}
> +num_sections = *(uint16_t *)buf;

This needs a check for very large numbers -- loading 65,535 section
headers might take a long time.

> +// read size of optional header
> +if (kdd_read_virtual(s, s->cpuid, pe_hdr + PE_OPT_HDR_SZ_OFF,
> +PE_OPT_HDR_SZ_SZ, ) != PE_OPT_HDR_SZ_SZ) {
> +return -1;
> +}
> +
> +// 0x18 is the size of PE header
> +sect_start = pe_hdr + PE_HDR_SZ + *(uint16_t *)buf;
> +
> +for (int i = 0; i < num_sections; i++) {
> +if (kdd_read_virtual(s, s->cpuid, sect_start + (i * PE_SECT_ENT_SZ),
> +PE_SECT_ENT_SZ, ) != PE_SECT_ENT_SZ) {
> +return -1;
> +}
> +
> +if (!strncmp(sectname, (char *)(buf + PE_SECT_NAME_OFF),
> +PE_SECT_NAME_SZ)) {
> +*vaddr = filebase + *(uint32_t *)(buf + PE_SECT_RVA_OFF);
> +*vsize = *(uint32_t *)(buf + PE_SECT_VSIZE_OFF);
> +ret = 0;
> +break;

Just 'return 0' will do here, and..

> +}
> +}
> +
> +return ret;

return -1 here, and drop 'ret'.

> +}
> +
> +/**
> + * @brief Get the OS information like base address, minor version,
> + * PsLoadedModuleList and DebuggerDataList (basically the fields of
> + * DBGKD_GET_VERSION64 struture required to do handshake?).
> + *
> + * This is done by reading the IDT entry for divide-by-zero exception and
> + * searching back into the memory for DOS header (which is our kernel base).
> + * Once we have the kernel base, we parse the PE header and look for kernel
> + * base address in the .data section. Once we have possible values, we look 

[Xen-devel] [PATCH] MAINTAINERS: drop Tim Deegan from 'The Rest'

2019-10-17 Thread Tim Deegan
I have not been active in this role for a while now.

Signed-off-by: Tim Deegan 
---
 MAINTAINERS | 1 -
 1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 533cfdc08f..f60d765baf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -537,7 +537,6 @@ M:  Jan Beulich 
 M: Julien Grall 
 M: Konrad Rzeszutek Wilk 
 M: Stefano Stabellini 
-M: Tim Deegan 
 M: Wei Liu 
 L: xen-devel@lists.xenproject.org
 S: Supported
-- 
2.20.1

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shadow: fold p2m page accounting into sh_min_allocation()

2019-09-11 Thread Tim Deegan
At 10:34 +0200 on 05 Sep (1567679687), Jan Beulich wrote:
> This is to make the function live up to the promise its name makes. And
> it simplifies all callers.
> 
> Suggested-by: Andrew Cooper 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shadow: don't enable shadow mode with too small a shadow allocation (part 2)

2019-09-05 Thread Tim Deegan
At 09:55 +0200 on 04 Sep (1567590940), Jan Beulich wrote:
> Commit 2634b997af ("x86/shadow: don't enable shadow mode with too small
> a shadow allocation") was incomplete: The adjustment done there to
> shadow_enable() is also needed in shadow_one_bit_enable(). The (new)
> problem report was (apparently) a failed PV guest migration followed by
> another migration attempt for that same guest. Disabling log-dirty mode
> after the first one had left a couple of shadow pages allocated (perhaps
> something that also wants fixing), and hence the second enabling of
> log-dirty mode wouldn't have allocated anything further.
> 
> Reported-by: James Wang 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -2864,7 +2864,8 @@ static int shadow_one_bit_enable(struct
>  
>  mode |= PG_SH_enable;
>  
> -if ( d->arch.paging.shadow.total_pages == 0 )
> +if ( d->arch.paging.shadow.total_pages <
> + sh_min_allocation(d) + d->arch.paging.shadow.p2m_pages )
>  {
>  /* Init the shadow memory allocation if the user hasn't done so */
>  if ( shadow_set_allocation(d, 1, NULL) != 0 )

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v8 1/6] x86/domain: remove the 'oos_off' flag

2019-09-03 Thread Tim Deegan
At 15:50 +0100 on 02 Sep (1567439409), Paul Durrant wrote:
> The flag is not needed since the domain 'options' can now be tested
> directly.
> 
> Signed-off-by: Paul Durrant 

Acked-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 4/6] x86/domain: remove the 'oos_off' flag

2019-07-24 Thread Tim Deegan
At 17:06 +0100 on 23 Jul (1563901567), Paul Durrant wrote:
> The flag is not needed since the domain 'createflags' can now be tested
> directly.
> 
> Signed-off-by: Paul Durrant 

Acked-by: Tim Deegan 

though some of this change seems to have got into patch 3, maybe they
were reordered at some point?

Cheers,

Tim.


> ---
> Cc: George Dunlap 
> Cc: Jan Beulich 
> Cc: Andrew Cooper 
> Cc: Wei Liu 
> Cc: "Roger Pau Monné" 
> ---
>  xen/arch/x86/mm/shadow/common.c | 3 +--
>  xen/include/asm-x86/domain.h| 1 -
>  2 files changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c
> index 320ea0db21..2c7fafa4fb 100644
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -62,7 +62,6 @@ int shadow_domain_init(struct domain *d)
>  
>  #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
>  d->arch.paging.shadow.oos_active = 0;
> -d->arch.paging.shadow.oos_off = d->createflags & XEN_DOMCTL_CDF_oos_off;
>  #endif
>  d->arch.paging.shadow.pagetable_dying_op = 0;
>  
> @@ -2523,7 +2522,7 @@ static void sh_update_paging_modes(struct vcpu *v)
>  #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
>  /* We need to check that all the vcpus have paging enabled to
>   * unsync PTs. */
> -if ( is_hvm_domain(d) && !d->arch.paging.shadow.oos_off )
> +if ( is_hvm_domain(d) && !(d->createflags & XEN_DOMCTL_CDF_oos_off) )
>  {
>  int pe = 1;
>  struct vcpu *vptr;
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index 933b85901f..5f9899469c 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -115,7 +115,6 @@ struct shadow_domain {
>  
>  /* OOS */
>  bool_t oos_active;
> -bool_t oos_off;
>  
>  /* Has this domain ever used HVMOP_pagetable_dying? */
>  bool_t pagetable_dying_op;
> -- 
> 2.20.1.2.gb21ebb671
> 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shadow: Drop incorrect diagnostic when shadowing TSS.RSP0

2019-04-16 Thread Tim Deegan
At 13:32 +0100 on 08 Apr (1554730320), Andrew Cooper wrote:
> On 08/04/2019 13:11, Jan Beulich wrote:
> >>>> On 08.04.19 at 13:37,  wrote:
> >> On 08/04/2019 11:14, Jan Beulich wrote:
> >>>>>> On 05.04.19 at 21:09,  wrote:
> >>>> --- a/xen/arch/x86/mm/shadow/multi.c
> >>>> +++ b/xen/arch/x86/mm/shadow/multi.c
> >>>> @@ -3305,8 +3305,9 @@ static int sh_page_fault(struct vcpu *v,
> >>>>  {
> >>>>  /*
> >>>>   * If we are in the middle of injecting an exception or 
> >>>> interrupt then
> >>>> - * we should not emulate: it is not the instruction at %eip 
> >>>> that caused
> >>>> - * the fault. Furthermore it is almost certainly the case the 
> >>>> handler
> >>>> + * we should not emulate: the fault is a side effect of the 
> >>>> processor
> >>>> + * trying to push an exception frame onto a stack which has yet 
> >>>> to be
> >>>> + * shadowed.  Furthermore it is almost certainly the case the 
> >>>> handler
> >>>>   * stack is currently considered to be a page table, so we 
> >>>> should
> >>>>   * unshadow the faulting page before exiting.
> >>>>   */
> > I'm (at least) mildly confused: I follow what you write (I think), but
> > you again say "the stack always becomes shadowed". My original
> > question was whether you really mean that, as stacks, if at all,
> > should get shadowed only unintentionally (and hence get un-shadowed
> > immediately when that condition is detected). That is, my (slightly
> > rephrased) question stands: Do you perhaps mean the page tables
> > mapping the stack to become shadowed, rather than the stack itself?
> 
> I guess this is an issue of terminology, to which I defer to Tim to judge.

Hi,

Sorry for the delay; I have been away.

FWIW I prefer the original comment, and I think that referring to the
stack as "not yet shadowed" is confusing when the problem might be
that the stack itself is indeed shadowed.  I'd also be happy with just
saying "the fault is a side effect of the processor trying to push an
exception frame onto the stack."

Happy to see the debug message being removed if it's being triggered
in the real world.  IIRC it has not proved to be useful since that
code was first developed.  So, with the comment adjusted,
Acked-by: Tim Deegan 

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/2] x86/shadow: two tiny further bits of PV/HVM separation

2019-03-11 Thread Tim Deegan
At 10:56 -0600 on 11 Mar (1552301785), Jan Beulich wrote:
> 1: sh_validate_guest_pt_write() is HVM-only
> 2: sh_{write,cmpxchg}_guest_entry() are PV-only

Acked-by: Tim Deegan 

Thanks,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shadow: don't use map_domain_page_global() on paths that may not fail

2019-02-22 Thread Tim Deegan
At 08:15 -0700 on 20 Feb (1550650529), Jan Beulich wrote:
> The assumption (according to one comment) and hope (according to
> another) that map_domain_page_global() can't fail are both wrong on
> large enough systems. Do away with the guest_vtable field altogether,
> and establish / tear down the desired mapping as necessary.
> 
> The alternatives, discarded as being undesirable, would have been to
> either crash the guest in sh_update_cr3() when the mapping fails, or to
> bubble up an error indicator, which upper layers would have a hard time
> to deal with (other than again by crashing the guest).
> 
> Signed-off-by: Jan Beulich 

I follow your argument, so Acked-by: Tim Deegan 

I would expect this to have a measurable cost on page fault times (on
configurations where global map isn't just a directmap).  It would be
good to know if that's the case.

Cheers,

Tim.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] x86/shadow: adjust minimum allocation calculations

2019-02-09 Thread Tim Deegan
At 04:41 -0700 on 07 Feb (1549514492), Jan Beulich wrote:
> A previously bad situation has become worse with the early setting of
> ->max_vcpus: The value returned by shadow_min_acceptable_pages() has
> further grown, and hence now holds back even more memory from use for
> the p2m.
> 
> Make sh_min_allocation() account for all p2m memory needed for
> shadow_enable() to succeed during domain creation (at which point the
> domain has no memory at all allocated to it yet, and hence use of
> d->tot_pages is meaningless).
> 
> Also make shadow_min_acceptable_pages() no longer needlessly add 1 to
> the vCPU count.
> 
> Finally make the debugging printk() in shadow_alloc_p2m_page() a little
> more useful by logging some of the relevant domain settings.
> 
> Reported-by: Roger Pau Monné 
> Signed-off-by: Jan Beulich 
> Reviewed-by: Roger Pau Monné 

Acked-by: Tim Deegan 

I don't think it's worth bikeshedding too hard over individual
pages on this path; these numbers look OK.

Cheers,

Tim.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shadow: don't enable shadow mode with too small a shadow allocation

2018-11-30 Thread Tim Deegan
At 07:53 -0700 on 29 Nov (1543477992), Jan Beulich wrote:
> We've had more than one report of host crashes after failed migration,
> and in at least one case we've had a hint towards a too far shrunk
> shadow allocation pool. Instead of just checking the pool for being
> empty, check whether the pool is smaller than what
> shadow_set_allocation() would minimally bump it to if it was invoked in
> the first place.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

Thanks,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/shadow: un-hide "full" auditing code

2018-11-20 Thread Tim Deegan
At 08:59 -0700 on 20 Nov (1542704343), Jan Beulich wrote:
> In particular sh_oos_audit() has become stale due to changes elsewhere,
> and the need for adjustment was not noticed because both "full audit"
> flags are off in both release and debug builds. Switch away from pre-
> processsor conditionals, thus exposing the code to the compiler at all
> times. This obviously requires correcting the accumulated issues with
> the so far hidden code.
> 
> Note that shadow_audit_tables() now also gains an effect with "full
> entry audit" mode disabled; the prior code structure suggests that this
> was originally intended anyway.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

Thanks!

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2 3/3] x86/shadow: emulate_gva_to_mfn() should respect p2m_ioreq_server

2018-11-15 Thread Tim Deegan
At 05:51 -0700 on 15 Nov (1542261108), Jan Beulich wrote:
> Writes to such pages need to be handed to the emulator.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/3] x86/shadow: emulate_gva_to_mfn() should respect p2m_ioreq_server

2018-11-14 Thread Tim Deegan
At 12:44 + on 14 Nov (1542199496), Paul Durrant wrote:
> > -Original Message-
> > From: Tim Deegan [mailto:t...@xen.org]
> > Sent: 14 November 2018 12:42
> > To: Jan Beulich 
> > Cc: Paul Durrant ; Andrew Cooper
> > ; Wei Liu ; xen-devel
> > 
> > Subject: Re: [PATCH 5/3] x86/shadow: emulate_gva_to_mfn() should respect
> > p2m_ioreq_server
> > 
> > At 04:22 -0700 on 13 Nov (1542082936), Jan Beulich wrote:
> > > >>> On 13.11.18 at 11:59,  wrote:
> > > >> Subject: [PATCH 5/3] x86/shadow: emulate_gva_to_mfn() should respect
> > > >> p2m_ioreq_server
> > > >>
> > > >> Writes to such pages would need to be handed to the emulator, which
> > we're
> > > >> not prepared to do at this point.
> > > >>
> > > >> Signed-off-by: Jan Beulich 
> > > >>
> > > >> --- a/xen/arch/x86/mm/shadow/hvm.c
> > > >> +++ b/xen/arch/x86/mm/shadow/hvm.c
> > > >> @@ -338,7 +338,7 @@ static mfn_t emulate_gva_to_mfn(struct v
> > > >>  {
> > > >>  return _mfn(BAD_GFN_TO_MFN);
> > > >>  }
> > > >> -if ( p2m_is_discard_write(p2mt) )
> > > >> +if ( p2m_is_discard_write(p2mt) || p2mt == p2m_ioreq_server )
> > > >>  {
> > > >>  put_page(page);
> > > >>  return _mfn(READONLY_GFN);
> > > >
> > > > Is this what we want here? I would have thought we want to return
> > > > BAD_GFN_TO_MFN in the p2m_ioreq_server case so that the caller treats
> > this in
> > > > the same way it would MMIO.
> > >
> > > I'm not sure which behavior is better; I'm certainly fine with switching
> > > as you say, but I'd first like to see Tim's opinion as well.
> > 
> > I'm not clear on what behaviour you want for this kind of page in
> > general -- I suspect I have missed or forgotten some context.  If the
> > guest's not supposed to write to it, then IMO you should add it to
> > P2M_DISCARD_WRITE_TYPES rather than special-casing it here.
> 
> The type has an odd semantic... it needs to read as normal RAM but writes 
> need to hit emulation. The use-case is for GVT-g where the emulator needs to 
> intercept writes to the graphics pagetables in guest RAM.

I see, thanks.  In that case, I agree with you that you should signal
this as BAD_GFN_TO_MFN here to trigger the MMIO path.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 5/3] x86/shadow: emulate_gva_to_mfn() should respect p2m_ioreq_server

2018-11-14 Thread Tim Deegan
At 04:22 -0700 on 13 Nov (1542082936), Jan Beulich wrote:
> >>> On 13.11.18 at 11:59,  wrote:
> >> Subject: [PATCH 5/3] x86/shadow: emulate_gva_to_mfn() should respect
> >> p2m_ioreq_server
> >> 
> >> Writes to such pages would need to be handed to the emulator, which we're
> >> not prepared to do at this point.
> >> 
> >> Signed-off-by: Jan Beulich 
> >> 
> >> --- a/xen/arch/x86/mm/shadow/hvm.c
> >> +++ b/xen/arch/x86/mm/shadow/hvm.c
> >> @@ -338,7 +338,7 @@ static mfn_t emulate_gva_to_mfn(struct v
> >>  {
> >>  return _mfn(BAD_GFN_TO_MFN);
> >>  }
> >> -if ( p2m_is_discard_write(p2mt) )
> >> +if ( p2m_is_discard_write(p2mt) || p2mt == p2m_ioreq_server )
> >>  {
> >>  put_page(page);
> >>  return _mfn(READONLY_GFN);
> > 
> > Is this what we want here? I would have thought we want to return 
> > BAD_GFN_TO_MFN in the p2m_ioreq_server case so that the caller treats this 
> > in 
> > the same way it would MMIO.
> 
> I'm not sure which behavior is better; I'm certainly fine with switching
> as you say, but I'd first like to see Tim's opinion as well.

I'm not clear on what behaviour you want for this kind of page in
general -- I suspect I have missed or forgotten some context.  If the
guest's not supposed to write to it, then IMO you should add it to
P2M_DISCARD_WRITE_TYPES rather than special-casing it here.

Cheers,

Tim.


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 18/18] tools/debugger/kdd: Install as `xen-kdd', not just `kdd'

2018-10-07 Thread Tim Deegan
At 18:29 +0100 on 05 Oct (1538764157), Ian Jackson wrote:
> `kdd' is an unfortunate namespace landgrab.

Bah, humbug, etc. :)  Can we have a note in the changelog for the next
release to warn the few kdd users that we've done this?

> Signed-off-by: Ian Jackson 

Acked-by: Tim Deegan 

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] Ping: [PATCH] x86: improve vCPU selection in pagetable_dying()

2018-10-03 Thread Tim Deegan
At 17:56 +0100 on 03 Oct (1538589366), George Dunlap wrote:
> On 09/26/2018 08:04 AM, Jan Beulich wrote:
> > Looking at things again (in particular
> > the comment ahead of pagetable_dying()) I now actually wonder why
> > HVMOP_pagetable_dying is permitted to be called by other than a domain
> > for itself. There's no use of it in the tool stack. Disallowing the unused
> > case would mean the fast-path logic in sh_pagetable_dying() could
> > become the only valid/implemented case. Tim?
> 
> Not so -- a guest could still call pagetable_dying() on the top level PT
> of a process not currently running.
> 
> I would be totally in favor of limiting this call to the guest itself,
> however -- that would simplify the logic even more.

Yes, I think that this can be restricted to the caller's domain, and
so always use current as the vcpu.  I don't recall a use case for
setting this from outside the VM.

I can't find reason for the vcpu[0] in the history, but it does look
wrong.  I suspect this patch might have been in a XenServer patch
queue for a while, and perhaps the plumbing was fixed up incorrectly
when it was upstreamed.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] x86/mm: Drop {HAP,SHADOW}_ERROR() wrappers

2018-08-30 Thread Tim Deegan
At 19:11 +0100 on 28 Aug (1535483514), Andrew Cooper wrote:
> Unlike the PRINTK/DEBUG wrappers, these go straight out to the console, rather
> than ending up in the debugtrace buffer.
> 
> A number of these users are followed by domain_crash(), and future changes
> will want to combine the printk() into the domain_crash() call.  Expand these
> wrappers in place, using XENLOG_ERR before a BUG(), and XENLOG_G_ERR before a
> domain_crash().
> 
> Perfom some %pv/PRI_mfn/etc cleanup while modifying the invocations, and
> explicitly drop some calls which are unnecessary (bad shadow op, and the empty
> stubs for incorrect sh_map_and_validate_gl?e() calls).
> 
> Signed-off-by: Andrew Cooper 

Acked-by:  Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 26/34] x86/mm/shadow: split out HVM only code

2018-08-24 Thread Tim Deegan
At 16:12 +0100 on 17 Aug (1534522364), Wei Liu wrote:
> Move the code previously enclosed in CONFIG_HVM into its own file.
> 
> Note that although some code explicitly check is_hvm_*, which hints it
> can be used for PV too, I can't find a code path that would be the
> case.
> 
> Signed-off-by: Wei Liu 

Reviewed-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 25/34] x86/mm/shadow: make it build with !CONFIG_HVM

2018-08-24 Thread Tim Deegan
At 16:12 +0100 on 17 Aug (1534522363), Wei Liu wrote:
> Enclose HVM only emulation code under CONFIG_HVM. Add some BUG()s to
> to catch any issue.
> 
> Note that although some code checks is_hvm_*, which hints it can be
> called for PV as well, I can't find such paths.
> 
> Signed-off-by: Wei Liu 

Reviewed-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/6] x86/mm: Minor non-functional cleanup

2018-08-16 Thread Tim Deegan
At 19:34 +0100 on 15 Aug (1534361671), Andrew Cooper wrote:
> Minor cleanup which has accumulated while doing other work.  No functional
> change anywhere.
> 
> Andrew Cooper (6):
>   x86/mm: Use mfn_eq()/mfn_add() rather than opencoded variations
>   x86/shadow: Use more appropriate conversion functions
>   x86/shadow: Switch shadow_domain.has_fast_mmio_entries to bool
>   x86/shadow: Use MASK_* helpers for the MMIO fastpath PTE manipulation
>   x86/shadow: Clean up the MMIO fastpath helpers
>   x86/shadow: Use mfn_t in shadow_track_dirty_vram()

Reviewed-by: Tim Deegan 
(with the one correction that Roger asked for in patch 1/6)


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH RFC] tools/kdd: avoid adversarial optimisation hazard

2018-08-03 Thread Tim Deegan
Hi,

Apologies for the delay.  Several of my other hats were on fire.

> > I suspect the address, from which offset is derived, is bounded. But I
> > haven't found the spec for KD.
> 
> I don’t think there is one.

Indeed not.  The official way to extend windbg  is to write a plugin
that runs on the Windows machine where you run the debugger.

At 13:37 +0100 on 26 Jul (1532612265), Ian Jackson wrote:
> It's still very obscure becaause this test
> 
> if (offset > sizeof ctrl.c32 || offset + len > sizeof ctrl.c32) {
> 
> depends critically on the size of offset, etc.
> 
> Is it not still possible that this test could be fooled ?  Suppose
> offset is 0x.  Then before the test, offset is 0xfd33.

This is > sizeof ctrl.c32.  But:

> This kind of reasoning is awful.  The code should be rewritten so that
> it is obvious that it won't go wrong.

Yes.  How about this (compile tested only, and I haven't checked the buggy
gcc versions):

diff --git a/tools/debugger/kdd/kdd.c b/tools/debugger/kdd/kdd.c
index 5a019a0a0c..64aacde1ee 100644
--- a/tools/debugger/kdd/kdd.c
+++ b/tools/debugger/kdd/kdd.c
@@ -687,11 +687,11 @@ static void kdd_handle_read_ctrl(kdd_state *s)
 }
 } else {
 /* 32-bit control-register space starts at 0x[2]cc, for 84 bytes */
-uint32_t offset = addr;
-if (offset > 0x200)
-offset -= 0x200;
-offset -= 0xcc;
-if (offset > sizeof ctrl.c32 || offset + len > sizeof ctrl.c32) {
+uint32_t offset = addr - 0xcc;
+if (offset > sizeof ctrl.c32)
+offset = addr - 0x2cc;
+if (offset > sizeof ctrl.c32
+|| len > sizeof ctrl.c32 - offset) {
 KDD_LOG(s, "Request outside of known control space\n");
 len = 0;
 } else {


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 3/6] x86/HVM: implement memory read caching

2018-07-23 Thread Tim Deegan
Hi,

At 04:48 -0600 on 19 Jul (1531975717), Jan Beulich wrote:
> Emulation requiring device model assistance uses a form of instruction
> re-execution, assuming that the second (and any further) pass takes
> exactly the same path. This is a valid assumption as far use of CPU
> registers goes (as those can't change without any other instruction
> executing in between), but is wrong for memory accesses. In particular
> it has been observed that Windows might page out buffers underneath an
> instruction currently under emulation (hitting between two passes). If
> the first pass translated a linear address successfully, any subsequent
> pass needs to do so too, yielding the exact same translation.
> 
> Introduce a cache (used just by guest page table accesses for now) to
> make sure above described assumption holds. This is a very simplistic
> implementation for now: Only exact matches are satisfied (no overlaps or
> partial reads or anything).
> 
> Signed-off-by: Jan Beulich 

For the change to shadow code:

Acked-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 6/6] x86/shadow: a little bit of style cleanup

2018-07-23 Thread Tim Deegan
At 04:51 -0600 on 19 Jul (1531975863), Jan Beulich wrote:
> Correct indentation of a piece of code, adjusting comment style at the
> same time. Constify gl3e pointers and drop a bogus (and useless once
> corrected) cast.
> 
> Signed-off-by: Jan Beulich 

Acked-by: Tim Deegan 

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 07/16] x86/shadow: fetch CPL just once in sh_page_fault()

2018-07-11 Thread Tim Deegan
At 07:29 -0600 on 11 Jul (1531294179), Jan Beulich wrote:
> This isn't as much of an optimization than to avoid triggering a gcc bug
> affecting 5.x ... 7.x, triggered by any asm() put inside the ad hoc
> "rewalk" loop and taking as an (output?) operand a register variable
> tied to %rdx (an "rdx" clobber is fine). The issue is due to an apparent
> collision in register use with the modulo operation in vtlb_hash(),
> which (with optimization enabled) involves a multiplication of two
> 64-bit values with the upper half (in %rdx) of the 128-bit result being
> of interest.
> 
> Such an asm() was originally meant to be implicitly introduced into the
> code when converting most indirect calls through the hvm_funcs table to
> direct calls (via alternative instruction patching); that model was
> switched to clobbers due to further compiler problems, but I think the
> change here is worthwhile nevertheless.
> 
> Signed-off-by: Jan Beulich 

I don't quite follow what the compiler bug does here -- it would be nice
to say what effect it has on the final code.  In any case, the code
change is fine.

Reviewed-by: Tim Deegan 

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  1   2   >