On 22.05.2024 15:17, Andrew Cooper wrote: > sh_trace_gfn_va() is very similar to sh_trace_gl1e_va(), and a rather shorter > name than trace_shadow_emulate_other(). Like sh_trace_gl1e_va(), there is no > need to pack the trace record.
Provided record size can freely change (here for the 3-level case) without breaking consumers, i.e. similar to patch 2. > --- a/xen/arch/x86/mm/shadow/multi.c > +++ b/xen/arch/x86/mm/shadow/multi.c > @@ -2010,29 +2010,30 @@ static void sh_trace_gl1e_va(uint32_t event, > guest_l1e_t gl1e, guest_va_t va) > } > } > > -static inline void trace_shadow_emulate_other(u32 event, > - guest_va_t va, > - gfn_t gfn) > +/* Shadow trace event with a gfn, linear address and flags. */ > +static void sh_trace_gfn_va(uint32_t event, gfn_t gfn, guest_va_t va) > { > if ( tb_init_done ) > { > - struct __packed { > - /* for PAE, guest_l1e may be 64 while guest_va may be 32; > - so put it first for alignment sake. */ > + struct { > + /* > + * For GUEST_PAGING_LEVELS=3 (PAE paging), gfn is 64 while > + * guest_va is 32. Put it first to avoid padding. > + */ > #if GUEST_PAGING_LEVELS == 2 > - u32 gfn; > + uint32_t gfn; > #else > - u64 gfn; > + uint64_t gfn; > #endif > guest_va_t va; > - } d; > - > - event |= ((GUEST_PAGING_LEVELS-2)<<8); > - > - d.gfn=gfn_x(gfn); > - d.va = va; > + uint32_t flags; > + } d = { > + .gfn = gfn_x(gfn), > + .va = va, > + .flags = this_cpu(trace_shadow_path_flags), > + }; There's again no function call involved here, so having tb_init_done checked only in (inlined) sh_trace() ought to again be enough? Jan