On 8/14/20 7:52 PM, Frank Chang wrote:
>     probe_pages(env, base + stride * i, nf * esz, ra, access_type);
> and
>     target_ulong addr = base + stride * i + k * esz;
> 
> If we pass ctzl(sizeof(type)) in GEN_VEXT_LD_STRIDE(),
> I would still have to do: (1 << esz) to get the correct element size in the
> above calculations.
> Would it eliminate the performance gain we have in vext_max_elems() instead?

Well, no, it will improve performance, because you'll write

  addr = base + stride * i + (k << esz)

I.e. strength-reduce the multiply to a shift.


r~


Reply via email to