On Sat, Aug 15, 2020 at 1:29 PM Richard Henderson < richard.hender...@linaro.org> wrote:
> On 8/14/20 7:52 PM, Frank Chang wrote: > > probe_pages(env, base + stride * i, nf * esz, ra, access_type); > > and > > target_ulong addr = base + stride * i + k * esz; > > > > If we pass ctzl(sizeof(type)) in GEN_VEXT_LD_STRIDE(), > > I would still have to do: (1 << esz) to get the correct element size in > the > > above calculations. > > Would it eliminate the performance gain we have in vext_max_elems() > instead? > > Well, no, it will improve performance, because you'll write > > addr = base + stride * i + (k << esz) > > I.e. strength-reduce the multiply to a shift. > > This works like a charm. Thanks for the advice. Frank Chang > r~ > >