On 8/14/20 7:52 PM, Frank Chang wrote: > probe_pages(env, base + stride * i, nf * esz, ra, access_type); > and > target_ulong addr = base + stride * i + k * esz; > > If we pass ctzl(sizeof(type)) in GEN_VEXT_LD_STRIDE(), > I would still have to do: (1 << esz) to get the correct element size in the > above calculations. > Would it eliminate the performance gain we have in vext_max_elems() instead?
Well, no, it will improve performance, because you'll write addr = base + stride * i + (k << esz) I.e. strength-reduce the multiply to a shift. r~