On Sat, Aug 15, 2020 at 2:36 AM Richard Henderson < richard.hender...@linaro.org> wrote:
> On 8/13/20 7:48 PM, Frank Chang wrote: > > esz is passed from e.g. GEN_VEXT_LD_STRIDE() macro: > > > >> #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN) \ > >> void HELPER(NAME)(void *vd, void * v0, target_ulong base, \ > >> target_ulong stride, CPURISCVState *env, \ > >> uint32_t desc) \ > >> { \ > >> uint32_t vm = vext_vm(desc); \ > >> vext_ldst_stride(vd, v0, base, stride, env, desc, vm, LOAD_FN, \ > >> sizeof(ETYPE), GETPC(), MMU_DATA_LOAD); \ > >> } > >> > >> GEN_VEXT_LD_STRIDE(vlse8_v, int8_t, lde_b) > > > > which is calculated by sizeof(ETYPE), so the results would be: 1, 2, 4, > 8. > > and vext_max_elems() is called by e.g. vext_ldst_stride(): > > Ah, yes. > > >> uint32_t max_elems = vext_max_elems(desc, esz); > > > > I can add another parameter to the macro and pass the hard-coded > log2(esz) number > > if it's the better way instead of using ctzl(). > > Or if there's another approach to get the log2(esz) number more > elegantly? > > Using ctzl(sizeof(type)) in the GEN_VEXT_LD_STRIDE macro will work well. > This > will be constant folded by the compiler. > > > r~ > Checked the codes again, GEN_VEXT_LD_STRIDE() will eventually call vext_ldst_stride() and pass esz as the parameter. However, esz is not only used in vext_max_elems() but also used for other calculation, e.g.: probe_pages(env, base + stride * i, nf * esz, ra, access_type); and target_ulong addr = base + stride * i + k * esz; If we pass ctzl(sizeof(type)) in GEN_VEXT_LD_STRIDE(), I would still have to do: (1 << esz) to get the correct element size in the above calculations. Would it eliminate the performance gain we have in vext_max_elems() instead? Frank Chang