On 7/22/20 2:15 AM, frank.ch...@sifive.com wrote: > +static void > +vext_ldst_whole(void *vd, target_ulong base, CPURISCVState *env, uint32_t > desc, > + vext_ldst_elem_fn *ldst_elem, uint32_t esz, uintptr_t ra, > + MMUAccessType access_type) > +{ > + uint32_t i, k; > + uint32_t nf = vext_nf(desc); > + uint32_t vlmax = vext_maxsz(desc) / esz; > + uint32_t vlenb = env_archcpu(env)->cfg.vlen >> 3; > + > + /* probe every access */ > + probe_pages(env, base, vlenb * nf * esz, ra, access_type); > + > + /* load bytes from guest memory */ > + for (i = 0; i < vlenb; i++) { > + k = 0; > + while (k < nf) { > + target_ulong addr = base + (i * nf + k) * esz; > + ldst_elem(env, addr, i + k * vlmax, vd, ra); > + k++; > + } > + } > +}
First, nf != 0 is reserved, so you shouldn't attempt to support it here. Second, even then the note in the spec suggests that these two loops should be interchanged -- but I'll also grant that the language could use improvement. Indeed, the whole vector load/store section seems to need improvement. For instance, no where does it say how EEW < SEW load operations are extended. >From reading Spike source code I can infer that it's sign-extended. But that's something a spec should explicitly say. r~