On Tue, Jun 23, 2020 at 5:21 AM Richard Sandiford <richard.sandif...@arm.com> wrote: > MVE and Power both set inactive lanes to zero. But I'm not sure about RVV. > AIUI, for RVV the approach instead would be to reduce the effective vector > length for the final iteration of the vector loop, and I'm not sure > whether in that situation it makes sense to say that the other elements > still exist and are guaranteed to be zero. > > I'm the last person who should be speculating on that though. Let's see > whether Jim has any comments.
The RVV spec supports two policies for tail elements, i.e. elements beyond the current vector length. They can be undisturbed or agnostic. In the undisturbed case, the trail elements retain their old values. In the agnostic case, the implementation can choose to either retain their old values, or set them to all ones, and this choice can be different from lane to lane. The latter case is useful because registers may be wider than the execution unit, and current vector length may not be a multiple of the width of the execution unit. So for instance if the vector registers can hold 8 elements, and the execution unit works on 4 elements at a time, and the current vector length is 2, then it might make sense to leave the last four elements unmodified to avoid an iteration across the registers, but the third and fourth elements might be set to all ones because you have to write to them anyways. The choice is left up to the implementation because we have multiple parties designing vector units, and some are target for low cost embedded market, and some are target for high performance, and they couldn't agree on a single best way to implement this. The software is expected to choose agnostic only if it doesn't care about what happens to tail elements, and undisturbed if you want to preserve them. The value of all ones was chosen to discourage software developers from trying to use the values in tail elements. The choice of undisturbed or agnostic can be changed every time you set the current vector length and type. In most cases, I think RVV programs will use agnostic for tail elements, since we can change the vector length at will, and it will be rare that we will care about elements beyond the current vector length. Tail elements can't cause exceptions so there is no need to worry about whether those elements hold valid values. Jim