On 8/11/23 03:01, Lehua Ding wrote:
Hi,
This patch revert the convert from vmv.s.x to vmv.v.i and add new pattern
optimize the special case when the scalar operand is zero.
Currently, the broadcast pattern where the scalar operand is a imm
will be converted to vmv.v.i from vmv.s.x and the mask operand will be
converted from 00..01 to 11..11. There are some advantages and
disadvantages before and after the conversion after discussing
with Juzhe offline and we chose not to do this transform.
Before:
Advantages: The vsetvli info required by vmv.s.x has better compatibility
since
vmv.s.x only required SEW and VLEN be zero or one. That mean there
is more opportunities to combine with other vsetlv infos in vsetvl pass.
Disadvantages: For non-zero scalar imm, one more `li rd, imm` instruction
will be needed.
After:
Advantages: No need `li rd, imm` instruction since vmv.v.i support imm
operand.
Disadvantages: Like before's advantages. Worse compatibility leads to more
vsetvl instrunctions need.
I can't speak for other uarches, but as a guiding principle for Ventana
we're assuming vsetvl instructions are common and as a result need to be
very cheap in hardware. It's likely a good tradeoff for us.
I could see other uarches making different design choices though. So at
a high level, do we want this to be driven by cost modeling in some way?
Not a review yet. Wanted to get that feedback to you now since the rest
of my day is going to be fairly busy.
jeff