> I also think that, if we create a bunch more of these wrappers: > >> +DEF_VFAE_HELPER(8) >> +DEF_VFAE_HELPER(16) >> +DEF_VFAE_HELPER(32) > > then RT and ZS can be passed in as constant parameters to the above, and then > the compiler will fold away all of the stuff that's not needed for each > different case. Which, I think, is significant. These are practically > different instructions with the different modifiers. >
So, we have 4 flags, resulting in 16 variants. Times 3 element sizes ... 48 helpers in total. Do we really want to go down that path? I can also go ahead any try to identify the most frequent users (in Linux) and only specialize that one. -- Thanks, David / dhildenb