Sorry for the slow reply.

Christophe Lyon <christophe.l...@linaro.org> writes:
> On Thu, 1 Oct 2020 at 16:10, Richard Sandiford via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> This patch does several things at once:
>>
>> (1) Add vector compare patterns (vec_cmp and vec_cmpu).
>>
>> (2) Add vector selects between floating-point modes when the
>>     values being compared are integers (affects vcond and vcondu).
>>
>> (3) Add vector selects between integer modes when the values being
>>     compared are floating-point (affects vcond).
>>
>> (4) Add standalone vector select patterns (vcond_mask).
>>
>> (5) Tweak the handling of compound comparisons with zeros.
>>
>> Unfortunately it proved too difficult (for me) to separate this
>> out into a series of smaller patches, since everything is so
>> inter-related.  Defining only some of the new patterns does
>> not leave things in a happy state.
>>
>> The handling of comparisons is mostly taken from the vcond patterns.
>> This means that it remains non-compliant with IEEE: “quiet” comparisons
>> use signalling instructions.  But that shouldn't matter for floats,
>> since we require -funsafe-math-optimizations to vectorize for them
>> anyway.
>>
>> It remains the case that comparisons and selects aren't implemented
>> at all for HF vectors.  Implementing those feels like separate work.
>>
>> Tested on arm-linux-gnueabihf and arm-eabi (for MVE).  OK to install?
>>
>> Richard
>>
>
> Hi Richard,
>
> This patches enables a few more tests on armeb-linux-gnueabihf
> --with-cpu cortex-a9
> --with-fpu neon-fp16, with these failures:
>     gcc.dg/vect/slp-cond-2-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorizing stmts using SLP" 3
>     gcc.dg/vect/slp-cond-2-big-array.c scan-tree-dump-times vect
> "vectorizing stmts using SLP" 3
>     gcc.dg/vect/slp-cond-2.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorizing stmts using SLP" 3
>     gcc.dg/vect/slp-cond-2.c scan-tree-dump-times vect "vectorizing
> stmts using SLP" 3
>     gcc.dg/vect/vect-cond-10.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 1 loops" 8
>     gcc.dg/vect/vect-cond-10.c scan-tree-dump-times vect "vectorized 1 loops" 
> 8
>     gcc.dg/vect/vect-cond-8.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 1 loops" 5
>     gcc.dg/vect/vect-cond-8.c scan-tree-dump-times vect "vectorized 1 loops" 5
>     gcc.dg/vect/vect-cond-9.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vectorized 1 loops" 10
>     gcc.dg/vect/vect-cond-9.c scan-tree-dump-times vect "vectorized 1 loops" 
> 10
>
> I guess this is expected since vectorization does not work well on
> armeb in general?

Yeah, seems like it, unfortunately.  I think if we wanted to fix this,
we should look at supporting the operations disabled in r176050.  Packs
and unpacks seem to be the problem for at least some of the tests above.

Trying to make the results clean with the current (somewhat artificial)
restrictions seems like a dead end.

Thanks,
Richard

Reply via email to