On November 30, 2020 7:18:37 PM GMT+01:00, Richard Sandiford 
<richard.sandif...@arm.com> wrote:
>Richard Biener <richard.guent...@gmail.com> writes:
>> On November 30, 2020 4:29:41 PM GMT+01:00, Richard Sandiford via
>Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>>dse.c:find_shift_sequence tries to represent a store and load
>>>back as a shift right followed by a truncation.  It therefore
>>>needs to find an integer mode in which to do the shift right.
>>>The loop it uses has the form:
>>>
>>>  FOR_EACH_MODE_FROM (new_mode_iter,
>>>                   smallest_int_mode_for_size (GET_MODE_BITSIZE (read_mode)))
>>>
>>>which implicitly assumes that read_mode has an equivalent integer
>mode.
>>>As shown in the testcase, not all modes have such an integer mode.
>>
>> But if no such mode exists iterating won't help either? So why not
>simply fail when the mode does not exist?
>
>You mean test against the maximum integer mode before the loop,
>but keep the smallest_int_mode_for_size call as-is?  I'm not sure
>that's going to be more efficient in practice, and it seems less
>obviously correct.
>
>Alternatively, we could have a version of smallest_int_mode_for_size
>that returns failure instead of asserting. 

OH, didn't remember this one asserts... 

 But
>smallest_int_mode_for_size
>is itself a FOR_EACH_MODE_* iterator, so it would iterate just as much
>as
>the patch does.
>
>smallest_int_mode_for_size also has some __int20 etc. handling that I
>don't think we want here, and probably avoid by luck.  (We already know
>at this point that any shift value would be nonzero, which probably
>weeds out most of the problematic cases for PSImode targets.)

Ok, I see. 

>find_shift_sequence already iterates over the modes itself and it
>already has custom requirements in terms of when to break and when
>to continue.  It just seems simpler and more obvious for the loop to
>iterate over all the modes directly rather than delegate part of the
>iteration to another function.

Fine. Jeff has already acked the patch. 

Richard. 

>Thanks,
>Richard
>
>
>
>>
>>>This patch just makes the code start from the smallest integer mode
>and
>>>skip modes that are too small.  The loop already breaks at the first
>>>mode wider than word_mode.
>>>
>>>Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK for trunk and
>>>GCC 10 branch?
>>>
>>>Richard
>>>
>>>
>>>gcc/
>>>     PR rtl-optimization/98037
>>>     * dse.c (find_shift_sequence): Iterate over all integers and
>>>     skip modes that are too small.
>>>
>>>gcc/testsuite/
>>>     PR rtl-optimization/98037
>>>     * gcc.target/aarch64/sve/acle/general/pr98037.c: New test.
>>>---
>>> gcc/dse.c                                                   | 5
>+++--
>>> gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c | 6
>++++++
>>> 2 files changed, 9 insertions(+), 2 deletions(-)
>>>create mode 100644
>>>gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c
>>>
>>>diff --git a/gcc/dse.c b/gcc/dse.c
>>>index d65266b5476..651e6e7e71e 100644
>>>--- a/gcc/dse.c
>>>+++ b/gcc/dse.c
>>>@@ -1757,8 +1757,7 @@ find_shift_sequence (poly_int64 access_size,
>>>      the machine.  */
>>> 
>>>   opt_scalar_int_mode new_mode_iter;
>>>-  FOR_EACH_MODE_FROM (new_mode_iter,
>>>-                  smallest_int_mode_for_size (GET_MODE_BITSIZE (read_mode)))
>>>+  FOR_EACH_MODE_IN_CLASS (new_mode_iter, MODE_INT)
>>>     {
>>>       rtx target, new_reg, new_lhs;
>>>       rtx_insn *shift_seq, *insn;
>>>@@ -1767,6 +1766,8 @@ find_shift_sequence (poly_int64 access_size,
>>>       new_mode = new_mode_iter.require ();
>>>       if (GET_MODE_BITSIZE (new_mode) > BITS_PER_WORD)
>>>     break;
>>>+      if (maybe_lt (GET_MODE_SIZE (new_mode), GET_MODE_SIZE
>>>(read_mode)))
>>>+    continue;
>>> 
>>>       /* Try a wider mode if truncating the store mode to NEW_MODE
>>>      requires a real instruction.  */
>>>diff --git
>>>a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c
>>>b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c
>>>new file mode 100644
>>>index 00000000000..b91e940b18e
>>>--- /dev/null
>>>+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr98037.c
>>>@@ -0,0 +1,6 @@
>>>+/* { dg-options "-msve-vector-bits=1024 -O3" } */
>>>+
>>>+typedef __SVInt8_t vec __attribute__((arm_sve_vector_bits(1024)));
>>>+struct pair { vec v[2]; };
>>>+void use (struct pair *);
>>>+vec f (struct pair p) { vec v = p.v[1]; use (&p); return v; }

Reply via email to