From: Kugan Vivekanandarajah <kugan.vivekanandara...@linaro.org>

Inorder to fix this PR.
 * We need to change the whilelo pattern in backend
 * Change RTL CSE such that:
   - Add support for VEC_DUPLICATE
   - When handling PARALLEL rtx in cse_insn, we kill CSE defined by all the
     parallel rtx at the end.

For example, with patch1, we now have rtl insn as follows:

(insn 19 18 20 3 (parallel [
            (set (reg:VNx4BI 93 [ next_mask_18 ])
                (unspec:VNx4BI [
                        (const_int 0 [0])
                        (reg:DI 95 [ _33 ])
                    ] UNSPEC_WHILE_LO))
            (set (reg:CC 66 cc)
                (compare:CC (unspec:SI [
                            (vec_duplicate:VNx4BI (const_int 1 [0x1]))
                            (reg:VNx4BI 93 [ next_mask_18 ])
                        ] UNSPEC_PTEST_PTRUE)
                    (const_int 0 [0])))
        ]) 4244 {while_ultdivnx4bi}

When cse_insn process the first, it records the CSE set in reg 93.  Then after
processing both the instruction in the parallel rtx, we invalidate all
expression with reg 93 which means expression in the second instruction is
invalidated for CSE. Attached patch relaxes this by invalidating before 
processing the
second.

Bootstrap and regression testing for the current version is ongoing.

Thanks,
Kugan

Kugan Vivekanandarajah (2):
  [PR88836][aarch64] Set CC_REGNUM instead of clobber
  [PR88836][aarch64] Fix CSE to process parallel rtx dest one by one

 gcc/config/aarch64/aarch64-sve.md          |  9 +++-
 gcc/cse.c                                  | 67 ++++++++++++++++++++++++++----
 gcc/testsuite/gcc.target/aarch64/pr88836.c | 14 +++++++
 3 files changed, 80 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr88836.c

-- 
2.7.4

Reply via email to