[PATCH v2, RFC] fsra: gimple final sra pass for paramters and returns
Hi, As known there are a few PRs (meta-bug PR101926) about accessing aggregate param/returns which are passed through registers. We could use the current SRA pass in a special mode right before RTL expansion for the incoming/outgoing part, as the suggestion from: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637935.html Compared to previous version: https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646625.html This version enhanced the expand_ARG_PARTS to pass bootstrap aarch64. And this version merge previous three patches into one patch for review. As mentioned in previous version: This patch is using IFN ARG_PARTS and SET_RET_PARTS for parameters and returns. And expand the IFNs according to the incoming/outgoing registers. Bootstrapped/regtested on ppc64{,le}, x86_64 and aarch64. Again there are a few thing could be enhanced for this patch: * Multi-registers access * Parameter access cross call * Optimize for access parameter which in memory * More cases/targets checking I would like to ask for comments to avoid major flaw. BR, Jeff (Jiufu Guo) PR target/108073 PR target/65421 PR target/69143 gcc/ChangeLog: * cfgexpand.cc (expand_value_return): Update for rtx eq checking. (expand_return): Update for sclarized returns. * internal-fn.cc (query_position_in_parallel): New function. (construct_reg_seq): New function. (get_incoming_element): New function. (reference_alias_ptr_type): Extern declare. (expand_ARG_PARTS): New IFN expand. (store_outgoing_element): New function. (expand_SET_RET_PARTS): New IFN expand. (expand_SET_RET_LAST_PARTS): New IFN expand. * internal-fn.def (ARG_PARTS): New IFN. (SET_RET_PARTS): New IFN. (SET_RET_LAST_PARTS): New IFN. * passes.def (pass_sra_final): Add new pass. * tree-pass.h (make_pass_sra_final): New function. * tree-sra.cc (enum sra_mode): New enum item SRA_MODE_FINAL_INTRA. (build_accesses_from_assign): Accept SRA_MODE_FINAL_INTRA. (scan_function): Update for argment in fsra. (find_var_candidates): Collect candidates for SRA_MODE_FINAL_INTRA. (analyze_access_subtree): Update analyze for fsra. (generate_subtree_copies): Update to generate new IFNs. (final_intra_sra): New function. (class pass_sra_final): New class. (make_pass_sra_final): New function. gcc/testsuite/ChangeLog: * g++.target/powerpc/pr102024.C: Update instructions. * gcc.target/powerpc/pr108073-1.c: New test. * gcc.target/powerpc/pr108073.c: New test. * gcc.target/powerpc/pr65421.c: New test. * gcc.target/powerpc/pr69143.c: New test. --- gcc/cfgexpand.cc | 6 +- gcc/internal-fn.cc| 254 ++ gcc/internal-fn.def | 9 + gcc/passes.def| 2 + gcc/tree-pass.h | 1 + gcc/tree-sra.cc | 157 ++- gcc/testsuite/g++.target/powerpc/pr102024.C | 2 +- gcc/testsuite/gcc.target/powerpc/pr108073-1.c | 76 ++ gcc/testsuite/gcc.target/powerpc/pr108073.c | 74 + gcc/testsuite/gcc.target/powerpc/pr65421.c| 10 + gcc/testsuite/gcc.target/powerpc/pr69143.c| 23 ++ 11 files changed, 598 insertions(+), 16 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073-1.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421.c create mode 100644 gcc/testsuite/gcc.target/powerpc/pr69143.c diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index eef565eddb5..1ec6c2d8102 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -3759,7 +3759,7 @@ expand_value_return (rtx val) tree decl = DECL_RESULT (current_function_decl); rtx return_reg = DECL_RTL (decl); - if (return_reg != val) + if (!rtx_equal_p (return_reg, val)) { tree funtype = TREE_TYPE (current_function_decl); tree type = TREE_TYPE (decl); @@ -3832,6 +3832,10 @@ expand_return (tree retval) been stored into it, so we don't have to do anything special. */ if (TREE_CODE (retval_rhs) == RESULT_DECL) expand_value_return (result_rtl); + /* return is scalarized by fsra: TODO use FLAG. */ + else if (VAR_P (retval_rhs) + && rtx_equal_p (result_rtl, DECL_RTL (retval_rhs))) +expand_null_return_1 (); /* If the result is an aggregate that is being returned in one (or more) registers, load the registers here. */ diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index fcf47c7fa12..905ee7da005 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -3394,6 +3394,260 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) } } +/* In the parallel rtx register series REGS, compute the register position for + given {BITPOS, B
Re: [PATCH v2, RFC] fsra: gimple final sra pass for paramters and returns
Hi, Ping this RFC, look forward to comments for the incoming stage1... Jeff (Jiufu Guo) Jiufu Guo writes: > Hi, > > As known there are a few PRs (meta-bug PR101926) about > accessing aggregate param/returns which are passed through registers. > > We could use the current SRA pass in a special mode right before > RTL expansion for the incoming/outgoing part, as the suggestion from: > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637935.html > > Compared to previous version: > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646625.html > This version enhanced the expand_ARG_PARTS to pass bootstrap aarch64. > And this version merge previous three patches into one patch for > review. > > As mentioned in previous version: > This patch is using IFN ARG_PARTS and SET_RET_PARTS for parameters > and returns. And expand the IFNs according to the incoming/outgoing > registers. > > Bootstrapped/regtested on ppc64{,le}, x86_64 and aarch64. > > Again there are a few thing could be enhanced for this patch: > * Multi-registers access > * Parameter access cross call > * Optimize for access parameter which in memory > * More cases/targets checking > > I would like to ask for comments to avoid major flaw. > > BR, > Jeff (Jiufu Guo) > > > PR target/108073 > PR target/65421 > PR target/69143 > > gcc/ChangeLog: > > * cfgexpand.cc (expand_value_return): Update for rtx eq checking. > (expand_return): Update for sclarized returns. > * internal-fn.cc (query_position_in_parallel): New function. > (construct_reg_seq): New function. > (get_incoming_element): New function. > (reference_alias_ptr_type): Extern declare. > (expand_ARG_PARTS): New IFN expand. > (store_outgoing_element): New function. > (expand_SET_RET_PARTS): New IFN expand. > (expand_SET_RET_LAST_PARTS): New IFN expand. > * internal-fn.def (ARG_PARTS): New IFN. > (SET_RET_PARTS): New IFN. > (SET_RET_LAST_PARTS): New IFN. > * passes.def (pass_sra_final): Add new pass. > * tree-pass.h (make_pass_sra_final): New function. > * tree-sra.cc (enum sra_mode): New enum item SRA_MODE_FINAL_INTRA. > (build_accesses_from_assign): Accept SRA_MODE_FINAL_INTRA. > (scan_function): Update for argment in fsra. > (find_var_candidates): Collect candidates for SRA_MODE_FINAL_INTRA. > (analyze_access_subtree): Update analyze for fsra. > (generate_subtree_copies): Update to generate new IFNs. > (final_intra_sra): New function. > (class pass_sra_final): New class. > (make_pass_sra_final): New function. > > gcc/testsuite/ChangeLog: > > * g++.target/powerpc/pr102024.C: Update instructions. > * gcc.target/powerpc/pr108073-1.c: New test. > * gcc.target/powerpc/pr108073.c: New test. > * gcc.target/powerpc/pr65421.c: New test. > * gcc.target/powerpc/pr69143.c: New test. > > --- > gcc/cfgexpand.cc | 6 +- > gcc/internal-fn.cc| 254 ++ > gcc/internal-fn.def | 9 + > gcc/passes.def| 2 + > gcc/tree-pass.h | 1 + > gcc/tree-sra.cc | 157 ++- > gcc/testsuite/g++.target/powerpc/pr102024.C | 2 +- > gcc/testsuite/gcc.target/powerpc/pr108073-1.c | 76 ++ > gcc/testsuite/gcc.target/powerpc/pr108073.c | 74 + > gcc/testsuite/gcc.target/powerpc/pr65421.c| 10 + > gcc/testsuite/gcc.target/powerpc/pr69143.c| 23 ++ > 11 files changed, 598 insertions(+), 16 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr69143.c > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > index eef565eddb5..1ec6c2d8102 100644 > --- a/gcc/cfgexpand.cc > +++ b/gcc/cfgexpand.cc > @@ -3759,7 +3759,7 @@ expand_value_return (rtx val) > >tree decl = DECL_RESULT (current_function_decl); >rtx return_reg = DECL_RTL (decl); > - if (return_reg != val) > + if (!rtx_equal_p (return_reg, val)) > { >tree funtype = TREE_TYPE (current_function_decl); >tree type = TREE_TYPE (decl); > @@ -3832,6 +3832,10 @@ expand_return (tree retval) > been stored into it, so we don't have to do anything special. */ >if (TREE_CODE (retval_rhs) == RESULT_DECL) > expand_value_return (result_rtl); > + /* return is scalarized by fsra: TODO use FLAG. */ > + else if (VAR_P (retval_rhs) > +&& rtx_equal_p (result_rtl, DECL_RTL (retval_rhs))) > +expand_null_return_1 (); > >/* If the result is an aggregate that is being returned in one (or more) > registers, load the registers here. */ > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index f
Re: [PATCH v2, RFC] fsra: gimple final sra pass for paramters and returns
Hi, Ping this patch, look forward to comments. Jeff (Jiufu Guo) Jiufu Guo writes: > Hi, > > As known there are a few PRs (meta-bug PR101926) about > accessing aggregate param/returns which are passed through registers. > > We could use the current SRA pass in a special mode right before > RTL expansion for the incoming/outgoing part, as the suggestion from: > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637935.html > > Compared to previous version: > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646625.html > This version enhanced the expand_ARG_PARTS to pass bootstrap aarch64. > And this version merge previous three patches into one patch for > review. > > As mentioned in previous version: > This patch is using IFN ARG_PARTS and SET_RET_PARTS for parameters > and returns. And expand the IFNs according to the incoming/outgoing > registers. > > Bootstrapped/regtested on ppc64{,le}, x86_64 and aarch64. > > Again there are a few thing could be enhanced for this patch: > * Multi-registers access > * Parameter access cross call > * Optimize for access parameter which in memory > * More cases/targets checking > > I would like to ask for comments to avoid major flaw. > > BR, > Jeff (Jiufu Guo) > > > PR target/108073 > PR target/65421 > PR target/69143 > > gcc/ChangeLog: > > * cfgexpand.cc (expand_value_return): Update for rtx eq checking. > (expand_return): Update for sclarized returns. > * internal-fn.cc (query_position_in_parallel): New function. > (construct_reg_seq): New function. > (get_incoming_element): New function. > (reference_alias_ptr_type): Extern declare. > (expand_ARG_PARTS): New IFN expand. > (store_outgoing_element): New function. > (expand_SET_RET_PARTS): New IFN expand. > (expand_SET_RET_LAST_PARTS): New IFN expand. > * internal-fn.def (ARG_PARTS): New IFN. > (SET_RET_PARTS): New IFN. > (SET_RET_LAST_PARTS): New IFN. > * passes.def (pass_sra_final): Add new pass. > * tree-pass.h (make_pass_sra_final): New function. > * tree-sra.cc (enum sra_mode): New enum item SRA_MODE_FINAL_INTRA. > (build_accesses_from_assign): Accept SRA_MODE_FINAL_INTRA. > (scan_function): Update for argment in fsra. > (find_var_candidates): Collect candidates for SRA_MODE_FINAL_INTRA. > (analyze_access_subtree): Update analyze for fsra. > (generate_subtree_copies): Update to generate new IFNs. > (final_intra_sra): New function. > (class pass_sra_final): New class. > (make_pass_sra_final): New function. > > gcc/testsuite/ChangeLog: > > * g++.target/powerpc/pr102024.C: Update instructions. > * gcc.target/powerpc/pr108073-1.c: New test. > * gcc.target/powerpc/pr108073.c: New test. > * gcc.target/powerpc/pr65421.c: New test. > * gcc.target/powerpc/pr69143.c: New test. > > --- > gcc/cfgexpand.cc | 6 +- > gcc/internal-fn.cc| 254 ++ > gcc/internal-fn.def | 9 + > gcc/passes.def| 2 + > gcc/tree-pass.h | 1 + > gcc/tree-sra.cc | 157 ++- > gcc/testsuite/g++.target/powerpc/pr102024.C | 2 +- > gcc/testsuite/gcc.target/powerpc/pr108073-1.c | 76 ++ > gcc/testsuite/gcc.target/powerpc/pr108073.c | 74 + > gcc/testsuite/gcc.target/powerpc/pr65421.c| 10 + > gcc/testsuite/gcc.target/powerpc/pr69143.c| 23 ++ > 11 files changed, 598 insertions(+), 16 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr69143.c > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > index eef565eddb5..1ec6c2d8102 100644 > --- a/gcc/cfgexpand.cc > +++ b/gcc/cfgexpand.cc > @@ -3759,7 +3759,7 @@ expand_value_return (rtx val) > >tree decl = DECL_RESULT (current_function_decl); >rtx return_reg = DECL_RTL (decl); > - if (return_reg != val) > + if (!rtx_equal_p (return_reg, val)) > { >tree funtype = TREE_TYPE (current_function_decl); >tree type = TREE_TYPE (decl); > @@ -3832,6 +3832,10 @@ expand_return (tree retval) > been stored into it, so we don't have to do anything special. */ >if (TREE_CODE (retval_rhs) == RESULT_DECL) > expand_value_return (result_rtl); > + /* return is scalarized by fsra: TODO use FLAG. */ > + else if (VAR_P (retval_rhs) > +&& rtx_equal_p (result_rtl, DECL_RTL (retval_rhs))) > +expand_null_return_1 (); > >/* If the result is an aggregate that is being returned in one (or more) > registers, load the registers here. */ > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index fcf47c7fa12..905ee7da005