[PATCH] RISC-V: Add fault first load C/C++ support
From: Ju-Zhe Zhong gcc/ChangeLog: * config/riscv/riscv-builtins.cc (riscv_gimple_fold_builtin): New function. * config/riscv/riscv-protos.h (riscv_gimple_fold_builtin): Ditto. (gimple_fold_builtin): Ditto. * config/riscv/riscv-vector-builtins-bases.cc (class read_vl): New class. (class vleff): Ditto. (BASE): Ditto. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-functions.def (read_vl): Ditto. (vleff): Ditto. * config/riscv/riscv-vector-builtins-shapes.cc (struct read_vl_def): Ditto. (struct fault_load_def): Ditto. (SHAPE): Ditto. * config/riscv/riscv-vector-builtins-shapes.h: Ditto. * config/riscv/riscv-vector-builtins.cc (rvv_arg_type_info::get_tree_type): Add size_ptr. (gimple_folder::gimple_folder): New class. (gimple_folder::fold): Ditto. (gimple_fold_builtin): New function. (get_read_vl_instance): Ditto. (get_read_vl_decl): Ditto. * config/riscv/riscv-vector-builtins.def (size_ptr): Add size_ptr. * config/riscv/riscv-vector-builtins.h (class gimple_folder): New class. (get_read_vl_instance): New function. (get_read_vl_decl): Ditto. * config/riscv/riscv-vsetvl.cc (fault_first_load_p): Ditto. (read_vl_insn_p): Ditto. (available_occurrence_p): Ditto. (backward_propagate_worthwhile_p): Ditto. (gen_vsetvl_pat): Adapt for vleff support. (get_forward_read_vl_insn): New function. (get_backward_fault_first_load_insn): Ditto. (source_equal_p): Adapt for vleff support. (first_ratio_invalid_for_second_sew_p): Remove. (first_ratio_invalid_for_second_lmul_p): Ditto. (first_lmul_less_than_second_lmul_p): Ditto. (first_ratio_less_than_second_ratio_p): Ditto. (support_relaxed_compatible_p): New function. (vector_insn_info::operator>): Remove. (vector_insn_info::operator>=): Refine. (vector_insn_info::parse_insn): Adapt for vleff support. (vector_insn_info::compatible_p): Ditto. (vector_insn_info::update_fault_first_load_avl): New function. (pass_vsetvl::transfer_after): Adapt for vleff support. (pass_vsetvl::demand_fusion): Ditto. (pass_vsetvl::cleanup_insns): Ditto. * config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Remove redundant condtions. * config/riscv/riscv-vsetvl.h (struct demands_cond): New function. * config/riscv/riscv.cc (TARGET_GIMPLE_FOLD_BUILTIN): New target hook. * config/riscv/riscv.md: Adapt for vleff support. * config/riscv/t-riscv: Ditto. * config/riscv/vector-iterators.md: New iterator. * config/riscv/vector.md (read_vlsi): New pattern. (read_vldi_zero_extend): Ditto. (@pred_fault_load): Ditto. --- gcc/config/riscv/riscv-builtins.cc| 31 ++ gcc/config/riscv/riscv-protos.h | 2 + .../riscv/riscv-vector-builtins-bases.cc | 86 - .../riscv/riscv-vector-builtins-bases.h | 2 + .../riscv/riscv-vector-builtins-functions.def | 7 +- .../riscv/riscv-vector-builtins-shapes.cc | 58 .../riscv/riscv-vector-builtins-shapes.h | 2 + gcc/config/riscv/riscv-vector-builtins.cc | 83 - gcc/config/riscv/riscv-vector-builtins.def| 1 + gcc/config/riscv/riscv-vector-builtins.h | 25 ++ gcc/config/riscv/riscv-vsetvl.cc | 323 +++--- gcc/config/riscv/riscv-vsetvl.def | 189 +- gcc/config/riscv/riscv-vsetvl.h | 10 +- gcc/config/riscv/riscv.cc | 3 + gcc/config/riscv/riscv.md | 8 +- gcc/config/riscv/t-riscv | 3 +- gcc/config/riscv/vector-iterators.md | 1 + gcc/config/riscv/vector.md| 53 ++- 18 files changed, 575 insertions(+), 312 deletions(-) diff --git a/gcc/config/riscv/riscv-builtins.cc b/gcc/config/riscv/riscv-builtins.cc index 390f8a38309..b1c4b7547d7 100644 --- a/gcc/config/riscv/riscv-builtins.cc +++ b/gcc/config/riscv/riscv-builtins.cc @@ -38,6 +38,9 @@ along with GCC; see the file COPYING3. If not see #include "expr.h" #include "langhooks.h" #include "tm_p.h" +#include "backend.h" +#include "gimple.h" +#include "gimple-iterator.h" /* Macros to create an enumeration identifier for a function prototype. */ #define RISCV_FTYPE_NAME0(A) RISCV_##A##_FTYPE @@ -332,6 +335,34 @@ riscv_expand_builtin_direct (enum insn_code icode, rtx target, tree exp, return riscv_expand_builtin_insn (icode, opno, ops, has_target_p); } +/* Implement TARGET_GIMPLE_FOLD_BUILTIN. */ + +bool +riscv_gimple_fold_builtin (gimple_stmt_iterator *gsi) +{ + gcall *stmt = as_a (gsi_stmt (*gsi)); + tree fndecl = gimple_call_fndecl (stmt); + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); + unsigned i
Re: [PATCH] RISC-V: Add fault first load C/C++ support
On 7 March 2023 07:21:23 CET, juzhe.zh...@rivai.ai wrote: >From: Ju-Zhe Zhong > >+class vleff : public function_base >+{ >+public: >+ unsigned int call_properties (const function_instance &) const override >+ { >+return CP_READ_MEMORY | CP_WRITE_CSR; >+ } >+ >+ gimple *fold (gimple_folder &f) const override >+ { >+/* fold vleff (const *base, size_t *new_vl, size_t vl) >+ >+ > vleff (const *base, size_t vl) >+ new_vl = MEM_REF[read_vl ()]. */ >+ >+auto_vec vargs; Where is that magic 8 coming from? Wouldn't you rather have one temporary to hold this manually CSEd nargs = gimple_call_num_args (f.call) - 2; which you would use throughout this function as it does not seem to change? Would you reserve something based off nargs for the auto_vec above? If not, please add a comment where the 8 comes from? thanks, >+ >+for (unsigned i = 0; i < gimple_call_num_args (f.call); i++) >+ { >+ /* Exclude size_t *new_vl argument. */ >+ if (i == gimple_call_num_args (f.call) - 2) >+continue; >+ >+ vargs.quick_push (gimple_call_arg (f.call, i)); >+ } >+ >+gimple *repl = gimple_build_call_vec (gimple_call_fn (f.call), vargs); >+gimple_call_set_lhs (repl, f.lhs); >+ >+/* Handle size_t *new_vl by read_vl. */ >+tree new_vl = gimple_call_arg (f.call, gimple_call_num_args (f.call) - 2); >+if (integer_zerop (new_vl)) >+ { >+ /* This case happens when user passes the nullptr to new_vl argument. >+ In this case, we just need to ignore the new_vl argument and return >+ vleff instruction directly. */ >+ return repl; >+ } >+ >+tree tmp_var = create_tmp_var (size_type_node, "new_vl"); >+tree decl = get_read_vl_decl (); >+gimple *g = gimple_build_call (decl, 0); >+gimple_call_set_lhs (g, tmp_var); >+tree indirect >+ = fold_build2 (MEM_REF, size_type_node, >+ gimple_call_arg (f.call, >+gimple_call_num_args (f.call) - 2), >+ build_int_cst (build_pointer_type (size_type_node), 0)); >+gassign *assign = gimple_build_assign (indirect, tmp_var); >+ >+gsi_insert_after (f.gsi, assign, GSI_SAME_STMT); >+gsi_insert_after (f.gsi, g, GSI_SAME_STMT); >+return repl; >+ } >+
Re: [PATCH] RISC-V: Add fault first load C/C++ support
Bernhard Reutner-Fischer via Gcc-patches writes: > On 7 March 2023 07:21:23 CET, juzhe.zh...@rivai.ai wrote: >>From: Ju-Zhe Zhong >> > >>+class vleff : public function_base >>+{ >>+public: >>+ unsigned int call_properties (const function_instance &) const override >>+ { >>+return CP_READ_MEMORY | CP_WRITE_CSR; >>+ } >>+ >>+ gimple *fold (gimple_folder &f) const override >>+ { >>+/* fold vleff (const *base, size_t *new_vl, size_t vl) >>+ >>+ > vleff (const *base, size_t vl) >>+ new_vl = MEM_REF[read_vl ()]. */ >>+ >>+auto_vec vargs; > > Where is that magic 8 coming from? I'm probably not saying anything you don't already know, but: The second template parameter is just an optimisation. It reserves a "small" amount of stack space for the vector, to reduce the likelihood that a full malloc/free will be needed. The vector can still grow arbitrarily large. So these numbers are always just gut instinct for what a reasonable common case would be. There's no particular science to it, and no particular need to explain away the value. The second parameter is still useful if the vector size is known at construction time. When I've looked at cc1 and cc1plus profiles in the past, malloc has often been a significant contributor. Trying to avoid malloc/free cycles for "petty" arrays seems like a worthwhile thing to do. Thanks, Richard
Re: Re: [PATCH] RISC-V: Add fault first load C/C++ support
Address comment and fix it in this V2 patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613608.html juzhe.zh...@rivai.ai From: Bernhard Reutner-Fischer Date: 2023-03-09 05:16 To: juzhe.zhong; gcc-patches CC: kito.cheng; Ju-Zhe Zhong Subject: Re: [PATCH] RISC-V: Add fault first load C/C++ support On 7 March 2023 07:21:23 CET, juzhe.zh...@rivai.ai wrote: >From: Ju-Zhe Zhong > >+class vleff : public function_base >+{ >+public: >+ unsigned int call_properties (const function_instance &) const override >+ { >+return CP_READ_MEMORY | CP_WRITE_CSR; >+ } >+ >+ gimple *fold (gimple_folder &f) const override >+ { >+/* fold vleff (const *base, size_t *new_vl, size_t vl) >+ >+ > vleff (const *base, size_t vl) >+ new_vl = MEM_REF[read_vl ()]. */ >+ >+auto_vec vargs; Where is that magic 8 coming from? Wouldn't you rather have one temporary to hold this manually CSEd nargs = gimple_call_num_args (f.call) - 2; which you would use throughout this function as it does not seem to change? Would you reserve something based off nargs for the auto_vec above? If not, please add a comment where the 8 comes from? thanks, >+ >+for (unsigned i = 0; i < gimple_call_num_args (f.call); i++) >+ { >+ /* Exclude size_t *new_vl argument. */ >+ if (i == gimple_call_num_args (f.call) - 2) >+ continue; >+ >+ vargs.quick_push (gimple_call_arg (f.call, i)); >+ } >+ >+gimple *repl = gimple_build_call_vec (gimple_call_fn (f.call), vargs); >+gimple_call_set_lhs (repl, f.lhs); >+ >+/* Handle size_t *new_vl by read_vl. */ >+tree new_vl = gimple_call_arg (f.call, gimple_call_num_args (f.call) - 2); >+if (integer_zerop (new_vl)) >+ { >+ /* This case happens when user passes the nullptr to new_vl argument. >+In this case, we just need to ignore the new_vl argument and return >+vleff instruction directly. */ >+ return repl; >+ } >+ >+tree tmp_var = create_tmp_var (size_type_node, "new_vl"); >+tree decl = get_read_vl_decl (); >+gimple *g = gimple_build_call (decl, 0); >+gimple_call_set_lhs (g, tmp_var); >+tree indirect >+ = fold_build2 (MEM_REF, size_type_node, >+ gimple_call_arg (f.call, >+ gimple_call_num_args (f.call) - 2), >+ build_int_cst (build_pointer_type (size_type_node), 0)); >+gassign *assign = gimple_build_assign (indirect, tmp_var); >+ >+gsi_insert_after (f.gsi, assign, GSI_SAME_STMT); >+gsi_insert_after (f.gsi, g, GSI_SAME_STMT); >+return repl; >+ } >+