Re: Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.
I think it's harmless to let this patch in GCC-14. So LGTM from my side to land this path in GCC-14.. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-03-26 01:07 To: Jeff Law; 钟居哲; gcc-patches; palmer; kito.cheng CC: rdapp.gcc Subject: Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores. > So where do we stand with this? Juzhe asked it to be rebased, but I > don't see a rebased version in my inbox and I don't see anything that > looks like this on the trunk. I missed this one and figured as we're pretty late in the cycle it can wait until GCC 15. Therefore let's call it "deferred". Regards Robin
Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.
> So where do we stand with this? Juzhe asked it to be rebased, but I > don't see a rebased version in my inbox and I don't see anything that > looks like this on the trunk. I missed this one and figured as we're pretty late in the cycle it can wait until GCC 15. Therefore let's call it "deferred". Regards Robin
Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.
On 3/1/24 8:07 AM, Robin Dapp wrote: + /* Segment load/store permute cost. */ + const int segment_permute_2; + const int segment_permute_4; + const int segment_permute_8; Why do we only have 2/4/8, I think we should have 2/3/4/5/6/7/8 No idea why I posted that (wrong) version, I used it for some testing locally. Attached is the proper version, still called it v3... Regards Robin Subject: [PATCH v3] RISC-V: Add initial cost handling for segment loads/stores. This patch makes segment loads and stores more expensive. It adds segment_permute_2 as well as 3 to 8 cost fields to the common vector costs and adds handling to adjust_stmt_cost. gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): Add segment_permute cost. * config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Handle segment loads/stores. * config/riscv/riscv.cc: Initialize segment_permute_[2-8] to 1. So where do we stand with this? Juzhe asked it to be rebased, but I don't see a rebased version in my inbox and I don't see anything that looks like this on the trunk. jeff
Re: Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.
Could you rebase to the trunk ? I don't think segment load store cost depends on previous patch you sent. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-03-01 23:07 To: 钟居哲; gcc-patches; palmer; kito.cheng CC: rdapp.gcc; Jeff Law Subject: Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores. > + /* Segment load/store permute cost. */ > + const int segment_permute_2; > + const int segment_permute_4; > + const int segment_permute_8; > > Why do we only have 2/4/8, I think we should have 2/3/4/5/6/7/8 No idea why I posted that (wrong) version, I used it for some testing locally. Attached is the proper version, still called it v3... Regards Robin Subject: [PATCH v3] RISC-V: Add initial cost handling for segment loads/stores. This patch makes segment loads and stores more expensive. It adds segment_permute_2 as well as 3 to 8 cost fields to the common vector costs and adds handling to adjust_stmt_cost. gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): Add segment_permute cost. * config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Handle segment loads/stores. * config/riscv/riscv.cc: Initialize segment_permute_[2-8] to 1. --- gcc/config/riscv/riscv-protos.h| 9 ++ gcc/config/riscv/riscv-vector-costs.cc | 163 ++--- gcc/config/riscv/riscv.cc | 14 +++ 3 files changed, 144 insertions(+), 42 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 80efdf2b7e5..90d1fcbb3b1 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -218,6 +218,15 @@ struct common_vector_cost const int gather_load_cost; const int scatter_store_cost; + /* Segment load/store permute cost. */ + const int segment_permute_2; + const int segment_permute_3; + const int segment_permute_4; + const int segment_permute_5; + const int segment_permute_6; + const int segment_permute_7; + const int segment_permute_8; + /* Cost of a vector-to-scalar operation. */ const int vec_to_scalar_cost; diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index adf9c197df5..f4da213fe14 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -1043,6 +1043,25 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const return vector_costs::better_main_loop_than_p (other); } +/* Returns the group size i.e. the number of vectors to be loaded by a + segmented load/store instruction. Return 0 if it is no segmented + load/store. */ +static int +segment_loadstore_group_size (enum vect_cost_for_stmt kind, + stmt_vec_info stmt_info) +{ + if (stmt_info + && (kind == vector_load || kind == vector_store) + && STMT_VINFO_DATA_REF (stmt_info)) +{ + stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); + if (stmt_info + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_LOAD_STORE_LANES) + return DR_GROUP_SIZE (stmt_info); +} + return 0; +} + /* Adjust vectorization cost after calling riscv_builtin_vectorization_cost. For some statement, we would like to further fine-grain tweak the cost on top of riscv_builtin_vectorization_cost handling which doesn't have any @@ -1067,55 +1086,115 @@ costs::adjust_stmt_cost (enum vect_cost_for_stmt kind, loop_vec_info loop, case vector_load: case vector_store: { - /* Unit-stride vector loads and stores do not have offset addressing - as opposed to scalar loads and stores. - If the address depends on a variable we need an additional - add/sub for each load/store in the worst case. */ - if (stmt_info && stmt_info->stmt) + if (stmt_info && stmt_info->stmt && STMT_VINFO_DATA_REF (stmt_info)) { - data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); - class loop *father = stmt_info->stmt->bb->loop_father; - if (!loop && father && !father->inner && father->superloops) + /* Segment loads and stores. When the group size is > 1 + the vectorizer will add a vector load/store statement for + each vector in the group. Here we additionally add permute + costs for each. */ + /* TODO: Indexed and ordered/unordered cost. */ + int group_size = segment_loadstore_group_size (kind, stmt_info); + if (group_size > 1) + { + switch (group_size) + { + case 2: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost += costs->vla->segment_permute_2; + else + stmt_cost += costs->vls->segment_permute_2; + break; + case 3: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost += costs->vla->segment_permute_3; + else + stmt_cost += costs->vls->segment_permute_3; + break; + case 4: + if (riscv_v_
Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.
> + /* Segment load/store permute cost. */ > + const int segment_permute_2; > + const int segment_permute_4; > + const int segment_permute_8; > > Why do we only have 2/4/8, I think we should have 2/3/4/5/6/7/8 No idea why I posted that (wrong) version, I used it for some testing locally. Attached is the proper version, still called it v3... Regards Robin Subject: [PATCH v3] RISC-V: Add initial cost handling for segment loads/stores. This patch makes segment loads and stores more expensive. It adds segment_permute_2 as well as 3 to 8 cost fields to the common vector costs and adds handling to adjust_stmt_cost. gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): Add segment_permute cost. * config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Handle segment loads/stores. * config/riscv/riscv.cc: Initialize segment_permute_[2-8] to 1. --- gcc/config/riscv/riscv-protos.h| 9 ++ gcc/config/riscv/riscv-vector-costs.cc | 163 ++--- gcc/config/riscv/riscv.cc | 14 +++ 3 files changed, 144 insertions(+), 42 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 80efdf2b7e5..90d1fcbb3b1 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -218,6 +218,15 @@ struct common_vector_cost const int gather_load_cost; const int scatter_store_cost; + /* Segment load/store permute cost. */ + const int segment_permute_2; + const int segment_permute_3; + const int segment_permute_4; + const int segment_permute_5; + const int segment_permute_6; + const int segment_permute_7; + const int segment_permute_8; + /* Cost of a vector-to-scalar operation. */ const int vec_to_scalar_cost; diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index adf9c197df5..f4da213fe14 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -1043,6 +1043,25 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const return vector_costs::better_main_loop_than_p (other); } +/* Returns the group size i.e. the number of vectors to be loaded by a + segmented load/store instruction. Return 0 if it is no segmented + load/store. */ +static int +segment_loadstore_group_size (enum vect_cost_for_stmt kind, + stmt_vec_info stmt_info) +{ + if (stmt_info + && (kind == vector_load || kind == vector_store) + && STMT_VINFO_DATA_REF (stmt_info)) +{ + stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); + if (stmt_info + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_LOAD_STORE_LANES) + return DR_GROUP_SIZE (stmt_info); +} + return 0; +} + /* Adjust vectorization cost after calling riscv_builtin_vectorization_cost. For some statement, we would like to further fine-grain tweak the cost on top of riscv_builtin_vectorization_cost handling which doesn't have any @@ -1067,55 +1086,115 @@ costs::adjust_stmt_cost (enum vect_cost_for_stmt kind, loop_vec_info loop, case vector_load: case vector_store: { - /* Unit-stride vector loads and stores do not have offset addressing -as opposed to scalar loads and stores. -If the address depends on a variable we need an additional -add/sub for each load/store in the worst case. */ - if (stmt_info && stmt_info->stmt) + if (stmt_info && stmt_info->stmt && STMT_VINFO_DATA_REF (stmt_info)) { - data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); - class loop *father = stmt_info->stmt->bb->loop_father; - if (!loop && father && !father->inner && father->superloops) + /* Segment loads and stores. When the group size is > 1 +the vectorizer will add a vector load/store statement for +each vector in the group. Here we additionally add permute +costs for each. */ + /* TODO: Indexed and ordered/unordered cost. */ + int group_size = segment_loadstore_group_size (kind, stmt_info); + if (group_size > 1) + { + switch (group_size) + { + case 2: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost += costs->vla->segment_permute_2; + else + stmt_cost += costs->vls->segment_permute_2; + break; + case 3: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost += costs->vla->segment_permute_3; + else + stmt_cost += costs->vls->segment_permute_3; + break; + case 4: + if (ri
Re: Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.
+ /* Segment load/store permute cost. */ + const int segment_permute_2; + const int segment_permute_4; + const int segment_permute_8; Why do we only have 2/4/8, I think we should have 2/3/4/5/6/7/8 juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-02-28 05:27 To: juzhe.zh...@rivai.ai; gcc-patches; palmer; kito.cheng CC: rdapp.gcc; jeffreyalaw Subject: Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores. > This patch looks odd to me. > I don't see memrefs in the trunk code. It's on top of the vle/vse offset handling patch from a while back that I haven't committed yet. > Also, I prefer list all cost in cost tune info for NF = 2 ~ 8 like ARM SVE > does: I don't mind having separate costs for each but I figured they scale anyway with the number of vectors already. Attached v2 is more similar to aarch64. Regards Robin Subject: [PATCH v2] RISC-V: Add initial cost handling for segment loads/stores. This patch makes segment loads and stores more expensive. It adds segment_permute_2 (as well as 4 and 8) cost fields to the common vector costs and adds handling to adjust_stmt_cost. gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): Add segment_permute cost. * config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Handle segment loads/stores. * config/riscv/riscv.cc: Initialize segment_permute_[248] to 1. --- gcc/config/riscv/riscv-protos.h| 5 + gcc/config/riscv/riscv-vector-costs.cc | 139 + gcc/config/riscv/riscv.cc | 6 ++ 3 files changed, 108 insertions(+), 42 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 80efdf2b7e5..9b737aca1a3 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -218,6 +218,11 @@ struct common_vector_cost const int gather_load_cost; const int scatter_store_cost; + /* Segment load/store permute cost. */ + const int segment_permute_2; + const int segment_permute_4; + const int segment_permute_8; + /* Cost of a vector-to-scalar operation. */ const int vec_to_scalar_cost; diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index adf9c197df5..c8178d71101 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -1043,6 +1043,25 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const return vector_costs::better_main_loop_than_p (other); } +/* Returns the group size i.e. the number of vectors to be loaded by a + segmented load/store instruction. Return 0 if it is no segmented + load/store. */ +static int +segment_loadstore_group_size (enum vect_cost_for_stmt kind, + stmt_vec_info stmt_info) +{ + if (stmt_info + && (kind == vector_load || kind == vector_store) + && STMT_VINFO_DATA_REF (stmt_info)) +{ + stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); + if (stmt_info + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_LOAD_STORE_LANES) + return DR_GROUP_SIZE (stmt_info); +} + return 0; +} + /* Adjust vectorization cost after calling riscv_builtin_vectorization_cost. For some statement, we would like to further fine-grain tweak the cost on top of riscv_builtin_vectorization_cost handling which doesn't have any @@ -1067,55 +1086,91 @@ costs::adjust_stmt_cost (enum vect_cost_for_stmt kind, loop_vec_info loop, case vector_load: case vector_store: { - /* Unit-stride vector loads and stores do not have offset addressing - as opposed to scalar loads and stores. - If the address depends on a variable we need an additional - add/sub for each load/store in the worst case. */ - if (stmt_info && stmt_info->stmt) + if (stmt_info && stmt_info->stmt && STMT_VINFO_DATA_REF (stmt_info)) { - data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); - class loop *father = stmt_info->stmt->bb->loop_father; - if (!loop && father && !father->inner && father->superloops) + /* Segment loads and stores. When the group size is > 1 + the vectorizer will add a vector load/store statement for + each vector in the group. Here we additionally add permute + costs for each. */ + /* TODO: Indexed and ordered/unordered cost. */ + int group_size = segment_loadstore_group_size (kind, stmt_info); + if (group_size > 1) + { + switch (group_size) + { + case 2: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost += costs->vla->segment_permute_2; + else + stmt_cost += costs->vls->segment_permute_2; + break; + case 4: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost += costs->vla->segment_permute_4; + else + stmt_cost += costs->vls->segment_per
Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.
> This patch looks odd to me. > I don't see memrefs in the trunk code. It's on top of the vle/vse offset handling patch from a while back that I haven't committed yet. > Also, I prefer list all cost in cost tune info for NF = 2 ~ 8 like ARM SVE > does: I don't mind having separate costs for each but I figured they scale anyway with the number of vectors already. Attached v2 is more similar to aarch64. Regards Robin Subject: [PATCH v2] RISC-V: Add initial cost handling for segment loads/stores. This patch makes segment loads and stores more expensive. It adds segment_permute_2 (as well as 4 and 8) cost fields to the common vector costs and adds handling to adjust_stmt_cost. gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): Add segment_permute cost. * config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Handle segment loads/stores. * config/riscv/riscv.cc: Initialize segment_permute_[248] to 1. --- gcc/config/riscv/riscv-protos.h| 5 + gcc/config/riscv/riscv-vector-costs.cc | 139 + gcc/config/riscv/riscv.cc | 6 ++ 3 files changed, 108 insertions(+), 42 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 80efdf2b7e5..9b737aca1a3 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -218,6 +218,11 @@ struct common_vector_cost const int gather_load_cost; const int scatter_store_cost; + /* Segment load/store permute cost. */ + const int segment_permute_2; + const int segment_permute_4; + const int segment_permute_8; + /* Cost of a vector-to-scalar operation. */ const int vec_to_scalar_cost; diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index adf9c197df5..c8178d71101 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -1043,6 +1043,25 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const return vector_costs::better_main_loop_than_p (other); } +/* Returns the group size i.e. the number of vectors to be loaded by a + segmented load/store instruction. Return 0 if it is no segmented + load/store. */ +static int +segment_loadstore_group_size (enum vect_cost_for_stmt kind, + stmt_vec_info stmt_info) +{ + if (stmt_info + && (kind == vector_load || kind == vector_store) + && STMT_VINFO_DATA_REF (stmt_info)) +{ + stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); + if (stmt_info + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_LOAD_STORE_LANES) + return DR_GROUP_SIZE (stmt_info); +} + return 0; +} + /* Adjust vectorization cost after calling riscv_builtin_vectorization_cost. For some statement, we would like to further fine-grain tweak the cost on top of riscv_builtin_vectorization_cost handling which doesn't have any @@ -1067,55 +1086,91 @@ costs::adjust_stmt_cost (enum vect_cost_for_stmt kind, loop_vec_info loop, case vector_load: case vector_store: { - /* Unit-stride vector loads and stores do not have offset addressing -as opposed to scalar loads and stores. -If the address depends on a variable we need an additional -add/sub for each load/store in the worst case. */ - if (stmt_info && stmt_info->stmt) + if (stmt_info && stmt_info->stmt && STMT_VINFO_DATA_REF (stmt_info)) { - data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); - class loop *father = stmt_info->stmt->bb->loop_father; - if (!loop && father && !father->inner && father->superloops) + /* Segment loads and stores. When the group size is > 1 +the vectorizer will add a vector load/store statement for +each vector in the group. Here we additionally add permute +costs for each. */ + /* TODO: Indexed and ordered/unordered cost. */ + int group_size = segment_loadstore_group_size (kind, stmt_info); + if (group_size > 1) + { + switch (group_size) + { + case 2: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost += costs->vla->segment_permute_2; + else + stmt_cost += costs->vls->segment_permute_2; + break; + case 4: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost += costs->vla->segment_permute_4; + else + stmt_cost += costs->vls->segment_permute_4; + break; + case 8: + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + s
Re: [PATCH] RISC-V: Add initial cost handling for segment loads/stores.
This patch looks odd to me. I don't see memrefs in the trunk code. Also, I prefer list all cost in cost tune info for NF = 2 ~ 8 like ARM SVE does: /* Detect cases in which a vector load or store represents an LD[234] or ST[234] instruction. */ switch (aarch64_ld234_st234_vectors (kind, stmt_info)) { case 2: stmt_cost += simd_costs->ld2_st2_permute_cost; break; case 3: stmt_cost += simd_costs->ld3_st3_permute_cost; break; case 4: stmt_cost += simd_costs->ld4_st4_permute_cost; break; } juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-02-26 23:54 To: gcc-patches; palmer; Kito Cheng; juzhe.zh...@rivai.ai CC: rdapp.gcc; jeffreyalaw Subject: [PATCH] RISC-V: Add initial cost handling for segment loads/stores. Hi, This has been sitting on my local tree - I've been wanting to post it for a while but somehow forgot. This patch makes segment loads and stores more expensive. It adds segment_load and segment_store cost fields to the common vector costs and adds handling to adjust_stmt_cost. In the future we could handle this in a more fine-grained manner but let's start somehow. Regtested on rv64. Regards Robin gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): Add segment_[load/store]_cost. * config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Handle segment loads/stores. * config/riscv/riscv.cc: Initialize segment_[load/store]_cost to 1. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr113112-4.c: Expect m4 instead of m2. --- gcc/config/riscv/riscv-protos.h | 4 + gcc/config/riscv/riscv-vector-costs.cc| 127 -- gcc/config/riscv/riscv.cc | 4 + .../vect/costmodel/riscv/rvv/pr113112-4.c | 4 +- 4 files changed, 95 insertions(+), 44 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 80efdf2b7e5..2e8ab9990a8 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -218,6 +218,10 @@ struct common_vector_cost const int gather_load_cost; const int scatter_store_cost; + /* Segment load/store cost. */ + const int segment_load_cost; + const int segment_store_cost; + /* Cost of a vector-to-scalar operation. */ const int vec_to_scalar_cost; diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index adf9c197df5..d3c12444773 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -1043,6 +1043,24 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const return vector_costs::better_main_loop_than_p (other); } +/* Returns the group size i.e. the number of vectors to be loaded by a + segmented load/store instruction. Return 0 if it is no segmented + load/store. */ +static int +segment_loadstore_group_size (enum vect_cost_for_stmt kind, + stmt_vec_info stmt_info) +{ + if ((kind == vector_load || kind == vector_store) + && STMT_VINFO_DATA_REF (stmt_info)) +{ + stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); + if (stmt_info + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_LOAD_STORE_LANES) + return DR_GROUP_SIZE (stmt_info); +} + return 0; +} + /* Adjust vectorization cost after calling riscv_builtin_vectorization_cost. For some statement, we would like to further fine-grain tweak the cost on top of riscv_builtin_vectorization_cost handling which doesn't have any @@ -1067,55 +1085,80 @@ costs::adjust_stmt_cost (enum vect_cost_for_stmt kind, loop_vec_info loop, case vector_load: case vector_store: { - /* Unit-stride vector loads and stores do not have offset addressing - as opposed to scalar loads and stores. - If the address depends on a variable we need an additional - add/sub for each load/store in the worst case. */ - if (stmt_info && stmt_info->stmt) + if (stmt_info && stmt_info->stmt && STMT_VINFO_DATA_REF (stmt_info)) { - data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); - class loop *father = stmt_info->stmt->bb->loop_father; - if (!loop && father && !father->inner && father->superloops) + int group_size; + if ((group_size += segment_loadstore_group_size (kind, stmt_info)) > 1) { - tree ref; - if (TREE_CODE (dr->ref) != MEM_REF - || !(ref = TREE_OPERAND (dr->ref, 0)) - || TREE_CODE (ref) != SSA_NAME) - break; + /* Segment loads and stores. When the group size is > 1 + the vectorizer will add a vector load/store statement for + each vector in the group. Note that STMT_COST is + overwritten here rather than adjusted. */ + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost + = (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)) + ? costs->vla->segment_load_co
[PATCH] RISC-V: Add initial cost handling for segment loads/stores.
Hi, This has been sitting on my local tree - I've been wanting to post it for a while but somehow forgot. This patch makes segment loads and stores more expensive. It adds segment_load and segment_store cost fields to the common vector costs and adds handling to adjust_stmt_cost. In the future we could handle this in a more fine-grained manner but let's start somehow. Regtested on rv64. Regards Robin gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): Add segment_[load/store]_cost. * config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Handle segment loads/stores. * config/riscv/riscv.cc: Initialize segment_[load/store]_cost to 1. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr113112-4.c: Expect m4 instead of m2. --- gcc/config/riscv/riscv-protos.h | 4 + gcc/config/riscv/riscv-vector-costs.cc| 127 -- gcc/config/riscv/riscv.cc | 4 + .../vect/costmodel/riscv/rvv/pr113112-4.c | 4 +- 4 files changed, 95 insertions(+), 44 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 80efdf2b7e5..2e8ab9990a8 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -218,6 +218,10 @@ struct common_vector_cost const int gather_load_cost; const int scatter_store_cost; + /* Segment load/store cost. */ + const int segment_load_cost; + const int segment_store_cost; + /* Cost of a vector-to-scalar operation. */ const int vec_to_scalar_cost; diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index adf9c197df5..d3c12444773 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -1043,6 +1043,24 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const return vector_costs::better_main_loop_than_p (other); } +/* Returns the group size i.e. the number of vectors to be loaded by a + segmented load/store instruction. Return 0 if it is no segmented + load/store. */ +static int +segment_loadstore_group_size (enum vect_cost_for_stmt kind, + stmt_vec_info stmt_info) +{ + if ((kind == vector_load || kind == vector_store) + && STMT_VINFO_DATA_REF (stmt_info)) +{ + stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info); + if (stmt_info + && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_LOAD_STORE_LANES) + return DR_GROUP_SIZE (stmt_info); +} + return 0; +} + /* Adjust vectorization cost after calling riscv_builtin_vectorization_cost. For some statement, we would like to further fine-grain tweak the cost on top of riscv_builtin_vectorization_cost handling which doesn't have any @@ -1067,55 +1085,80 @@ costs::adjust_stmt_cost (enum vect_cost_for_stmt kind, loop_vec_info loop, case vector_load: case vector_store: { - /* Unit-stride vector loads and stores do not have offset addressing -as opposed to scalar loads and stores. -If the address depends on a variable we need an additional -add/sub for each load/store in the worst case. */ - if (stmt_info && stmt_info->stmt) + if (stmt_info && stmt_info->stmt && STMT_VINFO_DATA_REF (stmt_info)) { - data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); - class loop *father = stmt_info->stmt->bb->loop_father; - if (!loop && father && !father->inner && father->superloops) + int group_size; + if ((group_size + = segment_loadstore_group_size (kind, stmt_info)) > 1) { - tree ref; - if (TREE_CODE (dr->ref) != MEM_REF - || !(ref = TREE_OPERAND (dr->ref, 0)) - || TREE_CODE (ref) != SSA_NAME) - break; + /* Segment loads and stores. When the group size is > 1 +the vectorizer will add a vector load/store statement for +each vector in the group. Note that STMT_COST is +overwritten here rather than adjusted. */ + if (riscv_v_ext_vector_mode_p (loop->vector_mode)) + stmt_cost + = (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)) +? costs->vla->segment_load_cost +: costs->vla->segment_store_cost); + else + stmt_cost + = (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)) +? costs->vls->segment_load_cost +: costs->vls->segment_store_cost); + break; + /* TODO: Indexed and ordered/unordered cost. */ + } + else + { + /* Unit-str