[PATCH v1] Gen-Match: Fix gen_kids_1 right hand braces mis-alignment

2024-05-25 Thread pan2 . li
From: Pan Li Notice some mis-alignment for gen_kids_1 right hand braces as below: if ((_q50 == _q20 && ! TREE_SIDE_EFFECTS (... { if ((_q51 == _q21 && ! TREE_SIDE_EFFECTS (... {

[PATCH v4] Match: Add overloaded types_match to avoid code dup [NFC]

2024-05-22 Thread pan2 . li
From: Pan Li There are sorts of match pattern for SAT related cases, there will be some duplicated code to check the dest, op_0, op_1 are same tree types. Aka ternary tree type matches. Thus, add overloaded types_match func do this and avoid match code duplication. The below test suites are

[PATCH v2] Match: Support branch form for unsigned SAT_ADD

2024-05-22 Thread pan2 . li
From: Pan Li This patch would like to support the branch form for unsigned SAT_ADD. For example as below: uint64_t sat_add (uint64_t x, uint64_t y) { return (uint64_t) (x + y) >= x ? (x + y) : -1; } Different to the branchless version, we leverage the simplify to convert the branch version

[PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-21 Thread pan2 . li
From: Pan Li This patch would like to support the __builtin_add_overflow branch form for unsigned SAT_ADD. For example as below: uint64_t sat_add (uint64_t x, uint64_t y) { uint64_t ret; return __builtin_add_overflow (x, y, ) ? -1 : ret; } Different to the branchless version, we leverage

[PATCH v1 1/2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-21 Thread pan2 . li
From: Pan Li This patch would like to support the __builtin_add_overflow branch form for unsigned SAT_ADD. For example as below: uint64_t sat_add (uint64_t x, uint64_t y) { uint64_t ret; return __builtin_add_overflow (x, y, ) ? -1 : ret; } Different to the branchless version, we leverage

[PATCH v1 2/2] RISC-V: Add test cases for __builtin_add_overflow branch form unsigned SAT_ADD

2024-05-21 Thread pan2 . li
From: Pan Li After we support __builtin_add_overflow branch form unsigned SAT_ADD from the middle end. Add more tests case to cover the functionarlities. The below test suites are passed. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h:

[PATCH v3] Match: Extract ternary_integer_types_match_p helper func [NFC]

2024-05-20 Thread pan2 . li
From: Pan Li There are sorts of match pattern for SAT related cases, there will be some duplicated code to check the dest, op_0, op_1 are same tree types. Aka ternary tree type matches. Thus, extract one helper function to do this and avoid match code duplication. The below test suites are

[PATCH v1 2/2] RISC-V: Add test cases for branch form unsigned SAT_ADD

2024-05-20 Thread pan2 . li
From: Pan Li After we support branch form unsigned SAT_ADD from the middle end. Add more tests case to cover the functionarlities. The below test suites are passed. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add branch form test

[PATCH v1 1/2] Match: Support branch form for unsigned SAT_ADD

2024-05-20 Thread pan2 . li
From: Pan Li This patch would like to support the branch form for unsigned SAT_ADD. For example as below: uint64_t sat_add (uint64_t x, uint64_t y) { return (uint64_t) (x + y) >= x ? (x + y) : -1; } Different to the branchless version, we leverage the simplify to convert the branch version

[PATCH v2] Match: Extract integer_types_ternary_match helper to avoid code dup [NFC]

2024-05-20 Thread pan2 . li
From: Pan Li There are sorts of match pattern for SAT related cases, there will be some duplicated code to check the dest, op_0, op_1 are same tree types. Aka ternary tree type matches. Thus, extract one helper function to do this and avoid match code duplication. The below test suites are

[PATCH v1 2/2] RISC-V: Add test cases for __builtin_add_overflow branchless unsigned SAT_ADD

2024-05-19 Thread pan2 . li
From: Pan Li After we support branchless __builtin_add_overflow unsigned SAT_ADD from the middle end. Add more tests case to cover the functionarlities. The below test suites are passed. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/sat_arith.h: Add

[PATCH v1 1/2] Match: Support __builtin_add_overflow for branchless unsigned SAT_ADD

2024-05-19 Thread pan2 . li
From: Pan Li This patch would like to support the branchless form for unsigned SAT_ADD when leverage __builtin_add_overflow. For example as below: uint64_t sat_add_u(uint64_t x, uint64_t y) { uint64_t ret; uint64_t overflow = __builtin_add_overflow (x, y, ); return (T)(-overflow) | ret;

[PATCH v1] Match: Extract integer_types_ternary_match helper to avoid code dup [NFC]

2024-05-18 Thread pan2 . li
From: Pan Li There are sorts of match pattern for SAT related cases, there will be some duplicated code to check the dest, op_0, op_1 are same tree types. Aka ternary tree type matches. Thus, extract one helper function to do this and avoid match code duplication. The below test suites are

[PATCH v6] RISC-V: Implement IFN SAT_ADD for both the scalar and vector

2024-05-17 Thread pan2 . li
From: Pan Li Update in v6: * Rebase upstream for conflict. Log for v5: The patch implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below vector as example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i;

[PATCH v2 2/3] RISC-V: Implement vectorizable early exit with vcond_mask_len

2024-05-15 Thread pan2 . li
From: Pan Li After we support the loop lens for the vectorizable, we would like to implement the feature for the RISC-V target. Given below example: unsigned vect_a[1923]; unsigned vect_b[1923]; void test (unsigned limit, int n) { for (int i = 0; i < n; i++) { vect_b[i] = limit +

[PATCH v2 3/3] RISC-V: Enable vectorizable early exit testsuite

2024-05-15 Thread pan2 . li
From: Pan Li After we supported vectorizable early exit in RISC-V, we would like to enable the gcc vect test for vectorizable early test. The vect-early-break_124-pr114403.c failed to vectorize for now. Because that the __builtin_memcpy with 8 bytes failed to folded into int64 assignment

[PATCH v2 1/3] Vect: Support loop len in vectorizable early exit

2024-05-15 Thread pan2 . li
From: Pan Li This patch adds early break auto-vectorization support for target which use length on partial vectorization. Consider this following example: unsigned vect_a[802]; unsigned vect_b[802]; void test (unsigned x, int n) { for (int i = 0; i < n; i++) { vect_b[i] = x + i;

[PATCH v5 2/3] Vect: Support new IFN SAT_ADD for unsigned vector int

2024-05-14 Thread pan2 . li
From: Pan Li For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be

[PATCH v5 3/3] RISC-V: Implement IFN SAT_ADD for both the scalar and vector

2024-05-14 Thread pan2 . li
From: Pan Li The patch implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below vector as example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (-

[PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-14 Thread pan2 . li
From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: *

[committed] RISC-V: Fix format issue for trailing operator [NFC]

2024-05-13 Thread pan2 . li
From: Pan Li This patch would like to fix below format issue of trailing operator. === ERROR type #1: trailing operator (4 error(s)) === gcc/config/riscv/riscv-vector-builtins.cc:4641:39: if ((exts & RVV_REQUIRE_ELEN_FP_16) && gcc/config/riscv/riscv-vector-builtins.cc:4651:39: if ((exts &

[PATCH v1 3/3] RISC-V: Enable vectorizable early exit test

2024-05-13 Thread pan2 . li
From: Pan Li This patch depends on below 2 patches. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651459.html https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651460.html After we supported vectorizable early exit in RISC-V, we would like to enable the gcc vect test for vectorizable

[PATCH v1 2/3] RISC-V: Implement vectorizable early exit with vcond_mask_len

2024-05-13 Thread pan2 . li
From: Pan Li This patch depends on below middle-end implementation. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651459.html After we support the loop lens for the vectorizable, we would like to implement the feature for the RISC-V target. Given below example: unsigned vect_a[1923];

[PATCH v1 1/3] Vect: Support loop len in vectorizable early exit

2024-05-13 Thread pan2 . li
From: Pan Li This patch adds early break auto-vectorization support for target which use length on partial vectorization. Consider this following example: unsigned vect_a[802]; unsigned vect_b[802]; void test (unsigned x, int n) {  for (int i = 0; i < n; i++)  {    vect_b[i] = x + i;    

[PATCH v1] RISC-V: Bugfix ICE for RVV intrinisc vfw on _Float16 scalar

2024-05-11 Thread pan2 . li
From: Pan Li For the vfw vx format RVV intrinsic, the scalar type _Float16 also requires the zvfh extension. Unfortunately, we only check the vector tree type and miss the scalar _Float16 type checking. For example: vfloat32mf2_t test_vfwsub_wf_f32mf2(vfloat32mf2_t vs2, _Float16 rs1, size_t

[PATCH v1] RISC-V: Make full-vec-move1.c test robust for optimization

2024-05-08 Thread pan2 . li
From: Pan Li During investigate the support of early break autovec, we notice the test full-vec-move1.c will be optimized to 'return 0;' in main function body. Because somehow the value of V type is compiler time constant, and then the second loop will be considered as assert (true). Thus,

[PATCH v4 3/3] RISC-V: Implement IFN SAT_ADD for both the scalar and vector

2024-05-06 Thread pan2 . li
From: Pan Li This patch depends on below middle-end enabling patches for scalar and vector. https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650823.html The patch also implement the SAT_ADD in the riscv backend as the sample for

[PATCH v4 2/3] VECT: Support new IFN SAT_ADD for unsigned vector int

2024-05-06 Thread pan2 . li
From: Pan Li This patch depends on below scalar enabling patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650822.html For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name

[PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-06 Thread pan2 . li
From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: *

[PATCH v4] DSE: Fix ICE after allow vector type in get_stored_val

2024-05-02 Thread pan2 . li
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, the valididate_subreg treats the vector type's size is less than vector register as invalid. Then we will have ICE here. This patch would like to fix it by filter-out

[PATCH v3] DSE: Fix ICE after allow vector type in get_stored_val

2024-04-30 Thread pan2 . li
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, the valididate_subreg treats the vector type's size is less than vector register as invalid. Then we will have ICE here. This patch would like to fix it by filter-out

[PATCH v3] Internal-fn: Introduce new internal function SAT_ADD

2024-04-29 Thread pan2 . li
From: Pan Li Update in v3: * Rebase upstream for conflict. Update in v2: * Fix one failure for x86 bootstrap. Original log: This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as

[PATCH v2] RISC-V: Fix ICE for legitimize move on subreg const_poly_int [PR114885]

2024-04-29 Thread pan2 . li
From: Pan Li When we build with isl, there will be a ICE for graphite in both the c/c++ and fortran. The legitimize move cannot take care of below rtl. (set (subreg:DI (reg:TI 237) 8) (subreg:DI (const_poly_int:TI [4, 2]) 8)) Then we will have ice similar to below: internal compiler error:

[PATCH v1] RISC-V: Fix ICE for legitimize move on subreg const_poly_move

2024-04-27 Thread pan2 . li
From: Pan Li When we build with isl, there will be a ICE for graphite in both the c/c++ and fortran. The legitimize move cannot take care of below rtl. (set (subreg:DI (reg:TI 237) 8) (subreg:DI (const_poly_int:TI [4, 2]) 8)) Then we will have ice similar to below: internal compiler error:

[PATCH v1] RISC-V: Add test cases for insn does not satisfy its constraints [PR114714]

2024-04-25 Thread pan2 . li
From: Pan Li We have one ICE when RVV register overlap is enabled. We reverted this feature as it is in stage 4 and there is no much time to figure a better solution for this. Thus, for now add the related test cases which will trigger ICE when register overlap enabled. This will gate the RVV

[PATCH v1] RISC-V: Add xfail test case for highpart register overlap of vwcvt

2024-04-24 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. bdad036da32 RISC-V: Support highpart register overlap for vwcvt The below test suites are passed for

[PATCH v1] RISC-V: Add early clobber to the dest of vwsll

2024-04-24 Thread pan2 . li
From: Pan Li We missed the existing early clobber for the dest operand of vwsll pattern when resolve the conflict of revert register overlap. Thus add it back to the pattern. Unfortunately, we have no test to cover this part and will improve this after GCC-15 open. The below tests are passed

[PATCH v1] Revert "RISC-V: Support highpart register overlap for vwcvt"

2024-04-24 Thread pan2 . li
From: Pan Li This reverts commit bdad036da32f72b84a96070518e7d75c21706dc2. --- gcc/config/riscv/constraints.md | 23 gcc/config/riscv/riscv.md | 24 gcc/config/riscv/vector-crypto.md | 21 ++-- gcc/config/riscv/vector.md

[PATCH v1] RISC-V: Add xfail test case for highpart overlap of vext.vf

2024-04-23 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 62685890d88 RISC-V: Support highpart overlap for vext.vf The below test suites are passed for this

[PATCH v1] RISC-V: Adjust overlap attr after revert d3544cea63d and e65aaf8efe1

2024-04-22 Thread pan2 . li
From: Pan Li After we reverted below 2 commits, the reference to attr need some adjustment as the group_overlap is no longer available. * RISC-V: Robostify the W43, W86, W87 constraint enabled attribute * RISC-V: Rename vconstraint into group_overlap The below tests are passed for this patch.

[PATCH v1] RISC-V: Add xfail test case for highpart overlap floating-point widen insn

2024-04-22 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 8614cbb2534 RISC-V: Support highpart overlap for floating-point widen instructions The below test

[PATCH v2] RISC-V: Add xfail test case for indexed load overlap with SRC EEW < DEST EEW

2024-04-22 Thread pan2 . li
From: Pan Li Update in v2: * Add change log to pr112431-34.c. Original log: We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 4418d55bcd1 RISC-V: Support highpart

[PATCH v1] RISC-V: Add xfail test case for indexed load overlap with SRC EEW < DEST EEW

2024-04-22 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 4418d55bcd1 RISC-V: Support highpart overlap for indexed load with SRC EEW < DEST EEW The below test

[PATCH v1] RISC-V: Add xfail test case for highest-number regno ternary overlap

2024-04-22 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 27fde325d64 RISC-V: Support highest-number regno overlap for widen ternary The below test suites are

[PATCH v1] RISC-V: Add xfail test case for widening register overlap of vf4/vf8

2024-04-21 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 303195e2a6b RISC-V: Support widening register overlap for vf4/vf8 The below test suites are passed. *

[PATCH v1] RISC-V: Add xfail test case for highpart register overlap of vx/vf widen

2024-04-20 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. a23415d7572 RISC-V: Support highpart register overlap for widen vx/vf instructions The below test

[PATCH v1] RISC-V: Add xfail test case for incorrect overlap on v0

2024-04-20 Thread pan2 . li
From: Pan Li We reverted below patch for register group overlap, add the related insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 018ba3ac952 RISC-V: Fix overlap group incorrect overlap on v0 The below test suites are passed. * The

[PATCH] RISC-V: Add xfail test case for wv insn highest overlap

2024-04-20 Thread pan2 . li
From: Pan Li We reverted below patch for wv insn overlap, add the related wv insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. 7e854b58084 RISC-V: Support highest overlap for wv instructions The below test suites are passed. * The

[PATCH v2] RISC-V: Add xfail test case for wv insn register overlap

2024-04-19 Thread pan2 . li
From: Pan Li We reverted below patch for wv insn overlap, add the related wv insn test and mark it as xfail. And we will remove the xfail after we support the register overlap in GCC-15. b3b2799b872 RISC-V: Support one more overlap for wv instructions gcc/testsuite/ChangeLog: *

[PATCH v1] RISC-V: Add xfail test case for wv insn register overlap

2024-04-19 Thread pan2 . li
From: Pan Li gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112431-42.c: New test. Signed-off-by: Pan Li --- .../gcc.target/riscv/rvv/base/pr112431-42.c | 30 +++ 1 file changed, 30 insertions(+) create mode 100644

[PATCH v1] RISC-V: Revert RVV wv instructions overlap and xfail tests

2024-04-19 Thread pan2 . li
From: Pan Li The RVV register overlap requires both the dest, and src operands. Thus the rigister filter in constraint cannot cover the fully sematics of the vector register overlap. Thus, revert these overlap patches list and xfail the related test cases. This patch would like to revert

[committed] RISC-V: Fix Werror=sign-compare in riscv_validate_vector_type

2024-04-12 Thread pan2 . li
From: Pan Li This patch would like to fix the Werror=sign-compare similar to below: gcc/config/riscv/riscv.cc: In function ‘void riscv_validate_vector_type(const_tree, const char*)’: gcc/config/riscv/riscv.cc:5614:23: error: comparison of integer expressions of different signedness: ‘int’ and

[PATCH v1] RISC-V: Bugfix ICE non-vector in TARGET_FUNCTION_VALUE_REGNO_P

2024-04-12 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE when vector is not enabled in hook TARGET_FUNCTION_VALUE_REGNO_P implementation. The vector regno is available if and only if the TARGET_VECTOR is true. The previous implement missed this condition and then result in ICE when rv64gc build

[PATCH v1] RISC-V: Remove -Wno-psabi for test build option [NFC]

2024-04-10 Thread pan2 . li
From: Pan Li Just notice there are some test case still have -Wno-psabi option, which is deprecated now. Remove them all for riscv test cases. The below test are passed for this patch. * The riscv rvv regression test. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr109244.C:

[PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-10 Thread pan2 . li
From: Pan Li This patch would like to fix a ICE in mode sw for below example code. during RTL pass: mode_sw test.c: In function ‘vbool16_t j(vuint64m4_t)’: test.c:15:1: internal compiler error: in create_pre_exit, at mode-switching.cc:451 15 | } | ^ 0x3978f12 create_pre_exit

[PATCH v1] RISC-V: Refine the error msg for RVV intrinisc required ext

2024-04-08 Thread pan2 . li
From: Pan Li The RVV intrinisc API has sorts of required extension from both the march or target attribute. It will have error message similar to below: built-in function '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension However, it is not accurate as we have many additional sub

[PATCH v2] Internal-fn: Introduce new internal function SAT_ADD

2024-04-07 Thread pan2 . li
From: Pan Li Update in v2: * Fix one failure for x86 bootstrap. Original log: This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) |

[PATCH v1] Internal-fn: Introduce new internal function SAT_ADD

2024-04-06 Thread pan2 . li
From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: *

[PATCH] RISC-V: Fix one unused varable in riscv_subset_list::parse

2024-03-30 Thread pan2 . li
From: Pan Li This patch would like to fix one unused variable as below: ../../gcc/common/config/riscv/riscv-common.cc: In static member function 'static riscv_subset_list* riscv_subset_list::parse(const char*, location_t)': ../../gcc/common/config/riscv/riscv-common.cc:1501:19: error: unused

[PATCH] RISC-V: Fix misspelled term builtin in error message

2024-03-30 Thread pan2 . li
From: Pan Li This patch would like to fix below misspelled term in error message. ../../gcc/config/riscv/riscv-vector-builtins.cc:4592:16: error: misspelled term 'builtin function' in format; use 'built-in function' instead [-Werror=format-diag] 4592 | "builtin function %qE

[PATCH v1] RISC-V: Allow RVV intrinsic for more function target

2024-03-26 Thread pan2 . li
From: Pan Li In previous, we allowed the target(("arch=+v")) for a function with rv64gc build. This patch would like to support more arch options as below: * zve32x * zve32f * zve64x * zve64f * zve64d * zvfhmin * zvfh For example, we have sample code as below. vfloat32m1_t

[PATCH v1] RISC-V: Allow RVV intrinsic when function target("arch=+v")

2024-03-25 Thread pan2 . li
From: Pan Li This patch would like to allow the RVV intrinsic when function is attributed as target("arch=+v") and build with rv64gc. For example: vint32m1_t __attribute__((target("arch=+v"))) test_1 (vint32m1_t a, vint32m1_t b, size_t vl) { return __riscv_vadd_vv_i32m1 (a, b, vl); } build

[PATCH v4] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-22 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its'

[PATCH v2] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))

2024-03-21 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE for __attribute__((target("arch=+v")) and likewise extension(s). Given we have sample code as below: void __attribute__((target("arch=+v"))) test_2 (int *a, int *b, int *out, unsigned count) { unsigned i; for (i = 0; i < count; i++)

[PATCH v1] RISC-V: Bugfix function target attribute pollution

2024-03-20 Thread pan2 . li
From: Pan Li This patch depends on below ICE fix. https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647915.html The function target attribute should be on a per-function basis. For example, we have 3 function as below: void test_1 () {} void __attribute__((target("arch=+v"))) test_2 () {}

[PATCH v1] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))

2024-03-18 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE for __attribute__((target("arch=+v")) and likewise extension(s). Given we have sample code as below: void __attribute__((target("arch=+v"))) test_2 (int *a, int *b, int *out, unsigned count) { unsigned i; for (i = 0; i < count; i++)

[PATCH v1] RISC-V: Fix some code style issue(s) in riscv-c.cc [NFC]

2024-03-12 Thread pan2 . li
From: Pan Li Notice some code style issue(s) when add __riscv_v_fixed_vlen, includes: * Meanless empty line. * Line greater than 80 chars. * Indent with 3 space(s). * Argument unalignment. gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_ext_version_value): Fix code style

[PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-12 Thread pan2 . li
From: Pan Li Update in v3: * Add pre-defined __riscv_v_fixed_vlen when zvl. Update in v2: * Cleanup some unused code. * Fix some typo of commit log. Original log: This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one

[PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

2024-03-09 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE in vectorizable_store when both the loop_masks and loop_lens are enabled. The ICE looks like below when build with "-march=rv64gcv -O3". during GIMPLE pass: vect test.c: In function ‘d’: test.c:6:6: internal compiler error: in

[PATCH v1] VECT: Bugfix ICE for vectorizable_store when both len and mask

2024-03-07 Thread pan2 . li
From: Pan Li This patch would like to fix one ICE in vectorizable_store for both the loop_masks and loop_lens. The ICE looks like below with "-march=rv64gcv -O3". during GIMPLE pass: vect test.c: In function ‘d’: test.c:6:6: internal compiler error: in vectorizable_store, at

[PATCH v2] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-05 Thread pan2 . li
From: Pan Li Update in v2: * Cleanup some unused code. * Fix some typo of commit log. Original log: This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only

[PATCH v1] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-05 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc attribute for RVV. This attribute is used to define fixed-length variants of one existing sizeless RVV types. This attribute is valid if and only if the mrvv-vector-bits=zvl, the only one args should be the integer constant and its'

[PATCH v1] RISC-V: Cleanup unused code in riscv_v_adjust_bytesize [NFC]

2024-03-05 Thread pan2 . li
From: Pan Li Cleanup mode_size related code which is not used anymore. Below tests are passed for this patch. * The RVV fully regresssion test. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_bytesize): Cleanup unused mode_size related code. Signed-off-by: Pan Li ---

[PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * scalable * zvl The scalable will pick up the zvl*b in the march as the minimal vlen. For example, the minimal vlen

[PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-28 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * scalable * zvl The scalable will pick up the zvl*b in the march as the minimal vlen. For example, the minimal vlen

[PATCH v2] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-27 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * zvl The zvl will pick up the zvl*b from the march option. For example, the mrvv-vector-bits will be 1024 when

[PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-02-26 Thread pan2 . li
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, we missed to adjust the validate_subreg part accordingly. When the vector type's size is less than vector register, it will be considered as invalid in the

[PATCH v1] RTL: Bugfix ICE after allow vector type in DSE

2024-02-25 Thread pan2 . li
From: Pan Li We allowed vector type for get_stored_val when read is less than or equal to store in previous. Unfortunately, we missed to adjust the validate_subreg part accordingly. For vector type, we don't need to restrict the mode size is greater than the vector register size. Thus, for

[PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-02-24 Thread pan2 . li
From: Pan Li Hi Richard & Tamar, Try the DEF_INTERNAL_INT_EXT_FN as your suggestion. By mapping us_plus$a3 to the RTL representation (us_plus:m x y) in optabs.def. And then expand_US_PLUS in internal-fn.cc. Not very sure if my understanding is correct for DEF_INTERNAL_INT_EXT_FN. I am not

[PATCH v1] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-02-23 Thread pan2 . li
From: Pan Li This patch would like to introduce one new gcc option for RVV. To appoint the bits size of one RVV vector register. Valid arguments to '-mrvv-vector-bits=' are: * 64 * 128 * 256 * 512 * 1024 * 2048 * 4096 * 8192 * 16384 * 32768 * 65536 * scalable * zvl 1. The scalable will be the

[PATCH v1] RISC-V: Upgrade RVV intrinsic version to 0.12

2024-02-20 Thread pan2 . li
From: Pan Li Upgrade the version of RVV intrinsic from 0.11 to 0.12. PR target/114017 gcc/ChangeLog: * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Upgrade the version to 0.12. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-__riscv_v_intrinsic.c:

[PATCH v1] Internal-fn: Add new internal function SAT_ADDU

2024-02-17 Thread pan2 . li
From: Pan Li This patch would like to add the middle-end presentation for the unsigned saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADDU (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will

[PATCH v1] RISC-V: Fix misspelled term args in error_at message

2024-02-10 Thread pan2 . li
From: Pan Li When build with "-Werror=format-diag", there will be one misspelled term args as below. This patch would like fix it by taking the term arguments instead. ../../gcc/config/riscv/riscv-vector-builtins.cc: In function 'tree_node* riscv_vector::resolve_overloaded_builtin(location_t,

[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinsic ICE in function checker

2024-02-07 Thread pan2 . li
From: Pan Li There is another corn case when similar as below example: void test (void) { __riscv_vaadd (); } We report error when overloaded function with empty args. For example: test.c: In function 'foo': test.c:8:3: error: no matching function call to '__riscv_vaadd' with empty args

[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args

2024-02-06 Thread pan2 . li
From: Pan Li There is one corn case when similar as below example: void test (void) { __riscv_vfredosum_tu (); } It will meet ICE because of the implement details of overloaded function in gcc. According to the rvv intrinisc doc, we have no such overloaded function with empty args.

[PATCH v1] RISC-V: Cleanup the comments for the psabi

2024-01-30 Thread pan2 . li
From: Pan Li This patch would like to cleanup some comments which are out of date or incorrect. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_get_arg_info): Cleanup comments. (riscv_pass_by_reference): Ditto. (riscv_fntype_abi): Ditto. Signed-off-by: Pan Li ---

[PATCH v2] RISC-V: Bugfix for vls mode aggregated in GPR calling convention

2024-01-30 Thread pan2 . li
From: Pan Li According to the issue as below. https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416 When the mode size of vls integer mode is less than 2 * XLEN, we will take the gpr for both the args and the return values. Instead of the reference. For example the below code:

[PATCH v1] RISC-V: Bugfix for vls integer mode calling convention

2024-01-23 Thread pan2 . li
From: Pan Li According to the issue as below. https://hub.fgit.cf/riscv-non-isa/riscv-elf-psabi-doc/pull/416 When the mode size of vls integer mode is less than 2 * XLEN, we will take the gpr/fpr for both the args and the return values. Instead of the reference. For example the below code:

[PATCH v1] RISC-V: Fix asm checks regression due to recent middle-end change

2024-01-17 Thread pan2 . li
From: Pan Li The recent middle-end change result in some asm check failures. This patch would like to fix the asm check by adjust the times. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/shift-1.c: Fix asm check count. *

[PATCH v1] RISC-V: Update the comments of riscv_v_ext_mode_p [NFC]

2024-01-11 Thread pan2 . li
From: Pan Li gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_ext_mode_p): Update the comments of predicate func riscv_v_ext_mode_p. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git

[PATCH v5] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

2024-01-11 Thread pan2 . li
From: Pan Li The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to

[PATCH v4] LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

2024-01-10 Thread pan2 . li
From: Pan Li The insert_var_expansion_initialization depends on the HONOR_SIGNED_ZEROS to initialize the unrolling variables to +0.0f when -0.0f and no-signed-option. Unfortunately, we should always keep the -0.0f here because: * The -0.0f is always the correct initial value. * We need to

[PATCH v3] RISC-V: Bugfix for doesn't honor no-signed-zeros option

2024-01-02 Thread pan2 . li
From: Pan Li According to the sematics of no-signed-zeros option, the backend like RISC-V should treat the minus zero -0.0f as plus zero 0.0f. Consider below example with option -fno-signed-zeros. void test (float *a) { *a = -0.0; } We will generate code as below, which doesn't treat the

[PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-26 Thread pan2 . li
From: Pan Li This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as

[PATCH v2] RISC-V: XFail the signbit-5 run test for RVV

2023-12-23 Thread pan2 . li
From: Pan Li This patch would like to XFail the signbit-5 run test case for the RVV. Given the case has one limitation like "This test does not work when the truth type does not match vector type." in the beginning of the test file. Aka, the RVV vector truth type is not integer type. The

[PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-23 Thread pan2 . li
From: Pan Li This patch would like to XFAIL the test case pr30957-1.c for the RVV when build the elf with some configurations (list at the end of the log) It will be vectorized during vect_transform_loop with a variable factor. It won't benefit from unrolling/peeling and mark the loop->unroll as

[PATCH v1] RISC-V: XFail the signbit-5 run test for RVV

2023-12-20 Thread pan2 . li
From: Pan Li This patch would like to XFail the signbit-5 run test case for the RVV. Given the case has one limitation like "This test does not work when the truth type does not match vector type." in the beginning of the test file. Aka, the RVV vector truth type is not integer type. The

[PATCH v3] RISC-V: Bugfix for the const vector in single steps

2023-12-20 Thread pan2 . li
From: Pan Li This patch would like to fix the below execution failure when build with "-march=rv64gcv_zvl512b -mabi=lp64d -mcmodel=medlow --param=riscv-autovec-lmul=m8 -ftree-vectorize -fno-vect-cost-model -O3" FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test The will be one

[PATCH v2] RISC-V: Bugfix for the const vector in single steps

2023-12-19 Thread pan2 . li
From: Pan Li This patch would like to fix the below execution failure. FAIL: gcc.dg/vect/pr92420.c -flto -ffat-lto-objects execution test The will be one single step const vector like { -4, 4, -3, 5, -2, 6, -1, 7, ...}. For such const vector generation with single step, we will generate vid +

[PATCH v1] RISC-V: Bugfix for the const vector in single steps

2023-12-19 Thread pan2 . li
From: Pan Li For generating the const vector with single step, we have code gen similar as below. We have npatterns = 4. v1= {3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8... } v2 (diff) = {3 - 0, 2 - 1, 1 - 2, 0 - 3, 7 - 4, 6 - 5, 5 - 6, 4 - 7...} = {3, 1, -1, 3, 3, 1, -1, 3 ...} v1 = vd +

  1   2   >