"juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes: >> Ah, right, sorry for the bogus suggestion. >> In that case, what specifically goes wrong? Which test in >> test_vector_subregs_modes fails, and what does the ICE look like? > > Because may_be (nunits, 1) return true. For nunits = (1,1), it will execute > test_vector_subregs_modes. > The fail ICE: > ../../../riscv-gcc/gcc/simplify-rtx.cc:8396: test_vector_subregs_modes: FAIL: > ASSERT_RTX_EQ (expected, simplify_subreg (QImode, x, inner_mode, byte)) > expected: (nil) > > actual: (const_int 0 [0]) > cc1: internal compiler error: in assert_rtx_eq_at, at selftest-rtl.cc:57 > 0x1304ee1 selftest::assert_rtx_eq_at(selftest::location const&, char const*, > rtx_def*, rtx_def*) > ../../../riscv-gcc/gcc/selftest-rtl.cc:57 > 0x1332504 test_vector_subregs_modes > ../../../riscv-gcc/gcc/simplify-rtx.cc:8395 > 0x1332988 test_vector_subregs_fore_back > ../../../riscv-gcc/gcc/simplify-rtx.cc:8442 > 0x1332ae7 test_vector_subregs > ../../../riscv-gcc/gcc/simplify-rtx.cc:8467 > 0x1332c57 test_vector_ops > ../../../riscv-gcc/gcc/simplify-rtx.cc:8487 > 0x1332c7b selftest::simplify_rtx_cc_tests() > ../../../riscv-gcc/gcc/simplify-rtx.cc:8547 > 0x21318fa selftest::run_tests() > ../../../riscv-gcc/gcc/selftest-run-tests.cc:115 > 0x1362a76 toplev::run_self_tests() > ../../../riscv-gcc/gcc/toplev.cc:2205 > > I analyzed the codes: > In test_vector_subregs_fore_back, when nunits = (1,1). The expected = > NULL_RTX and simplify_subreg (QImode, x, inner_mode, byte) = const_int 0. > So the assertion fails.
Hmm, ok, so the subreg operation is unexpected succeeding. > This is the test for stepped vector using 2 element per pattern. For > poly_uint16 (1,1), it's true it is possible only has 1 element. The stepped case is 3 elements per pattern rather than 2. In a stepped pattern: a, b, b+n are represented explicitly, then the rest are implicitly b+n*2, b+n*3, etc. The case being handled by this code is instead the 2-element case: a, b are represented explicitly, then the rest are implicitly all b. Why is (1,1) different though? The test is constructing: nunits: 1 + 1x shape: nelts_per_pattern == 2, npatterns == 1 elements: a, b[, b, b, b, b, ...] It then tests subregs starting at 0 + 1x (so starting at the first b). But for (2,2) we should have: nunits: 2 + 2x shape: nelts_per_pattern == 2, npatterns == 2 elements: a1, a2, b1, b2[, b1, b2, b1, b2, ...] and we should test subregs starting at 0 + 2x (so starting at the first b1). The two cases should be very similar, it's just that the (2,2) case doubles the number of patterns. > I think it makes sense to fail the test. However for poly (1,1) machine mode, > can we have the chance that some target define this > kind of machine mode only used for intrinsics? I already developed full RVV > support in GCC (including intrinsc and auto-vectorization). > I only enable auto-vectorization with mode larger than (2,2) and test it > fully. > From my experience, it seems the stepped vector only created during VLA > auto-vectorization. And I think only allow poly (1,1)mode used in intrinsics > will > not create issues. Am I understanding wrong ?Feel free to correct me. Thanks ~ Following on from what I said above, it doesn't look like this particular case is related to stepped vectors. (1,1) shouldn't (need to) be a special case though. Any potentital problems that would occur for (1,1) with npatterns==1 would also occur for (n,n) with npatterns==n. E.g. if stepped vectors are problematic for (1,1) then an interleaving of 2 stepped vectors (i.e. npatterns==2) would be problematic for (2,2). So yeah, preventing a mode being used for autovectorisation would allow the target to have a bit more control over which constants are actually generated. But it shouldn't be necessary to do that for correctness. Thanks, Richard > juzhe.zh...@rivai.ai > > From: Richard Sandiford > Date: 2022-08-19 17:35 > To: juzhe.zhong\@rivai.ai > CC: rguenther; gcc-patches; kito.cheng > Subject: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1, 1) > and allow the machine_mode definition with poly_uint16 (1, 1) > "juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes: >> Hi, Richard. I tried the codes: >> if (!nunits.is_constant () && maybe_gt (nunits, 1)) >> test_vector_subregs_modes (x, nunits - min_nunits, count); >> >> It still failed. For nunits = (1,1) , maybe_gt (nunits, 1) return true value. > > Ah, right, sorry for the bogus suggestion. > > In that case, what specifically goes wrong? Which test in > test_vector_subregs_modes fails, and what does the ICE look like? > > Thanks, > Richard > >> But I tried: >> if (!nunits.is_constant () && known_gt (nunits, 1)) >> test_vector_subregs_modes (x, nunits - min_nunits, count); >> It pass. But it report a warning: "warning: comparison between signed and >> unsigned integer expressions [-Wsign-compare]" during the compilation. >> >> Finally, I tried: >> if (!nunits.is_constant () && known_gt (GET_MODE_NUNITS (inner_mode), 1)) >> test_vector_subregs_modes (x, nunits - min_nunits, count); >> It passed with no warning. >> >> Is 'known_gt (GET_MODE_NUNITS (inner_mode), 1)' a good solution for this? >> Thanks! >> >> >> juzhe.zh...@rivai.ai >> >> From: Richard Sandiford >> Date: 2022-08-19 16:03 >> To: juzhe.zhong >> CC: gcc-patches; rguenther; kito.cheng >> Subject: Re: [PATCH] middle-end: skipp stepped vector test of poly_int (1, >> 1) and allow the machine_mode definition with poly_uint16 (1, 1) >> juzhe.zh...@rivai.ai writes: >>> From: zhongjuzhe <juzhe.zh...@rivai.ai> >>> >>> Hello. This patch is preparing for following RVV support. >>> >>> Both ARM SVE and RVV (RISC-V 'V' Extension) support length-agnostic vector. >>> The minimum vector length of ARM SVE is 128-bit and the runtime invariant >>> of ARM SVE is always 128-bit blocks. >>> However, the minimum vector length of RVV can be 32bit in 'Zve32*' >>> sub-extension and 64bit in 'Zve*' sub-extension. >>> >>> So I define the machine_mode as follows: >>> VECTOR_MODE_WITH_PREFIX (VNx, INT, DI, 1, 0); >>> ADJUST_NUNITS (MODE, riscv_vector_chunks); >>> The riscv_vector_chunks = poly_uint16 (1, 1) >>> >>> The compilation is failed for the stepped vector test: >>> (const_vector:VNx1DI repeat [ >>> (const_int 8 [0x8]) >>> (const_int 7 [0x7]) >>> ]) >>> >>> I understand for stepped vector should always have aleast 2 elements and >>> stepped vector initialization is common >>> for VLA (vector-lengthe agnostic) auto-vectorization. It makes sense that >>> report fail for stepped vector of poly_uint16 (1, 1). >>> >>> machine mode with nunits = poly_uint16 (1, 1) needs to implemented in >>> intrinsics. And I would like to enable RVV auto-vectorization >>> with vector mode only nunits is larger than poly_uint16 (2, 2) in RISC-V >>> backend. I think it will not create issue if we define >>> vector mode with nunits = poly_uint16 (1, 1). Feel free to correct me or >>> offer me some other better solutions. Thanks! >>> >>> >>> >>> gcc/ChangeLog: >>> >>> * simplify-rtx.cc (test_vector_subregs_fore_back): skip test for >>> poly_uint16 (1, 1). >>> >>> --- >>> gcc/simplify-rtx.cc | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc >>> index 7d09bf7103d..61e0dfa00d0 100644 >>> --- a/gcc/simplify-rtx.cc >>> +++ b/gcc/simplify-rtx.cc >>> @@ -8438,7 +8438,7 @@ test_vector_subregs_fore_back (machine_mode >>> inner_mode) >>> rtx x = builder.build (); >>> >>> test_vector_subregs_modes (x); >>> - if (!nunits.is_constant ()) >>> + if (!nunits.is_constant () && known_ne (nunits, poly_uint16 (1, 1))) >>> test_vector_subregs_modes (x, nunits - min_nunits, count); >> >> I think instead we should use maybe_gt (nunits, 1), on the basis that >> the fore_back tests require vectors that have a minimum of 2 elements. >> Something like poly_uint16 (1, 2) would have the same problem as >> poly_uint16 (1, 1). ({1, 2} is an unlikely value, but it's OK in >> principle.) >> >> This corresponds to the minimum of 3 elements for the stepped tests: >> >> if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT >> && maybe_gt (GET_MODE_NUNITS (mode), 2)) >> { >> test_vector_ops_series (mode, scalar_reg); >> test_vector_subregs (mode); >> } >> >> Thanks, >> Richard >> >