[Bug target/114499] New: MVE: scatter base offset constraints incorrect
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114499 Bug ID: 114499 Summary: MVE: scatter base offset constraints incorrect Product: gcc Version: 13.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kevin.bracey at alifsemi dot com Target Milestone: --- An attempt to use uint32x4_t base; float32x4_t value; vstrwq_scatter_base_wb_f32(, -sizeof(float), value); Generates an "unsupported" error. It does not accept -4 as a valid offset, but it should. It's looking for a multiple of 8 from -1016 to +1016, not a multiple of 4 from -508 to +508 as it should. Looking at mve.md, I see a number of scatter/gather_base operations have incorrect constraints; they're rather random. Offsets for VLDRW/VSTRW are always 7-bit with a sign bit, representing +/-0 to +/-127*memory size. So the W and D base forms all take -508 to 508 multiples of 4 ("O"?) or -1016 to +1016 multiples of 8 ("Ri"). The "Rl" constraint was wrongly added for just mve_vstrwq_scatter_base_wb_p_fv4sf (https://github.com/gcc-mirror/gcc/commit/ae180f26109bfaebb4ab0f4d45035fd075cf02c8), and it is not required. If it was really needed for a halfword instruction its range should be -254 to +254. It seems that mve_vector_mem_operand() handles this range correctly for non-scatter/gather. Some corrections I think are needed are: mve_vldrwq_gather_base_v4si i -> O mve_vldrwq_gather_base_v2di i -> Ri mve_vldrwq_gather_base_z_v2di i -> Ri mve_vldrwq_gather_base_fv4sf i -> O mve_vldrwq_gather_base_z_fv4sf i -> O mve_vldrwq_gather_base_wb_v4si Ri -> O mve_vldrwq_gather_base_wb_z_v4si Ri -> O mve_vldrwq_gather_base_wb_fv4sf Ri -> O mve_vldrwq_gather_base_wb_z_fv4sf Ri -> O mve_vstrwq_scatter_base_v4si i -> O mve_vstrwq_scatter_base_fv4sf i -> O mve_vstrwq_scatter_base_wb_v4si Ri -> O mve_vstrwq_scatter_base_wb_p_v4si Ri -> O mve_vstrwq_scatter_base_wb_fv4sf Ri -> O mve_vstrwq_scatter_base_wb_p_fv4sf Rl -> O But I don't know that that's exhaustive.
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 --- Comment #8 from Kevin Bracey --- I'm only testing on the Linux trunk because it's what Godbolt has. If it has bare-metal, I'm not seeing it. Actual real development system is bare-metal using Arm's embedded GCC releases, and I don't have a set-up to test a trunk GCC build on it at the moment. Clearly Helium+Linux on Godbolt is a bit confused because it's always using non-existent registers Q8 upwards. There may be a fundamental config error leading to all sorts of strange results. (Mostly reproduces my bare-metal findings though.)
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 --- Comment #6 from Kevin Bracey --- Retesting the Godbolt on trunk, it's now worse - every line produces multiple not-very-informative errors: source>:7:9: error: '_Generic' specifies two compatible types 7 | x = vmulq(x, 0.5); // ok | ^ :7:9: note: compatible type is here 7 | x = vmulq(x, 0.5); // ok | ^ (repeated 6 times per source line)
[Bug target/107714] MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 --- Comment #5 from Kevin Bracey --- I had a look at the GCC source. The vld2/vst2/vld4/vst4 instructions in mve.md have reused the "Um" constraint used for vld/vst in Neon, which permits both "!" and register offset. This needs to be tightened up - can't see an existing equivalent constraint. Perhaps "Um" can be given variant MVE/Neon behaviour, like "Uj".
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 --- Comment #4 from Kevin Bracey --- Yes, looking at them it seems clear those patches address what I'm seeing with the `vmulq(x, 6)` issue.
[Bug target/107714] MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 --- Comment #4 from Kevin Bracey --- The assembler's rejection of the vld2 is valid - the only permitted post-indexed form is to use "!" for increment by 32 (the amount read). Experimenting by changing "inStep" you can see the compiler backend knows that 32 is the only valid constant offset - it generates the "!" form for that correctly - but it apparently hasn't been told not to use register offsets.
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 --- Comment #2 from Kevin Bracey --- I've just spotted another apparent generic selection problem in my reproducer for bug 107714 - should I create a new issue for it?
[Bug target/107714] MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 Kevin Bracey changed: What|Removed |Added CC||stammark at gcc dot gnu.org --- Comment #2 from Kevin Bracey --- Ah, the vmulq is falling foul of some sort of generic selection problem. Substituting with vmulq_n_u8() gets me the actual 6. Something in the same area as my bug 107515, perhaps - I've been making liberal use of the generic functions.
[Bug target/107714] MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 --- Comment #1 from Kevin Bracey --- Looking at that assembly output from Compiler Explorer, I'm also at a loss as to what happened to the "6" for the VMUL. Maybe something else to look at?
[Bug target/107714] New: MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 Bug ID: 107714 Summary: MVE: Invalid addressing mode generated for VLD2 Product: gcc Version: 12.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kevin.bracey at alifsemi dot com Target Milestone: --- Created attachment 53909 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53909=edit Stripped-down reproducer source While I was working on some Helium intrinsics, GCC produced some invalid code, meaning my optimisation can only be enabled in our armclang builds. Problem seems to be still present on GCC trunk. Posted at https://godbolt.org/z/h3EhMvxao Compilation options -O2 -mcpu=cortex-m55 -mfloat-abi=hard Error: instruction does not accept this addressing mode -- `vld21.8 {q4,q5},[r3],r2' Compiler Explorer output for trunk shows the same invalid addressing mode. (It also shows non-existent registers q8 and up in use - I don't know why. Not a problem in my local GCC, obtained from Arm's embedded distribution).
[Bug target/107515] New: MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 Bug ID: 107515 Summary: MVE: Generic functions do not accept _Float16 scalars Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kevin.bracey at alifsemi dot com Target Milestone: --- Compiling C code, generic functions taking floating point scalars in arm_mve.h do not accept `_Float16` values. // Using gcc -mcpu=cortex-m55 -O2 // Uploaded at https://godbolt.org/z/7jrqWWroY #include void test(void) { float16x8_t x; x = vmulq(x, 0.5); // ok x = vmulq(x, 0.5f); // ok x = vmulq(x, (__fp16) 0.5); // ok x = vmulq(x, 0.15f16); // rejected x = vmulq(x, (_Float16) 0.15); // rejected } Output: :10:9: error: '_Generic' selector of type 'int (*)[4][39]' is not compatible with any association 10 | x = vmulq(x, 0.15f16); // rejected | ^ :11:9: error: '_Generic' selector of type 'int (*)[4][39]' is not compatible with any association 11 | x = vmulq(x, (_Float16) 0.15); // rejected | ^