[Bug c++/114921] Optimization flags cause _Float16 to __bf16 casting to do nothing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114921 --- Comment #2 from GCC Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:87e35da16df74cd1c4729a55d94e7bc592487f48 commit r15-124-g87e35da16df74cd1c4729a55d94e7bc592487f48 Author: Richard Biener Date: Thu May 2 13:55:15 2024 +0200 tree-optimization/114921 - _Float16 -> __bf16 isn't noop The vectorizer handles a _Float16 to __bf16 conversion through vectorizable_assignment, thinking it's a noop. The following fixes this by requiring the same vector component mode when checking for CONVERT_EXPR_CODE_P, being stricter than for VIEW_CONVERT_EXPR. PR tree-optimization/114921 * tree-vect-stmts.cc (vectorizable_assignment): Require same vector component modes for input and output for CONVERT_EXPR_CODE_P.
[Bug c++/114921] Optimization flags cause _Float16 to __bf16 casting to do nothing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114921 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Keywords||wrong-code Ever confirmed|0 |1 Last reconfirmed||2024-05-02 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Richard Biener --- Confirmed. We vectorize the loop to [local count: 119292720]: vect_temp_9.6_3 = MEM [(_Float16 *)f_7(D)]; vect__4.7_9 = VIEW_CONVERT_EXPR(vect_temp_9.6_3); MEM [(__bf16 *)f_7(D)] = vect__4.7_9; vect_temp_9.6_17 = MEM [(_Float16 *)f_7(D) + 8B]; vect__4.7_18 = VIEW_CONVERT_EXPR(vect_temp_9.6_17); MEM [(__bf16 *)f_7(D) + 8B] = vect__4.7_18; likely because the vectorizer thinks this is a noop conversion, it handles it via vectorizable_assignment.