[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804 --- Comment #6 from Hongtao.liu --- Fixed for GCC14.
[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804 --- Comment #5 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:3279b6223066d36d2e6880a137f80a46d3c82c8f commit r14-1421-g3279b6223066d36d2e6880a137f80a46d3c82c8f Author: liuhongt Date: Wed Feb 22 17:54:46 2023 +0800 Enhance NARROW FLOAT_EXPR vectorization by truncating integer to lower precision. Similar like WIDEN FLOAT_EXPR, when direct_optab is not existed, try intermediate integer type whenever gimple ranger can tell it's safe. .i.e. When there's no direct optab for vector long long -> vector float, but the value range of integer can be represented as int, try vector int -> vector float if availble. gcc/ChangeLog: PR tree-optimization/108804 * tree-vect-patterns.cc (vect_get_range_info): Remove static. * tree-vect-stmts.cc (vect_create_vectorized_demotion_stmts): Add new parameter narrow_src_p. (vectorizable_conversion): Enhance NARROW FLOAT_EXPR vectorization by truncating to lower precision. * tree-vectorizer.h (vect_get_range_info): New declare. gcc/testsuite/ChangeLog: * gcc.target/i386/pr108804.c: New test.
[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804 --- Comment #4 from Hongtao.liu --- Created attachment 54613 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54613=edit Patch pending for GCC14
[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804 --- Comment #3 from Hongtao.liu --- I think the point here is: although it's unit64_t -> float, but the range of x and y can be represent as int32(k & 0x007F) | 0x3F80), so we can use int32 -> float instructions which are supported by the backend. So it looks to me a middle-end issue. A simple testcase clang generates vcvtdq2ps but gcc doesn't vectorize. #include uint64_t d[512]; float f[1024]; void foo() { for (int i=0; i<512; ++i) { uint64_t k = d[i]; f[i]=(k & 0x3F30); } } manually add convertion then gcc also can do vectorization. #include uint64_t d[512]; float f[1024]; void foo() { for (int i=0; i<512; ++i) { uint64_t k = d[i]; f[i]=(int)(k & 0x3F30); } }
[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804 Richard Biener changed: What|Removed |Added Target||x86_64-*-* Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Keywords||missed-optimization Last reconfirmed||2023-02-20 --- Comment #2 from Richard Biener --- EVRP does @@ -38,16 +61,18 @@ k_12 = k_10 >> 23; _2 = k_12 & 8388607; y_13 = _2 | 1065353216; - _3 = (float) x_11; + _17 = (signed long) x_11; + _3 = (float) _17; f[i_6] = _3; _4 = i_6 + 128; - _5 = (float) y_13; + _18 = (signed long) y_13; + _5 = (float) _18; f[_4] = _5; because unsigned long -> float is even more difficult. With -fno-tree-vrp the conversion is still from uint64_t but that's not supported either. So it's a target issue. Shorter testcase: #include uint64_t d[512]; float f[1024]; void foo() { for (int i=0; i<512; ++i) { uint64_t k = d[i]; f[i]=k; } }
[Bug target/108804] missed vectorization in presence of conversion from uint64_t to float
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108804 Andrew Pinski changed: What|Removed |Added Component|tree-optimization |target Severity|normal |enhancement