> On 12/08/2015 04:08 AM, Liang Li wrote: > > +++ b/util/buffer-zero-avx2.c > > @@ -0,0 +1,54 @@ > > +#include "qemu-common.h" > > + > > +#if defined CONFIG_IFUNC && defined CONFIG_AVX2 #include > > +<immintrin.h> > > +#define AVX2_VECTYPE __m256i > > +#define AVX2_SPLAT(p) _mm256_set1_epi8(*(p)) > > +#define AVX2_ALL_EQ(v1, v2) \ > > + (_mm256_movemask_epi8(_mm256_cmpeq_epi8(v1, v2)) == > 0xFFFFFFFF) > > +#define AVX2_VEC_OR(v1, v2) (_mm256_or_si256(v1, v2)) > > + > > +inline bool > > +can_use_buffer_find_nonzero_offset_avx2(const void *buf, size_t len) > > +{ > > + return (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR > > + * sizeof(AVX2_VECTYPE)) == 0 > > + && ((uintptr_t) buf) % sizeof(AVX2_VECTYPE) == 0); } > > I'm not keen on adding a new file for this. You ought to be able to use > __attribute__((target("-mavx2"))) on any compiler that supports the > command-line option. Which means you can do this all in one file with static > functions. >
I think you means the ' __attribute__((target("avx2")))', I have tried this way, the issue here is: without the ' -mavx2' option for gcc, there are compiling error: '__m256i undeclared', the __attribute__((target("avx2"))) can't solve this issue. Any idea? If I put these avx2 Intrinsics and the sse2 Intrinsics in a single file, the sse2 Intrinsics will be compiled to the avx2 instructions, this is not we want. > Nor am I keen on marking a function inline when we know it must be out-of- > line because of the ifunc usage. Inline can be removed. Thanks Liang > > r~