RE: [llvm-dev] [PATCH 0/2] Initial support for AVX512FP16

2021-07-15 Thread Wang, Pengfei via Gcc-patches
It seems Clang doesn't support -fexcess-precision=xxx: https://github.com/llvm/llvm-project/blob/main/clang/test/Driver/clang_f_opts.c#L403 Thanks Pengfei -Original Message- From: Hongtao Liu Sent: Thursday, July 15, 2021 2:35 PM To: Wang, Pengfei Cc: Craig Topper ; Jakub Jelinek ;

RE: [llvm-dev] [PATCH 0/2] Initial support for AVX512FP16

2021-07-14 Thread Wang, Pengfei via Gcc-patches
* Clang for AArch64 promotes each individual operation and rounds immediately afterwards. https://godbolt.org/z/qzGfv6nvo note the fcvts between the two fadd operations. It's implemented in the LLVM backend where we can't see what was originally a single expression. Yes, but this is not

RE: [llvm-dev] [PATCH] Add optional _Float16 support

2021-07-13 Thread Wang, Pengfei via Gcc-patches
Hi H.J., Our LLVM implementation currently use %xmm0 for both _Complex's real part and imaginary part. Do we have special reason to use two registers? We are using one register on X64. Considering the performance, especially the register pressure, should it be better to use one register for

RE: [llvm-dev] [PATCH] Add optional _Float16 support

2021-07-12 Thread Wang, Pengfei via Gcc-patches
> Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers. Can you please explain the behavior here? Is there difference between _Float16 and _Complex _Float16 when return? I.e., 1, In which case will _Float16 values return in both %xmm0 and %xmm1? 2, For a single _Float16 value,