On Thu, Jul 15, 2021 at 2:58 PM Wang, Pengfei <pengfei.w...@intel.com> wrote: > > It seems Clang doesn't support -fexcess-precision=xxx: > https://github.com/llvm/llvm-project/blob/main/clang/test/Driver/clang_f_opts.c#L403 > > Thanks > Pengfei > > -----Original Message----- > From: Hongtao Liu <crazy...@gmail.com> > Sent: Thursday, July 15, 2021 2:35 PM > To: Wang, Pengfei <pengfei.w...@intel.com> > Cc: Craig Topper <craig.top...@gmail.com>; Jakub Jelinek <ja...@redhat.com>; > Liu, Hongtao <hongtao....@intel.com>; gcc-patches@gcc.gnu.org; Joseph Myers > <jos...@codesourcery.com> > Subject: Re: [llvm-dev] [PATCH 0/2] Initial support for AVX512FP16 > > On Thu, Jul 15, 2021 at 10:07 AM Wang, Pengfei <pengfei.w...@intel.com> wrote: > > > > Clang for AArch64 promotes each individual operation and rounds immediately > > afterwards. https://godbolt.org/z/qzGfv6nvo note the fcvts between the two > > fadd operations. It's implemented in the LLVM backend where we can't see > > what was originally a single expression. > > > > > > > > Yes, but this is not consistent with Clang document. I think we should ask > > Clang FE to do the promotion and truncation. > > > > > > > > Thanks > > > > Pengfei > > > > > > > > From: llvm-dev <llvm-dev-boun...@lists.llvm.org> On Behalf Of Craig > > Topper via llvm-dev > > Sent: Wednesday, July 14, 2021 11:32 PM > > To: Hongtao Liu <crazy...@gmail.com> > > Cc: Jakub Jelinek <ja...@redhat.com>; llvm-dev > > <llvm-...@lists.llvm.org>; Liu, Hongtao <hongtao....@intel.com>; > > gcc-patches@gcc.gnu.org; Joseph Myers <jos...@codesourcery.com> > > Subject: Re: [llvm-dev] [PATCH 0/2] Initial support for AVX512FP16 > > > > > > > > On Wed, Jul 14, 2021 at 12:45 AM Hongtao Liu via llvm-dev > > <llvm-...@lists.llvm.org> wrote: > > > > > > > > > Set excess_precision_type to FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 to > > > round after each operation could keep semantics right. > > > And I'll document the behavior difference between soft-fp and > > > AVX512FP16 instruction for exceptions. > > I got some feedback from my colleague who's working on supporting > > _Float16 for llvm. > > The LLVM side wants to set FLT_EVAL_METHOD_PROMOTE_TO_FLOAT for > > soft-fp so that codes can be more efficient. > > i.e. > > _Float16 a, b, c, d; > > d = a + b + c; > > > > would be transformed to > > float tmp, tmp1, a1, b1, c1; > > a1 = (float) a; > > b1 = (float) b; > > c1 = (float) c; > > tmp = a1 + b1; > > tmp1 = tmp + c1; > > d = (_Float16) tmp; > > > > so there's only 1 truncation in the end. > > > > if users want to round back after every operation. codes should be > > explicitly written as > > _Float16 a, b, c, d, e; > > e = a + b; > > d = e + c; > > > > That's what Clang does, quote from [1] > > _Float16 arithmetic will be performed using native half-precision > > support when available on the target (e.g. on ARMv8.2a); otherwise it > > will be performed at a higher precision (currently always float) and > > then truncated down to _Float16. Note that C and C++ allow > > intermediate floating-point operands of an expression to be computed > > with greater precision than is expressible in their type, so Clang may > > avoid intermediate truncations in certain cases; this may lead to > > results that are inconsistent with native arithmetic. > > > > > > > > Clang for AArch64 promotes each individual operation and rounds immediately > > afterwards. https://godbolt.org/z/qzGfv6nvo note the fcvts between the two > > fadd operations. It's implemented in the LLVM backend where we can't see > > what was originally a single expression. > > > > > When i'm reading option documents for excess-precision from > https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html > > -fexcess-precision=style By this option, we can provide a solution that rounds back after each operation or not, this should provide more convenience.
> > This option allows further control over excess precision on machines where > floating-point operations occur in a format with more precision or range than > the IEEE standard and interchange floating-point types. > By default, -fexcess-precision=fast is in effect; this means that operations > may be carried out in a wider precision than the types specified in the > source if that would result in faster code, and it is unpredictable when > rounding to the types specified in the source code takes place. When > compiling C, if -fexcess-precision=standard is specified then excess > precision follows the rules specified in ISO C99; in particular, both casts > and assignments cause values to be rounded to their semantic types (whereas > -ffloat-store only affects assignments). This option is enabled by default > for C if a strict conformance option such as -std=c99 is used. -ffast-math > enables -fexcess-precision=fast by default regardless of whether a strict > conformance option is used. > > For -fexcess-precision=fast, > we should set flt_eval_mathond to FLT_EVAL_METHOD_PROMOTE_TO_FLOAT for > soft-fp, and FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 for AVX512FP16 > > For -fexcess-precision=standard > set FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when TARGET_SSE2? so for soft-fp it > will round back after every operation? > > > > > > and so does arm gcc > > quote from arm.c > > > > /* We can calculate either in 16-bit range and precision or > > 32-bit range and precision. Make that decision based on whether > > we have native support for the ARMv8.2-A 16-bit floating-point > > instructions or not. */ > > return (TARGET_VFP_FP16INST > > ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 > > : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT); > > > > > > [1]https://clang.llvm.org/docs/LanguageExtensions.html > > > > -- > > > > Joseph S. Myers > > > > jos...@codesourcery.com > > > > > > > > > > > > -- > > > BR, > > > Hongtao > > > > > > > > -- > > BR, > > Hongtao > > _______________________________________________ > > LLVM Developers mailing list > > llvm-...@lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > -- > BR, > Hongtao -- BR, Hongtao