pengfei added inline comments.
================ Comment at: clang/lib/CodeGen/CGExprComplex.cpp:896 + +ComplexPairTy ComplexExprEmitter::EmitPromoted(const Expr *E) { + if (auto *BinOp = dyn_cast<BinaryOperator>(E->IgnoreParens())) { ---------------- zahiraam wrote: > pengfei wrote: > > rjmccall wrote: > > > pengfei wrote: > > > > rjmccall wrote: > > > > > zahiraam wrote: > > > > > > rjmccall wrote: > > > > > > > `EmitPromoted` should take the promotion type. > > > > > > > > > > > > > > You are missing the logic in this file which handles *unpromoted* > > > > > > > emission (where you have a binary operator but the context wants > > > > > > > an unpromoted value) by recognizing that you need to do a > > > > > > > promoted operation and then truncate. > > > > > > Sorry but not sure what you mean here. A case where we don't want > > > > > > any promotion would be: > > > > > > > > > > > > float _Complex add(float _Complex a, float _Complex b) { > > > > > > return a + b; > > > > > > } > > > > > > > > > > > > In this case, getPromotionType would return the null type and > > > > > > EmitBinOps would just go through the "older" control path where > > > > > > there is no promotion. > > > > > > Unless I misunderstood your comment? > > > > > > > > > > > > > > > > > I'm talking about the unpromoted emission path for an operation that > > > > > we want to *internally* promote. So like your example but using > > > > > `_Float16` — we want to emit the binary `+` as a `float` operation, > > > > > but the context needs an unpromoted value because it's going to be > > > > > returned. > > > > > > > > > > The promoted emission path (the various `EmitPromoted` methods) > > > > > represents a request by the caller to produce a result that doesn't > > > > > match the formal type of the expression. The normal emission path > > > > > `Visit` etc.) represents a request by the caller to produce a result > > > > > normally, i.e. one that matches the formal type. In general, we > > > > > always start in the latter because arbitrary contexts always expect a > > > > > value of the formal type; it's only these recursive calls within > > > > > promoted emitters that contextually want a promoted value. > > > > > > > > > > In your current patch, you're entering the promoted path by > > > > > special-casing one context that frequently terminates promoted > > > > > emission: `EmitComplexExprIntoLValue`, which is used for things like > > > > > assignment. That's not the right way to handle it, though. Instead, > > > > > you should do like I asked you to do with the scalar emitter, which > > > > > is to recognize in the normal emission path for a promotable > > > > > operation (like binary `+`) that you need to promote the operation > > > > > and then unpromote the result. Then things like > > > > > `EmitComplexExprIntoLValue` will continue to simply enter the normal > > > > > path because that's the kind of value they need, and the promotion > > > > > logic will do the right thing from there to ensure we don't emit a > > > > > `_Float16` operation in LLVM IR. > > > > > > > > > > Incidentally, I thought of another context that you ought to do > > > > > promotion for: when we're converting a promotable value to a larger > > > > > floating-point type, we should presumably convert the promoted value > > > > > directly rather than truncating it to the unpromoted type before the > > > > > conversion. > > > > I'm not familiar with the FE details, but I understand the background > > > > is just to promote expressions that have more than one operation. So I > > > > agree with @zahiraam, and > > > > > to ensure we don't emit a _Float16 operation in LLVM IR > > > > is not what we expected here. > > > > > > > > `_Float16` is a ABI type for which a target supported has to lower any > > > > `_Float16` operations in LLVM IR. See what we have done on X86 backend > > > > by D107082. > > > > > > > > That said, a single `a + b` can be and should be emitted as `%c = fadd > > > > half %a, %b`. > > > Well, you can provide operation-by-operation lowering support if you want > > > to, but what's going to come out of Clang IRGen in this mode is not going > > > to rely on it. It's far easier to just emit these operations > > > differently, always using `float`, than to try to recognize the special > > > cases where all the inputs are naturally `half` and so you might as well > > > emit a `half` operation. > > > > > > And I think the postcondition that `half` operations don't come out of > > > IRGen is a generally useful property even if you've already written this > > > lowering code. Among other things, it makes it easy to test that you > > > haven't missed a case where you should be doing promoted emission. > > We define `_Float16` as ABI type and require targets to support the ABI > > lowering. See > > https://clang.llvm.org/docs/LanguageExtensions.html#half-precision-floating-point > > Promoting in FE doesn't mean it works on any targets by default, so I > > think it's not much useful. > > We need to support `-fexcess-precision=16` the same as GCC does, it's > > easier to do nothing than promote and then truncate each of the operations > > in FE. > > I think in general, lowering operations for a type is a business of > > backend. We still need backend to do so for other FEs even we do it in > > Clang. A recent example is BF16, which people chose to lower in backend > > rather than mlir, see D126444. > > > > > Among other things, it makes it easy to test that you haven't missed a > > > case where you should be doing promoted emission. > > I don't understand, how can we check the missed cases if the promotion was > > done in FE? > This patch is doing by default what GCC is doing with the > -fexcess-precision=16. And that would be for expressions with any number of > operators. But it looks like you want promotion/truncation only for > expression with more than one operator? I think I agree with @rjmccall that > it is easier done this way than trying to recognize the special cases. > > What you want is for a+b to generate fadd half (not extension, and no > truncation ) and for a+b*c to generated a set of fpext, fadd float and > fptrunc? > Does GCC use their option for special cases too? I would think that when this > option is used all expression (regardless of the number of operators) are > extended/truncated right? > > I am not sure I like the idea of lowering some cases in the FE and some > others in the BE. I think it would be cleaner to decide that lowering be done > in FE for all expressions and have BE react to it. > > @rmjccall not sure what you mean by: > "And I think the postcondition that half operations don't come out of IRGen > is a generally useful property even if you've already written this lowering > code. Among other things, it makes it easy to test that you haven't missed a > case where you should be doing promoted emission." > > > > Oh, we may misunderstand each other. Let me clarify my understanding about `expression` so that I can make sure we are talking about the same thing. When I mentioned `expression`, I meant to the C/C++ expressions like: ``` _Float16 t1 = a + b; _Float16 t2 = a + b + c; ``` So when I talk about `expression granularity`, I mean we do truncation once not matter how many actual operations in the expression. This is an example of the differences when GCC generates code for them https://godbolt.org/z/YndnP8eM4 > This patch is doing by default what GCC is doing with the > -fexcess-precision=16. And that would be for expressions with any number of > operators. I thought this patch is doing the GCC default behavior that without `-fexcess-precision=16`. For `-fexcess-precision=16`, we don't need to do any promotion at all in FE. > Does GCC use their option for special cases too? I would think that when this > option is used all expression (regardless of the number of operators) are > extended/truncated right? GCC needs the option because it only truncates once per expression by default. Is it the same mean as you mentioned `all expression are extended/truncated`? > I am not sure I like the idea of lowering some cases in the FE and some > others in the BE. I think it would be cleaner to decide that lowering be done > in FE for all expressions and have BE react to it. The `expression granularity` I expected can only be done in the FE, because we don't have the scope information of C/C++ expression in IR. FE can choose to do the promotion/truncation pre operation, or leave it the backend which is what I suggested here. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113107/new/ https://reviews.llvm.org/D113107 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits