Re: clang 3.8.0 can mess up __builtin_dwarf_cfa (), at least for TARGET_ARCH=armv6, powerpc, powerpc64: a bug 207325 update

Mark Millard Sun, 28 Feb 2016 14:21:07 -0800

In /usr/src/contrib/llvm/tools/clang/lib/CodeGen/CGBuiltin.cpp there is:

>   case Builtin::BI__builtin_dwarf_cfa: {
>     // The offset in bytes from the first argument to the CFA.
>     //
>     // Why on earth is this in the frontend?  Is there any reason at
>     // all that the backend can't reasonably determine this while
>     // lowering llvm.eh.dwarf.cfa()?
>     //
>     // TODO: If there's a satisfactory reason, add a target hook for
>     // this instead of hard-coding 0, which is correct for most targets.
>     int32_t Offset = 0;
>   
>     Value *F = CGM.getIntrinsic(Intrinsic::eh_dwarf_cfa);
>     return RValue::get(Builder.CreateCall(F,
>                                       llvm::ConstantInt::get(Int32Ty, 
> Offset)));
>   }

I would have guessed that the internal argument was how many frames away on the 
stack to go from what 0 produces (high er address direction). g++'s 
__builtin_dwarf_cfa() returns the address for the next frame compared to clang 
3.8.0 (higher address direction).

I'd call that more of a frame depth than an offset. .eh_frame and its cfa 
material use offset terminology as byte offsets. And the comments above talk of 
an offset in bytes --but "next frame" distances in bytes would not be constant.

Looking at a use of LowerFRAMEADDR in a LowerRETURNADDR, for example,

> SDValue ARMTargetLowering::LowerRETURNADDR(SDValue Op, SelectionDAG &DAG) 
> const{
> . . .
>   EVT VT = Op.getValueType();
>   SDLoc dl(Op);
>   unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
>   if (Depth) {
>     SDValue FrameAddr = LowerFRAMEADDR(Op, DAG);
>     SDValue Offset = DAG.getConstant(4, dl, MVT::i32);
>     return DAG.getLoad(VT, dl, DAG.getEntryNode(),
>                        DAG.getNode(ISD::ADD, dl, VT, FrameAddr, Offset),
>                        MachinePointerInfo(), false, false, false, 0);
>   }
> . . .
> }

(PPCTargetLowering::LowerRETURNADDR is similar.) 

This has a mix of Depth and Offset overall, with the depth going to 
LowerFRAMEADDR via Op but Offset used later in GAG.getLoad via adding to the 
FrameAddr.

This would lead me to guess that the terminology and comments in "case 
Builtin::BI__builtin_dwarf_cfa" are wrong and that the Builder.CreateCall has 
been given a frame depth, not an offset.

> SDValue PPCTargetLowering::LowerFRAMEADDR(SDValue Op,
>                                           SelectionDAG &DAG) const {
>   SDLoc dl(Op);
>   unsigned Depth = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
>    
>   MachineFunction &MF = DAG.getMachineFunction();
>   MachineFrameInfo *MFI = MF.getFrameInfo();
>   MFI->setFrameAddressIsTaken(true);
>       
>   EVT PtrVT = DAG.getTargetLoweringInfo().getPointerTy(MF.getDataLayout());
>   bool isPPC64 = PtrVT == MVT::i64;
>            
>   // Naked functions never have a frame pointer, and so we use r1. For all
>   // other functions, this decision must be delayed until during PEI.
>   unsigned FrameReg;
>   if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
>     FrameReg = isPPC64 ? PPC::X1 : PPC::R1;
>   else
>     FrameReg = isPPC64 ? PPC::FP8 : PPC::FP;
>     
>   SDValue FrameAddr = DAG.getCopyFromReg(DAG.getEntryNode(), dl, FrameReg,
>                                          PtrVT);
>   while (Depth--)
>     FrameAddr = DAG.getLoad(Op.getValueType(), dl, DAG.getEntryNode(),
>                             FrameAddr, MachinePointerInfo(), false, false,
>                             false, 0);
>   return FrameAddr;
> } 

Again Op is called Depth --and is used to get from one frame pointer value to 
the next: a frame depth.

To match g++ 4.2.1 the value to use is 1 for depth.

Overall, at least applied to powerpc/powerpc64:

> . . .

>     // TODO: If there's a satisfactory reason, add a target hook for
>     // this instead of hard-coding 0, which is correct for most targets.
>     int32_t Offset = 0;

I think the comments in this area are actually talking about byte offsets, not 
depths and are just wrong. A byte offset of 0 would make sense relative to 
hardcoding but the value is actually a frame depth --a very different context.

I think that the naming of the variable is just wrong: it should be called 
Depth.

And I think that the comments should have originally talked about using a hard 
coded Depth 1 to match g++ and preexisting library usage of 
__builtin_dwarf_cfa() for C++ and other exception handling (.eh_frame usage). 
ANd the code should avhe matched.

As far as I can tell this error in the "case Builtin::BI__builtin_dwarf_cfa:" 
design was not caught until now.

But since the mess has been around a longtime just switching everything to 
match the g++ context now likely has its own problems. (Not just a FreeBSD 
issue.)

For FreeBSD I expect that Depth 1 could be used for powerpc and powerpc64: if 
it has been wrong for a long time (not just 3.8.0) then powerpc/powerpc64 
FreeBSD has likely been broken for C++ exception handling when buildworld was 
via clang for just as long. (But only recently has clang gotten this near 
working for buildworld for at least one of powerpc/powerpc64. Currently powerpc 
is closer, given that powerpc64 does not support softfloat last I knew.)

For other TARGET_ARCH's:

For FreeBSD armv6 it is less clear to me: it is based on clang as it is and I 
do not know what C++ exception ABI it uses. If a modern gcc/g++ buildworld had 
problems with C++ exception handling, does anything need to be done about it? 
For FreeBSD armv6 and the like: is xtoolchain like support important?

FreeBSD may have similar questions for other TARGET_ARCH's.

===
Mark Millard
markmi at dsl-only.net

On 2016-Feb-28, at 2:46 AM, Mark Millard <markmi at dsl-only.net> wrote:

On 2016-Feb-28, at 12:39 AM, Roman Divacky <rdivacky at vlakno.cz> wrote:
> 
> Mark,
> 
> __builtin_dwarf_cfa() is lowered in clang to llvm intrinsic eh_dwarf_cfa.
> There's a depth argument (which defaults to 0, saying it's correct for most
> targets). 
> 
> Then the intrinsic gets lowered in SelectionDAG using
> PPCTargetLowering::LowerFRAMEADDR()
> 
> 
> Can you check that 1) the depth should be 0 for ppc64/ppc32 2) that
> LowerFRAMEADDR() does something sensible?
> 
> There's a loop depth-times, so I wonder if that makes a difference.
> 
> Thanks, Roman

"Lowered"? I'm not familiar with the clang code base or its terminology. Handed 
off to a lower level interface, may be?

As near as I can tell libgcc_s could be made to deal with clang 3.8.0's way of 
__builtin_dwarf_cfa() working for powerpc/powerpc64. But then use with g++ 
would be broken on powerpc/powerpc64 unless there were some sort of live "which 
compiler's type of code" test also involved.

Having only one libgcc_s and multiple compilers using it for a given 
TARGET_ARCH= (for example, devel/powerpc64-xtoolchain-gcc like uses) suggests 
sticking to one convention per TARGET_ARCH= for __builtin_dwarf_cfa().

I would guess that g++ conventions win in this type of context for FreeBSD, 
under the guideline of trying to be gcc 4.2.1 "ABI" compatible. libgcc_s from 
FreeBSD works for C++ exceptions with its gcc 4.2.1 for powerpc and powerpc64 
as things are as far as I know.

But for clang++ FreeBSD is one context among many and other libraries may be 
based on clang 3.8.0's existing interpretation, without gcc/g++ compatibility 
constraints. (I've no experience with earlier clang vintages for the issue.) It 
may be that any change needs to be FreeBSD target-triple specific for all I 
know. In essence: making the convention part of the ABI chosen.

I'll probably get some sleep before looking much at the code that you 
reference. A quick look at part of it suggests a fair amount of research/study 
for me to interpret things reliably.

The loop may be somewhat analogous to _Unwind_RaiseException's loop, but for a 
specific depth. I would currently guess that depth 1 would match gcc 4.2.1's 
result for __builtin_dwarf_cfa().

But there was also some other "address"(?) builtin support routine nearby that 
seemed to call into LowerFRAMEADDR() and I've no clue if g++ 4.2.1 uses the 
same depth-figure standard for both sorts of things or not. For all I know both 
types of builtins(?) might have mismatches with gcc/g++ 4.2.1 and both might 
need fixes.

I do vaguely remember seeing a builtin in FreeBSD code for something that had 
an explicit number in the argument list, possibly 
__builtin_frame_address(n)(?). But I only saw __builtin_dwarf_cfa() with no 
argument in the argument list as far as I remember.

If clang 3.8.0 and gcc 4.2.1 disagreed about what the numbering standard 
referred to for __builtin_frame_address(n) (or whatever it was), that would not 
be good and compatibility would imply conversions between the conventions for 
the 2 FreeBSD contexts.

I have not checked for armv6 related clang++ vs. g++ compatibility for C++ 
exception-handling. If anything is not operating for that context I expect that 
it would be g++ that generated buildworld code that did not work based on the 
FreeBSD source code: clang/clang++ is the normal compiler and kyua seemed to 
operate, unlike on the powerpc/powerpc64.

I've never tried to build armv6 via an equivalent of devel/powerpc64-gcc. I do 
not know if armv6 even uses the same sort of C++ exception-handling ABI or not. 
But I do know that __builtin_dwarf_cfa() is not compatible between clang++ and 
g++ from the 2-line source and comparing objdump -d results.

So more than powerpc/powerpc64 might be involved overall.

===
Mark Millard
markmi at dsl-only.net

_______________________________________________
freebsd-toolchain@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"

Re: clang 3.8.0 can mess up __builtin_dwarf_cfa (), at least for TARGET_ARCH=armv6, powerpc, powerpc64: a bug 207325 update

Reply via email to