I will try implementing John's suggestions. Thanks. Sam
From: rjmcc...@apple.com [mailto:rjmcc...@apple.com] Sent: Saturday, March 10, 2018 12:14 PM To: Richard Smith <rich...@metafoo.co.uk> Cc: Liu, Yaxun (Sam) <yaxun....@amd.com>; cfe-commits <cfe-commits@lists.llvm.org> Subject: Re: r326946 - CodeGen: Fix address space of indirect function argument On Mar 9, 2018, at 8:51 PM, Richard Smith <rich...@metafoo.co.uk<mailto:rich...@metafoo.co.uk>> wrote: Hi, This change increases the size of a CallArg, and thus that of a CallArgList, dramatically (an LValue is *much* larger than an RValue, and unlike an RValue, does not appear to be at all optimized for size). This results in CallArgList (which contains inline storage for 16 CallArgs) ballooning in size from ~500 bytes to 2KB, resulting in stack overflows on programs with deep ASTs. Given that this introduces a regression (due to stack overflow), I've reverted in r327195. Sorry about that. Seems reasonable. The right short-term fix is probably a combination of the following: - put a little bit of effort into packing LValue (using fewer than 64 bits for the alignment and packing the bit-fields, I think) - drop the inline argument count to something like 8, which should still capture the vast majority of calls. There are quite a few other space optimizations we could do the LValue afterwards. I think we could eliminate BaseIVarExp completely by storing some sort of ObjC GC classification and then just evaluating the base expression normally. The program it broke looked something like this: struct VectorBuilder { VectorBuilder &operator,(int); }; void f() { VectorBuilder(), 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0, 1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0; } using Eigen's somewhat-ridiculous comma initializer technique (https://eigen.tuxfamily.org/dox/group__TutorialAdvancedInitialization.html) to build an approx 40 x 40 array. An AST 1600 levels deep is somewhat extreme, but it still seems like something we should be able to handle within our default 8MB stack limit. I mean, at some point this really is not supportable, even if it happens to build on current compilers. If we actually documented and enforced our implementation limits like we're supposed to, I do not think we would allow this much call nesting. John. On 7 March 2018 at 13:45, Yaxun Liu via cfe-commits <cfe-commits@lists.llvm.org<mailto:cfe-commits@lists.llvm.org>> wrote: Author: yaxunl Date: Wed Mar 7 13:45:40 2018 New Revision: 326946 URL: http://llvm.org/viewvc/llvm-project?rev=326946&view=rev Log: CodeGen: Fix address space of indirect function argument The indirect function argument is in alloca address space in LLVM IR. However, during Clang codegen for C++, the address space of indirect function argument should match its address space in the source code, i.e., default addr space, even for indirect argument. This is because destructor of the indirect argument may be called in the caller function, and address of the indirect argument may be taken, in either case the indirect function argument is expected to be in default addr space, not the alloca address space. Therefore, the indirect function argument should be mapped to the temp var casted to default address space. The caller will cast it to alloca addr space when passing it to the callee. In the callee, the argument is also casted to the default address space and used. CallArg is refactored to facilitate this fix. Differential Revision: https://reviews.llvm.org/D34367 Added: cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp Modified: cfe/trunk/lib/CodeGen/CGAtomic.cpp cfe/trunk/lib/CodeGen/CGCall.cpp cfe/trunk/lib/CodeGen/CGCall.h cfe/trunk/lib/CodeGen/CGClass.cpp cfe/trunk/lib/CodeGen/CGDecl.cpp cfe/trunk/lib/CodeGen/CGExprCXX.cpp cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp cfe/trunk/lib/CodeGen/CGObjCGNU.cpp cfe/trunk/lib/CodeGen/CGObjCMac.cpp cfe/trunk/lib/CodeGen/CodeGenFunction.h cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl<http://addr-space-struct-arg.cl/> cfe/trunk/test/CodeGenOpenCL/byval.cl<http://byval.cl/> Modified: cfe/trunk/lib/CodeGen/CGAtomic.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGAtomic.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGAtomic.cpp (original) +++ cfe/trunk/lib/CodeGen/CGAtomic.cpp Wed Mar 7 13:45:40 2018 @@ -1160,7 +1160,7 @@ RValue CodeGenFunction::EmitAtomicExpr(A if (UseOptimizedLibcall && Res.getScalarVal()) { llvm::Value *ResVal = Res.getScalarVal(); if (PostOp) { - llvm::Value *LoadVal1 = Args[1].RV.getScalarVal(); + llvm::Value *LoadVal1 = Args[1].getRValue(*this).getScalarVal(); ResVal = Builder.CreateBinOp(PostOp, ResVal, LoadVal1); } if (E->getOp() == AtomicExpr::AO__atomic_nand_fetch) Modified: cfe/trunk/lib/CodeGen/CGCall.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCall.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGCall.cpp (original) +++ cfe/trunk/lib/CodeGen/CGCall.cpp Wed Mar 7 13:45:40 2018 @@ -1040,42 +1040,49 @@ void CodeGenFunction::ExpandTypeFromArgs } void CodeGenFunction::ExpandTypeToArgs( - QualType Ty, RValue RV, llvm::FunctionType *IRFuncTy, + QualType Ty, CallArg Arg, llvm::FunctionType *IRFuncTy, SmallVectorImpl<llvm::Value *> &IRCallArgs, unsigned &IRCallArgPos) { auto Exp = getTypeExpansion(Ty, getContext()); if (auto CAExp = dyn_cast<ConstantArrayExpansion>(Exp.get())) { - forConstantArrayExpansion(*this, CAExp, RV.getAggregateAddress(), - [&](Address EltAddr) { - RValue EltRV = - convertTempToRValue(EltAddr, CAExp->EltTy, SourceLocation()); - ExpandTypeToArgs(CAExp->EltTy, EltRV, IRFuncTy, IRCallArgs, IRCallArgPos); - }); + Address Addr = Arg.hasLValue() ? Arg.getKnownLValue().getAddress() + : Arg.getKnownRValue().getAggregateAddress(); + forConstantArrayExpansion( + *this, CAExp, Addr, [&](Address EltAddr) { + CallArg EltArg = CallArg( + convertTempToRValue(EltAddr, CAExp->EltTy, SourceLocation()), + CAExp->EltTy); + ExpandTypeToArgs(CAExp->EltTy, EltArg, IRFuncTy, IRCallArgs, + IRCallArgPos); + }); } else if (auto RExp = dyn_cast<RecordExpansion>(Exp.get())) { - Address This = RV.getAggregateAddress(); + Address This = Arg.hasLValue() ? Arg.getKnownLValue().getAddress() + : Arg.getKnownRValue().getAggregateAddress(); for (const CXXBaseSpecifier *BS : RExp->Bases) { // Perform a single step derived-to-base conversion. Address Base = GetAddressOfBaseClass(This, Ty->getAsCXXRecordDecl(), &BS, &BS + 1, /*NullCheckValue=*/false, SourceLocation()); - RValue BaseRV = RValue::getAggregate(Base); + CallArg BaseArg = CallArg(RValue::getAggregate(Base), BS->getType()); // Recurse onto bases. - ExpandTypeToArgs(BS->getType(), BaseRV, IRFuncTy, IRCallArgs, + ExpandTypeToArgs(BS->getType(), BaseArg, IRFuncTy, IRCallArgs, IRCallArgPos); } LValue LV = MakeAddrLValue(This, Ty); for (auto FD : RExp->Fields) { - RValue FldRV = EmitRValueForField(LV, FD, SourceLocation()); - ExpandTypeToArgs(FD->getType(), FldRV, IRFuncTy, IRCallArgs, + CallArg FldArg = + CallArg(EmitRValueForField(LV, FD, SourceLocation()), FD->getType()); + ExpandTypeToArgs(FD->getType(), FldArg, IRFuncTy, IRCallArgs, IRCallArgPos); } } else if (isa<ComplexExpansion>(Exp.get())) { - ComplexPairTy CV = RV.getComplexVal(); + ComplexPairTy CV = Arg.getKnownRValue().getComplexVal(); IRCallArgs[IRCallArgPos++] = CV.first; IRCallArgs[IRCallArgPos++] = CV.second; } else { assert(isa<NoExpansion>(Exp.get())); + auto RV = Arg.getKnownRValue(); assert(RV.isScalar() && "Unexpected non-scalar rvalue during struct expansion."); @@ -3418,13 +3425,17 @@ void CodeGenFunction::EmitCallArgs( assert(InitialArgSize + 1 == Args.size() && "The code below depends on only adding one arg per EmitCallArg"); (void)InitialArgSize; - RValue RVArg = Args.back().RV; - EmitNonNullArgCheck(RVArg, ArgTypes[Idx], (*Arg)->getExprLoc(), AC, - ParamsToSkip + Idx); - // @llvm.objectsize should never have side-effects and shouldn't need - // destruction/cleanups, so we can safely "emit" it after its arg, - // regardless of right-to-leftness - MaybeEmitImplicitObjectSize(Idx, *Arg, RVArg); + // Since pointer argument are never emitted as LValue, it is safe to emit + // non-null argument check for r-value only. + if (!Args.back().hasLValue()) { + RValue RVArg = Args.back().getKnownRValue(); + EmitNonNullArgCheck(RVArg, ArgTypes[Idx], (*Arg)->getExprLoc(), AC, + ParamsToSkip + Idx); + // @llvm.objectsize should never have side-effects and shouldn't need + // destruction/cleanups, so we can safely "emit" it after its arg, + // regardless of right-to-leftness + MaybeEmitImplicitObjectSize(Idx, *Arg, RVArg); + } } if (!LeftToRight) { @@ -3471,6 +3482,31 @@ struct DisableDebugLocationUpdates { } // end anonymous namespace +RValue CallArg::getRValue(CodeGenFunction &CGF) const { + if (!HasLV) + return RV; + LValue Copy = CGF.MakeAddrLValue(CGF.CreateMemTemp(Ty), Ty); + CGF.EmitAggregateCopy(Copy, LV, Ty, LV.isVolatile()); + IsUsed = true; + return RValue::getAggregate(Copy.getAddress()); +} + +void CallArg::copyInto(CodeGenFunction &CGF, Address Addr) const { + LValue Dst = CGF.MakeAddrLValue(Addr, Ty); + if (!HasLV && RV.isScalar()) + CGF.EmitStoreOfScalar(RV.getScalarVal(), Dst, /*init=*/true); + else if (!HasLV && RV.isComplex()) + CGF.EmitStoreOfComplex(RV.getComplexVal(), Dst, /*init=*/true); + else { + auto Addr = HasLV ? LV.getAddress() : RV.getAggregateAddress(); + LValue SrcLV = CGF.MakeAddrLValue(Addr, Ty); + CGF.EmitAggregateCopy(Dst, SrcLV, Ty, + HasLV ? LV.isVolatileQualified() + : RV.isVolatileQualified()); + } + IsUsed = true; +} + void CodeGenFunction::EmitCallArg(CallArgList &args, const Expr *E, QualType type) { DisableDebugLocationUpdates Dis(*this, E); @@ -3536,15 +3572,7 @@ void CodeGenFunction::EmitCallArg(CallAr cast<CastExpr>(E)->getCastKind() == CK_LValueToRValue) { LValue L = EmitLValue(cast<CastExpr>(E)->getSubExpr()); assert(L.isSimple()); - if (L.getAlignment() >= getContext().getTypeAlignInChars(type)) { - args.add(L.asAggregateRValue(), type, /*NeedsCopy*/true); - } else { - // We can't represent a misaligned lvalue in the CallArgList, so copy - // to an aligned temporary now. - LValue Dest = MakeAddrLValue(CreateMemTemp(type), type); - EmitAggregateCopy(Dest, L, type, L.isVolatile()); - args.add(RValue::getAggregate(Dest.getAddress()), type); - } + args.addUncopiedAggregate(L, type); return; } @@ -3702,16 +3730,6 @@ CodeGenFunction::EmitCallOrInvoke(llvm:: return llvm::CallSite(Inst); } -/// \brief Store a non-aggregate value to an address to initialize it. For -/// initialization, a non-atomic store will be used. -static void EmitInitStoreOfNonAggregate(CodeGenFunction &CGF, RValue Src, - LValue Dst) { - if (Src.isScalar()) - CGF.EmitStoreOfScalar(Src.getScalarVal(), Dst, /*init=*/true); - else - CGF.EmitStoreOfComplex(Src.getComplexVal(), Dst, /*init=*/true); -} - void CodeGenFunction::deferPlaceholderReplacement(llvm::Instruction *Old, llvm::Value *New) { DeferredReplacements.push_back(std::make_pair(Old, New)); @@ -3804,7 +3822,6 @@ RValue CodeGenFunction::EmitCall(const C for (CallArgList::const_iterator I = CallArgs.begin(), E = CallArgs.end(); I != E; ++I, ++info_it, ++ArgNo) { const ABIArgInfo &ArgInfo = info_it->info; - RValue RV = I->RV; // Insert a padding argument to ensure proper alignment. if (IRFunctionArgs.hasPaddingArg(ArgNo)) @@ -3818,13 +3835,16 @@ RValue CodeGenFunction::EmitCall(const C case ABIArgInfo::InAlloca: { assert(NumIRArgs == 0); assert(getTarget().getTriple().getArch() == llvm::Triple::x86); - if (RV.isAggregate()) { + if (I->isAggregate()) { // Replace the placeholder with the appropriate argument slot GEP. + Address Addr = I->hasLValue() + ? I->getKnownLValue().getAddress() + : I->getKnownRValue().getAggregateAddress(); llvm::Instruction *Placeholder = - cast<llvm::Instruction>(RV.getAggregatePointer()); + cast<llvm::Instruction>(Addr.getPointer()); CGBuilderTy::InsertPoint IP = Builder.saveIP(); Builder.SetInsertPoint(Placeholder); - Address Addr = createInAllocaStructGEP(ArgInfo.getInAllocaFieldIndex()); + Addr = createInAllocaStructGEP(ArgInfo.getInAllocaFieldIndex()); Builder.restoreIP(IP); deferPlaceholderReplacement(Placeholder, Addr.getPointer()); } else { @@ -3837,22 +3857,20 @@ RValue CodeGenFunction::EmitCall(const C // from {}* to (%struct.foo*)*. if (Addr.getType() != MemType) Addr = Builder.CreateBitCast(Addr, MemType); - LValue argLV = MakeAddrLValue(Addr, I->Ty); - EmitInitStoreOfNonAggregate(*this, RV, argLV); + I->copyInto(*this, Addr); } break; } case ABIArgInfo::Indirect: { assert(NumIRArgs == 1); - if (RV.isScalar() || RV.isComplex()) { + if (!I->isAggregate()) { // Make a temporary alloca to pass the argument. Address Addr = CreateMemTemp(I->Ty, ArgInfo.getIndirectAlign(), "indirect-arg-temp", false); IRCallArgs[FirstIRArg] = Addr.getPointer(); - LValue argLV = MakeAddrLValue(Addr, I->Ty); - EmitInitStoreOfNonAggregate(*this, RV, argLV); + I->copyInto(*this, Addr); } else { // We want to avoid creating an unnecessary temporary+copy here; // however, we need one in three cases: @@ -3860,32 +3878,51 @@ RValue CodeGenFunction::EmitCall(const C // source. (This case doesn't occur on any common architecture.) // 2. If the argument is byval, RV is not sufficiently aligned, and // we cannot force it to be sufficiently aligned. - // 3. If the argument is byval, but RV is located in an address space - // different than that of the argument (0). - Address Addr = RV.getAggregateAddress(); + // 3. If the argument is byval, but RV is not located in default + // or alloca address space. + Address Addr = I->hasLValue() + ? I->getKnownLValue().getAddress() + : I->getKnownRValue().getAggregateAddress(); + llvm::Value *V = Addr.getPointer(); CharUnits Align = ArgInfo.getIndirectAlign(); const llvm::DataLayout *TD = &CGM.getDataLayout(); - const unsigned RVAddrSpace = Addr.getType()->getAddressSpace(); - const unsigned ArgAddrSpace = - (FirstIRArg < IRFuncTy->getNumParams() - ? IRFuncTy->getParamType(FirstIRArg)->getPointerAddressSpace() - : 0); - if ((!ArgInfo.getIndirectByVal() && I->NeedsCopy) || - (ArgInfo.getIndirectByVal() && Addr.getAlignment() < Align && - llvm::getOrEnforceKnownAlignment(Addr.getPointer(), - Align.getQuantity(), *TD) - < Align.getQuantity()) || - (ArgInfo.getIndirectByVal() && (RVAddrSpace != ArgAddrSpace))) { + + assert((FirstIRArg >= IRFuncTy->getNumParams() || + IRFuncTy->getParamType(FirstIRArg)->getPointerAddressSpace() == + TD->getAllocaAddrSpace()) && + "indirect argument must be in alloca address space"); + + bool NeedCopy = false; + + if (Addr.getAlignment() < Align && + llvm::getOrEnforceKnownAlignment(V, Align.getQuantity(), *TD) < + Align.getQuantity()) { + NeedCopy = true; + } else if (I->hasLValue()) { + auto LV = I->getKnownLValue(); + auto AS = LV.getAddressSpace(); + if ((!ArgInfo.getIndirectByVal() && + (LV.getAlignment() >= + getContext().getTypeAlignInChars(I->Ty))) || + (ArgInfo.getIndirectByVal() && + ((AS != LangAS::Default && AS != LangAS::opencl_private && + AS != CGM.getASTAllocaAddressSpace())))) { + NeedCopy = true; + } + } + if (NeedCopy) { // Create an aligned temporary, and copy to it. Address AI = CreateMemTemp(I->Ty, ArgInfo.getIndirectAlign(), "byval-temp", false); IRCallArgs[FirstIRArg] = AI.getPointer(); - LValue Dest = MakeAddrLValue(AI, I->Ty); - LValue Src = MakeAddrLValue(Addr, I->Ty); - EmitAggregateCopy(Dest, Src, I->Ty, RV.isVolatileQualified()); + I->copyInto(*this, AI); } else { // Skip the extra memcpy call. - IRCallArgs[FirstIRArg] = Addr.getPointer(); + auto *T = V->getType()->getPointerElementType()->getPointerTo( + CGM.getDataLayout().getAllocaAddrSpace()); + IRCallArgs[FirstIRArg] = getTargetHooks().performAddrSpaceCast( + *this, V, LangAS::Default, CGM.getASTAllocaAddressSpace(), T, + true); } } break; @@ -3902,10 +3939,12 @@ RValue CodeGenFunction::EmitCall(const C ArgInfo.getDirectOffset() == 0) { assert(NumIRArgs == 1); llvm::Value *V; - if (RV.isScalar()) - V = RV.getScalarVal(); + if (!I->isAggregate()) + V = I->getKnownRValue().getScalarVal(); else - V = Builder.CreateLoad(RV.getAggregateAddress()); + V = Builder.CreateLoad( + I->hasLValue() ? I->getKnownLValue().getAddress() + : I->getKnownRValue().getAggregateAddress()); // Implement swifterror by copying into a new swifterror argument. // We'll write back in the normal path out of the call. @@ -3943,12 +3982,12 @@ RValue CodeGenFunction::EmitCall(const C // FIXME: Avoid the conversion through memory if possible. Address Src = Address::invalid(); - if (RV.isScalar() || RV.isComplex()) { + if (!I->isAggregate()) { Src = CreateMemTemp(I->Ty, "coerce"); - LValue SrcLV = MakeAddrLValue(Src, I->Ty); - EmitInitStoreOfNonAggregate(*this, RV, SrcLV); + I->copyInto(*this, Src); } else { - Src = RV.getAggregateAddress(); + Src = I->hasLValue() ? I->getKnownLValue().getAddress() + : I->getKnownRValue().getAggregateAddress(); } // If the value is offset in memory, apply the offset now. @@ -4002,9 +4041,12 @@ RValue CodeGenFunction::EmitCall(const C llvm::Value *tempSize = nullptr; Address addr = Address::invalid(); - if (RV.isAggregate()) { - addr = RV.getAggregateAddress(); + if (I->isAggregate()) { + addr = I->hasLValue() ? I->getKnownLValue().getAddress() + : I->getKnownRValue().getAggregateAddress(); + } else { + RValue RV = I->getKnownRValue(); assert(RV.isScalar()); // complex should always just be direct llvm::Type *scalarType = RV.getScalarVal()->getType(); @@ -4041,7 +4083,7 @@ RValue CodeGenFunction::EmitCall(const C case ABIArgInfo::Expand: unsigned IRArgPos = FirstIRArg; - ExpandTypeToArgs(I->Ty, RV, IRFuncTy, IRCallArgs, IRArgPos); + ExpandTypeToArgs(I->Ty, *I, IRFuncTy, IRCallArgs, IRArgPos); assert(IRArgPos == FirstIRArg + NumIRArgs); break; } @@ -4393,7 +4435,7 @@ RValue CodeGenFunction::EmitCall(const C OffsetValue); } else if (const auto *AA = TargetDecl->getAttr<AllocAlignAttr>()) { llvm::Value *ParamVal = - CallArgs[AA->getParamIndex() - 1].RV.getScalarVal(); + CallArgs[AA->getParamIndex() - 1].getRValue(*this).getScalarVal(); EmitAlignmentAssumption(Ret.getScalarVal(), ParamVal); } } Modified: cfe/trunk/lib/CodeGen/CGCall.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCall.h?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGCall.h (original) +++ cfe/trunk/lib/CodeGen/CGCall.h Wed Mar 7 13:45:40 2018 @@ -213,12 +213,46 @@ public: }; struct CallArg { - RValue RV; + private: + union { + RValue RV; + LValue LV; /// The argument is semantically a load from this l-value. + }; + bool HasLV; + + /// A data-flow flag to make sure getRValue and/or copyInto are not + /// called twice for duplicated IR emission. + mutable bool IsUsed; + + public: QualType Ty; - bool NeedsCopy; - CallArg(RValue rv, QualType ty, bool needscopy) - : RV(rv), Ty(ty), NeedsCopy(needscopy) - { } + CallArg(RValue rv, QualType ty) + : RV(rv), HasLV(false), IsUsed(false), Ty(ty) {} + CallArg(LValue lv, QualType ty) + : LV(lv), HasLV(true), IsUsed(false), Ty(ty) {} + bool hasLValue() const { return HasLV; } + QualType getType() const { return Ty; } + + /// \returns an independent RValue. If the CallArg contains an LValue, + /// a temporary copy is returned. + RValue getRValue(CodeGenFunction &CGF) const; + + LValue getKnownLValue() const { + assert(HasLV && !IsUsed); + return LV; + } + RValue getKnownRValue() const { + assert(!HasLV && !IsUsed); + return RV; + } + void setRValue(RValue _RV) { + assert(!HasLV); + RV = _RV; + } + + bool isAggregate() const { return HasLV || RV.isAggregate(); } + + void copyInto(CodeGenFunction &CGF, Address A) const; }; /// CallArgList - Type for representing both the value and type of @@ -248,8 +282,10 @@ public: llvm::Instruction *IsActiveIP; }; - void add(RValue rvalue, QualType type, bool needscopy = false) { - push_back(CallArg(rvalue, type, needscopy)); + void add(RValue rvalue, QualType type) { push_back(CallArg(rvalue, type)); } + + void addUncopiedAggregate(LValue LV, QualType type) { + push_back(CallArg(LV, type)); } /// Add all the arguments from another CallArgList to this one. After doing Modified: cfe/trunk/lib/CodeGen/CGClass.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGClass.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGClass.cpp (original) +++ cfe/trunk/lib/CodeGen/CGClass.cpp Wed Mar 7 13:45:40 2018 @@ -2077,7 +2077,8 @@ void CodeGenFunction::EmitCXXConstructor assert(Args.size() == 2 && "unexpected argcount for trivial ctor"); QualType SrcTy = D->getParamDecl(0)->getType().getNonReferenceType(); - Address Src(Args[1].RV.getScalarVal(), getNaturalTypeAlignment(SrcTy)); + Address Src(Args[1].getRValue(*this).getScalarVal(), + getNaturalTypeAlignment(SrcTy)); LValue SrcLVal = MakeAddrLValue(Src, SrcTy); QualType DestTy = getContext().getTypeDeclType(ClassDecl); LValue DestLVal = MakeAddrLValue(This, DestTy); @@ -2131,8 +2132,7 @@ void CodeGenFunction::EmitInheritedCXXCo const CXXConstructorDecl *D, bool ForVirtualBase, Address This, bool InheritedFromVBase, const CXXInheritedCtorInitExpr *E) { CallArgList Args; - CallArg ThisArg(RValue::get(This.getPointer()), D->getThisType(getContext()), - /*NeedsCopy=*/false); + CallArg ThisArg(RValue::get(This.getPointer()), D->getThisType(getContext())); // Forward the parameters. if (InheritedFromVBase && @@ -2196,7 +2196,7 @@ void CodeGenFunction::EmitInlinedInherit assert(Args.size() >= Params.size() && "too few arguments for call"); for (unsigned I = 0, N = Args.size(); I != N; ++I) { if (I < Params.size() && isa<ImplicitParamDecl>(Params[I])) { - const RValue &RV = Args[I].RV; + const RValue &RV = Args[I].getRValue(*this); assert(!RV.isComplex() && "complex indirect params not supported"); ParamValue Val = RV.isScalar() ? ParamValue::forDirect(RV.getScalarVal()) Modified: cfe/trunk/lib/CodeGen/CGDecl.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGDecl.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGDecl.cpp (original) +++ cfe/trunk/lib/CodeGen/CGDecl.cpp Wed Mar 7 13:45:40 2018 @@ -1882,6 +1882,22 @@ void CodeGenFunction::EmitParmDecl(const llvm::Type *IRTy = ConvertTypeForMem(Ty)->getPointerTo(AS); if (DeclPtr.getType() != IRTy) DeclPtr = Builder.CreateBitCast(DeclPtr, IRTy, D.getName()); + // Indirect argument is in alloca address space, which may be different + // from the default address space. + auto AllocaAS = CGM.getASTAllocaAddressSpace(); + auto *V = DeclPtr.getPointer(); + auto SrcLangAS = getLangOpts().OpenCL ? LangAS::opencl_private : AllocaAS; + auto DestLangAS = + getLangOpts().OpenCL ? LangAS::opencl_private : LangAS::Default; + if (SrcLangAS != DestLangAS) { + assert(getContext().getTargetAddressSpace(SrcLangAS) == + CGM.getDataLayout().getAllocaAddrSpace()); + auto DestAS = getContext().getTargetAddressSpace(DestLangAS); + auto *T = V->getType()->getPointerElementType()->getPointerTo(DestAS); + DeclPtr = Address(getTargetHooks().performAddrSpaceCast( + *this, V, SrcLangAS, DestLangAS, T, true), + DeclPtr.getAlignment()); + } // Push a destructor cleanup for this parameter if the ABI requires it. // Don't push a cleanup in a thunk for a method that will also emit a Modified: cfe/trunk/lib/CodeGen/CGExprCXX.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGExprCXX.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGExprCXX.cpp (original) +++ cfe/trunk/lib/CodeGen/CGExprCXX.cpp Wed Mar 7 13:45:40 2018 @@ -265,7 +265,7 @@ RValue CodeGenFunction::EmitCXXMemberOrO // when it isn't necessary; just produce the proper effect here. LValue RHS = isa<CXXOperatorCallExpr>(CE) ? MakeNaturalAlignAddrLValue( - (*RtlArgs)[0].RV.getScalarVal(), + (*RtlArgs)[0].getRValue(*this).getScalarVal(), (*(CE->arg_begin() + 1))->getType()) : EmitLValue(*CE->arg_begin()); EmitAggregateAssign(This, RHS, CE->getType()); @@ -1490,7 +1490,7 @@ static void EnterNewDeleteCleanup(CodeGe AllocAlign); for (unsigned I = 0, N = E->getNumPlacementArgs(); I != N; ++I) { auto &Arg = NewArgs[I + NumNonPlacementArgs]; - Cleanup->setPlacementArg(I, Arg.RV, Arg.Ty); + Cleanup->setPlacementArg(I, Arg.getRValue(CGF), Arg.Ty); } return; @@ -1521,8 +1521,8 @@ static void EnterNewDeleteCleanup(CodeGe AllocAlign); for (unsigned I = 0, N = E->getNumPlacementArgs(); I != N; ++I) { auto &Arg = NewArgs[I + NumNonPlacementArgs]; - Cleanup->setPlacementArg(I, DominatingValue<RValue>::save(CGF, Arg.RV), - Arg.Ty); + Cleanup->setPlacementArg( + I, DominatingValue<RValue>::save(CGF, Arg.getRValue(CGF)), Arg.Ty); } CGF.initFullExprCleanup(); Modified: cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp (original) +++ cfe/trunk/lib/CodeGen/CGGPUBuiltin.cpp Wed Mar 7 13:45:40 2018 @@ -83,8 +83,9 @@ CodeGenFunction::EmitNVPTXDevicePrintfCa /* ParamsToSkip = */ 0); // We don't know how to emit non-scalar varargs. - if (std::any_of(Args.begin() + 1, Args.end(), - [](const CallArg &A) { return !A.RV.isScalar(); })) { + if (std::any_of(Args.begin() + 1, Args.end(), [&](const CallArg &A) { + return !A.getRValue(*this).isScalar(); + })) { CGM.ErrorUnsupported(E, "non-scalar arg to printf"); return RValue::get(llvm::ConstantInt::get(IntTy, 0)); } @@ -97,7 +98,7 @@ CodeGenFunction::EmitNVPTXDevicePrintfCa } else { llvm::SmallVector<llvm::Type *, 8> ArgTypes; for (unsigned I = 1, NumArgs = Args.size(); I < NumArgs; ++I) - ArgTypes.push_back(Args[I].RV.getScalarVal()->getType()); + ArgTypes.push_back(Args[I].getRValue(*this).getScalarVal()->getType()); // Using llvm::StructType is correct only because printf doesn't accept // aggregates. If we had to handle aggregates here, we'd have to manually @@ -109,7 +110,7 @@ CodeGenFunction::EmitNVPTXDevicePrintfCa for (unsigned I = 1, NumArgs = Args.size(); I < NumArgs; ++I) { llvm::Value *P = Builder.CreateStructGEP(AllocaTy, Alloca, I - 1); - llvm::Value *Arg = Args[I].RV.getScalarVal(); + llvm::Value *Arg = Args[I].getRValue(*this).getScalarVal(); Builder.CreateAlignedStore(Arg, P, DL.getPrefTypeAlignment(Arg->getType())); } BufferPtr = Builder.CreatePointerCast(Alloca, llvm::Type::getInt8PtrTy(Ctx)); @@ -117,6 +118,6 @@ CodeGenFunction::EmitNVPTXDevicePrintfCa // Invoke vprintf and return. llvm::Function* VprintfFunc = GetVprintfDeclaration(CGM.getModule()); - return RValue::get( - Builder.CreateCall(VprintfFunc, {Args[0].RV.getScalarVal(), BufferPtr})); + return RValue::get(Builder.CreateCall( + VprintfFunc, {Args[0].getRValue(*this).getScalarVal(), BufferPtr})); } Modified: cfe/trunk/lib/CodeGen/CGObjCGNU.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGObjCGNU.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGObjCGNU.cpp (original) +++ cfe/trunk/lib/CodeGen/CGObjCGNU.cpp Wed Mar 7 13:45:40 2018 @@ -1441,7 +1441,7 @@ CGObjCGNU::GenerateMessageSend(CodeGenFu } // Reset the receiver in case the lookup modified it - ActualArgs[0] = CallArg(RValue::get(Receiver), ASTIdTy, false); + ActualArgs[0] = CallArg(RValue::get(Receiver), ASTIdTy); imp = EnforceType(Builder, imp, MSI.MessengerType); Modified: cfe/trunk/lib/CodeGen/CGObjCMac.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGObjCMac.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CGObjCMac.cpp (original) +++ cfe/trunk/lib/CodeGen/CGObjCMac.cpp Wed Mar 7 13:45:40 2018 @@ -1708,7 +1708,7 @@ struct NullReturnState { e = Method->param_end(); i != e; ++i, ++I) { const ParmVarDecl *ParamDecl = (*i); if (ParamDecl->hasAttr<NSConsumedAttr>()) { - RValue RV = I->RV; + RValue RV = I->getRValue(CGF); assert(RV.isScalar() && "NullReturnState::complete - arg not on object"); CGF.EmitARCRelease(RV.getScalarVal(), ARCImpreciseLifetime); @@ -7071,7 +7071,7 @@ CGObjCNonFragileABIMac::EmitVTableMessag CGF.getPointerAlign()); // Update the message ref argument. - args[1].RV = RValue::get(mref.getPointer()); + args[1].setRValue(RValue::get(mref.getPointer())); // Load the function to call from the message ref table. Address calleeAddr = Modified: cfe/trunk/lib/CodeGen/CodeGenFunction.h URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CodeGenFunction.h?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/CodeGenFunction.h (original) +++ cfe/trunk/lib/CodeGen/CodeGenFunction.h Wed Mar 7 13:45:40 2018 @@ -3899,10 +3899,10 @@ private: void ExpandTypeFromArgs(QualType Ty, LValue Dst, SmallVectorImpl<llvm::Value *>::iterator &AI); - /// ExpandTypeToArgs - Expand an RValue \arg RV, with the LLVM type for \arg + /// ExpandTypeToArgs - Expand an CallArg \arg Arg, with the LLVM type for \arg /// Ty, into individual arguments on the provided vector \arg IRCallArgs, /// starting at index \arg IRCallArgPos. See ABIArgInfo::Expand. - void ExpandTypeToArgs(QualType Ty, RValue RV, llvm::FunctionType *IRFuncTy, + void ExpandTypeToArgs(QualType Ty, CallArg Arg, llvm::FunctionType *IRFuncTy, SmallVectorImpl<llvm::Value *> &IRCallArgs, unsigned &IRCallArgPos); Modified: cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp (original) +++ cfe/trunk/lib/CodeGen/ItaniumCXXABI.cpp Wed Mar 7 13:45:40 2018 @@ -1474,8 +1474,7 @@ CGCXXABI::AddedStructorArgs ItaniumCXXAB llvm::Value *VTT = CGF.GetVTTParameter(GlobalDecl(D, Type), ForVirtualBase, Delegating); QualType VTTTy = getContext().getPointerType(getContext().VoidPtrTy); - Args.insert(Args.begin() + 1, - CallArg(RValue::get(VTT), VTTTy, /*needscopy=*/false)); + Args.insert(Args.begin() + 1, CallArg(RValue::get(VTT), VTTTy)); return AddedStructorArgs::prefix(1); // Added one arg. } Modified: cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp (original) +++ cfe/trunk/lib/CodeGen/MicrosoftCXXABI.cpp Wed Mar 7 13:45:40 2018 @@ -1537,8 +1537,7 @@ CGCXXABI::AddedStructorArgs MicrosoftCXX } RValue RV = RValue::get(MostDerivedArg); if (FPT->isVariadic()) { - Args.insert(Args.begin() + 1, - CallArg(RV, getContext().IntTy, /*needscopy=*/false)); + Args.insert(Args.begin() + 1, CallArg(RV, getContext().IntTy)); return AddedStructorArgs::prefix(1); } Args.add(RV, getContext().IntTy); Added: cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp?rev=326946&view=auto ============================================================================== --- cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp (added) +++ cfe/trunk/test/CodeGenCXX/amdgcn-func-arg.cpp Wed Mar 7 13:45:40 2018 @@ -0,0 +1,94 @@ +// RUN: %clang_cc1 -O0 -triple amdgcn -emit-llvm %s -o - | FileCheck %s + +class A { +public: + int x; + A():x(0) {} + ~A() {} +}; + +class B { +int x[100]; +}; + +A g_a; +B g_b; + +void func_with_ref_arg(A &a); +void func_with_ref_arg(B &b); + +// CHECK-LABEL: define void @_Z22func_with_indirect_arg1A(%class.A addrspace(5)* %a) +// CHECK: %p = alloca %class.A*, align 8, addrspace(5) +// CHECK: %[[r1:.+]] = addrspacecast %class.A* addrspace(5)* %p to %class.A** +// CHECK: %[[r0:.+]] = addrspacecast %class.A addrspace(5)* %a to %class.A* +// CHECK: store %class.A* %[[r0]], %class.A** %[[r1]], align 8 +void func_with_indirect_arg(A a) { + A *p = &a; +} + +// CHECK-LABEL: define void @_Z22test_indirect_arg_autov() +// CHECK: %a = alloca %class.A, align 4, addrspace(5) +// CHECK: %[[r0:.+]] = addrspacecast %class.A addrspace(5)* %a to %class.A* +// CHECK: %agg.tmp = alloca %class.A, align 4, addrspace(5) +// CHECK: %[[r1:.+]] = addrspacecast %class.A addrspace(5)* %agg.tmp to %class.A* +// CHECK: call void @_ZN1AC1Ev(%class.A* %[[r0]]) +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64 +// CHECK: %[[r4:.+]] = addrspacecast %class.A* %[[r1]] to %class.A addrspace(5)* +// CHECK: call void @_Z22func_with_indirect_arg1A(%class.A addrspace(5)* %[[r4]]) +// CHECK: call void @_ZN1AD1Ev(%class.A* %[[r1]]) +// CHECK: call void @_Z17func_with_ref_argR1A(%class.A* dereferenceable(4) %[[r0]]) +// CHECK: call void @_ZN1AD1Ev(%class.A* %[[r0]]) +void test_indirect_arg_auto() { + A a; + func_with_indirect_arg(a); + func_with_ref_arg(a); +} + +// CHECK: define void @_Z24test_indirect_arg_globalv() +// CHECK: %agg.tmp = alloca %class.A, align 4, addrspace(5) +// CHECK: %[[r0:.+]] = addrspacecast %class.A addrspace(5)* %agg.tmp to %class.A* +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64 +// CHECK: %[[r2:.+]] = addrspacecast %class.A* %[[r0]] to %class.A addrspace(5)* +// CHECK: call void @_Z22func_with_indirect_arg1A(%class.A addrspace(5)* %[[r2]]) +// CHECK: call void @_ZN1AD1Ev(%class.A* %[[r0]]) +// CHECK: call void @_Z17func_with_ref_argR1A(%class.A* dereferenceable(4) addrspacecast (%class.A addrspace(1)* @g_a to %class.A*)) +void test_indirect_arg_global() { + func_with_indirect_arg(g_a); + func_with_ref_arg(g_a); +} + +// CHECK-LABEL: define void @_Z19func_with_byval_arg1B(%class.B addrspace(5)* byval align 4 %b) +// CHECK: %p = alloca %class.B*, align 8, addrspace(5) +// CHECK: %[[r1:.+]] = addrspacecast %class.B* addrspace(5)* %p to %class.B** +// CHECK: %[[r0:.+]] = addrspacecast %class.B addrspace(5)* %b to %class.B* +// CHECK: store %class.B* %[[r0]], %class.B** %[[r1]], align 8 +void func_with_byval_arg(B b) { + B *p = &b; +} + +// CHECK-LABEL: define void @_Z19test_byval_arg_autov() +// CHECK: %b = alloca %class.B, align 4, addrspace(5) +// CHECK: %[[r0:.+]] = addrspacecast %class.B addrspace(5)* %b to %class.B* +// CHECK: %agg.tmp = alloca %class.B, align 4, addrspace(5) +// CHECK: %[[r1:.+]] = addrspacecast %class.B addrspace(5)* %agg.tmp to %class.B* +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64 +// CHECK: %[[r4:.+]] = addrspacecast %class.B* %[[r1]] to %class.B addrspace(5)* +// CHECK: call void @_Z19func_with_byval_arg1B(%class.B addrspace(5)* byval align 4 %[[r4]]) +// CHECK: call void @_Z17func_with_ref_argR1B(%class.B* dereferenceable(400) %[[r0]]) +void test_byval_arg_auto() { + B b; + func_with_byval_arg(b); + func_with_ref_arg(b); +} + +// CHECK-LABEL: define void @_Z21test_byval_arg_globalv() +// CHECK: %agg.tmp = alloca %class.B, align 4, addrspace(5) +// CHECK: %[[r0:.+]] = addrspacecast %class.B addrspace(5)* %agg.tmp to %class.B* +// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64 +// CHECK: %[[r2:.+]] = addrspacecast %class.B* %[[r0]] to %class.B addrspace(5)* +// CHECK: call void @_Z19func_with_byval_arg1B(%class.B addrspace(5)* byval align 4 %[[r2]]) +// CHECK: call void @_Z17func_with_ref_argR1B(%class.B* dereferenceable(400) addrspacecast (%class.B addrspace(1)* @g_b to %class.B*)) +void test_byval_arg_global() { + func_with_byval_arg(g_b); + func_with_ref_arg(g_b); +} Modified: cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl<http://addr-space-struct-arg.cl/> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl<http://addr-space-struct-arg.cl/> (original) +++ cfe/trunk/test/CodeGenOpenCL/addr-space-struct-arg.cl<http://addr-space-struct-arg.cl/> Wed Mar 7 13:45:40 2018 @@ -1,5 +1,6 @@ // RUN: %clang_cc1 %s -emit-llvm -o - -O0 -finclude-default-header -ffake-address-space-map -triple i686-pc-darwin | FileCheck -enable-var-scope -check-prefixes=COM,X86 %s -// RUN: %clang_cc1 %s -emit-llvm -o - -O0 -finclude-default-header -triple amdgcn-amdhsa-amd | FileCheck -enable-var-scope -check-prefixes=COM,AMDGCN %s +// RUN: %clang_cc1 %s -emit-llvm -o - -O0 -finclude-default-header -triple amdgcn | FileCheck -enable-var-scope -check-prefixes=COM,AMDGCN %s +// RUN: %clang_cc1 %s -emit-llvm -o - -cl-std=CL2.0 -O0 -finclude-default-header -triple amdgcn | FileCheck -enable-var-scope -check-prefixes=COM,AMDGCN,AMDGCN20 %s typedef struct { int cells[9]; @@ -35,6 +36,9 @@ struct LargeStructTwoMember { int2 y[20]; }; +#if __OPENCL_C_VERSION__ >= 200 +struct LargeStructOneMember g_s; +#endif // X86-LABEL: define void @foo(%struct.Mat4X4* noalias sret %agg.result, %struct.Mat3X3* byval align 4 %in) // AMDGCN-LABEL: define %struct.Mat4X4 @foo([9 x i32] %in.coerce) @@ -80,10 +84,42 @@ void FuncOneMember(struct StructOneMembe } // AMDGCN-LABEL: define void @FuncOneLargeMember(%struct.LargeStructOneMember addrspace(5)* byval align 8 %u) +// AMDGCN-NOT: addrspacecast +// AMDGCN: store <2 x i32> %{{.*}}, <2 x i32> addrspace(5)* void FuncOneLargeMember(struct LargeStructOneMember u) { u.x[0] = (int2)(0, 0); } +// AMDGCN20-LABEL: define void @test_indirect_arg_globl() +// AMDGCN20: %[[byval_temp:.*]] = alloca %struct.LargeStructOneMember, align 8, addrspace(5) +// AMDGCN20: %[[r0:.*]] = bitcast %struct.LargeStructOneMember addrspace(5)* %[[byval_temp]] to i8 addrspace(5)* +// AMDGCN20: call void @llvm.memcpy.p5i8.p1i8.i64(i8 addrspace(5)* align 8 %[[r0]], i8 addrspace(1)* align 8 bitcast (%struct.LargeStructOneMember addrspace(1)* @g_s to i8 addrspace(1)*), i64 800, i1 false) +// AMDGCN20: call void @FuncOneLargeMember(%struct.LargeStructOneMember addrspace(5)* byval align 8 %[[byval_temp]]) +#if __OPENCL_C_VERSION__ >= 200 +void test_indirect_arg_globl(void) { + FuncOneLargeMember(g_s); +} +#endif + +// AMDGCN-LABEL: define amdgpu_kernel void @test_indirect_arg_local() +// AMDGCN: %[[byval_temp:.*]] = alloca %struct.LargeStructOneMember, align 8, addrspace(5) +// AMDGCN: %[[r0:.*]] = bitcast %struct.LargeStructOneMember addrspace(5)* %[[byval_temp]] to i8 addrspace(5)* +// AMDGCN: call void @llvm.memcpy.p5i8.p3i8.i64(i8 addrspace(5)* align 8 %[[r0]], i8 addrspace(3)* align 8 bitcast (%struct.LargeStructOneMember addrspace(3)* @test_indirect_arg_local.l_s to i8 addrspace(3)*), i64 800, i1 false) +// AMDGCN: call void @FuncOneLargeMember(%struct.LargeStructOneMember addrspace(5)* byval align 8 %[[byval_temp]]) +kernel void test_indirect_arg_local(void) { + local struct LargeStructOneMember l_s; + FuncOneLargeMember(l_s); +} + +// AMDGCN-LABEL: define void @test_indirect_arg_private() +// AMDGCN: %[[p_s:.*]] = alloca %struct.LargeStructOneMember, align 8, addrspace(5) +// AMDGCN-NOT: @llvm.memcpy +// AMDGCN-NEXT: call void @FuncOneLargeMember(%struct.LargeStructOneMember addrspace(5)* byval align 8 %[[p_s]]) +void test_indirect_arg_private(void) { + struct LargeStructOneMember p_s; + FuncOneLargeMember(p_s); +} + // AMDGCN-LABEL: define amdgpu_kernel void @KernelOneMember // AMDGCN-SAME: (<2 x i32> %[[u_coerce:.*]]) // AMDGCN: %[[u:.*]] = alloca %struct.StructOneMember, align 8, addrspace(5) @@ -112,7 +148,6 @@ void FuncLargeTwoMember(struct LargeStru u.y[0] = (int2)(0, 0); } - // AMDGCN-LABEL: define amdgpu_kernel void @KernelTwoMember // AMDGCN-SAME: (%struct.StructTwoMember %[[u_coerce:.*]]) // AMDGCN: %[[u:.*]] = alloca %struct.StructTwoMember, align 8, addrspace(5) Modified: cfe/trunk/test/CodeGenOpenCL/byval.cl<http://byval.cl/> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGenOpenCL/byval.cl?rev=326946&r1=326945&r2=326946&view=diff ============================================================================== --- cfe/trunk/test/CodeGenOpenCL/byval.cl<http://byval.cl/> (original) +++ cfe/trunk/test/CodeGenOpenCL/byval.cl<http://byval.cl/> Wed Mar 7 13:45:40 2018 @@ -1,5 +1,4 @@ // RUN: %clang_cc1 -emit-llvm -o - -triple amdgcn %s | FileCheck %s -// RUN: %clang_cc1 -emit-llvm -o - -triple amdgcn---opencl %s | FileCheck %s struct A { int x[100]; _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org<mailto:cfe-commits@lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits