[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-09-22 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added a comment.

In D125418#3806720 , @efriedma wrote:

>> Sometimes we will emit the alias here but later the function will be inlined 
>> or eliminated by DCE.
>
> If the alias is externally visible, it can't be eliminated; the compiler 
> can't tell whether the symbol is referenced.  If the alias isn't externally 
> visible, it's dead from the outset.  Not sure how this could become an issue.

There will be no functional issue here. I mean that we can avoid generate some 
redundant alias if it is in the arm64eccalllowering.

>> And later we need to emit alias for direct call thunk also, like 
>> $originname$exitthunk.
>
> Direct call thunks aren't directly relevant here; we only emit them for 
> declarations, not definitions.  I guess this does imply that we need to teach 
> arm64eccalllowering how to modify mangled symbol names... and we could use 
> that same code to insert the $$h.
>
>> Put all of them into arm64eccalllowering pass should be better I think.
>
> I really don't want to do demangling in arm64eccalllowering.  But looking at 
> the generated patterns a bit more closely, maybe we don't have to fully parse 
> the mangled symbol.  If we can get away with just parsing the "?symbolname@@" 
> at the beginning of the symbol, and ignore all the type-related stuff, I 
> guess that would be okay.
>
> Alternatively, I guess we could use attributes to communicate the different 
> mangled forms to the backend, but probably better to avoid that if we can.
>
> If we can solve the mangling issues, I guess generating the alias in 
> arm64eccalllowering would be fine.

As far as I know, there are three kinds of alias we need to generate, For 
example:

  extern "C" void function_name(void a)
  arm64 signature: #function_name(native)

If it is function definition, we need to create an alias to be x86 signature, 
and mangle a new name to arm64ec signature. That is what this change do.

  x86 signature: function_name

If it is a function direct call case, we need to create two alias:

  function thunk: #function_name$exit_thunk
  x86 signature: function_name

I don't understand too much why we need to demangle the function name in 
arm64eccallovering. It looks what we need to do is generate the arm64 signature 
name from the default symbol name which is x86 signature by default.
I'm not familiar with normal mangle rules. If `#` and `$$h` are unique, maybe 
we can just insert `#` on the beginning for C symbol name or insert `$$h` after 
first`@@` for C++ symbol name.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-09-21 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

> Sometimes we will emit the alias here but later the function will be inlined 
> or eliminated by DCE.

If the alias is externally visible, it can't be eliminated; the compiler can't 
tell whether the symbol is referenced.  If the alias isn't externally visible, 
it's dead from the outset.  Not sure how this could become an issue.

> And later we need to emit alias for direct call thunk also, like 
> $originname$exitthunk.

Direct call thunks aren't directly relevant here; we only emit them for 
declarations, not definitions.  I guess this does imply that we need to teach 
arm64eccalllowering how to modify mangled symbol names... and we could use that 
same code to insert the $$h.

> Put all of them into arm64eccalllowering pass should be better I think.

I really don't want to do demangling in arm64eccalllowering.  But looking at 
the generated patterns a bit more closely, maybe we don't have to fully parse 
the mangled symbol.  If we can get away with just parsing the "?symbolname@@" 
at the beginning of the symbol, and ignore all the type-related stuff, I guess 
that would be okay.

Alternatively, I guess we could use attributes to communicate the different 
mangled forms to the backend, but probably better to avoid that if we can.

If we can solve the mangling issues, I guess generating the alias in 
arm64eccalllowering would be fine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-09-21 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added a comment.

A question about the mangle and alias part.
Should we move the code to create alias to backend also?Sometimes we will emit 
the alias here but later the function will be inlined or eliminated by DCE.
And later we need to emit alias for direct call thunk also, like 
$originname$exitthunk. Put all of them into arm64eccalllowering pass should be 
better I think.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-09-21 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added a comment.

I think this looks reasonable to me, but I don't think I'm knowledgeable enough 
to give this a proper review, sorry.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-09-20 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

Ping


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-09-05 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added reviewers: rjmccall, mstorsjo, dpaoliello, rnk.
efriedma added a comment.

I think I'd like to continue moving forward with approximately this approach, 
at least for the moment.  As far as I know, D132926 
 solves the remaining issues with translating 
the calling conventions.  (I'll try to review D132926 
 soon.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-09-01 Thread Eli Friedman via Phabricator via cfe-commits
efriedma updated this revision to Diff 457457.
efriedma added a comment.

Rebased


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

Files:
  clang/include/clang/AST/Mangle.h
  clang/lib/AST/MicrosoftMangle.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/test/CodeGen/arm64ec.c
  clang/test/CodeGenCXX/arm64ec.cpp

Index: clang/test/CodeGenCXX/arm64ec.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/arm64ec.cpp
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -no-opaque-pointers -triple arm64ec-windows-msvc -emit-llvm -o - %s | FileCheck %s
+
+// CHECK: @"?g@@YAXUA@@UB@@@Z" = alias void ([2 x float], [4 x float]), void ([2 x float], [4 x float])* @"?g@@$$hYAXUA@@UB@@@Z"
+// CHECK: define dso_local void @"?g@@$$hYAXUA@@UB@@@Z"
+typedef struct { float x[2]; } A;
+typedef struct { float x[4]; } B;
+void g(A a, B b) { }
Index: clang/test/CodeGen/arm64ec.c
===
--- /dev/null
+++ clang/test/CodeGen/arm64ec.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -no-opaque-pointers -triple arm64ec-windows-msvc -emit-llvm -o - %s | FileCheck %s
+
+// CHECK: @g = alias void ([2 x float], [4 x float]), void ([2 x float], [4 x float])* @"#g"
+// CHECK: define dso_local void @"#g"
+typedef struct { float x[2]; } A;
+typedef struct { float x[4]; } B;
+void g(A a, B b) { }
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -5289,6 +5289,23 @@
   if (!GV->isDeclaration())
 return;
 
+  if (getTriple().isWindowsArm64EC()) {
+// For ARM64EC targets, a function definition's name is mangled differently
+// from the normal symbol.  We then emit an alias from the normal
+// symbol to the remangled definition.
+// FIXME: MSVC uses IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY, we just emit
+// multiple definition symbols.  Why does this matter?
+// FIXME: For hybrid_patchable functions, the alias doesn't point
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;
+llvm::raw_svector_ostream Out(MangledName);
+getCXXABI().getMangleContext().mangleArm64ECFnDef(GD, Out);
+auto *Alias = llvm::GlobalAlias::create("", GV);
+Alias->takeName(GV);
+GV->setName(MangledName);
+  }
+
   // We need to set linkage and visibility on the function before
   // generating code for it because various parts of IR generation
   // want to propagate this information down (e.g. to local static
Index: clang/lib/AST/MicrosoftMangle.cpp
===
--- clang/lib/AST/MicrosoftMangle.cpp
+++ clang/lib/AST/MicrosoftMangle.cpp
@@ -299,6 +299,7 @@
 return AnonymousNamespaceHash;
   }
 
+  void mangleArm64ECFnDef(GlobalDecl GD, raw_ostream &) override;
 private:
   void mangleInitFiniStub(const VarDecl *D, char CharCode, raw_ostream );
 };
@@ -361,7 +362,7 @@
 
   void mangle(GlobalDecl GD, StringRef Prefix = "?");
   void mangleName(GlobalDecl GD);
-  void mangleFunctionEncoding(GlobalDecl GD, bool ShouldMangle);
+  void mangleFunctionEncoding(GlobalDecl GD, bool ShouldMangle, bool Arm64ECDef = false);
   void mangleVariableEncoding(const VarDecl *VD);
   void mangleMemberDataPointer(const CXXRecordDecl *RD, const ValueDecl *VD,
StringRef Prefix = "$");
@@ -385,6 +386,7 @@
   bool ForceThisQuals = false,
   bool MangleExceptionSpec = true);
   void mangleNestedName(GlobalDecl GD);
+  void mangleArm64ECFnDef(GlobalDecl GD);
 
 private:
   bool isStructorDecl(const NamedDecl *ND) const {
@@ -575,7 +577,8 @@
 }
 
 void MicrosoftCXXNameMangler::mangleFunctionEncoding(GlobalDecl GD,
- bool ShouldMangle) {
+ bool ShouldMangle,
+ bool Arm64ECDef) {
   const FunctionDecl *FD = cast(GD.getDecl());
   //  ::=  
 
@@ -599,6 +602,8 @@
 if (FD->isExternC() && FD->hasAttr())
   Out << "$$J0";
 
+if (Arm64ECDef)
+  Out << "$$h";
 mangleFunctionClass(FD);
 
 mangleFunctionType(FT, FD, false, false);
@@ -3954,6 +3959,29 @@
   Mangler.getStream() << '@';
 }
 
+void MicrosoftMangleContextImpl::mangleArm64ECFnDef(GlobalDecl GD,
+raw_ostream ) {
+  const FunctionDecl *D = cast(GD.getDecl());
+  PrettyStackTraceDecl CrashInfo(D, SourceLocation(),
+ getASTContext().getSourceManager(),
+ "Mangling Arm64EC function def");
+
+  if (!shouldMangleCXXName(D)) {
+Out << '#' << D->getName();
+

[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-31 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added a comment.

In D125418#3759174 , @efriedma wrote:

> The reason struct returns require register shuffling is that AArch64 passes 
> the sret pointer in x8 (i.e. RAX), but the x64 calling convention expects in 
> in RCX (i.e. x0).

So, for the function: s64 f(int a):
AArch64 CC:void f(x8, x0)
X64 CC:void f(rcx[x0], rdx[x1])
AArch64 --> X64 we need to add instructions before blr

  mov x1, x0
  mov x0, x8

It can match `iexit_thunk$cdecl$m64$i8` when we call extern function not a 
function pointer.

> Have you tried to see if the Microsoft-generated thunk actually works?  I 
> found at least one bug in MSVC thunk generation and reported it to Microsoft. 
>  (Microsoft didn't acknowledge the report, but that's a different story...)

You are right. For now, I haven't tested too much case runtime. But it looks if 
a DLL import function pass to a function pointer, then call it will cause 
access violation.
Based on the debug result, it should be exit thunk issue, MSVC generate wrong 
thunk type.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-30 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

The reason struct returns require register shuffling is that AArch64 passes the 
sret pointer in x8 (i.e. RAX), but the x64 calling convention expects in in RCX 
(i.e. x0).

Have you tried to see if the Microsoft-generated thunk actually works?  I found 
at least one bug in MSVC thunk generation and reported it to Microsoft.  
(Microsoft didn't acknowledge the report, but that's a different story...)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-30 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added a comment.

In D125418#3756223 , @efriedma wrote:

> There's no way the calling convention can change based on whether you're 
> calling a function vs. a function pointer.  I can't explain why MSVC is 
> generating different code.  I think we should just ignore it, at least for 
> now.

It's OK for me to ignore the difference but I think the main thing is not 
function or function pointer. It's how to generate the exit thunkwhen return 
with structure size value > 16.
https://godbolt.org/z/MWv4YaKdK
Three different way to call extern function, with three kind of exit thunks. 
All of them are keep the return value, not move the return value' point to the 
first argument.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-29 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

There's no way the calling convention can change based on whether you're 
calling a function vs. a function pointer.  I can't explain why MSVC is 
generating different code.  I think we should just ignore it, at least for now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-29 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added a comment.

Another thing we need consider here is this case:

  #pragma pack(push, 1)
  struct b64 {
  char a[64];
  };
  #pragma pack(pop)
  
  typedef b64 (fptrtype)(int a);
  
  b64 f(void* p, int a) {
  return ((fptrtype*)p)(a);
  }

For now we generate exit_thunk with type void f(void* sret(b64) ret, int a)

  $iexit_thunk$cdecl$v$i8i8:  // @"$iexit_thunk$cdecl$v$i8i8"
  .seh_proc $iexit_thunk$cdecl$v$i8i8
  // %bb.0:
sub sp, sp, #48
.seh_stackalloc 48
stp x29, x30, [sp, #32] // 16-byte Folded Spill
.seh_save_fplr  32
add x29, sp, #32
.seh_add_fp 32
.seh_endprologue
mov w1, w0
mov x0, x8
adrpx8, __os_arm64x_dispatch_call_no_redirect
ldr x8, [x8, :lo12:__os_arm64x_dispatch_call_no_redirect]
blr x8
.seh_startepilogue
ldp x29, x30, [sp, #32] // 16-byte Folded Reload
.seh_save_fplr  32
add sp, sp, #48
.seh_stackalloc 48
.seh_endepilogue
ret
.seh_endfunclet
.seh_endproc
  // -- End function
.globl  f
.deff;
.scl2;
.type   32;
.endef

But it looks Microsoft generate exit thunk with type void* f(int a)

  |$iexit_thunk$cdecl$i8$i8| PROC
  |$LN2|
pacibsp
stp fp,lr,[sp,#-0x10]!
mov fp,sp
sub sp,sp,#0x20
adrpx8,__os_arm64x_dispatch_call_no_redirect
ldr xip0,[x8,__os_arm64x_dispatch_call_no_redirect]
blr xip0
mov x0,x8
add sp,sp,#0x20
ldp fp,lr,[sp],#0x10
autibsp
ret
  
ENDP  ; |$iexit_thunk$cdecl$i8$i8|

But based on clang x86 on Windows, we also generate the function type with void 
f(void* sret(b64) ret, int a).
It looks clang is different from MSVC even in x86 ABI.
Do we need to follow MSVC to generate $iexit_thunk$cdecl$i8$i8 ? Or just follow 
clang's ABI and ignore the difference?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-12 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5128
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;

bcl5980 wrote:
> efriedma wrote:
> > bcl5980 wrote:
> > > efriedma wrote:
> > > > bcl5980 wrote:
> > > > > efriedma wrote:
> > > > > > bcl5980 wrote:
> > > > > > > A headache thing here.
> > > > > > > We need to get the function definition with triple x64 to define 
> > > > > > > entry thunk. For now the function definition here is aarch64 
> > > > > > > version.
> > > > > > > For example the case in Microsoft doc "Understanding Arm64EC ABI 
> > > > > > > and assembly code":
> > > > > > > 
> > > > > > > ```
> > > > > > > struct SC {
> > > > > > > char a;
> > > > > > > char b;
> > > > > > > char c;
> > > > > > > };
> > > > > > > int fB(int a, double b, int i1, int i2, int i3);
> > > > > > > int fC(int a, struct SC c, int i1, int i2, int i3);
> > > > > > > int fA(int a, double b, struct SC c, int i1, int i2, int i3) {
> > > > > > > return fB(a, b, i1, i2, i3) + fC(a, c, i1, i2, i3);
> > > > > > > }
> > > > > > > ```
> > > > > > > 
> > > > > > > x64 version IR for fA is:
> > > > > > > ```
> > > > > > > define dso_local i32 @fA(i32 noundef %a, double noundef %b, ptr 
> > > > > > > nocapture noundef readonly %c, i32 noundef %i1, i32 noundef %i2, 
> > > > > > > i32 noundef %i3) local_unnamed_addr #0 { ... }
> > > > > > > ```
> > > > > > > aarch64 version IR for fA is:
> > > > > > > 
> > > > > > > ```
> > > > > > > define dso_local i32 @"#fA"(i32 noundef %a, double noundef %b, 
> > > > > > > i64 %c.coerce, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) 
> > > > > > > #0 {...}
> > > > > > > ```
> > > > > > > Arm64 will allow any size structure to be assigned to a register 
> > > > > > > directly. x64 only allows sizes 1, 2, 4 and 8. 
> > > > > > > Entry thunk follow x64 version function type. But we only have 
> > > > > > > aarch64 version function type.
> > > > > > > 
> > > > > > > I think the best way to do is create a x64 version codeGenModule 
> > > > > > > and use the x64 CGM to generate the function type for entry 
> > > > > > > thunk. But it is hard for me to do here. I tried a little but a 
> > > > > > > lot of issues happen.
> > > > > > > 
> > > > > > > One other way is only modify 
> > > > > > > `AArch64ABIInfo::classifyArgumentType`, copy the x64 code into 
> > > > > > > the function and add a flag to determine which version will the 
> > > > > > > function use. It is easier but I'm not sure it is the only 
> > > > > > > difference between x64 and aarch64. Maybe the classify return 
> > > > > > > also need to do this. And it is not a clean way I think.
> > > > > > Oh, that's annoying... I hadn't considered the case of a struct of 
> > > > > > size 3/5/6/7.
> > > > > > 
> > > > > > Like I noted on D126811, attaching thunks to calls is tricky if we 
> > > > > > try to do it from clang.
> > > > > > 
> > > > > > Computing the right IR type shouldn't be that hard by itself; we 
> > > > > > can call into call lowering code in TargetInfo without modifying 
> > > > > > much else.  (We just need a bit to tell the TargetInfo to redirect 
> > > > > > the call, like D125419.  Use an entry point like 
> > > > > > CodeGenTypes::arrangeCall.)  You don't need to mess with the type 
> > > > > > system or anything like that.
> > > > > > 
> > > > > > The problem is correctly representing the lowered call in IR; we 
> > > > > > really don't want to do lowering early because it will block 
> > > > > > optimizations.  I considered using an operand bundle; we can 
> > > > > > probably make that work, but it's complicated, and probably 
> > > > > > disables some optimizations.
> > > > > > 
> > > > > > I think the best thing we can do here is add an IR attribute to 
> > > > > > mark arguments which are passed directly on AArch64, but need to be 
> > > > > > passed indirectly for the x64 ABI.  Then AArch64Arm64ECCallLowering 
> > > > > > can check for the attribute and modify its behavior.  This isn't 
> > > > > > really clean in the sense that it's specific to the x64/aarch64 
> > > > > > pair of calling conventions, but I think the alternative is worse.
> > > > > It looks not only 3/5/6/7, but also all size exclusive larger than 8 
> > > > > and less than 16 are difference between x86 ABI and Aarch64 ABI.
> > > > > Maybe we can emit a function declaration here for the x86ABI thunk, 
> > > > > then define it in Arm64ECCallLowering.
> > > > > 
> > > > I think the sizes between 8 and 16 work correctly already?  All sizes 
> > > > greater than 8 are passed indirectly on x86, and the thunk generation 
> > > > code accounts for that.  But that's not really important for the 
> > > > general question.
> > > > 
> > > > We need to preserve the required semantics for both the AArch64 and x86 
> > > > calling conventions.  There are basically 

[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-11 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5128
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;

efriedma wrote:
> bcl5980 wrote:
> > efriedma wrote:
> > > bcl5980 wrote:
> > > > efriedma wrote:
> > > > > bcl5980 wrote:
> > > > > > A headache thing here.
> > > > > > We need to get the function definition with triple x64 to define 
> > > > > > entry thunk. For now the function definition here is aarch64 
> > > > > > version.
> > > > > > For example the case in Microsoft doc "Understanding Arm64EC ABI 
> > > > > > and assembly code":
> > > > > > 
> > > > > > ```
> > > > > > struct SC {
> > > > > > char a;
> > > > > > char b;
> > > > > > char c;
> > > > > > };
> > > > > > int fB(int a, double b, int i1, int i2, int i3);
> > > > > > int fC(int a, struct SC c, int i1, int i2, int i3);
> > > > > > int fA(int a, double b, struct SC c, int i1, int i2, int i3) {
> > > > > > return fB(a, b, i1, i2, i3) + fC(a, c, i1, i2, i3);
> > > > > > }
> > > > > > ```
> > > > > > 
> > > > > > x64 version IR for fA is:
> > > > > > ```
> > > > > > define dso_local i32 @fA(i32 noundef %a, double noundef %b, ptr 
> > > > > > nocapture noundef readonly %c, i32 noundef %i1, i32 noundef %i2, 
> > > > > > i32 noundef %i3) local_unnamed_addr #0 { ... }
> > > > > > ```
> > > > > > aarch64 version IR for fA is:
> > > > > > 
> > > > > > ```
> > > > > > define dso_local i32 @"#fA"(i32 noundef %a, double noundef %b, i64 
> > > > > > %c.coerce, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) #0 
> > > > > > {...}
> > > > > > ```
> > > > > > Arm64 will allow any size structure to be assigned to a register 
> > > > > > directly. x64 only allows sizes 1, 2, 4 and 8. 
> > > > > > Entry thunk follow x64 version function type. But we only have 
> > > > > > aarch64 version function type.
> > > > > > 
> > > > > > I think the best way to do is create a x64 version codeGenModule 
> > > > > > and use the x64 CGM to generate the function type for entry thunk. 
> > > > > > But it is hard for me to do here. I tried a little but a lot of 
> > > > > > issues happen.
> > > > > > 
> > > > > > One other way is only modify 
> > > > > > `AArch64ABIInfo::classifyArgumentType`, copy the x64 code into the 
> > > > > > function and add a flag to determine which version will the 
> > > > > > function use. It is easier but I'm not sure it is the only 
> > > > > > difference between x64 and aarch64. Maybe the classify return also 
> > > > > > need to do this. And it is not a clean way I think.
> > > > > Oh, that's annoying... I hadn't considered the case of a struct of 
> > > > > size 3/5/6/7.
> > > > > 
> > > > > Like I noted on D126811, attaching thunks to calls is tricky if we 
> > > > > try to do it from clang.
> > > > > 
> > > > > Computing the right IR type shouldn't be that hard by itself; we can 
> > > > > call into call lowering code in TargetInfo without modifying much 
> > > > > else.  (We just need a bit to tell the TargetInfo to redirect the 
> > > > > call, like D125419.  Use an entry point like 
> > > > > CodeGenTypes::arrangeCall.)  You don't need to mess with the type 
> > > > > system or anything like that.
> > > > > 
> > > > > The problem is correctly representing the lowered call in IR; we 
> > > > > really don't want to do lowering early because it will block 
> > > > > optimizations.  I considered using an operand bundle; we can probably 
> > > > > make that work, but it's complicated, and probably disables some 
> > > > > optimizations.
> > > > > 
> > > > > I think the best thing we can do here is add an IR attribute to mark 
> > > > > arguments which are passed directly on AArch64, but need to be passed 
> > > > > indirectly for the x64 ABI.  Then AArch64Arm64ECCallLowering can 
> > > > > check for the attribute and modify its behavior.  This isn't really 
> > > > > clean in the sense that it's specific to the x64/aarch64 pair of 
> > > > > calling conventions, but I think the alternative is worse.
> > > > It looks not only 3/5/6/7, but also all size exclusive larger than 8 
> > > > and less than 16 are difference between x86 ABI and Aarch64 ABI.
> > > > Maybe we can emit a function declaration here for the x86ABI thunk, 
> > > > then define it in Arm64ECCallLowering.
> > > > 
> > > I think the sizes between 8 and 16 work correctly already?  All sizes 
> > > greater than 8 are passed indirectly on x86, and the thunk generation 
> > > code accounts for that.  But that's not really important for the general 
> > > question.
> > > 
> > > We need to preserve the required semantics for both the AArch64 and x86 
> > > calling conventions.  There are basically the following possibilities:
> > > 
> > > - We compute the declaration of the thunk in the frontend, and attach it 
> > > to the call with an operand bundle.  Like I mentioned, I don't want to go 
> > 

[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-11 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5128
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;

bcl5980 wrote:
> efriedma wrote:
> > bcl5980 wrote:
> > > efriedma wrote:
> > > > bcl5980 wrote:
> > > > > A headache thing here.
> > > > > We need to get the function definition with triple x64 to define 
> > > > > entry thunk. For now the function definition here is aarch64 version.
> > > > > For example the case in Microsoft doc "Understanding Arm64EC ABI and 
> > > > > assembly code":
> > > > > 
> > > > > ```
> > > > > struct SC {
> > > > > char a;
> > > > > char b;
> > > > > char c;
> > > > > };
> > > > > int fB(int a, double b, int i1, int i2, int i3);
> > > > > int fC(int a, struct SC c, int i1, int i2, int i3);
> > > > > int fA(int a, double b, struct SC c, int i1, int i2, int i3) {
> > > > > return fB(a, b, i1, i2, i3) + fC(a, c, i1, i2, i3);
> > > > > }
> > > > > ```
> > > > > 
> > > > > x64 version IR for fA is:
> > > > > ```
> > > > > define dso_local i32 @fA(i32 noundef %a, double noundef %b, ptr 
> > > > > nocapture noundef readonly %c, i32 noundef %i1, i32 noundef %i2, i32 
> > > > > noundef %i3) local_unnamed_addr #0 { ... }
> > > > > ```
> > > > > aarch64 version IR for fA is:
> > > > > 
> > > > > ```
> > > > > define dso_local i32 @"#fA"(i32 noundef %a, double noundef %b, i64 
> > > > > %c.coerce, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) #0 {...}
> > > > > ```
> > > > > Arm64 will allow any size structure to be assigned to a register 
> > > > > directly. x64 only allows sizes 1, 2, 4 and 8. 
> > > > > Entry thunk follow x64 version function type. But we only have 
> > > > > aarch64 version function type.
> > > > > 
> > > > > I think the best way to do is create a x64 version codeGenModule and 
> > > > > use the x64 CGM to generate the function type for entry thunk. But it 
> > > > > is hard for me to do here. I tried a little but a lot of issues 
> > > > > happen.
> > > > > 
> > > > > One other way is only modify `AArch64ABIInfo::classifyArgumentType`, 
> > > > > copy the x64 code into the function and add a flag to determine which 
> > > > > version will the function use. It is easier but I'm not sure it is 
> > > > > the only difference between x64 and aarch64. Maybe the classify 
> > > > > return also need to do this. And it is not a clean way I think.
> > > > Oh, that's annoying... I hadn't considered the case of a struct of size 
> > > > 3/5/6/7.
> > > > 
> > > > Like I noted on D126811, attaching thunks to calls is tricky if we try 
> > > > to do it from clang.
> > > > 
> > > > Computing the right IR type shouldn't be that hard by itself; we can 
> > > > call into call lowering code in TargetInfo without modifying much else. 
> > > >  (We just need a bit to tell the TargetInfo to redirect the call, like 
> > > > D125419.  Use an entry point like CodeGenTypes::arrangeCall.)  You 
> > > > don't need to mess with the type system or anything like that.
> > > > 
> > > > The problem is correctly representing the lowered call in IR; we really 
> > > > don't want to do lowering early because it will block optimizations.  I 
> > > > considered using an operand bundle; we can probably make that work, but 
> > > > it's complicated, and probably disables some optimizations.
> > > > 
> > > > I think the best thing we can do here is add an IR attribute to mark 
> > > > arguments which are passed directly on AArch64, but need to be passed 
> > > > indirectly for the x64 ABI.  Then AArch64Arm64ECCallLowering can check 
> > > > for the attribute and modify its behavior.  This isn't really clean in 
> > > > the sense that it's specific to the x64/aarch64 pair of calling 
> > > > conventions, but I think the alternative is worse.
> > > It looks not only 3/5/6/7, but also all size exclusive larger than 8 and 
> > > less than 16 are difference between x86 ABI and Aarch64 ABI.
> > > Maybe we can emit a function declaration here for the x86ABI thunk, then 
> > > define it in Arm64ECCallLowering.
> > > 
> > I think the sizes between 8 and 16 work correctly already?  All sizes 
> > greater than 8 are passed indirectly on x86, and the thunk generation code 
> > accounts for that.  But that's not really important for the general 
> > question.
> > 
> > We need to preserve the required semantics for both the AArch64 and x86 
> > calling conventions.  There are basically the following possibilities:
> > 
> > - We compute the declaration of the thunk in the frontend, and attach it to 
> > the call with an operand bundle.  Like I mentioned, I don't want to go down 
> > this path: the operand bundle blocks optimizations, and it becomes more 
> > complicated for other code to generate arm64ec compatible calls.
> > - We don't compute the definition of the thunk in the frontend.  Given 
> > that, the only other way to 

[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-10 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5128
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;

efriedma wrote:
> bcl5980 wrote:
> > efriedma wrote:
> > > bcl5980 wrote:
> > > > A headache thing here.
> > > > We need to get the function definition with triple x64 to define entry 
> > > > thunk. For now the function definition here is aarch64 version.
> > > > For example the case in Microsoft doc "Understanding Arm64EC ABI and 
> > > > assembly code":
> > > > 
> > > > ```
> > > > struct SC {
> > > > char a;
> > > > char b;
> > > > char c;
> > > > };
> > > > int fB(int a, double b, int i1, int i2, int i3);
> > > > int fC(int a, struct SC c, int i1, int i2, int i3);
> > > > int fA(int a, double b, struct SC c, int i1, int i2, int i3) {
> > > > return fB(a, b, i1, i2, i3) + fC(a, c, i1, i2, i3);
> > > > }
> > > > ```
> > > > 
> > > > x64 version IR for fA is:
> > > > ```
> > > > define dso_local i32 @fA(i32 noundef %a, double noundef %b, ptr 
> > > > nocapture noundef readonly %c, i32 noundef %i1, i32 noundef %i2, i32 
> > > > noundef %i3) local_unnamed_addr #0 { ... }
> > > > ```
> > > > aarch64 version IR for fA is:
> > > > 
> > > > ```
> > > > define dso_local i32 @"#fA"(i32 noundef %a, double noundef %b, i64 
> > > > %c.coerce, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) #0 {...}
> > > > ```
> > > > Arm64 will allow any size structure to be assigned to a register 
> > > > directly. x64 only allows sizes 1, 2, 4 and 8. 
> > > > Entry thunk follow x64 version function type. But we only have aarch64 
> > > > version function type.
> > > > 
> > > > I think the best way to do is create a x64 version codeGenModule and 
> > > > use the x64 CGM to generate the function type for entry thunk. But it 
> > > > is hard for me to do here. I tried a little but a lot of issues happen.
> > > > 
> > > > One other way is only modify `AArch64ABIInfo::classifyArgumentType`, 
> > > > copy the x64 code into the function and add a flag to determine which 
> > > > version will the function use. It is easier but I'm not sure it is the 
> > > > only difference between x64 and aarch64. Maybe the classify return also 
> > > > need to do this. And it is not a clean way I think.
> > > Oh, that's annoying... I hadn't considered the case of a struct of size 
> > > 3/5/6/7.
> > > 
> > > Like I noted on D126811, attaching thunks to calls is tricky if we try to 
> > > do it from clang.
> > > 
> > > Computing the right IR type shouldn't be that hard by itself; we can call 
> > > into call lowering code in TargetInfo without modifying much else.  (We 
> > > just need a bit to tell the TargetInfo to redirect the call, like 
> > > D125419.  Use an entry point like CodeGenTypes::arrangeCall.)  You don't 
> > > need to mess with the type system or anything like that.
> > > 
> > > The problem is correctly representing the lowered call in IR; we really 
> > > don't want to do lowering early because it will block optimizations.  I 
> > > considered using an operand bundle; we can probably make that work, but 
> > > it's complicated, and probably disables some optimizations.
> > > 
> > > I think the best thing we can do here is add an IR attribute to mark 
> > > arguments which are passed directly on AArch64, but need to be passed 
> > > indirectly for the x64 ABI.  Then AArch64Arm64ECCallLowering can check 
> > > for the attribute and modify its behavior.  This isn't really clean in 
> > > the sense that it's specific to the x64/aarch64 pair of calling 
> > > conventions, but I think the alternative is worse.
> > It looks not only 3/5/6/7, but also all size exclusive larger than 8 and 
> > less than 16 are difference between x86 ABI and Aarch64 ABI.
> > Maybe we can emit a function declaration here for the x86ABI thunk, then 
> > define it in Arm64ECCallLowering.
> > 
> I think the sizes between 8 and 16 work correctly already?  All sizes greater 
> than 8 are passed indirectly on x86, and the thunk generation code accounts 
> for that.  But that's not really important for the general question.
> 
> We need to preserve the required semantics for both the AArch64 and x86 
> calling conventions.  There are basically the following possibilities:
> 
> - We compute the declaration of the thunk in the frontend, and attach it to 
> the call with an operand bundle.  Like I mentioned, I don't want to go down 
> this path: the operand bundle blocks optimizations, and it becomes more 
> complicated for other code to generate arm64ec compatible calls.
> - We don't compute the definition of the thunk in the frontend.  Given that, 
> the only other way to attach the information we need to the call is to use 
> attributes.  The simplest thing is probably to attach the attribute directly 
> to the argument; name it "arm64ec-thunk-pass-indirect", or something like 

[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-10 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5128
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;

bcl5980 wrote:
> efriedma wrote:
> > bcl5980 wrote:
> > > A headache thing here.
> > > We need to get the function definition with triple x64 to define entry 
> > > thunk. For now the function definition here is aarch64 version.
> > > For example the case in Microsoft doc "Understanding Arm64EC ABI and 
> > > assembly code":
> > > 
> > > ```
> > > struct SC {
> > > char a;
> > > char b;
> > > char c;
> > > };
> > > int fB(int a, double b, int i1, int i2, int i3);
> > > int fC(int a, struct SC c, int i1, int i2, int i3);
> > > int fA(int a, double b, struct SC c, int i1, int i2, int i3) {
> > > return fB(a, b, i1, i2, i3) + fC(a, c, i1, i2, i3);
> > > }
> > > ```
> > > 
> > > x64 version IR for fA is:
> > > ```
> > > define dso_local i32 @fA(i32 noundef %a, double noundef %b, ptr nocapture 
> > > noundef readonly %c, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) 
> > > local_unnamed_addr #0 { ... }
> > > ```
> > > aarch64 version IR for fA is:
> > > 
> > > ```
> > > define dso_local i32 @"#fA"(i32 noundef %a, double noundef %b, i64 
> > > %c.coerce, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) #0 {...}
> > > ```
> > > Arm64 will allow any size structure to be assigned to a register 
> > > directly. x64 only allows sizes 1, 2, 4 and 8. 
> > > Entry thunk follow x64 version function type. But we only have aarch64 
> > > version function type.
> > > 
> > > I think the best way to do is create a x64 version codeGenModule and use 
> > > the x64 CGM to generate the function type for entry thunk. But it is hard 
> > > for me to do here. I tried a little but a lot of issues happen.
> > > 
> > > One other way is only modify `AArch64ABIInfo::classifyArgumentType`, copy 
> > > the x64 code into the function and add a flag to determine which version 
> > > will the function use. It is easier but I'm not sure it is the only 
> > > difference between x64 and aarch64. Maybe the classify return also need 
> > > to do this. And it is not a clean way I think.
> > Oh, that's annoying... I hadn't considered the case of a struct of size 
> > 3/5/6/7.
> > 
> > Like I noted on D126811, attaching thunks to calls is tricky if we try to 
> > do it from clang.
> > 
> > Computing the right IR type shouldn't be that hard by itself; we can call 
> > into call lowering code in TargetInfo without modifying much else.  (We 
> > just need a bit to tell the TargetInfo to redirect the call, like D125419.  
> > Use an entry point like CodeGenTypes::arrangeCall.)  You don't need to mess 
> > with the type system or anything like that.
> > 
> > The problem is correctly representing the lowered call in IR; we really 
> > don't want to do lowering early because it will block optimizations.  I 
> > considered using an operand bundle; we can probably make that work, but 
> > it's complicated, and probably disables some optimizations.
> > 
> > I think the best thing we can do here is add an IR attribute to mark 
> > arguments which are passed directly on AArch64, but need to be passed 
> > indirectly for the x64 ABI.  Then AArch64Arm64ECCallLowering can check for 
> > the attribute and modify its behavior.  This isn't really clean in the 
> > sense that it's specific to the x64/aarch64 pair of calling conventions, 
> > but I think the alternative is worse.
> It looks not only 3/5/6/7, but also all size exclusive larger than 8 and less 
> than 16 are difference between x86 ABI and Aarch64 ABI.
> Maybe we can emit a function declaration here for the x86ABI thunk, then 
> define it in Arm64ECCallLowering.
> 
I think the sizes between 8 and 16 work correctly already?  All sizes greater 
than 8 are passed indirectly on x86, and the thunk generation code accounts for 
that.  But that's not really important for the general question.

We need to preserve the required semantics for both the AArch64 and x86 calling 
conventions.  There are basically the following possibilities:

- We compute the declaration of the thunk in the frontend, and attach it to the 
call with an operand bundle.  Like I mentioned, I don't want to go down this 
path: the operand bundle blocks optimizations, and it becomes more complicated 
for other code to generate arm64ec compatible calls.
- We don't compute the definition of the thunk in the frontend.  Given that, 
the only other way to attach the information we need to the call is to use 
attributes.  The simplest thing is probably to attach the attribute directly to 
the argument; name it "arm64ec-thunk-pass-indirect", or something like that.  
(I mean, we could compute the whole signature and stuff it into a string 
attribute, but that doesn't really seem like an improvement...)


Repository:
  rG LLVM Github Monorepo

CHANGES 

[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-08-09 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5128
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;

efriedma wrote:
> bcl5980 wrote:
> > A headache thing here.
> > We need to get the function definition with triple x64 to define entry 
> > thunk. For now the function definition here is aarch64 version.
> > For example the case in Microsoft doc "Understanding Arm64EC ABI and 
> > assembly code":
> > 
> > ```
> > struct SC {
> > char a;
> > char b;
> > char c;
> > };
> > int fB(int a, double b, int i1, int i2, int i3);
> > int fC(int a, struct SC c, int i1, int i2, int i3);
> > int fA(int a, double b, struct SC c, int i1, int i2, int i3) {
> > return fB(a, b, i1, i2, i3) + fC(a, c, i1, i2, i3);
> > }
> > ```
> > 
> > x64 version IR for fA is:
> > ```
> > define dso_local i32 @fA(i32 noundef %a, double noundef %b, ptr nocapture 
> > noundef readonly %c, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) 
> > local_unnamed_addr #0 { ... }
> > ```
> > aarch64 version IR for fA is:
> > 
> > ```
> > define dso_local i32 @"#fA"(i32 noundef %a, double noundef %b, i64 
> > %c.coerce, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) #0 {...}
> > ```
> > Arm64 will allow any size structure to be assigned to a register directly. 
> > x64 only allows sizes 1, 2, 4 and 8. 
> > Entry thunk follow x64 version function type. But we only have aarch64 
> > version function type.
> > 
> > I think the best way to do is create a x64 version codeGenModule and use 
> > the x64 CGM to generate the function type for entry thunk. But it is hard 
> > for me to do here. I tried a little but a lot of issues happen.
> > 
> > One other way is only modify `AArch64ABIInfo::classifyArgumentType`, copy 
> > the x64 code into the function and add a flag to determine which version 
> > will the function use. It is easier but I'm not sure it is the only 
> > difference between x64 and aarch64. Maybe the classify return also need to 
> > do this. And it is not a clean way I think.
> Oh, that's annoying... I hadn't considered the case of a struct of size 
> 3/5/6/7.
> 
> Like I noted on D126811, attaching thunks to calls is tricky if we try to do 
> it from clang.
> 
> Computing the right IR type shouldn't be that hard by itself; we can call 
> into call lowering code in TargetInfo without modifying much else.  (We just 
> need a bit to tell the TargetInfo to redirect the call, like D125419.  Use an 
> entry point like CodeGenTypes::arrangeCall.)  You don't need to mess with the 
> type system or anything like that.
> 
> The problem is correctly representing the lowered call in IR; we really don't 
> want to do lowering early because it will block optimizations.  I considered 
> using an operand bundle; we can probably make that work, but it's 
> complicated, and probably disables some optimizations.
> 
> I think the best thing we can do here is add an IR attribute to mark 
> arguments which are passed directly on AArch64, but need to be passed 
> indirectly for the x64 ABI.  Then AArch64Arm64ECCallLowering can check for 
> the attribute and modify its behavior.  This isn't really clean in the sense 
> that it's specific to the x64/aarch64 pair of calling conventions, but I 
> think the alternative is worse.
It looks not only 3/5/6/7, but also all size exclusive larger than 8 and less 
than 16 are difference between x86 ABI and Aarch64 ABI.
Maybe we can emit a function declaration here for the x86ABI thunk, then define 
it in Arm64ECCallLowering.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-07-19 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5128
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;

bcl5980 wrote:
> A headache thing here.
> We need to get the function definition with triple x64 to define entry thunk. 
> For now the function definition here is aarch64 version.
> For example the case in Microsoft doc "Understanding Arm64EC ABI and assembly 
> code":
> 
> ```
> struct SC {
> char a;
> char b;
> char c;
> };
> int fB(int a, double b, int i1, int i2, int i3);
> int fC(int a, struct SC c, int i1, int i2, int i3);
> int fA(int a, double b, struct SC c, int i1, int i2, int i3) {
> return fB(a, b, i1, i2, i3) + fC(a, c, i1, i2, i3);
> }
> ```
> 
> x64 version IR for fA is:
> ```
> define dso_local i32 @fA(i32 noundef %a, double noundef %b, ptr nocapture 
> noundef readonly %c, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) 
> local_unnamed_addr #0 { ... }
> ```
> aarch64 version IR for fA is:
> 
> ```
> define dso_local i32 @"#fA"(i32 noundef %a, double noundef %b, i64 %c.coerce, 
> i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) #0 {...}
> ```
> Arm64 will allow any size structure to be assigned to a register directly. 
> x64 only allows sizes 1, 2, 4 and 8. 
> Entry thunk follow x64 version function type. But we only have aarch64 
> version function type.
> 
> I think the best way to do is create a x64 version codeGenModule and use the 
> x64 CGM to generate the function type for entry thunk. But it is hard for me 
> to do here. I tried a little but a lot of issues happen.
> 
> One other way is only modify `AArch64ABIInfo::classifyArgumentType`, copy the 
> x64 code into the function and add a flag to determine which version will the 
> function use. It is easier but I'm not sure it is the only difference between 
> x64 and aarch64. Maybe the classify return also need to do this. And it is 
> not a clean way I think.
Oh, that's annoying... I hadn't considered the case of a struct of size 3/5/6/7.

Like I noted on D126811, attaching thunks to calls is tricky if we try to do it 
from clang.

Computing the right IR type shouldn't be that hard by itself; we can call into 
call lowering code in TargetInfo without modifying much else.  (We just need a 
bit to tell the TargetInfo to redirect the call, like D125419.  Use an entry 
point like CodeGenTypes::arrangeCall.)  You don't need to mess with the type 
system or anything like that.

The problem is correctly representing the lowered call in IR; we really don't 
want to do lowering early because it will block optimizations.  I considered 
using an operand bundle; we can probably make that work, but it's complicated, 
and probably disables some optimizations.

I think the best thing we can do here is add an IR attribute to mark arguments 
which are passed directly on AArch64, but need to be passed indirectly for the 
x64 ABI.  Then AArch64Arm64ECCallLowering can check for the attribute and 
modify its behavior.  This isn't really clean in the sense that it's specific 
to the x64/aarch64 pair of calling conventions, but I think the alternative is 
worse.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-07-18 Thread chenglin.bi via Phabricator via cfe-commits
bcl5980 added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:5128
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;

A headache thing here.
We need to get the function definition with triple x64 to define entry thunk. 
For now the function definition here is aarch64 version.
For example the case in Microsoft doc "Understanding Arm64EC ABI and assembly 
code":

```
struct SC {
char a;
char b;
char c;
};
int fB(int a, double b, int i1, int i2, int i3);
int fC(int a, struct SC c, int i1, int i2, int i3);
int fA(int a, double b, struct SC c, int i1, int i2, int i3) {
return fB(a, b, i1, i2, i3) + fC(a, c, i1, i2, i3);
}
```

x64 version IR for fA is:
```
define dso_local i32 @fA(i32 noundef %a, double noundef %b, ptr nocapture 
noundef readonly %c, i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) 
local_unnamed_addr #0 { ... }
```
aarch64 version IR for fA is:

```
define dso_local i32 @"#fA"(i32 noundef %a, double noundef %b, i64 %c.coerce, 
i32 noundef %i1, i32 noundef %i2, i32 noundef %i3) #0 {...}
```
Arm64 will allow any size structure to be assigned to a register directly. x64 
only allows sizes 1, 2, 4 and 8. 
Entry thunk follow x64 version function type. But we only have aarch64 version 
function type.

I think the best way to do is create a x64 version codeGenModule and use the 
x64 CGM to generate the function type for entry thunk. But it is hard for me to 
do here. I tried a little but a lot of issues happen.

One other way is only modify `AArch64ABIInfo::classifyArgumentType`, copy the 
x64 code into the function and add a flag to determine which version will the 
function use. It is easier but I'm not sure it is the only difference between 
x64 and aarch64. Maybe the classify return also need to do this. And it is not 
a clean way I think.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125418/new/

https://reviews.llvm.org/D125418

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125418: [Arm64EC 6/?] Implement C/C++ mangling for Arm64EC function definitions.

2022-05-11 Thread Eli Friedman via Phabricator via cfe-commits
efriedma created this revision.
Herald added subscribers: zzheng, kristof.beyls.
Herald added a project: All.
efriedma requested review of this revision.
Herald added a project: clang.

Part of initial Arm64EC patchset.

For the Arm64EC ABI, ARM64 functions have an alternate name.  For C code, this 
name is just the original name prefixed with "#".  For C++ code, we stick a 
"$$h" modifier in the middle of the mangling.

For functions which are not hybrid_patchable, the normal name is then an alias 
for the alternate name.  (For functions that are patchable, we have to do 
something more complicated to tell the linker to generate a stub; I haven't 
tried to implement that yet.)

This doesn't emit quite the same symbols table as MSVC for simple cases: MSVC 
generates a IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY alias, where this just makes 
another symbol pointing at the function definition. This probably matters for 
the hybmp$x table, but I don't have the complete documentation at the moment.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D125418

Files:
  clang/include/clang/AST/Mangle.h
  clang/lib/AST/MicrosoftMangle.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/test/CodeGen/arm64ec.c
  clang/test/CodeGenCXX/arm64ec.cpp

Index: clang/test/CodeGenCXX/arm64ec.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/arm64ec.cpp
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -no-opaque-pointers -triple aarch64-windows-msvc_arm64ec -emit-llvm -o - %s | FileCheck %s
+
+// CHECK: @"?g@@YAXUA@@UB@@@Z" = alias void ([2 x float], [4 x float]), void ([2 x float], [4 x float])* @"?g@@$$hYAXUA@@UB@@@Z"
+// CHECK: define dso_local void @"?g@@$$hYAXUA@@UB@@@Z"
+typedef struct { float x[2]; } A;
+typedef struct { float x[4]; } B;
+void g(A a, B b) { }
Index: clang/test/CodeGen/arm64ec.c
===
--- /dev/null
+++ clang/test/CodeGen/arm64ec.c
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -no-opaque-pointers -triple aarch64-windows-msvc_arm64ec -emit-llvm -o - %s | FileCheck %s
+
+// CHECK: @g = alias void ([2 x float], [4 x float]), void ([2 x float], [4 x float])* @"#g"
+// CHECK: define dso_local void @"#g"
+typedef struct { float x[2]; } A;
+typedef struct { float x[4]; } B;
+void g(A a, B b) { }
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -5117,6 +5117,23 @@
   if (!GV->isDeclaration())
 return;
 
+  if (getTriple().isWindowsArm64EC()) {
+// For ARM64EC targets, a function definition's name is mangled differently
+// from the normal symbol.  We then emit an alias from the normal
+// symbol to the remangled definition.
+// FIXME: MSVC uses IMAGE_WEAK_EXTERN_ANTI_DEPENDENCY, we just emit
+// multiple definition symbols.  Why does this matter?
+// FIXME: For hybrid_patchable functions, the alias doesn't point
+// to the function itself; it points to a stub for the compiler.
+// FIXME: We also need to emit an entry thunk.
+SmallString<256> MangledName;
+llvm::raw_svector_ostream Out(MangledName);
+getCXXABI().getMangleContext().mangleArm64ECFnDef(GD, Out);
+auto *Alias = llvm::GlobalAlias::create("", GV);
+Alias->takeName(GV);
+GV->setName(MangledName);
+  }
+
   // We need to set linkage and visibility on the function before
   // generating code for it because various parts of IR generation
   // want to propagate this information down (e.g. to local static
Index: clang/lib/AST/MicrosoftMangle.cpp
===
--- clang/lib/AST/MicrosoftMangle.cpp
+++ clang/lib/AST/MicrosoftMangle.cpp
@@ -298,6 +298,7 @@
 return AnonymousNamespaceHash;
   }
 
+  void mangleArm64ECFnDef(GlobalDecl GD, raw_ostream &) override;
 private:
   void mangleInitFiniStub(const VarDecl *D, char CharCode, raw_ostream );
 };
@@ -360,7 +361,7 @@
 
   void mangle(GlobalDecl GD, StringRef Prefix = "?");
   void mangleName(GlobalDecl GD);
-  void mangleFunctionEncoding(GlobalDecl GD, bool ShouldMangle);
+  void mangleFunctionEncoding(GlobalDecl GD, bool ShouldMangle, bool Arm64ECDef = false);
   void mangleVariableEncoding(const VarDecl *VD);
   void mangleMemberDataPointer(const CXXRecordDecl *RD, const ValueDecl *VD,
StringRef Prefix = "$");
@@ -384,6 +385,7 @@
   bool ForceThisQuals = false,
   bool MangleExceptionSpec = true);
   void mangleNestedName(GlobalDecl GD);
+  void mangleArm64ECFnDef(GlobalDecl GD);
 
 private:
   bool isStructorDecl(const NamedDecl *ND) const {
@@ -573,7 +575,8 @@
 }
 
 void MicrosoftCXXNameMangler::mangleFunctionEncoding(GlobalDecl GD,
- bool ShouldMangle) {
+ bool