[PATCH] D122958: [clang] Corrections for target_clones multiversion functions.

2022-04-05 Thread Tom Honermann via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG5531abaf7158: [clang] Corrections for target_clones 
multiversion functions. (authored by tahonermann).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122958/new/

https://reviews.llvm.org/D122958

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/test/CodeGen/attr-target-clones.c
  clang/test/CodeGenCXX/attr-target-clones.cpp

Index: clang/test/CodeGenCXX/attr-target-clones.cpp
===
--- clang/test/CodeGenCXX/attr-target-clones.cpp
+++ clang/test/CodeGenCXX/attr-target-clones.cpp
@@ -12,29 +12,29 @@
 int __attribute__((target_clones("sse4.2", "default"))) overloaded(int) { return 1; }
 // LINUX: define {{.*}}i32 @_Z10overloadedi.sse4.2.0(i32{{.+}})
 // LINUX: define {{.*}}i32 @_Z10overloadedi.default.1(i32{{.+}})
-// LINUX: define i32 (i32)* @_Z10overloadedi.resolver
+// LINUX: define weak_odr i32 (i32)* @_Z10overloadedi.resolver() comdat
 // LINUX: ret i32 (i32)* @_Z10overloadedi.sse4.2.0
 // LINUX: ret i32 (i32)* @_Z10overloadedi.default.1
 
 // WINDOWS: define dso_local noundef i32 @"?overloaded@@YAHH@Z.sse4.2.0"(i32{{.+}})
 // WINDOWS: define dso_local noundef i32 @"?overloaded@@YAHH@Z.default.1"(i32{{.+}})
-// WINDOWS: define dso_local i32 @"?overloaded@@YAHH@Z"(i32{{.+}})
+// WINDOWS: define weak_odr dso_local i32 @"?overloaded@@YAHH@Z"(i32{{.+}}) comdat
 // WINDOWS: call i32 @"?overloaded@@YAHH@Z.sse4.2.0"
 // WINDOWS: call i32 @"?overloaded@@YAHH@Z.default.1"
 
 int __attribute__((target_clones("arch=ivybridge", "default"))) overloaded(const char *) { return 2; }
 // LINUX: define {{.*}}i32 @_Z10overloadedPKc.arch_ivybridge.0(i8*{{.+}})
 // LINUX: define {{.*}}i32 @_Z10overloadedPKc.default.1(i8*{{.+}})
-// LINUX: define i32 (i8*)* @_Z10overloadedPKc.resolver
+// LINUX: define weak_odr i32 (i8*)* @_Z10overloadedPKc.resolver() comdat
 // LINUX: ret i32 (i8*)* @_Z10overloadedPKc.arch_ivybridge.0
 // LINUX: ret i32 (i8*)* @_Z10overloadedPKc.default.1
 
 // WINDOWS: define dso_local noundef i32 @"?overloaded@@YAHPEBD@Z.arch_ivybridge.0"(i8*{{.+}})
 // WINDOWS: define dso_local noundef i32 @"?overloaded@@YAHPEBD@Z.default.1"(i8*{{.+}})
-// WINDOWS: define dso_local i32 @"?overloaded@@YAHPEBD@Z"(i8*{{.+}})
+// WINDOWS: define weak_odr dso_local i32 @"?overloaded@@YAHPEBD@Z"(i8*{{.+}}) comdat
 // WINDOWS: call i32 @"?overloaded@@YAHPEBD@Z.arch_ivybridge.0"
 // WINDOWS: call i32 @"?overloaded@@YAHPEBD@Z.default.1"
-//
+
 void use_overloaded() {
   overloaded(1);
   // LINUX: call noundef i32 @_Z10overloadedi.ifunc
@@ -81,36 +81,40 @@
   // WINDOWS: call noundef i32 @"?foo@?$C@NM@@QEAAHXZ"(%struct.C
 }
 
-// LINUX: define {{.*}}i32 @_ZN1CIssE3fooEv.sse4.2.0(%struct.C{{.+}})
-// WINDOWS: define {{.*}}i32 @"?foo@?$C@FF@@QEAAHXZ.sse4.2.0"(%struct.C{{.+}})
-// LINUX: define i32 (%struct.C*)* @_ZN1CIssE3fooEv.resolver
+// LINUX: define weak_odr i32 (%struct.C*)* @_ZN1CIssE3fooEv.resolver() comdat
 // LINUX: ret i32 (%struct.C*)* @_ZN1CIssE3fooEv.sse4.2.0
 // LINUX: ret i32 (%struct.C*)* @_ZN1CIssE3fooEv.default.1
+
 // WINDOWS: define {{.*}}i32 @"?foo@?$C@FF@@QEAAHXZ"(%struct.C{{.+}})
 // WINDOWS: call i32 @"?foo@?$C@FF@@QEAAHXZ.sse4.2.0"
 // WINDOWS: call i32 @"?foo@?$C@FF@@QEAAHXZ.default.1"
 
-// LINUX: define {{.*}}i32 @_ZN1CIisE3fooEv.sse4.2.0(%struct.C{{.+}})
-// WINDOWS: define {{.*}}i32 @"?foo@?$C@HF@@QEAAHXZ.sse4.2.0"(%struct.C{{.+}})
-// LINUX: define i32 (%struct.C{{.+}})* @_ZN1CIisE3fooEv.resolver
+// LINUX: define weak_odr i32 (%struct.C{{.+}})* @_ZN1CIisE3fooEv.resolver() comdat
 // LINUX: ret i32 (%struct.C{{.+}})* @_ZN1CIisE3fooEv.sse4.2.0
 // LINUX: ret i32 (%struct.C{{.+}})* @_ZN1CIisE3fooEv.default.1
+
 // WINDOWS: define {{.*}}i32 @"?foo@?$C@HF@@QEAAHXZ"(%struct.C{{.+}})
 // WINDOWS: call i32 @"?foo@?$C@HF@@QEAAHXZ.sse4.2.0"
 // WINDOWS: call i32 @"?foo@?$C@HF@@QEAAHXZ.default.1"
 
-// LINUX: define i32 (%struct.C{{.+}})* @_ZN1CIdfE3fooEv.resolver
+// LINUX: define weak_odr i32 (%struct.C{{.+}})* @_ZN1CIdfE3fooEv.resolver() comdat
 // LINUX: ret i32 (%struct.C{{.+}})* @_ZN1CIdfE3fooEv.sse4.2.0
 // LINUX: ret i32 (%struct.C{{.+}})* @_ZN1CIdfE3fooEv.default.1
+
 // WINDOWS: define {{.*}}i32 @"?foo@?$C@NM@@QEAAHXZ"(%struct.C{{.+}})
 // WINDOWS: call i32 @"?foo@?$C@NM@@QEAAHXZ.sse4.2.0"
 // WINDOWS: call i32 @"?foo@?$C@NM@@QEAAHXZ.default.1"
 
+// LINUX: define {{.*}}i32 @_ZN1CIssE3fooEv.sse4.2.0(%struct.C{{.+}})
+// LINUX: define {{.*}}i32 @_ZN1CIssE3fooEv.default.1(%struct.C{{.+}})
+// LINUX: define {{.*}}i32 @_ZN1CIisE3fooEv.sse4.2.0(%struct.C{{.+}})
+// LINUX: define {{.*}}i32 @_ZN1CIisE3fooEv.default.1(%struct.C{{.+}})
 // LINUX: define {{.*}}i32 @_ZN1CIdfE3fooEv.sse4.2.0(%struct.C{{.+}})
-// WINDOWS: define {{.*}}i32 @"?foo@?$C@NM@@QEAAHXZ.sse4.2.0"(%struct.C{{.+}})
 // LINUX: define {{.*}}i32 

[PATCH] D122958: [clang] Corrections for target_clones multiversion functions.

2022-04-05 Thread Erich Keane via Phabricator via cfe-commits
erichkeane accepted this revision.
erichkeane added a comment.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122958/new/

https://reviews.llvm.org/D122958

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122958: [clang] Corrections for target_clones multiversion functions.

2022-04-05 Thread Tom Honermann via Phabricator via cfe-commits
tahonermann updated this revision to Diff 420587.
tahonermann added a comment.

Squashed the addition of tests originally made in D122954 
 to this review.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122958/new/

https://reviews.llvm.org/D122958

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/test/CodeGen/attr-target-clones.c
  clang/test/CodeGenCXX/attr-target-clones.cpp

Index: clang/test/CodeGenCXX/attr-target-clones.cpp
===
--- clang/test/CodeGenCXX/attr-target-clones.cpp
+++ clang/test/CodeGenCXX/attr-target-clones.cpp
@@ -12,29 +12,29 @@
 int __attribute__((target_clones("sse4.2", "default"))) overloaded(int) { return 1; }
 // LINUX: define {{.*}}i32 @_Z10overloadedi.sse4.2.0(i32{{.+}})
 // LINUX: define {{.*}}i32 @_Z10overloadedi.default.1(i32{{.+}})
-// LINUX: define i32 (i32)* @_Z10overloadedi.resolver
+// LINUX: define weak_odr i32 (i32)* @_Z10overloadedi.resolver() comdat
 // LINUX: ret i32 (i32)* @_Z10overloadedi.sse4.2.0
 // LINUX: ret i32 (i32)* @_Z10overloadedi.default.1
 
 // WINDOWS: define dso_local noundef i32 @"?overloaded@@YAHH@Z.sse4.2.0"(i32{{.+}})
 // WINDOWS: define dso_local noundef i32 @"?overloaded@@YAHH@Z.default.1"(i32{{.+}})
-// WINDOWS: define dso_local i32 @"?overloaded@@YAHH@Z"(i32{{.+}})
+// WINDOWS: define weak_odr dso_local i32 @"?overloaded@@YAHH@Z"(i32{{.+}}) comdat
 // WINDOWS: call i32 @"?overloaded@@YAHH@Z.sse4.2.0"
 // WINDOWS: call i32 @"?overloaded@@YAHH@Z.default.1"
 
 int __attribute__((target_clones("arch=ivybridge", "default"))) overloaded(const char *) { return 2; }
 // LINUX: define {{.*}}i32 @_Z10overloadedPKc.arch_ivybridge.0(i8*{{.+}})
 // LINUX: define {{.*}}i32 @_Z10overloadedPKc.default.1(i8*{{.+}})
-// LINUX: define i32 (i8*)* @_Z10overloadedPKc.resolver
+// LINUX: define weak_odr i32 (i8*)* @_Z10overloadedPKc.resolver() comdat
 // LINUX: ret i32 (i8*)* @_Z10overloadedPKc.arch_ivybridge.0
 // LINUX: ret i32 (i8*)* @_Z10overloadedPKc.default.1
 
 // WINDOWS: define dso_local noundef i32 @"?overloaded@@YAHPEBD@Z.arch_ivybridge.0"(i8*{{.+}})
 // WINDOWS: define dso_local noundef i32 @"?overloaded@@YAHPEBD@Z.default.1"(i8*{{.+}})
-// WINDOWS: define dso_local i32 @"?overloaded@@YAHPEBD@Z"(i8*{{.+}})
+// WINDOWS: define weak_odr dso_local i32 @"?overloaded@@YAHPEBD@Z"(i8*{{.+}}) comdat
 // WINDOWS: call i32 @"?overloaded@@YAHPEBD@Z.arch_ivybridge.0"
 // WINDOWS: call i32 @"?overloaded@@YAHPEBD@Z.default.1"
-//
+
 void use_overloaded() {
   overloaded(1);
   // LINUX: call noundef i32 @_Z10overloadedi.ifunc
@@ -81,36 +81,40 @@
   // WINDOWS: call noundef i32 @"?foo@?$C@NM@@QEAAHXZ"(%struct.C
 }
 
-// LINUX: define {{.*}}i32 @_ZN1CIssE3fooEv.sse4.2.0(%struct.C{{.+}})
-// WINDOWS: define {{.*}}i32 @"?foo@?$C@FF@@QEAAHXZ.sse4.2.0"(%struct.C{{.+}})
-// LINUX: define i32 (%struct.C*)* @_ZN1CIssE3fooEv.resolver
+// LINUX: define weak_odr i32 (%struct.C*)* @_ZN1CIssE3fooEv.resolver() comdat
 // LINUX: ret i32 (%struct.C*)* @_ZN1CIssE3fooEv.sse4.2.0
 // LINUX: ret i32 (%struct.C*)* @_ZN1CIssE3fooEv.default.1
+
 // WINDOWS: define {{.*}}i32 @"?foo@?$C@FF@@QEAAHXZ"(%struct.C{{.+}})
 // WINDOWS: call i32 @"?foo@?$C@FF@@QEAAHXZ.sse4.2.0"
 // WINDOWS: call i32 @"?foo@?$C@FF@@QEAAHXZ.default.1"
 
-// LINUX: define {{.*}}i32 @_ZN1CIisE3fooEv.sse4.2.0(%struct.C{{.+}})
-// WINDOWS: define {{.*}}i32 @"?foo@?$C@HF@@QEAAHXZ.sse4.2.0"(%struct.C{{.+}})
-// LINUX: define i32 (%struct.C{{.+}})* @_ZN1CIisE3fooEv.resolver
+// LINUX: define weak_odr i32 (%struct.C{{.+}})* @_ZN1CIisE3fooEv.resolver() comdat
 // LINUX: ret i32 (%struct.C{{.+}})* @_ZN1CIisE3fooEv.sse4.2.0
 // LINUX: ret i32 (%struct.C{{.+}})* @_ZN1CIisE3fooEv.default.1
+
 // WINDOWS: define {{.*}}i32 @"?foo@?$C@HF@@QEAAHXZ"(%struct.C{{.+}})
 // WINDOWS: call i32 @"?foo@?$C@HF@@QEAAHXZ.sse4.2.0"
 // WINDOWS: call i32 @"?foo@?$C@HF@@QEAAHXZ.default.1"
 
-// LINUX: define i32 (%struct.C{{.+}})* @_ZN1CIdfE3fooEv.resolver
+// LINUX: define weak_odr i32 (%struct.C{{.+}})* @_ZN1CIdfE3fooEv.resolver() comdat
 // LINUX: ret i32 (%struct.C{{.+}})* @_ZN1CIdfE3fooEv.sse4.2.0
 // LINUX: ret i32 (%struct.C{{.+}})* @_ZN1CIdfE3fooEv.default.1
+
 // WINDOWS: define {{.*}}i32 @"?foo@?$C@NM@@QEAAHXZ"(%struct.C{{.+}})
 // WINDOWS: call i32 @"?foo@?$C@NM@@QEAAHXZ.sse4.2.0"
 // WINDOWS: call i32 @"?foo@?$C@NM@@QEAAHXZ.default.1"
 
+// LINUX: define {{.*}}i32 @_ZN1CIssE3fooEv.sse4.2.0(%struct.C{{.+}})
+// LINUX: define {{.*}}i32 @_ZN1CIssE3fooEv.default.1(%struct.C{{.+}})
+// LINUX: define {{.*}}i32 @_ZN1CIisE3fooEv.sse4.2.0(%struct.C{{.+}})
+// LINUX: define {{.*}}i32 @_ZN1CIisE3fooEv.default.1(%struct.C{{.+}})
 // LINUX: define {{.*}}i32 @_ZN1CIdfE3fooEv.sse4.2.0(%struct.C{{.+}})
-// WINDOWS: define {{.*}}i32 @"?foo@?$C@NM@@QEAAHXZ.sse4.2.0"(%struct.C{{.+}})
 // LINUX: define {{.*}}i32 @_ZN1CIdfE3fooEv.default.1(%struct.C{{.+}})
-// WINDOWS: define 

[PATCH] D122958: [clang] Corrections for target_clones multiversion functions.

2022-04-04 Thread Erich Keane via Phabricator via cfe-commits
erichkeane accepted this revision.
erichkeane added inline comments.
This revision is now accepted and ready to land.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:3492
 
-if (TI.supportsIFunc() || FD->isTargetMultiVersion()) {
-  ResolverFunc = cast(
-  GetGlobalValue((getMangledName(GD) + ".resolver").str()));
-  ResolverFunc->setLinkage(getMultiversionLinkage(*this, GD));
-} else {
-  ResolverFunc = cast(GetGlobalValue(getMangledName(GD)));
-}
+ResolverFunc->setLinkage(getMultiversionLinkage(*this, GD));
 

tahonermann wrote:
> `setLinkage()` was previously only called in the ifunc case. I don't know why 
> that would be, so I "fixed" it here.
Likely that was the only place we 'noticed' it was a problem?  


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122958/new/

https://reviews.llvm.org/D122958

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122958: [clang] Corrections for target_clones multiversion functions.

2022-04-02 Thread Tom Honermann via Phabricator via cfe-commits
tahonermann created this revision.
tahonermann added reviewers: erichkeane, aaron.ballman.
Herald added a subscriber: kristof.beyls.
Herald added a project: All.
tahonermann updated this revision to Diff 419919.
tahonermann added a comment.
tahonermann updated this revision to Diff 420008.
tahonermann published this revision for review.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Removed the in-class declaration of CodeGenModule::EmitTargetClonesResolver().


tahonermann added a comment.

Simplified code to select a resolver function.




Comment at: clang/lib/CodeGen/CodeGenModule.cpp:3492
 
-if (TI.supportsIFunc() || FD->isTargetMultiVersion()) {
-  ResolverFunc = cast(
-  GetGlobalValue((getMangledName(GD) + ".resolver").str()));
-  ResolverFunc->setLinkage(getMultiversionLinkage(*this, GD));
-} else {
-  ResolverFunc = cast(GetGlobalValue(getMangledName(GD)));
-}
+ResolverFunc->setLinkage(getMultiversionLinkage(*this, GD));
 

`setLinkage()` was previously only called in the ifunc case. I don't know why 
that would be, so I "fixed" it here.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:3323
   EmitGlobalFunctionDefinition(GD.getWithMultiVersionIndex(I), nullptr);
-// Requires multiple emits.
   } else if (FD->isTargetClonesMultiVersion()) {

This comment seemed oddly placed and didn't seem to convey used information, so 
I gave it the boot.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:3411-3461
-void CodeGenModule::EmitTargetClonesResolver(GlobalDecl GD) {
-  const auto *FD = cast(GD.getDecl());
-  assert(FD && "Not a FunctionDecl?");
-  const auto *TC = FD->getAttr();
-  assert(TC && "Not a target_clones Function?");
-
-  llvm::Function *ResolverFunc;

This functionality was all merged into 
`CodeGenModule::emitMultiVersionFunctions()` where much of it was already 
present.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:3488-3493
+if (getTarget().supportsIFunc()) {
+  auto *IFunc =
+  cast(GetOrCreateMultiVersionResolver(GD));
+  ResolverFunc = cast(IFunc->getResolver());
+} else
+  ResolverFunc = cast(GetOrCreateMultiVersionResolver(GD));

Use of `GetOrCreateMultiVersionResolver()` here avoids duplication of code to 
determine how resolver functions are named for each multiversion function kind.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:3670-3673
+  // The resolver needs to be created. For target and target_clones, defer
+  // creation until the end of the TU.
+  if (FD->isTargetMultiVersion() || FD->isTargetClonesMultiVersion())
 MultiVersionFuncs.push_back(GD);

This is the change that effectively ensures that a `target_clones` function 
resolver is emitted.



Comment at: clang/test/CodeGen/attr-target-clones.c:26
 // LINUX: define {{.*}}i32 @foo.default.1()
-// LINUX: define i32 ()* @foo.resolver() comdat
+// LINUX: define weak_odr i32 ()* @foo.resolver() comdat
 // LINUX: ret i32 ()* @foo.sse4.2.0

`target_clones` resolvers are now emitted consistently with `target` resolvers.



Comment at: clang/test/CodeGen/attr-target-clones.c:152-156
+// LINUX: define linkonce i32 @foo_inline2.arch_sandybridge.0() #[[SB]]
 // LINUX: define linkonce i32 @foo_inline2.default.2() #[[DEF]]
 // LINUX: define linkonce i32 @foo_inline2.sse4.2.1() #[[SSE42]]
 
+// WINDOWS: define linkonce_odr dso_local i32 
@foo_inline2.arch_sandybridge.0() #[[SB]]

Some checks were simply missing previously.



Comment at: clang/test/CodeGen/attr-target-clones.c:161-165
+// LINUX: declare i32 @foo_used_no_defn.default.1()
+// LINUX: declare i32 @foo_used_no_defn.sse4.2.0()
+
+// WINDOWS: declare dso_local i32 @foo_used_no_defn.default.1()
+// WINDOWS: declare dso_local i32 @foo_used_no_defn.sse4.2.0()

Declarations emitted cause the function is no defined!


This change merges code for emit of target and target_clones multiversion
resolver functions and, in doing so, corrects handling of target_clones
functions that are declared but not defined. Previously, a use of such
a target_clones function would result in an attempted emit of an ifunc
that referenced an undefined resolver function. Ifunc references to
undefined resolver functions are not allowed and, when the LLVM verifier
is not disabled (via '-disable-llvm-verifier'), resulted in the verifier
issuing a "IFunc resolver must be a definition" error and aborting the
compilation. With this change, ifuncs and resolver function definitions
are always emitted for used target_clones functions regardless of whether
the target_clones function is defined (if the function is defined, then
the ifunc and resolver are emitted regardless of whether the function is
used).

This change has the side effect of