[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-04-01 Thread Erich Keane via cfe-commits

erichkeane wrote:

> I'm not sure llvm needs to know the priorities. I haven't had time to work on 
> this, but my plan was to have something that attempts to step through the 
> resolver instruction by instruction with known bits for the value loaded from 
> `__aarch64_cpu_features.features` according to the caller's target features. 
> If the return value is known, then we can fold away the resolver for that 
> call site. If we encounter a loop, a call, or some other pattern we don't 
> understand, then bail & leave that call site alone.

That seems sensible to me.  It would be nice to be able to recognize this /get 
this optimization for 'hand rolled' resolvers as well.

https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-04-01 Thread Jon Roelofs via cfe-commits

jroelofs wrote:

I'm not sure llvm needs to know the priorities. I haven't had time to work on 
this, but my plan was to have something that attempts to step through the 
resolver instruction by instruction with known bits for the value loaded from 
`__aarch64_cpu_features.features` according to the caller's target features. If 
the return value is known, then we can fold away the resolver for that call 
site. If we encounter a loop, a call, or some other pattern we don't 
understand, then bail & leave that call site alone.

https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-04-01 Thread Erich Keane via cfe-commits

erichkeane wrote:

I'd be OK with Clang providing some level of metadata to clarify which is an 
FMV, and what our target features are.  But this sort of analysis still needs 
to happen in LLVM.  

https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-03-31 Thread Alexandros Lamprineas via cfe-commits

labrinea wrote:

@erichkeane while I agree that Clang might not be the best place for such an 
optimization, I have some concerns about implementing it in LLVM:
* We cannot distinguish a FMV resolver from any other ifunc resolver.
* There is no information at the LLVM IR level about function versions or which 
resolver they are associated with. 
* We cannot use target-features to determine version priority since this 
information is encoded via front-end features in the TargetParser. We can only 
rely on the resolver's basic block layout under the assumption that predecessor 
basic blocks correspond to versions of higher priority than successor basic 
blocks. This is fragile and unreliable:
```
void discoverResolvedIFuncsInPriorityOrder(GlobalIFunc *IFunc) {
  DenseMap> ResolvedIFuncs;

  std::function visitValue = [&](Value *V) {
if (auto *Func = dyn_cast(V)) {
  ResolvedIFuncs[IFunc].push_back(Func);
} else if (auto *Sel = dyn_cast(V)) {
  visitValue(Sel->getTrueValue());
  visitValue(Sel->getFalseValue());
} else if (auto *Phi = dyn_cast(V)) {
  for (unsigned I = 0, E = Phi->getNumIncomingValues(); I != E; ++I)
visitValue(Phi->getIncomingValue(I));
}
  };

  for (BasicBlock  : *IFunc->getResolverFunction())
if (auto *Ret = dyn_cast_or_null(BB.getTerminator()))
  visitValue(Ret->getReturnValue());
  // discard default
  if (!ResolvedIFuncs[IFunc].empty())
ResolvedIFuncs[IFunc].pop_back();
}
```

https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Jon Roelofs via cfe-commits

jroelofs wrote:

Fair. I'll give that a shot.

Doing it in opt has another big advantage I only just realized: it allows LTO 
to do the transformation.

https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Jon Roelofs via cfe-commits

https://github.com/jroelofs closed 
https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Erich Keane via cfe-commits

erichkeane wrote:

> My gut feel was that recovering this information from the callee's resolver's 
> body would take heroics if we tried to do it in the backend.

Opt can already see the feature strings in the llvm-attributes, and can 
introspect into it for the resolver.  I could PERHAPS see value in an 
llvm-attribute on the resolver to tell OPT to try to look through that (that 
is, something that says "this is a generated Function MultiVersion resolver, 
you can trust these conditions match the functions"), but the rest I don't 
think needs to be in the CFE.

https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Jon Roelofs via cfe-commits

jroelofs wrote:

My gut feel was that recovering this information from the callee's resolver's 
body would take heroics if we tried to do it in the backend.

https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Jon Roelofs via cfe-commits

https://github.com/jroelofs updated 
https://github.com/llvm/llvm-project/pull/80093

>From ed52ee4424459ebc046a625341ad8dbbd38bcbe3 Mon Sep 17 00:00:00 2001
From: Jon Roelofs 
Date: Tue, 30 Jan 2024 19:13:42 -0800
Subject: [PATCH 1/6] [clang][FMV] Direct-call multi-versioned callees from
 multi-versioned callers

... when there is a callee with a matching feature set, and no other higher
priority callee.  This optimization helps the inliner see past the
ifunc+resolver to the callee that we know it will always land on.

This is a conservative implementation of: 
https://github.com/llvm/llvm-project/issues/71714
---
 clang/lib/CodeGen/CGCall.cpp  |  72 +
 clang/lib/CodeGen/CodeGenModule.cpp   |   2 +-
 .../test/CodeGen/attr-target-mv-direct-call.c | 245 ++
 3 files changed, 318 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/attr-target-mv-direct-call.c

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 28c211aa631e4..84a04e3ccddd8 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4966,6 +4966,11 @@ static unsigned getMaxVectorWidth(const llvm::Type *Ty) {
   return MaxVectorWidth;
 }
 
+// FIXME: put this somewhere nicer to share
+unsigned
+TargetMVPriority(const TargetInfo ,
+ const CodeGenFunction::MultiVersionResolverOption );
+
 RValue CodeGenFunction::EmitCall(const CGFunctionInfo ,
  const CGCallee ,
  ReturnValueSlot ReturnValue,
@@ -5437,6 +5442,73 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   const CGCallee  = Callee.prepareConcreteCallee(*this);
   llvm::Value *CalleePtr = ConcreteCallee.getFunctionPointer();
 
+  // If a multi-versioned caller calls a multi-versioned callee, skip the
+  // resolver when there is a precise match on the feature sets, and no
+  // possibility of a better match at runtime.
+  if (const auto *CallerFD = dyn_cast_or_null(CurGD.getDecl()))
+if (const auto *CallerTVA = CallerFD->getAttr())
+  if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl))
+// FIXME: do the same where either the caller or callee are
+// target_clones.
+if (FD->isTargetMultiVersion()) {
+  llvm::SmallVector CallerFeats;
+  CallerTVA->getFeatures(CallerFeats);
+  MultiVersionResolverOption CallerMVRO(nullptr, "", CallerFeats);
+
+  bool HasHigherPriorityCallee = false;
+  llvm::Constant *FoundMatchingCallee = nullptr;
+  getContext().forEachMultiversionedFunctionVersion(
+  FD, [this, FD, , ,
+   ](const FunctionDecl *CurFD) {
+const auto *CalleeTVA = CurFD->getAttr();
+
+GlobalDecl CurGD{
+(CurFD->isDefined() ? CurFD->getDefinition() : CurFD)};
+StringRef MangledName = CGM.getMangledName(CurFD);
+
+llvm::SmallVector CalleeFeats;
+CalleeTVA->getFeatures(CalleeFeats);
+MultiVersionResolverOption CalleeMVRO(nullptr, "", 
CalleeFeats);
+
+const TargetInfo  = getTarget();
+
+// If there is a higher priority callee, we can't do the
+// optimization at all, as it would be a valid choice at
+// runtime.
+if (TargetMVPriority(TI, CalleeMVRO) >
+TargetMVPriority(TI, CallerMVRO)) {
+  HasHigherPriorityCallee = true;
+  return;
+}
+
+// FIXME: we could allow a lower-priority match when the
+// features are a proper subset. But for now, to keep things
+// simpler, we only care about a precise match.
+if (TargetMVPriority(TI, CalleeMVRO) <
+TargetMVPriority(TI, CallerMVRO))
+  return;
+
+if (llvm::Constant *Func = CGM.GetGlobalValue(MangledName)) {
+  FoundMatchingCallee = Func;
+  return;
+}
+
+if (CurFD->isDefined()) {
+  // FIXME: not sure how to get the address
+} else {
+  const CGFunctionInfo  =
+  getTypes().arrangeGlobalDeclaration(FD);
+  llvm::FunctionType *Ty = getTypes().GetFunctionType(FI);
+  FoundMatchingCallee =
+  CGM.GetAddrOfFunction(CurGD, Ty, /*ForVTable=*/false,
+/*DontDefer=*/false, 
ForDefinition);
+}
+  });
+
+  if (FoundMatchingCallee && !HasHigherPriorityCallee)
+CalleePtr = FoundMatchingCallee;
+}
+
   // If we're using inalloca, set up that argument.
   if (ArgMemory.isValid()) {
 llvm::Value *Arg = ArgMemory.getPointer();
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 

[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Erich Keane via cfe-commits

https://github.com/erichkeane commented:

My immediate response is that this sounds like something that OPT should be 
doing here, not us.  We typically do NOT do this sort of thing the CFE, and do 
not want to do opt-type stuff in the CFE. 

Is there good reason that this isn't a part of the inliner?

https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Jon Roelofs via cfe-commits

https://github.com/jroelofs updated 
https://github.com/llvm/llvm-project/pull/80093

>From ed52ee4424459ebc046a625341ad8dbbd38bcbe3 Mon Sep 17 00:00:00 2001
From: Jon Roelofs 
Date: Tue, 30 Jan 2024 19:13:42 -0800
Subject: [PATCH 1/4] [clang][FMV] Direct-call multi-versioned callees from
 multi-versioned callers

... when there is a callee with a matching feature set, and no other higher
priority callee.  This optimization helps the inliner see past the
ifunc+resolver to the callee that we know it will always land on.

This is a conservative implementation of: 
https://github.com/llvm/llvm-project/issues/71714
---
 clang/lib/CodeGen/CGCall.cpp  |  72 +
 clang/lib/CodeGen/CodeGenModule.cpp   |   2 +-
 .../test/CodeGen/attr-target-mv-direct-call.c | 245 ++
 3 files changed, 318 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/attr-target-mv-direct-call.c

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 28c211aa631e4..84a04e3ccddd8 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4966,6 +4966,11 @@ static unsigned getMaxVectorWidth(const llvm::Type *Ty) {
   return MaxVectorWidth;
 }
 
+// FIXME: put this somewhere nicer to share
+unsigned
+TargetMVPriority(const TargetInfo ,
+ const CodeGenFunction::MultiVersionResolverOption );
+
 RValue CodeGenFunction::EmitCall(const CGFunctionInfo ,
  const CGCallee ,
  ReturnValueSlot ReturnValue,
@@ -5437,6 +5442,73 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   const CGCallee  = Callee.prepareConcreteCallee(*this);
   llvm::Value *CalleePtr = ConcreteCallee.getFunctionPointer();
 
+  // If a multi-versioned caller calls a multi-versioned callee, skip the
+  // resolver when there is a precise match on the feature sets, and no
+  // possibility of a better match at runtime.
+  if (const auto *CallerFD = dyn_cast_or_null(CurGD.getDecl()))
+if (const auto *CallerTVA = CallerFD->getAttr())
+  if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl))
+// FIXME: do the same where either the caller or callee are
+// target_clones.
+if (FD->isTargetMultiVersion()) {
+  llvm::SmallVector CallerFeats;
+  CallerTVA->getFeatures(CallerFeats);
+  MultiVersionResolverOption CallerMVRO(nullptr, "", CallerFeats);
+
+  bool HasHigherPriorityCallee = false;
+  llvm::Constant *FoundMatchingCallee = nullptr;
+  getContext().forEachMultiversionedFunctionVersion(
+  FD, [this, FD, , ,
+   ](const FunctionDecl *CurFD) {
+const auto *CalleeTVA = CurFD->getAttr();
+
+GlobalDecl CurGD{
+(CurFD->isDefined() ? CurFD->getDefinition() : CurFD)};
+StringRef MangledName = CGM.getMangledName(CurFD);
+
+llvm::SmallVector CalleeFeats;
+CalleeTVA->getFeatures(CalleeFeats);
+MultiVersionResolverOption CalleeMVRO(nullptr, "", 
CalleeFeats);
+
+const TargetInfo  = getTarget();
+
+// If there is a higher priority callee, we can't do the
+// optimization at all, as it would be a valid choice at
+// runtime.
+if (TargetMVPriority(TI, CalleeMVRO) >
+TargetMVPriority(TI, CallerMVRO)) {
+  HasHigherPriorityCallee = true;
+  return;
+}
+
+// FIXME: we could allow a lower-priority match when the
+// features are a proper subset. But for now, to keep things
+// simpler, we only care about a precise match.
+if (TargetMVPriority(TI, CalleeMVRO) <
+TargetMVPriority(TI, CallerMVRO))
+  return;
+
+if (llvm::Constant *Func = CGM.GetGlobalValue(MangledName)) {
+  FoundMatchingCallee = Func;
+  return;
+}
+
+if (CurFD->isDefined()) {
+  // FIXME: not sure how to get the address
+} else {
+  const CGFunctionInfo  =
+  getTypes().arrangeGlobalDeclaration(FD);
+  llvm::FunctionType *Ty = getTypes().GetFunctionType(FI);
+  FoundMatchingCallee =
+  CGM.GetAddrOfFunction(CurGD, Ty, /*ForVTable=*/false,
+/*DontDefer=*/false, 
ForDefinition);
+}
+  });
+
+  if (FoundMatchingCallee && !HasHigherPriorityCallee)
+CalleePtr = FoundMatchingCallee;
+}
+
   // If we're using inalloca, set up that argument.
   if (ArgMemory.isValid()) {
 llvm::Value *Arg = ArgMemory.getPointer();
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 

[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Jon Roelofs via cfe-commits

https://github.com/jroelofs updated 
https://github.com/llvm/llvm-project/pull/80093

>From ed52ee4424459ebc046a625341ad8dbbd38bcbe3 Mon Sep 17 00:00:00 2001
From: Jon Roelofs 
Date: Tue, 30 Jan 2024 19:13:42 -0800
Subject: [PATCH 1/3] [clang][FMV] Direct-call multi-versioned callees from
 multi-versioned callers

... when there is a callee with a matching feature set, and no other higher
priority callee.  This optimization helps the inliner see past the
ifunc+resolver to the callee that we know it will always land on.

This is a conservative implementation of: 
https://github.com/llvm/llvm-project/issues/71714
---
 clang/lib/CodeGen/CGCall.cpp  |  72 +
 clang/lib/CodeGen/CodeGenModule.cpp   |   2 +-
 .../test/CodeGen/attr-target-mv-direct-call.c | 245 ++
 3 files changed, 318 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/attr-target-mv-direct-call.c

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 28c211aa631e4..84a04e3ccddd8 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4966,6 +4966,11 @@ static unsigned getMaxVectorWidth(const llvm::Type *Ty) {
   return MaxVectorWidth;
 }
 
+// FIXME: put this somewhere nicer to share
+unsigned
+TargetMVPriority(const TargetInfo ,
+ const CodeGenFunction::MultiVersionResolverOption );
+
 RValue CodeGenFunction::EmitCall(const CGFunctionInfo ,
  const CGCallee ,
  ReturnValueSlot ReturnValue,
@@ -5437,6 +5442,73 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   const CGCallee  = Callee.prepareConcreteCallee(*this);
   llvm::Value *CalleePtr = ConcreteCallee.getFunctionPointer();
 
+  // If a multi-versioned caller calls a multi-versioned callee, skip the
+  // resolver when there is a precise match on the feature sets, and no
+  // possibility of a better match at runtime.
+  if (const auto *CallerFD = dyn_cast_or_null(CurGD.getDecl()))
+if (const auto *CallerTVA = CallerFD->getAttr())
+  if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl))
+// FIXME: do the same where either the caller or callee are
+// target_clones.
+if (FD->isTargetMultiVersion()) {
+  llvm::SmallVector CallerFeats;
+  CallerTVA->getFeatures(CallerFeats);
+  MultiVersionResolverOption CallerMVRO(nullptr, "", CallerFeats);
+
+  bool HasHigherPriorityCallee = false;
+  llvm::Constant *FoundMatchingCallee = nullptr;
+  getContext().forEachMultiversionedFunctionVersion(
+  FD, [this, FD, , ,
+   ](const FunctionDecl *CurFD) {
+const auto *CalleeTVA = CurFD->getAttr();
+
+GlobalDecl CurGD{
+(CurFD->isDefined() ? CurFD->getDefinition() : CurFD)};
+StringRef MangledName = CGM.getMangledName(CurFD);
+
+llvm::SmallVector CalleeFeats;
+CalleeTVA->getFeatures(CalleeFeats);
+MultiVersionResolverOption CalleeMVRO(nullptr, "", 
CalleeFeats);
+
+const TargetInfo  = getTarget();
+
+// If there is a higher priority callee, we can't do the
+// optimization at all, as it would be a valid choice at
+// runtime.
+if (TargetMVPriority(TI, CalleeMVRO) >
+TargetMVPriority(TI, CallerMVRO)) {
+  HasHigherPriorityCallee = true;
+  return;
+}
+
+// FIXME: we could allow a lower-priority match when the
+// features are a proper subset. But for now, to keep things
+// simpler, we only care about a precise match.
+if (TargetMVPriority(TI, CalleeMVRO) <
+TargetMVPriority(TI, CallerMVRO))
+  return;
+
+if (llvm::Constant *Func = CGM.GetGlobalValue(MangledName)) {
+  FoundMatchingCallee = Func;
+  return;
+}
+
+if (CurFD->isDefined()) {
+  // FIXME: not sure how to get the address
+} else {
+  const CGFunctionInfo  =
+  getTypes().arrangeGlobalDeclaration(FD);
+  llvm::FunctionType *Ty = getTypes().GetFunctionType(FI);
+  FoundMatchingCallee =
+  CGM.GetAddrOfFunction(CurGD, Ty, /*ForVTable=*/false,
+/*DontDefer=*/false, 
ForDefinition);
+}
+  });
+
+  if (FoundMatchingCallee && !HasHigherPriorityCallee)
+CalleePtr = FoundMatchingCallee;
+}
+
   // If we're using inalloca, set up that argument.
   if (ArgMemory.isValid()) {
 llvm::Value *Arg = ArgMemory.getPointer();
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 

[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-31 Thread Jon Roelofs via cfe-commits

https://github.com/jroelofs updated 
https://github.com/llvm/llvm-project/pull/80093

>From ed52ee4424459ebc046a625341ad8dbbd38bcbe3 Mon Sep 17 00:00:00 2001
From: Jon Roelofs 
Date: Tue, 30 Jan 2024 19:13:42 -0800
Subject: [PATCH 1/2] [clang][FMV] Direct-call multi-versioned callees from
 multi-versioned callers

... when there is a callee with a matching feature set, and no other higher
priority callee.  This optimization helps the inliner see past the
ifunc+resolver to the callee that we know it will always land on.

This is a conservative implementation of: 
https://github.com/llvm/llvm-project/issues/71714
---
 clang/lib/CodeGen/CGCall.cpp  |  72 +
 clang/lib/CodeGen/CodeGenModule.cpp   |   2 +-
 .../test/CodeGen/attr-target-mv-direct-call.c | 245 ++
 3 files changed, 318 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/attr-target-mv-direct-call.c

diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 28c211aa631e4..84a04e3ccddd8 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4966,6 +4966,11 @@ static unsigned getMaxVectorWidth(const llvm::Type *Ty) {
   return MaxVectorWidth;
 }
 
+// FIXME: put this somewhere nicer to share
+unsigned
+TargetMVPriority(const TargetInfo ,
+ const CodeGenFunction::MultiVersionResolverOption );
+
 RValue CodeGenFunction::EmitCall(const CGFunctionInfo ,
  const CGCallee ,
  ReturnValueSlot ReturnValue,
@@ -5437,6 +5442,73 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
,
   const CGCallee  = Callee.prepareConcreteCallee(*this);
   llvm::Value *CalleePtr = ConcreteCallee.getFunctionPointer();
 
+  // If a multi-versioned caller calls a multi-versioned callee, skip the
+  // resolver when there is a precise match on the feature sets, and no
+  // possibility of a better match at runtime.
+  if (const auto *CallerFD = dyn_cast_or_null(CurGD.getDecl()))
+if (const auto *CallerTVA = CallerFD->getAttr())
+  if (const FunctionDecl *FD = dyn_cast_or_null(TargetDecl))
+// FIXME: do the same where either the caller or callee are
+// target_clones.
+if (FD->isTargetMultiVersion()) {
+  llvm::SmallVector CallerFeats;
+  CallerTVA->getFeatures(CallerFeats);
+  MultiVersionResolverOption CallerMVRO(nullptr, "", CallerFeats);
+
+  bool HasHigherPriorityCallee = false;
+  llvm::Constant *FoundMatchingCallee = nullptr;
+  getContext().forEachMultiversionedFunctionVersion(
+  FD, [this, FD, , ,
+   ](const FunctionDecl *CurFD) {
+const auto *CalleeTVA = CurFD->getAttr();
+
+GlobalDecl CurGD{
+(CurFD->isDefined() ? CurFD->getDefinition() : CurFD)};
+StringRef MangledName = CGM.getMangledName(CurFD);
+
+llvm::SmallVector CalleeFeats;
+CalleeTVA->getFeatures(CalleeFeats);
+MultiVersionResolverOption CalleeMVRO(nullptr, "", 
CalleeFeats);
+
+const TargetInfo  = getTarget();
+
+// If there is a higher priority callee, we can't do the
+// optimization at all, as it would be a valid choice at
+// runtime.
+if (TargetMVPriority(TI, CalleeMVRO) >
+TargetMVPriority(TI, CallerMVRO)) {
+  HasHigherPriorityCallee = true;
+  return;
+}
+
+// FIXME: we could allow a lower-priority match when the
+// features are a proper subset. But for now, to keep things
+// simpler, we only care about a precise match.
+if (TargetMVPriority(TI, CalleeMVRO) <
+TargetMVPriority(TI, CallerMVRO))
+  return;
+
+if (llvm::Constant *Func = CGM.GetGlobalValue(MangledName)) {
+  FoundMatchingCallee = Func;
+  return;
+}
+
+if (CurFD->isDefined()) {
+  // FIXME: not sure how to get the address
+} else {
+  const CGFunctionInfo  =
+  getTypes().arrangeGlobalDeclaration(FD);
+  llvm::FunctionType *Ty = getTypes().GetFunctionType(FI);
+  FoundMatchingCallee =
+  CGM.GetAddrOfFunction(CurGD, Ty, /*ForVTable=*/false,
+/*DontDefer=*/false, 
ForDefinition);
+}
+  });
+
+  if (FoundMatchingCallee && !HasHigherPriorityCallee)
+CalleePtr = FoundMatchingCallee;
+}
+
   // If we're using inalloca, set up that argument.
   if (ArgMemory.isValid()) {
 llvm::Value *Arg = ArgMemory.getPointer();
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 

[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-30 Thread Jon Roelofs via cfe-commits

https://github.com/jroelofs edited 
https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][FMV] Direct-call FMV callees from FMV callers (PR #80093)

2024-01-30 Thread Jon Roelofs via cfe-commits

https://github.com/jroelofs edited 
https://github.com/llvm/llvm-project/pull/80093
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits