[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-25 Thread Kristof Beyls via llvm-branch-commits


@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.

kbeyls wrote:

Do you happen to know whether it would be a good idea to adapt what BOLT 
overall considers as tail calls to also include the cases that this function 
adds in addition?
Basically, does there need to be a separate "definition" of what is considered 
a tail call, only for the pauth analysis, versus the "definition" of a tail 
call in all other places in BOLT?

If there is a good reason why there has to be a difference, maybe it makes 
sense to explain in this comment why that is the case?

https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-25 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 27d75c4248864d12381aac765674106f573de923 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 2eadaf15d3a65..0a3948e2e278e 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1478,6 +1555,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-25 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 27d75c4248864d12381aac765674106f573de923 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 2eadaf15d3a65..0a3948e2e278e 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1478,6 +1555,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-24 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Thanks, mostly looks good, I only have 1 nitpicky comment about the underlying 
reason why the pauth analyzer should have a slightly different "definition" of 
what is considered a tail call versus BOLT overall.

https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-23 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 5d067891f23b106a78dc2f61954fac538ddbd280 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 2eadaf15d3a65..0a3948e2e278e 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1478,6 +1555,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-23 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 5d067891f23b106a78dc2f61954fac538ddbd280 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 2eadaf15d3a65..0a3948e2e278e 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1478,6 +1555,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-23 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 44f7d3dfc6279e80c840f90eb11b0a9e73556def Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 20c8c2511428c..694d9b5504399 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1476,6 +1553,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-23 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 44f7d3dfc6279e80c840f90eb11b0a9e73556def Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 20c8c2511428c..694d9b5504399 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1476,6 +1553,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-19 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From a0c9617031dd31157686a519743048e12c01ac97 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 05309a47aba40..b5b46390d4586 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1473,6 +1550,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-19 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From a0c9617031dd31157686a519743048e12c01ac97 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 05309a47aba40..b5b46390d4586 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1473,6 +1550,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-16 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From ec07ae02a991c5730fa0529e2479eadae4886211 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 05309a47aba40..b5b46390d4586 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1473,6 +1550,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-06-16 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From ec07ae02a991c5730fa0529e2479eadae4886211 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 05309a47aba40..b5b46390d4586 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1319,6 +1319,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1473,6 +1550,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-28 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From a75cab7070e2167a4be39a4467895a2d1622c4e8 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  80 +++
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 2 files changed, 677 insertions(+)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 6327a2da54d5b..4c7ae3c880db4 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -1307,6 +1307,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those registers used by SrcSafetyAnalysis to initialize the
+  // set of trusted registers on function entry.
+  // Thus, this function basically checks that the precondition expected to be
+  // imposed by a function call instruction (which is hardcoded into the 
target-
+  // specific getTrustedLiveInRegs() function) is also respected on tail calls.
+  SmallVector RegsToCheck = BC.MIB->getTrustedLiveInRegs();
+  LLVM_DEBUG({
+traceInst(BC, "Found tail call inst", Inst);
+traceRegMask(BC, "Trusted regs", S.TrustedRegs);
+  });
+
+  // In musl on AArch64, the _start function sets LR to zero and calls the next
+  // stage initialization function at the end, something along these lines:
+  //
+  //   _start:
+  // mov x30, #0
+  // ; ... other initialization ...
+  // b   _start_c ; performs "exit" system call at some point
+  //
+  // As this would produce a false positive for every executable linked with
+  // such libc, ignore tail calls performed by ELF entry function.
+  if (BC.StartFunctionAddress &&
+  *BC.StartFunctionAddress == Inst.getFunction()->getAddress()) {
+LLVM_DEBUG({ dbgs() << "  Skipping tail call in ELF entry function.\n"; });
+return std::nullopt;
+  }
+
+  // Returns at most one report per instruction - this is probably OK...
+  for (auto Reg : RegsToCheck)
+if (!S.TrustedRegs[Reg])
+  return make_gadget_report(UntrustedLRKind, Inst, Reg);
+
+  return std::nullopt;
+}
+
 static std::optional>
 shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
const SrcState &S) {
@@ -1462,6 +1539,9 @@ void FunctionAnalysisContext::findUnsafeUses(
 if (PacRetGadgetsOnly)
   return;
 
+if (auto Report = shouldReportUnsafeTailCall(BC, BF, Inst, S))
+  Reports.push_back(*Report);
+
 if (auto Report = shouldReportCallGadget(BC, Inst, S))

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-28 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-28 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 5a680425d70dd55179e27a613c9ef29cfaf72086 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 2aacb38ee19a9..4c7ae3c880db4 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -737,19 +737,14 @@ template  class CFGUnawareAnalysis {
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis,
 public CFGUnawareAnalysis {
   using SrcSafetyAnalysis::BC;
   BinaryFunction &BF;
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -759,6 +754,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -773,7 +769,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Attach the state *before* this instruction executes.
@@ -1311,6 +1307,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those regis

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-28 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 5a680425d70dd55179e27a613c9ef29cfaf72086 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 2aacb38ee19a9..4c7ae3c880db4 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -737,19 +737,14 @@ template  class CFGUnawareAnalysis {
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis,
 public CFGUnawareAnalysis {
   using SrcSafetyAnalysis::BC;
   BinaryFunction &BF;
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -759,6 +754,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -773,7 +769,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Attach the state *before* this instruction executes.
@@ -1311,6 +1307,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those regis

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-27 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 11094a446c4b193d5b5e3023cdd01de0e619 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index fc6120e922baa..b539f1a211d4f 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -737,19 +737,14 @@ template  class CFGUnawareAnalysis {
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis,
 public CFGUnawareAnalysis {
   using SrcSafetyAnalysis::BC;
   BinaryFunction &BF;
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -759,6 +754,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -773,7 +769,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Attach the state *before* this instruction executes.
@@ -1302,6 +1298,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those regis

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-27 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 11094a446c4b193d5b5e3023cdd01de0e619 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index fc6120e922baa..b539f1a211d4f 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -737,19 +737,14 @@ template  class CFGUnawareAnalysis {
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis,
 public CFGUnawareAnalysis {
   using SrcSafetyAnalysis::BC;
   BinaryFunction &BF;
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -759,6 +754,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -773,7 +769,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Attach the state *before* this instruction executes.
@@ -1302,6 +1298,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those regis

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-26 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From cfc7f4dac44c66f125c8d514c3d0a26d36dc0779 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index c34ca620bb50b..49087eab3ce9a 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -736,19 +736,14 @@ template  class CFGUnawareAnalysis {
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis,
 public CFGUnawareAnalysis {
   using SrcSafetyAnalysis::BC;
   BinaryFunction &BF;
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -758,6 +753,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -772,7 +768,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Attach the state *before* this instruction executes.
@@ -1297,6 +1293,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those regis

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-26 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From cfc7f4dac44c66f125c8d514c3d0a26d36dc0779 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index c34ca620bb50b..49087eab3ce9a 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -736,19 +736,14 @@ template  class CFGUnawareAnalysis {
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis,
 public CFGUnawareAnalysis {
   using SrcSafetyAnalysis::BC;
   BinaryFunction &BF;
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -758,6 +753,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -772,7 +768,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Attach the state *before* this instruction executes.
@@ -1297,6 +1293,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those regis

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-26 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 41914bd7fd1eefb7cba31c0c825cc119d01b35b6 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index c34ca620bb50b..49087eab3ce9a 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -736,19 +736,14 @@ template  class CFGUnawareAnalysis {
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis,
 public CFGUnawareAnalysis {
   using SrcSafetyAnalysis::BC;
   BinaryFunction &BF;
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -758,6 +753,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -772,7 +768,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Attach the state *before* this instruction executes.
@@ -1297,6 +1293,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those regis

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-26 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 41914bd7fd1eefb7cba31c0c825cc119d01b35b6 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index c34ca620bb50b..49087eab3ce9a 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -736,19 +736,14 @@ template  class CFGUnawareAnalysis {
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis,
 public CFGUnawareAnalysis {
   using SrcSafetyAnalysis::BC;
   BinaryFunction &BF;
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -758,6 +753,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -772,7 +768,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis,
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Attach the state *before* this instruction executes.
@@ -1297,6 +1293,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // seen as a reasonable target-independent _approximation_ of "the LR", these
+  // are *exactly* those regis

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-20 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 0e859b759886700bcf551459dd9f5c4b22fbad9a Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 7a5d47a3ff812..dfb71575b2b39 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  /

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-20 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 0e859b759886700bcf551459dd9f5c4b22fbad9a Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 7a5d47a3ff812..dfb71575b2b39 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  /

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-16 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 7ed0a41d2162a97103cfa7c3eed10543cacc89df Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 7a5d47a3ff812..dfb71575b2b39 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  /

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits

atrosinenko wrote:

As this PR improved the handling of leaf functions without CFG information, a 
few more test cases can be cleaned up - updated in 
1377286872187ed191b02bd9632842f4db6dc367.

https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From d20efbedd0be34942e8e28cc91eefeb28d1b8108 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH 1/2] [BOLT] Gadget scanner: detect untrusted LR before tail
 call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 7a5d47a3ff812..dfb71575b2b39 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  /

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-14 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From d20efbedd0be34942e8e28cc91eefeb28d1b8108 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp|  94 ++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 706 insertions(+), 46 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 7a5d47a3ff812..dfb71575b2b39 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -701,8 +701,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -713,12 +714,6 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
-  /// Creates a state with all registers marked unsafe (not to be confused
-  /// with empty state).
-  SrcState createUnsafeState() const {
-return SrcState(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-  }
-
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -729,6 +724,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -743,7 +739,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1288,6 +1284,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition from BinaryContext::printInstruction.
+  bool IsUnknownControlFlow =
+  BC.MIB->isIndirectBranch(Inst) && !BC.MIB->getJumpTable(Inst);
+
+  if (BF.hasCFG() && IsUnknownControlFlow)
+return true;
+
+  return false;
+}
+
+static std::optional>
+shouldReportUnsafeTailCall(const BinaryContext &BC, const BinaryFunction &BF,
+   const MCInstReference &Inst, const SrcState &S) {
+  static const GadgetKind UntrustedLRKind(
+  "untrusted link register found before tail call");
+
+  if (!shouldAnalyzeTailCallInst(BC, BF, Inst))
+return std::nullopt;
+
+  // Not only the set of registers returned by getTrustedLiveInRegs() can be
+  // see

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 24d83f3847c8492311b322281938e78ddd004394 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index b2c922065da57..bb4aa7c615511 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -659,8 +659,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -671,6 +672,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -681,6 +706,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -695,7 +721,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1240,6 +1266,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-05-05 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 24d83f3847c8492311b322281938e78ddd004394 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index b2c922065da57..bb4aa7c615511 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -659,8 +659,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -671,6 +672,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -681,6 +706,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -695,7 +721,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1240,6 +1266,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-30 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 33158bcada96af1d201a5cb5342f0256f30894cd Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index d4282daad648b..fab92947f6d67 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -660,8 +660,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -672,6 +673,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -682,6 +707,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -696,7 +722,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1233,6 +1259,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-30 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 33158bcada96af1d201a5cb5342f0256f30894cd Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index d4282daad648b..fab92947f6d67 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -660,8 +660,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -672,6 +673,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -682,6 +707,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -696,7 +722,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1233,6 +1259,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-30 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 7d1cbb97817e615a5fa8841d72a18643b5722114 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index d4282daad648b..fab92947f6d67 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -660,8 +660,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -672,6 +673,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -682,6 +707,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -696,7 +722,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1233,6 +1259,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-30 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 7d1cbb97817e615a5fa8841d72a18643b5722114 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index d4282daad648b..fab92947f6d67 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -660,8 +660,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -672,6 +673,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -682,6 +707,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -696,7 +722,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1233,6 +1259,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-29 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 2b70037a627edc2d6fee8baaa3954ba0317dc13a Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 917649731884e..47d3d9b1f24df 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -660,8 +660,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -672,6 +673,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -682,6 +707,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -696,7 +722,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1233,6 +1259,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-29 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 2b70037a627edc2d6fee8baaa3954ba0317dc13a Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 917649731884e..47d3d9b1f24df 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -660,8 +660,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -672,6 +673,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -682,6 +707,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -696,7 +722,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1233,6 +1259,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-29 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 7c7922d617558f5e4a5860015c10b6be32a18596 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 917649731884e..47d3d9b1f24df 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -660,8 +660,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -672,6 +673,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -682,6 +707,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -696,7 +722,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1233,6 +1259,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-29 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko updated 
https://github.com/llvm/llvm-project/pull/137224

>From 7c7922d617558f5e4a5860015c10b6be32a18596 Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 917649731884e..47d3d9b1f24df 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -660,8 +660,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -672,6 +673,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -682,6 +707,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -696,7 +722,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1233,6 +1259,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_gadget_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return t

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-24 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-bolt

Author: Anatoly Trosinenko (atrosinenko)


Changes

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, detection of this one involves some amount
of guessing which branch instructions should be checked as tail calls.

---

Patch is 41.46 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/137224.diff


4 Files Affected:

- (modified) bolt/lib/Passes/PAuthGadgetScanner.cpp (+109-3) 
- (modified) bolt/test/binary-analysis/AArch64/gs-pacret-autiasp.s (+22-9) 
- (modified) bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s (+2-28) 
- (added) bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s (+597) 


``diff
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 0ce9f51c44af4..d67f10a311396 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -655,8 +655,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -667,6 +668,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -677,6 +702,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -691,7 +717,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1226,6 +1252,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(static_cast(Inst).getOpcode());
+  // Tail call should be a branch (but not necessarily an indirect one).
+  if (!Desc.isBranch())
+return false;
+
+  // Always analyze the branches already marked as tail calls by BOLT.
+  if (BC.MIB->isTailCall(Inst))
+return true;
+
+  // Try to also check the branches marked as "UNKNOWN CONTROL FLOW" - the
+  // below is a simplified condition 

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko ready_for_review 
https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko edited 
https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

https://github.com/atrosinenko created 
https://github.com/llvm/llvm-project/pull/137224

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.

>From 5ccb98144a2625908c6abf3aab9fb6d0c226d80b Mon Sep 17 00:00:00 2001
From: Anatoly Trosinenko 
Date: Tue, 22 Apr 2025 21:43:14 +0300
Subject: [PATCH] [BOLT] Gadget scanner: detect untrusted LR before tail call

Implement the detection of tail calls performed with untrusted link
register, which violates the assumption made on entry to every function.

Unlike other pauth gadgets, this one involves some amount of guessing
which branch instructions should be checked as tail calls.
---
 bolt/lib/Passes/PAuthGadgetScanner.cpp| 112 +++-
 .../AArch64/gs-pacret-autiasp.s   |  31 +-
 .../AArch64/gs-pauth-debug-output.s   |  30 +-
 .../AArch64/gs-pauth-tail-calls.s | 597 ++
 4 files changed, 730 insertions(+), 40 deletions(-)
 create mode 100644 bolt/test/binary-analysis/AArch64/gs-pauth-tail-calls.s

diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp 
b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index 0ce9f51c44af4..d67f10a311396 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -655,8 +655,9 @@ class DataflowSrcSafetyAnalysis
 //
 // Then, a function can be split into a number of disjoint contiguous sequences
 // of instructions without labels in between. These sequences can be processed
-// the same way basic blocks are processed by data-flow analysis, assuming
-// pessimistically that all registers are unsafe at the start of each sequence.
+// the same way basic blocks are processed by data-flow analysis, with the same
+// pessimistic estimation of the initial state at the start of each sequence
+// (except the first instruction of the function).
 class CFGUnawareSrcSafetyAnalysis : public SrcSafetyAnalysis {
   BinaryFunction &BF;
   MCPlusBuilder::AllocatorIdTy AllocId;
@@ -667,6 +668,30 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   BC.MIB->removeAnnotation(I.second, StateAnnotationIndex);
   }
 
+  /// Compute a reasonably pessimistic estimation of the register state when
+  /// the previous instruction is not known for sure. Take the set of registers
+  /// which are trusted at function entry and remove all registers that can be
+  /// clobbered inside this function.
+  SrcState computePessimisticState(BinaryFunction &BF) {
+BitVector ClobberedRegs(NumRegs);
+for (auto &I : BF.instrs()) {
+  MCInst &Inst = I.second;
+  BC.MIB->getClobberedRegs(Inst, ClobberedRegs);
+
+  // If this is a call instruction, no register is safe anymore, unless
+  // it is a tail call. Ignore tail calls for the purpose of estimating the
+  // worst-case scenario, assuming no instructions are executed in the
+  // caller after this point anyway.
+  if (BC.MIB->isCall(Inst) && !BC.MIB->isTailCall(Inst))
+ClobberedRegs.set();
+}
+
+SrcState S = createEntryState();
+S.SafeToDerefRegs.reset(ClobberedRegs);
+S.TrustedRegs.reset(ClobberedRegs);
+return S;
+  }
+
 public:
   CFGUnawareSrcSafetyAnalysis(BinaryFunction &BF,
   MCPlusBuilder::AllocatorIdTy AllocId,
@@ -677,6 +702,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
   }
 
   void run() override {
+const SrcState DefaultState = computePessimisticState(BF);
 SrcState S = createEntryState();
 for (auto &I : BF.instrs()) {
   MCInst &Inst = I.second;
@@ -691,7 +717,7 @@ class CFGUnawareSrcSafetyAnalysis : public 
SrcSafetyAnalysis {
 LLVM_DEBUG({
   traceInst(BC, "Due to label, resetting the state before", Inst);
 });
-S = createUnsafeState();
+S = DefaultState;
   }
 
   // Check if we need to remove an old annotation (this is the case if
@@ -1226,6 +1252,83 @@ shouldReportReturnGadget(const BinaryContext &BC, const 
MCInstReference &Inst,
   return make_report(RetKind, Inst, *RetReg);
 }
 
+/// While BOLT already marks some of the branch instructions as tail calls,
+/// this function tries to improve the coverage by including less obvious cases
+/// when it is possible to do without introducing too many false positives.
+static bool shouldAnalyzeTailCallInst(const BinaryContext &BC,
+  const BinaryFunction &BF,
+  const MCInstReference &Inst) {
+  // Some BC.MIB->isXYZ(Inst) methods simply delegate to MCInstrDesc::isXYZ()
+  // (such as isBranch at the time of writing this comment), some don't (such
+  // as isCall). For that reason, call MCInstrDesc's methods explicitly when
+  // it is important.
+  const MCInstrDesc &Desc =
+  BC.MII->get(stati

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect untrusted LR before tail call (PR #137224)

2025-04-24 Thread Anatoly Trosinenko via llvm-branch-commits

atrosinenko wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/137224?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#137224** https://app.graphite.dev/github/pr/llvm/llvm-project/137224?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/137224?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#136183** https://app.graphite.dev/github/pr/llvm/llvm-project/136183?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#136151** https://app.graphite.dev/github/pr/llvm/llvm-project/136151?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#135663** https://app.graphite.dev/github/pr/llvm/llvm-project/135663?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#136147** https://app.graphite.dev/github/pr/llvm/llvm-project/136147?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#135662** https://app.graphite.dev/github/pr/llvm/llvm-project/135662?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#135661** https://app.graphite.dev/github/pr/llvm/llvm-project/135661?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#134146** https://app.graphite.dev/github/pr/llvm/llvm-project/134146?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#133461** https://app.graphite.dev/github/pr/llvm/llvm-project/133461?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#135073** https://app.graphite.dev/github/pr/llvm/llvm-project/135073?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/137224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits