[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-23 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-23 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov ready_for_review 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-23 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-23 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-23 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)


Changes

This PR adds *imputing the fall-throughs for the top-of-stack (TOS)
entries*, addressing fall-through undercounting for branches that appear
on the top of the branch stack more often than they should.

# Branch stack sampling
Branch stack sampling (LBR, BRBE) provides a stack of (branch, target)
entries for the last N taken branches:
- Intel SKL+: 32,
- AMD BRS/LBRv2: 16,
- ARM BRBE: 8/16/32/64 depending on the implementation,
- arbitrary for trace-synthesized branch stacks (ARM ETM, Intel PT).

# DataAggregator representation/traces
The internal representation of a unit of the branch profile is a
_trace_, a triple of branch source, branch target/fall-through start,
and fall-through end/next branch:
https://github.com/llvm/llvm-project/blob/89c61449e60703e42c5f274ed41a21f3bc386cf0/bolt/include/bolt/Profile/DataAggregator.h#L116-L118

Branch source and branch target match the branch stack entry.
Fall-through end is the branch start for the next stack entry.
Naturally, the top-of-stack (TOS) entry doesn't have a fall-through end.
Internally, this case is represented with a magic value `BR_ONLY` in
`To` field:
https://github.com/llvm/llvm-project/blob/89c61449e60703e42c5f274ed41a21f3bc386cf0/bolt/include/bolt/Profile/DataAggregator.h#L111

# Traces and fall-throughs
Traces combine taken branches and fall-throughs into one record.
This helps achieve efficiencies throughout profile pre-aggregation,
profile parsing, and most critically for BOLT, enables distinguishing
*call continuation* fall-throughs from the rest.

This information is used to set the weight for call to call continuation
fall-through CFG edge, in case they are in separate basic blocks. The
effect of this information is non-trivial, providing a measurable CPU 
win over separate branches and non-differentiated fall-throughs.

# Motivation: top-of-stack biased branches
Internal studies indicate that Intel LBR sampling is not completely
fair.

Intuitively, for any branch in the program, the probability for it to be
captured in a branch stack should be proportional to its execution
frequency, and if captured, it should appear in any given entry with
equal probability of ~1/N, N being the size of the branch stack.
However, the observed probability distribution of the percentage of
times an entry appears on the top of the stack does not follow a
binomial distribution.

Relaxing the assumption of a constant probability of 1/N and modeling
the  probability as a beta distribution, the resulting distribution more
closely resembled the actual observed distribution.

This means that for branches that are biased to be at the top of the
stack, the fall-through path is undercounted in the profile.

# Implementation
TOS entries are easily identifiable with a `BR_ONLY` value in `Trace.To`
field. To get an expected fall-through length for such traces:
- group traces by the first two fields (`Branch`, `From`),
- using traces that do have fall-through end, compute *weighed average
  fall-through length*,
- use that for the trace without a fall-through end.

Since traces are sorted by their three fields, the grouping is done
naturally, and valid fall-through traces come first within a group, and
`BR_ONLY` trace comes last in the group. This allows us to compute the
weighted average in a single pass over traces.

Care is taken in two special cases:
- Fall-through start is an unconditional jump (unconditional branch,
  call, or return). In this case, the fall-through length is set to
  zero. This still allows extending it to cover call to continuation
  fall-through CFG edge.
- Traces in external code (DSO, JIT) are skipped.

# Effect
For a large binary, this change improves the profile quality metrics:
- no significant increase in mismatching traces (primary raw profile
  quality metric),
- increases the profile density: 599.4 -> 620.6 (density is computed
  using the fall-through ranges),
- CG flow imbalance: 11.56% -> 8.28%,
- CFG flow imbalance:
  - weighted: 3.63% -> 3.34%,
  - worst: 17.13% -> 22.05% (the only regression).

---
Full diff: https://github.com/llvm/llvm-project/pull/145258.diff


3 Files Affected:

- (modified) bolt/include/bolt/Core/MCPlusBuilder.h (+6) 
- (modified) bolt/include/bolt/Profile/DataAggregator.h (+4) 
- (modified) bolt/lib/Profile/DataAggregator.cpp (+62) 


``diff
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h 
b/bolt/include/bolt/Core/MCPlusBuilder.h
index 804100db80793..50c5f5bcb9303 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -430,6 +430,12 @@ class MCPlusBuilder {
 return Analysis->isIndirectBranch(Inst);
   }
 
+  bool IsUnconditionalJump(const MCInst &Inst) const {
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+// barrier captures returns and unconditional branches
+return Desc.isCall() || Desc.isBarrier();
+  }
+
   /// Returns true if the instruc

[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-23 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-22 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-22 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-22 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-22 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/145258
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Impute missing trace fall-through (PR #145258)

2025-06-22 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov created 
https://github.com/llvm/llvm-project/pull/145258

None


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits