[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-28 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D86156#2245103 , @nikic wrote:

> I have no familiarity with BFI, so possibly stupid question: There is already 
> some similar handling as part of BFIImpl here: 
> https://github.com/llvm/llvm-project/blob/0f14b2e6cbb54c84ed3b00b0db521f5ce2d1e3f2/llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h#L1043-L1058
>  What is the difference to that / why are both needed?

Well this is a funny situation: looks like someone else saw the same problem we 
observed and pushed the fix also using VH callbacks (D75341 
). Both of us coming to the same solution here 
is a good sign that its a good fit.

I'm glad you pointed this out! Looking at my diff with both callbacks 
incorporated the node gets erase() called twice but since the second call isn't 
in the DenseMap erase becomes a no-op. This explains why both changes didn't 
catastrophically collide against each other.

Given all that, the changes here to pass along the BFI information in the loop 
passes and allow usage in LICM still seems meaningful enough to commit. I've 
removed the redundant call-back added. Also updating the description to reflect 
the latest updates.




Comment at: llvm/lib/Transforms/Scalar/LoopDistribute.cpp:1062
+LoopStandardAnalysisResults AR = {AA,  AC,  DT,  LI, SE,
+  TLI, TTI, nullptr, nullptr};
 return LAM.getResult(L, AR);

nikic wrote:
> Huh, surprised that clang-format allows this.
I also thought that was a mis-format at first but the combination of rules 
turns out to prefer this.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-28 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288719.
modimo added a comment.

Remove redundant VH callback as @nikic helpfully pointed out!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,10 +260,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,10 +279,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-N

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-28 Thread Nikita Popov via Phabricator via cfe-commits
nikic added a comment.
Herald added a subscriber: danielkiss.

New compile-time numbers: 
https://llvm-compile-time-tracker.com/compare.php?from=d7c119d89c5f6d0789cfd0a139c80e23912c0bb0&to=e0a1a6cac1b982023f8ceba8285d1ee7bc96bd32&stat=instructions
 The regression is now reduced to 0.2%. I assume that's the overhead of those 
CallbackVHs.

I have no familiarity with BFI, so possibly stupid question: There is already 
some similar handling as part of BFIImpl here: 
https://github.com/llvm/llvm-project/blob/0f14b2e6cbb54c84ed3b00b0db521f5ce2d1e3f2/llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h#L1043-L1058
 What is the difference to that / why are both needed?




Comment at: llvm/include/llvm/Analysis/BlockFrequencyInfo.h:43
+  class BFICallbackVH final : public CallbackVH {
+std::weak_ptr BFI;
+const BasicBlock *BB;

Is the shared_ptr/weak_ptr usage here necessary? This type of map deletion 
CallbackVH is a common pattern, but this is the first time I see it with 
weak_ptr.



Comment at: llvm/include/llvm/Analysis/BlockFrequencyInfo.h:44
+std::weak_ptr BFI;
+const BasicBlock *BB;
+

It should not be necessary to explicitly store `BB` here, it is already part of 
the value handle (you can access it via `*this` for example).



Comment at: llvm/include/llvm/Analysis/BlockFrequencyInfo.h:48
+BFICallbackVH(std::shared_ptr BFI, const BasicBlock *BB);
+virtual ~BFICallbackVH() = default;
+

I don't believe this virtual dtor is needed (class is final). For that matter, 
the method below also don't need to marked virtual.



Comment at: llvm/lib/Analysis/BlockFrequencyInfo.cpp:212
+  for (const auto *BB : RegisteredBlocks)
+BlockWatchers.emplace_back(BFICallbackVH(BFI, BB));
+

May want to clear BlockWatchers in `BlockFrequencyInfo::releaseMemory()`?



Comment at: llvm/lib/Transforms/Scalar/LoopDistribute.cpp:1062
+LoopStandardAnalysisResults AR = {AA,  AC,  DT,  LI, SE,
+  TLI, TTI, nullptr, nullptr};
 return LAM.getResult(L, AR);

Huh, surprised that clang-format allows this.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D86156#2243393 , @asbirlea wrote:

> Diff looks reasonable at this point. Thank you for the patch!
> Please wait on @nikic for compile-time impact or additional feedback.
>
> Just out of curiosity, in D65060 , it was 
> mentioned that using BFI got you ~7% improvement for a CPU related metric 
> (@wenlei). Are you seeing benefits from this patch? And which pass manager 
> are you using?

Thanks for quick review. We got the perf improvement from preventing a bad 
hoisting in a critical loop using BFI. Originally it was with legacy pass 
manager, hence the invalidation issue with new pass manager wasn't caught from 
our usage. This change was made as an internal patch for perf parity when we 
moved to new pass manager, and we've been using it for a while now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Alina Sbirlea via Phabricator via cfe-commits
asbirlea accepted this revision.
asbirlea added a comment.

Diff looks reasonable at this point. Thank you for the patch!
Please wait on @nikic for compile-time impact or additional feedback.

Just out of curiosity, in D65060 , it was 
mentioned that using BFI got you ~7% improvement for a CPU related metric 
(@wenlei). Are you seeing benefits from this patch? And which pass manager are 
you using?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: llvm/include/llvm/Transforms/Scalar/LoopPassManager.h:273
   : nullptr;
+BlockFrequencyInfo *BFI = UseBlockFrequencyInfo
+  ? (&AM.getResult(F))

asbirlea wrote:
> Add `&& F.hasProfileData()` check here, in the NPM as well?
Makes sense, added.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288477.
modimo added a comment.

Condition usage of BFI to PGO in newPM as well


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,10 +260,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,10 +279,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Alina Sbirlea via Phabricator via cfe-commits
asbirlea added inline comments.



Comment at: llvm/include/llvm/Transforms/Scalar/LoopPassManager.h:273
   : nullptr;
+BlockFrequencyInfo *BFI = UseBlockFrequencyInfo
+  ? (&AM.getResult(F))

Add `&& F.hasProfileData()` check here, in the NPM as well?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D86156#2242872 , @asbirlea wrote:

> As a general note, it may make sense to include BFI in the set of loop passes 
> always preserved (`getLoopPassPreservedAnalyses()`), if its nature is to 
> always be preserved (with some potential info loss) due to the callbacks 
> deleting blocks. But since we're only looking at LICM effect for now, this 
> can be a follow up when/if needed.

Certainly, that makes a lot of sense and makes it easier for any other loop 
optimization to take advantage of this data. I definitely want to make that 
happen, process-wise agreed that following up makes sense instead of tacking 
this on in addition.




Comment at: llvm/lib/Transforms/Scalar/LICM.cpp:220
   
&getAnalysis().getDomTree(),
+  &getAnalysis().getBFI(),
   &getAnalysis().getTLI(

nikic wrote:
> I believe that to make this actually lazy the getBFI() call needs to be 
> guarded by some other check for presence of profiling data, otherwise it will 
> be computed unconditionally at this point. Typically something like 
> F.hasProfileData() or PSI.hasProfileSummary().
Appreciate the diligence here-your website is great! 

Your assessment is correct, the ".getBFI()" call for lazy will calculate BFI. 
I've guarded it under hasProfileData now.

I see in the "about" section that I can use your setup to test TP changes 
myself, here's my fork of llvm-project if that option is still available: 
https://github.com/modiking/llvm-project



Comment at: llvm/test/Other/opt-O2-pipeline.ll:281
 ; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis

asbirlea wrote:
> Mark LICM to preserve these passes so they get moved above LICM rather than 
> recomputed here (same as they are preserved in unswitch).
Marked, these are no longer calculated again.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288445.
modimo added a comment.

only use BFI when profile is enabled, have LICM mark BFI as preserved


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,10 +260,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,10 +279,10

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Nikita Popov via Phabricator via cfe-commits
nikic added a comment.

Current compile-time numbers show a 1% geomean regression: 
https://llvm-compile-time-tracker.com/compare.php?from=d7c119d89c5f6d0789cfd0a139c80e23912c0bb0&to=127615d90c7b4424ec83f5a8ab4256d08f7a8362&stat=instructions

I've left a comment inline with a possible cause.




Comment at: llvm/lib/Transforms/Scalar/LICM.cpp:220
   
&getAnalysis().getDomTree(),
+  &getAnalysis().getBFI(),
   &getAnalysis().getTLI(

I believe that to make this actually lazy the getBFI() call needs to be guarded 
by some other check for presence of profiling data, otherwise it will be 
computed unconditionally at this point. Typically something like 
F.hasProfileData() or PSI.hasProfileSummary().


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Alina Sbirlea via Phabricator via cfe-commits
asbirlea added a comment.

Your understanding is correct. I only have one more comment on preserving 
passes in LICM too.

As a general note, it may make sense to include BFI in the set of loop passes 
always preserved (`getLoopPassPreservedAnalyses()`), if its nature is to always 
be preserved (with some potential info loss) due to the callbacks deleting 
blocks. But since we're only looking at LICM effect for now, this can be a 
follow up when/if needed.

Please allow some time for nikic@ to take a look too.




Comment at: llvm/test/Other/opt-O2-pipeline.ll:281
 ; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis

Mark LICM to preserve these passes so they get moved above LICM rather than 
recomputed here (same as they are preserved in unswitch).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: llvm/lib/Passes/PassBuilder.cpp:522
   FPM.addPass(createFunctionToLoopPassAdaptor(
-  std::move(LPM2), /*UseMemorySSA=*/false, DebugLogging));
+  std::move(LPM2), /*UseMemorySSA=*/false, /*UseBlockFrequencyInfo=*/true,
+  DebugLogging));

asbirlea wrote:
> It doesn't look like `UseBlockFrequencyInfo` is used for LPM2 here and below. 
> Would it make sense to set it to `false` at this point?
> 
Good catch, changed to false for LPM2.



Comment at: llvm/test/Other/pass-pipelines.ll:57
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under 
separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and

asbirlea wrote:
> Add `; CHECK-O2: Loop Pass Manager` along with this comment.
> 
> 
> Please consider if splitting the loop pass pipeline has any effects on 
> optimizations. This is for the legacyPM only, so those who switched to the 
> newPM will not be affected.
> The solution may be to mark the analyses preserved in loop unswitch.
To check my understanding here, with the split loop pass pipeline the phases 
look like the following:

```
Lazy Branch Probability Analysis
Lazy Block Frequency Analysis
Loop Pass Manager
  Loop Invariant Code Motion
Loop Pass Manager
  Unswitch loops
```
Walking through the code of Loop Pass Manager by itself it doesn't re-calculate 
or produce additional analysis. Thus the difference appears to arise as follows:
1. combined loop pass: loop opts are run one after the other per loop, so if 
you have loops L1 and L2 the order would be L1(LICM, Unswitch) -> L2 (LICM, 
Unswitch)
2. split loop pass: in this case it would be L1(LICM)->L2(LICM) into 
L1(Unswitch)->L2(Unswitch)

My qualitative assessment is that the impact here is quite minimal. The main 
scenario I can think of where differences could occur is having all LICM 
completed early can change the costing in unswitching for nested loops. 

Unswitching only occurs once in the legacyPM and always after LICM so it seems 
fairly tame to build in this cross-pass dependence by marking lazy BFI/BPI 
preserved. I'd like to know if this level of cross-pass dependence is 
potentially an issue and if there's precedence for doing so if you have that 
context.

I've made the changes in the latest commit so that they exist in the same pass 
pipeline again by marking lazy BFI/FPI as preserved in unswitching. The value 
is definitely there to prevent unwanted side-effects.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288172.
modimo added a comment.

Remove usage need for BFI in LPM2 and set unswitching to preserve lazy BPI/BFI 
so it can remain in the same loop pass as LICM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,6 +260,8 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
 ; CHECK-NEXT:   Loop Pass Manager
 ; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,6 +279,8 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
+; CH

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Alina Sbirlea via Phabricator via cfe-commits
asbirlea added inline comments.



Comment at: llvm/lib/Passes/PassBuilder.cpp:522
   FPM.addPass(createFunctionToLoopPassAdaptor(
-  std::move(LPM2), /*UseMemorySSA=*/false, DebugLogging));
+  std::move(LPM2), /*UseMemorySSA=*/false, /*UseBlockFrequencyInfo=*/true,
+  DebugLogging));

It doesn't look like `UseBlockFrequencyInfo` is used for LPM2 here and below. 
Would it make sense to set it to `false` at this point?




Comment at: llvm/test/Other/pass-pipelines.ll:57
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under 
separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and

Add `; CHECK-O2: Loop Pass Manager` along with this comment.


Please consider if splitting the loop pass pipeline has any effects on 
optimizations. This is for the legacyPM only, so those who switched to the 
newPM will not be affected.
The solution may be to mark the analyses preserved in loop unswitch.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D86156#2231710 , @nikic wrote:

> This change adds three PDT calculations to the standard pipeline. Please try 
> to avoid the PDT calculations if PGO is not used, possibly by using LazyBPI.

Good catch, changed to use LazyBFI which eliminates the extra PDT calculations 
unless this is accessed by PGO


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288125.
modimo added a comment.

Change to LazyBFI for legacy pass manager to prevent rebuilding the 
post-dominator tree


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/pass-pipelines.ll
===
--- llvm/test/Other/pass-pipelines.ll
+++ llvm/test/Other/pass-pipelines.ll
@@ -54,7 +54,7 @@
 ; CHECK-O2-NOT: Manager
 ; CHECK-O2: Loop Pass Manager
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and
 ; causing new loop pass managers.
 ; CHECK-O2: Simplify the CFG
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,8 +97,11 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
+; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Unswitch loops
 ; CHECK-NEXT: Simplify the CFG
 ; CHECK-NEXT: Dominator Tree Construction
@@ -154,6 +157,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,6 +261,8 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
 ; CHECK-NEXT:   Loop Pass Manager
 ; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,8 +116,11 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-22 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> If only LICM benefits from it, I'd consider creating a BFI instance for the 
> duration of the LICM pass

Currently only LICM uses BFI, but I think it'd be beneficial for other loop 
passes to be able to use BFI too. BFI not accessible to loop passes seems to be 
a non-trivial limitation (comparing to other compilers). I would think it's not 
that other loop passes won't benefit from BFI, but because of the constraints 
imposed by pass manager, we couldn't reap any potential benefit.

I'm hoping that we could find a way to lift that limitation (and of course 
without breaking the invariants we need to keep for pass manager). This change 
is an attempt towards that. It's not ideal, as it only covers invalidation (for 
removal of blocks), but not incremental update for BFI to reflect arbitrary CFG 
changes, which can be really hard. Still, we think it's a practical solution to 
expose BFI to loop passes.

> or only enabling it in loop pipelines that have LICM in them (see example 
> comment in one of the tests).

Makes sense for now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-21 Thread Nikita Popov via Phabricator via cfe-commits
nikic requested changes to this revision.
nikic added a comment.
This revision now requires changes to proceed.

This change adds three PDT calculations to the standard pipeline. Please try to 
avoid the PDT calculations if PGO is not used, possibly by using LazyBPI.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-21 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: llvm/include/llvm/Transforms/Scalar/LoopPassManager.h:408
 FunctionToLoopPassAdaptor
 createFunctionToLoopPassAdaptor(LoopPassT Pass, bool UseMemorySSA = false,
+bool UseBlockFrequencyInfo = false,

@asbirlea Assuming this change matches expectations of making BFI on present 
when LICM or loop passes contain LICM I'm looking at the users of this and they 
seem to fall into 2 categories:
1. Those that specify the optional flags
2. Those that only pass in the LoopPassT

I found as I was updating with a new flag that due to C++ behavior a place that 
wasn't updated to have all 3 optional parameters will place DebugLogging into 
UseBlockFrequencyInfo which is a nasty error. I think a way around it is to 
enforce overloaded functions with either 0 additional parameters or having 
every "optional" parameter specified so it'll be a build time error for the 
previous scenario rather than a runtime issue.



Comment at: llvm/include/llvm/Transforms/Scalar/LoopPassManager.h:278
AM.getResult(F),
+   
&AM.getResult(F),
MSSA};

asbirlea wrote:
> This should not be unconditional. See MSSA approach.
Fixed it up to resemble MSSA



Comment at: llvm/test/Transforms/LoopRotate/pr35210.ll:51
+; MSSA-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; MSSA-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f

asbirlea wrote:
> e.g. there's no use of creating these for LoopRotate.
Changed up so LoopRotate no longer has a dependency on BFI


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-21 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 287152.
modimo added a comment.
Herald added subscribers: lxfind, nikic.

@asbirlea Thanks for taking a look!

I updated BFI to resemble MSSA as recommended which removed the BFI calculation 
unless LICM is invoked. Also removed the spurious diffs from formatting.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/pass-pipelines.ll
===
--- llvm/test/Other/pass-pipelines.ll
+++ llvm/test/Other/pass-pipelines.ll
@@ -54,7 +54,7 @@
 ; CHECK-O2-NOT: Manager
 ; CHECK-O2: Loop Pass Manager
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and
 ; causing new loop pass managers.
 ; CHECK-O2: Simplify the CFG
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -96,9 +96,13 @@
 ; CHECK-NEXT: Scalar Evolution Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
+; CHECK-NEXT: Post-Dominator Tree Construction
+; CHECK-NEXT: Branch Probability Analysis
+; CHECK-NEXT: Block Frequency Analysis
 ; CHECK-NEXT: Memory SSA
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
+; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Unswitch loops
 ; CHECK-NEXT: Simplify the CFG
 ; CHECK-NEXT: Dominator Tree Construction
@@ -147,13 +151,17 @@
 ; CHECK-NEXT: Phi Values Analysis
 ; CHECK-NEXT: Memory Dependence Analysis
 ; CHECK-NEXT: Dead Store Elimination
+; CHECK-NEXT: Natural Loop Information
+; CHECK-NEXT: Post-Dominator Tree Construction
+; CHECK-NEXT: Branch Probability Analysis
+; CHECK-NEXT: Block Frequency Analysis
 ; CHECK-NEXT: Function Alias Analysis Results
 ; CHECK-NEXT: Memory SSA
-; CHECK-NEXT: Natural Loop Information
 ; CHECK-NEXT: Canonicalize natural loops
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -251,11 +259,15 @@
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Combine redundant instructions
+; CHECK-N

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-20 Thread Alina Sbirlea via Phabricator via cfe-commits
asbirlea requested changes to this revision.
asbirlea added a comment.
This revision now requires changes to proceed.

A couple of quick comments, I haven't gone into details yet.

- please split off the changes that are NFCs (deleted spaces) and clang-format 
for ease to review.
- the test changes show a lot of dependencies that overload the loop pipeline. 
If only LICM benefits from it, I'd consider creating a BFI instance for the 
duration of the LICM pass, or only enabling it in loop pipelines that have LICM 
in them (see example comment in one of the tests).




Comment at: llvm/include/llvm/Transforms/Scalar/LoopPassManager.h:278
AM.getResult(F),
+   
&AM.getResult(F),
MSSA};

This should not be unconditional. See MSSA approach.



Comment at: llvm/test/Transforms/LoopRotate/pr35210.ll:51
+; MSSA-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; MSSA-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f

e.g. there's no use of creating these for LoopRotate.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-18 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 286392.
modimo added a comment.

Commit my changes (crazy I know) so that the diff is actually updated for 
linting


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/loop-pm-invalidation.ll
  llvm/test/Other/new-pass-manager.ll
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/test/Transforms/LoopRotate/pr35210.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Transforms/LoopRotate/pr35210.ll
===
--- llvm/test/Transforms/LoopRotate/pr35210.ll
+++ llvm/test/Transforms/LoopRotate/pr35210.ll
@@ -19,12 +19,15 @@
 ; CHECK-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; CHECK-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; CHECK-NEXT: Running analysis: TargetIRAnalysis on f
+; CHECK-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; CHECK-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; CHECK-NEXT: Starting Loop pass manager run.
 ; CHECK-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; CHECK-NEXT: Folding loop latch bb4 into bb
 ; CHECK-NEXT: Finished Loop pass manager run.
 ; CHECK-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; CHECK-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running pass: ADCEPass on f
 ; CHECK-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; CHECK-NEXT: Finished llvm::Function pass manager run.
@@ -44,12 +47,15 @@
 ; MSSA-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; MSSA-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; MSSA-NEXT: Running analysis: TargetIRAnalysis on f
+; MSSA-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; MSSA-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; MSSA-NEXT: Starting Loop pass manager run.
 ; MSSA-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; MSSA-NEXT: Folding loop latch bb4 into bb
 ; MSSA-NEXT: Finished Loop pass manager run.
 ; MSSA-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; MSSA-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running pass: ADCEPass on f
 ; MSSA-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; MSSA-NEXT: Finished llvm::Function pass manager run.
Index: llvm/test/Other/pass-pipelines.ll
===
--- llvm/test/Other/pass-pipelines.ll
+++ llvm/test/Other/pass-pipelines.ll
@@ -54,7 +54,7 @@
 ; CHECK-O2-NOT: Manager
 ; CHECK-O2: Loop Pass Manager
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and
 ; causing new loop pass managers.
 ; CHECK-O2: Simplify the CFG
Index: llvm/test/Other/new-pm-thinlto-defaults.ll

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-18 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 286390.
modimo added a comment.

Linting


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/loop-pm-invalidation.ll
  llvm/test/Other/new-pass-manager.ll
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/test/Transforms/LoopRotate/pr35210.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -10,6 +10,9 @@
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Transforms/LoopRotate/pr35210.ll
===
--- llvm/test/Transforms/LoopRotate/pr35210.ll
+++ llvm/test/Transforms/LoopRotate/pr35210.ll
@@ -19,12 +19,15 @@
 ; CHECK-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; CHECK-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; CHECK-NEXT: Running analysis: TargetIRAnalysis on f
+; CHECK-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; CHECK-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; CHECK-NEXT: Starting Loop pass manager run.
 ; CHECK-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; CHECK-NEXT: Folding loop latch bb4 into bb
 ; CHECK-NEXT: Finished Loop pass manager run.
 ; CHECK-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; CHECK-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running pass: ADCEPass on f
 ; CHECK-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; CHECK-NEXT: Finished llvm::Function pass manager run.
@@ -44,12 +47,15 @@
 ; MSSA-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; MSSA-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; MSSA-NEXT: Running analysis: TargetIRAnalysis on f
+; MSSA-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; MSSA-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; MSSA-NEXT: Starting Loop pass manager run.
 ; MSSA-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; MSSA-NEXT: Folding loop latch bb4 into bb
 ; MSSA-NEXT: Finished Loop pass manager run.
 ; MSSA-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; MSSA-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running pass: ADCEPass on f
 ; MSSA-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; MSSA-NEXT: Finished llvm::Function pass manager run.
Index: llvm/test/Other/pass-pipelines.ll
===
--- llvm/test/Other/pass-pipelines.ll
+++ llvm/test/Other/pass-pipelines.ll
@@ -54,7 +54,7 @@
 ; CHECK-O2-NOT: Manager
 ; CHECK-O2: Loop Pass Manager
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and
 ; causing new loop pass managers.
 ; CHECK-O2: Simplify the CFG
Index: llvm/test/Other/new-pm-thinlto-defaults.ll
===
--- llvm/test/Other/new-pm-thinlto-defaults.ll
+++ llvm/test/Other/new-pm-thinlto-defaults.

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-18 Thread Di Mo via Phabricator via cfe-commits
modimo created this revision.
modimo added reviewers: wenlei, asbirlea, vsk.
modimo added a project: LLVM.
Herald added subscribers: llvm-commits, cfe-commits, dexonsmith, steven_wu, 
hiraditya.
Herald added a project: clang.
modimo requested review of this revision.

D65060  uncovered that trying to use BFI in 
loop passes can lead to non-deterministic behavior when blocks are re-used 
while retaining old BFI data.

To make sure BFI is preserved through loop passes a Value Handle (VH) callback 
is registered on blocks themselves. When a block is freed it now also wipes out 
the accompanying BFI entry such that stale BFI data can no longer persist 
resolving the determinism issue.

An optimistic approach would be to incrementally update BFI information 
throughout the loop passes rather than only invalidating them on removed 
blocks. The issues with that are:

1. It is not clear how BFI information should be incrementally updated: If a 
block is duplicated does its BFI information come with? How about if it's 
split/modified/moved around?
2. Assuming we can address these problems the implementation here will be a 
massive undertaking.

There's a known need of BFI in LICM analysis which requires correct but not 
incrementally updated BFI data. Other loop passes can now also use this level 
of information and richer updates can be made as needed.

This diff also moves BFI to be a part of LoopStandardAnalysisResults since the 
previous method using getCachedResults now (correctly!) statically asserts 
(D72893 ) that this data isn't static through 
the loop passes.

Testing
Ninja check


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/loop-pm-invalidation.ll
  llvm/test/Other/new-pass-manager.ll
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/test/Transforms/LoopRotate/pr35210.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -10,6 +10,9 @@
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Transforms/LoopRotate/pr35210.ll
===
--- llvm/test/Transforms/LoopRotate/pr35210.ll
+++ llvm/test/Transforms/LoopRotate/pr35210.ll
@@ -19,12 +19,15 @@
 ; CHECK-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; CHECK-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; CHECK-NEXT: Running analysis: TargetIRAnalysis on f
+; CHECK-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; CHECK-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; CHECK-NEXT: Starting Loop pass manager run.
 ; CHECK-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; CHECK-NEXT: Folding loop latch bb4 into bb
 ; CHECK-NEXT: Finished Loop pass manager run.
 ; CHECK-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; CHECK-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running pass: ADCEPass on f
 ; CHECK-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; CHECK-NEXT: Finished llvm::Function pass manager run.
@@ -44,12 +47,15 @@
 ; MSSA-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; MSSA-NEXT: Ru