[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-11-30 Thread Hongtao Yu via Phabricator via cfe-commits
hoy abandoned this revision.
hoy added a comment.

Abandoning this diff which has been broken into four other diffs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-09-23 Thread Hongtao Yu via Phabricator via cfe-commits
hoy added a comment.

@davidxl I'm wondering if it is a good time for you to start reviewing the 
patches. Please let me know if you need more time. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-09-21 Thread Wei Mi via Phabricator via cfe-commits
wmi added a comment.
Herald added a subscriber: ecnelises.

The patches split from the main one look good to me. Please see if David has 
further comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-26 Thread Hongtao Yu via Phabricator via cfe-commits
hoy added a comment.

In D86193#2240596 , @wmi wrote:

> In D86193#2240502 , @hoy wrote:
>
>> In D86193#2240353 , @wmi wrote:
>>
 There are some optimizations such as if-convert, tail call elimination, 
 that were initially blocked by the pseudo probe intrinsic but is now 
 unblocked by fixes included in this change. With the current change we do 
 not see perf degradation out of SPEC and one of our internal large 
 services.
 The main optimizations left blocked intentionally are those that merge 
 blocks for smaller code size, such as tail merge which is the opposite of 
 jump threading. We believe that those optimizations are not very 
 beneficial for performance and AutoFDO.
>>>
>>> If the optimizations are not very beneficial for performance and AutoFDO 
>>> and should be blocked, it may be better to block them in a more general way 
>>> and not depend on pseudo probe, because blocking them may also be 
>>> beneficial for debug info based AutoFDO.
>>
>> In theory, yes, we should have a black list of transforms (mainly related to 
>> block merge) that are not needed by AutoFDO and block them. In reality it 
>> might take quite some efforts to figure them out. Pseudo probe, on the other 
>> hand, starts with blocking those transforms in the first place and relax the 
>> ones that might actually help AutoFDO.
>>
>>> Another reason is that pseudo probe looks pretty much like debug 
>>> information to me. They are used to annotate the IR but shouldn't affect 
>>> the transformation. Binaries built w/wo debug information are required to 
>>> be identical in LLVM. I think that requirement could be applied on pseudo 
>>> probe as well. It is even better to have some test to enforce it so that no 
>>> change in the future could break the requirement.
>>
>> Good point! Yes, pseudo probe is implemented in a similar way with the debug 
>> intrinsics. However they are not guaranteed to not affect the codegen since 
>> its main purpose is to achieve an accurate profile correlation with low 
>> cost. Regarding the cost, it sits somewhere between the debug intrinsics and 
>> the PGO instrumentation and close to a zero cost in practice.
>
> I see. It makes sense to fix up some important transformations to achieve the 
> goal of low cost. To achieve the goal of not affecting codegen needs a lot 
> more effort to test and fix up all over the pipeline. I don't mean to have it 
> ready in the patch, but I think it maybe something worthy to strive for later 
> on.

Sounds good, we will be accumulating a list of AutoFDO-unfriendly transforms 
over time.

>> Agreed that it would be better to have tests protect the pseudo probe cost 
>> from going too high, but not sure which optimizations we should start with. 
>> Maybe to start with some critical optimizations like inlining, vectorization?
>
> The test I have in my mind comes from debug info. It is to bootstrap llvm 
> with and without debug information. The test is to check whether the binaries 
> built after stripping the debug information are identical. I am thinking 
> pseudo probe can have such test setup somewhere sometime in the future. Same 
> as above, it doesn't have to be ready currently.

I like the idea. It would catch a regression on pseudo probe with new 
optimization changes. Let me think about it. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-26 Thread Wei Mi via Phabricator via cfe-commits
wmi added a comment.

In D86193#2240502 , @hoy wrote:

> In D86193#2240353 , @wmi wrote:
>
>>> There are some optimizations such as if-convert, tail call elimination, 
>>> that were initially blocked by the pseudo probe intrinsic but is now 
>>> unblocked by fixes included in this change. With the current change we do 
>>> not see perf degradation out of SPEC and one of our internal large services.
>>> The main optimizations left blocked intentionally are those that merge 
>>> blocks for smaller code size, such as tail merge which is the opposite of 
>>> jump threading. We believe that those optimizations are not very beneficial 
>>> for performance and AutoFDO.
>>
>> If the optimizations are not very beneficial for performance and AutoFDO and 
>> should be blocked, it may be better to block them in a more general way and 
>> not depend on pseudo probe, because blocking them may also be beneficial for 
>> debug info based AutoFDO.
>
> In theory, yes, we should have a black list of transforms (mainly related to 
> block merge) that are not needed by AutoFDO and block them. In reality it 
> might take quite some efforts to figure them out. Pseudo probe, on the other 
> hand, starts with blocking those transforms in the first place and relax the 
> ones that might actually help AutoFDO.
>
>> Another reason is that pseudo probe looks pretty much like debug information 
>> to me. They are used to annotate the IR but shouldn't affect the 
>> transformation. Binaries built w/wo debug information are required to be 
>> identical in LLVM. I think that requirement could be applied on pseudo probe 
>> as well. It is even better to have some test to enforce it so that no change 
>> in the future could break the requirement.
>
> Good point! Yes, pseudo probe is implemented in a similar way with the debug 
> intrinsics. However they are not guaranteed to not affect the codegen since 
> its main purpose is to achieve an accurate profile correlation with low cost. 
> Regarding the cost, it sits somewhere between the debug intrinsics and the 
> PGO instrumentation and close to a zero cost in practice.

I see. It makes sense to fix up some important transformations to achieve the 
goal of low cost. To achieve the goal of not affecting codegen needs a lot more 
effort to test and fix up all over the pipeline. I don't mean to have it ready 
in the patch, but I think it maybe something worthy to strive for later on.

> Agreed that it would be better to have tests protect the pseudo probe cost 
> from going too high, but not sure which optimizations we should start with. 
> Maybe to start with some critical optimizations like inlining, vectorization?

The test I have in my mind comes from debug info. It is to bootstrap llvm with 
and without debug information. The test is to check whether the binaries built 
after stripping the debug information are identical. I am thinking pseudo probe 
can have such test setup somewhere sometime in the future. Same as above, it 
doesn't have to be ready currently.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-26 Thread Hongtao Yu via Phabricator via cfe-commits
hoy added a comment.

In D86193#2240353 , @wmi wrote:

>> There are some optimizations such as if-convert, tail call elimination, that 
>> were initially blocked by the pseudo probe intrinsic but is now unblocked by 
>> fixes included in this change. With the current change we do not see perf 
>> degradation out of SPEC and one of our internal large services.
>> The main optimizations left blocked intentionally are those that merge 
>> blocks for smaller code size, such as tail merge which is the opposite of 
>> jump threading. We believe that those optimizations are not very beneficial 
>> for performance and AutoFDO.
>
> If the optimizations are not very beneficial for performance and AutoFDO and 
> should be blocked, it may be better to block them in a more general way and 
> not depend on pseudo probe, because blocking them may also be beneficial for 
> debug info based AutoFDO.

In theory, yes, we should have a black list of transforms (mainly related to 
block merge) that are not needed by AutoFDO and block them. In reality it might 
take quite some efforts to figure them out. Pseudo probe, on the other hand, 
starts with blocking those transforms in the first place and relax the ones 
that might actually help AutoFDO.

> Another reason is that pseudo probe looks pretty much like debug information 
> to me. They are used to annotate the IR but shouldn't affect the 
> transformation. Binaries built w/wo debug information are required to be 
> identical in LLVM. I think that requirement could be applied on pseudo probe 
> as well. It is even better to have some test to enforce it so that no change 
> in the future could break the requirement.

Good point! Yes, pseudo probe is implemented in a similar way with the debug 
intrinsics. However they are not guaranteed to not affect the codegen since its 
main purpose is to achieve an accurate profile correlation with low cost. 
Regarding the cost, it sits somewhere between the debug intrinsics and the PGO 
instrumentation and close to a zero cost in practice. Agreed that it would be 
better to have tests protect the pseudo probe cost from going too high, but not 
sure which optimizations we should start with. Maybe to start with some 
critical optimizations like inlining, vectorization?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-26 Thread Wei Mi via Phabricator via cfe-commits
wmi added a comment.

> There are some optimizations such as if-convert, tail call elimination, that 
> were initially blocked by the pseudo probe intrinsic but is now unblocked by 
> fixes included in this change. With the current change we do not see perf 
> degradation out of SPEC and one of our internal large services.
> The main optimizations left blocked intentionally are those that merge blocks 
> for smaller code size, such as tail merge which is the opposite of jump 
> threading. We believe that those optimizations are not very beneficial for 
> performance and AutoFDO.

If the optimizations are not very beneficial for performance and AutoFDO and 
should be blocked, it may be better to block them in a more general way and not 
depend on pseudo probe, because blocking them may also be beneficial for debug 
info based AutoFDO.

Another reason is that pseudo probe looks pretty much like debug information to 
me. They are used to annotate the IR but shouldn't affect the transformation. 
Binaries built w/wo debug information are required to be identical in LLVM. I 
think that requirement could be applied on pseudo probe as well. It is even 
better to have some test to enforce it so that no change in the future could 
break the requirement.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-24 Thread Hongtao Yu via Phabricator via cfe-commits
hoy added a comment.

In D86193#2232609 , @wmi wrote:

> Thanks for the patch! A few questions:
>
>> probe blocks some CFG transformations that can mess up profile correlation.
>
> Can you enumerate some CFG transformations which be blocked? Is it possible 
> that some CFG transformations being blocked are actually beneficial for later 
> optimizations?

There are some optimizations such as if-convert, tail call elimination, that 
were initially blocked by the pseudo probe intrinsic but is now unblocked by 
fixes included in this change. With the current change we do not see perf 
degradation out of SPEC and one of our internal large services.

The main optimizations left blocked intentionally are those that merge blocks 
for smaller code size, such as tail merge which is the opposite of jump 
threading. We believe that those optimizations are not very beneficial for 
performance and AutoFDO. But if things are changed we can always unblock them.

> Are the intrinsic probes counted when computing bb size and function size?

That's a good question. On the IR level, pseudo probe intrinsics are treated in 
a similar way of the debug intrinsics and the side-effect intrinsics. On the 
MIR level, pseudo probe intrinsics are implemented as a 
`StandardPseudoInstruction`. So they should not be counted towards real code 
size.

> And could you split the patches into small parts for easier review. For 
> example,  Add the intrinsic support in IR and MIR. SampleProfileProbe pass. 
> -fpseudo-probe-for-profiling support. changes in various passes.

Thanks for the suggestion. Agreed the current patch is too big to review. Will 
come up with a list of breakdowns.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-23 Thread Wei Mi via Phabricator via cfe-commits
wmi added a comment.

Thanks for the patch! A few questions:

> probe blocks some CFG transformations that can mess up profile correlation.

Can you enumerate some CFG transformations which be blocked? Is it possible 
that some CFG transformations being blocked are actually beneficial for later 
optimizations?

Are the intrinsic probes counted when computing bb size and function size?

And could you split the patches into small parts for easier review. For 
example,  Add the intrinsic support in IR and MIR. SampleProfileProbe pass. 
-fpseudo-probe-for-profiling support. changes in various passes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-19 Thread Hongtao Yu via Phabricator via cfe-commits
hoy added a comment.

In D86193#2227129 , @davidxl wrote:

> A heads up -- I won't be able to review patch until mid Sept. Hope this is 
> fine.

Thanks for the heads-up. That's fine. We can wait for your input.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-19 Thread Hongtao Yu via Phabricator via cfe-commits
hoy updated this revision to Diff 286664.
hoy edited the summary of this revision.
hoy added a comment.

Updating D86193 : [CSSPGO] Pseudo probe 
instrumentation for basic blocks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/emit-pseudo-probe.c
  llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
  llvm/include/llvm/CodeGen/BasicTTIImpl.h
  llvm/include/llvm/CodeGen/CommandFlags.h
  llvm/include/llvm/CodeGen/ISDOpcodes.h
  llvm/include/llvm/CodeGen/MachineInstr.h
  llvm/include/llvm/CodeGen/SelectionDAG.h
  llvm/include/llvm/CodeGen/SelectionDAGNodes.h
  llvm/include/llvm/IR/BasicBlock.h
  llvm/include/llvm/IR/IntrinsicInst.h
  llvm/include/llvm/IR/Intrinsics.td
  llvm/include/llvm/InitializePasses.h
  llvm/include/llvm/Passes/PassBuilder.h
  llvm/include/llvm/Support/TargetOpcodes.def
  llvm/include/llvm/Target/Target.td
  llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
  llvm/lib/Analysis/AliasSetTracker.cpp
  llvm/lib/Analysis/ValueTracking.cpp
  llvm/lib/Analysis/VectorUtils.cpp
  llvm/lib/CodeGen/Analysis.cpp
  llvm/lib/CodeGen/CodeGenPrepare.cpp
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
  llvm/lib/IR/BasicBlock.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Passes/PassRegistry.def
  llvm/lib/Transforms/IPO/CMakeLists.txt
  llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/lib/Transforms/Utils/Evaluator.cpp
  llvm/lib/Transforms/Utils/SimplifyCFG.cpp
  llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
  llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll

Index: llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll
===
--- /dev/null
+++ llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll
@@ -0,0 +1,29 @@
+; RUN: opt < %s -passes=pseudo-probe -function-sections -S -o %t
+; RUN: FileCheck %s < %t --check-prefix=CHECK-IL
+; RUN: llc %t -stop-after=instruction-select -o - | FileCheck %s --check-prefix=CHECK-MIR
+;
+;; Check the generation of pseudoprobe intrinsic call.
+
+define void @foo(i32 %x) {
+bb0:
+  %cmp = icmp eq i32 %x, 0
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID:]], 1, 0
+  br i1 %cmp, label %bb1, label %bb2
+
+bb1:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 3, 0
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 4, 0
+  br label %bb3
+
+bb2:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 2, 0
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 4, 0
+  br label %bb3
+
+bb3:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4)
+  ret void
+}
Index: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
===
--- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -5122,7 +5122,9 @@
 
 if (I->mayReadOrWriteMemory() &&
 (!isa(I) ||
- cast(I)->getIntrinsicID() != Intrinsic::sideeffect)) {
+ (cast(I)->getIntrinsicID() != Intrinsic::sideeffect &&
+  cast(I)->getIntrinsicID() !=
+  Intrinsic::pseudoprobe))) {
   // Update the linked list of memory accessing instructions.
   if (CurrentLoadStore) {
 CurrentLoadStore->NextLoadStore = SD;
Index: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
===
--- llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7167,7 +7167,8 @@
 
   Intrinsic::ID ID = getVectorIntrinsicIDForCall(CI, TLI);
   if (ID && (ID == Intrinsic::assume || ID == Intrinsic::lifetime_end ||
- ID == Intrinsic::lifetime_start || ID == Intrinsic::sideeffect))
+ ID == Intrinsic::lifetime_start || ID == Intrinsic::sideeffect ||
+ ID == Intrinsic::pseudoprobe))
 return nullptr;
 
   auto willWiden = [&](unsigned VF) -> bool {
Index: llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
===
--- llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+++ 

[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks.

2020-08-19 Thread David Li via Phabricator via cfe-commits
davidxl added a comment.

A heads up -- I won't be able to review patch until mid Sept. Hope this is fine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks

2020-08-18 Thread Hongtao Yu via Phabricator via cfe-commits
hoy updated this revision to Diff 286476.
hoy added a comment.

Updating D86193 : [CSSPGO] Pseudo probe 
instrumentation for basic blocks


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86193/new/

https://reviews.llvm.org/D86193

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/emit-pseudo-probe.c
  llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
  llvm/include/llvm/CodeGen/BasicTTIImpl.h
  llvm/include/llvm/CodeGen/CommandFlags.h
  llvm/include/llvm/CodeGen/ISDOpcodes.h
  llvm/include/llvm/CodeGen/MachineInstr.h
  llvm/include/llvm/CodeGen/SelectionDAG.h
  llvm/include/llvm/CodeGen/SelectionDAGNodes.h
  llvm/include/llvm/IR/BasicBlock.h
  llvm/include/llvm/IR/IntrinsicInst.h
  llvm/include/llvm/IR/Intrinsics.td
  llvm/include/llvm/InitializePasses.h
  llvm/include/llvm/Passes/PassBuilder.h
  llvm/include/llvm/Support/TargetOpcodes.def
  llvm/include/llvm/Target/Target.td
  llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
  llvm/lib/Analysis/AliasSetTracker.cpp
  llvm/lib/Analysis/ValueTracking.cpp
  llvm/lib/Analysis/VectorUtils.cpp
  llvm/lib/CodeGen/Analysis.cpp
  llvm/lib/CodeGen/CodeGenPrepare.cpp
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
  llvm/lib/IR/BasicBlock.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Passes/PassRegistry.def
  llvm/lib/Transforms/IPO/CMakeLists.txt
  llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/lib/Transforms/Utils/Evaluator.cpp
  llvm/lib/Transforms/Utils/SimplifyCFG.cpp
  llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
  llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll

Index: llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll
===
--- /dev/null
+++ llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll
@@ -0,0 +1,29 @@
+; RUN: opt < %s -passes=pseudo-probe -function-sections -S -o %t
+; RUN: FileCheck %s < %t --check-prefix=CHECK-IL
+; RUN: llc %t -stop-after=instruction-select -o - | FileCheck %s --check-prefix=CHECK-MIR
+;
+;; Check the generation of pseudoprobe intrinsic call.
+
+define void @foo(i32 %x) {
+bb0:
+  %cmp = icmp eq i32 %x, 0
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID:]], 1, 0
+  br i1 %cmp, label %bb1, label %bb2
+
+bb1:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 3, 0
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 4, 0
+  br label %bb3
+
+bb2:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 2, 0
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 4, 0
+  br label %bb3
+
+bb3:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4)
+  ret void
+}
Index: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
===
--- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -5122,7 +5122,9 @@
 
 if (I->mayReadOrWriteMemory() &&
 (!isa(I) ||
- cast(I)->getIntrinsicID() != Intrinsic::sideeffect)) {
+ (cast(I)->getIntrinsicID() != Intrinsic::sideeffect &&
+  cast(I)->getIntrinsicID() !=
+  Intrinsic::pseudoprobe))) {
   // Update the linked list of memory accessing instructions.
   if (CurrentLoadStore) {
 CurrentLoadStore->NextLoadStore = SD;
Index: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
===
--- llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7167,7 +7167,8 @@
 
   Intrinsic::ID ID = getVectorIntrinsicIDForCall(CI, TLI);
   if (ID && (ID == Intrinsic::assume || ID == Intrinsic::lifetime_end ||
- ID == Intrinsic::lifetime_start || ID == Intrinsic::sideeffect))
+ ID == Intrinsic::lifetime_start || ID == Intrinsic::sideeffect ||
+ ID == Intrinsic::pseudoprobe))
 return nullptr;
 
   auto willWiden = [&](unsigned VF) -> bool {
Index: llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
===
--- llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+++ llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
@@ -666,6 +666,10 @@

[PATCH] D86193: [CSSPGO] Pseudo probe instrumentation for basic blocks

2020-08-18 Thread Hongtao Yu via Phabricator via cfe-commits
hoy created this revision.
Herald added subscribers: llvm-commits, cfe-commits, dang, laytonio, asbirlea, 
hiraditya, mgorny.
Herald added projects: clang, LLVM.
hoy requested review of this revision.
Herald added a subscriber: jdoerfert.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D86193

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGen/emit-pseudo-probe.c
  llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
  llvm/include/llvm/CodeGen/BasicTTIImpl.h
  llvm/include/llvm/CodeGen/CommandFlags.h
  llvm/include/llvm/CodeGen/ISDOpcodes.h
  llvm/include/llvm/CodeGen/MachineInstr.h
  llvm/include/llvm/CodeGen/Passes.h
  llvm/include/llvm/CodeGen/SelectionDAG.h
  llvm/include/llvm/CodeGen/SelectionDAGNodes.h
  llvm/include/llvm/IR/BasicBlock.h
  llvm/include/llvm/IR/IntrinsicInst.h
  llvm/include/llvm/IR/Intrinsics.td
  llvm/include/llvm/InitializePasses.h
  llvm/include/llvm/Passes/PassBuilder.h
  llvm/include/llvm/Support/TargetOpcodes.def
  llvm/include/llvm/Target/Target.td
  llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
  llvm/lib/Analysis/AliasSetTracker.cpp
  llvm/lib/Analysis/ValueTracking.cpp
  llvm/lib/Analysis/VectorUtils.cpp
  llvm/lib/CodeGen/Analysis.cpp
  llvm/lib/CodeGen/CodeGenPrepare.cpp
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
  llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
  llvm/lib/IR/BasicBlock.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Passes/PassRegistry.def
  llvm/lib/Transforms/IPO/CMakeLists.txt
  llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/lib/Transforms/Utils/Evaluator.cpp
  llvm/lib/Transforms/Utils/SimplifyCFG.cpp
  llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
  llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll

Index: llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll
===
--- /dev/null
+++ llvm/test/Transforms/SampleProfile/emit-pseudo-probe.ll
@@ -0,0 +1,29 @@
+; RUN: opt < %s -passes=pseudo-probe -function-sections -S -o %t
+; RUN: FileCheck %s < %t --check-prefix=CHECK-IL
+; RUN: llc %t -stop-after=instruction-select -o - | FileCheck %s --check-prefix=CHECK-MIR
+;
+;; Check the generation of pseudoprobe intrinsic call.
+
+define void @foo(i32 %x) {
+bb0:
+  %cmp = icmp eq i32 %x, 0
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID:]], 1, 0
+  br i1 %cmp, label %bb1, label %bb2
+
+bb1:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 3, 0
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 4, 0
+  br label %bb3
+
+bb2:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3)
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 2, 0
+; CHECK-MIR: PSEUDO_PROBE [[#GUID]], 4, 0
+  br label %bb3
+
+bb3:
+; CHECK-IL: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4)
+  ret void
+}
Index: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
===
--- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -5122,7 +5122,9 @@
 
 if (I->mayReadOrWriteMemory() &&
 (!isa(I) ||
- cast(I)->getIntrinsicID() != Intrinsic::sideeffect)) {
+ (cast(I)->getIntrinsicID() != Intrinsic::sideeffect &&
+  cast(I)->getIntrinsicID() !=
+  Intrinsic::pseudoprobe))) {
   // Update the linked list of memory accessing instructions.
   if (CurrentLoadStore) {
 CurrentLoadStore->NextLoadStore = SD;
Index: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
===
--- llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7167,7 +7167,8 @@
 
   Intrinsic::ID ID = getVectorIntrinsicIDForCall(CI, TLI);
   if (ID && (ID == Intrinsic::assume || ID == Intrinsic::lifetime_end ||
- ID == Intrinsic::lifetime_start || ID == Intrinsic::sideeffect))
+ ID == Intrinsic::lifetime_start || ID == Intrinsic::sideeffect ||
+ ID == Intrinsic::pseudoprobe))
 return nullptr;
 
   auto willWiden = [&](unsigned VF) -> bool {
Index: llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
===
--- llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+++ llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
@@