tejohnson marked an inline comment as done.
tejohnson added a comment.

In D73242#1847051 <https://reviews.llvm.org/D73242#1847051>, @evgeny777 wrote:

> > This is an enabler for upcoming enhancements to indirect call promotion, 
> > for example streamlined promotion guard sequences that compare against 
> > vtable address instead of the target function
>
> Can you please describe the whole approach in more detail? At the moment ICP 
> is capable to do (a sort of) devirtualization is replacing indirect vtbl call 
> with sequence of function address comparisons and direct calls.
>  Are you going to speedup this by means of comparing vtable pointers instead 
> of function pointers (thus eliminating a single load per each vtbl call) or 
> there is also something else in mind?


That's exactly what we want to do here. We found a relatively significant 
number of cycles are being spent on virtual function pointer loads in these 
sequences, and by doing a vtable comparison instead, that is moved off the 
critical path. I had prototyped something like this in ICP awhile back and 
found a speedup in an important app.

> If that's true, what's the next
>  step? Make ICP pass analyze type test intrinsics?

There are a few ways to do the alternate ICP compare sequences, one is using 
statically available info from the vtable definitions in the module that 
utilize the profiled target. This relies on ThinLTO to import all the relevant 
vtable definitions. The other is to profile vtable addresses with FDO (not just 
the target function pointer) - I've got the type profiling implemented, but it 
needs some cleanup before I send for review. Both of these approaches need the 
type tests to determine the correct address point offset (the offset in the 
type test) to use in the compare sequence. And in both cases you want to trade 
off the number of comparisons needed for the two approaches to determine 
whether a vtable compare or a target function compare is better. I.e. if there 
are a lot of vtable definitions that utilize a hot target, it is likely better 
to do a single target function comparison.



================
Comment at: llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp:1660
+        cast<MetadataAsValue>(CI->getArgOperand(1))->getMetadata();
     // If we found any, add them to CallSlots.
     if (!Assumes.empty()) {
----------------
evgeny777 wrote:
> This change seems to be unrelated
It is needed to have the TypeId available outside this if statement (see the 
map check below).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73242/new/

https://reviews.llvm.org/D73242



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to