On 7/2/19 2:45 AM, Richard Biener wrote: > External Email > > ---------------------------------------------------------------------- > On Mon, Jul 1, 2019 at 11:58 PM Gary Oblock <gobl...@marvell.com> wrote: >> I've been looking at trying to optimize the performance of code for >> programs that use functions like qsort where a function is passed the >> name of a function and some constant parameter(s). >> >> The function qsort itself is an excellent example of what I'm trying to show >> what I want to do, except for being in a library, so please ignore >> that while I proceed assuming that that qsort is not in a library. In >> qsort the user passes in a size of the array elements and comparison >> function name in addition to the location of the array to be sorted. I >> noticed that for a given call site that the first two are always the >> same so why not create a specialized version of qsort that eliminates >> them and internally uses a constant value for the size parameter and >> does a direct call instead of an indirect call. The later lets the >> comparison function code be inlined. >> >> This seems to me to be a very useful optimization where heavy use is >> made of this programming idiom. I saw a 30%+ overall improvement when >> I specialized a function like this by hand in an application. >> >> My question is does anything inside gcc do something similar? I don't >> want to reinvent the wheel and I want to do something that plays >> nicely with the rest of gcc so it makes it into real world. Note, I >> should mention that I'm an experienced compiler developed and I'm >> planning on adding this optimization unless it's obvious from the >> ensuing discussion that either it's a bad idea or that it's a matter >> of simply tweaking gcc a bit to get this optimization to occur. > GCC performs intraprocedural constant propagation (IPA-CP) and > this should catch your case already. The IPA-CP function cloning > might have too constrained limits (on code bloat) to apply on a > specific testcase but all functionality for the qsort case should > be available. > > Richard. > >> Thanks, >> >> Gary Oblock Richard, I'm planning on using profile based heuristics that are fairly conservative. However, I'll also also let the user have access to a parameter to relax the heuristics to the degree they desire if they want to do so.
Gary