On Tue, Jan 14, 2014 at 1:47 PM, Ilya Enkovich <enkovich....@gmail.com> wrote: > 2014/1/14 Richard Biener <richard.guent...@gmail.com>: >> On Tue, Jan 14, 2014 at 10:15 AM, Ilya Enkovich <enkovich....@gmail.com> >> wrote: >>> Hi, >>> >>> I've been working for some time on the prototype of the Pointer Bounds >>> Checker which uses function clones for instrumentation >>> (http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03327.html). After >>> several experiments with this approach I want to share my results and >>> ask for some feedback to make a decision about the future steps. >>> >>> Firstly I want to remind the reasons for digging in this direction. In >>> the original approach bounds of call arguments and input parameters >>> are associated with arguments via special built-in calls. It creates >>> implicit data flow compiler is not aware about which confuses some >>> optimizations resulting in miss-optimization and breaks bounds data >>> flow. Thus optimizations have to be fixed to get better pointers >>> protection. >>> >>> Clones approach does not use special built-in function calls to >>> associate bounds with call arguments and input parameters. Each >>> function which should be instrumented gets an additional version and >>> only this special version will be instrumented.This new version gets >>> additional bound arguments to express input bounds. When function call >>> is instrumented, it is redirected to instrumented version and all >>> bounds are passed as explicit call arguments. Thus we have explicit >>> pointer bounds flow similar to regular function parameters. It should >>> allow to avoid changes in optimization, avoid miss-optimizations, >>> allow existing IPA optimizations to work with bound args (e.g. >>> propagate constant bounds value and remove checks in called function). >>> >>> I made a prototype implementation of this approach in the following way: >>> >>> - Add new IPA pass before early local passes to produce versions for >>> all functions to be instrumented. >>> - Put instrumentation pass after SSA pass. >>> - Add new pass after IPA passes to remove bodies of functions which >>> have instrumented versions. Function nodes may still be required for >>> calls in not instrumented code. But we do not emit this code and >>> therefore function bodies are not needed. >>> >>> Positive changes are: >>> >>> - IPA optimizations are not confused by bound parameters >>> - bounds are now more like regular arguments; it makes their >>> processing in expand easier >>> - functions with bounds not attached to any pointer are allowed >> >> First of all thanks for trying to work in this direction. Comments on the >> issues you encountered below (also CCed Honza as he should be more >> familiar with reachability and clone issues). >> >>> On simple codes this approach worked well but on a bigger tests some >>> issues were revealed. >>> >>> 1. Nodes reachability. Instrumented version is actually always >>> reachable when original function is reachable because it is always >>> emitted instead of the original. Thus I had to fix reachability >>> analysis to achieve it. Another similar problem is check whether node >>> can be removed after inline when inlining instrumented function. Not >>> hard to fix but probably other similar problems exist. >> >> I suppose you do not update the callgraph / the call stmts when >> creating the clones? Btw, is it desirable to inline the uninstrumented >> function and then instrument the result (thus run cloning and >> instrumentation after early inlining?)? Especially handling always_inlines >> before cloning/isntrumentation looks very sensible. > > Right. Created clones have the same code as the original function and > therefore same cgraph edges. I suppose instrumentation after early > inlining is OK and may be preferred because inline shouldn't lead to > any losses of bounds information. I tried variant when instrumentation > works right after early inlining but with cloning still before early > local passes. In general it looked OK. > >> >>> 2. Function processing order. Function processing order is determined >>> before early local passes. But during function instrumentation call >>> graph is modified significantly and used topological order becomes >>> outdated. That causes some troubles. E.g. function marked as 'always >>> inline' cannot be inlined because it is not in SSA form yet. Surely >>> inlining problem may be solved by just putting instrumentation after >>> early inline, but similar problem may exist in other passes too. To >>> resolve this problem I tried to split early local passes into three >>> parts. The first one builds SSA, the second one performs >>> instrumentation, the last one does the rest. Each part is performed on >>> all functions before the next one starts. Thus I get all functions in >>> SSA form and all instrumentation performed before starting early >>> optimizations. Unfortunately such passes order leads to invalid SSA >>> because of local_pure_const optimization affecting callers correctness >>> (in case caller SSA was built before optimization revealed 'pure' or >>> 'const' flag). >> >> Generally the processing order of early_local_passes is chosen >> to get better optimization - changing it shouldn't affect correctness >> and thus the issues you observe demand fixing anyway. >> (I've noted in the past that the early_local_passes processing order >> should more explicitely honor callgraph SCCs, eventually even iterating). >> >> Moving SSA build out of the early_local_passes and into a >> separate lowering stage is possible, the pure-const stuff is >> handled by keeping pass_fixup_cfg where it is now I think. >> In theory you can go into SSA form in all_lowering_passes >> already (but you have to adjust some pieces dealing with >> late created functions). > > Currently pass_fixup_cfg is called at the beginning of the early local > passes and therefore function processing order mattes much here. If > caller is processed before the callee marked as pure then there is no > additional pass_fixup_cfg for the caller after pure flag for callee is > set. Probably additional IPA pass is required after early_local_passes > to fixup all functions?
You probably need an additional fixup_cfg pass somewhere. >> Note that the idea of having stuff inside a single early-local-passes >> pipeline is that it increases cache locality and reduces memory >> usage by reclaiming functions no longer needed after inlining. >> >>> In general I feel that having special function version for >>> instrumentation has a better potential, should lead to less intrusive >>> changes in the compiler and better code quality. >>> >>> But before continue this implementation I would like to get some >>> feedback and probably some advice on how to order passes to get the >>> best result. Currently I incline to have all functions instrumented >>> before any local optimizations and solve pure_const problem by >>> modifying all callers when attribute is computed for some function. >> >> As noted above the pure_const proble is fixed by the fixup_cfg pass. >> >> Generally if instrumentation is not strictly required before inlining >> then doing cloning and instrumentation as a "real" IPA pass would >> be the way to go. You'd create clones during IPA analysis, >> fixup clones during IPA propagate (IPA inline should notify you >> of any decisions it makes), and instantiate and instrument the >> remaining required clones at IPA transform time. > > Instrumentation is not required before inlining but is required before > any optimization pass. Optimizations may change pointers flow and > therefore break bounds flow, resulting in possible false bound > violations. It means instrumentation has to be performed before > early_optimizations. Ah, ok ... > Thus I may split early_local_passes into three > IPA passes: > 1. build SSA + early_inline > 2. Make instrumented versions + release bodies of original functions > (for functions having instrumented version) > 3. Perform early_optimizations. That defeats the purpose and functioning of early inlining though. early inlining wants the callee to have gone through all early local passes - doing the above breaks this. So I suppose your original ordering is the only sensible thing unless somebody comes up with new great ideas ;) (eventually inlining always-inline functions can be split off and done earlier) Richard. > > Thanks, > Ilya > >> >> Generally if doing it before early opts is not too intrusive making it >> a real IPA pass later shouldn't be too difficult (heh ...). >> >> Honza may have more comments here, of course. >> >> Thanks, >> Richard. >> >>> Thanks, >>> Ilya