> How is code-size affected with this patch, non-instrumented vs.
> regular-instrumented vs. sample-instrumented?

I don't have the numbers, but the increase in code size from
regular-instrumented to sample-instrumented is larger than that from
non-instrumented to sample-instrumented.  Easwaran, could you please
build a few binaries at -O2, -O2 -fprofile-generate and -O2
-fprofile-generate -fprofile-generate-sampling, and send out the text
size ratios considering -O2 as the baseline?

> > @@ -292,6 +474,15 @@ gimple_gen_edge_profiler (int edgeno, edge e)
> >                                        gimple_assign_lhs (stmt1), one);
> >   gimple_assign_set_lhs (stmt2, make_ssa_name (gcov_type_tmp_var, stmt2));
> >   stmt3 = gimple_build_assign (unshare_expr (ref), gimple_assign_lhs 
> > (stmt2));
> > +
> > +  if (flag_profile_generate_sampling)
> > +    {
> > +      slot = htab_find_slot_with_hash (instrumentation_to_be_sampled, 
> > stmt1,
> > +                                       htab_hash_pointer (stmt1), INSERT);
> > +      gcc_assert (!*slot);
> > +      *slot = stmt1;
> > +    }
> > +
>
> What's the reason to delay the sampling transform and not apply it here?
> At least a comment should be added for this.

Sorry, I just don't remember.

> > +DEFPARAM (PARAM_PROFILE_GENERATE_SAMPLING_RATE,
> > +         "profile-generate-sampling-rate",
> > +         "sampling rate with -fprofile-generate-sampling",
> > +         100, 0, 2000000000)
>
> Zero is really ok?

It wouldn't break anything, but a rate of 0 doesn't make sense.  It
should be 1.  And the default should be 101 instead of 100 (and
generally a prime).

While we're at it, let me bring up an important practical FDO issue
related to this.  As David mentioned, besides reducing overhead we
have used this to implement selective collection.  This essentially
requires libgcov to provide a basic public interface:
reset()
start()
stop()
save()
I implemented them by playing with the sampling rate, but a clearer
and supported interface would be useful.  Also, there was one annoying
detail that I could not figure out.  When you build with
-fprofile-generate=./fdoprof, the output gets dumped under
./fdoprof/..., but there seems to be no easy way to know this path
within the profiling process.  For Google servers this makes
collection fragile.  The user generally does not have access to the
server file system.  I implemented a collection mechanism so the user
can tell the server "copy the profiles from ./fdoprof to some shared
location".  But now you need to pass the exact path when building
*and* collecting profiles, which may get separated in time, thus the
process is fragile.  It would be good to keep a copy of the path
prefix and have it accessible through a public interface as well.
Then once an instrumented binary is built, it will know the root of
the profile output directory and thus relieve the user from the
responsibility of knowing it.

Silvius

Reply via email to