> How is code-size affected with this patch, non-instrumented vs. > regular-instrumented vs. sample-instrumented?
I don't have the numbers, but the increase in code size from regular-instrumented to sample-instrumented is larger than that from non-instrumented to sample-instrumented. Easwaran, could you please build a few binaries at -O2, -O2 -fprofile-generate and -O2 -fprofile-generate -fprofile-generate-sampling, and send out the text size ratios considering -O2 as the baseline? > > @@ -292,6 +474,15 @@ gimple_gen_edge_profiler (int edgeno, edge e) > > gimple_assign_lhs (stmt1), one); > > gimple_assign_set_lhs (stmt2, make_ssa_name (gcov_type_tmp_var, stmt2)); > > stmt3 = gimple_build_assign (unshare_expr (ref), gimple_assign_lhs > > (stmt2)); > > + > > + if (flag_profile_generate_sampling) > > + { > > + slot = htab_find_slot_with_hash (instrumentation_to_be_sampled, > > stmt1, > > + htab_hash_pointer (stmt1), INSERT); > > + gcc_assert (!*slot); > > + *slot = stmt1; > > + } > > + > > What's the reason to delay the sampling transform and not apply it here? > At least a comment should be added for this. Sorry, I just don't remember. > > +DEFPARAM (PARAM_PROFILE_GENERATE_SAMPLING_RATE, > > + "profile-generate-sampling-rate", > > + "sampling rate with -fprofile-generate-sampling", > > + 100, 0, 2000000000) > > Zero is really ok? It wouldn't break anything, but a rate of 0 doesn't make sense. It should be 1. And the default should be 101 instead of 100 (and generally a prime). While we're at it, let me bring up an important practical FDO issue related to this. As David mentioned, besides reducing overhead we have used this to implement selective collection. This essentially requires libgcov to provide a basic public interface: reset() start() stop() save() I implemented them by playing with the sampling rate, but a clearer and supported interface would be useful. Also, there was one annoying detail that I could not figure out. When you build with -fprofile-generate=./fdoprof, the output gets dumped under ./fdoprof/..., but there seems to be no easy way to know this path within the profiling process. For Google servers this makes collection fragile. The user generally does not have access to the server file system. I implemented a collection mechanism so the user can tell the server "copy the profiles from ./fdoprof to some shared location". But now you need to pass the exact path when building *and* collecting profiles, which may get separated in time, thus the process is fragile. It would be good to keep a copy of the path prefix and have it accessible through a public interface as well. Then once an instrumented binary is built, it will know the root of the profile output directory and thus relieve the user from the responsibility of knowing it. Silvius