Travis Vitek wrote:
Martin Sebor wrote:
[...]
Initially, I had a slight bias for the first approach.
I started warming up to the second one because the code
reuse seems like a cleaner, better design. But now that
we've seen how much more costly in terms of system
resources during compilation the second approach is
I think the specialization approach might be the way to
go after all, not just for traits, but in general.
Yes, thanks for taking the time to evaluate this. If I had to read the
code I'd prefer to see the first approach, but it does appear that the
second approach (explicit specialization for each cv-qualified type) is
going to reduce compile time costs for us and our users.
I'd
still like to measure the compilation performance of the
first alternative to get a more complete picture.
I'm unclear what approach you are talking about here. Are you talking
about doing more testing of the techniques using the remove_cv<> trait,
or something else?
We've been discussing three approaches:
1. the one that's currently implemented; let's call it the
specialization approach
2. one where each _RWSTD_IS_XXX() macro strips the cv-qualifiers
from the type before passing the type to the implementation
trait; I'll call it the macro approach
3. one where each __rw_is_xxx trait is derived from another
__rw_is_xxx_impl trait specialized on __rw_remove_cv<T>;
I called it the metaprogramming approach
I have now benchmarked all three approaches using variable numbers
of types and template instantiations, and with both gcc 4.3 and EDG
eccp 3.9 on Linux. Here are the normalized results:
2000x10 200x100 20x1000
APPROACH GCC ECCP GCC ECCP GCC ECCP
[1] specialization 1.4 1 3.3 2.8 34.1 26.3
[2] macro 1.8 1.2 4 3.8 48.2 49.5
[3] metaprogramming 1.4 67.2 3.5 87.8 63.3 268.5
The column labeled 2000x10 shows normalized times for 2000
groups of implicit instantiations of is_void on 10 distinct types
with all combinations of cv-qualifications. The next column labeled
200x100 is for 200 instantiations on 100 distinct types, and the last
one, labeled 20x1000 is for 20 instantiations on 1000 distinct types.
According to the table, the specialization approach (1) yields
the best compile-time performance with both compilers. The speed
of both compilers is much more affected by the number of distinct
types (i.e., the number of generated distinct specializations of
the templates) than by the number of instantiations, suggesting
that they cache each instantiated specialization and reuse it to
satisfy the next instantiation request, although the eccp numbers
for approach (3) are a little mystifying.
Martin