On Nov 8, 2006, at 5:59 AM, Doug Gregor wrote:
However, this approach could have some odd side effects when there are
multiple mappings within one context. For instance, we could have
something like:
typedef int foo_t;
typedef int bar_t;
foo_t* x = strlen("oops");
x is a decl, the decl has a type, the context of that instance of the
type is x.
map(int,x) == foo_t.
It is this, because we know that foo_x was used to create x and we
set map(int,x) equal to foo_t as it is created.
It can never be wrong. It can never be wrong, because any use of the
type that would have had the wrong value comes from a specific
context, and that specific context map(type,context) can be set to
_any_ value, including the right value. Put another way, any time
one carries around type outside of a specific context, one also needs
to also carry around the context the type came from.
The error message that pops out would likely reference "bar_t *"
map(int,x) doesn't yield bar_t.
This approach wouldn't help with the implementation of concepts,
because we need to be able to take two distinct types (say, template
type parameters T and U) and consider them equivalent in the type
system.
I'd need to see more specifics, but from just the above... Any data
you need that would make them different, you put into map
(type,context), we're not restricted to just the typedef name. Once
you do that, then you discover that what's left, is identical, and
since they are identical, they have the same address, and the same
address makes them the same type.
The two things this doesn't work on are if you have two different
notions of equality, my scheme (unaltered) can only handle 1
definition for equality, or some of the temporal aspects, like, we
didn't know T1 and T2 were the same before, but now we do, because
they are both int. The later case I'm expecting to not be an issue,
as to form the type, you do the substitution and after you do it, you
replace T1 with int (creating data in map(int,context), if you later
need to know this was a T1 for any reason (debugging, error
messages)). These bubble up and one is left with the real type, and
then equality remains fast, post substitution. Reasoning about type
equality pre-substitution remains slow.
You can even get fast unsubstituted comparisons for a particular
definition of equality. You boost the substitution bits out as
variants, notice then, you have nothing left, and nothing is nothing,
so the values wind up being the same again. Now to get comptypes to
work, you just have to add code to compare the boosted variants in
the top of comptypes. Now, before you say that that is as bad as
what we had before, no, it isn't. If the type is equal, then you can
immediate fail the comparison, this takes care of 90% of the calls.
After than you check the variants for equality and return that. The
one address compare doesn't hit memory and can answer most of the
equations by itself. The variants are all on one cache line, and if
the cost to compare them is cheap, it is just two memory hits.
We can't literally combine T and U into a single canonical
type node, because they start out as different types.
?
Granted, we could layer a union-find implementation (that better
supports
concepts) on top of this approach.
Ah, but once you break the fundamental quality that different
addresses implies different types, you limit things to structural
equality and that is slow.
type = type_representative (TREE_TYPE (exp));
if (TREE_CODE (type) == REFERENCE_TYPE)
type = TREE_TYPE (type);
We could find all of these places by "poisoning" TREE_CODE for
TYPE_ALIAS_TYPE nodes, then patch up the compiler to make the
appropriate type_representative calls. We'd want to save the original
type for diagnostics.
Or, you can just save the context the type came from:
type = TREE_TYPE (exp);
type_context = &TREE_TYPE (exp);
same amount of work on the use side, but much faster equality checking.
An alternative to poisoning TREE_CODE would be to have TREE_TYPE do
the mapping itself and have another macro to access the original
(named) type:
#define TREE_TYPE(NODE) type_representative ((NODE)->common.type)
#define TREE_ORIGINAL_TYPE(NODE) ((NODE)->common.type)
Likewise, given those, we could do:
#define TREE_TYPE(NODE) ((NODE)->common.type)
#define TREE_ORIGINAL_TYPE(NODE)
(map((NODE)->common.type, &(NODE)->common.type)
? map((NODE)->common.type, &(NODE)->common.type)
: (NODE)->common.type)
and remain fast for equality.
Since we know that type canonicalization is incremental, could we
work
> toward type canonicalization in the GCC 4.3 time frame?
If by we you mean you, I don't see why that would be a bad
idea. :-) The risk is if one invests all this effort, and the win
turns out to be < 1% on real code and 10x on benchmark code, one
feels bad.
ConceptGCC has hit the point where compile times have gotten
prohibitive, and we need to illustrate that concepts can be compiled
efficiently. So, I'm stuck working toward type canonicalization in
ConceptGCC for performance reasons. Since type comparison is literally
50% of my compile time now, I'm sure to win. I just don't know if
what I come up with will be a win when concepts aren't present.
:-) At 50%, I think it is clear, that if you require types to be
fully canonical, trivially, type comparisons go from linear to
constant time, where the constant time is around one machine
instruction that doesn't touch memory, not one byte. Trivially, this
gets a 2x faster compiler. We really, really want a 2x faster compiler.
Now, what's the cost, in normal C++, debug information needs to know,
error messages need to know and we have to run a hash for them. The
cost is log n for each, instead of constant. I'd like to think this
cost is paid for by the improvement from linear time type checking,
as error messages are either, never printed, or we don't care about
compilation speed, and debug information is only done once, if at
all. Type comparisons are done often.
The big question is whether I end up doing this work in ConceptGCC
(only) or also in FSF GCC, and if anyone is willing to help with the
FSF GCC version.
I'd endorse putting type canonicalization into mainline if it were
same speed or faster. I think/hope it would be a speed win.
However, I don't know it would be. That'd be the risk of doing the
code. A bad hash could easily kill performance. For the 50% case,
I'm fairly sure it would be a speed win. I think for template heavy
code, it would be a speed win.