So, we know the standard representation of a sum type is:

        struct _uctor_ { int variant; void *data; };

which is layout compatible with

        int * address

however there are various special representations. For example
if a variant has no argument, it's just represented by the "int" tag.
Safe, since proper use shouldn't fetch the data field for that tag.

Another special one is for a form where there are exactly two
variants, the first having no argument, then we just use

        void *data

will NULL being the first variant, and a non-null address being
the address of the second one. This is efficient but it was chosen
for a more fundamental reason than performance: it means any
"possibly NULL" C pointer has a natural representation in Felix
as a union:

        union maybe_ptr[T] = Null | Ptr of T;

In other words, a possibly NULL C pointer can not only be
considered logically equivalent to an option type with
type T a pointer, it is layout compatible with one.

Another example of this is 

        union list[T] = Empty | Cons of T * list[T]

which is naturally represented by

        struct list { T data; list *next; }

where the next field is NULL at the end of the list.

UNFORTUNATELY there is a natural isomorphism for unions
which is broken by all of this:

        T + T + T + ... + T = N * T

This is the dual of

        T * T * T ... * T = T ^ N

which is true by definition. We would like

        gproj: T ^ N -> N -> N * T

This is the generalised projection specialised to arrays.
In general

        gproj: P -> INT -> ~P

where ~P is the dual of P. For example:

        gproj : int * long -> 2 -> int + long

allows one to select a tuple projection at run time, i.e with
an expression instead of a constant. This allows dynamically
scanning a heterogenous datatype. The degenerate array
case reduces to returning the selected array value AND
the index.

Returning key/value pairs is natural for associative stores
(dictionaries, hash tables, etc) when iterating. As it turns
out we now see why array indexing should do the same:
if we do this we can throw out arrays entirely!!!

You have to consider this data type:

        (int + long) * (int + long)

which is an array

        (int + long) ^ 2

and thus can be indexed dynamically, the returned value
from a standard (NOT generalised) array projection has type

        int + long

What we've done is CONVERT the tuple type int * long into
an array.

Basically this operation COMMUTES with the generalised projection.

Ok, so the conflict here is: if we retain our special representation
of certain unions, if it's to be layout compatible with tuple

        N * T

then that tuple type has to use the same representation. This really
complicates things. If we use a simple representation, we lose both
performance and C compat.

There's more (of course!).

One of the reasons for using unions is that you can pattern match.
Duh!

Pattern matching works with special union representations.
the compiler generates calls to C macros which do the required
extractions to get the variant tag and data value.

BUT .. we also have user defined pattern matching.
It works the same way: you define a function that calculates
the variant tag and checks against a supplied value,
and another that extracts the data. If you match a non-union type
with these functions available the compiler uses them.

The point? We can throw out the special union representations
used by the compiler! At a price!

For example, instead of our definition of a list:

        union list[T] = Empty | Cons of T * list[T];

we can use an abstract type, with the required representation,
and just provide a match checker and match extractor.
Instead of the injections Empty and Cons, we provide ordinary
functions as constructors. I.e. we have user defined constructors
and destructors and an abstract type instead of the compiler generated
constructors and destructors. Then we can use any representation
we want.

There's more! Of course!!!! 

Felix has "compact linear types" which are special compact representations
of certain types such as 3 * 4. We compress the value into a single integer.
We do this because we want polyadic arrays. This allows a simple cast
of a multi-index of type

        3 * 4

into a single integer, allowing a matrix to be accessed as a linear array.

But compact linear types have a serious problem! The components of
a compact linear product aren't addressable by an ordinary pointer.

So our method of eliminating lvalues, to provide projections of pointers
to products:

        var x = "Hello", 42;
        (&x) . 1 <- 99;

fails for compact linear types because the LHS of <- has to be a pointer:

        var x = (case 1 of 3), (case 2 of 4);
        (&x) . 1 <- case 3 of 4; // FAILS

We would need a fat pointer here (an ordinary pointer PLUS a divisor
and a modulo value to shift and mask the compact linear value).

The solution? It may well be "user defined projections".
Which would allow us to THROW OUT compact linear types
from the compiler (put it in the library instead).

So you can see representations are crucial and hard to get right.
Ideally the compiler should use only basic ones. Smart compact
ones are better left to the library, if only we can get enough 
"hooks" to do it.

We DO want our isomorphisms to work as naturally as possible.
In particular the more cases where commuting operations
are represented by casts (i.e. no change in the actual representation)
the better.

--
john skaller
skal...@users.sourceforge.net
http://felix-lang.org




------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

Reply via email to