Hello all, Here are some ideas on... Extending Haskell With Attributes --------------------------------- Below I describe a way in which attribute grammars could be added to Haskell, using Bird's replace-minimum as a running example. None of this has yet been implemented, and it might very well turn out that part of these ideas must be revised when implementation is attempted. I'd appreciate to hear your comments on this. Note that I do not explain the How, What, and Why of attribute grammars. --Syntax and semantics-- In a given program context, every value within a type has a fixed set of typed attributes. The names of these attributes are simple identifiers that are unique within this set, and every attribute is either synthesized or inherited. For example, given the datatype > data Tree = Leaf Int | Fork Tree Tree one can declare that every Tree throughout the program has a min attribute by adding this top-level declaration: > attr Tree where synthesized min :: Int For values that have multiple types (such as the empty list [] ) we demand that the attributes of a more generally typed value form a subset of those of a more specificly typed value. For instance, if all values of type [a] have a synthesized integer length attribute, then also all values of types [Int] and [Char] have one: > attr [a] where synthesized length :: Int Conceptually, attribute values are stored in bindings. This makes it possible to decorate the same Tree with different attribute values in different contexts. Attribute values are indicated by these notations: variable^attributeName variable!attributeName The first is used for synthesized attributes, the second for inherited ones. (Note that this notation conflicts with current Haskell operators; ideally we would use upward and downward arrows.) To compute the smallest value of a Tree we could simply take its min attribute: > minTree :: Tree -> Int > minTree t = t^min But for this to work we must also say how the min attribute is calculated: > eq t :: Tree of > Leaf n where t^min = n > Fork l r where t^min = l^min `min` r^min These equations relate the attribute values of a Tree with those of its subvalues, which are described using pattern matching. (It is also possible to use guards to limit matching of a pattern.) To demonstrate the use of inherited attributes, we add an inherited attribute num that is simply distributed throughout a tree. > attr Tree where inherited num :: Int > eq t :: Tree of > Fork l r where l!num = t!num > r!num = t!num Finally, we add a synthesized attribute containing a tree of the same structure, but with every leaf replaced with the num attribute: > attr Tree where synthesized tree :: Tree > eq t :: Tree of > Leaf n where t^tree = Leaf t!num > Fork l r where t^tree = Fork l^tree r^tree Now we can implement Bird's replace-minimum example by > repMin :: Tree -> Tree > repMin t = t^tree where t!num = t^min This transforms a tree into one with the same structure, but every leaf containing the smallest value of the tree. --Remarks-- There are a couple of remarks to be made about the above syntax and semantics. It is not strictly necessary to use different operators for inherited and synthesized attributes, but the distinction gives better readablity, and it enables the compiler/interpreter/translator to catch more errors. When using the translation detailed below, it is easy to add guards to the pattern matching in an equations declaration. We have stated that, in a given context, every value (within a type) has a fixed set of attributes. In the terminology of Johnsson, this means that every value has one attribute sort. (It is therefore not necessary to add attribute sort annotations to bindings.) To emulate attribute sorts, the programmer can declare a new type with the same structure as an existing one, and give it different attributes. Every kind of value can be attributed: we are not restricted to abstract datatypes. We have associated attributes with values, instead of with types. This has been done to support the use of local attributes, which are only attached to a limited set of values from a datatype. The syntax above could be extended to allow `remote upward attribute sets' as in SSL; these are easily translated into additional attributes. --Translation-- The implementation of the new declarations is most easily done by translating to `native' Haskell, roughly as follows: * Translate the equations to so-called visit functions; for each type, these compute the synthesized attributes from the value and its inherited attributes. The attr declarations give their types, and the eq declarations their definitions. * Add calls to the visit functions in places where synthesized attributes are used. * Replace all signs ^ and ! by an apostrophe. In attribute grammar literature this is known as the CIRC mapping. For the above example, this leads to > visitTree :: Tree -> (Int, Tree) > visitTree (Leaf n) t'num = (t'min, t'tree) > where t'min = n > t'tree = Leaf t'num > visitTree (Fork l r) t'num = (t'min, t'tree) > where t'min = l'min `min` r'min > l'num = t'num > r'num = t'num > t'tree = Fork l'tree r'tree > (l'min, l'tree) = visitTree l l'num > (r'min, r'tree) = visitTree r r'num > minTree :: Tree -> Int > minTree t = t'min > where (t'min, _) = visitTree t undefined > repMin :: Tree -> Tree > repMin t = t'tree > where t'num = t'min > (t'min, t'tree) = visitTree t t'num (Note: this code has not been tested.) The important thing about this translation is that it can be done automatically. Of course, an implementation may use a different translation, or may even choose to implement attribute decoration on dedicated hardware, as long as the result is exactly equivalent to the above translation. The easiest implementation is probably as as pre-processor, feeding the result into a native Haskell compiler. A disadvantage of this approach is that it might become more difficult to trace errors. Any comments? <>< Marnix -- Marnix Klooster [EMAIL PROTECTED] // [EMAIL PROTECTED]
