Hi -
As I've been writing a Haskell program over the past few months the main problem I encounter is that record field names are not local to the record type, and any systematic way of making them local (eg by prepending "_Tycon_") results in names that are just too clunky, and I feel that identifiers *must* use systematic naming conventions to get code that will be easy to understand and maintain.

Although it is relatively easy to think up a better record system, it has taken me up till now to discover how such a system could be integrated with Haskell as it is at the moment, since it's unlikely that the existing record system will ever disappear, at least not in the next few years, and in fact, there are some good points about the existing record system that I wouldn't like to lose.

I'll motivate the proposal then talk about how it could be implemented.

Motivation
=======

Consider the following:

   data Vector3 a = Vector3{x :: a, y::a,  z::a }

   data Normal3 a = Normal3{x :: a, y::a, z::a }

We've got a problem because (x) has been introduced twice to the top level namespace. In the above example, it could be argued that I should have written:

   data Arr3 a = Arr3 {x::a, y::a, z::a}

   newtype Vector3 a = Vector3 (Arr3 a)
   newtype Normal3 a = Normal3 (Arr3 a)

but for the sake of argument, let's suppose we can't do this - perhaps one of the record types has some different fields as well.

A related problem is suppose I have:

   data Size = {width :: Int, height :: Int}

   data Rect = Rect{x1, y1, x2, y2::Int}

   width :: Rect -> Int
   width Rect{x1,_,x2,_} = x2 - x1

because a record field of Size has the same name as a top level function.

This second conflict can be avoided by always using the rule that record fields begin with an underscore and all other variables don't.

So applying this to the first problem, we have:

   data Vector3 a = Vector3{_x :: a, _y::a,  _z::a }
   data Normal3 a = Normal3{_x :: a, _y::a, _z::a }

Of course we haven't solved it yet!

But now suppose we introduce a new piece of syntactic sugar, and write:

   data Vector3 a = Vector3{.x :: a, .y::a,  .z::a }
   data Normal3 a = Normal3{.x :: a, .y::a, .z::a }

ie putting a '.' before each field name. The intended meaning is that dotted field names do *not* generate top level functions. Instead they allow the compiler to generate instance decls as follows, where we've introduced a new form of identifier, the dotted id, which behaves as a postfix operator which binds more tightly than function application and can also be used as a class name (by the compiler only):

   class (.x) :: a b | a -> b where
       (.x) :: a -> b

   class (.y) :: a b | a -> b where
       (.y) :: a -> b

   class (.z) :: a b | a -> b where
       (.z) :: a -> b

For each dotted id, there is a class defined as above, which is available globally to the whole program as if a module containing an infinite set of class decls as above was exported by the Prelude.

In the module containing the data decl for the record, the compiler inserts the following:

   instance (.x) (Vector3 a) a where
       (.x) v = ... -- compiler generated code to access the field

Then within the rest of the program we can write:

   magSquared :: Num a => Vector3 a -> a
   magSquared v = v.x*v.x + v.y*v.y + v.z*v.z

   -- explicit type when specific function is required
   vec_x = ((.x) :: (Vector3 a -> a))

The advantage of this proposal is that it is completely backwards compatible with records as they are at the moment, and we can choose which fields we want to be dotted and which we want to just keep as normal top level functions.

The only extra thing we need to do is put a dot before the field names we want to access via the dotted syntax, but we in any case needed to use an underscore when we wanted a systematic way to avoid conflicts between field names and other top-level names so there is no extra effort involved.

Implementation
==========

The above could almost be implemented just by parsing a source file containing uses of dotted fields and using a conversion like:

   data Rec a = Rec { .f :: a}

   let
       rec = Rec {.f = 78}    -- dot is used here too
       p = rec.f * rec.f

==>

   data Rec a = Rec a

   instance Dot__f (Rec a) a where
       __dot_f (Rec a) = a

   let
       rec = Rec 78
       p = __dot_f rec * __dot_f rec

So far so good, but the alert reader :-) will have noticed that we now have a *major* problem with abstraction because although we can write:

   module M (Rec) where ...

code in another module can still say rec.f because in Haskell, all instance decls in a module are always exported [1]. This problem may disappear in Haskell' [2]. http://hackage.haskell.org/trac/haskell-prime/ticket/19

Therefore I think the desugaring would need to take place in the compiler so the compiler could avoid exporting the compiler-generated instances when the fields are not present in the module export list.

Regards, Brian.

[1] http://haskell.org/onlinereport/modules.html#import-instances
[2] http://hackage.haskell.org/trac/haskell-prime/ticket/19
--
Logic empowers us and Love gives us purpose.
Yet still phantoms restless for eras long past,
congealed in the present in unthought forms,
strive mightily unseen to destroy us.

http://www.metamilk.com
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to