[Haskell-cafe] A backwards-compatible record proposal

Brian Hulley Sat, 19 Aug 2006 01:21:17 -0700

Hi -

As I've been writing a Haskell program over the past few months the mainproblem I encounter is that record field names are not local to the recordtype, and any systematic way of making them local (eg by prepending"_Tycon_") results in names that are just too clunky, and I feel thatidentifiers *must* use systematic naming conventions to get code that willbe easy to understand and maintain.

Although it is relatively easy to think up a better record system, it hastaken me up till now to discover how such a system could be integrated withHaskell as it is at the moment, since it's unlikely that the existing recordsystem will ever disappear, at least not in the next few years, and in fact,there are some good points about the existing record system that I wouldn'tlike to lose.


I'll motivate the proposal then talk about how it could be implemented.

Motivation
=======

Consider the following:

   data Vector3 a = Vector3{x :: a, y::a,  z::a }

   data Normal3 a = Normal3{x :: a, y::a, z::a }

We've got a problem because (x) has been introduced twice to the top levelnamespace. In the above example, it could be argued that I should havewritten:


   data Arr3 a = Arr3 {x::a, y::a, z::a}

   newtype Vector3 a = Vector3 (Arr3 a)
   newtype Normal3 a = Normal3 (Arr3 a)

but for the sake of argument, let's suppose we can't do this - perhaps oneof the record types has some different fields as well.


A related problem is suppose I have:

   data Size = {width :: Int, height :: Int}

   data Rect = Rect{x1, y1, x2, y2::Int}

   width :: Rect -> Int
   width Rect{x1,_,x2,_} = x2 - x1

because a record field of Size has the same name as a top level function.

This second conflict can be avoided by always using the rule that recordfields begin with an underscore and all other variables don't.


So applying this to the first problem, we have:

   data Vector3 a = Vector3{_x :: a, _y::a,  _z::a }
   data Normal3 a = Normal3{_x :: a, _y::a, _z::a }

Of course we haven't solved it yet!

But now suppose we introduce a new piece of syntactic sugar, and write:

   data Vector3 a = Vector3{.x :: a, .y::a,  .z::a }
   data Normal3 a = Normal3{.x :: a, .y::a, .z::a }

ie putting a '.' before each field name. The intended meaning is that dottedfield names do *not* generate top level functions. Instead they allow thecompiler to generate instance decls as follows, where we've introduced a newform of identifier, the dotted id, which behaves as a postfix operator whichbinds more tightly than function application and can also be used as a classname (by the compiler only):


   class (.x) :: a b | a -> b where
       (.x) :: a -> b

   class (.y) :: a b | a -> b where
       (.y) :: a -> b

   class (.z) :: a b | a -> b where
       (.z) :: a -> b

For each dotted id, there is a class defined as above, which is availableglobally to the whole program as if a module containing an infinite set ofclass decls as above was exported by the Prelude.

In the module containing the data decl for the record, the compiler insertsthe following:


   instance (.x) (Vector3 a) a where
       (.x) v = ... -- compiler generated code to access the field

Then within the rest of the program we can write:

   magSquared :: Num a => Vector3 a -> a
   magSquared v = v.x*v.x + v.y*v.y + v.z*v.z

   -- explicit type when specific function is required
   vec_x = ((.x) :: (Vector3 a -> a))

The advantage of this proposal is that it is completely backwards compatiblewith records as they are at the moment, and we can choose which fields wewant to be dotted and which we want to just keep as normal top levelfunctions.

The only extra thing we need to do is put a dot before the field names wewant to access via the dotted syntax, but we in any case needed to use anunderscore when we wanted a systematic way to avoid conflicts between fieldnames and other top-level names so there is no extra effort involved.


Implementation
==========

The above could almost be implemented just by parsing a source filecontaining uses of dotted fields and using a conversion like:


   data Rec a = Rec { .f :: a}

   let
       rec = Rec {.f = 78}    -- dot is used here too
       p = rec.f * rec.f

==>

   data Rec a = Rec a

   instance Dot__f (Rec a) a where
       __dot_f (Rec a) = a

   let
       rec = Rec 78
       p = __dot_f rec * __dot_f rec

So far so good, but the alert reader :-) will have noticed that we now havea *major* problem with abstraction because although we can write:


   module M (Rec) where ...

code in another module can still say rec.f because in Haskell, all instancedecls in a module are always exported [1]. This problem may disappear inHaskell' [2]. http://hackage.haskell.org/trac/haskell-prime/ticket/19

Therefore I think the desugaring would need to take place in the compiler sothe compiler could avoid exporting the compiler-generated instances when thefields are not present in the module export list.


Regards, Brian.

[1] http://haskell.org/onlinereport/modules.html#import-instances
[2] http://hackage.haskell.org/trac/haskell-prime/ticket/19
--
Logic empowers us and Love gives us purpose.
Yet still phantoms restless for eras long past,
congealed in the present in unthought forms,
strive mightily unseen to destroy us.

http://www.metamilk.com

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] A backwards-compatible record proposal

Reply via email to