Replying to 3 of Ben's 4 emails in one go.

On Wed, 27 Apr 2011 13:22:30 +1000 Ben Lippmeier wrote:

> In Base.ds are the following comments (which I only added at the
> beginning of the year). This was my current plan, though I'm open
> to suggestions.
> 
> -- | A Void# is used as the argument and/or return type for primitive 
> functions
> --   that do not take/return real values. 
> --   Don't use (Ptr# Void#) after C. Use Addr# for that instead.
> data Void#
> 
> -- | A pointer to some thing.
> --   A (Ptr# a) should always be a valid pointer, correctly aligned for
> --   addressing some type 'a'. It should also point to an actual 'a', except
> --   in the case where we've just deallocated memory with 'free' or similar.
> --   Do not use to hold NULL pointers. Instead, test a possibly zero valued
> --   Addr# before casting it to the appropriate Ptr# type.
> data Ptr# a
> 
> -- | A raw store address, with enough precision to directly address any byte 
> of 
> --   memory reachable from the process. Equivalent to (void*) from C.
> --   An Addr# has a slightly different meaning to Ptr#, as the actual Addr#
> --   value isn't nessessaraly expected to point to an actual object, or even 
> to
> --   memory owned by the running process.
> foreign import data "Addr"    Addr#

Somehow I missed that. Thats a reasonably nice solution to what
in C would be a void* pointer (and in LLVM has to be a char*). 

> With that in mind, I think the type should really be:
> 
>     foreign import "primStore_peekDataRS_payload"
>     peekDataRS_payload
>       :: forall a. a -> Addr#
> 
> So the type Addr# is completely uncommitted to what data might be at that
> address. You can represent an Addr# as a (void*) in C though, or whatever
> seems reasonable in LLVM. To read the Int32 we'd cast the Addr# to a
> (Ptr# Int32), then read from that.

Ok, I'm cool with that.
 
On Wed, 27 Apr 2011 13:40:40 +1000 Ben Lippmeier wrote:

> The RS stands for "Raw Small". A DataRS object can contain raw data
> of any length. By "raw", I mean values that are not pointers that
> the GC needs to worry about.

<snip>

> I think we'll still need coercePtr for now as it's the base library
> code itself that defines the layout of these objects.

Yes, I thought coercePtr would still be needed.

> The "right way" of fixing it would be to follow GHC's approach, where
> we can define data constructors that have unboxed fields.

<snip>

> In the long term i'd hope to be able to define data constructors with unboxed 
> fields then use the projection mechanism to access the fields:
> 
> data Int
>     = Int { unboxed :: Int# }
> 
>  boxedX   = 5
>  unboxedX = boxedX.unboxed

Now that idea I really like.
 
> The point here is that the compiler can work out that boxedX has
> type Int, and it knows what the internal structure of that is. In
> contrast, all peekDataRS_payload knows is that it's been passed an
> object of type "a", which could be anything.

In fact it can't even figure out if the object actually contains a
DataRS or whether its something else.


On Wed, 27 Apr 2011 13:48:42 +1000 Ben Lippmeier wrote :

> On 25/04/2011, at 11:08 AM, Erik de Castro Lopo wrote:
> 
> > Looking at this some more, the call to primAlloc_dataRS really
> > concerns me. Its two parameters are magic numbers, but I know
> > that they are actually the tag that goes in the Obj and the 
> > size of the payload in the DataRS object. However, how does
> > someone else figure that out and make sure its correct?
> 
> They can't. At some point in the implementation you just have to
> trust that the code is correct.

True, but you don't give regular scissors to child, you give them
safety scissors so they are less likely to do themselves damage.
Similarly if you are designing a high level language, I don't see
any reason why the language should provide the programmer with
features that can so easily be abused to create unsafe programs.

I also think its possible to remove potentially unsafe language
features and still produce a high level language capable of
replacing C for systems programming. In some cases this may mean
moving things from the library to the compiler.

This is very much applying Yaron Minsky's "Make illegal states
unrepresentable" idea to language design.

> There is work on type preserving compilation that maintains type
> information all the way to the assembly level, but that's another
> life's work. Simon Winwood did his thesis on this stuff.

Yes, but with LLVM we are already quite a long way there. LLVM
allows some kinds of casts, but is very much more strict with
types than the average C compiler.

<snip>

> If you implement a function like boxPrim, you should give it a
> type class constraint that requires the argument to actually be
> a primitive unboxed type.
> 
> class Boxable a b where
>  box    :: a -> b
>  unbox  :: b -> a 
> 
> You'd then bake the valid instances into the compiler, but allow
> users to define their own if they want.
> 
> instance Boxable Int32# Int32 where
>   box   = ...
>   unbox = ... 
> 
> or something.

Again, thats an idea I really like. Nice!

At this stage I think I should drop my original idea and look into

  0) Fixing peekDataRS_payload so it has signature of
        :: forall a. a -> Addr#

  1) Look at doing one or more of these two:

        a) data Int32 = Int { unboxed :: Int32# }

        b) class Boxable


Cheers,
Erik
-- 
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

-- 
Disciple-Cafe mailing list
http://groups.google.com/group/disciple-cafe

Reply via email to