Re: ArrayArrays

2015-09-08 Thread Simon Marlow

On 08/09/2015 13:10, Simon Peyton Jones wrote:


|  Without any unlifted kind, we need
|- ArrayArray#
|- a set of new/read/write primops for every element type,
|  either built-in or made from unsafeCoerce#
|
|  With the unlifted kind, we would need
|- ArrayArray#
|- one set of new/read/write primops
|
|  With levity polymorphism, we would need
|- none of this, Array# can be used

I don't think levity polymorphism will work here.  The code for a function 
needs to know whether an intermediate value of type 'a' is strict or not. It 
HAS to choose (unless we compile two versions of every function).  So I don't 
see how to be polymorphic over a type variable that can range over both lifted 
and unlifted types.

The only reason that 'error' is levity-polymorphic over both lifted and 
unlifted types is that it never returns!
   error :: forall (a :: AnyKind).  String -> a
the code for error never manipulates a value of type 'a', so all is well.  But 
it's an incredibly special case.


I think there's a bit of confusion here, Ed's email a bit earlier 
described the proposal for the third option above: 
https://mail.haskell.org/pipermail/ghc-devs/2015-September/009867.html


For generalising these primops it would be fine, there are no thunks 
being built.


Cheers
Simon
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: ArrayArrays

2015-09-08 Thread Simon Peyton Jones

|  Without any unlifted kind, we need
|- ArrayArray#
|- a set of new/read/write primops for every element type,
|  either built-in or made from unsafeCoerce#
|  
|  With the unlifted kind, we would need
|- ArrayArray#
|- one set of new/read/write primops
|  
|  With levity polymorphism, we would need
|- none of this, Array# can be used

I don't think levity polymorphism will work here.  The code for a function 
needs to know whether an intermediate value of type 'a' is strict or not. It 
HAS to choose (unless we compile two versions of every function).  So I don't 
see how to be polymorphic over a type variable that can range over both lifted 
and unlifted types.

The only reason that 'error' is levity-polymorphic over both lifted and 
unlifted types is that it never returns!
  error :: forall (a :: AnyKind).  String -> a
the code for error never manipulates a value of type 'a', so all is well.  But 
it's an incredibly special case.

Simon
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: ArrayArrays

2015-09-08 Thread Simon Marlow
;>>>>>
  >> >>>>>>> I think 90% of the
needs I have
 are covered just by adding the one
  >> >>>>>>> primitive. The last 10%
gets
 pretty invasive.
  >> >>>>>>>
  >> >>>>>>> -Edward
  >> >>>>>>>
  >> >>>>>>> On Fri, Aug 28, 2015 at
5:30 PM,
 Ryan Newton mailto:rrnew...@gmail.com>
 <mailto:rrnew...@gmail.com
<mailto:rrnew...@gmail.com>>>
  >> >>>>>>> wrote:
  >> >>>>>>>>
  >> >>>>>>>> I like the possibility
of a
 general solution for mutable structs
  >> >>>>>>>> (like Ed said), and
I'm trying
 to fully understand why it's hard.
  >> >>>>>>>>
  >> >>>>>>>> So, we can't unpack
MutVar into
 constructors because of object
      >> >>>>>>>> identity problems. But
what
 about directly supporting an
  >> >>>>>>>> extensible set of
  >> >>>>>>>> unlifted MutStruct#
objects,
 generalizing (and even replacing)
  >> >>>>>>>> MutVar#? That
  >> >>>>>>>> may be too much work,
but is it
 problematic otherwise?
  >> >>>>>>>>
  >> >>>>>>>> Needless to say, this
is also
 critical if we ever want best in
  >> >>>>>>>> class
  >> >>>>>>>> lockfree mutable
structures,
 just like their Stm and sequential
  >> >>>>>>>> counterparts.
  >> >>>>>>>>
  >> >>>>>>>> On Fri, Aug 28, 2015
at 4:43 AM
 Simon Peyton Jones
  >> >>>>>>>> mailto:simo...@microsoft.com>
 <mailto:simo...@microsoft.com
<mailto:simo...@microsoft.com>>> wrote:
  >> >>>>>>>>>
  >> >>>>>>>>> At the very least
I'll take
 this email and turn it into a short
  >> >>>>>>>>> article.
  >> >>>>>>>>>
  >> >>>>>>>>> Yes, please do make
it into a
 wiki page on the GHC Trac, and
  >> >>>>>>>>> maybe
  >> >>>>>>>>> make a ticket for it.
  >> >>>>>>>>>
  >> >>>>>>>>>
  >> >>>>>>>>> Thanks
  >> >>>>>>>>>
  >> >>>>>>>>>

Re: ArrayArrays

2015-09-08 Thread Edward Kmett
>> >>>>>>
>>  >> >>>>>>
>>  >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM
>> Edward Kmett > <mailto:ekm...@gmail.com>>
>>  >> >>>>>> wrote:
>>  >> >>>>>>>
>>  >> >>>>>>> Some form of MutableStruct# with
>> a known number of words and a
>>  >> >>>>>>> known
>>  >> >>>>>>> number of pointers is basically
>> what Ryan Yates was suggesting
>>  >> >>>>>>> above, but
>>  >> >>>>>>> where the word counts were
>> stored in the objects themselves.
>>  >> >>>>>>>
>>  >> >>>>>>> Given that it'd have a couple of
>> words for those counts it'd
>>  >> >>>>>>> likely
>>  >> >>>>>>> want to be something we build in
>> addition to MutVar# rather than a
>>  >> >>>>>>> replacement.
>>  >> >>>>>>>
>>  >> >>>>>>> On the other hand, if we had to
>> fix those numbers and build info
>>  >> >>>>>>> tables that knew them, and
>> typechecker support, for instance, it'd
>>  >> >>>>>>> get
>>  >> >>>>>>> rather invasive.
>>  >> >>>>>>>
>>  >> >>>>>>> Also, a number of things that we
>> can do with the 'sized' versions
>>  >> >>>>>>> above, like working with evil
>> unsized c-style arrays directly
>>  >> >>>>>>> inline at the
>>  >> >>>>>>> end of the structure cease to be
>> possible, so it isn't even a pure
>>  >> >>>>>>> win if we
>>  >> >>>>>>> did the engineering effort.
>>  >> >>>>>>>
>>  >> >>>>>>> I think 90% of the needs I have
>> are covered just by adding the one
>>  >> >>>>>>> primitive. The last 10% gets
>> pretty invasive.
>>  >> >>>>>>>
>>  >> >>>>>>> -Edward
>>  >> >>>>>>>
>>  >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM,
>> Ryan Newton > <mailto:rrnew...@gmail.com>>
>>  >> >>>>>>> wrote:
>>  >> >>>>>>>>
>>  >> >>>>>>>> I like the possibility of a
>> general solution for mutable structs
>>  >> >>>>>>>> (like Ed said), and I'm trying
>> to fully understand why it's hard.
>>  >> >>>>>>>>
>>  >> >>>>>>>> So, we can't unpack MutVar into
>> constructors because of object
>&

Re: ArrayArrays

2015-09-08 Thread Simon Marlow
;>>
 >> >>>>>>> Given that it'd have a couple of
words for those counts it'd
 >> >>>>>>> likely
 >> >>>>>>> want to be something we build in
addition to MutVar# rather than a
 >> >>>>>>> replacement.
 >> >>>>>>>
 >> >>>>>>> On the other hand, if we had to
fix those numbers and build info
 >> >>>>>>> tables that knew them, and
typechecker support, for instance, it'd
 >> >>>>>>> get
 >> >>>>>>> rather invasive.
 >> >>>>>>>
 >> >>>>>>> Also, a number of things that we
can do with the 'sized' versions
 >> >>>>>>> above, like working with evil
unsized c-style arrays directly
 >> >>>>>>> inline at the
 >> >>>>>>> end of the structure cease to be
possible, so it isn't even a pure
 >> >>>>>>> win if we
 >> >>>>>>> did the engineering effort.
 >> >>>>>>>
 >> >>>>>>> I think 90% of the needs I have
are covered just by adding the one
 >> >>>>>>> primitive. The last 10% gets
pretty invasive.
 >> >>>>>>>
 >> >>>>>>> -Edward
 >> >>>>>>>
 >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM,
Ryan Newton mailto:rrnew...@gmail.com>>
 >> >>>>>>> wrote:
 >> >>>>>>>>
 >> >>>>>>>> I like the possibility of a
general solution for mutable structs
     >> >>>>>>>> (like Ed said), and I'm trying
to fully understand why it's hard.
 >> >>>>>>>>
 >> >>>>>>>> So, we can't unpack MutVar into
constructors because of object
 >> >>>>>>>> identity problems. But what
about directly supporting an
 >> >>>>>>>> extensible set of
 >> >>>>>>>> unlifted MutStruct# objects,
generalizing (and even replacing)
 >> >>>>>>>> MutVar#? That
 >> >>>>>>>> may be too much work, but is it
problematic otherwise?
 >> >>>>>>>>
 >> >>>>>>>> Needless to say, this is also
critical if we ever want best in
 >> >>>>>>>> class
 >> >>>>>>>> lockfree mutable structures,
just like their Stm and sequential
 >> >>>>>>>> counterparts.
 >> >>>>>>>>
 >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM
Simon Peyton Jones
 >> >>>>>>>> mailto:simo...@microsoft.com>> wrote:
 >> >>>>>>>>>
 >> >&g

Re: ArrayArrays

2015-09-07 Thread Edward Kmett
Assume we had the ability to talk about Levity in a new way and instead of
just:

data Levity = Lifted | Unlifted

type * = TYPE 'Lifted
type # = TYPE 'Unlifted

we replace had a more nuanced notion of TYPE parameterized on another data
type:

data Levity = Lifted | Unlifted
data Param = Composite | Simple Levity

and we parameterized TYPE with a Param rather than Levity.

Existing strange representations can continue to live in TYPE 'Composite

(# Int# , Double #) :: TYPE 'Composite

and we don't support parametricity in there, just like, currently we don't
allow parametricity in #.

We can include the undefined example from Richard's talk:

undefined :: forall (v :: Param). v

and ultimately lift it into his pi type when it is available just as before.

But we could let consider TYPE ('Simple 'Unlifted) as a form of 'parametric
#' covering unlifted things we're willing to allow polymorphism over
because they are just pointers to something in the heap, that just happens
to not be able to be _|_ or a thunk.

In this setting, recalling that above, I modified Richard's TYPE to take a
Param instead of Levity, we can define a type alias for things that live as
a simple pointer to a heap allocated object:

type GC (l :: Levity) = TYPE ('Simple l)
type * = GC 'Lifted

and then we can look at existing primitives generalized:

Array# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted
MutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted
SmallArray# :: forall (l :: Levity) (a :: GC l). a -> GC 'Unlifted
SmallMutableArray# :: forall (l :: Levity) (a :: GC l). * -> a -> GC
'Unlifted
MutVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted
MVar# :: forall (l :: Levity) (a :: GC l). * -> a -> GC 'Unlifted

Weak#, StablePtr#, StableName#, etc. all can take similar modifications.

Recall that an ArrayArray# was just an Array# hacked up to be able to hold
onto the subset of # that is collectable.

Almost all of the operations on these data types can work on the more
general kind of argument.

newArray# :: forall (s :: *) (l :: Levity) (a :: GC l). Int# -> a -> State#
s -> (# State# s, MutableArray# s a #)

writeArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s a
-> Int# -> a -> State# s -> State# s

readArray# :: forall (s :: *) (l :: Levity) (a :: GC l). MutableArray# s a
-> Int# -> State# s -> (# State# s, a #)

etc.

Only a couple of our existing primitives _can't_ generalize this way. The
one that leaps to mind is atomicModifyMutVar, which would need to stay
constrained to only work on arguments in *, because of the way it operates.

With that we can still talk about

MutableArray# s Int

but now we can also talk about:

MutableArray# s (MutableArray# s Int)

without the layer of indirection through a box in * and without an
explosion of primops. The same newFoo, readFoo, writeFoo machinery works
for both kinds.

The struct machinery doesn't get to take advantage of this, but it would
let us clean house elsewhere in Prim and drastically improve the range of
applicability of the existing primitives with nothing more than a small
change to the levity machinery.

I'm not attached to any of the names above, I coined them just to give us a
concrete thing to talk about.

Here I'm only proposing we extend machinery in GHC.Prim this way, but an
interesting 'now that the barn door is open' question is to consider that
our existing Haskell data types often admit a similar form of parametricity
and nothing in principle prevents this from working for Maybe or [] and
once you permit inference to fire across all of GC l then it seems to me
that you'd start to get those same capabilities there as well when
LevityPolymorphism was turned on.

-Edward

On Mon, Sep 7, 2015 at 5:56 PM, Simon Peyton Jones 
wrote:

> This could make the menagerie of ways to pack
> {Small}{Mutable}Array{Array}# references into a
> {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the
> need for folks to descend into the use of the more evil structure
> primitives we're talking about, and letting us keep a few more principles
> around us.
>
>
>
> I’m lost. Can you give some concrete examples that illustrate how levity
> polymorphism will help us?
>
>
> Simon
>
>
>
> *From:* Edward Kmett [mailto:ekm...@gmail.com]
> *Sent:* 07 September 2015 21:17
> *To:* Simon Peyton Jones
> *Cc:* Ryan Newton; Johan Tibell; Simon Marlow; Manuel M T Chakravarty;
> Chao-Hong Chen; ghc-devs; Ryan Scott; Ryan Yates
> *Subject:* Re: ArrayArrays
>
>
>
> I had a brief discussion with Richard during the Haskell Symposium about
> how we might be able to let parametricity help a bit in reducing the space
> of neces

Re: ArrayArrays

2015-09-07 Thread Dan Doel
On Mon, Sep 7, 2015 at 4:16 PM, Edward Kmett  wrote:
> Notably, it'd be interesting to explore the ability to allow parametricity
> over the portion of # that is just a gcptr.

Which is also a necessary part of Ed Yang's unlifted types proposal.
This portion of # becomes the `Unlifted` kind, and it should be
possible to have parametric polymorphism for it (and if that isn't
stated outright, several things in the proposal assume you have it).

-- Dan
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: ArrayArrays

2015-09-07 Thread Edward Kmett
Indeed. I can CAS today with appropriately coerced primitives.

-Edward

On Mon, Sep 7, 2015 at 4:27 PM, Ryan Newton  wrote:

> Ah, incidentally that introduces an interesting difference between
> atomicModify and CAS.  CAS should be able to work on mutable locations in
> that subset of # that are represented by a gcptr, whereas Edward pointed
> out that atomicModify cannot.
>
> (Indeed, to use lock-free algorithms with these new unboxed mutable
> structures we'll need CAS on the slots.)
>
> On Mon, Sep 7, 2015 at 4:16 PM, Edward Kmett  wrote:
>
>> I had a brief discussion with Richard during the Haskell Symposium about
>> how we might be able to let parametricity help a bit in reducing the space
>> of necessarily primops to a slightly more manageable level.
>>
>> Notably, it'd be interesting to explore the ability to allow
>> parametricity over the portion of # that is just a gcptr.
>>
>> We could do this if the levity polymorphism machinery was tweaked a bit.
>> You could envision the ability to abstract over things in both * and the
>> subset of # that are represented by a gcptr, then modifying the existing
>> array primitives to be parametric in that choice of levity for their
>> argument so long as it was of a "heap object" levity.
>>
>> This could make the menagerie of ways to pack
>> {Small}{Mutable}Array{Array}# references into a
>> {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the
>> need for folks to descend into the use of the more evil structure
>> primitives we're talking about, and letting us keep a few more principles
>> around us.
>>
>> Then in the cases like `atomicModifyMutVar#` where it needs to actually
>> be in * rather than just a gcptr, due to the constructed field selectors it
>> introduces on the heap then we could keep the existing less polymorphic
>> type.
>>
>> -Edward
>>
>> On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones > > wrote:
>>
>>> It was fun to meet and discuss this.
>>>
>>>
>>>
>>> Did someone volunteer to write a wiki page that describes the proposed
>>> design?  And, I earnestly hope, also describes the menagerie of currently
>>> available array types and primops so that users can have some chance of
>>> picking the right one?!
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Simon
>>>
>>>
>>>
>>> *From:* ghc-devs [mailto:ghc-devs-boun...@haskell.org] *On Behalf Of *Ryan
>>> Newton
>>> *Sent:* 31 August 2015 23:11
>>> *To:* Edward Kmett; Johan Tibell
>>> *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs;
>>> Ryan Scott; Ryan Yates
>>> *Subject:* Re: ArrayArrays
>>>
>>>
>>>
>>> Dear Edward, Ryan Yates, and other interested parties --
>>>
>>>
>>>
>>> So when should we meet up about this?
>>>
>>>
>>>
>>> May I propose the Tues afternoon break for everyone at ICFP who is
>>> interested in this topic?  We can meet out in the coffee area and
>>> congregate around Edward Kmett, who is tall and should be easy to find ;-).
>>>
>>>
>>>
>>> I think Ryan is going to show us how to use his new primops for combined
>>> array + other fields in one heap object?
>>>
>>>
>>>
>>> On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett  wrote:
>>>
>>> Without a custom primitive it doesn't help much there, you have to store
>>> the indirection to the mask.
>>>
>>>
>>>
>>> With a custom primitive it should cut the on heap root-to-leaf path of
>>> everything in the HAMT in half. A shorter HashMap was actually one of the
>>> motivating factors for me doing this. It is rather astoundingly difficult
>>> to beat the performance of HashMap, so I had to start cheating pretty
>>> badly. ;)
>>>
>>>
>>>
>>> -Edward
>>>
>>>
>>>
>>> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell 
>>> wrote:
>>>
>>> I'd also be interested to chat at ICFP to see if I can use this for my
>>> HAMT implementation.
>>>
>>>
>>>
>>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett  wrote:
>>>
>>> Sounds good to me. Right now I'm just hacking up composable accessors
>>> for "typed slots" in a fairly lens-like fashion, and treating the set of
>>

Re: ArrayArrays

2015-09-07 Thread Ryan Newton
Ah, incidentally that introduces an interesting difference between
atomicModify and CAS.  CAS should be able to work on mutable locations in
that subset of # that are represented by a gcptr, whereas Edward pointed
out that atomicModify cannot.

(Indeed, to use lock-free algorithms with these new unboxed mutable
structures we'll need CAS on the slots.)

On Mon, Sep 7, 2015 at 4:16 PM, Edward Kmett  wrote:

> I had a brief discussion with Richard during the Haskell Symposium about
> how we might be able to let parametricity help a bit in reducing the space
> of necessarily primops to a slightly more manageable level.
>
> Notably, it'd be interesting to explore the ability to allow parametricity
> over the portion of # that is just a gcptr.
>
> We could do this if the levity polymorphism machinery was tweaked a bit.
> You could envision the ability to abstract over things in both * and the
> subset of # that are represented by a gcptr, then modifying the existing
> array primitives to be parametric in that choice of levity for their
> argument so long as it was of a "heap object" levity.
>
> This could make the menagerie of ways to pack
> {Small}{Mutable}Array{Array}# references into a
> {Small}{Mutable}Array{Array}#' actually typecheck soundly, reducing the
> need for folks to descend into the use of the more evil structure
> primitives we're talking about, and letting us keep a few more principles
> around us.
>
> Then in the cases like `atomicModifyMutVar#` where it needs to actually be
> in * rather than just a gcptr, due to the constructed field selectors it
> introduces on the heap then we could keep the existing less polymorphic
> type.
>
> -Edward
>
> On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones 
> wrote:
>
>> It was fun to meet and discuss this.
>>
>>
>>
>> Did someone volunteer to write a wiki page that describes the proposed
>> design?  And, I earnestly hope, also describes the menagerie of currently
>> available array types and primops so that users can have some chance of
>> picking the right one?!
>>
>>
>>
>> Thanks
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* ghc-devs [mailto:ghc-devs-boun...@haskell.org] *On Behalf Of *Ryan
>> Newton
>> *Sent:* 31 August 2015 23:11
>> *To:* Edward Kmett; Johan Tibell
>> *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs;
>> Ryan Scott; Ryan Yates
>> *Subject:* Re: ArrayArrays
>>
>>
>>
>> Dear Edward, Ryan Yates, and other interested parties --
>>
>>
>>
>> So when should we meet up about this?
>>
>>
>>
>> May I propose the Tues afternoon break for everyone at ICFP who is
>> interested in this topic?  We can meet out in the coffee area and
>> congregate around Edward Kmett, who is tall and should be easy to find ;-).
>>
>>
>>
>> I think Ryan is going to show us how to use his new primops for combined
>> array + other fields in one heap object?
>>
>>
>>
>> On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett  wrote:
>>
>> Without a custom primitive it doesn't help much there, you have to store
>> the indirection to the mask.
>>
>>
>>
>> With a custom primitive it should cut the on heap root-to-leaf path of
>> everything in the HAMT in half. A shorter HashMap was actually one of the
>> motivating factors for me doing this. It is rather astoundingly difficult
>> to beat the performance of HashMap, so I had to start cheating pretty
>> badly. ;)
>>
>>
>>
>> -Edward
>>
>>
>>
>> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell 
>> wrote:
>>
>> I'd also be interested to chat at ICFP to see if I can use this for my
>> HAMT implementation.
>>
>>
>>
>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett  wrote:
>>
>> Sounds good to me. Right now I'm just hacking up composable accessors for
>> "typed slots" in a fairly lens-like fashion, and treating the set of slots
>> I define and the 'new' function I build for the data type as its API, and
>> build atop that. This could eventually graduate to template-haskell, but
>> I'm not entirely satisfied with the solution I have. I currently
>> distinguish between what I'm calling "slots" (things that point directly to
>> another SmallMutableArrayArray# sans wrapper) and "fields" which point
>> directly to the usual Haskell data types because unifying the two notions
>> meant that I couldn't lift some coercions out "far enough" 

Re: ArrayArrays

2015-09-07 Thread Edward Kmett
I had a brief discussion with Richard during the Haskell Symposium about
how we might be able to let parametricity help a bit in reducing the space
of necessarily primops to a slightly more manageable level.

Notably, it'd be interesting to explore the ability to allow parametricity
over the portion of # that is just a gcptr.

We could do this if the levity polymorphism machinery was tweaked a bit.
You could envision the ability to abstract over things in both * and the
subset of # that are represented by a gcptr, then modifying the existing
array primitives to be parametric in that choice of levity for their
argument so long as it was of a "heap object" levity.

This could make the menagerie of ways to pack {Small}{Mutable}Array{Array}#
references into a {Small}{Mutable}Array{Array}#' actually typecheck
soundly, reducing the need for folks to descend into the use of the more
evil structure primitives we're talking about, and letting us keep a few
more principles around us.

Then in the cases like `atomicModifyMutVar#` where it needs to actually be
in * rather than just a gcptr, due to the constructed field selectors it
introduces on the heap then we could keep the existing less polymorphic
type.

-Edward

On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones 
wrote:

> It was fun to meet and discuss this.
>
>
>
> Did someone volunteer to write a wiki page that describes the proposed
> design?  And, I earnestly hope, also describes the menagerie of currently
> available array types and primops so that users can have some chance of
> picking the right one?!
>
>
>
> Thanks
>
>
>
> Simon
>
>
>
> *From:* ghc-devs [mailto:ghc-devs-boun...@haskell.org] *On Behalf Of *Ryan
> Newton
> *Sent:* 31 August 2015 23:11
> *To:* Edward Kmett; Johan Tibell
> *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs;
> Ryan Scott; Ryan Yates
> *Subject:* Re: ArrayArrays
>
>
>
> Dear Edward, Ryan Yates, and other interested parties --
>
>
>
> So when should we meet up about this?
>
>
>
> May I propose the Tues afternoon break for everyone at ICFP who is
> interested in this topic?  We can meet out in the coffee area and
> congregate around Edward Kmett, who is tall and should be easy to find ;-).
>
>
>
> I think Ryan is going to show us how to use his new primops for combined
> array + other fields in one heap object?
>
>
>
> On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett  wrote:
>
> Without a custom primitive it doesn't help much there, you have to store
> the indirection to the mask.
>
>
>
> With a custom primitive it should cut the on heap root-to-leaf path of
> everything in the HAMT in half. A shorter HashMap was actually one of the
> motivating factors for me doing this. It is rather astoundingly difficult
> to beat the performance of HashMap, so I had to start cheating pretty
> badly. ;)
>
>
>
> -Edward
>
>
>
> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell 
> wrote:
>
> I'd also be interested to chat at ICFP to see if I can use this for my
> HAMT implementation.
>
>
>
> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett  wrote:
>
> Sounds good to me. Right now I'm just hacking up composable accessors for
> "typed slots" in a fairly lens-like fashion, and treating the set of slots
> I define and the 'new' function I build for the data type as its API, and
> build atop that. This could eventually graduate to template-haskell, but
> I'm not entirely satisfied with the solution I have. I currently
> distinguish between what I'm calling "slots" (things that point directly to
> another SmallMutableArrayArray# sans wrapper) and "fields" which point
> directly to the usual Haskell data types because unifying the two notions
> meant that I couldn't lift some coercions out "far enough" to make them
> vanish.
>
>
>
> I'll be happy to run through my current working set of issues in person
> and -- as things get nailed down further -- in a longer lived medium than
> in personal conversations. ;)
>
>
>
> -Edward
>
>
>
> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton  wrote:
>
> I'd also love to meet up at ICFP and discuss this.  I think the array
> primops plus a TH layer that lets (ab)use them many times without too much
> marginal cost sounds great.  And I'd like to learn how we could be either
> early users of, or help with, this infrastructure.
>
>
>
> CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping
> in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is
> currently working on concurrent data structures in Haskell, but will not 

Re: ArrayArrays

2015-09-07 Thread Edward Kmett
I volunteered to write something up with the caveat that it would take me a
while after the conference ended to get time to do so.

I'll see what I can do.

-Edward

On Mon, Sep 7, 2015 at 9:59 AM, Simon Peyton Jones 
wrote:

> It was fun to meet and discuss this.
>
>
>
> Did someone volunteer to write a wiki page that describes the proposed
> design?  And, I earnestly hope, also describes the menagerie of currently
> available array types and primops so that users can have some chance of
> picking the right one?!
>
>
>
> Thanks
>
>
>
> Simon
>
>
>
> *From:* ghc-devs [mailto:ghc-devs-boun...@haskell.org] *On Behalf Of *Ryan
> Newton
> *Sent:* 31 August 2015 23:11
> *To:* Edward Kmett; Johan Tibell
> *Cc:* Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs;
> Ryan Scott; Ryan Yates
> *Subject:* Re: ArrayArrays
>
>
>
> Dear Edward, Ryan Yates, and other interested parties --
>
>
>
> So when should we meet up about this?
>
>
>
> May I propose the Tues afternoon break for everyone at ICFP who is
> interested in this topic?  We can meet out in the coffee area and
> congregate around Edward Kmett, who is tall and should be easy to find ;-).
>
>
>
> I think Ryan is going to show us how to use his new primops for combined
> array + other fields in one heap object?
>
>
>
> On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett  wrote:
>
> Without a custom primitive it doesn't help much there, you have to store
> the indirection to the mask.
>
>
>
> With a custom primitive it should cut the on heap root-to-leaf path of
> everything in the HAMT in half. A shorter HashMap was actually one of the
> motivating factors for me doing this. It is rather astoundingly difficult
> to beat the performance of HashMap, so I had to start cheating pretty
> badly. ;)
>
>
>
> -Edward
>
>
>
> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell 
> wrote:
>
> I'd also be interested to chat at ICFP to see if I can use this for my
> HAMT implementation.
>
>
>
> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett  wrote:
>
> Sounds good to me. Right now I'm just hacking up composable accessors for
> "typed slots" in a fairly lens-like fashion, and treating the set of slots
> I define and the 'new' function I build for the data type as its API, and
> build atop that. This could eventually graduate to template-haskell, but
> I'm not entirely satisfied with the solution I have. I currently
> distinguish between what I'm calling "slots" (things that point directly to
> another SmallMutableArrayArray# sans wrapper) and "fields" which point
> directly to the usual Haskell data types because unifying the two notions
> meant that I couldn't lift some coercions out "far enough" to make them
> vanish.
>
>
>
> I'll be happy to run through my current working set of issues in person
> and -- as things get nailed down further -- in a longer lived medium than
> in personal conversations. ;)
>
>
>
> -Edward
>
>
>
> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton  wrote:
>
> I'd also love to meet up at ICFP and discuss this.  I think the array
> primops plus a TH layer that lets (ab)use them many times without too much
> marginal cost sounds great.  And I'd like to learn how we could be either
> early users of, or help with, this infrastructure.
>
>
>
> CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping
> in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is
> currently working on concurrent data structures in Haskell, but will not be
> at ICFP.
>
>
>
>
>
> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates  wrote:
>
> I completely agree.  I would love to spend some time during ICFP and
> friends talking about what it could look like.  My small array for STM
> changes for the RTS can be seen here [1].  It is on a branch somewhere
> between 7.8 and 7.10 and includes irrelevant STM bits and some
> confusing naming choices (sorry), but should cover all the details
> needed to implement it for a non-STM context.  The biggest surprise
> for me was following small array too closely and having a word/byte
> offset miss-match [2].
>
> [1]:
> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut
> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413
>
> Ryan
>
>
> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett  wrote:
> > I'd love to have that last 10%, but its a lot of work to get there and
> more
> > importantly I don't know quite what it should look like.
> >
> > On the oth

RE: ArrayArrays

2015-09-07 Thread Simon Peyton Jones
It was fun to meet and discuss this.

Did someone volunteer to write a wiki page that describes the proposed design?  
And, I earnestly hope, also describes the menagerie of currently available 
array types and primops so that users can have some chance of picking the right 
one?!

Thanks

Simon

From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Ryan Newton
Sent: 31 August 2015 23:11
To: Edward Kmett; Johan Tibell
Cc: Simon Marlow; Manuel M T Chakravarty; Chao-Hong Chen; ghc-devs; Ryan Scott; 
Ryan Yates
Subject: Re: ArrayArrays

Dear Edward, Ryan Yates, and other interested parties --

So when should we meet up about this?

May I propose the Tues afternoon break for everyone at ICFP who is interested 
in this topic?  We can meet out in the coffee area and congregate around Edward 
Kmett, who is tall and should be easy to find ;-).

I think Ryan is going to show us how to use his new primops for combined array 
+ other fields in one heap object?

On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett 
mailto:ekm...@gmail.com>> wrote:
Without a custom primitive it doesn't help much there, you have to store the 
indirection to the mask.

With a custom primitive it should cut the on heap root-to-leaf path of 
everything in the HAMT in half. A shorter HashMap was actually one of the 
motivating factors for me doing this. It is rather astoundingly difficult to 
beat the performance of HashMap, so I had to start cheating pretty badly. ;)

-Edward

On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell 
mailto:johan.tib...@gmail.com>> wrote:
I'd also be interested to chat at ICFP to see if I can use this for my HAMT 
implementation.

On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett 
mailto:ekm...@gmail.com>> wrote:
Sounds good to me. Right now I'm just hacking up composable accessors for 
"typed slots" in a fairly lens-like fashion, and treating the set of slots I 
define and the 'new' function I build for the data type as its API, and build 
atop that. This could eventually graduate to template-haskell, but I'm not 
entirely satisfied with the solution I have. I currently distinguish between 
what I'm calling "slots" (things that point directly to another 
SmallMutableArrayArray# sans wrapper) and "fields" which point directly to the 
usual Haskell data types because unifying the two notions meant that I couldn't 
lift some coercions out "far enough" to make them vanish.

I'll be happy to run through my current working set of issues in person and -- 
as things get nailed down further -- in a longer lived medium than in personal 
conversations. ;)

-Edward

On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton 
mailto:rrnew...@gmail.com>> wrote:
I'd also love to meet up at ICFP and discuss this.  I think the array primops 
plus a TH layer that lets (ab)use them many times without too much marginal 
cost sounds great.  And I'd like to learn how we could be either early users 
of, or help with, this infrastructure.

CC'ing in Ryan Scot and Omer Agacan who may also be interested in dropping in 
on such discussions @ICFP, and Chao-Hong Chen, a Ph.D. student who is currently 
working on concurrent data structures in Haskell, but will not be at ICFP.


On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates 
mailto:fryguy...@gmail.com>> wrote:
I completely agree.  I would love to spend some time during ICFP and
friends talking about what it could look like.  My small array for STM
changes for the RTS can be seen here [1].  It is on a branch somewhere
between 7.8 and 7.10 and includes irrelevant STM bits and some
confusing naming choices (sorry), but should cover all the details
needed to implement it for a non-STM context.  The biggest surprise
for me was following small array too closely and having a word/byte
offset miss-match [2].

[1]: 
https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut
[2]: https://ghc.haskell.org/trac/ghc/ticket/10413

Ryan

On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett 
mailto:ekm...@gmail.com>> wrote:
> I'd love to have that last 10%, but its a lot of work to get there and more
> importantly I don't know quite what it should look like.
>
> On the other hand, I do have a pretty good idea of how the primitives above
> could be banged out and tested in a long evening, well in time for 7.12. And
> as noted earlier, those remain useful even if a nicer typed version with an
> extra level of indirection to the sizes is built up after.
>
> The rest sounds like a good graduate student project for someone who has
> graduate students lying around. Maybe somebody at Indiana University who has
> an interest in type theory and parallelism can find us one. =)
>
> -Edward
>
> On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates 
> mailto:fryguy...@gmail.com>> wrote:
>>
>> I think from my perspective, the mot

Re: ArrayArrays

2015-08-31 Thread Edward Kmett
>> >>>>>> >> wrote:
>> >>>>>> >> > They just segfault at this level. ;)
>> >>>>>> >> >
>> >>>>>> >> > Sent from my iPhone
>> >>>>>> >> >
>> >>>>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton 
>> >>>>>> >> > wrote:
>> >>>>>> >> >
>> >>>>>> >> > You presumably also save a bounds check on reads by
>> hard-coding
>> >>>>>> >> > the
>> >>>>>> >> > sizes?
>> >>>>>> >> >
>> >>>>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett <
>> ekm...@gmail.com>
>> >>>>>> >> > wrote:
>> >>>>>> >> >>
>> >>>>>> >> >> Also there are 4 different "things" here, basically
>> depending on
>> >>>>>> >> >> two
>> >>>>>> >> >> independent questions:
>> >>>>>> >> >>
>> >>>>>> >> >> a.) if you want to shove the sizes into the info table, and
>> >>>>>> >> >> b.) if you want cardmarking.
>> >>>>>> >> >>
>> >>>>>> >> >> Versions with/without cardmarking for different sizes can be
>> >>>>>> >> >> done
>> >>>>>> >> >> pretty
>> >>>>>> >> >> easily, but as noted, the infotable variants are pretty
>> >>>>>> >> >> invasive.
>> >>>>>> >> >>
>> >>>>>> >> >> -Edward
>> >>>>>> >> >>
>> >>>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett <
>> ekm...@gmail.com>
>> >>>>>> >> >> wrote:
>> >>>>>> >> >>>
>> >>>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which
>> >>>>>> >> >>> adds up
>> >>>>>> >> >>> if
>> >>>>>> >> >>> they were small enough and there are enough of them. You
>> get a
>> >>>>>> >> >>> bit
>> >>>>>> >> >>> better
>> >>>>>> >> >>> locality of reference in terms of what fits in the first
>> cache
>> >>>>>> >> >>> line of
>> >>>>>> >> >>> them.
>> >>>>>> >> >>>
>> >>>>>> >> >>> -Edward
>> >>>>>> >> >>>
>> >>>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton
>> >>>>>> >> >>> 
>> >>>>>> >> >>> wrote:
>> >>>>>> >> >>>>
>> >>>>>> >> >>>> Yes. And for the short term I can imagine places we will
>> >>>>>> >> >>>> settle with
>> >>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and
>> >>>>>> >> >>>> unsafeCoercing
>> >>>>>> >> >>>> pointers whose types don't actually match their siblings.
>> >>>>>> >> >>>>
>> >>>>>> >> >>>> Is there anything to recommend the hacks mentioned for
>> fixed
>> >>>>>> >> >>>> sized
>> >>>>>> >> >>>> array
>> >>>>>> >> >>>> objects *other* than using them to fake structs? (Much to
>> >>>>>> >> >>>> derecommend, as
>> >>>>>> >> >>>> you mentioned!)
>> >>>>>> >> >>>>
>> >>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett
>> >>>>>> >> >>>> 
>> >>>>>> >> >>>> wrote:

Re: ArrayArrays

2015-08-31 Thread Johan Tibell
>
> >>>>>> >> > wrote:
> >>>>>> >> >>
> >>>>>> >> >> Also there are 4 different "things" here, basically depending
> on
> >>>>>> >> >> two
> >>>>>> >> >> independent questions:
> >>>>>> >> >>
> >>>>>> >> >> a.) if you want to shove the sizes into the info table, and
> >>>>>> >> >> b.) if you want cardmarking.
> >>>>>> >> >>
> >>>>>> >> >> Versions with/without cardmarking for different sizes can be
> >>>>>> >> >> done
> >>>>>> >> >> pretty
> >>>>>> >> >> easily, but as noted, the infotable variants are pretty
> >>>>>> >> >> invasive.
> >>>>>> >> >>
> >>>>>> >> >> -Edward
> >>>>>> >> >>
> >>>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett <
> ekm...@gmail.com>
> >>>>>> >> >> wrote:
> >>>>>> >> >>>
> >>>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which
> >>>>>> >> >>> adds up
> >>>>>> >> >>> if
> >>>>>> >> >>> they were small enough and there are enough of them. You get
> a
> >>>>>> >> >>> bit
> >>>>>> >> >>> better
> >>>>>> >> >>> locality of reference in terms of what fits in the first
> cache
> >>>>>> >> >>> line of
> >>>>>> >> >>> them.
> >>>>>> >> >>>
> >>>>>> >> >>> -Edward
> >>>>>> >> >>>
> >>>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton
> >>>>>> >> >>> 
> >>>>>> >> >>> wrote:
> >>>>>> >> >>>>
> >>>>>> >> >>>> Yes. And for the short term I can imagine places we will
> >>>>>> >> >>>> settle with
> >>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and
> >>>>>> >> >>>> unsafeCoercing
> >>>>>> >> >>>> pointers whose types don't actually match their siblings.
> >>>>>> >> >>>>
> >>>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed
> >>>>>> >> >>>> sized
> >>>>>> >> >>>> array
> >>>>>> >> >>>> objects *other* than using them to fake structs? (Much to
> >>>>>> >> >>>> derecommend, as
> >>>>>> >> >>>> you mentioned!)
> >>>>>> >> >>>>
> >>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett
> >>>>>> >> >>>> 
> >>>>>> >> >>>> wrote:
> >>>>>> >> >>>>>
> >>>>>> >> >>>>> I think both are useful, but the one you suggest requires a
> >>>>>> >> >>>>> lot more
> >>>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the
> >>>>>> >> >>>>> other.
> >>>>>> >> >>>>>
> >>>>>> >> >>>>> -Edward
> >>>>>> >> >>>>>
> >>>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton
> >>>>>> >> >>>>> 
> >>>>>> >> >>>>> wrote:
> >>>>>> >> >>>>>>
> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed
> type,
> >>>>>> >> >>>>>> unbounded
> >>>>>> >> >

Re: ArrayArrays

2015-08-31 Thread Ryan Yates
t;>> >> >> invasive.
>>>>>> >> >>
>>>>>> >> >> -Edward
>>>>>> >> >>
>>>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett 
>>>>>> >> >> wrote:
>>>>>> >> >>>
>>>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which
>>>>>> >> >>> adds up
>>>>>> >> >>> if
>>>>>> >> >>> they were small enough and there are enough of them. You get a
>>>>>> >> >>> bit
>>>>>> >> >>> better
>>>>>> >> >>> locality of reference in terms of what fits in the first cache
>>>>>> >> >>> line of
>>>>>> >> >>> them.
>>>>>> >> >>>
>>>>>> >> >>> -Edward
>>>>>> >> >>>
>>>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton
>>>>>> >> >>> 
>>>>>> >> >>> wrote:
>>>>>> >> >>>>
>>>>>> >> >>>> Yes. And for the short term I can imagine places we will
>>>>>> >> >>>> settle with
>>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and
>>>>>> >> >>>> unsafeCoercing
>>>>>> >> >>>> pointers whose types don't actually match their siblings.
>>>>>> >> >>>>
>>>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed
>>>>>> >> >>>> sized
>>>>>> >> >>>> array
>>>>>> >> >>>> objects *other* than using them to fake structs? (Much to
>>>>>> >> >>>> derecommend, as
>>>>>> >> >>>> you mentioned!)
>>>>>> >> >>>>
>>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett
>>>>>> >> >>>> 
>>>>>> >> >>>> wrote:
>>>>>> >> >>>>>
>>>>>> >> >>>>> I think both are useful, but the one you suggest requires a
>>>>>> >> >>>>> lot more
>>>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the
>>>>>> >> >>>>> other.
>>>>>> >> >>>>>
>>>>>> >> >>>>> -Edward
>>>>>> >> >>>>>
>>>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton
>>>>>> >> >>>>> 
>>>>>> >> >>>>> wrote:
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type,
>>>>>> >> >>>>>> unbounded
>>>>>> >> >>>>>> length) with extra payload.
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> I can see how we can do without structs if we have arrays,
>>>>>> >> >>>>>> especially
>>>>>> >> >>>>>> with the extra payload at front. But wouldn't the general
>>>>>> >> >>>>>> solution
>>>>>> >> >>>>>> for
>>>>>> >> >>>>>> structs be one that that allows new user data type defs for
>>>>>> >> >>>>>> #
>>>>>> >> >>>>>> types?
>>>>>> >> >>>>>>
>>>>>> >> >>>>>>
>>>>>> >> >>>>>>
>>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett
>>>>>> >> >>>>>> 
>>>>>> >> >>>>>> wrote:
>>>>>> >> >>>>>>>
>>>>&g

Re: ArrayArrays

2015-08-31 Thread Ryan Newton
>> >> >>> wrote:
>>>>> >> >>>>
>>>>> >> >>>> Yes. And for the short term I can imagine places we will
>>>>> settle with
>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and
>>>>> >> >>>> unsafeCoercing
>>>>> >> >>>> pointers whose types don't actually match their siblings.
>>>>> >> >>>>
>>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed
>>>>> sized
>>>>> >> >>>> array
>>>>> >> >>>> objects *other* than using them to fake structs? (Much to
>>>>> >> >>>> derecommend, as
>>>>> >> >>>> you mentioned!)
>>>>> >> >>>>
>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett >>>> >
>>>>> >> >>>> wrote:
>>>>> >> >>>>>
>>>>> >> >>>>> I think both are useful, but the one you suggest requires a
>>>>> lot more
>>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the other.
>>>>> >> >>>>>
>>>>> >> >>>>> -Edward
>>>>> >> >>>>>
>>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton <
>>>>> rrnew...@gmail.com>
>>>>> >> >>>>> wrote:
>>>>> >> >>>>>>
>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed type,
>>>>> >> >>>>>> unbounded
>>>>> >> >>>>>> length) with extra payload.
>>>>> >> >>>>>>
>>>>> >> >>>>>> I can see how we can do without structs if we have arrays,
>>>>> >> >>>>>> especially
>>>>> >> >>>>>> with the extra payload at front. But wouldn't the general
>>>>> solution
>>>>> >> >>>>>> for
>>>>> >> >>>>>> structs be one that that allows new user data type defs for #
>>>>> >> >>>>>> types?
>>>>> >> >>>>>>
>>>>> >> >>>>>>
>>>>> >> >>>>>>
>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett <
>>>>> ekm...@gmail.com>
>>>>> >> >>>>>> wrote:
>>>>> >> >>>>>>>
>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words
>>>>> and a
>>>>> >> >>>>>>> known
>>>>> >> >>>>>>> number of pointers is basically what Ryan Yates was
>>>>> suggesting
>>>>> >> >>>>>>> above, but
>>>>> >> >>>>>>> where the word counts were stored in the objects themselves.
>>>>> >> >>>>>>>
>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd
>>>>> >> >>>>>>> likely
>>>>> >> >>>>>>> want to be something we build in addition to MutVar# rather
>>>>> than a
>>>>> >> >>>>>>> replacement.
>>>>> >> >>>>>>>
>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build
>>>>> info
>>>>> >> >>>>>>> tables that knew them, and typechecker support, for
>>>>> instance, it'd
>>>>> >> >>>>>>> get
>>>>> >> >>>>>>> rather invasive.
>>>>> >> >>>>>>>
>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized'
>>>>> versions
>>>>> >> >>>>

Re: ArrayArrays

2015-08-29 Thread Johan Tibell
t;>>> structs be one that that allows new user data type defs for #
>>> >> >>>>>> types?
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>>
>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett >> >
>>> >> >>>>>> wrote:
>>> >> >>>>>>>
>>> >> >>>>>>> Some form of MutableStruct# with a known number of words and a
>>> >> >>>>>>> known
>>> >> >>>>>>> number of pointers is basically what Ryan Yates was suggesting
>>> >> >>>>>>> above, but
>>> >> >>>>>>> where the word counts were stored in the objects themselves.
>>> >> >>>>>>>
>>> >> >>>>>>> Given that it'd have a couple of words for those counts it'd
>>> >> >>>>>>> likely
>>> >> >>>>>>> want to be something we build in addition to MutVar# rather
>>> than a
>>> >> >>>>>>> replacement.
>>> >> >>>>>>>
>>> >> >>>>>>> On the other hand, if we had to fix those numbers and build
>>> info
>>> >> >>>>>>> tables that knew them, and typechecker support, for instance,
>>> it'd
>>> >> >>>>>>> get
>>> >> >>>>>>> rather invasive.
>>> >> >>>>>>>
>>> >> >>>>>>> Also, a number of things that we can do with the 'sized'
>>> versions
>>> >> >>>>>>> above, like working with evil unsized c-style arrays directly
>>> >> >>>>>>> inline at the
>>> >> >>>>>>> end of the structure cease to be possible, so it isn't even a
>>> pure
>>> >> >>>>>>> win if we
>>> >> >>>>>>> did the engineering effort.
>>> >> >>>>>>>
>>> >> >>>>>>> I think 90% of the needs I have are covered just by adding
>>> the one
>>> >> >>>>>>> primitive. The last 10% gets pretty invasive.
>>> >> >>>>>>>
>>> >> >>>>>>> -Edward
>>> >> >>>>>>>
>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton <
>>> rrnew...@gmail.com>
>>> >> >>>>>>> wrote:
>>> >> >>>>>>>>
>>> >> >>>>>>>> I like the possibility of a general solution for mutable
>>> structs
>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's
>>> hard.
>>> >> >>>>>>>>
>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of
>>> object
>>> >> >>>>>>>> identity problems. But what about directly supporting an
>>> >> >>>>>>>> extensible set of
>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even
>>> replacing)
>>> >> >>>>>>>> MutVar#? That
>>> >> >>>>>>>> may be too much work, but is it problematic otherwise?
>>> >> >>>>>>>>
>>> >> >>>>>>>> Needless to say, this is also critical if we ever want best
>>> in
>>> >> >>>>>>>> class
>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and
>>> sequential
>>> >> >>>>>>>> counterparts.
>>> >> >>>>>>>>
>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones
>>> >> >>>>>>>>  wrote:
>>> >> >>>>>>>>>
>>> >> >>>>>>>>> At the very least I'll take this email and turn it into a
>>> short
>>> >> >

Re: ArrayArrays

2015-08-28 Thread Ryan Yates
sing them to fake structs? (Much to
>> >>>> derecommend, as
>> >>>> you mentioned!)
>> >>>>
>> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett 
>> >>>> wrote:
>> >>>>>
>> >>>>> I think both are useful, but the one you suggest requires a lot more
>> >>>>> plumbing and doesn't subsume all of the usecases of the other.
>> >>>>>
>> >>>>> -Edward
>> >>>>>
>> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton 
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> So that primitive is an array like thing (Same pointed type,
>> >>>>>> unbounded
>> >>>>>> length) with extra payload.
>> >>>>>>
>> >>>>>> I can see how we can do without structs if we have arrays,
>> >>>>>> especially
>> >>>>>> with the extra payload at front. But wouldn't the general solution
>> >>>>>> for
>> >>>>>> structs be one that that allows new user data type defs for #
>> >>>>>> types?
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett 
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>> Some form of MutableStruct# with a known number of words and a
>> >>>>>>> known
>> >>>>>>> number of pointers is basically what Ryan Yates was suggesting
>> >>>>>>> above, but
>> >>>>>>> where the word counts were stored in the objects themselves.
>> >>>>>>>
>> >>>>>>> Given that it'd have a couple of words for those counts it'd
>> >>>>>>> likely
>> >>>>>>> want to be something we build in addition to MutVar# rather than a
>> >>>>>>> replacement.
>> >>>>>>>
>> >>>>>>> On the other hand, if we had to fix those numbers and build info
>> >>>>>>> tables that knew them, and typechecker support, for instance, it'd
>> >>>>>>> get
>> >>>>>>> rather invasive.
>> >>>>>>>
>> >>>>>>> Also, a number of things that we can do with the 'sized' versions
>> >>>>>>> above, like working with evil unsized c-style arrays directly
>> >>>>>>> inline at the
>> >>>>>>> end of the structure cease to be possible, so it isn't even a pure
>> >>>>>>> win if we
>> >>>>>>> did the engineering effort.
>> >>>>>>>
>> >>>>>>> I think 90% of the needs I have are covered just by adding the one
>> >>>>>>> primitive. The last 10% gets pretty invasive.
>> >>>>>>>
>> >>>>>>> -Edward
>> >>>>>>>
>> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton 
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>> I like the possibility of a general solution for mutable structs
>> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard.
>> >>>>>>>>
>> >>>>>>>> So, we can't unpack MutVar into constructors because of object
>> >>>>>>>> identity problems. But what about directly supporting an
>> >>>>>>>> extensible set of
>> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing)
>> >>>>>>>> MutVar#? That
>> >>>>>>>> may be too much work, but is it problematic otherwise?
>> >>>>>>>>
>> >>>>>>>> Needless to say, this is also critical if we ever want best in
>> >>>>>>>> class
>> >>>>>>>> lockfree mutable structures, just like their Stm and sequential
>> >>>>>>>> counterparts.
>> >>>>>>>>
>> >>

Re: ArrayArrays

2015-08-28 Thread Edward Kmett
gt; >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett 
> wrote:
> >>>>>>>
> >>>>>>> Some form of MutableStruct# with a known number of words and a
> known
> >>>>>>> number of pointers is basically what Ryan Yates was suggesting
> above, but
> >>>>>>> where the word counts were stored in the objects themselves.
> >>>>>>>
> >>>>>>> Given that it'd have a couple of words for those counts it'd likely
> >>>>>>> want to be something we build in addition to MutVar# rather than a
> >>>>>>> replacement.
> >>>>>>>
> >>>>>>> On the other hand, if we had to fix those numbers and build info
> >>>>>>> tables that knew them, and typechecker support, for instance, it'd
> get
> >>>>>>> rather invasive.
> >>>>>>>
> >>>>>>> Also, a number of things that we can do with the 'sized' versions
> >>>>>>> above, like working with evil unsized c-style arrays directly
> inline at the
> >>>>>>> end of the structure cease to be possible, so it isn't even a pure
> win if we
> >>>>>>> did the engineering effort.
> >>>>>>>
> >>>>>>> I think 90% of the needs I have are covered just by adding the one
> >>>>>>> primitive. The last 10% gets pretty invasive.
> >>>>>>>
> >>>>>>> -Edward
> >>>>>>>
> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton 
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> I like the possibility of a general solution for mutable structs
> >>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard.
> >>>>>>>>
> >>>>>>>> So, we can't unpack MutVar into constructors because of object
> >>>>>>>> identity problems. But what about directly supporting an
> extensible set of
> >>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing)
> MutVar#? That
> >>>>>>>> may be too much work, but is it problematic otherwise?
> >>>>>>>>
> >>>>>>>> Needless to say, this is also critical if we ever want best in
> class
> >>>>>>>> lockfree mutable structures, just like their Stm and sequential
> >>>>>>>> counterparts.
> >>>>>>>>
> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones
> >>>>>>>>  wrote:
> >>>>>>>>>
> >>>>>>>>> At the very least I'll take this email and turn it into a short
> >>>>>>>>> article.
> >>>>>>>>>
> >>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and
> maybe
> >>>>>>>>> make a ticket for it.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Simon
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> From: Edward Kmett [mailto:ekm...@gmail.com]
> >>>>>>>>> Sent: 27 August 2015 16:54
> >>>>>>>>> To: Simon Peyton Jones
> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs
> >>>>>>>>> Subject: Re: ArrayArrays
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It
> >>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> While those live in 

Re: ArrayArrays

2015-08-28 Thread Ryan Yates
t we can do with the 'sized' versions
>>>>>>> above, like working with evil unsized c-style arrays directly inline at 
>>>>>>> the
>>>>>>> end of the structure cease to be possible, so it isn't even a pure win 
>>>>>>> if we
>>>>>>> did the engineering effort.
>>>>>>>
>>>>>>> I think 90% of the needs I have are covered just by adding the one
>>>>>>> primitive. The last 10% gets pretty invasive.
>>>>>>>
>>>>>>> -Edward
>>>>>>>
>>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton 
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I like the possibility of a general solution for mutable structs
>>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard.
>>>>>>>>
>>>>>>>> So, we can't unpack MutVar into constructors because of object
>>>>>>>> identity problems. But what about directly supporting an extensible 
>>>>>>>> set of
>>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) 
>>>>>>>> MutVar#? That
>>>>>>>> may be too much work, but is it problematic otherwise?
>>>>>>>>
>>>>>>>> Needless to say, this is also critical if we ever want best in class
>>>>>>>> lockfree mutable structures, just like their Stm and sequential
>>>>>>>> counterparts.
>>>>>>>>
>>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones
>>>>>>>>  wrote:
>>>>>>>>>
>>>>>>>>> At the very least I'll take this email and turn it into a short
>>>>>>>>> article.
>>>>>>>>>
>>>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe
>>>>>>>>> make a ticket for it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Simon
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> From: Edward Kmett [mailto:ekm...@gmail.com]
>>>>>>>>> Sent: 27 August 2015 16:54
>>>>>>>>> To: Simon Peyton Jones
>>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs
>>>>>>>>> Subject: Re: ArrayArrays
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It
>>>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> While those live in #, they are garbage collected objects, so this
>>>>>>>>> all lives on the heap.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> They were added to make some of the DPH stuff fast when it has to
>>>>>>>>> deal with nested arrays.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm currently abusing them as a placeholder for a better thing.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The Problem
>>>>>>>>>
>>>>>>>>> -
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Consider the scenario where you write a classic doubly-linked list
>>>>>>>>> in Haskell.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
>>>>>>>>>
>>>>>>>>>
>>>>>

Re: ArrayArrays

2015-08-28 Thread Edward Kmett
They just segfault at this level. ;)

Sent from my iPhone

> On Aug 28, 2015, at 7:25 PM, Ryan Newton  wrote:
> 
> You presumably also save a bounds check on reads by hard-coding the sizes?
> 
>> On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett  wrote:
>> Also there are 4 different "things" here, basically depending on two 
>> independent questions: 
>> 
>> a.) if you want to shove the sizes into the info table, and 
>> b.) if you want cardmarking.
>> 
>> Versions with/without cardmarking for different sizes can be done pretty 
>> easily, but as noted, the infotable variants are pretty invasive.
>> 
>> -Edward
>> 
>>> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett  wrote:
>>> Well, on the plus side you'd save 16 bytes per object, which adds up if 
>>> they were small enough and there are enough of them. You get a bit better 
>>> locality of reference in terms of what fits in the first cache line of them.
>>> 
>>> -Edward
>>> 
>>>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton  wrote:
>>>> Yes. And for the short term I can imagine places we will settle with 
>>>> arrays even if it means tracking lengths unnecessarily and unsafeCoercing 
>>>> pointers whose types don't actually match their siblings. 
>>>> 
>>>> Is there anything to recommend the hacks mentioned for fixed sized array 
>>>> objects *other* than using them to fake structs? (Much to derecommend, as 
>>>> you mentioned!)
>>>> 
>>>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett  wrote:
>>>>> I think both are useful, but the one you suggest requires a lot more 
>>>>> plumbing and doesn't subsume all of the usecases of the other.
>>>>> 
>>>>> -Edward
>>>>> 
>>>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton  wrote:
>>>>>> So that primitive is an array like thing (Same pointed type, unbounded 
>>>>>> length) with extra payload. 
>>>>>> 
>>>>>> I can see how we can do without structs if we have arrays, especially 
>>>>>> with the extra payload at front. But wouldn't the general solution for 
>>>>>> structs be one that that allows new user data type defs for # types?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett  wrote:
>>>>>>> Some form of MutableStruct# with a known number of words and a known 
>>>>>>> number of pointers is basically what Ryan Yates was suggesting above, 
>>>>>>> but where the word counts were stored in the objects themselves.
>>>>>>> 
>>>>>>> Given that it'd have a couple of words for those counts it'd likely 
>>>>>>> want to be something we build in addition to MutVar# rather than a 
>>>>>>> replacement.
>>>>>>> 
>>>>>>> On the other hand, if we had to fix those numbers and build info tables 
>>>>>>> that knew them, and typechecker support, for instance, it'd get rather 
>>>>>>> invasive.
>>>>>>> 
>>>>>>> Also, a number of things that we can do with the 'sized' versions 
>>>>>>> above, like working with evil unsized c-style arrays directly inline at 
>>>>>>> the end of the structure cease to be possible, so it isn't even a pure 
>>>>>>> win if we did the engineering effort.
>>>>>>> 
>>>>>>> I think 90% of the needs I have are covered just by adding the one 
>>>>>>> primitive. The last 10% gets pretty invasive.
>>>>>>> 
>>>>>>> -Edward
>>>>>>> 
>>>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton  
>>>>>>>> wrote:
>>>>>>>> I like the possibility of a general solution for mutable structs (like 
>>>>>>>> Ed said), and I'm trying to fully understand why it's hard. 
>>>>>>>> 
>>>>>>>> So, we can't unpack MutVar into constructors because of object 
>>>>>>>> identity problems. But what about directly supporting an extensible 
>>>>>>>> set of unlifted MutStruct# objects, generalizing (and even replacing) 
>>>>>>>> MutVar#? That may

Re: ArrayArrays

2015-08-28 Thread Ryan Newton
You presumably also save a bounds check on reads by hard-coding the sizes?

On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett  wrote:

> Also there are 4 different "things" here, basically depending on two
> independent questions:
>
> a.) if you want to shove the sizes into the info table, and
> b.) if you want cardmarking.
>
> Versions with/without cardmarking for different sizes can be done pretty
> easily, but as noted, the infotable variants are pretty invasive.
>
> -Edward
>
> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett  wrote:
>
>> Well, on the plus side you'd save 16 bytes per object, which adds up if
>> they were small enough and there are enough of them. You get a bit better
>> locality of reference in terms of what fits in the first cache line of them.
>>
>> -Edward
>>
>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton  wrote:
>>
>>> Yes. And for the short term I can imagine places we will settle with
>>> arrays even if it means tracking lengths unnecessarily and unsafeCoercing
>>> pointers whose types don't actually match their siblings.
>>>
>>> Is there anything to recommend the hacks mentioned for fixed sized array
>>> objects *other* than using them to fake structs? (Much to derecommend, as
>>> you mentioned!)
>>>
>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett  wrote:
>>>
>>>> I think both are useful, but the one you suggest requires a lot more
>>>> plumbing and doesn't subsume all of the usecases of the other.
>>>>
>>>> -Edward
>>>>
>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton 
>>>> wrote:
>>>>
>>>>> So that primitive is an array like thing (Same pointed type, unbounded
>>>>> length) with extra payload.
>>>>>
>>>>> I can see how we can do without structs if we have arrays, especially
>>>>> with the extra payload at front. But wouldn't the general solution for
>>>>> structs be one that that allows new user data type defs for # types?
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett  wrote:
>>>>>
>>>>>> Some form of MutableStruct# with a known number of words and a known
>>>>>> number of pointers is basically what Ryan Yates was suggesting above, but
>>>>>> where the word counts were stored in the objects themselves.
>>>>>>
>>>>>> Given that it'd have a couple of words for those counts it'd likely
>>>>>> want to be something we build in addition to MutVar# rather than a
>>>>>> replacement.
>>>>>>
>>>>>> On the other hand, if we had to fix those numbers and build info
>>>>>> tables that knew them, and typechecker support, for instance, it'd get
>>>>>> rather invasive.
>>>>>>
>>>>>> Also, a number of things that we can do with the 'sized' versions
>>>>>> above, like working with evil unsized c-style arrays directly inline at 
>>>>>> the
>>>>>> end of the structure cease to be possible, so it isn't even a pure win if
>>>>>> we did the engineering effort.
>>>>>>
>>>>>> I think 90% of the needs I have are covered just by adding the one
>>>>>> primitive. The last 10% gets pretty invasive.
>>>>>>
>>>>>> -Edward
>>>>>>
>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton 
>>>>>> wrote:
>>>>>>
>>>>>>> I like the possibility of a general solution for mutable structs
>>>>>>> (like Ed said), and I'm trying to fully understand why it's hard.
>>>>>>>
>>>>>>> So, we can't unpack MutVar into constructors because of object
>>>>>>> identity problems. But what about directly supporting an extensible set 
>>>>>>> of
>>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#?
>>>>>>> That may be too much work, but is it problematic otherwise?
>>>>>>>
>>>>>>> Needless to say, this is also critical if we ever want best in class
>>>>>>> lockfree mutable structures, just like their Stm and sequential
>>>>>>> counterparts.
>>&

Re: ArrayArrays

2015-08-28 Thread Edward Kmett
Also there are 4 different "things" here, basically depending on two
independent questions:

a.) if you want to shove the sizes into the info table, and
b.) if you want cardmarking.

Versions with/without cardmarking for different sizes can be done pretty
easily, but as noted, the infotable variants are pretty invasive.

-Edward

On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett  wrote:

> Well, on the plus side you'd save 16 bytes per object, which adds up if
> they were small enough and there are enough of them. You get a bit better
> locality of reference in terms of what fits in the first cache line of them.
>
> -Edward
>
> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton  wrote:
>
>> Yes. And for the short term I can imagine places we will settle with
>> arrays even if it means tracking lengths unnecessarily and unsafeCoercing
>> pointers whose types don't actually match their siblings.
>>
>> Is there anything to recommend the hacks mentioned for fixed sized array
>> objects *other* than using them to fake structs? (Much to derecommend, as
>> you mentioned!)
>>
>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett  wrote:
>>
>>> I think both are useful, but the one you suggest requires a lot more
>>> plumbing and doesn't subsume all of the usecases of the other.
>>>
>>> -Edward
>>>
>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton  wrote:
>>>
>>>> So that primitive is an array like thing (Same pointed type, unbounded
>>>> length) with extra payload.
>>>>
>>>> I can see how we can do without structs if we have arrays, especially
>>>> with the extra payload at front. But wouldn't the general solution for
>>>> structs be one that that allows new user data type defs for # types?
>>>>
>>>>
>>>>
>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett  wrote:
>>>>
>>>>> Some form of MutableStruct# with a known number of words and a known
>>>>> number of pointers is basically what Ryan Yates was suggesting above, but
>>>>> where the word counts were stored in the objects themselves.
>>>>>
>>>>> Given that it'd have a couple of words for those counts it'd likely
>>>>> want to be something we build in addition to MutVar# rather than a
>>>>> replacement.
>>>>>
>>>>> On the other hand, if we had to fix those numbers and build info
>>>>> tables that knew them, and typechecker support, for instance, it'd get
>>>>> rather invasive.
>>>>>
>>>>> Also, a number of things that we can do with the 'sized' versions
>>>>> above, like working with evil unsized c-style arrays directly inline at 
>>>>> the
>>>>> end of the structure cease to be possible, so it isn't even a pure win if
>>>>> we did the engineering effort.
>>>>>
>>>>> I think 90% of the needs I have are covered just by adding the one
>>>>> primitive. The last 10% gets pretty invasive.
>>>>>
>>>>> -Edward
>>>>>
>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton 
>>>>> wrote:
>>>>>
>>>>>> I like the possibility of a general solution for mutable structs
>>>>>> (like Ed said), and I'm trying to fully understand why it's hard.
>>>>>>
>>>>>> So, we can't unpack MutVar into constructors because of object
>>>>>> identity problems. But what about directly supporting an extensible set 
>>>>>> of
>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#?
>>>>>> That may be too much work, but is it problematic otherwise?
>>>>>>
>>>>>> Needless to say, this is also critical if we ever want best in class
>>>>>> lockfree mutable structures, just like their Stm and sequential
>>>>>> counterparts.
>>>>>>
>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones <
>>>>>> simo...@microsoft.com> wrote:
>>>>>>
>>>>>>> At the very least I'll take this email and turn it into a short
>>>>>>> article.
>>>>>>>
>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe
>>>>>>> make a ticket for it.
>>>>>>>
>>>>&

Re: ArrayArrays

2015-08-28 Thread Edward Kmett
Well, on the plus side you'd save 16 bytes per object, which adds up if
they were small enough and there are enough of them. You get a bit better
locality of reference in terms of what fits in the first cache line of them.

-Edward

On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton  wrote:

> Yes. And for the short term I can imagine places we will settle with
> arrays even if it means tracking lengths unnecessarily and unsafeCoercing
> pointers whose types don't actually match their siblings.
>
> Is there anything to recommend the hacks mentioned for fixed sized array
> objects *other* than using them to fake structs? (Much to derecommend, as
> you mentioned!)
>
> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett  wrote:
>
>> I think both are useful, but the one you suggest requires a lot more
>> plumbing and doesn't subsume all of the usecases of the other.
>>
>> -Edward
>>
>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton  wrote:
>>
>>> So that primitive is an array like thing (Same pointed type, unbounded
>>> length) with extra payload.
>>>
>>> I can see how we can do without structs if we have arrays, especially
>>> with the extra payload at front. But wouldn't the general solution for
>>> structs be one that that allows new user data type defs for # types?
>>>
>>>
>>>
>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett  wrote:
>>>
>>>> Some form of MutableStruct# with a known number of words and a known
>>>> number of pointers is basically what Ryan Yates was suggesting above, but
>>>> where the word counts were stored in the objects themselves.
>>>>
>>>> Given that it'd have a couple of words for those counts it'd likely
>>>> want to be something we build in addition to MutVar# rather than a
>>>> replacement.
>>>>
>>>> On the other hand, if we had to fix those numbers and build info tables
>>>> that knew them, and typechecker support, for instance, it'd get rather
>>>> invasive.
>>>>
>>>> Also, a number of things that we can do with the 'sized' versions
>>>> above, like working with evil unsized c-style arrays directly inline at the
>>>> end of the structure cease to be possible, so it isn't even a pure win if
>>>> we did the engineering effort.
>>>>
>>>> I think 90% of the needs I have are covered just by adding the one
>>>> primitive. The last 10% gets pretty invasive.
>>>>
>>>> -Edward
>>>>
>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton 
>>>> wrote:
>>>>
>>>>> I like the possibility of a general solution for mutable structs (like
>>>>> Ed said), and I'm trying to fully understand why it's hard.
>>>>>
>>>>> So, we can't unpack MutVar into constructors because of object
>>>>> identity problems. But what about directly supporting an extensible set of
>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#?
>>>>> That may be too much work, but is it problematic otherwise?
>>>>>
>>>>> Needless to say, this is also critical if we ever want best in class
>>>>> lockfree mutable structures, just like their Stm and sequential
>>>>> counterparts.
>>>>>
>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones <
>>>>> simo...@microsoft.com> wrote:
>>>>>
>>>>>> At the very least I'll take this email and turn it into a short
>>>>>> article.
>>>>>>
>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe
>>>>>> make a ticket for it.
>>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>> Simon
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Edward Kmett [mailto:ekm...@gmail.com]
>>>>>> *Sent:* 27 August 2015 16:54
>>>>>> *To:* Simon Peyton Jones
>>>>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs
>>>>>> *Subject:* Re: ArrayArrays
>>>>>>
>>>>>>
>>>>>>
>>>>>> An ArrayArray# is just an Array# with a modified invariant. It points
>>>>>> directly to other unlifted ArrayArray#'s or B

Re: ArrayArrays

2015-08-28 Thread Ryan Newton
Yes. And for the short term I can imagine places we will settle with arrays
even if it means tracking lengths unnecessarily and unsafeCoercing pointers
whose types don't actually match their siblings.

Is there anything to recommend the hacks mentioned for fixed sized array
objects *other* than using them to fake structs? (Much to derecommend, as
you mentioned!)
On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett  wrote:

> I think both are useful, but the one you suggest requires a lot more
> plumbing and doesn't subsume all of the usecases of the other.
>
> -Edward
>
> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton  wrote:
>
>> So that primitive is an array like thing (Same pointed type, unbounded
>> length) with extra payload.
>>
>> I can see how we can do without structs if we have arrays, especially
>> with the extra payload at front. But wouldn't the general solution for
>> structs be one that that allows new user data type defs for # types?
>>
>>
>>
>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett  wrote:
>>
>>> Some form of MutableStruct# with a known number of words and a known
>>> number of pointers is basically what Ryan Yates was suggesting above, but
>>> where the word counts were stored in the objects themselves.
>>>
>>> Given that it'd have a couple of words for those counts it'd likely want
>>> to be something we build in addition to MutVar# rather than a replacement.
>>>
>>> On the other hand, if we had to fix those numbers and build info tables
>>> that knew them, and typechecker support, for instance, it'd get rather
>>> invasive.
>>>
>>> Also, a number of things that we can do with the 'sized' versions above,
>>> like working with evil unsized c-style arrays directly inline at the end of
>>> the structure cease to be possible, so it isn't even a pure win if we did
>>> the engineering effort.
>>>
>>> I think 90% of the needs I have are covered just by adding the one
>>> primitive. The last 10% gets pretty invasive.
>>>
>>> -Edward
>>>
>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton  wrote:
>>>
>>>> I like the possibility of a general solution for mutable structs (like
>>>> Ed said), and I'm trying to fully understand why it's hard.
>>>>
>>>> So, we can't unpack MutVar into constructors because of object identity
>>>> problems. But what about directly supporting an extensible set of unlifted
>>>> MutStruct# objects, generalizing (and even replacing) MutVar#? That may be
>>>> too much work, but is it problematic otherwise?
>>>>
>>>> Needless to say, this is also critical if we ever want best in class
>>>> lockfree mutable structures, just like their Stm and sequential
>>>> counterparts.
>>>>
>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones <
>>>> simo...@microsoft.com> wrote:
>>>>
>>>>> At the very least I'll take this email and turn it into a short
>>>>> article.
>>>>>
>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe
>>>>> make a ticket for it.
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>> Simon
>>>>>
>>>>>
>>>>>
>>>>> *From:* Edward Kmett [mailto:ekm...@gmail.com]
>>>>> *Sent:* 27 August 2015 16:54
>>>>> *To:* Simon Peyton Jones
>>>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs
>>>>> *Subject:* Re: ArrayArrays
>>>>>
>>>>>
>>>>>
>>>>> An ArrayArray# is just an Array# with a modified invariant. It points
>>>>> directly to other unlifted ArrayArray#'s or ByteArray#'s.
>>>>>
>>>>>
>>>>>
>>>>> While those live in #, they are garbage collected objects, so this all
>>>>> lives on the heap.
>>>>>
>>>>>
>>>>>
>>>>> They were added to make some of the DPH stuff fast when it has to deal
>>>>> with nested arrays.
>>>>>
>>>>>
>>>>>
>>>>> I'm currently abusing them as a placeholder for a better thing.
>>>>>
>>>>>
>>>>>
>>>>> The Probl

Re: ArrayArrays

2015-08-28 Thread Edward Kmett
I think both are useful, but the one you suggest requires a lot more
plumbing and doesn't subsume all of the usecases of the other.

-Edward

On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton  wrote:

> So that primitive is an array like thing (Same pointed type, unbounded
> length) with extra payload.
>
> I can see how we can do without structs if we have arrays, especially with
> the extra payload at front. But wouldn't the general solution for structs
> be one that that allows new user data type defs for # types?
>
>
>
> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett  wrote:
>
>> Some form of MutableStruct# with a known number of words and a known
>> number of pointers is basically what Ryan Yates was suggesting above, but
>> where the word counts were stored in the objects themselves.
>>
>> Given that it'd have a couple of words for those counts it'd likely want
>> to be something we build in addition to MutVar# rather than a replacement.
>>
>> On the other hand, if we had to fix those numbers and build info tables
>> that knew them, and typechecker support, for instance, it'd get rather
>> invasive.
>>
>> Also, a number of things that we can do with the 'sized' versions above,
>> like working with evil unsized c-style arrays directly inline at the end of
>> the structure cease to be possible, so it isn't even a pure win if we did
>> the engineering effort.
>>
>> I think 90% of the needs I have are covered just by adding the one
>> primitive. The last 10% gets pretty invasive.
>>
>> -Edward
>>
>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton  wrote:
>>
>>> I like the possibility of a general solution for mutable structs (like
>>> Ed said), and I'm trying to fully understand why it's hard.
>>>
>>> So, we can't unpack MutVar into constructors because of object identity
>>> problems. But what about directly supporting an extensible set of unlifted
>>> MutStruct# objects, generalizing (and even replacing) MutVar#? That may be
>>> too much work, but is it problematic otherwise?
>>>
>>> Needless to say, this is also critical if we ever want best in class
>>> lockfree mutable structures, just like their Stm and sequential
>>> counterparts.
>>>
>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones <
>>> simo...@microsoft.com> wrote:
>>>
>>>> At the very least I'll take this email and turn it into a short article.
>>>>
>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe make
>>>> a ticket for it.
>>>>
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> Simon
>>>>
>>>>
>>>>
>>>> *From:* Edward Kmett [mailto:ekm...@gmail.com]
>>>> *Sent:* 27 August 2015 16:54
>>>> *To:* Simon Peyton Jones
>>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs
>>>> *Subject:* Re: ArrayArrays
>>>>
>>>>
>>>>
>>>> An ArrayArray# is just an Array# with a modified invariant. It points
>>>> directly to other unlifted ArrayArray#'s or ByteArray#'s.
>>>>
>>>>
>>>>
>>>> While those live in #, they are garbage collected objects, so this all
>>>> lives on the heap.
>>>>
>>>>
>>>>
>>>> They were added to make some of the DPH stuff fast when it has to deal
>>>> with nested arrays.
>>>>
>>>>
>>>>
>>>> I'm currently abusing them as a placeholder for a better thing.
>>>>
>>>>
>>>>
>>>> The Problem
>>>>
>>>> -
>>>>
>>>>
>>>>
>>>> Consider the scenario where you write a classic doubly-linked list in
>>>> Haskell.
>>>>
>>>>
>>>>
>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
>>>>
>>>>
>>>>
>>>> Chasing from one DLL to the next requires following 3 pointers on the
>>>> heap.
>>>>
>>>>
>>>>
>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL
>>>> ~> DLL
>>>>
>>>>
>>>>
>>>> That is 3 levels of indirection.
>>>>
>>>>
>>>>
>>>> We can trim one by simply unpacking the IORef

Re: ArrayArrays

2015-08-28 Thread Ryan Newton
So that primitive is an array like thing (Same pointed type, unbounded
length) with extra payload.

I can see how we can do without structs if we have arrays, especially with
the extra payload at front. But wouldn't the general solution for structs
be one that that allows new user data type defs for # types?


On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett  wrote:

> Some form of MutableStruct# with a known number of words and a known
> number of pointers is basically what Ryan Yates was suggesting above, but
> where the word counts were stored in the objects themselves.
>
> Given that it'd have a couple of words for those counts it'd likely want
> to be something we build in addition to MutVar# rather than a replacement.
>
> On the other hand, if we had to fix those numbers and build info tables
> that knew them, and typechecker support, for instance, it'd get rather
> invasive.
>
> Also, a number of things that we can do with the 'sized' versions above,
> like working with evil unsized c-style arrays directly inline at the end of
> the structure cease to be possible, so it isn't even a pure win if we did
> the engineering effort.
>
> I think 90% of the needs I have are covered just by adding the one
> primitive. The last 10% gets pretty invasive.
>
> -Edward
>
> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton  wrote:
>
>> I like the possibility of a general solution for mutable structs (like Ed
>> said), and I'm trying to fully understand why it's hard.
>>
>> So, we can't unpack MutVar into constructors because of object identity
>> problems. But what about directly supporting an extensible set of unlifted
>> MutStruct# objects, generalizing (and even replacing) MutVar#? That may be
>> too much work, but is it problematic otherwise?
>>
>> Needless to say, this is also critical if we ever want best in class
>> lockfree mutable structures, just like their Stm and sequential
>> counterparts.
>>
>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones 
>> wrote:
>>
>>> At the very least I'll take this email and turn it into a short article.
>>>
>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe make
>>> a ticket for it.
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Simon
>>>
>>>
>>>
>>> *From:* Edward Kmett [mailto:ekm...@gmail.com]
>>> *Sent:* 27 August 2015 16:54
>>> *To:* Simon Peyton Jones
>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs
>>> *Subject:* Re: ArrayArrays
>>>
>>>
>>>
>>> An ArrayArray# is just an Array# with a modified invariant. It points
>>> directly to other unlifted ArrayArray#'s or ByteArray#'s.
>>>
>>>
>>>
>>> While those live in #, they are garbage collected objects, so this all
>>> lives on the heap.
>>>
>>>
>>>
>>> They were added to make some of the DPH stuff fast when it has to deal
>>> with nested arrays.
>>>
>>>
>>>
>>> I'm currently abusing them as a placeholder for a better thing.
>>>
>>>
>>>
>>> The Problem
>>>
>>> -
>>>
>>>
>>>
>>> Consider the scenario where you write a classic doubly-linked list in
>>> Haskell.
>>>
>>>
>>>
>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
>>>
>>>
>>>
>>> Chasing from one DLL to the next requires following 3 pointers on the
>>> heap.
>>>
>>>
>>>
>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL
>>> ~> DLL
>>>
>>>
>>>
>>> That is 3 levels of indirection.
>>>
>>>
>>>
>>> We can trim one by simply unpacking the IORef with -funbox-strict-fields
>>> or UNPACK
>>>
>>>
>>>
>>> We can trim another by adding a 'Nil' constructor for DLL and worsening
>>> our representation.
>>>
>>>
>>>
>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil
>>>
>>>
>>>
>>> but now we're still stuck with a level of indirection
>>>
>>>
>>>
>>> DLL ~> MutVar# RealWorld DLL ~> DLL
>>>
>>>
>>>
>>> This means that every operation we perform on this structure will be
>>> about half of the speed of an implementation in most other la

Re: ArrayArrays

2015-08-28 Thread Edward Kmett
Some form of MutableStruct# with a known number of words and a known number
of pointers is basically what Ryan Yates was suggesting above, but where
the word counts were stored in the objects themselves.

Given that it'd have a couple of words for those counts it'd likely want to
be something we build in addition to MutVar# rather than a replacement.

On the other hand, if we had to fix those numbers and build info tables
that knew them, and typechecker support, for instance, it'd get rather
invasive.

Also, a number of things that we can do with the 'sized' versions above,
like working with evil unsized c-style arrays directly inline at the end of
the structure cease to be possible, so it isn't even a pure win if we did
the engineering effort.

I think 90% of the needs I have are covered just by adding the one
primitive. The last 10% gets pretty invasive.

-Edward

On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton  wrote:

> I like the possibility of a general solution for mutable structs (like Ed
> said), and I'm trying to fully understand why it's hard.
>
> So, we can't unpack MutVar into constructors because of object identity
> problems. But what about directly supporting an extensible set of unlifted
> MutStruct# objects, generalizing (and even replacing) MutVar#? That may be
> too much work, but is it problematic otherwise?
>
> Needless to say, this is also critical if we ever want best in class
> lockfree mutable structures, just like their Stm and sequential
> counterparts.
>
> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones 
> wrote:
>
>> At the very least I'll take this email and turn it into a short article.
>>
>> Yes, please do make it into a wiki page on the GHC Trac, and maybe make a
>> ticket for it.
>>
>>
>> Thanks
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* Edward Kmett [mailto:ekm...@gmail.com]
>> *Sent:* 27 August 2015 16:54
>> *To:* Simon Peyton Jones
>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs
>> *Subject:* Re: ArrayArrays
>>
>>
>>
>> An ArrayArray# is just an Array# with a modified invariant. It points
>> directly to other unlifted ArrayArray#'s or ByteArray#'s.
>>
>>
>>
>> While those live in #, they are garbage collected objects, so this all
>> lives on the heap.
>>
>>
>>
>> They were added to make some of the DPH stuff fast when it has to deal
>> with nested arrays.
>>
>>
>>
>> I'm currently abusing them as a placeholder for a better thing.
>>
>>
>>
>> The Problem
>>
>> -
>>
>>
>>
>> Consider the scenario where you write a classic doubly-linked list in
>> Haskell.
>>
>>
>>
>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
>>
>>
>>
>> Chasing from one DLL to the next requires following 3 pointers on the
>> heap.
>>
>>
>>
>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~>
>> DLL
>>
>>
>>
>> That is 3 levels of indirection.
>>
>>
>>
>> We can trim one by simply unpacking the IORef with -funbox-strict-fields
>> or UNPACK
>>
>>
>>
>> We can trim another by adding a 'Nil' constructor for DLL and worsening
>> our representation.
>>
>>
>>
>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil
>>
>>
>>
>> but now we're still stuck with a level of indirection
>>
>>
>>
>> DLL ~> MutVar# RealWorld DLL ~> DLL
>>
>>
>>
>> This means that every operation we perform on this structure will be
>> about half of the speed of an implementation in most other languages
>> assuming we're memory bound on loading things into cache!
>>
>>
>>
>> Making Progress
>>
>> --
>>
>>
>>
>> I have been working on a number of data structures where the indirection
>> of going from something in * out to an object in # which contains the real
>> pointer to my target and coming back effectively doubles my runtime.
>>
>>
>>
>> We go out to the MutVar# because we are allowed to put the MutVar# onto
>> the mutable list when we dirty it. There is a well defined write-barrier.
>>
>>
>>
>> I could change out the representation to use
>>
>>
>>
>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil
>>
>>
>>
>> I can just store two pointers in the MutableArray# every time, but this
>

Re: ArrayArrays

2015-08-28 Thread Ryan Newton
I like the possibility of a general solution for mutable structs (like Ed
said), and I'm trying to fully understand why it's hard.

So, we can't unpack MutVar into constructors because of object identity
problems. But what about directly supporting an extensible set of unlifted
MutStruct# objects, generalizing (and even replacing) MutVar#? That may be
too much work, but is it problematic otherwise?

Needless to say, this is also critical if we ever want best in class
lockfree mutable structures, just like their Stm and sequential
counterparts.

On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones 
wrote:

> At the very least I'll take this email and turn it into a short article.
>
> Yes, please do make it into a wiki page on the GHC Trac, and maybe make a
> ticket for it.
>
>
> Thanks
>
>
>
> Simon
>
>
>
> *From:* Edward Kmett [mailto:ekm...@gmail.com]
> *Sent:* 27 August 2015 16:54
> *To:* Simon Peyton Jones
> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs
> *Subject:* Re: ArrayArrays
>
>
>
> An ArrayArray# is just an Array# with a modified invariant. It points
> directly to other unlifted ArrayArray#'s or ByteArray#'s.
>
>
>
> While those live in #, they are garbage collected objects, so this all
> lives on the heap.
>
>
>
> They were added to make some of the DPH stuff fast when it has to deal
> with nested arrays.
>
>
>
> I'm currently abusing them as a placeholder for a better thing.
>
>
>
> The Problem
>
> -
>
>
>
> Consider the scenario where you write a classic doubly-linked list in
> Haskell.
>
>
>
> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
>
>
>
> Chasing from one DLL to the next requires following 3 pointers on the heap.
>
>
>
> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~>
> DLL
>
>
>
> That is 3 levels of indirection.
>
>
>
> We can trim one by simply unpacking the IORef with -funbox-strict-fields
> or UNPACK
>
>
>
> We can trim another by adding a 'Nil' constructor for DLL and worsening
> our representation.
>
>
>
> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil
>
>
>
> but now we're still stuck with a level of indirection
>
>
>
> DLL ~> MutVar# RealWorld DLL ~> DLL
>
>
>
> This means that every operation we perform on this structure will be about
> half of the speed of an implementation in most other languages assuming
> we're memory bound on loading things into cache!
>
>
>
> Making Progress
>
> --
>
>
>
> I have been working on a number of data structures where the indirection
> of going from something in * out to an object in # which contains the real
> pointer to my target and coming back effectively doubles my runtime.
>
>
>
> We go out to the MutVar# because we are allowed to put the MutVar# onto
> the mutable list when we dirty it. There is a well defined write-barrier.
>
>
>
> I could change out the representation to use
>
>
>
> data DLL = DLL (MutableArray# RealWorld DLL) | Nil
>
>
>
> I can just store two pointers in the MutableArray# every time, but this
> doesn't help _much_ directly. It has reduced the amount of distinct
> addresses in memory I touch on a walk of the DLL from 3 per object to 2.
>
>
>
> I still have to go out to the heap from my DLL and get to the array object
> and then chase it to the next DLL and chase that to the next array. I do
> get my two pointers together in memory though. I'm paying for a card
> marking table as well, which I don't particularly need with just two
> pointers, but we can shed that with the "SmallMutableArray#" machinery
> added back in 7.10, which is just the old array code a a new data type,
> which can speed things up a bit when you don't have very big arrays:
>
>
>
> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil
>
>
>
> But what if I wanted my object itself to live in # and have two mutable
> fields and be able to share the sme write barrier?
>
>
>
> An ArrayArray# points directly to other unlifted array types. What if we
> have one # -> * wrapper on the outside to deal with the impedence mismatch
> between the imperative world and Haskell, and then just let the
> ArrayArray#'s hold other arrayarrays.
>
>
>
> data DLL = DLL (MutableArrayArray# RealWorld)
>
>
>
> now I need to make up a new Nil, which I can just make be a special
> MutableArrayArray# I allocate on program startup. I can even abuse pattern
> synonyms. Alternately I can exploit the internals furthe

Re: ArrayArrays

2015-08-28 Thread Edward Kmett
I posted a summary article on "what this lets you do" to

https://www.fpcomplete.com/user/edwardk/unlifted-structures

I can see about making a more proposal/feature-oriented summary for the
Haskell Wiki. It may have to wait until after ICFP though.

-Edward

On Fri, Aug 28, 2015 at 5:42 AM, Simon Peyton Jones 
wrote:

> At the very least I'll take this email and turn it into a short article.
>
> Yes, please do make it into a wiki page on the GHC Trac, and maybe make a
> ticket for it.
>
>
> Thanks
>
>
>
> Simon
>
>
>
> *From:* Edward Kmett [mailto:ekm...@gmail.com]
> *Sent:* 27 August 2015 16:54
> *To:* Simon Peyton Jones
> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs
> *Subject:* Re: ArrayArrays
>
>
>
> An ArrayArray# is just an Array# with a modified invariant. It points
> directly to other unlifted ArrayArray#'s or ByteArray#'s.
>
>
>
> While those live in #, they are garbage collected objects, so this all
> lives on the heap.
>
>
>
> They were added to make some of the DPH stuff fast when it has to deal
> with nested arrays.
>
>
>
> I'm currently abusing them as a placeholder for a better thing.
>
>
>
> The Problem
>
> -
>
>
>
> Consider the scenario where you write a classic doubly-linked list in
> Haskell.
>
>
>
> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
>
>
>
> Chasing from one DLL to the next requires following 3 pointers on the heap.
>
>
>
> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~>
> DLL
>
>
>
> That is 3 levels of indirection.
>
>
>
> We can trim one by simply unpacking the IORef with -funbox-strict-fields
> or UNPACK
>
>
>
> We can trim another by adding a 'Nil' constructor for DLL and worsening
> our representation.
>
>
>
> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil
>
>
>
> but now we're still stuck with a level of indirection
>
>
>
> DLL ~> MutVar# RealWorld DLL ~> DLL
>
>
>
> This means that every operation we perform on this structure will be about
> half of the speed of an implementation in most other languages assuming
> we're memory bound on loading things into cache!
>
>
>
> Making Progress
>
> --
>
>
>
> I have been working on a number of data structures where the indirection
> of going from something in * out to an object in # which contains the real
> pointer to my target and coming back effectively doubles my runtime.
>
>
>
> We go out to the MutVar# because we are allowed to put the MutVar# onto
> the mutable list when we dirty it. There is a well defined write-barrier.
>
>
>
> I could change out the representation to use
>
>
>
> data DLL = DLL (MutableArray# RealWorld DLL) | Nil
>
>
>
> I can just store two pointers in the MutableArray# every time, but this
> doesn't help _much_ directly. It has reduced the amount of distinct
> addresses in memory I touch on a walk of the DLL from 3 per object to 2.
>
>
>
> I still have to go out to the heap from my DLL and get to the array object
> and then chase it to the next DLL and chase that to the next array. I do
> get my two pointers together in memory though. I'm paying for a card
> marking table as well, which I don't particularly need with just two
> pointers, but we can shed that with the "SmallMutableArray#" machinery
> added back in 7.10, which is just the old array code a a new data type,
> which can speed things up a bit when you don't have very big arrays:
>
>
>
> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil
>
>
>
> But what if I wanted my object itself to live in # and have two mutable
> fields and be able to share the sme write barrier?
>
>
>
> An ArrayArray# points directly to other unlifted array types. What if we
> have one # -> * wrapper on the outside to deal with the impedence mismatch
> between the imperative world and Haskell, and then just let the
> ArrayArray#'s hold other arrayarrays.
>
>
>
> data DLL = DLL (MutableArrayArray# RealWorld)
>
>
>
> now I need to make up a new Nil, which I can just make be a special
> MutableArrayArray# I allocate on program startup. I can even abuse pattern
> synonyms. Alternately I can exploit the internals further to make this
> cheaper.
>
>
>
> Then I can use the readMutableArrayArray# and writeMutableArrayArray#
> calls to directly access the preceding and next entry in the linked list.
>
>
>
> So now we have one DLL wrapper which just 'bootstraps me' int

RE: ArrayArrays

2015-08-28 Thread Simon Peyton Jones
At the very least I'll take this email and turn it into a short article.
Yes, please do make it into a wiki page on the GHC Trac, and maybe make a 
ticket for it.

Thanks

Simon

From: Edward Kmett [mailto:ekm...@gmail.com]
Sent: 27 August 2015 16:54
To: Simon Peyton Jones
Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs
Subject: Re: ArrayArrays

An ArrayArray# is just an Array# with a modified invariant. It points directly 
to other unlifted ArrayArray#'s or ByteArray#'s.

While those live in #, they are garbage collected objects, so this all lives on 
the heap.

They were added to make some of the DPH stuff fast when it has to deal with 
nested arrays.

I'm currently abusing them as a placeholder for a better thing.

The Problem
-

Consider the scenario where you write a classic doubly-linked list in Haskell.

data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)

Chasing from one DLL to the next requires following 3 pointers on the heap.

DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~> DLL

That is 3 levels of indirection.

We can trim one by simply unpacking the IORef with -funbox-strict-fields or 
UNPACK

We can trim another by adding a 'Nil' constructor for DLL and worsening our 
representation.

data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil

but now we're still stuck with a level of indirection

DLL ~> MutVar# RealWorld DLL ~> DLL

This means that every operation we perform on this structure will be about half 
of the speed of an implementation in most other languages assuming we're memory 
bound on loading things into cache!

Making Progress
--

I have been working on a number of data structures where the indirection of 
going from something in * out to an object in # which contains the real pointer 
to my target and coming back effectively doubles my runtime.

We go out to the MutVar# because we are allowed to put the MutVar# onto the 
mutable list when we dirty it. There is a well defined write-barrier.

I could change out the representation to use

data DLL = DLL (MutableArray# RealWorld DLL) | Nil

I can just store two pointers in the MutableArray# every time, but this doesn't 
help _much_ directly. It has reduced the amount of distinct addresses in memory 
I touch on a walk of the DLL from 3 per object to 2.

I still have to go out to the heap from my DLL and get to the array object and 
then chase it to the next DLL and chase that to the next array. I do get my two 
pointers together in memory though. I'm paying for a card marking table as 
well, which I don't particularly need with just two pointers, but we can shed 
that with the "SmallMutableArray#" machinery added back in 7.10, which is just 
the old array code a a new data type, which can speed things up a bit when you 
don't have very big arrays:

data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil

But what if I wanted my object itself to live in # and have two mutable fields 
and be able to share the sme write barrier?

An ArrayArray# points directly to other unlifted array types. What if we have 
one # -> * wrapper on the outside to deal with the impedence mismatch between 
the imperative world and Haskell, and then just let the ArrayArray#'s hold 
other arrayarrays.

data DLL = DLL (MutableArrayArray# RealWorld)

now I need to make up a new Nil, which I can just make be a special 
MutableArrayArray# I allocate on program startup. I can even abuse pattern 
synonyms. Alternately I can exploit the internals further to make this cheaper.

Then I can use the readMutableArrayArray# and writeMutableArrayArray# calls to 
directly access the preceding and next entry in the linked list.

So now we have one DLL wrapper which just 'bootstraps me' into a strict world, 
and everything there lives in #.

next :: DLL -> IO DLL
next (DLL m) = IO $ \s -> case readMutableArrayArray# s of
   (# s', n #) -> (# s', DLL n #)

It turns out GHC is quite happy to optimize all of that code to keep things 
unboxed. The 'DLL' wrappers get removed pretty easily when they are known 
strict and you chain operations of this sort!

Cleaning it Up
--

Now I have one outermost indirection pointing to an array that points directly 
to other arrays.

I'm stuck paying for a card marking table per object, but I can fix that by 
duplicating the code for MutableArrayArray# and using a SmallMutableArray#. I 
can hack up primops that let me store a mixture of SmallMutableArray# fields 
and normal ones in the data structure. Operationally, I can even do so by just 
unsafeCoercing the existing SmallMutableArray# primitives to change the kind of 
one of the arguments it takes.

This is almost ideal, but not quite. I often have fields that would be best 
left unboxed.

data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil

was able to unpack the Int, but we 

Re: ArrayArrays

2015-08-28 Thread Simon Marlow

On 27/08/2015 19:36, Edward Kmett wrote:

On Thu, Aug 27, 2015 at 1:24 PM, Edward Z. Yang mailto:ezy...@mit.edu>> wrote:

It seems to me that we should take a page from OCaml's playbook
and add support for native mutable fields in objects, because
this is essentially what a mix of words and pointers is.


That actually doesn't work as well as one might hope.

We currently treat data constructor closures as so much tissue paper
around a present. We tear them open, rip out all their contents, scatter
them throughout our code and then we build a whole new data constructor
closure when we're done, or we just leave them suspended in closures
awaiting someone to demand we finally make a new data constructor.

Half the time we don't even give back the data constructor closure and
push it into update g frames and we just give back the items on the stack.

With the machinery I mentioned above I get a world where every time I
access an object I can know it is evaluated for real, so this means I'm
not stuck 'entering an unknown closure', and getting it to give me back
a slab of memory that we know is a real data constructor that i can bang
away on mutable entries in.

In a world where things in * could hold mutable pointers we have to care
a lot more about object identity in deeply uncomfortable ways.

With what I've implemented I only care about object identity between
things in # that are gcptrs. The garbage collector may move them around,
but it doesn't put in thunks anywhere.


Yeah, I've actually thought about whether we could have mutable fields 
in constructors a couple of times, and it's far from easy for the 
reasons you describe.  A constructor with mutable fields would need to 
be an object with identity, with precise control over when it is 
created.  This is nothing like an ordinary constructor.


I like the alternative approach in this thread, which is to attack the 
problem from the other end: start with a primitive object and make it 
more like a constructor.


I don't see any reason why we shouldn't add primops to read/write 
SmallArray# and other primitive objects in an ArrayArray#.  Will someone 
make a patch?  It should be pretty straightforward.


Cheers,
Simon
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: ArrayArrays

2015-08-27 Thread Edward Kmett
On Thu, Aug 27, 2015 at 1:24 PM, Edward Z. Yang  wrote:

> It seems to me that we should take a page from OCaml's playbook
> and add support for native mutable fields in objects, because
> this is essentially what a mix of words and pointers is.
>

That actually doesn't work as well as one might hope.

We currently treat data constructor closures as so much tissue paper around
a present. We tear them open, rip out all their contents, scatter them
throughout our code and then we build a whole new data constructor closure
when we're done, or we just leave them suspended in closures awaiting
someone to demand we finally make a new data constructor.

Half the time we don't even give back the data constructor closure and push
it into update g frames and we just give back the items on the stack.

With the machinery I mentioned above I get a world where every time I
access an object I can know it is evaluated for real, so this means I'm not
stuck 'entering an unknown closure', and getting it to give me back a slab
of memory that we know is a real data constructor that i can bang away on
mutable entries in.

In a world where things in * could hold mutable pointers we have to care a
lot more about object identity in deeply uncomfortable ways.

With what I've implemented I only care about object identity between things
in # that are gcptrs. The garbage collector may move them around, but it
doesn't put in thunks anywhere.

-Edward
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: ArrayArrays

2015-08-27 Thread Edward Z. Yang
nto the SmallArray
> > as slots, removing the limitation  we see here by aping the
> > MutableArrayArray# s API, but that gets really really dangerous!
> >
> > I'm pretty much willing to sacrifice almost anything on the altar of speed
> > here, but I'd like to be able to let the GC move them and collect them which
> > rules out simpler Ptr and Addr based solutions.
> >
> > -Edward
> >
> > On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty
> >  wrote:
> >>
> >> That’s an interesting idea.
> >>
> >> Manuel
> >>
> >> > Edward Kmett :
> >> >
> >> > Would it be possible to add unsafe primops to add Array# and SmallArray#
> >> > entries to an ArrayArray#? The fact that the ArrayArray# entries are all
> >> > directly unlifted avoiding a level of indirection for the containing
> >> > structure is amazing, but I can only currently use it if my leaf level 
> >> > data
> >> > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be
> >> > able to have the ability to put SmallArray# a stuff down at the leaves to
> >> > hold lifted contents.
> >> >
> >> > I accept fully that if I name the wrong type when I go to access one of
> >> > the fields it'll lie to me, but I suppose it'd do that if i tried to use 
> >> > one
> >> > of the members that held a nested ArrayArray# as a ByteArray# anyways, 
> >> > so it
> >> > isn't like there is a safety story preventing this.
> >> >
> >> > I've been hunting for ways to try to kill the indirection problems I get
> >> > with Haskell and mutable structures, and I could shoehorn a number of 
> >> > them
> >> > into ArrayArrays if this worked.
> >> >
> >> > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection
> >> > compared to c/java and this could reduce that pain to just 1 level of
> >> > unnecessary indirection.
> >> >
> >> > -Edward
> >> > ___
> >> > ghc-devs mailing list
> >> > ghc-devs@haskell.org
> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >>
> >
> >
> > ___
> > ghc-devs mailing list
> > ghc-devs@haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: ArrayArrays

2015-08-27 Thread Edward Kmett
An ArrayArray# is just an Array# with a modified invariant. It points
directly to other unlifted ArrayArray#'s or ByteArray#'s.

While those live in #, they are garbage collected objects, so this all
lives on the heap.

They were added to make some of the DPH stuff fast when it has to deal with
nested arrays.

I'm currently abusing them as a placeholder for a better thing.

The Problem
-

Consider the scenario where you write a classic doubly-linked list in
Haskell.

data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)

Chasing from one DLL to the next requires following 3 pointers on the heap.

DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe DLL ~>
DLL

That is 3 levels of indirection.

We can trim one by simply unpacking the IORef with -funbox-strict-fields or
UNPACK

We can trim another by adding a 'Nil' constructor for DLL and worsening our
representation.

data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil

but now we're still stuck with a level of indirection

DLL ~> MutVar# RealWorld DLL ~> DLL

This means that every operation we perform on this structure will be about
half of the speed of an implementation in most other languages assuming
we're memory bound on loading things into cache!

Making Progress
--

I have been working on a number of data structures where the indirection of
going from something in * out to an object in # which contains the real
pointer to my target and coming back effectively doubles my runtime.

We go out to the MutVar# because we are allowed to put the MutVar# onto the
mutable list when we dirty it. There is a well defined write-barrier.

I could change out the representation to use

data DLL = DLL (MutableArray# RealWorld DLL) | Nil

I can just store two pointers in the MutableArray# every time, but this
doesn't help _much_ directly. It has reduced the amount of distinct
addresses in memory I touch on a walk of the DLL from 3 per object to 2.

I still have to go out to the heap from my DLL and get to the array object
and then chase it to the next DLL and chase that to the next array. I do
get my two pointers together in memory though. I'm paying for a card
marking table as well, which I don't particularly need with just two
pointers, but we can shed that with the "SmallMutableArray#" machinery
added back in 7.10, which is just the old array code a a new data type,
which can speed things up a bit when you don't have very big arrays:

data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil

But what if I wanted my object itself to live in # and have two mutable
fields and be able to share the sme write barrier?

An ArrayArray# points directly to other unlifted array types. What if we
have one # -> * wrapper on the outside to deal with the impedence mismatch
between the imperative world and Haskell, and then just let the
ArrayArray#'s hold other arrayarrays.

data DLL = DLL (MutableArrayArray# RealWorld)

now I need to make up a new Nil, which I can just make be a special
MutableArrayArray# I allocate on program startup. I can even abuse pattern
synonyms. Alternately I can exploit the internals further to make this
cheaper.

Then I can use the readMutableArrayArray# and writeMutableArrayArray# calls
to directly access the preceding and next entry in the linked list.

So now we have one DLL wrapper which just 'bootstraps me' into a strict
world, and everything there lives in #.

next :: DLL -> IO DLL
next (DLL m) = IO $ \s -> case readMutableArrayArray# s of
   (# s', n #) -> (# s', DLL n #)

It turns out GHC is quite happy to optimize all of that code to keep things
unboxed. The 'DLL' wrappers get removed pretty easily when they are known
strict and you chain operations of this sort!

Cleaning it Up
--

Now I have one outermost indirection pointing to an array that points
directly to other arrays.

I'm stuck paying for a card marking table per object, but I can fix that by
duplicating the code for MutableArrayArray# and using a SmallMutableArray#.
I can hack up primops that let me store a mixture of SmallMutableArray#
fields and normal ones in the data structure. Operationally, I can even do
so by just unsafeCoercing the existing SmallMutableArray# primitives to
change the kind of one of the arguments it takes.

This is almost ideal, but not quite. I often have fields that would be best
left unboxed.

data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil

was able to unpack the Int, but we lost that. We can currently at best
point one of the entries of the SmallMutableArray# at a boxed or at a
MutableByteArray# for all of our misc. data and shove the int in question
in there.

e.g. if I were to implement a hash-array-mapped-trie I need to store masks
and administrivia as I walk down the tree. Having to go off to the side
costs me the entire win from avoiding the first poi

RE: ArrayArrays

2015-08-27 Thread Simon Peyton Jones
Just to say that I have no idea what is going on in this thread.  What is 
ArrayArray?  What is the issue in general?  Is there a ticket? Is there a wiki 
page?

If it’s important, an ab-initio wiki page + ticket would be a good thing.

Simon

From: ghc-devs [mailto:ghc-devs-boun...@haskell.org] On Behalf Of Edward Kmett
Sent: 21 August 2015 05:25
To: Manuel M T Chakravarty
Cc: Simon Marlow; ghc-devs
Subject: Re: ArrayArrays

When (ab)using them for this purpose, SmallArrayArray's would be very handy as 
well.

Consider right now if I have something like an order-maintenance structure I 
have:

data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} 
!(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s))

data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} 
!(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-} 
!(MutVar s (Lower s))

The former contains, logically, a mutable integer and two pointers, one for 
forward and one for backwards. The latter is basically the same thing with a 
mutable reference up pointing at the structure above.

On the heap this is an object that points to a structure for the bytearray, and 
points to another structure for each mutvar which each point to the other 
'Upper' structure. So there is a level of indirection smeared over everything.

So this is a pair of doubly linked lists with an upward link from the structure 
below to the structure above.

Converted into ArrayArray#s I'd get

data Upper s = Upper (MutableArrayArray# s)

w/ the first slot being a pointer to a MutableByteArray#, and the next 2 slots 
pointing to the previous and next previous objects, represented just as their 
MutableArrayArray#s. I can use sameMutableArrayArray# on these for object 
identity, which lets me check for the ends of the lists by tying things back on 
themselves.

and below that

data Lower s = Lower (MutableArrayArray# s)

is similar, with an extra MutableArrayArray slot pointing up to an upper 
structure.

I can then write a handful of combinators for getting out the slots in 
question, while it has gained a level of indirection between the wrapper to put 
it in * and the MutableArrayArray# s in #, that one can be basically erased by 
ghc.

Unlike before I don't have several separate objects on the heap for each thing. 
I only have 2 now. The MutableArrayArray# for the object itself, and the 
MutableByteArray# that it references to carry around the mutable int.

The only pain points are

1.) the aforementioned limitation that currently prevents me from stuffing 
normal boxed data through a SmallArray or Array into an ArrayArray leaving me 
in a little ghetto disconnected from the rest of Haskell,

and

2.) the lack of SmallArrayArray's, which could let us avoid the card marking 
overhead. These objects are all small, 3-4 pointers wide. Card marking doesn't 
help.

Alternately I could just try to do really evil things and convert the whole 
mess to SmallArrays and then figure out how to unsafeCoerce my way to glory, 
stuffing the #'d references to the other arrays directly into the SmallArray as 
slots, removing the limitation  we see here by aping the MutableArrayArray# s 
API, but that gets really really dangerous!

I'm pretty much willing to sacrifice almost anything on the altar of speed 
here, but I'd like to be able to let the GC move them and collect them which 
rules out simpler Ptr and Addr based solutions.

-Edward

On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty 
mailto:c...@cse.unsw.edu.au>> wrote:
That’s an interesting idea.

Manuel

> Edward Kmett mailto:ekm...@gmail.com>>:
>
> Would it be possible to add unsafe primops to add Array# and SmallArray# 
> entries to an ArrayArray#? The fact that the ArrayArray# entries are all 
> directly unlifted avoiding a level of indirection for the containing 
> structure is amazing, but I can only currently use it if my leaf level data 
> can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be 
> able to have the ability to put SmallArray# a stuff down at the leaves to 
> hold lifted contents.
>
> I accept fully that if I name the wrong type when I go to access one of the 
> fields it'll lie to me, but I suppose it'd do that if i tried to use one of 
> the members that held a nested ArrayArray# as a ByteArray# anyways, so it 
> isn't like there is a safety story preventing this.
>
> I've been hunting for ways to try to kill the indirection problems I get with 
> Haskell and mutable structures, and I could shoehorn a number of them into 
> ArrayArrays if this worked.
>
> Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection 
> compared to c/java and this could reduce that pain to just 1 level of 
> unnecessary indirection.
>
> -Edward
> 

Re: ArrayArrays

2015-08-21 Thread Edward Kmett
On Fri, Aug 21, 2015 at 9:49 AM, Ryan Yates  wrote:

> Hi Edward,
>
> I've been working on removing indirection in STM and I added a heap
> object like SmallArray, but with a mix of words and pointers (as well
> as a header with metadata for STM).  It appears to work well now, but
> it is missing the type information.  All the pointers have the same
> type which works fine for your Upper.  In my case I use it to
> represent a red-black tree node [1].
>

This would be perfect for my purposes.


> Also all the structures I make are fixed size and it would be nice if
> the compiler could treat that fix size like a constant in code
> generation.


To make the fixed sized thing work without an extra couple of size
parameters in the arguments, you'd want to be able to build an info table
for each generated size. That sounds messy.


> I don't know what the right design is or what would be
> needed, but it seems simple enough to give the right typing
> information to something like this and basically get a mutable struct.
> I'm talking about this work at HIW and really hope to find someone
> interested in extending this expressiveness to let us write something
> that looks clear in Haskell, but gives the heap representation that we
> really need for performance.


I'll be there. Let's talk.


> From the RTS perspective I think there are any obstacles.
>

FWIW- I was able to get some code put together that let me scribble
unlifted SmallMutableArray#s directly into other SmallMutableArray#s, which
nicely "just works" as long as you fix up all the fields that are supposed
to be arrays before you ever dare use them.

writeSmallMutableArraySmallArray# :: SmallMutableArray# s Any -> Int# ->
SmallMutableArray# s Any -> State# s -> State# s
writeSmallMutableArraySmallArray# m i a s = unsafeCoerce# writeSmallArray#
m i a s
{-# INLINE writeSmallMutableArraySmallArray# #-}

readSmallMutableArraySmallArray# :: SmallMutableArray# s Any -> Int# ->
State# s -> (# State# s, SmallMutableArray# s Any #)
readSmallMutableArraySmallArray# m i s = unsafeCoerce# readSmallArray# m i s
{-# INLINE readSmallMutableArraySmallArray# #-}

With some support for typed 'Field's I can write code now that looks like:
order :: PrimMonad m => Upper (PrimState m) -> Int -> Order (PrimState m)
-> Order (PrimState m) -> m (Order (PrimState m))
order p a l r = st $ do
  this <- primitive $ \s -> case unsafeCoerce# newSmallArray# 4# a s of
(# s', b #) -> (# s', Order b #)
  set parent this p
  set next this l
  set prev this r
  return this

and in there basically build my own little strict, mutable, universe and
with some careful monitoring of the core make sure that the little Order
wrappers as the fringes get removed.

Here I'm using one of the slots as a pointer to a boxed Int for testing,
rather than as a pointer to a MutableByteArray that holds the Int.

-Edward
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: ArrayArrays

2015-08-21 Thread Ryan Yates
gt; directly unlifted avoiding a level of indirection for the containing
>> > structure is amazing, but I can only currently use it if my leaf level data
>> > can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be
>> > able to have the ability to put SmallArray# a stuff down at the leaves to
>> > hold lifted contents.
>> >
>> > I accept fully that if I name the wrong type when I go to access one of
>> > the fields it'll lie to me, but I suppose it'd do that if i tried to use 
>> > one
>> > of the members that held a nested ArrayArray# as a ByteArray# anyways, so 
>> > it
>> > isn't like there is a safety story preventing this.
>> >
>> > I've been hunting for ways to try to kill the indirection problems I get
>> > with Haskell and mutable structures, and I could shoehorn a number of them
>> > into ArrayArrays if this worked.
>> >
>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection
>> > compared to c/java and this could reduce that pain to just 1 level of
>> > unnecessary indirection.
>> >
>> > -Edward
>> > ___
>> > ghc-devs mailing list
>> > ghc-devs@haskell.org
>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: ArrayArrays

2015-08-20 Thread Edward Kmett
When (ab)using them for this purpose, SmallArrayArray's would be very handy
as well.

Consider right now if I have something like an order-maintenance structure
I have:

data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# UNPACK #-}
!(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s))

data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-}
!(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) {-# UNPACK #-}
!(MutVar s (Lower s))

The former contains, logically, a mutable integer and two pointers, one for
forward and one for backwards. The latter is basically the same thing with
a mutable reference up pointing at the structure above.

On the heap this is an object that points to a structure for the bytearray,
and points to another structure for each mutvar which each point to the
other 'Upper' structure. So there is a level of indirection smeared over
everything.

So this is a pair of doubly linked lists with an upward link from the
structure below to the structure above.

Converted into ArrayArray#s I'd get

data Upper s = Upper (MutableArrayArray# s)

w/ the first slot being a pointer to a MutableByteArray#, and the next 2
slots pointing to the previous and next previous objects, represented just
as their MutableArrayArray#s. I can use sameMutableArrayArray# on these for
object identity, which lets me check for the ends of the lists by tying
things back on themselves.

and below that

data Lower s = Lower (MutableArrayArray# s)

is similar, with an extra MutableArrayArray slot pointing up to an upper
structure.

I can then write a handful of combinators for getting out the slots in
question, while it has gained a level of indirection between the wrapper to
put it in * and the MutableArrayArray# s in #, that one can be basically
erased by ghc.

Unlike before I don't have several separate objects on the heap for each
thing. I only have 2 now. The MutableArrayArray# for the object itself, and
the MutableByteArray# that it references to carry around the mutable int.

The only pain points are

1.) the aforementioned limitation that currently prevents me from stuffing
normal boxed data through a SmallArray or Array into an ArrayArray leaving
me in a little ghetto disconnected from the rest of Haskell,

and

2.) the lack of SmallArrayArray's, which could let us avoid the card
marking overhead. These objects are all small, 3-4 pointers wide. Card
marking doesn't help.

Alternately I could just try to do really evil things and convert the whole
mess to SmallArrays and then figure out how to unsafeCoerce my way to
glory, stuffing the #'d references to the other arrays directly into the
SmallArray as slots, removing the limitation  we see here by aping the
MutableArrayArray# s API, but that gets really really dangerous!

I'm pretty much willing to sacrifice almost anything on the altar of speed
here, but I'd like to be able to let the GC move them and collect them
which rules out simpler Ptr and Addr based solutions.

-Edward

On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty <
c...@cse.unsw.edu.au> wrote:

> That’s an interesting idea.
>
> Manuel
>
> > Edward Kmett :
> >
> > Would it be possible to add unsafe primops to add Array# and SmallArray#
> entries to an ArrayArray#? The fact that the ArrayArray# entries are all
> directly unlifted avoiding a level of indirection for the containing
> structure is amazing, but I can only currently use it if my leaf level data
> can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be
> able to have the ability to put SmallArray# a stuff down at the leaves to
> hold lifted contents.
> >
> > I accept fully that if I name the wrong type when I go to access one of
> the fields it'll lie to me, but I suppose it'd do that if i tried to use
> one of the members that held a nested ArrayArray# as a ByteArray# anyways,
> so it isn't like there is a safety story preventing this.
> >
> > I've been hunting for ways to try to kill the indirection problems I get
> with Haskell and mutable structures, and I could shoehorn a number of them
> into ArrayArrays if this worked.
> >
> > Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection
> compared to c/java and this could reduce that pain to just 1 level of
> unnecessary indirection.
> >
> > -Edward
> > ___
> > ghc-devs mailing list
> > ghc-devs@haskell.org
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: ArrayArrays

2015-08-20 Thread Manuel M T Chakravarty
That’s an interesting idea.

Manuel

> Edward Kmett :
> 
> Would it be possible to add unsafe primops to add Array# and SmallArray# 
> entries to an ArrayArray#? The fact that the ArrayArray# entries are all 
> directly unlifted avoiding a level of indirection for the containing 
> structure is amazing, but I can only currently use it if my leaf level data 
> can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be 
> able to have the ability to put SmallArray# a stuff down at the leaves to 
> hold lifted contents.
> 
> I accept fully that if I name the wrong type when I go to access one of the 
> fields it'll lie to me, but I suppose it'd do that if i tried to use one of 
> the members that held a nested ArrayArray# as a ByteArray# anyways, so it 
> isn't like there is a safety story preventing this.
> 
> I've been hunting for ways to try to kill the indirection problems I get with 
> Haskell and mutable structures, and I could shoehorn a number of them into 
> ArrayArrays if this worked.
> 
> Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection 
> compared to c/java and this could reduce that pain to just 1 level of 
> unnecessary indirection.
> 
> -Edward
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


ArrayArrays

2015-08-20 Thread Edward Kmett
Would it be possible to add unsafe primops to add Array# and SmallArray#
entries to an ArrayArray#? The fact that the ArrayArray# entries are all
directly unlifted avoiding a level of indirection for the containing
structure is amazing, but I can only currently use it if my leaf level data
can be 100% unboxed and distributed among ByteArray#s. It'd be nice to be
able to have the ability to put SmallArray# a stuff down at the leaves to
hold lifted contents.

I accept fully that if I name the wrong type when I go to access one of the
fields it'll lie to me, but I suppose it'd do that if i tried to use one of
the members that held a nested ArrayArray# as a ByteArray# anyways, so it
isn't like there is a safety story preventing this.

I've been hunting for ways to try to kill the indirection problems I get
with Haskell and mutable structures, and I could shoehorn a number of them
into ArrayArrays if this worked.

Right now I'm stuck paying for 2 or 3 levels of unnecessary indirection
compared to c/java and this could reduce that pain to just 1 level of
unnecessary indirection.

-Edward
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs