Re: ArrayArrays

Johan Tibell Mon, 31 Aug 2015 22:15:07 -0700

Works for me.

On Mon, Aug 31, 2015 at 3:50 PM, Ryan Yates <[email protected]> wrote:


> Any time works for me.
>
> Ryan
>
> On Mon, Aug 31, 2015 at 6:11 PM, Ryan Newton <[email protected]> wrote:
> > Dear Edward, Ryan Yates, and other interested parties --
> >
> > So when should we meet up about this?
> >
> > May I propose the Tues afternoon break for everyone at ICFP who is
> > interested in this topic?  We can meet out in the coffee area and
> congregate
> > around Edward Kmett, who is tall and should be easy to find ;-).
> >
> > I think Ryan is going to show us how to use his new primops for combined
> > array + other fields in one heap object?
> >
> > On Sat, Aug 29, 2015 at 9:24 PM Edward Kmett <[email protected]> wrote:
> >>
> >> Without a custom primitive it doesn't help much there, you have to store
> >> the indirection to the mask.
> >>
> >> With a custom primitive it should cut the on heap root-to-leaf path of
> >> everything in the HAMT in half. A shorter HashMap was actually one of
> the
> >> motivating factors for me doing this. It is rather astoundingly
> difficult to
> >> beat the performance of HashMap, so I had to start cheating pretty
> badly. ;)
> >>
> >> -Edward
> >>
> >> On Sat, Aug 29, 2015 at 5:45 PM, Johan Tibell <[email protected]>
> >> wrote:
> >>>
> >>> I'd also be interested to chat at ICFP to see if I can use this for my
> >>> HAMT implementation.
> >>>
> >>> On Sat, Aug 29, 2015 at 3:07 PM, Edward Kmett <[email protected]>
> wrote:
> >>>>
> >>>> Sounds good to me. Right now I'm just hacking up composable accessors
> >>>> for "typed slots" in a fairly lens-like fashion, and treating the set
> of
> >>>> slots I define and the 'new' function I build for the data type as
> its API,
> >>>> and build atop that. This could eventually graduate to
> template-haskell, but
> >>>> I'm not entirely satisfied with the solution I have. I currently
> distinguish
> >>>> between what I'm calling "slots" (things that point directly to
> another
> >>>> SmallMutableArrayArray# sans wrapper) and "fields" which point
> directly to
> >>>> the usual Haskell data types because unifying the two notions meant
> that I
> >>>> couldn't lift some coercions out "far enough" to make them vanish.
> >>>>
> >>>> I'll be happy to run through my current working set of issues in
> person
> >>>> and -- as things get nailed down further -- in a longer lived medium
> than in
> >>>> personal conversations. ;)
> >>>>
> >>>> -Edward
> >>>>
> >>>> On Sat, Aug 29, 2015 at 7:59 AM, Ryan Newton <[email protected]>
> wrote:
> >>>>>
> >>>>> I'd also love to meet up at ICFP and discuss this.  I think the array
> >>>>> primops plus a TH layer that lets (ab)use them many times without
> too much
> >>>>> marginal cost sounds great.  And I'd like to learn how we could be
> either
> >>>>> early users of, or help with, this infrastructure.
> >>>>>
> >>>>> CC'ing in Ryan Scot and Omer Agacan who may also be interested in
> >>>>> dropping in on such discussions @ICFP, and Chao-Hong Chen, a Ph.D.
> student
> >>>>> who is currently working on concurrent data structures in Haskell,
> but will
> >>>>> not be at ICFP.
> >>>>>
> >>>>>
> >>>>> On Fri, Aug 28, 2015 at 7:47 PM, Ryan Yates <[email protected]>
> >>>>> wrote:
> >>>>>>
> >>>>>> I completely agree.  I would love to spend some time during ICFP and
> >>>>>> friends talking about what it could look like.  My small array for
> STM
> >>>>>> changes for the RTS can be seen here [1].  It is on a branch
> somewhere
> >>>>>> between 7.8 and 7.10 and includes irrelevant STM bits and some
> >>>>>> confusing naming choices (sorry), but should cover all the details
> >>>>>> needed to implement it for a non-STM context.  The biggest surprise
> >>>>>> for me was following small array too closely and having a word/byte
> >>>>>> offset miss-match [2].
> >>>>>>
> >>>>>> [1]:
> >>>>>>
> https://github.com/fryguybob/ghc/compare/ghc-htm-bloom...fryguybob:ghc-htm-mut
> >>>>>> [2]: https://ghc.haskell.org/trac/ghc/ticket/10413
> >>>>>>
> >>>>>> Ryan
> >>>>>>
> >>>>>> On Fri, Aug 28, 2015 at 10:09 PM, Edward Kmett <[email protected]>
> >>>>>> wrote:
> >>>>>> > I'd love to have that last 10%, but its a lot of work to get there
> >>>>>> > and more
> >>>>>> > importantly I don't know quite what it should look like.
> >>>>>> >
> >>>>>> > On the other hand, I do have a pretty good idea of how the
> >>>>>> > primitives above
> >>>>>> > could be banged out and tested in a long evening, well in time for
> >>>>>> > 7.12. And
> >>>>>> > as noted earlier, those remain useful even if a nicer typed
> version
> >>>>>> > with an
> >>>>>> > extra level of indirection to the sizes is built up after.
> >>>>>> >
> >>>>>> > The rest sounds like a good graduate student project for someone
> who
> >>>>>> > has
> >>>>>> > graduate students lying around. Maybe somebody at Indiana
> University
> >>>>>> > who has
> >>>>>> > an interest in type theory and parallelism can find us one. =)
> >>>>>> >
> >>>>>> > -Edward
> >>>>>> >
> >>>>>> > On Fri, Aug 28, 2015 at 8:48 PM, Ryan Yates <[email protected]>
> >>>>>> > wrote:
> >>>>>> >>
> >>>>>> >> I think from my perspective, the motivation for getting the type
> >>>>>> >> checker involved is primarily bringing this to the level where
> >>>>>> >> users
> >>>>>> >> could be expected to build these structures.  it is reasonable to
> >>>>>> >> think that there are people who want to use STM (a context with
> >>>>>> >> mutation already) to implement a straight forward data structure
> >>>>>> >> that
> >>>>>> >> avoids extra indirection penalty.  There should be some places
> >>>>>> >> where
> >>>>>> >> knowing that things are field accesses rather then array indexing
> >>>>>> >> could be helpful, but I think GHC is good right now about
> handling
> >>>>>> >> constant offsets.  In my code I don't do any bounds checking as I
> >>>>>> >> know
> >>>>>> >> I will only be accessing my arrays with constant indexes.  I make
> >>>>>> >> wrappers for each field access and leave all the unsafe stuff in
> >>>>>> >> there.  When things go wrong though, the compiler is no help.
> >>>>>> >> Maybe
> >>>>>> >> template Haskell that generates the appropriate wrappers is the
> >>>>>> >> right
> >>>>>> >> direction to go.
> >>>>>> >> There is another benefit for me when working with these as arrays
> >>>>>> >> in
> >>>>>> >> that it is quite simple and direct (given the hoops already
> jumped
> >>>>>> >> through) to play with alignment.  I can ensure two pointers are
> >>>>>> >> never
> >>>>>> >> on the same cache-line by just spacing things out in the array.
> >>>>>> >>
> >>>>>> >> On Fri, Aug 28, 2015 at 7:33 PM, Edward Kmett <[email protected]>
> >>>>>> >> wrote:
> >>>>>> >> > They just segfault at this level. ;)
> >>>>>> >> >
> >>>>>> >> > Sent from my iPhone
> >>>>>> >> >
> >>>>>> >> > On Aug 28, 2015, at 7:25 PM, Ryan Newton <[email protected]>
> >>>>>> >> > wrote:
> >>>>>> >> >
> >>>>>> >> > You presumably also save a bounds check on reads by hard-coding
> >>>>>> >> > the
> >>>>>> >> > sizes?
> >>>>>> >> >
> >>>>>> >> > On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett <
> [email protected]>
> >>>>>> >> > wrote:
> >>>>>> >> >>
> >>>>>> >> >> Also there are 4 different "things" here, basically depending
> on
> >>>>>> >> >> two
> >>>>>> >> >> independent questions:
> >>>>>> >> >>
> >>>>>> >> >> a.) if you want to shove the sizes into the info table, and
> >>>>>> >> >> b.) if you want cardmarking.
> >>>>>> >> >>
> >>>>>> >> >> Versions with/without cardmarking for different sizes can be
> >>>>>> >> >> done
> >>>>>> >> >> pretty
> >>>>>> >> >> easily, but as noted, the infotable variants are pretty
> >>>>>> >> >> invasive.
> >>>>>> >> >>
> >>>>>> >> >> -Edward
> >>>>>> >> >>
> >>>>>> >> >> On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett <
> [email protected]>
> >>>>>> >> >> wrote:
> >>>>>> >> >>>
> >>>>>> >> >>> Well, on the plus side you'd save 16 bytes per object, which
> >>>>>> >> >>> adds up
> >>>>>> >> >>> if
> >>>>>> >> >>> they were small enough and there are enough of them. You get
> a
> >>>>>> >> >>> bit
> >>>>>> >> >>> better
> >>>>>> >> >>> locality of reference in terms of what fits in the first
> cache
> >>>>>> >> >>> line of
> >>>>>> >> >>> them.
> >>>>>> >> >>>
> >>>>>> >> >>> -Edward
> >>>>>> >> >>>
> >>>>>> >> >>> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton
> >>>>>> >> >>> <[email protected]>
> >>>>>> >> >>> wrote:
> >>>>>> >> >>>>
> >>>>>> >> >>>> Yes. And for the short term I can imagine places we will
> >>>>>> >> >>>> settle with
> >>>>>> >> >>>> arrays even if it means tracking lengths unnecessarily and
> >>>>>> >> >>>> unsafeCoercing
> >>>>>> >> >>>> pointers whose types don't actually match their siblings.
> >>>>>> >> >>>>
> >>>>>> >> >>>> Is there anything to recommend the hacks mentioned for fixed
> >>>>>> >> >>>> sized
> >>>>>> >> >>>> array
> >>>>>> >> >>>> objects *other* than using them to fake structs? (Much to
> >>>>>> >> >>>> derecommend, as
> >>>>>> >> >>>> you mentioned!)
> >>>>>> >> >>>>
> >>>>>> >> >>>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett
> >>>>>> >> >>>> <[email protected]>
> >>>>>> >> >>>> wrote:
> >>>>>> >> >>>>>
> >>>>>> >> >>>>> I think both are useful, but the one you suggest requires a
> >>>>>> >> >>>>> lot more
> >>>>>> >> >>>>> plumbing and doesn't subsume all of the usecases of the
> >>>>>> >> >>>>> other.
> >>>>>> >> >>>>>
> >>>>>> >> >>>>> -Edward
> >>>>>> >> >>>>>
> >>>>>> >> >>>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton
> >>>>>> >> >>>>> <[email protected]>
> >>>>>> >> >>>>> wrote:
> >>>>>> >> >>>>>>
> >>>>>> >> >>>>>> So that primitive is an array like thing (Same pointed
> type,
> >>>>>> >> >>>>>> unbounded
> >>>>>> >> >>>>>> length) with extra payload.
> >>>>>> >> >>>>>>
> >>>>>> >> >>>>>> I can see how we can do without structs if we have arrays,
> >>>>>> >> >>>>>> especially
> >>>>>> >> >>>>>> with the extra payload at front. But wouldn't the general
> >>>>>> >> >>>>>> solution
> >>>>>> >> >>>>>> for
> >>>>>> >> >>>>>> structs be one that that allows new user data type defs
> for
> >>>>>> >> >>>>>> #
> >>>>>> >> >>>>>> types?
> >>>>>> >> >>>>>>
> >>>>>> >> >>>>>>
> >>>>>> >> >>>>>>
> >>>>>> >> >>>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett
> >>>>>> >> >>>>>> <[email protected]>
> >>>>>> >> >>>>>> wrote:
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>>> Some form of MutableStruct# with a known number of words
> >>>>>> >> >>>>>>> and a
> >>>>>> >> >>>>>>> known
> >>>>>> >> >>>>>>> number of pointers is basically what Ryan Yates was
> >>>>>> >> >>>>>>> suggesting
> >>>>>> >> >>>>>>> above, but
> >>>>>> >> >>>>>>> where the word counts were stored in the objects
> >>>>>> >> >>>>>>> themselves.
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>>> Given that it'd have a couple of words for those counts
> >>>>>> >> >>>>>>> it'd
> >>>>>> >> >>>>>>> likely
> >>>>>> >> >>>>>>> want to be something we build in addition to MutVar#
> rather
> >>>>>> >> >>>>>>> than a
> >>>>>> >> >>>>>>> replacement.
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>>> On the other hand, if we had to fix those numbers and
> build
> >>>>>> >> >>>>>>> info
> >>>>>> >> >>>>>>> tables that knew them, and typechecker support, for
> >>>>>> >> >>>>>>> instance, it'd
> >>>>>> >> >>>>>>> get
> >>>>>> >> >>>>>>> rather invasive.
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>>> Also, a number of things that we can do with the 'sized'
> >>>>>> >> >>>>>>> versions
> >>>>>> >> >>>>>>> above, like working with evil unsized c-style arrays
> >>>>>> >> >>>>>>> directly
> >>>>>> >> >>>>>>> inline at the
> >>>>>> >> >>>>>>> end of the structure cease to be possible, so it isn't
> even
> >>>>>> >> >>>>>>> a pure
> >>>>>> >> >>>>>>> win if we
> >>>>>> >> >>>>>>> did the engineering effort.
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>>> I think 90% of the needs I have are covered just by
> adding
> >>>>>> >> >>>>>>> the one
> >>>>>> >> >>>>>>> primitive. The last 10% gets pretty invasive.
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>>> -Edward
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton
> >>>>>> >> >>>>>>> <[email protected]>
> >>>>>> >> >>>>>>> wrote:
> >>>>>> >> >>>>>>>>
> >>>>>> >> >>>>>>>> I like the possibility of a general solution for mutable
> >>>>>> >> >>>>>>>> structs
> >>>>>> >> >>>>>>>> (like Ed said), and I'm trying to fully understand why
> >>>>>> >> >>>>>>>> it's hard.
> >>>>>> >> >>>>>>>>
> >>>>>> >> >>>>>>>> So, we can't unpack MutVar into constructors because of
> >>>>>> >> >>>>>>>> object
> >>>>>> >> >>>>>>>> identity problems. But what about directly supporting an
> >>>>>> >> >>>>>>>> extensible set of
> >>>>>> >> >>>>>>>> unlifted MutStruct# objects, generalizing (and even
> >>>>>> >> >>>>>>>> replacing)
> >>>>>> >> >>>>>>>> MutVar#? That
> >>>>>> >> >>>>>>>> may be too much work, but is it problematic otherwise?
> >>>>>> >> >>>>>>>>
> >>>>>> >> >>>>>>>> Needless to say, this is also critical if we ever want
> >>>>>> >> >>>>>>>> best in
> >>>>>> >> >>>>>>>> class
> >>>>>> >> >>>>>>>> lockfree mutable structures, just like their Stm and
> >>>>>> >> >>>>>>>> sequential
> >>>>>> >> >>>>>>>> counterparts.
> >>>>>> >> >>>>>>>>
> >>>>>> >> >>>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones
> >>>>>> >> >>>>>>>> <[email protected]> wrote:
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it
> into a
> >>>>>> >> >>>>>>>>> short
> >>>>>> >> >>>>>>>>> article.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Yes, please do make it into a wiki page on the GHC
> Trac,
> >>>>>> >> >>>>>>>>> and
> >>>>>> >> >>>>>>>>> maybe
> >>>>>> >> >>>>>>>>> make a ticket for it.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Thanks
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Simon
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> From: Edward Kmett [mailto:[email protected]]
> >>>>>> >> >>>>>>>>> Sent: 27 August 2015 16:54
> >>>>>> >> >>>>>>>>> To: Simon Peyton Jones
> >>>>>> >> >>>>>>>>> Cc: Manuel M T Chakravarty; Simon Marlow; ghc-devs
> >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> An ArrayArray# is just an Array# with a modified
> >>>>>> >> >>>>>>>>> invariant. It
> >>>>>> >> >>>>>>>>> points directly to other unlifted ArrayArray#'s or
> >>>>>> >> >>>>>>>>> ByteArray#'s.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> While those live in #, they are garbage collected
> >>>>>> >> >>>>>>>>> objects, so
> >>>>>> >> >>>>>>>>> this
> >>>>>> >> >>>>>>>>> all lives on the heap.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> They were added to make some of the DPH stuff fast when
> >>>>>> >> >>>>>>>>> it has
> >>>>>> >> >>>>>>>>> to
> >>>>>> >> >>>>>>>>> deal with nested arrays.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> I'm currently abusing them as a placeholder for a
> better
> >>>>>> >> >>>>>>>>> thing.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> The Problem
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> -----------------
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Consider the scenario where you write a classic
> >>>>>> >> >>>>>>>>> doubly-linked
> >>>>>> >> >>>>>>>>> list
> >>>>>> >> >>>>>>>>> in Haskell.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL)
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Chasing from one DLL to the next requires following 3
> >>>>>> >> >>>>>>>>> pointers
> >>>>>> >> >>>>>>>>> on
> >>>>>> >> >>>>>>>>> the heap.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe
> DLL)
> >>>>>> >> >>>>>>>>> ~>
> >>>>>> >> >>>>>>>>> Maybe
> >>>>>> >> >>>>>>>>> DLL ~> DLL
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> That is 3 levels of indirection.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> We can trim one by simply unpacking the IORef with
> >>>>>> >> >>>>>>>>> -funbox-strict-fields or UNPACK
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> We can trim another by adding a 'Nil' constructor for
> DLL
> >>>>>> >> >>>>>>>>> and
> >>>>>> >> >>>>>>>>> worsening our representation.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> but now we're still stuck with a level of indirection
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> This means that every operation we perform on this
> >>>>>> >> >>>>>>>>> structure
> >>>>>> >> >>>>>>>>> will
> >>>>>> >> >>>>>>>>> be about half of the speed of an implementation in most
> >>>>>> >> >>>>>>>>> other
> >>>>>> >> >>>>>>>>> languages
> >>>>>> >> >>>>>>>>> assuming we're memory bound on loading things into
> cache!
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Making Progress
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> ----------------------
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> I have been working on a number of data structures
> where
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> indirection of going from something in * out to an
> object
> >>>>>> >> >>>>>>>>> in #
> >>>>>> >> >>>>>>>>> which
> >>>>>> >> >>>>>>>>> contains the real pointer to my target and coming back
> >>>>>> >> >>>>>>>>> effectively doubles
> >>>>>> >> >>>>>>>>> my runtime.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> We go out to the MutVar# because we are allowed to put
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> MutVar#
> >>>>>> >> >>>>>>>>> onto the mutable list when we dirty it. There is a well
> >>>>>> >> >>>>>>>>> defined
> >>>>>> >> >>>>>>>>> write-barrier.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> I could change out the representation to use
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> I can just store two pointers in the MutableArray#
> every
> >>>>>> >> >>>>>>>>> time,
> >>>>>> >> >>>>>>>>> but
> >>>>>> >> >>>>>>>>> this doesn't help _much_ directly. It has reduced the
> >>>>>> >> >>>>>>>>> amount of
> >>>>>> >> >>>>>>>>> distinct
> >>>>>> >> >>>>>>>>> addresses in memory I touch on a walk of the DLL from 3
> >>>>>> >> >>>>>>>>> per
> >>>>>> >> >>>>>>>>> object to 2.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> I still have to go out to the heap from my DLL and get
> to
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> array
> >>>>>> >> >>>>>>>>> object and then chase it to the next DLL and chase that
> >>>>>> >> >>>>>>>>> to the
> >>>>>> >> >>>>>>>>> next array. I
> >>>>>> >> >>>>>>>>> do get my two pointers together in memory though. I'm
> >>>>>> >> >>>>>>>>> paying for
> >>>>>> >> >>>>>>>>> a card
> >>>>>> >> >>>>>>>>> marking table as well, which I don't particularly need
> >>>>>> >> >>>>>>>>> with just
> >>>>>> >> >>>>>>>>> two
> >>>>>> >> >>>>>>>>> pointers, but we can shed that with the
> >>>>>> >> >>>>>>>>> "SmallMutableArray#"
> >>>>>> >> >>>>>>>>> machinery added
> >>>>>> >> >>>>>>>>> back in 7.10, which is just the old array code a a new
> >>>>>> >> >>>>>>>>> data
> >>>>>> >> >>>>>>>>> type, which can
> >>>>>> >> >>>>>>>>> speed things up a bit when you don't have very big
> >>>>>> >> >>>>>>>>> arrays:
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> But what if I wanted my object itself to live in # and
> >>>>>> >> >>>>>>>>> have two
> >>>>>> >> >>>>>>>>> mutable fields and be able to share the sme write
> >>>>>> >> >>>>>>>>> barrier?
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> An ArrayArray# points directly to other unlifted array
> >>>>>> >> >>>>>>>>> types.
> >>>>>> >> >>>>>>>>> What
> >>>>>> >> >>>>>>>>> if we have one # -> * wrapper on the outside to deal
> with
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> impedence
> >>>>>> >> >>>>>>>>> mismatch between the imperative world and Haskell, and
> >>>>>> >> >>>>>>>>> then just
> >>>>>> >> >>>>>>>>> let the
> >>>>>> >> >>>>>>>>> ArrayArray#'s hold other arrayarrays.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld)
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> now I need to make up a new Nil, which I can just make
> be
> >>>>>> >> >>>>>>>>> a
> >>>>>> >> >>>>>>>>> special
> >>>>>> >> >>>>>>>>> MutableArrayArray# I allocate on program startup. I can
> >>>>>> >> >>>>>>>>> even
> >>>>>> >> >>>>>>>>> abuse pattern
> >>>>>> >> >>>>>>>>> synonyms. Alternately I can exploit the internals
> further
> >>>>>> >> >>>>>>>>> to
> >>>>>> >> >>>>>>>>> make this
> >>>>>> >> >>>>>>>>> cheaper.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Then I can use the readMutableArrayArray# and
> >>>>>> >> >>>>>>>>> writeMutableArrayArray# calls to directly access the
> >>>>>> >> >>>>>>>>> preceding
> >>>>>> >> >>>>>>>>> and next
> >>>>>> >> >>>>>>>>> entry in the linked list.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> So now we have one DLL wrapper which just 'bootstraps
> me'
> >>>>>> >> >>>>>>>>> into a
> >>>>>> >> >>>>>>>>> strict world, and everything there lives in #.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> next :: DLL -> IO DLL
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s
> >>>>>> >> >>>>>>>>> of
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>    (# s', n #) -> (# s', DLL n #)
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> It turns out GHC is quite happy to optimize all of that
> >>>>>> >> >>>>>>>>> code to
> >>>>>> >> >>>>>>>>> keep things unboxed. The 'DLL' wrappers get removed
> >>>>>> >> >>>>>>>>> pretty
> >>>>>> >> >>>>>>>>> easily when they
> >>>>>> >> >>>>>>>>> are known strict and you chain operations of this sort!
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Cleaning it Up
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> ------------------
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Now I have one outermost indirection pointing to an
> array
> >>>>>> >> >>>>>>>>> that
> >>>>>> >> >>>>>>>>> points directly to other arrays.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> I'm stuck paying for a card marking table per object,
> but
> >>>>>> >> >>>>>>>>> I can
> >>>>>> >> >>>>>>>>> fix
> >>>>>> >> >>>>>>>>> that by duplicating the code for MutableArrayArray# and
> >>>>>> >> >>>>>>>>> using a
> >>>>>> >> >>>>>>>>> SmallMutableArray#. I can hack up primops that let me
> >>>>>> >> >>>>>>>>> store a
> >>>>>> >> >>>>>>>>> mixture of
> >>>>>> >> >>>>>>>>> SmallMutableArray# fields and normal ones in the data
> >>>>>> >> >>>>>>>>> structure.
> >>>>>> >> >>>>>>>>> Operationally, I can even do so by just unsafeCoercing
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> existing
> >>>>>> >> >>>>>>>>> SmallMutableArray# primitives to change the kind of one
> >>>>>> >> >>>>>>>>> of the
> >>>>>> >> >>>>>>>>> arguments it
> >>>>>> >> >>>>>>>>> takes.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> This is almost ideal, but not quite. I often have
> fields
> >>>>>> >> >>>>>>>>> that
> >>>>>> >> >>>>>>>>> would
> >>>>>> >> >>>>>>>>> be best left unboxed.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> was able to unpack the Int, but we lost that. We can
> >>>>>> >> >>>>>>>>> currently
> >>>>>> >> >>>>>>>>> at
> >>>>>> >> >>>>>>>>> best point one of the entries of the SmallMutableArray#
> >>>>>> >> >>>>>>>>> at a
> >>>>>> >> >>>>>>>>> boxed or at a
> >>>>>> >> >>>>>>>>> MutableByteArray# for all of our misc. data and shove
> the
> >>>>>> >> >>>>>>>>> int in
> >>>>>> >> >>>>>>>>> question in
> >>>>>> >> >>>>>>>>> there.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I
> >>>>>> >> >>>>>>>>> need to
> >>>>>> >> >>>>>>>>> store masks and administrivia as I walk down the tree.
> >>>>>> >> >>>>>>>>> Having to
> >>>>>> >> >>>>>>>>> go off to
> >>>>>> >> >>>>>>>>> the side costs me the entire win from avoiding the
> first
> >>>>>> >> >>>>>>>>> pointer
> >>>>>> >> >>>>>>>>> chase.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> But, if like Ryan suggested, we had a heap object we
> >>>>>> >> >>>>>>>>> could
> >>>>>> >> >>>>>>>>> construct that had n words with unsafe access and m
> >>>>>> >> >>>>>>>>> pointers to
> >>>>>> >> >>>>>>>>> other heap
> >>>>>> >> >>>>>>>>> objects, one that could put itself on the mutable list
> >>>>>> >> >>>>>>>>> when any
> >>>>>> >> >>>>>>>>> of those
> >>>>>> >> >>>>>>>>> pointers changed then I could shed this last factor of
> >>>>>> >> >>>>>>>>> two in
> >>>>>> >> >>>>>>>>> all
> >>>>>> >> >>>>>>>>> circumstances.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Prototype
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> -------------
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Over the last few days I've put together a small
> >>>>>> >> >>>>>>>>> prototype
> >>>>>> >> >>>>>>>>> implementation with a few non-trivial imperative data
> >>>>>> >> >>>>>>>>> structures
> >>>>>> >> >>>>>>>>> for things
> >>>>>> >> >>>>>>>>> like Tarjan's link-cut trees, the list labeling problem
> >>>>>> >> >>>>>>>>> and
> >>>>>> >> >>>>>>>>> order-maintenance.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> https://github.com/ekmett/structs
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Notable bits:
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Data.Struct.Internal.LinkCut provides an implementation
> >>>>>> >> >>>>>>>>> of
> >>>>>> >> >>>>>>>>> link-cut
> >>>>>> >> >>>>>>>>> trees in this style.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Data.Struct.Internal provides the rather horrifying
> guts
> >>>>>> >> >>>>>>>>> that
> >>>>>> >> >>>>>>>>> make
> >>>>>> >> >>>>>>>>> it go fast.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Once compiled with -O or -O2, if you look at the core,
> >>>>>> >> >>>>>>>>> almost
> >>>>>> >> >>>>>>>>> all
> >>>>>> >> >>>>>>>>> the references to the LinkCut or Object data
> constructor
> >>>>>> >> >>>>>>>>> get
> >>>>>> >> >>>>>>>>> optimized away,
> >>>>>> >> >>>>>>>>> and we're left with beautiful strict code directly
> >>>>>> >> >>>>>>>>> mutating out
> >>>>>> >> >>>>>>>>> underlying
> >>>>>> >> >>>>>>>>> representation.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> At the very least I'll take this email and turn it
> into a
> >>>>>> >> >>>>>>>>> short
> >>>>>> >> >>>>>>>>> article.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> -Edward
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones
> >>>>>> >> >>>>>>>>> <[email protected]> wrote:
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Just to say that I have no idea what is going on in
> this
> >>>>>> >> >>>>>>>>> thread.
> >>>>>> >> >>>>>>>>> What is ArrayArray?  What is the issue in general?  Is
> >>>>>> >> >>>>>>>>> there a
> >>>>>> >> >>>>>>>>> ticket? Is
> >>>>>> >> >>>>>>>>> there a wiki page?
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> If it’s important, an ab-initio wiki page + ticket
> would
> >>>>>> >> >>>>>>>>> be a
> >>>>>> >> >>>>>>>>> good
> >>>>>> >> >>>>>>>>> thing.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Simon
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> From: ghc-devs [mailto:[email protected]]
> On
> >>>>>> >> >>>>>>>>> Behalf
> >>>>>> >> >>>>>>>>> Of
> >>>>>> >> >>>>>>>>> Edward Kmett
> >>>>>> >> >>>>>>>>> Sent: 21 August 2015 05:25
> >>>>>> >> >>>>>>>>> To: Manuel M T Chakravarty
> >>>>>> >> >>>>>>>>> Cc: Simon Marlow; ghc-devs
> >>>>>> >> >>>>>>>>> Subject: Re: ArrayArrays
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> When (ab)using them for this purpose, SmallArrayArray's
> >>>>>> >> >>>>>>>>> would be
> >>>>>> >> >>>>>>>>> very handy as well.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Consider right now if I have something like an
> >>>>>> >> >>>>>>>>> order-maintenance
> >>>>>> >> >>>>>>>>> structure I have:
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray
> s)
> >>>>>> >> >>>>>>>>> {-#
> >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-}
> !(MutVar
> >>>>>> >> >>>>>>>>> s
> >>>>>> >> >>>>>>>>> (Upper s))
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper
> s))
> >>>>>> >> >>>>>>>>> {-#
> >>>>>> >> >>>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-}
> !(MutVar
> >>>>>> >> >>>>>>>>> s
> >>>>>> >> >>>>>>>>> (Lower s)) {-#
> >>>>>> >> >>>>>>>>> UNPACK #-} !(MutVar s (Lower s))
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> The former contains, logically, a mutable integer and
> two
> >>>>>> >> >>>>>>>>> pointers,
> >>>>>> >> >>>>>>>>> one for forward and one for backwards. The latter is
> >>>>>> >> >>>>>>>>> basically
> >>>>>> >> >>>>>>>>> the same
> >>>>>> >> >>>>>>>>> thing with a mutable reference up pointing at the
> >>>>>> >> >>>>>>>>> structure
> >>>>>> >> >>>>>>>>> above.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> On the heap this is an object that points to a
> structure
> >>>>>> >> >>>>>>>>> for the
> >>>>>> >> >>>>>>>>> bytearray, and points to another structure for each
> >>>>>> >> >>>>>>>>> mutvar which
> >>>>>> >> >>>>>>>>> each point
> >>>>>> >> >>>>>>>>> to the other 'Upper' structure. So there is a level of
> >>>>>> >> >>>>>>>>> indirection smeared
> >>>>>> >> >>>>>>>>> over everything.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> So this is a pair of doubly linked lists with an upward
> >>>>>> >> >>>>>>>>> link
> >>>>>> >> >>>>>>>>> from
> >>>>>> >> >>>>>>>>> the structure below to the structure above.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Converted into ArrayArray#s I'd get
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data Upper s = Upper (MutableArrayArray# s)
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> w/ the first slot being a pointer to a
> MutableByteArray#,
> >>>>>> >> >>>>>>>>> and
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> next 2 slots pointing to the previous and next previous
> >>>>>> >> >>>>>>>>> objects,
> >>>>>> >> >>>>>>>>> represented
> >>>>>> >> >>>>>>>>> just as their MutableArrayArray#s. I can use
> >>>>>> >> >>>>>>>>> sameMutableArrayArray# on these
> >>>>>> >> >>>>>>>>> for object identity, which lets me check for the ends
> of
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> lists by tying
> >>>>>> >> >>>>>>>>> things back on themselves.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> and below that
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> data Lower s = Lower (MutableArrayArray# s)
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> is similar, with an extra MutableArrayArray slot
> pointing
> >>>>>> >> >>>>>>>>> up to
> >>>>>> >> >>>>>>>>> an
> >>>>>> >> >>>>>>>>> upper structure.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> I can then write a handful of combinators for getting
> out
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> slots
> >>>>>> >> >>>>>>>>> in question, while it has gained a level of indirection
> >>>>>> >> >>>>>>>>> between
> >>>>>> >> >>>>>>>>> the wrapper
> >>>>>> >> >>>>>>>>> to put it in * and the MutableArrayArray# s in #, that
> >>>>>> >> >>>>>>>>> one can
> >>>>>> >> >>>>>>>>> be basically
> >>>>>> >> >>>>>>>>> erased by ghc.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Unlike before I don't have several separate objects on
> >>>>>> >> >>>>>>>>> the heap
> >>>>>> >> >>>>>>>>> for
> >>>>>> >> >>>>>>>>> each thing. I only have 2 now. The MutableArrayArray#
> for
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> object itself,
> >>>>>> >> >>>>>>>>> and the MutableByteArray# that it references to carry
> >>>>>> >> >>>>>>>>> around the
> >>>>>> >> >>>>>>>>> mutable
> >>>>>> >> >>>>>>>>> int.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> The only pain points are
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> 1.) the aforementioned limitation that currently
> prevents
> >>>>>> >> >>>>>>>>> me
> >>>>>> >> >>>>>>>>> from
> >>>>>> >> >>>>>>>>> stuffing normal boxed data through a SmallArray or
> Array
> >>>>>> >> >>>>>>>>> into an
> >>>>>> >> >>>>>>>>> ArrayArray
> >>>>>> >> >>>>>>>>> leaving me in a little ghetto disconnected from the
> rest
> >>>>>> >> >>>>>>>>> of
> >>>>>> >> >>>>>>>>> Haskell,
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> and
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> 2.) the lack of SmallArrayArray's, which could let us
> >>>>>> >> >>>>>>>>> avoid the
> >>>>>> >> >>>>>>>>> card marking overhead. These objects are all small, 3-4
> >>>>>> >> >>>>>>>>> pointers
> >>>>>> >> >>>>>>>>> wide. Card
> >>>>>> >> >>>>>>>>> marking doesn't help.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Alternately I could just try to do really evil things
> and
> >>>>>> >> >>>>>>>>> convert
> >>>>>> >> >>>>>>>>> the whole mess to SmallArrays and then figure out how
> to
> >>>>>> >> >>>>>>>>> unsafeCoerce my way
> >>>>>> >> >>>>>>>>> to glory, stuffing the #'d references to the other
> arrays
> >>>>>> >> >>>>>>>>> directly into the
> >>>>>> >> >>>>>>>>> SmallArray as slots, removing the limitation  we see
> here
> >>>>>> >> >>>>>>>>> by
> >>>>>> >> >>>>>>>>> aping the
> >>>>>> >> >>>>>>>>> MutableArrayArray# s API, but that gets really really
> >>>>>> >> >>>>>>>>> dangerous!
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> I'm pretty much willing to sacrifice almost anything on
> >>>>>> >> >>>>>>>>> the
> >>>>>> >> >>>>>>>>> altar
> >>>>>> >> >>>>>>>>> of speed here, but I'd like to be able to let the GC
> move
> >>>>>> >> >>>>>>>>> them
> >>>>>> >> >>>>>>>>> and collect
> >>>>>> >> >>>>>>>>> them which rules out simpler Ptr and Addr based
> >>>>>> >> >>>>>>>>> solutions.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> -Edward
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty
> >>>>>> >> >>>>>>>>> <[email protected]> wrote:
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> That’s an interesting idea.
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> Manuel
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> > Edward Kmett <[email protected]>:
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> >
> >>>>>> >> >>>>>>>>> > Would it be possible to add unsafe primops to add
> >>>>>> >> >>>>>>>>> > Array# and
> >>>>>> >> >>>>>>>>> > SmallArray# entries to an ArrayArray#? The fact that
> >>>>>> >> >>>>>>>>> > the
> >>>>>> >> >>>>>>>>> > ArrayArray# entries
> >>>>>> >> >>>>>>>>> > are all directly unlifted avoiding a level of
> >>>>>> >> >>>>>>>>> > indirection for
> >>>>>> >> >>>>>>>>> > the containing
> >>>>>> >> >>>>>>>>> > structure is amazing, but I can only currently use it
> >>>>>> >> >>>>>>>>> > if my
> >>>>>> >> >>>>>>>>> > leaf level data
> >>>>>> >> >>>>>>>>> > can be 100% unboxed and distributed among
> ByteArray#s.
> >>>>>> >> >>>>>>>>> > It'd be
> >>>>>> >> >>>>>>>>> > nice to be
> >>>>>> >> >>>>>>>>> > able to have the ability to put SmallArray# a stuff
> >>>>>> >> >>>>>>>>> > down at
> >>>>>> >> >>>>>>>>> > the leaves to
> >>>>>> >> >>>>>>>>> > hold lifted contents.
> >>>>>> >> >>>>>>>>> >
> >>>>>> >> >>>>>>>>> > I accept fully that if I name the wrong type when I
> go
> >>>>>> >> >>>>>>>>> > to
> >>>>>> >> >>>>>>>>> > access
> >>>>>> >> >>>>>>>>> > one of the fields it'll lie to me, but I suppose it'd
> >>>>>> >> >>>>>>>>> > do that
> >>>>>> >> >>>>>>>>> > if i tried to
> >>>>>> >> >>>>>>>>> > use one of the members that held a nested ArrayArray#
> >>>>>> >> >>>>>>>>> > as a
> >>>>>> >> >>>>>>>>> > ByteArray#
> >>>>>> >> >>>>>>>>> > anyways, so it isn't like there is a safety story
> >>>>>> >> >>>>>>>>> > preventing
> >>>>>> >> >>>>>>>>> > this.
> >>>>>> >> >>>>>>>>> >
> >>>>>> >> >>>>>>>>> > I've been hunting for ways to try to kill the
> >>>>>> >> >>>>>>>>> > indirection
> >>>>>> >> >>>>>>>>> > problems I get with Haskell and mutable structures,
> and
> >>>>>> >> >>>>>>>>> > I
> >>>>>> >> >>>>>>>>> > could shoehorn a
> >>>>>> >> >>>>>>>>> > number of them into ArrayArrays if this worked.
> >>>>>> >> >>>>>>>>> >
> >>>>>> >> >>>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of
> >>>>>> >> >>>>>>>>> > unnecessary
> >>>>>> >> >>>>>>>>> > indirection compared to c/java and this could reduce
> >>>>>> >> >>>>>>>>> > that pain
> >>>>>> >> >>>>>>>>> > to just 1
> >>>>>> >> >>>>>>>>> > level of unnecessary indirection.
> >>>>>> >> >>>>>>>>> >
> >>>>>> >> >>>>>>>>> > -Edward
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> > _______________________________________________
> >>>>>> >> >>>>>>>>> > ghc-devs mailing list
> >>>>>> >> >>>>>>>>> > [email protected]
> >>>>>> >> >>>>>>>>> >
> >>>>>> >> >>>>>>>>> >
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>>
> >>>>>> >> >>>>>>>>> _______________________________________________
> >>>>>> >> >>>>>>>>> ghc-devs mailing list
> >>>>>> >> >>>>>>>>> [email protected]
> >>>>>> >> >>>>>>>>>
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>>>
> >>>>>> >> >>>>>
> >>>>>> >> >>>
> >>>>>> >> >>
> >>>>>> >> >
> >>>>>> >> >
> >>>>>> >> > _______________________________________________
> >>>>>> >> > ghc-devs mailing list
> >>>>>> >> > [email protected]
> >>>>>> >> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >>>>>> >> >
> >>>>>> >
> >>>>>> >
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> ghc-devs mailing list
> >>>> [email protected]
> >>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >>>>
> >>>
> >>
> >
> > _______________________________________________
> > ghc-devs mailing list
> > [email protected]
> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> >
>

_______________________________________________
ghc-devs mailing list
[email protected]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Re: ArrayArrays

Reply via email to