You presumably also save a bounds check on reads by hard-coding the sizes? On Fri, Aug 28, 2015 at 3:39 PM, Edward Kmett <[email protected]> wrote:
> Also there are 4 different "things" here, basically depending on two > independent questions: > > a.) if you want to shove the sizes into the info table, and > b.) if you want cardmarking. > > Versions with/without cardmarking for different sizes can be done pretty > easily, but as noted, the infotable variants are pretty invasive. > > -Edward > > On Fri, Aug 28, 2015 at 6:36 PM, Edward Kmett <[email protected]> wrote: > >> Well, on the plus side you'd save 16 bytes per object, which adds up if >> they were small enough and there are enough of them. You get a bit better >> locality of reference in terms of what fits in the first cache line of them. >> >> -Edward >> >> On Fri, Aug 28, 2015 at 6:14 PM, Ryan Newton <[email protected]> wrote: >> >>> Yes. And for the short term I can imagine places we will settle with >>> arrays even if it means tracking lengths unnecessarily and unsafeCoercing >>> pointers whose types don't actually match their siblings. >>> >>> Is there anything to recommend the hacks mentioned for fixed sized array >>> objects *other* than using them to fake structs? (Much to derecommend, as >>> you mentioned!) >>> >>> On Fri, Aug 28, 2015 at 3:07 PM Edward Kmett <[email protected]> wrote: >>> >>>> I think both are useful, but the one you suggest requires a lot more >>>> plumbing and doesn't subsume all of the usecases of the other. >>>> >>>> -Edward >>>> >>>> On Fri, Aug 28, 2015 at 5:51 PM, Ryan Newton <[email protected]> >>>> wrote: >>>> >>>>> So that primitive is an array like thing (Same pointed type, unbounded >>>>> length) with extra payload. >>>>> >>>>> I can see how we can do without structs if we have arrays, especially >>>>> with the extra payload at front. But wouldn't the general solution for >>>>> structs be one that that allows new user data type defs for # types? >>>>> >>>>> >>>>> >>>>> On Fri, Aug 28, 2015 at 4:43 PM Edward Kmett <[email protected]> wrote: >>>>> >>>>>> Some form of MutableStruct# with a known number of words and a known >>>>>> number of pointers is basically what Ryan Yates was suggesting above, but >>>>>> where the word counts were stored in the objects themselves. >>>>>> >>>>>> Given that it'd have a couple of words for those counts it'd likely >>>>>> want to be something we build in addition to MutVar# rather than a >>>>>> replacement. >>>>>> >>>>>> On the other hand, if we had to fix those numbers and build info >>>>>> tables that knew them, and typechecker support, for instance, it'd get >>>>>> rather invasive. >>>>>> >>>>>> Also, a number of things that we can do with the 'sized' versions >>>>>> above, like working with evil unsized c-style arrays directly inline at >>>>>> the >>>>>> end of the structure cease to be possible, so it isn't even a pure win if >>>>>> we did the engineering effort. >>>>>> >>>>>> I think 90% of the needs I have are covered just by adding the one >>>>>> primitive. The last 10% gets pretty invasive. >>>>>> >>>>>> -Edward >>>>>> >>>>>> On Fri, Aug 28, 2015 at 5:30 PM, Ryan Newton <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I like the possibility of a general solution for mutable structs >>>>>>> (like Ed said), and I'm trying to fully understand why it's hard. >>>>>>> >>>>>>> So, we can't unpack MutVar into constructors because of object >>>>>>> identity problems. But what about directly supporting an extensible set >>>>>>> of >>>>>>> unlifted MutStruct# objects, generalizing (and even replacing) MutVar#? >>>>>>> That may be too much work, but is it problematic otherwise? >>>>>>> >>>>>>> Needless to say, this is also critical if we ever want best in class >>>>>>> lockfree mutable structures, just like their Stm and sequential >>>>>>> counterparts. >>>>>>> >>>>>>> On Fri, Aug 28, 2015 at 4:43 AM Simon Peyton Jones < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> At the very least I'll take this email and turn it into a short >>>>>>>> article. >>>>>>>> >>>>>>>> Yes, please do make it into a wiki page on the GHC Trac, and maybe >>>>>>>> make a ticket for it. >>>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Simon >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From:* Edward Kmett [mailto:[email protected]] >>>>>>>> *Sent:* 27 August 2015 16:54 >>>>>>>> *To:* Simon Peyton Jones >>>>>>>> *Cc:* Manuel M T Chakravarty; Simon Marlow; ghc-devs >>>>>>>> *Subject:* Re: ArrayArrays >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> An ArrayArray# is just an Array# with a modified invariant. It >>>>>>>> points directly to other unlifted ArrayArray#'s or ByteArray#'s. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> While those live in #, they are garbage collected objects, so this >>>>>>>> all lives on the heap. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> They were added to make some of the DPH stuff fast when it has to >>>>>>>> deal with nested arrays. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I'm currently abusing them as a placeholder for a better thing. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The Problem >>>>>>>> >>>>>>>> ----------------- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Consider the scenario where you write a classic doubly-linked list >>>>>>>> in Haskell. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL (IORef (Maybe DLL) (IORef (Maybe DLL) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Chasing from one DLL to the next requires following 3 pointers on >>>>>>>> the heap. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> DLL ~> IORef (Maybe DLL) ~> MutVar# RealWorld (Maybe DLL) ~> Maybe >>>>>>>> DLL ~> DLL >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> That is 3 levels of indirection. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We can trim one by simply unpacking the IORef with >>>>>>>> -funbox-strict-fields or UNPACK >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We can trim another by adding a 'Nil' constructor for DLL and >>>>>>>> worsening our representation. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL !(IORef DLL) !(IORef DLL) | Nil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> but now we're still stuck with a level of indirection >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> DLL ~> MutVar# RealWorld DLL ~> DLL >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This means that every operation we perform on this structure will >>>>>>>> be about half of the speed of an implementation in most other languages >>>>>>>> assuming we're memory bound on loading things into cache! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Making Progress >>>>>>>> >>>>>>>> ---------------------- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I have been working on a number of data structures where the >>>>>>>> indirection of going from something in * out to an object in # which >>>>>>>> contains the real pointer to my target and coming back effectively >>>>>>>> doubles >>>>>>>> my runtime. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> We go out to the MutVar# because we are allowed to put the MutVar# >>>>>>>> onto the mutable list when we dirty it. There is a well defined >>>>>>>> write-barrier. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I could change out the representation to use >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL (MutableArray# RealWorld DLL) | Nil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I can just store two pointers in the MutableArray# every time, but >>>>>>>> this doesn't help _much_ directly. It has reduced the amount of >>>>>>>> distinct >>>>>>>> addresses in memory I touch on a walk of the DLL from 3 per object to >>>>>>>> 2. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I still have to go out to the heap from my DLL and get to the array >>>>>>>> object and then chase it to the next DLL and chase that to the next >>>>>>>> array. >>>>>>>> I do get my two pointers together in memory though. I'm paying for a >>>>>>>> card >>>>>>>> marking table as well, which I don't particularly need with just two >>>>>>>> pointers, but we can shed that with the "SmallMutableArray#" machinery >>>>>>>> added back in 7.10, which is just the old array code a a new data type, >>>>>>>> which can speed things up a bit when you don't have very big arrays: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL (SmallMutableArray# RealWorld DLL) | Nil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> But what if I wanted my object itself to live in # and have two >>>>>>>> mutable fields and be able to share the sme write barrier? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> An ArrayArray# points directly to other unlifted array types. What >>>>>>>> if we have one # -> * wrapper on the outside to deal with the impedence >>>>>>>> mismatch between the imperative world and Haskell, and then just let >>>>>>>> the >>>>>>>> ArrayArray#'s hold other arrayarrays. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLL = DLL (MutableArrayArray# RealWorld) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> now I need to make up a new Nil, which I can just make be a special >>>>>>>> MutableArrayArray# I allocate on program startup. I can even abuse >>>>>>>> pattern >>>>>>>> synonyms. Alternately I can exploit the internals further to make this >>>>>>>> cheaper. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Then I can use the readMutableArrayArray# and >>>>>>>> writeMutableArrayArray# calls to directly access the preceding and next >>>>>>>> entry in the linked list. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> So now we have one DLL wrapper which just 'bootstraps me' into a >>>>>>>> strict world, and everything there lives in #. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> next :: DLL -> IO DLL >>>>>>>> >>>>>>>> next (DLL m) = IO $ \s -> case readMutableArrayArray# s of >>>>>>>> >>>>>>>> (# s', n #) -> (# s', DLL n #) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> It turns out GHC is quite happy to optimize all of that code to >>>>>>>> keep things unboxed. The 'DLL' wrappers get removed pretty easily when >>>>>>>> they >>>>>>>> are known strict and you chain operations of this sort! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Cleaning it Up >>>>>>>> >>>>>>>> ------------------ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Now I have one outermost indirection pointing to an array that >>>>>>>> points directly to other arrays. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I'm stuck paying for a card marking table per object, but I can fix >>>>>>>> that by duplicating the code for MutableArrayArray# and using a >>>>>>>> SmallMutableArray#. I can hack up primops that let me store a mixture >>>>>>>> of >>>>>>>> SmallMutableArray# fields and normal ones in the data structure. >>>>>>>> Operationally, I can even do so by just unsafeCoercing the existing >>>>>>>> SmallMutableArray# primitives to change the kind of one of the >>>>>>>> arguments it >>>>>>>> takes. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> This is almost ideal, but not quite. I often have fields that would >>>>>>>> be best left unboxed. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data DLLInt = DLL !Int !(IORef DLL) !(IORef DLL) | Nil >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> was able to unpack the Int, but we lost that. We can currently at >>>>>>>> best point one of the entries of the SmallMutableArray# at a boxed or >>>>>>>> at a >>>>>>>> MutableByteArray# for all of our misc. data and shove the int in >>>>>>>> question >>>>>>>> in there. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> e.g. if I were to implement a hash-array-mapped-trie I need to >>>>>>>> store masks and administrivia as I walk down the tree. Having to go >>>>>>>> off to >>>>>>>> the side costs me the entire win from avoiding the first pointer chase. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> But, if like Ryan suggested, we had a heap object we could >>>>>>>> construct that had n words with unsafe access and m pointers to other >>>>>>>> heap >>>>>>>> objects, one that could put itself on the mutable list when any of >>>>>>>> those >>>>>>>> pointers changed then I could shed this last factor of two in all >>>>>>>> circumstances. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Prototype >>>>>>>> >>>>>>>> ------------- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Over the last few days I've put together a small prototype >>>>>>>> implementation with a few non-trivial imperative data structures for >>>>>>>> things >>>>>>>> like Tarjan's link-cut trees, the list labeling problem and >>>>>>>> order-maintenance. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> https://github.com/ekmett/structs >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Notable bits: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Data.Struct.Internal.LinkCut >>>>>>>> <https://github.com/ekmett/structs/blob/9ff2818f888aff4789b7a41077a674a10d15e6ee/src/Data/Struct/Internal/LinkCut.hs> >>>>>>>> provides an implementation of link-cut trees in this style. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Data.Struct.Internal >>>>>>>> <https://github.com/ekmett/structs/blob/9ff2818f888aff4789b7a41077a674a10d15e6ee/src/Data/Struct/Internal.hs> >>>>>>>> provides the rather horrifying guts that make it go fast. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Once compiled with -O or -O2, if you look at the core, almost all >>>>>>>> the references to the LinkCut or Object data constructor get optimized >>>>>>>> away, and we're left with beautiful strict code directly mutating out >>>>>>>> underlying representation. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> At the very least I'll take this email and turn it into a short >>>>>>>> article. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Edward >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 27, 2015 at 9:00 AM, Simon Peyton Jones < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>> Just to say that I have no idea what is going on in this thread. >>>>>>>> What is ArrayArray? What is the issue in general? Is there a ticket? >>>>>>>> Is >>>>>>>> there a wiki page? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> If it’s important, an ab-initio wiki page + ticket would be a good >>>>>>>> thing. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Simon >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> *From:* ghc-devs [mailto:[email protected]] *On Behalf >>>>>>>> Of *Edward Kmett >>>>>>>> *Sent:* 21 August 2015 05:25 >>>>>>>> *To:* Manuel M T Chakravarty >>>>>>>> *Cc:* Simon Marlow; ghc-devs >>>>>>>> *Subject:* Re: ArrayArrays >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> When (ab)using them for this purpose, SmallArrayArray's would be >>>>>>>> very handy as well. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Consider right now if I have something like an order-maintenance >>>>>>>> structure I have: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data Upper s = Upper {-# UNPACK #-} !(MutableByteArray s) {-# >>>>>>>> UNPACK #-} !(MutVar s (Upper s)) {-# UNPACK #-} !(MutVar s (Upper s)) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data Lower s = Lower {-# UNPACK #-} !(MutVar s (Upper s)) {-# >>>>>>>> UNPACK #-} !(MutableByteArray s) {-# UNPACK #-} !(MutVar s (Lower s)) >>>>>>>> {-# >>>>>>>> UNPACK #-} !(MutVar s (Lower s)) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The former contains, logically, a mutable integer and two pointers, >>>>>>>> one for forward and one for backwards. The latter is basically the same >>>>>>>> thing with a mutable reference up pointing at the structure above. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On the heap this is an object that points to a structure for the >>>>>>>> bytearray, and points to another structure for each mutvar which each >>>>>>>> point >>>>>>>> to the other 'Upper' structure. So there is a level of indirection >>>>>>>> smeared >>>>>>>> over everything. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> So this is a pair of doubly linked lists with an upward link from >>>>>>>> the structure below to the structure above. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Converted into ArrayArray#s I'd get >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data Upper s = Upper (MutableArrayArray# s) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> w/ the first slot being a pointer to a MutableByteArray#, and the >>>>>>>> next 2 slots pointing to the previous and next previous objects, >>>>>>>> represented just as their MutableArrayArray#s. I can use >>>>>>>> sameMutableArrayArray# on these for object identity, which lets me >>>>>>>> check >>>>>>>> for the ends of the lists by tying things back on themselves. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> and below that >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> data Lower s = Lower (MutableArrayArray# s) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> is similar, with an extra MutableArrayArray slot pointing up to an >>>>>>>> upper structure. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I can then write a handful of combinators for getting out the slots >>>>>>>> in question, while it has gained a level of indirection between the >>>>>>>> wrapper >>>>>>>> to put it in * and the MutableArrayArray# s in #, that one can be >>>>>>>> basically >>>>>>>> erased by ghc. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Unlike before I don't have several separate objects on the heap for >>>>>>>> each thing. I only have 2 now. The MutableArrayArray# for the object >>>>>>>> itself, and the MutableByteArray# that it references to carry around >>>>>>>> the >>>>>>>> mutable int. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> The only pain points are >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 1.) the aforementioned limitation that currently prevents me from >>>>>>>> stuffing normal boxed data through a SmallArray or Array into an >>>>>>>> ArrayArray >>>>>>>> leaving me in a little ghetto disconnected from the rest of Haskell, >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> and >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> 2.) the lack of SmallArrayArray's, which could let us avoid the >>>>>>>> card marking overhead. These objects are all small, 3-4 pointers wide. >>>>>>>> Card >>>>>>>> marking doesn't help. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Alternately I could just try to do really evil things and convert >>>>>>>> the whole mess to SmallArrays and then figure out how to unsafeCoerce >>>>>>>> my >>>>>>>> way to glory, stuffing the #'d references to the other arrays directly >>>>>>>> into >>>>>>>> the SmallArray as slots, removing the limitation we see here by aping >>>>>>>> the >>>>>>>> MutableArrayArray# s API, but that gets really really dangerous! >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I'm pretty much willing to sacrifice almost anything on the altar >>>>>>>> of speed here, but I'd like to be able to let the GC move them and >>>>>>>> collect >>>>>>>> them which rules out simpler Ptr and Addr based solutions. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -Edward >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 20, 2015 at 9:01 PM, Manuel M T Chakravarty < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>> That’s an interesting idea. >>>>>>>> >>>>>>>> Manuel >>>>>>>> >>>>>>>> > Edward Kmett <[email protected]>: >>>>>>>> >>>>>>>> > >>>>>>>> > Would it be possible to add unsafe primops to add Array# and >>>>>>>> SmallArray# entries to an ArrayArray#? The fact that the ArrayArray# >>>>>>>> entries are all directly unlifted avoiding a level of indirection for >>>>>>>> the >>>>>>>> containing structure is amazing, but I can only currently use it if my >>>>>>>> leaf >>>>>>>> level data can be 100% unboxed and distributed among ByteArray#s. It'd >>>>>>>> be >>>>>>>> nice to be able to have the ability to put SmallArray# a stuff down at >>>>>>>> the >>>>>>>> leaves to hold lifted contents. >>>>>>>> > >>>>>>>> > I accept fully that if I name the wrong type when I go to access >>>>>>>> one of the fields it'll lie to me, but I suppose it'd do that if i >>>>>>>> tried to >>>>>>>> use one of the members that held a nested ArrayArray# as a ByteArray# >>>>>>>> anyways, so it isn't like there is a safety story preventing this. >>>>>>>> > >>>>>>>> > I've been hunting for ways to try to kill the indirection >>>>>>>> problems I get with Haskell and mutable structures, and I could >>>>>>>> shoehorn a >>>>>>>> number of them into ArrayArrays if this worked. >>>>>>>> > >>>>>>>> > Right now I'm stuck paying for 2 or 3 levels of unnecessary >>>>>>>> indirection compared to c/java and this could reduce that pain to just >>>>>>>> 1 >>>>>>>> level of unnecessary indirection. >>>>>>>> > >>>>>>>> > -Edward >>>>>>>> >>>>>>>> > _______________________________________________ >>>>>>>> > ghc-devs mailing list >>>>>>>> > [email protected] >>>>>>>> > http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> ghc-devs mailing list >>>>>>>> [email protected] >>>>>>>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs >>>>>>>> >>>>>>> >>>>>> >>>> >> >
_______________________________________________ ghc-devs mailing list [email protected] http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
