Hi Michael, Thanks for the info. I stumbled upon Richard's excellent web page describing the internals of OCaml and the write barrier but it does not describe how the pointers actually get rewritten.
Cheers, Jon. > -----Original Message----- > From: [email protected] [mailto:caml-list- > [email protected]] On Behalf Of Michael Ekstrand > Sent: 30 October 2010 18:38 > To: [email protected] > Subject: Re: [Caml-list] How does OCaml update references when values > are moved by the GC? > > On 10/30/2010 12:15 PM, Jon Harrop wrote: > > I was hoping for a little more detail, of course. :-) > > > > How is the mapping from old to new pointers stored? Does the GC > rewrite all > > of the thread-local stacks in series before allowing any of them to > > continue? > > I imagine so. > > > Does the write barrier record pointers written into the major heap > > so only those specific locations are rewritten at minor heap > collections but > > the entire major heap is rewritten upon compaction? > > Yes. The runtime maintains a 'remembered set', a list of pointers from > the major heap back to the minor heap. Maintaining this set is why > mutable data can be expensive in OCaml - any time you store a pointer > into a mutable field, the runtime must check whether the new link is > from the major to the minor heap and update the refs list accordingly. > Richard WM Jones has details here: > > http://rwmj.wordpress.com/2009/08/08/ocaml-internals-part-5-garbage- > collection/ > > > Can the GC distinguish > > between an array of ints and an array of pointers at run-time in > order to > > avoid traversing all of the ints when trying to rewrite pointers? > > Not that I know of. The tag block does not have a documented reserved > value to indicate that - there are values to indicate an unboxed float > array, a string, and an array of opaque values, but not an integer > array > (unless the opaque value flag is set for integer arrays). > > > Also, any idea what the maximum proportion of the running time of a > program > > is spent doing this rewriting? For example, if you fill a float- > >float hash > > table with values does updating the pointers in the major heap > account for a > > significant proportion of the total running time of the program? > > In my data analysis jobs (which wind up allocating quite large heaps), > the compactor almost never (detectably) runs. Minor cycles and major > slices are a much larger concern in my experience. I work around that > by increasing the minor heap size to decrease minor heap thrashing (my > general rule of thumb is that I want each "work unit", whatever that > may > be, to fit in the minor heap). > > It could well be that other applications will have different > characteristics that trigger more compactions. I cannot speak for > those > applications. Further, when I have huge floating-point data > structures, > I'm usually using bigarrays (not because I choose them over arrays, > typically, but because such code in my work frequently has to interact > with BLAS or SPARSKIT at some point). > > - Michael > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs _______________________________________________ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs
