Re: Weak References?

Benjamin Goldberg Sun, 24 Aug 2003 01:31:36 +0000


Juergen Boemmels wrote:
> 
> Benjamin Goldberg <[EMAIL PROTECTED]> writes:
> 
> > But suppose that at the end of DoD, the object we're weakly referring
> > to gets marked as alive?  Now, we would need to look through that
> > object's list of destroy-functions, and remove all of the
> > weakref-callbacks.
> 
> At the end of DoD nobody gets marked alive anymore. The calling of
> destroy-functions is done at sweep-time. There might be a problem when
> you destroy referent and referee at the same DoD run. Then the
> weakref-callback can be called on a dead but not destroyed object. It
> is a matter of destruction ordering to destroy the referee first,
> which destroyes the weakref.


Erm, I had meant to say, "suppose that *by* the time DoD has finished
sweeping, the object we're weakly referring to *has been* marked as
alive some time (after we've done pobject_weakref)," not "what happens
if it gets marked as alive after the end of DoD"

> > But if the weakref-callbacks are stored in a seperate data structure,
> > created during DoD, then there's no problem; this data structure will
> > get re-created for each DoD pass, and thus always start empty.
> 
> This is unnecessary complicated. And it slows down the DoD run by
> creating a datastructure.

As opposed to attaching another destructor (which is, after all, a data
structure) to the object being weakly referred to?

Plus, since the data structure for the weakref callbacks gets created at
the start of the DoD function, and we're finished with it at the end of
the DoD function, then it doesn't need to be a PMC or any other
"generic" structure; it's allocation and deallocation will be quite
explicit, and it can be a special-purpose data structure written for
compactness and speed instead of generality.

> > Also, by keeping them seperate, we can walk all the callback functions
> > before destructing the dead objects.
> 
> This is a problem of destruction ordering. You try to solve it by
> introducing a seperate step, which works in this special case. But
> destruction ordering is a much harder problem.

Well, it wouldn't be *harmful* to call the callbacks after the dead
objects are destructed (or even when *some* but not all of them are
destructed) ... however, to now be safe, we'd have to make it illegal to
operate on the Pobj pointer as anything other than opaque pointer, which
we may only perform equality comparisons with.  Caching the information
about what the Pobj *used* to be would become necessary, if we needed to
know that info.

> > (But you're right -- it is too complicated to do a lookup table.  A
> > simple linked list should do fine.)
> >
> > > But one other thing, what happens if the object holding the weakref
> > > dies before the refrenced object? Then the callback-function will be
> > > called for a dead object.
> >
> > Each callback-function "belongs" to a pmc.  The DoD should be able to
> > know this, and act on it.  So if the pmc which registered the callback
> > is dead, (or if the object weakly referred has since then come alive),
> > then the callback isn't called.
> >
> > > So pobject_weakref() needs to return a handle
> > > for the weakref and there needs to be a function
> > > weakref_destroy(weakref_handle *handle).
> > >
> > > Other issue is who owns the data_structure of the weakref? The
> > > referent, the referee, or will this be garbage-collected (which
> > > makes the weakref_handle a PObj* and weakref_destroy its custom
> > > destroy function.
> >
> > The garbage collector owns everything except for callback_info, which
> > belongs to the pmc which registered the weakref-callback.
> 
> This is one possiblity. Therefor the destroy-function of the
> registering PMC must be extended with freeing the callback_info. As we
> already extend the destroy function of the PMC referenced by the
> weakref this needs no new mechanics.

? Where have we extended the destroy function of ... ?  Hmm, oh, in your
proposed modification of ... .  Ok.  I suppose that works, though it
sounds like it's getting to be rather more complicated than my idea. 
And bits of the weakref stuff is now scattered all over the place, stuck
onto each pmc being dealt with, instead of collected all in one nicely
managable location.

> > > > After DOD finishes, the lookup table is walked; for each entry
> > > > whose Pobj* hasn't been marked as alive, the callbacks are called.
> > > >
> > > > The effect of this of course is that a WeakRef has no cost except
> > > > during Dead Object Detection.
> > >
> > > It only has a cost at object destroy-time. (If the weakrefs are
> > > garbagecollected they have an effect on DOD in the way that there
> > > are more objects to trace)
> >
> > *blink* More objects?  Oh, you're assuming that pobject_weakref is
> > returning a Pobj* handle.
> 
> Returning a PObj* handle is the other possibility. The registering PMC
> holds a hard reference to the callback_info, and the callback_info
> deregisters itself when it gets destroyed. The advantage of this
> approach is there is no need to malloc/free the memory for
> callback_info, it just uses the standard gc-allocator.

Either you're confused, or you're confusing me.  The callback_info is
just a void* pointer, which can be anything, ranging from an integer
cast to a pointer, or a pointer to global or static memory, or a pointer
into the object doing the registering, or a pointer into the interpreter
struct, or a pointer into some other pmc (one which we're marking as
alive, I hope!), or a pointer to whole PMC, or a pointer into the object
we're weakly referring to, or a mem_sys_allocate()d pointer.

If it's anything *other* than a mem_sys_allocate()d pointer, then
there's no need whatsoever to do anything to free it -- that'll happen
when necessary (or in the case of an int cast to a pointer, won't *be*
necessary).

If it *is* a mem_sys_allocate()d pointer, then of course someone needs
to free it.  It had better not be one which was allocated by the pmc
being weakly referenced (and which will be freed by that pmc), or you're
in trouble (or at least, it's trouble if the callback may happen after
the referenced pmc is destructed)!

I'll assume that it's the referencing pmc that allocated it, and that
this was done inside of the mark() function, soley for the purposes of
the marking effect.  Obviously, we'll need to keep a copy of this
pointer somewhere in our structure, so that *we* can free it in our
destroy, if necessary.

It's annoying though that if the weakly referenced object is found to be
alive, and we stay alive for a long while, this allocated memory stays
allocated for a long while, since we have no chance to free it earlier.

Hmm... I suppose that if the callback is called *whether or not* the
weakly referenced object dies, (and the callback then has to do a check
of whether the referenced object is really dead) then there's no
problem: it can free the sys_allocate()d memory inside the callback, and
we don't need to store any pointer to it anywhere.

> > Keeping the callback data in a seperate list which only exists for the
> > duration of the dod prevents this.  Or rather, you do have to clean up
> > the linked list, of course, but there's no extra bookkeeping.
> 
> *blink too* You don't want to use a weakref you want a weak_MARK. The
> callbacks are getting registered during each mark, and get used or
> destroyed. But this registring and destructing has a cost too. This
> cost is only payed by the weakmark-using objects, but they need to be
> paid on every DoD run.

Yes.  What I meant was, no extra data is attached to pmcs, which we
would then have to keep track of.

> > > this is only useful if a hashlookup is fast compared with
> > > string_make.
> >
> > Well, it might be.  Hashing can be quite fast, ya know.
> 
> Only the profiler can tell you which one is faster.

Indeed.  Anyway, doing this to strings *automatically* was merely a
thought -- we don't *have* to do it.  (Well, we will have to when we
implement Java, since that's part of the spec.  But we don't have to for
imcc, or perl6, unless we want to.

> > Here's a better idea, one you'll have more difficulty arguing with --
> > imagine a debugger, written in parrot.
> >
> > We are going to have one, right?  Hmm, p6tkdb :)
> >
> > It needs to keep references to objects it's interested in, but if
> > they're strong references, then we would have trouble debugging
> > objects with custom destroys (or worse, objects needing timely
> > destruction), since the debugger's references to them would prevent
> > them from being cleaned up.
> >
> > Changing to weakrefs removes this kind of horrible heisenbug.
> 
> Something totaly diffrent. DoD runs happen normaly in out of memory
> situations. Doing things like running a debugger callback befor a
> sweep finishes might ask for trouble. If the debugger is writen in
> parrot, then the callback-function which deregisters the object is
> surely also written in parrot. Recursive runloops with unfinished
> sweeps, combined with early destruction in the inner runloop. This
> will be really fun.

Blech, you're right.

Ok, how about this:  If the callbacks are only called *after* everything
else is done (destructors called, memory compacted and maybe freed,
etc.), then the GC will be complete when the callback functions are
called.

Copy the pointer to the list of callback stuff into a C auto variable,
and then set the interpreter-> version of it to NULL.  Thus, it should
be safe for a callback function to do something which might happen to
trigger DoD.

Since we cannot at this point in time deal with the dead objects as
actual Pobj* things (since they've been cleaned up, and their memory
presumably freed), that parameter would have to disappear from the
callback function's parameters (or maybe, rename the parameter as "void*
opaque", and forbid users (in the docs) from casting it into a Pobj*). 
All the information needed to tell us what we *had been* weakly
referencing will now have to go into the callback_info pointer.

-- 
$a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "[EMAIL PROTECTED]
]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}

Re: Weak References?

Reply via email to