Re: Ephemerons

Gil Tene Sat, 23 Jan 2016 10:39:43 -0800

> On Jan 23, 2016, at 5:14 AM, Peter Levart <peter.lev...@gmail.com> wrote:
> 
> Hi Gil, it's good to have this discussion. See comments inline...
> 
> On 01/23/2016 05:13 AM, Gil Tene wrote:
> ....
>>> On Jan 22, 2016, at 2:49 PM, Peter Levart < 
>>> <mailto:peter.lev...@gmail.com>peter.lev...@gmail.com 
>>> <mailto:peter.lev...@gmail.com>> wrote:
>>> 
>>> Ephemeron always touches definitions of at least two consecutive strengths 
>>> of reachabilities. The prototype says:
>>> 
>>>  * <li> An object is <em>weakly reachable</em> if it is neither
>>>  * strongly nor softly reachable but can be reached by traversing a
>>>  * weak reference or by traversing an ephemeron through it's value while
>>>  * the ephemeron's key is at least weakly reachable.
>>> 
>>>  * <li> An object is <em>ephemerally reachable</em> if it is neither
>>>  * strongly, softly nor weakly reachable but can be reached by traversing an
>>>  * ephemeron through it's key or by traversing an ephemeron through it's 
>>> value
>>>  * while it's key is at most ephemerally reachable. When the ephemerons that
>>>  * refer to ephemerally reachable key object are cleared, the key object 
>>> becomes
>>>  * eligible for finalization.
>> 
>> Looking into this a bit more, I don't think the above is quite right. 
>> Specifically, If an ephemeron's key is either strongly of softly reachable, 
>> you want the value to remain appropriately strongly/softly reachable. 
>> Without this quality, Ephemeron value referents can (and will) be 
>> prematurely collected and finalized while the keys are not. This (IMO) 
>> needed quality not provided by the behavior you specify…
> 
> This is not quite true. While ephemeron's value is weakly or even 
> ephemerally-reachable, it is not finalizable, because ephemeraly-reachable is 
> stronger than finaly-reachable. After ephemeron's key becomes 
> ephemeraly-reachable, the ephemeron is cleared by GC which sets it's key 
> *and* value to null atomically. The life of key and value at that moment 
> becomes untangled. Either of them can have a finalizer or not and both of 
> them will eventually be collected if not revived by their finalize() methods. 
> But it can never happen that ephemeron's value is finalized or collected 
> while it's key is still reachable through the ephemeron (while the ephemeron 
> is not cleared yet).
> 
> But I agree that it would be desirable for ephemeron's value to follow the 
> reachability of it's key. In above specification, if the key is strongly 
> reachable, the value is weakly reachable, so any WeakReferences or 
> SoftReferences pointing at the Ephemeron's value can already be cleared while 
> the key is still strongly reachable. This is arguably no different than 
> current specification of Soft vs. Weak references. A SoftReference can 
> already be cleared while its referent is still reachable through a 
> WeakReference,


We seem to agree about the cleaner behavior specification (in both of our texts 
below), so the these next paragraphs are really about arguing for why this is 
an important design choice if/when adding Ephemerons to Java:

It is true the [current] spec allows for soft references to an object to be 
cleared while weak references to the same object are not: the "determines" in 
"Suppose that the garbage collector determines at a certain point in time hat 
an object is RRRR reachable..." part [for RRRR = {soft, weak}] does not have to 
happen at the same "certain point in time".

However, to my knowledge all current implementations present as if this 
determination is happening at the same "point in time" for all weakly and 
softly reachable objects combined. Specifically [in implementations]: if soft 
reachability is determined for an object at some point in time, then weak 
reachability for that object is determined at the same point in time. And the 
weak reachability determination for an object depends on whether the collector 
chose to clear existing soft references to that object at that same point in 
time, with the appearance of the choice to clear (or not to clear) soft 
references to a given object atomically affecting the determination of it's 
weak reachability. Since the collector is *required* to act on a weak 
determination when it is made, while it *may* act on a soft determination when 
it is made, making the combined determination at the same "point in time" 
eliminates an obviously confusing situation that is not prohibited by the spec: 
if the determination for weak and soft reachability was not done at the same 
point in time, then an object that was softly reachable and had it's soft 
references cleared and queued could later become strongly reachable, and even 
softly reachable again. When reference processing is done as a STW thing, this 
"combined determination" effect is a trivial side-effect of STW. When it is 
done concurrently (or incrementally?), implementations still work to maintain 
the appearance of combined atomic determination of soft and weak reachability. 
I know ours does. In our case, we do it because we had no desire to be the ones 
to argue "I know that all implementations did this atomically because they were 
STW, but the spec allows us to add this bug to your program…".

So in actual implementations (to my knowledge), finalization is currently the 
only mechanism that can create this "strange situation" where an object was no 
longer strongly reachable, had actions triggered as a result from loss of 
strong reachability (i.e. actually observed by the program as "known to not be 
strongly reachable"), and later became strongly reachable again. E.g. a 
finalizer can propagate a strong reference to a previously non-strongly 
reachable object ('this' in the finalizer, or anything that 'this' transitively 
refers and was not otherwise reachable when the finalizer was called).. This is 
one of those "undesired" things that the introduction of Reference types was 
meant to deal with (Reference types were introduced in 1.2, after finalization 
was unfortunately already included and spec'ed. And phantom refs were meant to 
allow for a cleaner form that could replace finalization). And while the 
specifications of SoftReference and WeakReference do not prohibit it, 
implementations are not required to allow it, and in practice non of them do (I 
think), as doing so would most likely expose some "interesting" 
spec-allowed-but-extremely-surprising things/bugs that none of us want to have 
to defend...

In this context, it would be a "highly undesirable" design choice to introduce 
Ephemerons in a way that would them to return a strong reference to an object 
that has previously been determined to no longer be strongly reachable. 
Structuring the spec to prohibit this is a better design choice.

To highlight the design choice here, let me describe a specific problem 
scenario for which the previous (above) spec would cause "re-strengthening" 
behavior that would break assumptions that are allowed under the current spec: 
in the above/previously specified behavior an object V that is known to have no 
finalizers, but has e.g. 3 WeakReference objects that refer to it, can become 
weakly reachable while both a key referent object K in some ephemeron E with a 
value referent of V remain strongly reachable. At such a point (V is weakly 
reachable, K and E are strongly reachable), the collector may determine weak 
reachability for V, [atomically] clear all weak references to V, and enqueue 
those weak reference objects on their respective queues. While V is still 
ephemerally reachable under your previous definition, there are no references 
to it anywhere other than in ephemeron value referent fields, and weak 
references that did refer to it have been cleared and queued. Since the 
ephemeron is still there, and the key is still there, and the ephemeron has not 
been cleared, an Ephemeron.getValue() call would create a strong reference to 
an object that was previously determined to not be weakly reachable. 
Re-creating a strong reference to V after the point where weak references to V 
were cleared and the weak refs to it were enqueued would be "surprising" to 
current weak reference based code (the only thing that could cause this under 
the current spec would be a finalizer), so allowing that (jn the spec) is 
likely to break all kinds of logic that depends on currently spec'ed weak 
reference behaviors.

The spec'ed behavior we seem to be agreeing on (below) would prohibit this 
loophole and would [I think] maintain any reachability-based expectations that 
current weak-ref based logic can make under the current spec. Maintaining this 
continuity is an important design choice for adding Ephemerons into the current 
set of Reference behaviors.

And since I suspect that all implementations will continue to choose to do the 
"determination" of soft and weak reachability at the same "point in time", this 
will fit well with how people would build this stuff anyway.

Separate note: It would be separately interesting to consider narrowing the 
SoftRef spec to require JVM implementations to atomically clear all soft *and* 
weak references to an object at the same time. I.e. if the garbage collector 
chooses to clear a soft reference to an object that would become weakly 
reachable as a result, then all weak references to that object must be 
[atomically] cleared at the same time. Since I suspect that all current JVM 
implementations actually adhere to this stronger requirement already, this 
would not "hurt" anything or require extra work to comply with. [Anyone from 
Metronome or some other non-STW reference processing implementations want to 
chime in?].

> but for Ephemeron's value this might be confusing. The easier to understand 
> conceptual model for Ephemerons might be a pair of (WeakReference<K>, 
> WeakReference<V>) where the key has a virtual strong reference to the value. 
> And this is what we get if we say that reachability of the value follows 
> reachability of the key.
> 
>> 
>> 
>> For a correctly specified behavior, I think all strengths (from strong down) 
>> need to be affected by key/value Ephemeron relationships, but without adding 
>> an "ephemerally reachable" strength. E.g. I think you fundamentally need 
>> something like this:
>> 
>> - "An object is <em>strongly reachable</em> if it can be reached by (a) some 
>> thread without traversing any reference objects, or by (b) traversing the 
>> value of an Ephemeron whose key is strongly reachable. A newly-created 
>> object is strongly reachable by the thread that created it"
>> 
>> - "An object is <em>softly reachable</em> if it is not strongly reachable 
>> but can be reached by (a) traversing a soft reference or by (b) traversing 
>> the value of an Ephemeron whose key is softly reachable.
>> 
>> - "An object is <em>weakly reachable</em> if it is neither strongly nor 
>> softly reachable but can be reached by (a) traversing a weak reference or by 
>> (b) traversing the value of an ephemeron whose key is weakly reachable.
> 
> ...and that's where we stop, because when we make Ephemeron just a special 
> kind of WeakReference, the next thing that happens is:
> 
>  * <p> Suppose that the garbage collector determines at a certain point in 
> time
>  * that an object is <a href="package-summary.html#reachability">weakly
>  * reachable</a>.  At that time it will atomically clear all weak references 
> to
>  * that object and all weak references to any other weakly-reachable objects
>  * from which that object is reachable through a chain of strong and soft
>  * references.  At the same time it will declare all of the formerly
>  * weakly-reachable objects to be finalizable.  At the same time or at some
>  * later time it will enqueue those newly-cleared weak references that are
>  * registered with reference queues.
> 
> ...where "clearing of the WeakReference" means reseting the key *and* value 
> to null in case it is an Ephemeron; and
> "all weak references to some object" means Ephemerons that have that object 
> as a key (but not those that only have it as a value!) in case of ephemerons
> 
> ...
>> I still think that Ephemeron<K, V> should extend WeakReference<K>, since 
>> that places already established rules and expectation on (a) when it will be 
>> enqueued, (b) when the collector will clear it (when the the collector 
>> encounters the <K> key being weakly reachable), and (c) that clearing of all 
>> Ephemeron *and* WeakReference instances who share an identical key value is 
>> done atomically, along with (d) all weak references to to any other 
>> weakly-reachable objects from which that object is reachable through a chain 
>> of strong and soft references. These last (c, d) parts are critically 
>> captured since an Ephemeron *is a* WeakReference, and the statement in 
>> WeakReference that says that "… it will atomically clear all weak references 
>> to that object and all weak references to any other weakly-reachable objects 
>> from which that object is reachable through a chain of strong and soft 
>> references." has a clear application.
>> 
>> Here are some suggested edits to the JavaDoc to go with this suggested 
>> spec'ed behavior:
>> /**
>>   * Ephemeron<K, V> objects are a special kind of WeakReference<K> objects, 
>> which
>>   * hold two referents (a key referent and a value referent) and do not 
>> prevent their
>>   * referents from being made finalizable, finalized, and then reclaimed.
>>   * In addition to the key referent, which adheres to the referent behavior 
>> of a
>>   * WeakReference<K>, an ephemeron also holds a value referent whose 
>> reachabiliy
>>   * strength is affected by the reachability strength of the key referent:
>>   * The value referent of an Ephemeron instance is considered:
>>   * (a) strongly reachable if the key referent of the same Ephemeron
>>   * object is strongly reachable, or if the value referent is otherwise 
>> strongly reachable.
>>   * (b) softly reachable if it is not strongly reachable, and (i) the key 
>> referent of
>>   * the same Ephemeron object is softly reachable, or (ii) if the value 
>> referent is otherwise
>>   * softly reachable.
>>   * (c) weakly reachable if it is not strongly or softly reachable, and (i) 
>> the key referent of
>>   * the same Ephemeron object is weakly reachable, or (ii) if the value 
>> referent is otherwise
>>   * weakly reachable.
>>   * <p> When the collector clears an Ephemeron object instance (according to 
>> the rules
>>   * expressed for clearing WeakReference object instances), the Ephemeron 
>> instance's
>>   * key referent value referent are simultaneously and atomically cleared.
>>   * <p> By convenience, the Ephemeron's referent is also called the key, and 
>> can be
>>   * obtained either by invoking {@link #get} or {@link #getKey} while the 
>> value
>>   * can be obtained by invoking {@link #getValue} method.
>>   *...
> 
> 
> Thanks, this is very nice. I do like this behavior more.
> 
> Let me see what it takes to implement this strategy...
> 
> Regards, Peter
>

Re: Ephemerons

Reply via email to