Could you please look at the updated version at [0]?

The new API for ReferenceQueue still targets the problem of
synchronizing Reference handling with CRaC.  An example of that is a
java object that becomes unreachable just before the checkpoint, and an
associated Reference needs to be processed to release some native
resource.  Creating an image of the VM with that native resource linked
is both unsafe (the native resource may not exist at the restore -- CRaC
VM does its best to prevent successful checkpoint in this case) and
inefficient, as every restored instance will perform the same processing
of the same Reference that was captured by the image.  So we need to
ensure Reference processing is complete.

For the processing done by a thread (or a set of threads), the change
provides an updated API to await the set of threads blocked on the Queue
awaiting references.  This ensures that threads are done processing
References from that Queue.

> Once the method returns then there is no guarantee that the number
> of waiters hasn't changed, but I think you know that

I hoped to guarantee all Queues are empty by waiting a sufficient
number of waiters for each Queue, in the order of Queues passing
References between each other (for a single thread). But now even
there, I see handling of a Reference later in the order may make
another one pending, filling up a Queue that was supposed to be empty.
For a strong guarantee that all Queues are empty, some sort of
iteration may be required, that will check no Queue had a new
reference since the last check.

Processing of a single Reference may generate an arbitrary number of
more enqueued References. More formally, ReferenceQueues and their
processors form a directed graph, in which nodes are Queues and edges
are relation "handling of a Reference from the source may enqueue
another Reference into destination". Edges are defined by the code of
processing and not data. The graph can be of the arbitrary form, e.g.
there can be cycles, so Reference processing does not need even to
converge.

So the only reasonable way to get reference processing quiescent is to
ensure References for each Queue are processed (by calling the new API),
in the order of Queues may get References.

I think a public API is needed as users may have the same problem as
we do. But the current code does not support this (we need to allow
user code after JDK Queues are emptied).

The API now fully supports calling from the user code.  Each invocation
of the new API ensures all unreachable objects are discovered and pushed
to a Queue before the Queue is checked for pending references and the
number of waiting threads.

> At a high level it should be okay to provide a JDK-internal way to
> await quiescent. You've added it as a public API which might be okay
> for the current exploration but I don't think it would be exposed in
> its current form.

The new ReferenceQueue API is moved into jdk.crac.* package, to avoid
polluting Java API of CRaC EA builds that are based on JDK 17 for now.

[0] https://github.com/openjdk/crac/pull/22
[1] 
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/WeakHashMap.java#L361

Thanks,
Anton

Reply via email to