Re: RFR(L): JDK-8046936 : JEP 270: Reserved Stack Areas for Critical Sections

David Holmes Mon, 16 Nov 2015 17:02:16 -0800

Hi Fred,

On 11/11/2015 3:38 AM, frederic parain wrote:

Hi David and Doug,


Thank you for your feedback.

I put some comments below.

On 09/11/2015 08:12, David Holmes wrote:

Hi Doug,

On 9/11/2015 9:31 AM, Doug Lea wrote:

On 11/06/2015 02:23 AM, David Holmes wrote:

Before I look at the code, what is the status of the "Open Issues"
referenced in the JEP?

Also the JDK changes need to be reviewed on core-libs-dev and in
particular must
be seen by the jsr166 maintainers (ie Doug Lea and Martin Buchholz)


Martin and I and others are aware of this work.

As has been said by everyone initially looking at JEP 270, it is a
disappointing effort. It requires that APIs/methods supporting
try/finally-like constructions (especially locks) include new
annotations to substantially reduce the likelihood of failing to
perform the "finally" side actions to release resources (like a lock)
upon StackOverflowError.

Considering that no other readily implementable solution has
been proposed during the many years that this issue has been discussed,
a disappointing band-aid might not be so bad.


If I put on some extra-strength rose-coloured glasses I think I can
almost read that as "something is better than nothing". ;-) As there are
no practical general solutions to the problem (which surely says
something fundamental about the language design!) an annotation-based
point solution seems the only way to make some progress.

But note that it is not the finally part that need be at issue. If one
considers ReentrantLock.lock() (in NonfairSync):

  210         final void lock() {
  211             if (compareAndSetState(0, 1))
  212                 setExclusiveOwnerThread(Thread.currentThread());
  213             else
  214                 acquire(1);
  215         }

if we throw after #211 the lock is half-locked: state says it is locked,
but no owner set. So it can't be locked again, nor can it be unlocked.
We would have to determine actual stack usage for each call path to know
whether that is in fact possible

Assuming the hotspot mechanics are put into place, the main question
is when to use the  @ReservedStackAccess annotation.

The JEP singles out ReentrantLock. But there are also other locks
provided in JDK (StampedLock, ReentrantReadWriteLock), as well
as in other external packages, that should probably use this.
Plus there are other lock/resource APIs, for example Semaphore,
that are often used in try/finally constructions. And internal
before/after bookkeeping inside Executors. Plus, many cases of
try-with-resources constructions. Plus many transactional APIs.

And so on. It would be a never-ending effort to effectively use
@ReservedStackAccess.


Which, to me, is just another way of saying that the general problem is
intractable. The annotation at least gives the mechanism for a point
solution as has been said. The pain point that drove this was the use of
ReentrantLock in ConcurrentHashMap, which in turn was used in class
loading. That particular pain point has been addressed by not using the
problematic class - something we surely do not want to promote as the
way to deal with this problem!

I don't know of a good way to address this. One possibility is
for javac to automatically insert @ReservedStackAccess in any
method with a try-finally that involves at least one call?


I don't see how that would be at all viable as it brings back in the
sizing problem - or greatly exacerbates the sizing problem. Plus as
noted above the problem is not just with the finally part.


Another issue is that writing a rule for javac that is not
overpessimistic is also an intractable problem. The pattern
"atomic operation followed by method invocation to complete
status update" is too general to be the trigger of the annotation.
1) It would lead to an excessive number of methods being annotated,
and by consequence an over-sizing of the reserved area.
2) Another condition to hit the bug is that the stack space of
the callee method must be bigger than the caller method with
the atomic operation. This information depends heavily on runtime
information like HW, compilation policies (inlining) and execution
profile (to know which methods are going to be compiled first).
javac won't have access to these information to annotated methods.

Right - there is no way to know that the atomic operation followed bythe call should really be a single atomic operation that we can'timplement that way. Given many atomic operations have subsequent codeactions, the pattern would degenerate to simply involve the atomicoperation, and that wouldn't work well at all.

Further, it isn't always that pattern. Consider for exampleReentrantReadWriteLock.sync.tryRelease():


            int nextc = getState() - releases;
            boolean free = exclusiveCount(nextc) == 0;
            if (free)
                setExclusiveOwnerThread(null);
            setState(nextc);

here we clear the owner first, but then may fail to set the new state.Actually detecting where a stackoverflow can lead to inconsistent staterequires detailed scrutiny of the code. I would like to be able toreason that setState requires no more stack thansetExclusiveOwnerThread, but I can't do that simply by code inspection.

Perhaps we should simply start with j.u.c.locks as the initial candidate
sets of classes and work out from there. The idiomatic Lock usage:

l.lock();
try {
  // ...
} finally {
   l.unlock();
}

epitomizes the critical-finally-block scenario (though the lock()
problem is much more subtle). AQS is too low-level to flag any specific
function I think; and the other synchronizers perhaps too high-level,
with fewer idiomatic use-cases that obviously benefit from this.

In that regard I don't agree with where Fred has currently placed the
annotation in ReentrantLock and AQS. The placements appear arbitrary at
first glance - though no doubt they are the result of a deep and careful
analysis. But it is far from obvious why the annotation is placed on
NonfairSync.lock but FairSync.tryAcquire(), for example.


The annotation is used on methods with the problematic pattern
"atomic operation followed by method invocation".
Their execution could lead to data corruption if atomic operation
is executed successfully but the following method invocation
fails because of SOE.

In the NonFairSync class, this pattern is in the lock() method,
while the tryAcquire() is only a method invocation. So lock()
is annotated and tryAcquire() is not. Note that the method
invoked by try acquire is nonfairTryAcquire() which has the
problematic pattern and is annotated.

In the FairSync class, the situation is reversed: lock() is
just an method invocation and it is not annotated, and
tryAcquire() has the problematic pattern and is annotated.

I tried to put the annotation as close as possible to the
critical sections in order to minimize the size requirement
for the reserved area.

Okay I now understand the rule you were applying. But it isn't obviousfrom the code. Further, there seem to be other places which could alsosuffer serious problems. In AQS doReleaseShared() we have:


 720                     if (!h.compareAndSetWaitStatus(Node.SIGNAL, 0))
 721                         continue;            // loop to recheck cases
 722                     unparkSuccessor(h);

which could seemingly throw SOE without unparking the successor.

Any compound action that logically forms a "transaction" would need tobe immune to SOE.

I would be tempted to place them on all the public lock/unlock methods
of the Lock implementations, rather than on the internal AQS-derived
functions, and AQS itself.


I've tried that :-) The amount of code being executed with
the ReservedStackAccess privilege tends to increase very
quickly and I was concerned about keeping the size of the
reserved area small.

How much space does each level of calling need? I know that is hard toanswer but are we talking a few bytes, few kb, many kb?

I'd be looking at expanding the current application of the annotation tocover all affected areas of AQS, AQLS, ReentrantLock andReeentrantReadWriteLock. StampedLock is a bit more problematic - thereit seems we do need to annotate the public API - but if we do it thenj.u.c.locks is at least covered.


Thanks,
David

Fred

On 6/11/2015 12:17 AM, Frederic Parain wrote:

Please review the changesets for JEP 270: Reserved Stack Areas for
Critical Sections

CR: https://bugs.openjdk.java.net/browse/JDK-8046936

Webrevs:
   hotspot:
http://cr.openjdk.java.net/~fparain/8046936/webrev.00/hotspot/
   JDK: http://cr.openjdk.java.net/~fparain/8046936/webrev.00/jdk/

The CR contains a detailed description of the issue and the design
of the solution as well as implementation details.

Tests:

     JPRT - hotspot & core
     RBT - vm.quick

Thanks

Fred

Re: RFR(L): JDK-8046936 : JEP 270: Reserved Stack Areas for Critical Sections

Reply via email to