On Mon, 2 Nov 2020 11:59:09 GMT, Maurizio Cimadamore <mcimadam...@openjdk.org> wrote:
>> This patch contains the changes associated with the third incubation round >> of the foreign memory access API incubation (see JEP 393 [1]). This >> iteration focus on improving the usability of the API in 3 main ways: >> >> * first, by providing a way to obtain truly *shared* segments, which can be >> accessed and closed concurrently from multiple threads >> * second, by providing a way to register a memory segment against a >> `Cleaner`, so as to have some (optional) guarantee that the memory will be >> deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they >> first pick up the API; a new `MemoryAccess` class has been added, which >> defines several useful dereference routines; these are really just thin >> wrappers around memory access var handles, but they make the barrier of >> entry for using this API somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of >> `MemorySegment` and `MemoryAddress` is not the same as it used to be; it >> used to be the case that a memory address could (sometimes, not always) have >> a back link to the memory segment which originated it; additionally, memory >> access var handles used `MemoryAddress` as a basic unit of dereference. >> >> This has all changed as per this API refresh; now a `MemoryAddress` is just >> a dumb carrier which wraps a pair of object/long addressing coordinates; >> `MemorySegment` has become the star of the show, as far as dereferencing >> memory is concerned. You cannot dereference memory if you don't have a >> segment. This improves usability in a number of ways - first, it is a lot >> easier to wrap native addresses (`long`, essentially) into a >> `MemoryAddress`; secondly, it is crystal clear what a client has to do in >> order to dereference memory: if a client has a segment, it can use that; >> otherwise, if the client only has an address, it will have to create a >> segment *unsafely* (this can be done by calling >> `MemoryAddress::asSegmentRestricted`). >> >> A list of the API, implementation and test changes is provided below. If >> you have any questions, or need more detailed explanations, I (and the rest >> of the Panama team) will be happy to point at existing discussions, and/or >> to provide the feedback required. >> >> A big thank to Erik Osterlund, Vladimir Ivanov and David Holmes, without >> whom the work on shared memory segment would not have been possible; also >> I'd like to thank Paul Sandoz, whose insights on API design have been very >> helpful in this journey. >> >> Thanks >> Maurizio >> >> Javadoc: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to >> `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing >> entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments >> are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement >> `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns >> the offset (a `long`) of this address relative to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are >> organized by carrier and access mode, where a carrier is one of the usual >> suspect (a Java primitive, minus `boolean`); the access mode can be simple >> (e.g. access base address of given segment), or indexed, in which case the >> accessor takes a segment and either a low-level byte offset,or a high level >> logical index. The classification is reflected in the naming scheme (e.g. >> `getByte` vs. `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which >> takes a `MemorySegment` and a `long` - from which it is easy to derive all >> the other handles using plain var handle combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be >> projected to a `MemoryAddress`. For now, both `MemoryAddress` and >> `MemorySegment` implement it; we have plans, with JEP 389 [2] to add more >> implementations. Clients can largely ignore this interface, which comes in >> really handy when defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and >> the general simplification of the memory access var handle support. >> >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for >> shared segments is notoriously hard to achieve, at least in a way that >> guarantees optimal access performances. This is caused by the fact that, if >> a segment is shared, it would be possible for a thread to close it while >> another is accessing it. >> >> After considering several options (see [3]), we zeroed onto an approach >> which is inspired by an happy idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea >> is that if we could *freeze* the world (e.g. with a GC pause), while a >> segment is closed, we could then prevent segments from being accessed >> concurrently to a close operation. For this to work, it is crucial that no >> GC safepoints can occur between a segment liveness check and the access >> itself (otherwise it would be possible for the accessing thread to stop just >> right before an unsafe call). It also relies on the fact that hotspot/C2 >> should not be able to propagate loads across safepoints. >> >> Sadly, none of these conditions seems to be valid in the current >> implementation, so we needed to resort to a bit of creativity. First, we >> noted that, if we could mark so called *scoped* method with an annotation, >> it would be very simply to check as to whether a thread was in the middle of >> a scoped method when we stopped the world for a close operation (btw, >> instead of stopping the world, we do a much more efficient, thread-local >> polling, thanks to JEP 312 [4]). >> >> The question is, then, once we detect that a thread is accessing the very >> segment we're about to close, what should happen? We first experimented with >> a solution which would install an *asynchronous* exception on the accessing >> thread, thus making it fail. This solution has some desirable properties, in >> that a `close` operation always succeeds. Unfortunately the machinery for >> async exceptions is a bit fragile (e.g. not all the code in hotspot checks >> for async exceptions); to minimize risks, we decided to revert to a simpler >> strategy, where `close` might fail when it finds that another thread is >> accessing the segment being closed. >> >> As written in the javadoc, this doesn't mean that clients should just catch >> and try again; an exception on `close` is a bug in the user code, likely >> arising from lack of synchronization, and should be treated as such. >> >> In terms of gritty implementation, we needed to centralize memory access >> routines in a single place, so that we could have a set of routines closely >> mimicking the primitives exposed by `Unsafe` but which, in addition, also >> provided a liveness check. This way we could mark all these routines with >> the special `@Scoped` annotation, which tells the VM that something >> important is going on. >> >> To achieve this, we created a new (autogenerated) class, called >> `ScopedMemoryAccess`. This class contains all the main memory access >> primitives (including bulk access, like `copyMemory`, or `setMemory`), and >> accepts, in addition to the access coordinates, also a scope object, which >> is tested before access. A reachability fence is also thrown in the mix to >> make sure that the scope is kept alive during access (which is important >> when registering segments against cleaners). >> >> Of course, to make memory access safe, memory access var handles, byte >> buffer var handles, and byte buffer API should use the new >> `ScopedMemoryAccess` class instead of unsafe, so that a liveness check can >> be triggered (in case a scope is present). >> >> `ScopedMemoryAccess` has a `closeScope` method, which initiates the >> thread-local handshakes, and returns `true` if the handshake completed >> successfully. >> >> The implementation of `MemoryScope` (now significantly simplified from what >> we had before), has two implementations, one for confined segments and one >> for shared segments; the main difference between the two is what happens >> when the scope is closed; a confined segment sets a boolean flag to false, >> and returns, whereas a shared segment goes into a `CLOSING` state, then >> starts the handshake, and then updates the state again, to either `CLOSED` >> or `ALIVE` depending on whether the handshake was successful or not. Note >> that when a shared segment is in the `CLOSING` state, >> `MemorySegment::isAlive` will still return `true`, while the liveness check >> upon memory access will fail. >> >> #### Memory access var handles overhaul >> >> The key realization here was that if all memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could >> be derived from this basic var handle form. >> >> This allowed us to remove the on-the-fly var handle generation, and to >> simply derive structural access var handles (such as those obtained by >> calling `MemoryLayout::varHandle`) using *plain* var handle combinators, so >> that e.g. additional offset is injected into a base memory access var handle. >> >> This also helped in simplifying the implementation by removing the special >> `withStride` and `withOffset` combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now >> gone. >> >> #### Test changes >> >> Not much to see here - most of the tests needed to be updated because of the >> API changes. Some were beefed up (like the array test, since now segments >> can be projected into many different kinds of arrays). A test has been added >> to test the `Cleaner` functionality, and another stress test has been added >> for shared segments (`TestHandshake`). Some of the microbenchmarks also >> needed some tweaks - and some of them were also updated to also test >> performance in the shared segment case. >> >> [1] - https://openjdk.java.net/jeps/393 >> [2] - https://openjdk.java.net/jeps/389 >> [3] - https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html >> [4] - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one > additional commit since the last revision: > > Address comments from @AlanBateman test/jdk/java/foreign/TestCleaner.java line 8: > 6: * under the terms of the GNU General Public License version 2 only, as > 7: * published by the Free Software Foundation. Oracle designates this > 8: * particular file as subject to the "Classpath" exception as provided "Classpath exception" could be dropped? ------------- PR: https://git.openjdk.java.net/jdk/pull/548