On Thu, 8 Oct 2020 10:29:24 GMT, Maurizio Cimadamore <mcimadam...@openjdk.org> wrote:
>> This patch contains the changes associated with the third incubation round >> of the foreign memory access API incubation >> (see JEP 393 [1]). This iteration focus on improving the usability of the >> API in 3 main ways: >> * first, by providing a way to obtain truly *shared* segments, which can be >> accessed and closed concurrently from >> multiple threads >> * second, by providing a way to register a memory segment against a >> `Cleaner`, so as to have some (optional) guarantee >> that the memory will be deallocated, eventually >> * third, by not requiring users to dive deep into var handles when they >> first pick up the API; a new `MemoryAccess` class >> has been added, which defines several useful dereference routines; these >> are really just thin wrappers around memory >> access var handles, but they make the barrier of entry for using this API >> somewhat lower. >> >> A big conceptual shift that comes with this API refresh is that the role of >> `MemorySegment` and `MemoryAddress` is not >> the same as it used to be; it used to be the case that a memory address >> could (sometimes, not always) have a back link >> to the memory segment which originated it; additionally, memory access var >> handles used `MemoryAddress` as a basic unit >> of dereference. This has all changed as per this API refresh; now a >> `MemoryAddress` is just a dumb carrier which >> wraps a pair of object/long addressing coordinates; `MemorySegment` has >> become the star of the show, as far as >> dereferencing memory is concerned. You cannot dereference memory if you >> don't have a segment. This improves usability >> in a number of ways - first, it is a lot easier to wrap native addresses >> (`long`, essentially) into a `MemoryAddress`; >> secondly, it is crystal clear what a client has to do in order to >> dereference memory: if a client has a segment, it can >> use that; otherwise, if the client only has an address, it will have to >> create a segment *unsafely* (this can be done >> by calling `MemoryAddress::asSegmentRestricted`). A list of the API, >> implementation and test changes is provided >> below. If you have any questions, or need more detailed explanations, I >> (and the rest of the Panama team) will be >> happy to point at existing discussions, and/or to provide the feedback >> required. A big thank to Erik Osterlund, >> Vladimir Ivanov and David Holmes, without whom the work on shared memory >> segment would not have been possible; also I'd >> like to thank Paul Sandoz, whose insights on API design have been very >> helpful in this journey. Thanks Maurizio >> Javadoc: >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/javadoc/jdk/incubator/foreign/package-summary.html >> Specdiff: >> >> http://cr.openjdk.java.net/~mcimadamore/8254162_v1/specdiff/jdk/incubator/foreign/package-summary.html >> >> CSR: >> >> https://bugs.openjdk.java.net/browse/JDK-8254163 >> >> >> >> ### API Changes >> >> * `MemorySegment` >> * drop factory for restricted segment (this has been moved to >> `MemoryAddress`, see below) >> * added a no-arg factory for a native restricted segment representing >> entire native heap >> * rename `withOwnerThread` to `handoff` >> * add new `share` method, to create shared segments >> * add new `registerCleaner` method, to register a segment against a cleaner >> * add more helpers to create arrays from a segment e.g. `toIntArray` >> * add some `asSlice` overloads (to make up for the fact that now segments >> are more frequently used as cursors) >> * rename `baseAddress` to `address` (so that `MemorySegment` can implement >> `Addressable`) >> * `MemoryAddress` >> * drop `segment` accessor >> * drop `rebase` method and replace it with `segmentOffset` which returns >> the offset (a `long`) of this address relative >> to a given segment >> * `MemoryAccess` >> * New class supporting several static dereference helpers; the helpers are >> organized by carrier and access mode, where a >> carrier is one of the usual suspect (a Java primitive, minus `boolean`); >> the access mode can be simple (e.g. access >> base address of given segment), or indexed, in which case the accessor >> takes a segment and either a low-level byte >> offset,or a high level logical index. The classification is reflected in >> the naming scheme (e.g. `getByte` vs. >> `getByteAtOffset` vs `getByteAtIndex`). >> * `MemoryHandles` >> * drop `withOffset` combinator >> * drop `withStride` combinator >> * the basic memory access handle factory now returns a var handle which >> takes a `MemorySegment` and a `long` - from which >> it is easy to derive all the other handles using plain var handle >> combinators. >> * `Addressable` >> * This is a new interface which is attached to entities which can be >> projected to a `MemoryAddress`. For now, both >> `MemoryAddress` and `MemorySegment` implement it; we have plans, with >> JEP 389 [2] to add more implementations. Clients >> can largely ignore this interface, which comes in really handy when >> defining native bindings with tools like `jextract`. >> * `MemoryLayouts` >> * A new layout, for machine addresses, has been added to the mix. >> >> >> >> ### Implementation changes >> >> There are two main things to discuss here: support for shared segments, and >> the general simplification of the memory >> access var handle support. >> #### Shared segments >> >> The support for shared segments cuts in pretty deep in the VM. Support for >> shared segments is notoriously hard to >> achieve, at least in a way that guarantees optimal access performances. This >> is caused by the fact that, if a segment >> is shared, it would be possible for a thread to close it while another is >> accessing it. After considering several >> options (see [3]), we zeroed onto an approach which is inspired by an happy >> idea that Andrew Haley had (and that he >> reminded me of at this year OpenJDK committer workshop - thanks!). The idea >> is that if we could *freeze* the world >> (e.g. with a GC pause), while a segment is closed, we could then prevent >> segments from being accessed concurrently to a >> close operation. For this to work, it is crucial that no GC safepoints can >> occur between a segment liveness check and >> the access itself (otherwise it would be possible for the accessing thread >> to stop just right before an unsafe call). >> It also relies on the fact that hotspot/C2 should not be able to propagate >> loads across safepoints. Sadly, none of >> these conditions seems to be valid in the current implementation, so we >> needed to resort to a bit of creativity. First, >> we noted that, if we could mark so called *scoped* method with an >> annotation, it would be very simply to check as to >> whether a thread was in the middle of a scoped method when we stopped the >> world for a close operation (btw, instead of >> stopping the world, we do a much more efficient, thread-local polling, >> thanks to JEP 312 [4]). The question is, then, >> once we detect that a thread is accessing the very segment we're about to >> close, what should happen? We first >> experimented with a solution which would install an *asynchronous* exception >> on the accessing thread, thus making it >> fail. This solution has some desirable properties, in that a `close` >> operation always succeeds. Unfortunately the >> machinery for async exceptions is a bit fragile (e.g. not all the code in >> hotspot checks for async exceptions); to >> minimize risks, we decided to revert to a simpler strategy, where `close` >> might fail when it finds that another thread >> is accessing the segment being closed. As written in the javadoc, this >> doesn't mean that clients should just catch and >> try again; an exception on `close` is a bug in the user code, likely arising >> from lack of synchronization, and should >> be treated as such. In terms of gritty implementation, we needed to >> centralize memory access routines in a single >> place, so that we could have a set of routines closely mimicking the >> primitives exposed by `Unsafe` but which, in >> addition, also provided a liveness check. This way we could mark all these >> routines with the special `@Scoped` >> annotation, which tells the VM that something important is going on. To >> achieve this, we created a new (autogenerated) >> class, called `ScopedMemoryAccess`. This class contains all the main memory >> access primitives (including bulk access, >> like `copyMemory`, or `setMemory`), and accepts, in addition to the access >> coordinates, also a scope object, which is >> tested before access. A reachability fence is also thrown in the mix to make >> sure that the scope is kept alive during >> access (which is important when registering segments against cleaners). Of >> course, to make memory access safe, memory >> access var handles, byte buffer var handles, and byte buffer API should use >> the new `ScopedMemoryAccess` class instead >> of unsafe, so that a liveness check can be triggered (in case a scope is >> present). `ScopedMemoryAccess` has a >> `closeScope` method, which initiates the thread-local handshakes, and >> returns `true` if the handshake completed >> successfully. The implementation of `MemoryScope` (now significantly >> simplified from what we had before), has two >> implementations, one for confined segments and one for shared segments; the >> main difference between the two is what >> happens when the scope is closed; a confined segment sets a boolean flag to >> false, and returns, whereas a shared >> segment goes into a `CLOSING` state, then starts the handshake, and then >> updates the state again, to either `CLOSED` or >> `ALIVE` depending on whether the handshake was successful or not. Note that >> when a shared segment is in the `CLOSING` >> state, `MemorySegment::isAlive` will still return `true`, while the liveness >> check upon memory access will fail. #### >> Memory access var handles overhaul The key realization here was that if all >> memory access var handles took a >> coordinate pair of `MemorySegment` and `long`, all other access types could >> be derived from this basic var handle >> form. This allowed us to remove the on-the-fly var handle generation, and >> to simply derive structural access var >> handles (such as those obtained by calling `MemoryLayout::varHandle`) using >> *plain* var handle combinators, so that >> e.g. additional offset is injected into a base memory access var handle. >> This also helped in simplifying the >> implementation by removing the special `withStride` and `withOffset` >> combinators, which previously needed low-level >> access on the innards of the memory access var handle. All that code is now >> gone. #### Test changes Not much to see >> here - most of the tests needed to be updated because of the API changes. >> Some were beefed up (like the array test, >> since now segments can be projected into many different kinds of arrays). A >> test has been added to test the `Cleaner` >> functionality, and another stress test has been added for shared segments >> (`TestHandshake`). Some of the >> microbenchmarks also needed some tweaks - and some of them were also updated >> to also test performance in the shared >> segment case. [1] - https://openjdk.java.net/jeps/393 [2] - >> https://openjdk.java.net/jeps/389 [3] - >> https://mail.openjdk.java.net/pipermail/panama-dev/2020-May/009004.html [4] >> - https://openjdk.java.net/jeps/312 > > Maurizio Cimadamore has updated the pull request incrementally with one > additional commit since the last revision: > > Address review comments make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk line 145: > 143: SCOPE_MEMORY_ACCESS_TYPES := Byte Short Char Int Long Float Double > 144: $(foreach t, $(SCOPE_MEMORY_ACCESS_TYPES), \ > 145: $(eval $(call GenerateScopedOp,BIN_$t,$t))) This indent was fine at 2 spaces. I meant the one below inside the recipe. ------------- PR: https://git.openjdk.java.net/jdk/pull/548