On 28/04/2020 17:12, Peter Levart wrote:
Hi Maurizio,
I'm checking out the thread-confinement in the parallel stream case. I
see the Spliterator.trySplit() is calling AbstractMemorySegmentImpl's:
102 private AbstractMemorySegmentImpl asSliceNoCheck(long offset,
long newSize) {
103 return dup(offset, newSize, mask, owner, scope);
104 }
...so here the "owner" of the slice is still the same as that of
parent segment...
But then later in tryAdvance or forEachRemaining, the segment is
acquired/closed for each element of the stream (in case of tryAdvance)
or for the whole chunk to the end of spliterator (in case of
forEachRemaining). So some pipelines will be more optimal than others...
Not sure I follow here - you have to create a new segment for each
element of the stream since you don't know what thread is gonna process
it anyway no?
Maurizio
So I'm thinking. Would it be possible to "lazily" acquire scope just
once in tryAdvance and then re-use the scope until the end?
Unfortunately Spliterator does not have a close() method to be called
when the pipeline is done with it. Perhaps it could be added to the
API? This is not the 1st time I wished Spliterator had a close method.
I had a similar problem when trying to create a Spliterator with a
database backend. When using JDBC API a separate transaction
(Connection) is typically required for each thread of execution since
several frameworks bind it to the ThreadLocal.
WDYT?
Regards, Peter
On 4/23/20 10:33 PM, Maurizio Cimadamore wrote:
Hi,
time has come for another round of foreign memory access API
incubation (see JEP 383 [3]). This iteration aims at polishing some
of the rough edges of the API, and adds some of the functionalities
that developers have been asking for during this first round of
incubation. The revised API tightens the thread-confinement
constraints (by removing the MemorySegment::acquire method) and
instead provides more targeted support for parallel computation via a
segment spliterator. The API also adds a way to create a custom
native segment; this is, essentially, an unsafe API point, very
similar in spirit to the JNI NewDirectByteBuffer functionality [1].
By using this bit of API, power-users will be able to add support,
via MemorySegment, to *their own memory sources* (e.g. think of a
custom allocator written in C/C++). For now, this API point is called
off as "restricted" and a special read-only JDK property will have to
be set on the command line for calls to this method to succeed. We
are aware there's no precedent for something like this in the Java SE
API - but if Project Panama is to remain true about its ultimate goal
of replacing bits of JNI code with (low level) Java code, stuff like
this has to be *possible*. We anticipate that, at some point, this
property will become a true launcher flag, and that the foreign
restricted machinery will be integrated more neatly into the module
system.
A list of the API, implementation and test changes is provided below.
If you have any questions, or need more detailed explanations, I (and
the rest of the Panama team) will be happy to point at existing
discussions, and/or to provide the feedback required.
Thanks
Maurizio
Webrev:
http://cr.openjdk.java.net/~mcimadamore/8243491_v1/webrev
Javadoc:
http://cr.openjdk.java.net/~mcimadamore/8243491_v1/javadoc
Specdiff:
http://cr.openjdk.java.net/~mcimadamore/8243491_v1/specdiff/overview-summary.html
CSR:
https://bugs.openjdk.java.net/browse/JDK-8243496
API changes
===========
* MemorySegment
- drop support for acquire() method - in its place now you can
obtain a spliterator from a segment, which supports divide-and-conquer
- revamped support for views - e.g. isReadOnly - now segments have
access modes
- added API to do serial confinement hand-off
(MemorySegment::withOwnerThread)
- added unsafe factory to construct a native segment out of an
existing address; this API is "restricted" and only available if the
program is executed using the -Dforeign.unsafe=permit flag.
- the MemorySegment::mapFromPath now returns a MappedMemorySegment
* MappedMemorySegment
- small sub-interface which provides extra capabilities for mapped
segments (load(), unload() and force())
* MemoryAddress
- added distinction between *checked* and *unchecked* addresses;
*unchecked* addresses do not have a segment, so they cannot be
dereferenced
- added NULL memory address (it's an unchecked address)
- added factory to construct MemoryAddress from long value (result
is also an unchecked address)
- added API point to get raw address value (where possible - e.g.
if this is not an address pointing to a heap segment)
* MemoryLayout
- Added support for layout "attributes" - e.g. store metadata
inside MemoryLayouts
- Added MemoryLayout::isPadding predicate
- Added helper function to SequenceLayout to rehape/flatten
sequence layouts (a la NDArray [4])
* MemoryHandles
- add support for general VarHandle combinators (similar to MH
combinators)
- add a combinator to turn a long-VH into a MemoryAddress VH (the
resulting MemoryAddress is also *unchecked* and cannot be dereferenced)
Implementation changes
======================
* add support for VarHandle combinators (e.g. IndirectVH)
The idea here is simple: a VarHandle can almost be thought of as a
set of method handles (one for each access mode supported by the var
handle) that are lazily linked. This gives us a relatively simple
idea upon which to build support for custom var handle adapters: we
could create a VarHandle by passing an existing var handle and also
specify the set of adaptations that should be applied to the method
handle for a given access mode in the original var handle. The result
is a new VarHandle which might support a different carrier type and
more, or less coordinate types. Adding this support was relatively
easy - and it only required one low-level surgery of the lambda forms
generated for adapted var handle (this is required so that the
"right" var handle receiver can be used for dispatching the access
mode call).
All the new adapters in the MemoryHandles API (which are really
defined inside VarHandles) are really just a bunch of MH adapters
that are stitched together into a brand new VH. The only caveat is
that, we could have a checked exception mismatch: the VarHandle API
methods are specified not to throw any checked exception, whereas
method handles can throw any throwable. This means that, potentially,
calling get() on an adapted VarHandle could result in a checked
exception being thrown; to solve this gnarly issue, we decided to
scan all the filter functions passed to the VH combinators and look
for direct method handles which throw checked exceptions. If such MHs
are found (these can be deeply nested, since the MHs can be adapted
on their own), adaptation of the target VH fails fast.
* More ByteBuffer implementation changes
Some more changes to ByteBuffer support were necessary here. First,
we have added support for retrieval of "mapped" properties associated
with a ByteBuffer (e.g. the file descriptor, etc.). This is crucial
if we want to be able to turn an existing byte buffer into the "right
kind" of memory segment.
Conversely, we also have to allow creation of mapped byte buffers
given existing parameters - which is needed when going from (mapped)
segment to a buffer. These two pieces together allow us to go from
segment to buffer and back w/o losing any information about the
underlying memory mapping (which was an issue in the previous
implementation).
Lastly, to support the new MappedMemorySegment abstraction, all the
memory mapped supporting functionalities have been moved into a
common helper class so that MappedMemorySegmentImpl can reuse that
(e.g. for MappedMemorySegment::force).
* Rewritten memory segment hierarchy
The old implementation had a monomorphic memory segment class. In
this round we aimed at splitting the various implementation classes
so that we have a class for heap segments (HeapMemorySegmentImpl),
one for native segments (NativeMemorySegmentImpl) and one for memory
mapped segments (MappedMemorySegmentImpl, which extends from
NativeMemorySegmentImpl). Not much to see here - although one
important point is that, by doing this, we have been able to speed up
performances quite a bit, since now e.g. native/mapped segments are
_guaranteed_ to have a null "base". We have also done few tricks to
make sure that the "base" accessor for heap segment is sharply typed
and also NPE checked, which allows C2 to speculate more and hoist.
With these changes _all_ segment types have comparable performances
and hoisting guarantees (unlike in the old implementation).
* Add workarounds in MemoryAddressProxy, AbstractMemorySegmentImpl to
special case "small segments" so that VM can apply bound check
elimination
This is another important piece which allows to get very good
performances out of indexes memory access var handles; as you might
know, the JIT compiler has troubles in optimizing loops where the
loop variable is a long [2]. To make up for that, in this round we
add an optimization which allows the API to detect whether a segment
is *small* or *large*. For small segments, the API realizes that
there's no need to perform long computation (e.g. to perform bound
checks, or offset additions), so it falls back to integer logic,
which in turns allows bound check elimination.
* renaming of the various var handle classes to conform to "memory
access var handle" terminology
This is mostly stylistic, nothing to see here.
Tests changes
=============
In addition to the tests for the new API changes, we've also added
some stress tests for var handle combinators - e.g. there's a flag
that can be enabled which turns on some "dummy" var handle
adaptations on all var handles created by the runtime. We've used
this flag on existing tests to make sure that things work as expected.
To sanity test the new memory segment spliterator, we have wired the
new segment spliterator with the existing spliterator test harness.
We have also added several micro benchmarks for the memory segment
API (and made some changes to the build script so that native
libraries would be handled correctly).
[1] -
https://docs.oracle.com/en/java/javase/14/docs/specs/jni/functions.html#newdirectbytebuffer
[2] - https://bugs.openjdk.java.net/browse/JDK-8223051
[3] - https://openjdk.java.net/jeps/383
[4] -
https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html#numpy.reshape