RFR: 8331865: Consolidate size and alignment checks in LayoutPath

Maurizio Cimadamore Thu, 16 May 2024 04:04:49 -0700

When creating a nested memory access var handle, we ensure that the segment is 
accessed at the correct alignment for the root layout being accessed. But we do 
not ensure that the segment has at least the size of the accessed root layout. 
Example:



MemoryLayout LAYOUT = sequenceLayout(2, structLayout(JAVA_INT.withName("x"), 
JAVA_INT.withName("y")));
VarHandle X_HANDLE = LAYOUT.varHandle(PathElement.sequenceElement(), 
PathElement.groupElement("x"));


If I have a memory segment whose size is 12, I can successfully call X_HANDLE 
on it with index 1, like so:


MemorySegment segment = Arena.ofAuto().allocate(12);
int x = (int)X_HANDLE.get(segment, 0, 1);


This seems incorrect: after all, the provided segment doesn't have enough space 
to fit _two_ elements of the nested structs. 

This PR adds checks to detect this kind of scenario earlier, thus improving 
usability. To achieve this we could, in principle, just tweak 
`LayoutPath::checkEnclosingLayout` and add the required size check.

But doing so will add yet another redundant check on top of the other already 
added by [pull/19124](https://git.openjdk.org/jdk/pull/19124). Instead, this PR 
rethinks how memory segment var handles are created, in order to eliminate 
redundant checks.

The main observation is that size and alignment checks depend on an _enclosing_ 
layout. This layout *might* be the very same value layout being accessed (this 
is the case when e.g. using `JAVA_INT.varHandle()`). But, in the general case, 
the accessed value layout and the enclosing layout might differ (e.g. think 
about accessing a struct field).

Furthermore, the enclosing layout check depends on the _base_ offset at which 
the segment is accessed, _prior_ to any index computation that occurs if the 
accessed layout path has any open elements. It is important to notice that this 
base offset is only available when looking at the var handle that is returned 
to the user. For instance, an indexed var handle with coordinates:


(MemorySegment, long /* base */, long /* index 1 */, long /* index 2 */, long 
/* index 3 */)


Is obtained through adaptation, by taking a more basic var handle of the form:


(MemorySegment, long /* offset */)


And then injecting the result of the index multiplication into `offset`. As 
such, we can't add an enclosing layout check inside the var handle returned by 
`VarHandles::memorySegmentViewHandle`, as doing so will end up seeing the 
*wrong* offset (e.g. an offset in which the index part has already been mixed 
in).

The only possibility then, is to remove size and alignment checks from the 
*raw* var handles returned by `VarHandles::memorySegmentViewHandle`, and 
perform such checks outside (e.g. in `LayoutPath::dereferenceHandle`). The only 
checks left in the raw var handles are:

* an alignment check to make sure that the access mode selected by the client 
is really available - this alignment check is against the alignment of the 
value layout being selected, and not the enclosing layout alignment (which 
might be stricter)
* a read-only check, to make sure that write access modes are blocked if the 
accessed segment is read-only
* liveness/confinement checks, as mandated by ScopedMemoryAccess

Since these check depends on the particular access mode selected by the client, 
we can't move these checks away from the raw var handle.

These changes come with some consequences:

* Now it is always necessary to adapt a raw var handle, and to insert 
appropriate size and alignment checks (otherwise OOB access might be possible). 
As such, `ValueLayouts` also need to call the path layout variant of 
`MemoryLayout::varHandle`, to make sure that the raw var handle is adapted 
accordingly, before it is cached in its stable field.
* The var handle cache has been moved from `Utils` to 
`ValueLayouts::varHandle`. The cache is used (a) to reduce the number of var 
handle instances created and (b) to make sure that the cached var handle in the 
`ValueLayout` has stable identity. With respect to (a), while it makes sense to 
cache "toplevel" var handles (e.g. `JAVA_INT.varHandle()`) the cost/benefit 
ratio for caching nested var handles seem more unfavourable, as the same var 
handle can be reused with different enclosing layouts, leading to different 
checks. Ultimately, all nested var handles will require some kind of 
adaptation, so it doesn't seem too crucial to have a deeper level of caching 
here.
* The order in which exceptions are thrown might be slightly different. This is 
because the size/alignment checks now take place _before_ the raw var handle is 
even called. This caused a bunch of small updates in code and tests.
* It used to be possible to create a sequence layout with maximal size, like 
`MemoryLayout.sequenceLayout(Long.MAX_VALUE, JAVA_LONG)`, and derive an indexed 
var handle from that layout. Since there used to be no enclosing layout size 
check, access to the sequence element was allowed, as long as the index 
computation did not result in an offset outside the boundary of the accessed 
memory segment. This is now no longer the case: when selecting an element from 
the above layout, the implementation will make sure that the accessed segment 
indeed has the size of that sequence layout (which will probably lead to a 
`IndexOutOfBoundException`). To do indexed access on an unbounded sequence, the 
`MemoryLayout::arrayElementVarHandle` should be used instead (but this is the 
norm anyway for such cases).

-------------

Commit messages:
 - Revert commented lines in test
 - Simplify var handle code
 - Revert test changes
 - Cleanup spurious changes
 - Only use nested var handles
 - Add tests for nested layout size check
 - Drop spurious changes
 - cleanup javadoc
 - Fix benchmark
 - Cache both nested and toplevel var handles
 - ... and 7 more: https://git.openjdk.org/jdk/compare/61aff6db...236007c3

Changes: https://git.openjdk.org/jdk/pull/19251/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19251&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8331865
  Stats: 199 lines in 12 files changed: 73 ins; 37 del; 89 mod
  Patch: https://git.openjdk.org/jdk/pull/19251.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/19251/head:pull/19251

PR: https://git.openjdk.org/jdk/pull/19251

RFR: 8331865: Consolidate size and alignment checks in LayoutPath

Reply via email to