On Sat, 11 Mar 2023 15:57:53 GMT, Roman Kennke <rken...@openjdk.org> wrote:

> > Proposal for omitting the lockstack size check (at least in 75% of all 
> > times):
> > 
> > * We know that Thread as well as grown lockstack backing buffers start at 
> > malloc-aligned boundaries. Practically this is 16 (64-bit), 4-8 (32-bit). 
> > So at the very least 4.
> > * Make the initial lockstack this size. Define it so that initial slot 
> > stack starts at offset 0.
> > * Load the current slot pointer as you do now. Check the lowest 2 bits. If 
> > all are zero, go the slower path (load the current limit and compare 
> > against limit, ...).
> > * If bit 0 or 1 are set, you can omit this check. You are done since you 
> > have not yet reached the limit.
> > * You can expand this proposal to any alignment you like. You need to 
> > declare the lockstack slots with `alignof(X)`, and the compiler will take 
> > care that the _initial_ slot stack is always well aligned. As for larger 
> > slot stacks, we will have to allocate them in an aligned fashion using 
> > posix_memalign (we need this as NMT-wrapped version, but thats trivial)
> 
> This would only work when pushing a single slot, right? Have you seen what 
> we're doing in the compiled (C1 and C2) paths (in x86_64 and aarch64)? There 
> we're doing a (conservative) estimate how many lock-slots are needed in the 
> method, and check for enough slots upon method entry once, and then elide the 
> check altogether in the lock-enter implementation.

Yeah, I just realized this myself. I started working on the template 
interpreter first, where we push single stack slots. There it may still make 
sense.

-------------

PR: https://git.openjdk.org/jdk/pull/10907

Reply via email to