On Thu, 17 Jul 2025 15:33:37 GMT, Raffaello Giulietti <rgiulie...@openjdk.org> wrote:
> What is the thinking when an `@IntrinsicCandidate` method invokes another > `@IntrinsicCandidate` method? Which part is responsible for the checks? This is a good question. Suppose IC1 calls IC2 and both are intrinsic candidates, and suppose that M1 and M2 are their checked "front doors". I think the answer has to be that, once you start executing IC1, you cannot expect any further checks. Probably some assembler macro implements IC2 and it may be called from more than one place. The tricky thing to prove is that all uses of IC2's intrinsic code, whether direct (via M2) or indirect (via things like M1) have adequate checks. If intrinsics are factored this way (as they are for string methods) I think that IC1 has to advertise that it calls IC2, so that the front door method M1 is responsible for validity checks for both IC1 and IC2. That is because after intrinsic expansion, IC2 is reached without going through M2; the entry was indirectly from M1. So M1 has to duplicate M2's front door checks. To make this workable, it may be that M2's front door checks are factored into a subroutine FD2, so that M1 can refer to FD2, rather than do risky code duplication. If (as in this case) IC2 loops over calls to IC1, then perhaps M2 should have a companion method FD2R which checks a range a range of inputs to IC2, so that M1 can call FD2R. If all goes well, then FD2R has a range check that duplicates the front door logic of M1, so that the JIT can remove the duplicate checking. In the case of `StringUTF16.getChar`, I see it is marked as trusted, and it does not have a front-door method, and does have many callers. In the terms of this PR, perhaps it should be renamed `getChar0` (or the like) to make it more clear (at non-local use points) that it must be called from trusted code. Perhaps it should also have a range check method associated with it, so that some callers can use that range check method, so that the non-local responsibility is more clearly fulfilled. Maybe some callers (if less performance critical) should be changed to call a properly checked front-door method, `getChar` (as opposed to `getChar0`). Remaining callers of `getChar0` should be clearly linked to the front-door checks required by `getChar0`. The above seems to be the principled way to deal with an unchecked intrinsic called from many trusted use sites. The basic idea is that every trusted use site should reaffirm its responsibility locally, not just hope that a non-local assert will catch a bug. We want some kind of reviewable (static/local) proof that each use site (of an unchecked private intrinsics) has correct checks. Some examples: A new front-door `getChar` method can be used in less important places like `AbstractStringBuilder::getChar`. In trusted loops over `getChar` like `String::encodeASCII`, the loop containing `getChar` can be prefaced by a range check which is batched for all the loop iterations, something like `StringUTF16.getCharChecks(val, 0, len)`. The same pattern occurs in `String::encode8859_1` and `encodeUTF8_UTF16` and `computeSizeUTF8_UTF16` and maybe elsewhere. The `val` reference and limit variable `len` or `sl` should be marked `final` to ensure that the batched range check remains correct (because it should not take loop-variant inputs). As I read through `String.java` I see that a batched range check would cover a lot of use cases… I haven't read though all the uses of `getChar`, however. The intrinsic `encodeISOArray0` (was `implEncodeISOArray`) calls `getChar`. This is an example where its front door method (now `encodeISOArray` with no "0") should call a batched check method like `getCharBatchChecks`. Let's look at this in detail. The `getCharBatchChecks` method could look like this: //non-public void getCharBatchChecks(byte[] val, int charStart, int charSize) { Objects.requireNonNull(val, "val"); // *** what style guide mandates this line?? Preconditions.checkFromIndexSize(charStart << 1, charSize << 1, val.length, Preconditions.AIOOBE_FORMATTER); // *** using "char" in the names helps reduce confusion from the mix of byte and char indexes } … static int encodeISOArray(…) { … StringUTF16.getCharBatchChecks(sa, sp, len); // next method loops over getChar(sa, sp++) return encodeISOArray0(sa, sp, da, dp, len); } Note that after inlining, the batch checks exactly match pre-existing checks for the caller intrinsic. Perhaps the caller's checks could be removed manually, or perhaps the JIT removes the duplication. Actually, I think you got this documentation wrong: @param sp the index of the <em>byte (not character!)</em> from the source array to start reading from AFAICT, `sp` is a char index; note that `getChar` scales it as `(index=sp)<<1`. Note that `getChar` has zero javadoc, so you are left to guess helplessly about its index operand. This stuff is complicated to get right. The above exercise in wiring up the checking logic tends to uncover bugs and misconceptions, I think. ------------- PR Comment: https://git.openjdk.org/jdk/pull/25998#issuecomment-3091231708