On Fri, 6 Feb 2026 16:34:38 GMT, Roger Riggs <[email protected]> wrote:
> The encoded form is always bytes, so I don't think 'byte' needs to be in the > name. I'd be fine with getEncodedLength(Charset). The javadoc would specify that it's a length in bytes, so perhaps that's sufficient without including 'bytes' in the method name. I do think that some callers might expect `getEncodedLength(UTF_16)` to return a length in code units and not bytes. There was some related discussion in [JDK-8372338](https://bugs.openjdk.org/browse/JDK-8372338) and also Maurizio's [Pulling the (foreign) string](https://cr.openjdk.org/~mcimadamore/panama/strings_ffm.html#reading-strings-with-known-length) doc. > The discoverability of the method if placed as > Charset.getEncodedLength(String) would be very low and would require > cross-package hacking to gain the performance advantage. For completeness, here's a demo of it in `CharsetEncoder` (https://github.com/openjdk/jdk/pull/29639). As expected it's possible to implement it that way and preserve equivalent performance, by adding a package visibility method to `String` and using `JavaLangAccess`. With that change, `string.getByteLength(UTF_8)` could be expressed as: try { int byteLength = StandardCharsets.UTF_8.newEncoder() .onUnmappableCharacter(CodingErrorAction.REPLACE) .onMalformedInput(CodingErrorAction.REPLACE) .getByteLength(stringData); } catch (CharacterCodingException e) { throw new IllegalStateException(e); } I can update the CSR to document this as an alternative. > Should we also consider the inverse operation, that is to compute the length > of a String had it been decoded from a sequence of bytes? Someone will > eventually ask for this. I see some potential use case for it in the ZipFile > implementation where knowing the length ahead of decoding could provide > efficient rejection of strings without decoding and without looking at String > contents. What is the use-case for `decodedLength` in `ZipFile`? Does 'efficient rejection of strings without decoding' require knowing the decoded length, or just whether the data is a valid encoding? ------------- PR Comment: https://git.openjdk.org/jdk/pull/28454#issuecomment-3872766017
