Hi wenshao,
I think removing Compact Strings is a great idea! As you noted in your first
message, removing it would make String easier to maintain. Just so that everybody
here understands the issues, every string algorithm has THREE implementations:
1. compact strings enabled, using ISO Latin 1 coder
2. compact strings enabled, using UTF-16 coder
3. compact strings disabled
In recent years I suspect our test coverage of the compact-strings disabled case is
lacking, as some bugs have occurred in only that case. For example, see JDK-8321514
<https://bugs.openjdk.org/browse/JDK-8321514>, JDK-8316879
<https://bugs.openjdk.org/browse/JDK-8316879>, JDK-8360271
<https://bugs.openjdk.org/browse/JDK-8360271>, JDK-8360255
<https://bugs.openjdk.org/browse/JDK-8360255>, JDK-8221430
<https://bugs.openjdk.org/browse/JDK-8221430>, etc. (Some of these have been fixed,
but some are still open.)
As Alan noted, however, we can't simply remove this case. We also can't simply
deprecate the command-line option; we need to deprecate the feature of running
without Compact Strings before we can remove that feature.
Compact Strings were introduced with JEP 254 <https://openjdk.org/jeps/254>. The JEP
doesn't mention that there is an option to disable compact strings, but the JVM
Guide
<https://docs.oracle.com/en/java/javase/25/vm/java-hotspot-virtual-machine-performance-enhancements.html#GUID-D2E3DC58-D18B-4A6C-8167-4A1DFB4888E4>
describes the Compact Strings feature and also the ability to disable it using the
-XX:-CompactStrings command line option. This section doesn't say much about when
you might want to disable the feature, though; it merely says "This feature can be
disabled if you observe performance regression issues in an application." Articles
like this one from Baeldung <https://www.baeldung.com/java-9-compact-string>, and
vendor documentation from IBM
<https://www.ibm.com/docs/en/sdk-java-technology/8?topic=options-xx-compactstrings>
also document this option, but they offer similarly vague advice.
Since the option is fairly well-known, it's not merely a matter of looking at the
status of the various ports (though those are significant, of course). It could be
that some installations out there running with option to disable compact strings,
perhaps if they encountered a performance regression, or for other reasons. They'll
need to be informed that the feature is going away, and the best way to do that is
with a JEP.
There are some additional issues to consider as well.
* As Alan noted, the ARM32 port has compact strings disabled by default. It's
not
clear whether it even works if compact strings are enabled.
* Compact strings increases storage requirements of CJK character data. Our
/assumption/ has been that even CJK-heavy applications use a lot of ASCII
data
for config files, message headers, JSON, etc., and that compact strings are
still a net win for such applications. However, that's an assumption. There's
the possibility that some installation run those applications with compact
strings disabled.
* The JNI GetStringCritical call returns a direct pointer in the non compact
strings case but makes a copy when compact strings are enabled. Some
applications may suffer regressions because of this; see this Stack Overflow
<https://stackoverflow.com/questions/76913323/string-compact-has-introduced-some-performance-issues-for-the-current-jni-how>
question.
There are probably some other issues we haven't considered yet. The best way to
flush them out is to post a JEP, and then use other channels to publicize the JEP.
The JEP is mostly a formality about changing the official status of running in the
compact-strings-disabled mode to "deprecated". Even though it seems like a lot of
overhead to write a JEP for this, the fact is that many people in the tech press
look only at the list of JEPs for each release and not much else. Any many Java
users look only at tech publications to keep up with Java; they don't look at GitHub
or follow the OpenJDK mailing lists. Thus, posting a JEP is the best chance we have
to reach a broad set of Java users, some of whom might be affected by this change.
Actual changes that go along with the deprecation will probably only involve adding
warning messages and possibly updating documentation. We don't need to resolve
issues like the ARM32 port yet. However, that will need to be resolved before we
actually remove the feature.
Since I'm "Dr Deprecator" I'll volunteer to draft the JEP.
s'marks
On 10/27/25 11:56 PM, Alan Bateman wrote:
On 28/10/2025 06:32, wenshao wrote:
Thanks to Alan for your feedback.
Based on Chen Liang's suggestion, I submitted a new draft PR
https://github.com/openjdk/jdk/pull/27995
<https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/27995__;!!ACWV5N9M2RV99hQ!OKe3zURFdlME6esFh_Travsoq4L0s3h71P8bsjCEG5RrmA0nzVmARS7ZAmOZEL0-DWdIg9P8orcXs26SZtYD-c8ZB8h_Fg$>
to add a warning message to the ComactStrings option.
I think first step has to be establish what or who might be using
-XX:-CompactStrings in 2025. This means looking into the status of ports. Andrew
Haley is going to check with folks in IBM as some of the bug reports for the
-CompactString code paths come from ports there.
-Alan