On 06/09/16 19:04, Xueming Shen wrote: > On 9/6/16, 10:09 AM, Tim Ellison wrote: >> Has it been noted that while JEP 254 reduces the space occupied by one >> byte per character strings, moving from a char[] to byte[] >> representation universally means that the maximum length of a UTF-16 >> (two bytes per char) string is now halved?
Hey Sherman, > Yes, it's a known "limit" given the nature of the approach. It is > not considered to be an "incompatible change", because the max length > the String class and the corresponding buffer/builder classes can > support is really an implementation details, not a spec requirement. Don't confuse spec compliance with compatibility. Of course, the JEP should not break the formal specified behavior of String etc, but the goal was to ensure that the implementation be compatible with prior behavior. As you know, there are many places where compatible behavior beyond the spec is important to maintain. > The conclusion from the discussion back then was this is something we > can trade off for the benefits we gain from the approach. Out of curiosity, where was that? I did search for previous discussion of this topic but didn't see it -- it may be just my poor search foo. > Do we have a real use case that impacted by this change? People stash all sorts of things in (immutable) Strings. Reducing the limits in JDK9 seems like a regression. Was there any consideration to using the older Java 8 StringCoding APIs for UTF-16 strings (already highly perf tuned) and adding additional methods for compact strings rather than rewriting everything as byte[]'s? Regards, Tim >> Since the goal is "preserving full compatibility", this has been missed >> by failing to allow for UTF-16 strings of length greater than >> Integer.MAX_VALUE / 2. >> >> Regards, >> Tim >> >> >