On Sep 6, 2016, at 12:58 PM, Charles Oliver Nutter <head...@headius.com> wrote: > > On Tue, Sep 6, 2016 at 1:04 PM, Xueming Shen <xueming.s...@oracle.com> > wrote: > >> Yes, it's a known "limit" given the nature of the approach. It is not >> considered >> to be an "incompatible change", because the max length the String class >> and >> the corresponding buffer/builder classes can support is really an >> implementation >> details, not a spec requirement. The conclusion from the discussion back >> then >> was this is something we can trade off for the benefits we gain from the >> approach. >> Do we have a real use case that impacted by this change? >> > > Well, doesn't this mean that any code out there consuming String data > that's longer than Integer.MAX_VALUE / 2 will suddenly start failing on > OpenJDK 9? > > Not that such a case is a particularly good pattern, but I'm sure there's > code out there doing it. On JRuby we routinely get bug reports complaining > that we can't support strings larger than 2GB (and we have used byte[] for > strings since 2006). > > - Charlie
The most basic scale requirement for strings is that they support class-file constants, which top out at a UTF8-length of 2**16. Lengths beyond that, to fill up the 'int' return value of String::length, are less well specified. FTR, we could have chosen char[], int[], or long[] (not byte[]) as the backing store for string data. With long[] we could have strings above 4G-chars. But it would have come with a perf. tax, since the T[].length field would need to be combined with an extra bit or two (from a flag byte) to complete the length. That's 2-3 extra instructions for loading a string length, or else a redundant length field. So it's a trade-off. Likewise, choosing a third format deepens branch depth in order to get to payload. Likewise, making the second format (of two) have a length field embedded in the payload section requires a conditional load or branch, in order to load the string length. Again, more instructions. The team has looked at 20 possibilities like these. The current design is fastest. I hope it flies. — John