In the constructor of String, many locations the user-supplied byte or char 
arrays are read multiple times with a plain memory access; if a user previously 
wrote to one of such locations out of happens-before order, distinct plain 
memory reads may result in different unanticipated values.

The main problem caused by such error is that String constructor may 
incorrectly produce a UTF16 coder string with all-LATIN1 compatible characters 
when `COMPACT_STRING` is true, which breaks the contract of String. (The error 
can happen the other way around, but the resulting LATIN1 string is valid; this 
patch does not address that.)

Thus, I modified the String data compression for non-trusted arrays: a LATIN1 
compression first-pass is still done, but if the first compression fails, a 
second compression pass is done on a trusted (that is, copied from the original 
data) data where reading would be consistent. The approach takes a toll on 
UTF16 string construction time, but should not be more costly memory-wise.

A separate routine to decode UTF8 in String constructor that takes byte 
encoding has the same multi-read problem, that the old `offset--` leads to a 
problematic double read. This is resolved by copying the data to decode to a 
local array at first instead of reading from the user-provided byte array. This 
fix also costs more runtime but at no extra memory cost.

Internal APIs such as newStringNoRepl are not guarded by this patch, as they 
are already trusted to be immutable and unshared.

`test/jdk/java/lang/String` tests pass. More testing is needed to see if there 
are other edge cases not covered.

Please review and don't hesitate to critique my approach and patch.

-------------

Commit messages:
 - 8311906 - fix utf16 first

Changes: https://git.openjdk.org/jdk/pull/15902/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15902&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8311906
  Stats: 87 lines in 3 files changed: 48 ins; 10 del; 29 mod
  Patch: https://git.openjdk.org/jdk/pull/15902.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/15902/head:pull/15902

PR: https://git.openjdk.org/jdk/pull/15902

Reply via email to