On Mon, 25 Sep 2023 12:28:40 GMT, Chen Liang <li...@openjdk.org> wrote:

> In the constructor of String, many locations the user-supplied byte or char 
> arrays are read multiple times with a plain memory access; if a user 
> previously wrote to one of such locations out of happens-before order, 
> distinct plain memory reads may result in different unanticipated values.
> 
> The main problem caused by such error is that String constructor may 
> incorrectly produce a UTF16 coder string with all-LATIN1 compatible 
> characters when `COMPACT_STRING` is true, which breaks the contract of 
> String. (The error can happen the other way around, but the resulting LATIN1 
> string is valid; this patch does not address that.)
> 
> Thus, I modified the String data compression for non-trusted arrays: a LATIN1 
> compression first-pass is still done, but if the first compression fails, a 
> second compression pass is done on a trusted (that is, copied from the 
> original data) data where reading would be consistent. The approach takes a 
> toll on UTF16 string construction time, but should not be more costly 
> memory-wise.
> 
> A separate routine to decode UTF8 in String constructor that takes byte 
> encoding has the same multi-read problem, that the old `offset--` leads to a 
> problematic double read. This is resolved by copying the data to decode to a 
> local array at first instead of reading from the user-provided byte array. 
> This fix also costs more runtime but at no extra memory cost.
> 
> Internal APIs such as newStringNoRepl are not guarded by this patch, as they 
> are already trusted to be immutable and unshared.
> 
> `test/jdk/java/lang/String` tests pass. More testing is needed to see if 
> there are other edge cases not covered.
> 
> Please review and don't hesitate to critique my approach and patch.

The JBS is assigned to Roger Riggs, he has changes in progress for this so 
please coordinate to avoid this duplicate effort.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15902#issuecomment-1733621925

Reply via email to