Re: RFR 8170155: StringBuffer and StringBuilder stream methods are not late-binding

Paul Sandoz Thu, 01 Dec 2016 12:06:25 -0800

> On 1 Dec 2016, at 11:17, Xueming Shen <[email protected]> wrote:
> 
> On 11/28/2016 11:27 AM, Paul Sandoz wrote:
>>> On 25 Nov 2016, at 02:47, Tobias Hartmann<[email protected]>  
>>> wrote:
>>> 
>>>> I'm not sure if it is still desired to do the same boundary check in case 
>>>> of LATIN1
>>>> for the benefit of consistency.  Assume there might be concurrent 
>>>> access/operation
>>>> between val = this.value and count = this.count; (for StringBuilder) for 
>>>> example,
>>>> the this.value got doubled and the this.count got increased. One will end 
>>>> up with
>>>> StringIndexOutOfBoundsException() from checkOffset, but the other will be 
>>>> ioobe
>>>> from vm?
>>> You mean when hitting a check in an intrinsic compared to when hitting a 
>>> check in the Java wrapper?
>>> 
>> Not quite. There is a spliterator implementation for each coder, in the case 
>> of the LATIN1 coder there are no associated intrinsics. I think it’s ok just 
>> to perform the explicit bounds check for the UTF16 coder, since for the 
>> LATIN1 bounds checks will be performed by baloads.
>> 
>> However, i think there is bug. The coder could change from LATIN1 to UTF16, 
>> while holding a reference to a byte[] value as LATIN1.
>> 
>> For StringBuilder i could fix that by atomically reading the 
>> value/count/coder, but there is still an issue with StringBuffer. Thus, in 
>> the UTF16 coder spliterator we need to perform the explicit bounds before 
>> *every* StringUTF16.getChar().
>> 
>> Webrev updated:
>> 
>>   
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8170155-string-buffer-builder-late-binding/webrev/
>> 
>> Paul.
>> 
> 
> Hi Paul,
> 
> Seems like we do have a bug here. But I think to use "charAt()" might not be 
> the
> desired/correct approach here?
>


I think the patch is correct for both StringBuffer and StringBuilder.

It is specified that buffer must remain constant during execution of the stream 
terminal operation otherwise the results are undefined. Such an undefined 
result might be the throwing of an IndexOutOfBounds.


> For StringBuffer, since we have to guarantee the correctness in concurrent 
> use scenario,
> I would assume we need a synchronized version of getSupplier(...) to return a 
> late binding
> and thread-safe Supplier in StringBuffer, similar to "getBytes(...)" (we have 
> a un-synced
> version of getBytes() in AbstractStringBuilder and a synced version at the 
> bottom of the
> StringBuffer.java). This should be for the latin1 case as well.
> 

I did ponder that, and moving the stream returning methods into the concrete 
builder classes. I opted for the sneakier approach :-) what do you prefer?


> For StringBuilder, since the only concern is the bounds check, I think the 
> existing
> checkOffset(...) should be good enough, even in case the coder changes from 
> latin1
> to utf16, as the checkOffset(...) checks the "count" against "val.length >> 
> coder" on
> the local copy,  the getChar() access after that should be safe (though might 
> not be
> correct, if it's a utf16 coder on a latin1 byte[]).
> 

Yes, i was being very conservative (I mistakenly thought the original code read 
the coder field twice).

If we go with the single wider bounds check i think we should place it in the a 
root spliterator constructor if possible.

Paul.

Re: RFR 8170155: StringBuffer and StringBuilder stream methods are not late-binding

Reply via email to