Hi,

On 24.11.2016 00:56, Xueming Shen wrote:
> On 11/23/2016 02:39 PM, Paul Sandoz wrote:
>> Hi,
>>
>> Please review:
>>
>>    
>> http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8170155-string-buffer-builder-late-binding/webrev/
>>
>> This patch:
>>
>> 1) updates the StringBuilder/StringBuffer.chars()/codePoints() so they are 
>> late binding.
>>
>> 2) refactors the spliterator late binding and fail fast tests, separating 
>> them out, and to the former additional tests are added for CharSequence 
>> implementations and BitSet.
>>
>> The code in AbstractStringBuilder has the following bounds check call to 
>> checkOffset:
>>
>> 1558         return StreamSupport.intStream(
>> 1559                 () ->  {
>> 1560                     byte[] val = this.value; int count = this.count;
>> 1561                     checkOffset(count, val.length>>  coder);
>> 1562                     return coder == LATIN1
>> 1563                            ? new StringLatin1.CharsSpliterator(val, 0, 
>> count, 0)
>> 1564                            : new StringUTF16.CharsSpliterator(val, 0, 
>> count, 0);
>> 1565                 },
>> 1566                 Spliterator.ORDERED | Spliterator.SIZED | 
>> Spliterator.SUBSIZED,
>> 1567                 false);
>>
>> (Separately checkOffset could be replaced with the internal 
>> Preconditions.checkIndex.)

Yes, if possible, bound checks should be replaced by the intrinsified 
Preconditions.check() (see https://bugs.openjdk.java.net/browse/JDK-8042997).

>> On initial inspection that should be a no-op, but talking to Sherman we 
>> believe it is required for UTF-16 access, since the intrinsics do not 
>> perform bounds checks, so they need to be guarded (for conformance purposes 
>> if a no-op).

If an out of bounds access is possible when using 
StringUTF16.putChar/getChar(), the check in the Java code is required because 
the intrinsic does not implement any bound checks (see below).

> One of the purposes of having StringUTF16.putChar() (other than the charAt())
> is to avoid the expensive boundary check on each/every char access. I'm 
> forwarding
> the email to Tobias to make sure it's still the case on hotspot side.

Yes, this is still correct. For performance reasons, the intrinsic for 
StringUTF16.putChar/getChar() has no boundary checks (see 
LibraryCallKit::inline_string_char_access()). It should *only* be used without 
a bounds check in the Java code if the index is guaranteed to be always in 
bounds.

> I'm not sure if it is still desired to do the same boundary check in case of 
> LATIN1
> for the benefit of consistency.  Assume there might be concurrent 
> access/operation
> between val = this.value and count = this.count; (for StringBuilder) for 
> example,
> the this.value got doubled and the this.count got increased. One will end up 
> with
> StringIndexOutOfBoundsException() from checkOffset, but the other will be 
> ioobe
> from vm?

You mean when hitting a check in an intrinsic compared to when hitting a check 
in the Java wrapper?

Actually, bound checks should always be done in the Java wrapper method that 
calls the (unchecked) intrinsic. In general, the unchecked intrinsic should not 
be called directly. StringUTF16.putChar/getChar() is an exception because there 
are places where we we need to omit the check for performance reasons.

I'm planning to revisit this with JDK-8156534 in JDK 10.

Best regards,
Tobias

> 
> Sherman
> 
> 
>> If so i propose the following to make this clearer:
>>
>> return StreamSupport.intStream(
>>          () ->  {
>>              byte[] val = this.value; int count = this.count;
>>              if (coder == LATIN1) {
>>                  return new StringLatin1.CharsSpliterator(val, 0, count, 0);
>>              } else {
>>                  // Perform an explicit bounds check since HotSpot
>>                  // intrinsics to access UTF-16 characters in the byte[]
>>                  // array will not perform bounds checks
>>                  checkOffset(count, val.length>>  coder);
>>                  return new StringUTF16.CharsSpliterator(val, 0, count, 0);
>>              }
>>          },
>>          Spliterator.ORDERED | Spliterator.SIZED | Spliterator.SUBSIZED,
>>          false);
>>
>> Paul.
> 

Reply via email to