On 11/23/2016 02:39 PM, Paul Sandoz wrote:
Hi,

Please review:

   
http://cr.openjdk.java.net/~psandoz/jdk9/JDK-8170155-string-buffer-builder-late-binding/webrev/

This patch:

1) updates the StringBuilder/StringBuffer.chars()/codePoints() so they are late 
binding.

2) refactors the spliterator late binding and fail fast tests, separating them 
out, and to the former additional tests are added for CharSequence 
implementations and BitSet.

The code in AbstractStringBuilder has the following bounds check call to 
checkOffset:

1558         return StreamSupport.intStream(
1559                 () ->  {
1560                     byte[] val = this.value; int count = this.count;
1561                     checkOffset(count, val.length>>  coder);
1562                     return coder == LATIN1
1563                            ? new StringLatin1.CharsSpliterator(val, 0, 
count, 0)
1564                            : new StringUTF16.CharsSpliterator(val, 0, 
count, 0);
1565                 },
1566                 Spliterator.ORDERED | Spliterator.SIZED | 
Spliterator.SUBSIZED,
1567                 false);

(Separately checkOffset could be replaced with the internal 
Preconditions.checkIndex.)

On initial inspection that should be a no-op, but talking to Sherman we believe 
it is required for UTF-16 access, since the intrinsics do not perform bounds 
checks, so they need to be guarded (for conformance purposes if a no-op).

One of the purposes of having StringUTF16.putChar() (other than the charAt())
is to avoid the expensive boundary check on each/every char access. I'm 
forwarding
the email to Tobias to make sure it's still the case on hotspot side.

I'm not sure if it is still desired to do the same boundary check in case of 
LATIN1
for the benefit of consistency.  Assume there might be concurrent 
access/operation
between val = this.value and count = this.count; (for StringBuilder) for 
example,
the this.value got doubled and the this.count got increased. One will end up 
with
StringIndexOutOfBoundsException() from checkOffset, but the other will be ioobe
from vm?

Sherman


If so i propose the following to make this clearer:

return StreamSupport.intStream(
         () ->  {
             byte[] val = this.value; int count = this.count;
             if (coder == LATIN1) {
                 return new StringLatin1.CharsSpliterator(val, 0, count, 0);
             } else {
                 // Perform an explicit bounds check since HotSpot
                 // intrinsics to access UTF-16 characters in the byte[]
                 // array will not perform bounds checks
                 checkOffset(count, val.length>>  coder);
                 return new StringUTF16.CharsSpliterator(val, 0, count, 0);
             }
         },
         Spliterator.ORDERED | Spliterator.SIZED | Spliterator.SUBSIZED,
         false);

Paul.

Reply via email to