Re: String.subSequence and CR#6924259: Remove offset and count fields from java.lang.String

Mike Duigou Sun, 24 Jun 2012 13:25:30 -0700

As usual, an excellent idea Jason. I'll probably run an internal test/benchmark 
with both this and the CharSequence inner class implementation to see what 
breaks and where there are performance differences between the two. I was 
planning to also test a version of the CharSequence implementation which 
implemented equals() and hashCode() compatible with String and see if this 
produced fewer (or different) test failures.


Good idea.

Mike

On Jun 24 2012, at 12:06 , Jason Mehrens wrote:

> Mike,
>  
> Why not implement subSequence as 'java.nio.CharBuffer.wrap(data, beginIndex, 
> endIndex).asReadOnlyBuffer()' ?  Easy to implement and test.  The nice thing 
> is that parsers would know what a 'CharBuffer' vs. a sub sequence String 
> internal class.
>  
> Jason
>  
> > Subject: String.subSequence and CR#6924259: Remove offset and count fields  
> > from java.lang.String
> > From: [email protected]
> > Date: Fri, 22 Jun 2012 15:15:40 -0700
> > To: [email protected]
> > CC: [email protected]
> > 
> > I've made a test implementation of subSequence() utilizing an inner class 
> > with offset and count fields to try to understand all the parts that would 
> > be impacted. My observations thus far:
> > 
> > - The specification of the subSequence() method is currently too specific. 
> > It says that the result is a subString(). This would no longer be true. 
> > Hopefully nobody assumed that this meant they could cast the result to 
> > String. I know, why would you if you can just call subString() instead? 
> > I've learned to assume that somebody somewhere does always does the most 
> > unexpected thing.
> > - The CharSequences returned by subSequence would follow only the general 
> > CharSequence rules for equals()/hashCode(). Any current usages of the 
> > result of subSequence for equals() or hashing, even though it's not 
> > advised, would break. We could add equals() and hashCode() implementations 
> > to the CharSequence returned but they would probably be expensive.
> > - In general I wonder if parsers will be satisfied with a CharSequence that 
> > only implements identity equals().
> > - I also worry about applications that currently do use subSequence 
> > currently and which will fail when the result is not a String instance as 
> > String.equals() will return false for all CharSequences that aren't 
> > Strings. ie. CharSequence token = line.subSequence(line, start, end); if 
> > (keyword.equals(token)) ... This would now fail.
> > 
> > At this point I wonder if this is a feature worth pursuing.
> > 
> > Mike
> > 
> > On Jun 3 2012, at 13:44 , Peter Levart wrote:
> > 
> > > On Thursday, May 31, 2012 03:22:35 AM [email protected] wrote:
> > >> Changeset: 2c773daa825d
> > >> Author:    mduigou
> > >> Date: 2012-05-17 10:06 -0700
> > >> URL: http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
> > >> 
> > >> 6924259: Remove offset and count fields from java.lang.String
> > >> Summary: Removes the use of shared character array buffers by String 
> > >> along
> > >> with the two fields needed to support the use of shared buffers.
> > > 
> > > Wow, that's quite a change.
> > > 
> > > So .substring() is not O(1) any more?
> > > 
> > > Doesn't this have impact on the performance of parsers and such that rely 
> > > on 
> > > the performance caracteristics of the .substring() ?
> > > 
> > > Have you considered then implementing .subSequence() not in terms of just 
> > > delegating to .substring() but returning a special CharSequence view over 
> > > the 
> > > chars of the sub-sequence?
> >

Re: String.subSequence and CR#6924259: Remove offset and count fields from java.lang.String

Reply via email to