Re: String.subSequence and CR#6924259: Remove offset and count fields from java.lang.String

Martin Desruisseaux Tue, 26 Jun 2012 07:15:28 -0700

If String.substring(int, int) now performs a copy of the underlyingchar[] array and if there is no String.subSequence(int, int) providingthe old functionality, maybe the following implications should beinvestigated?


StringBuilder.append(...)
--------------------

Since, in order to avoid a useless array copy, the users may be advisedto replace the following pattern:


      StringBuilder.append(string.substring(lower, upper));
by:
      StringBuilder.append(string, lower, upper);

would it be worth to add a special-case in theAbstractStringBuilder.append(CharSequence, int, int) implementation forthe String case in order to reach the efficiency of theAbstractStringBuilder.append(String) method? The later copies the datawith a single call to System.arraycopy, as opposed to the former whichinvoke CharSequence.charAt(int) in a loop.



Integer.parseInt(...)
----------------

There was a thread one years ago about allowing Integer.parseInt(String)to accept a CharSequence.


http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-April/thread.html#9801

One invoked reason was performance, since the cost of callingCharSequence.toString() has been measured with the NetBeans profiler assignificant (assuming that the CharSequence is not already a String)when reading large ASCII files. Now if the new String.substring(...)implementation copies the internal array, we may expect a performancecost similar to StringBuilder.toString(). Would it be worth to revisitthe Integer.parseInt(String) case - and similar methods in other wrapperclasses - for allowing CharSequence input?


    Martin



Le 23/06/12 00:15, Mike Duigou a écrit :

I've made a test implementation of subSequence() utilizing an inner class with 
offset and count fields to try to understand all the parts that would be 
impacted. My observations thus far:

- The specification of the subSequence() method is currently too specific. It 
says that the result is a subString(). This would no longer be true. Hopefully 
nobody assumed that this meant they could cast the result to String. I know, 
why would you if you can just call subString() instead? I've learned to assume 
that somebody somewhere does always does the most unexpected thing.
- The CharSequences returned by subSequence would follow only the general 
CharSequence rules for equals()/hashCode(). Any current usages of the result of 
subSequence for equals() or hashing, even though it's not advised, would break. 
We could add equals() and hashCode() implementations to the CharSequence 
returned but they would probably be expensive.
- In general I wonder if parsers will be satisfied with a CharSequence that 
only implements identity equals().
- I also worry about applications that currently do use subSequence currently 
and which will fail when the result is not a String instance as String.equals() 
will return false for all CharSequences that aren't Strings. ie. CharSequence 
token =ine.subSequence(line, start, end); if (keyword.equals(token)) ... This 
would now fail.

At this point I wonder if this is a feature worth pursuing.

Mike

Re: String.subSequence and CR#6924259: Remove offset and count fields from java.lang.String

Reply via email to