On Jun 26 2012, at 07:13 , Martin Desruisseaux wrote: > If String.substring(int, int) now performs a copy of the underlying char[] > array and if there is no String.subSequence(int, int) providing the old > functionality, maybe the following implications should be investigated? > > > StringBuilder.append(...) > -------------------- > Since, in order to avoid a useless array copy, the users may be advised to > replace the following pattern: > > StringBuilder.append(string.substring(lower, upper)); > by: > StringBuilder.append(string, lower, upper);
This would seem to be a good refactoring regardless of the substring implementation as it avoids creation of a temporary object. > > would it be worth to add a special-case in the > AbstractStringBuilder.append(CharSequence, int, int) implementation for the > String case in order to reach the efficiency of the > AbstractStringBuilder.append(String) method? The later copies the data with a > single call to System.arraycopy, as opposed to the former which invoke > CharSequence.charAt(int) in a loop. I think a microbenchmark to compare StringBuilder.append(string.substring(lower, upper)) with AbstractStringBuilder.append.append(CharSequence, int, int) would help. I wouldn't be surprised if the later is faster when a substring has to be created but slower when the string is an existing string. > > Integer.parseInt(...) > ---------------- > There was a thread one years ago about allowing Integer.parseInt(String) to > accept a CharSequence. > > http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-April/thread.html#9801 > > One invoked reason was performance, since the cost of calling > CharSequence.toString() has been measured with the NetBeans profiler as > significant (assuming that the CharSequence is not already a String) when > reading large ASCII files. Now if the new String.substring(...) > implementation copies the internal array, we may expect a performance cost > similar to StringBuilder.toString(). Would it be worth to revisit the > Integer.parseInt(String) case - and similar methods in other wrapper classes > - for allowing CharSequence input? Probably. > Martin > > > > Le 23/06/12 00:15, Mike Duigou a écrit : >> I've made a test implementation of subSequence() utilizing an inner class >> with offset and count fields to try to understand all the parts that would >> be impacted. My observations thus far: >> >> - The specification of the subSequence() method is currently too specific. >> It says that the result is a subString(). This would no longer be true. >> Hopefully nobody assumed that this meant they could cast the result to >> String. I know, why would you if you can just call subString() instead? I've >> learned to assume that somebody somewhere does always does the most >> unexpected thing. >> - The CharSequences returned by subSequence would follow only the general >> CharSequence rules for equals()/hashCode(). Any current usages of the result >> of subSequence for equals() or hashing, even though it's not advised, would >> break. We could add equals() and hashCode() implementations to the >> CharSequence returned but they would probably be expensive. >> - In general I wonder if parsers will be satisfied with a CharSequence that >> only implements identity equals(). >> - I also worry about applications that currently do use subSequence >> currently and which will fail when the result is not a String instance as >> String.equals() will return false for all CharSequences that aren't Strings. >> ie. CharSequence token =ine.subSequence(line, start, end); if >> (keyword.equals(token)) ... This would now fail. >> >> At this point I wonder if this is a feature worth pursuing. >> >> Mike >