Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-23 Thread Zhong Yu
I just realized that I also use String.trim() a lot, and have always
assumed it's very cheap; that's no longer the case.

JDK itself contains many usages of trim(); even a lot of

str.substring().trim()

Just saying.

On Wed, Nov 14, 2012 at 9:14 AM, Zhong Yu zhong.j...@gmail.com wrote:
 Changing String.substring() from O(1) to O(n) is a big deal; we may
 say it breaks compatibility.

 Any code that intends to work across JDK versions before and after 7u6
 cannot use the method, since its behavior is so different in different
 versions.

 Any deployment that upgrades JDK to 7u6 and later needs to review all
 its usages of substring(). That's a ton of work. A quick workaround
 might be to refactor all substring() usages to some `old_substring()`
 method that preserves O(1). Unfortunately `old_substring()` cannot
 exist, so there's no quick workaround possible.

 The memory leak problem of the old substring() method is well-known
 among Java programmers, it's not really a big problem today.

 For the uninitiated, they might expect that substring() is leak-free;
 but they might also expect that substring() is O(1). There's no
 apparent reason why favoring one is better than favoring the other.

 In the old implementation, there's a workaround to achieve leak-free,
 by `new String(String)`.

 In the new implementation, there is no workaround to achieve O(1),
 unless the developer uses something other than String. It is basically
 impossible to change String to another type if it's part of a method
 signature.

 With all these problems, please reconsider this change and see if you
 should roll it back. Thanks.

 Zhong Yu


Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-15 Thread Peter Levart
Hi,

This change is 6 months old now. I wonder if Oracle received any complaints
from the users since then. I mean complaints that are based on real
observations of performance degradation in real code - not only speculation.

Regards, Peter

2012/11/15 Zhong Yu zhong.j...@gmail.com

 Since this change is to achieve minor performance boost, it's not fair
 to defend it by saying that it only incurs minor performance
 penalties.

 Java programs are infested with strings, most of which could have used
 a more appropriate type, but it is the insane reality. Any change to
 the behavior of strings should have been backed up by a much more
 thorough analysis.

 Every usage of substring() was (hopefully) the result of some
 conscious reasoning about space-time. Even if this change does not
 significantly alter an application's performance, it invalidates all
 the reasoning, that's the worst blow in my book. There's no problem if
 substring() does copying from day one, but 17 years have passed.

 Zhong Yu

 On Wed, Nov 14, 2012 at 6:58 PM, Vitaly Davidovich vita...@gmail.com
 wrote:
  Personally, I feel like the concern is a bit overstated:
 
  1) the n in O(n) is likely actually fairly small in practice (at least in
  what I'd consider sane code)
  2) I think a lot of people that worry about perf probably aren't using
  substring() anyway
  3) copying char[] is optimized by jit - this is basically a memcpy()-like
  call, which modern machines handle well
  4) the upside is strings are 8 bytes smaller
  5) .NET substring() has always allocated new storage (via an optimized
  internal VM call) and never shared the char[] and I haven't come across
 any
  complaints or seen serious perf problems myself (granted I seldom use
  substring)
 
  So I don't know if this is anything to worry about in practice.
 
  Sent from my phone
 
  On Nov 14, 2012 5:26 PM, Zhong Yu zhong.j...@gmail.com wrote:
 
  On 06/03/2012 11:35 PM, Mike Duigou wrote:
   [I trimmed the distribution list]
  
   On Jun 3 2012, at 13:44 , Peter Levart wrote:
  
   On Thursday, May 31, 2012 03:22:35 AM mike.duigou at oracle.comwrote:
   Changeset: 2c773daa825d
   Author:mduigou
   Date:  2012-05-17 10:06 -0700
   URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
  
   6924259: Remove offset and count fields from java.lang.String
   Summary: Removes the use of shared character array buffers by String
   along
   with the two fields needed to support the use of shared buffers.
   Wow, that's quite a change.
   Indeed. It was a long time in development. It is a change which is
   expected to be overall beneficial though and in the general case a
 positive
   win.
 
  Wow!
 
  If the previous behavior of substring() was once a bug, by now it has
  become a well known feature. People know about it, and people depend
  on it.
 
  This change is a big surprise. Changing O(1) to O(n) is a breach of
  contract. It'll break lots of old code; and meanwhile lots of new code
  are still being written based on the old assumption. After people
  learned about the new behavior, they need to comb through and rewrite
  their code.
 
  The worst part is the same code performs very differently on different
  versions of JDK. What's a programmer supposed to do if his code
  targets JDK6 and above? If the cost of strings are no longer certain,
  what else can we believe in?
 
  Is there any chance in hell to roll it back? Maybe add a new method
  for the new behavior?
 
  Zhong Yu



Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-14 Thread Zhong Yu
Changing String.substring() from O(1) to O(n) is a big deal; we may
say it breaks compatibility.

Any code that intends to work across JDK versions before and after 7u6
cannot use the method, since its behavior is so different in different
versions.

Any deployment that upgrades JDK to 7u6 and later needs to review all
its usages of substring(). That's a ton of work. A quick workaround
might be to refactor all substring() usages to some `old_substring()`
method that preserves O(1). Unfortunately `old_substring()` cannot
exist, so there's no quick workaround possible.

The memory leak problem of the old substring() method is well-known
among Java programmers, it's not really a big problem today.

For the uninitiated, they might expect that substring() is leak-free;
but they might also expect that substring() is O(1). There's no
apparent reason why favoring one is better than favoring the other.

In the old implementation, there's a workaround to achieve leak-free,
by `new String(String)`.

In the new implementation, there is no workaround to achieve O(1),
unless the developer uses something other than String. It is basically
impossible to change String to another type if it's part of a method
signature.

With all these problems, please reconsider this change and see if you
should roll it back. Thanks.

Zhong Yu


Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-14 Thread Zhong Yu
The new implementation also introduces a new form of memory leak.
Previously N substrings take O(N) space. Now it takes O(N*m) space
where m is the average length of substrings.

Some applications may be double penalized by the new implementation -
both CPU and memory go up.

On Wed, Nov 14, 2012 at 9:14 AM, Zhong Yu zhong.j...@gmail.com wrote:
 Changing String.substring() from O(1) to O(n) is a big deal; we may
 say it breaks compatibility.

 Any code that intends to work across JDK versions before and after 7u6
 cannot use the method, since its behavior is so different in different
 versions.

 Any deployment that upgrades JDK to 7u6 and later needs to review all
 its usages of substring(). That's a ton of work. A quick workaround
 might be to refactor all substring() usages to some `old_substring()`
 method that preserves O(1). Unfortunately `old_substring()` cannot
 exist, so there's no quick workaround possible.

 The memory leak problem of the old substring() method is well-known
 among Java programmers, it's not really a big problem today.

 For the uninitiated, they might expect that substring() is leak-free;
 but they might also expect that substring() is O(1). There's no
 apparent reason why favoring one is better than favoring the other.

 In the old implementation, there's a workaround to achieve leak-free,
 by `new String(String)`.

 In the new implementation, there is no workaround to achieve O(1),
 unless the developer uses something other than String. It is basically
 impossible to change String to another type if it's part of a method
 signature.

 With all these problems, please reconsider this change and see if you
 should roll it back. Thanks.

 Zhong Yu


Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-14 Thread Remi Forax

Hi Zhong Yu,
I agree with you that changing the implementation of something
like String.substring which is widely used is something that is always a 
little hairy.


The memory leak you mention is one side of the problem, the other is that
we want the VM to do memory collocation of String (i.e. allocate the 
array of char

and the String as one object to avoid the double indirection).
For that the creation of the char array and the creation of the String
must be done in the constructor of String.

So this change is a way to look to the bright future instead of looking 
to the dark past :)


Now, I don't know why this change was backported to a jdk update,
but it's more a question to the jdk7 update mailing list.

Rémi

On 11/14/2012 04:14 PM, Zhong Yu wrote:

Changing String.substring() from O(1) to O(n) is a big deal; we may
say it breaks compatibility.

Any code that intends to work across JDK versions before and after 7u6
cannot use the method, since its behavior is so different in different
versions.

Any deployment that upgrades JDK to 7u6 and later needs to review all
its usages of substring(). That's a ton of work. A quick workaround
might be to refactor all substring() usages to some `old_substring()`
method that preserves O(1). Unfortunately `old_substring()` cannot
exist, so there's no quick workaround possible.

The memory leak problem of the old substring() method is well-known
among Java programmers, it's not really a big problem today.

For the uninitiated, they might expect that substring() is leak-free;
but they might also expect that substring() is O(1). There's no
apparent reason why favoring one is better than favoring the other.

In the old implementation, there's a workaround to achieve leak-free,
by `new String(String)`.

In the new implementation, there is no workaround to achieve O(1),
unless the developer uses something other than String. It is basically
impossible to change String to another type if it's part of a method
signature.

With all these problems, please reconsider this change and see if you
should roll it back. Thanks.

Zhong Yu




Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-14 Thread Alan Bateman

On 14/11/2012 16:06, Remi Forax wrote:


Now, I don't know why this change was backported to a jdk update,
but it's more a question to the jdk7 update mailing list.

It was to offset the addition of the hash32 field.

-Alan.


Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-14 Thread Zhong Yu
On 06/03/2012 11:35 PM, Mike Duigou wrote:
 [I trimmed the distribution list]

 On Jun 3 2012, at 13:44 , Peter Levart wrote:

 On Thursday, May 31, 2012 03:22:35 AM mike.duigou at oracle.com wrote:
 Changeset: 2c773daa825d
 Author:mduigou
 Date:  2012-05-17 10:06 -0700
 URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d

 6924259: Remove offset and count fields from java.lang.String
 Summary: Removes the use of shared character array buffers by String along
 with the two fields needed to support the use of shared buffers.
 Wow, that's quite a change.
 Indeed. It was a long time in development. It is a change which is expected 
 to be overall beneficial though and in the general case a positive win.

Wow!

If the previous behavior of substring() was once a bug, by now it has
become a well known feature. People know about it, and people depend
on it.

This change is a big surprise. Changing O(1) to O(n) is a breach of
contract. It'll break lots of old code; and meanwhile lots of new code
are still being written based on the old assumption. After people
learned about the new behavior, they need to comb through and rewrite
their code.

The worst part is the same code performs very differently on different
versions of JDK. What's a programmer supposed to do if his code
targets JDK6 and above? If the cost of strings are no longer certain,
what else can we believe in?

Is there any chance in hell to roll it back? Maybe add a new method
for the new behavior?

Zhong Yu


Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-14 Thread Vitaly Davidovich
Personally, I feel like the concern is a bit overstated:

1) the n in O(n) is likely actually fairly small in practice (at least in
what I'd consider sane code)
2) I think a lot of people that worry about perf probably aren't using
substring() anyway
3) copying char[] is optimized by jit - this is basically a memcpy()-like
call, which modern machines handle well
4) the upside is strings are 8 bytes smaller
5) .NET substring() has always allocated new storage (via an optimized
internal VM call) and never shared the char[] and I haven't come across any
complaints or seen serious perf problems myself (granted I seldom use
substring)

So I don't know if this is anything to worry about in practice.

Sent from my phone
On Nov 14, 2012 5:26 PM, Zhong Yu zhong.j...@gmail.com wrote:

 On 06/03/2012 11:35 PM, Mike Duigou wrote:
  [I trimmed the distribution list]
 
  On Jun 3 2012, at 13:44 , Peter Levart wrote:
 
  On Thursday, May 31, 2012 03:22:35 AM mike.duigou at oracle.com wrote:
  Changeset: 2c773daa825d
  Author:mduigou
  Date:  2012-05-17 10:06 -0700
  URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
 
  6924259: Remove offset and count fields from java.lang.String
  Summary: Removes the use of shared character array buffers by String
 along
  with the two fields needed to support the use of shared buffers.
  Wow, that's quite a change.
  Indeed. It was a long time in development. It is a change which is
 expected to be overall beneficial though and in the general case a positive
 win.

 Wow!

 If the previous behavior of substring() was once a bug, by now it has
 become a well known feature. People know about it, and people depend
 on it.

 This change is a big surprise. Changing O(1) to O(n) is a breach of
 contract. It'll break lots of old code; and meanwhile lots of new code
 are still being written based on the old assumption. After people
 learned about the new behavior, they need to comb through and rewrite
 their code.

 The worst part is the same code performs very differently on different
 versions of JDK. What's a programmer supposed to do if his code
 targets JDK6 and above? If the cost of strings are no longer certain,
 what else can we believe in?

 Is there any chance in hell to roll it back? Maybe add a new method
 for the new behavior?

 Zhong Yu



Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-11-14 Thread Zhong Yu
Since this change is to achieve minor performance boost, it's not fair
to defend it by saying that it only incurs minor performance
penalties.

Java programs are infested with strings, most of which could have used
a more appropriate type, but it is the insane reality. Any change to
the behavior of strings should have been backed up by a much more
thorough analysis.

Every usage of substring() was (hopefully) the result of some
conscious reasoning about space-time. Even if this change does not
significantly alter an application's performance, it invalidates all
the reasoning, that's the worst blow in my book. There's no problem if
substring() does copying from day one, but 17 years have passed.

Zhong Yu

On Wed, Nov 14, 2012 at 6:58 PM, Vitaly Davidovich vita...@gmail.com wrote:
 Personally, I feel like the concern is a bit overstated:

 1) the n in O(n) is likely actually fairly small in practice (at least in
 what I'd consider sane code)
 2) I think a lot of people that worry about perf probably aren't using
 substring() anyway
 3) copying char[] is optimized by jit - this is basically a memcpy()-like
 call, which modern machines handle well
 4) the upside is strings are 8 bytes smaller
 5) .NET substring() has always allocated new storage (via an optimized
 internal VM call) and never shared the char[] and I haven't come across any
 complaints or seen serious perf problems myself (granted I seldom use
 substring)

 So I don't know if this is anything to worry about in practice.

 Sent from my phone

 On Nov 14, 2012 5:26 PM, Zhong Yu zhong.j...@gmail.com wrote:

 On 06/03/2012 11:35 PM, Mike Duigou wrote:
  [I trimmed the distribution list]
 
  On Jun 3 2012, at 13:44 , Peter Levart wrote:
 
  On Thursday, May 31, 2012 03:22:35 AM mike.duigou at oracle.com wrote:
  Changeset: 2c773daa825d
  Author:mduigou
  Date:  2012-05-17 10:06 -0700
  URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
 
  6924259: Remove offset and count fields from java.lang.String
  Summary: Removes the use of shared character array buffers by String
  along
  with the two fields needed to support the use of shared buffers.
  Wow, that's quite a change.
  Indeed. It was a long time in development. It is a change which is
  expected to be overall beneficial though and in the general case a positive
  win.

 Wow!

 If the previous behavior of substring() was once a bug, by now it has
 become a well known feature. People know about it, and people depend
 on it.

 This change is a big surprise. Changing O(1) to O(n) is a breach of
 contract. It'll break lots of old code; and meanwhile lots of new code
 are still being written based on the old assumption. After people
 learned about the new behavior, they need to comb through and rewrite
 their code.

 The worst part is the same code performs very differently on different
 versions of JDK. What's a programmer supposed to do if his code
 targets JDK6 and above? If the cost of strings are no longer certain,
 what else can we believe in?

 Is there any chance in hell to roll it back? Maybe add a new method
 for the new behavior?

 Zhong Yu


Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-06-14 Thread Jonathan Gibbons
For the record, javadoc uses substring very heavily, and might be 
impacted by this change.


-- Jon

On 06/03/2012 02:35 PM, Mike Duigou wrote:

[I trimmed the distribution list]

On Jun 3 2012, at 13:44 , Peter Levart wrote:


On Thursday, May 31, 2012 03:22:35 AM mike.dui...@oracle.com wrote:

Changeset: 2c773daa825d
Author:mduigou
Date:  2012-05-17 10:06 -0700
URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d

6924259: Remove offset and count fields from java.lang.String
Summary: Removes the use of shared character array buffers by String along
with the two fields needed to support the use of shared buffers.

Wow, that's quite a change.

Indeed. It was a long time in development. It is a change which is expected to 
be overall beneficial though and in the general case a positive win.


So .substring() is not O(1) any more?

No. Though with object allocation it probably was only ever roughly O(1) anyway.


Doesn't this have impact on the performance of parsers and such that rely on
the performance caracteristics of the .substring() ?

It does have an impact. We've seen as much as a couple of percent on some 
benchmarks. Parsers which use substring for extraction are definitely impacted 
by this change.


Have you considered then implementing .subSequence() not in terms of just
delegating to .substring() but returning a special CharSequence view over the
chars of the sub-sequence?

It does look that String.subSequence() returning a special view rather than a 
substring would be a good optimization and probably a very good compromise for 
parser developers. Please create an issue and if you have the time and 
expertise a patch would speed things along (though unfortunately almost 
certainly too late for inclusion in 7u6).

Mike


Regards, Peter


Reviewed-by: alanb, mduigou, forax, briangoetz
Contributed-by: brian.dohe...@oracle.com

! src/share/classes/java/lang/Integer.java
! src/share/classes/java/lang/Long.java
! src/share/classes/java/lang/String.java
! src/share/classes/java/lang/StringCoding.java




Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-06-11 Thread Peter Levart
On Thursday, May 31, 2012 03:22:35 AM mike.dui...@oracle.com wrote:
 Changeset: 2c773daa825d
 Author:mduigou
 Date:  2012-05-17 10:06 -0700
 URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
 
 6924259: Remove offset and count fields from java.lang.String
 Summary: Removes the use of shared character array buffers by String along
 with the two fields needed to support the use of shared buffers.

Wow, that's quite a change.

So .substring() is not O(1) any more?

Doesn't this have impact on the performance of parsers and such that rely on 
the performance caracteristics of the .substring() ?

Have you considered then implementing .subSequence() not in terms of just 
delegating to .substring() but returning a special CharSequence view over the 
chars of the sub-sequence?

Regards, Peter

 Reviewed-by: alanb, mduigou, forax, briangoetz
 Contributed-by: brian.dohe...@oracle.com
 
 ! src/share/classes/java/lang/Integer.java
 ! src/share/classes/java/lang/Long.java
 ! src/share/classes/java/lang/String.java
 ! src/share/classes/java/lang/StringCoding.java


Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-06-04 Thread Rémi Forax

On 06/03/2012 11:35 PM, Mike Duigou wrote:

[I trimmed the distribution list]

On Jun 3 2012, at 13:44 , Peter Levart wrote:


On Thursday, May 31, 2012 03:22:35 AM mike.dui...@oracle.com wrote:

Changeset: 2c773daa825d
Author:mduigou
Date:  2012-05-17 10:06 -0700
URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d

6924259: Remove offset and count fields from java.lang.String
Summary: Removes the use of shared character array buffers by String along
with the two fields needed to support the use of shared buffers.

Wow, that's quite a change.

Indeed. It was a long time in development. It is a change which is expected to 
be overall beneficial though and in the general case a positive win.


So .substring() is not O(1) any more?

No. Though with object allocation it probably was only ever roughly O(1) anyway.


Allocation fast path just bump a pointer, so it's O(1).

There are two advantages of the new code.
The String object and the array of chars are now co-located in memory
(at least for small/medium strings) so cpu caches are happy.

This fix a longstanding memory leak issue
  see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4513622
that why some parsers already doesn't use the substring() trick.
BTW Mike, you can now close this bug.




Doesn't this have impact on the performance of parsers and such that rely on
the performance caracteristics of the .substring() ?

It does have an impact. We've seen as much as a couple of percent on some 
benchmarks. Parsers which use substring for extraction are definitely impacted 
by this change.


Have you considered then implementing .subSequence() not in terms of just
delegating to .substring() but returning a special CharSequence view over the
chars of the sub-sequence?

It does look that String.subSequence() returning a special view rather than a 
substring would be a good optimization and probably a very good compromise for 
parser developers. Please create an issue and if you have the time and 
expertise a patch would speed things along (though unfortunately almost 
certainly too late for inclusion in 7u6).


Given that Integer.parseInt() or Double.parseDouble() takes a String and
not a CharSequence, yes you can create a CharSequence view but the only 
way to use it

is to call toString() on it.



Mike


cheers,
Rémi



Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-06-04 Thread Ulf Zibis

In getBytes() you do not need:
int n = srcEnd;
int i = srcBegin;
You could already use srcEnd srcBegin


@@ -1015,16 +968,16 @@ public final class String
 return true;
 }
 if (anObject instanceof String) {
-String anotherString = (String)anObject;
-int n = count;
-if (n == anotherString.count) {
+String anotherString = (String) anObject;
+int n = value.length;
+if (n == anotherString.value.length) {
 char v1[] = value;
 char v2[] = anotherString.value;
-int i = offset;
-int j = anotherString.offset;
+int i = 0;
 while (n-- != 0) {
-if (v1[i++] != v2[j++])
-return false;
+if (v1[i] != v2[i])
+return false;
+i++;
 }
 return true;
 }

Why not simply? :
 if (anObject instanceof String) {
String anotherString = (String) anObject;
int n = value.length;
if (n == anotherString.value.length) {
 char v1[] = value;
 char v2[] = anotherString.value;
 while (n-- != 0) {
if (v1[n] != v2[n])
return false;
 }
 return true;
 }
This additionally has the advantage, that mostly the difference would be found quicker, as strings 
of same length often would differ at the end e.g.:

VeryLongText_1
VeryLongText_2

Same for other equals and compare methods.

BTW: You again have inserted a space after casts. ;-)

-Ulf


Am 31.05.2012 05:22, schrieb mike.dui...@oracle.com:

Changeset: 2c773daa825d
Author:mduigou
Date:  2012-05-17 10:06 -0700
URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d

6924259: Remove offset and count fields from java.lang.String
Summary: Removes the use of shared character array buffers by String along with 
the two fields needed to support the use of shared buffers.
Reviewed-by: alanb, mduigou, forax, briangoetz
Contributed-by: brian.dohe...@oracle.com

! src/share/classes/java/lang/Integer.java
! src/share/classes/java/lang/Long.java
! src/share/classes/java/lang/String.java
! src/share/classes/java/lang/StringCoding.java




Re: hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-06-03 Thread Mike Duigou
[I trimmed the distribution list]

On Jun 3 2012, at 13:44 , Peter Levart wrote:

 On Thursday, May 31, 2012 03:22:35 AM mike.dui...@oracle.com wrote:
 Changeset: 2c773daa825d
 Author:mduigou
 Date:  2012-05-17 10:06 -0700
 URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d
 
 6924259: Remove offset and count fields from java.lang.String
 Summary: Removes the use of shared character array buffers by String along
 with the two fields needed to support the use of shared buffers.
 
 Wow, that's quite a change.

Indeed. It was a long time in development. It is a change which is expected to 
be overall beneficial though and in the general case a positive win. 

 So .substring() is not O(1) any more?

No. Though with object allocation it probably was only ever roughly O(1) anyway.

 Doesn't this have impact on the performance of parsers and such that rely on 
 the performance caracteristics of the .substring() ?

It does have an impact. We've seen as much as a couple of percent on some 
benchmarks. Parsers which use substring for extraction are definitely impacted 
by this change.

 Have you considered then implementing .subSequence() not in terms of just 
 delegating to .substring() but returning a special CharSequence view over the 
 chars of the sub-sequence?

It does look that String.subSequence() returning a special view rather than a 
substring would be a good optimization and probably a very good compromise for 
parser developers. Please create an issue and if you have the time and 
expertise a patch would speed things along (though unfortunately almost 
certainly too late for inclusion in 7u6).

Mike

 Regards, Peter
 
 Reviewed-by: alanb, mduigou, forax, briangoetz
 Contributed-by: brian.dohe...@oracle.com
 
 ! src/share/classes/java/lang/Integer.java
 ! src/share/classes/java/lang/Long.java
 ! src/share/classes/java/lang/String.java
 ! src/share/classes/java/lang/StringCoding.java



hg: jdk8/tl/jdk: 6924259: Remove offset and count fields from java.lang.String

2012-05-30 Thread mike . duigou
Changeset: 2c773daa825d
Author:mduigou
Date:  2012-05-17 10:06 -0700
URL:   http://hg.openjdk.java.net/jdk8/tl/jdk/rev/2c773daa825d

6924259: Remove offset and count fields from java.lang.String
Summary: Removes the use of shared character array buffers by String along with 
the two fields needed to support the use of shared buffers.
Reviewed-by: alanb, mduigou, forax, briangoetz
Contributed-by: brian.dohe...@oracle.com

! src/share/classes/java/lang/Integer.java
! src/share/classes/java/lang/Long.java
! src/share/classes/java/lang/String.java
! src/share/classes/java/lang/StringCoding.java