Re: RFR(2): 7174936: several String methods claim to always create new String
On 21/11/2013 04:04, Stuart Marks wrote: OK, I'll remove the String.subSequence change from this changeset and push it when I get approval. I've filed this bug: https://bugs.openjdk.java.net/browse/JDK-8028757 to cover the CharSequence.subSequence issue and potential spec change. It probably isn't that much work to do, but it probably is easier to handle it separately. Thanks, I think this is the best approach given that the subSequence issue may require clearly documenting the deviation from how it is specified in CharSequence. -Alan
Re: RFR(2): 7174936: several String methods claim to always create new String
On 11/15/13 8:36 AM, Alan Bateman wrote: On 14/11/2013 23:56, Stuart Marks wrote: On 11/14/13 2:04 AM, David Holmes wrote: Sorry for the delay (been enroute). Only issue I have remains the subSequence change - you can't weaken the post-condition of CharSequence.subSequence without breaking subtype substitutability. Yes, in general, what you say about weakening post-conditions is true. In this case, though, I can't see how it can cause any practical harm. I've looked through the webrev and read the exchange between you and David. I think it might be better to leave the subSequence change out for now (the @apiNote is fine) and submit another bug to track the discrepancy between the spec and implementation. From what I can tell, this has existed since CharSequence was added in 1.4, in which case there may be a good argument to just document the potential deviation. OK, I'll remove the String.subSequence change from this changeset and push it when I get approval. I've filed this bug: https://bugs.openjdk.java.net/browse/JDK-8028757 to cover the CharSequence.subSequence issue and potential spec change. It probably isn't that much work to do, but it probably is easier to handle it separately. s'marks
Re: RFR(2): 7174936: several String methods claim to always create new String
On 14/11/2013 23:56, Stuart Marks wrote: On 11/14/13 2:04 AM, David Holmes wrote: Sorry for the delay (been enroute). Only issue I have remains the subSequence change - you can't weaken the post-condition of CharSequence.subSequence without breaking subtype substitutability. Hi David, Yes, in general, what you say about weakening post-conditions is true. In this case, though, I can't see how it can cause any practical harm. I've looked through the webrev and read the exchange between you and David. I think it might be better to leave the subSequence change out for now (the @apiNote is fine) and submit another bug to track the discrepancy between the spec and implementation. From what I can tell, this has existed since CharSequence was added in 1.4, in which case there may be a good argument to just document the potential deviation. -Alan
Re: RFR(2): 7174936: several String methods claim to always create new String
Hi Stuart, On 15/11/2013 9:56 AM, Stuart Marks wrote: On 11/14/13 2:04 AM, David Holmes wrote: Sorry for the delay (been enroute). Only issue I have remains the subSequence change - you can't weaken the post-condition of CharSequence.subSequence without breaking subtype substitutability. Hi David, Yes, in general, what you say about weakening post-conditions is true. In this case, though, I can't see how it can cause any practical harm. I think the CharSequence spec for subSequence() is written with the assumption that the CharSequence is mutable. If I'm not mistaken, all the CharSequence implementations except for String are mutable. Thus, in the statement Returns a new CharSequence that is a subsequence of this sequence. the "new" really means that if the value of this sequence is later mutated, it will not affect the value of the returned sequence. Well, strictly speaking, it's ambiguous; the "new" could refer to either the sequences' values, the reference identity of this vs the returned sequence, or both. I'd say that for the mutable CharSequences, keeping the values independent is the important part, and having independent values pretty much implies that the returned sequence is a distinct object. Since Strings are immutable, there is no concern about mutation of the value of the original String affecting the value of the returned String. That leaves reference identity. Obviously identity of mutable objects is really important. For immutable objects, identity is much less important, and so I'm trying to relax the spec to allow more discretion to the implementation about when to create new objects. I guess our choices are as follows: 0) Current webrev: remove "new" from String's spec for subSequence(). Allows 'this' to be returned in certain cases, but weakens the post-condition of its definining interface, which is usually a no-no. I think this is "wrong" but as you note there is precedent for "adjusting" things this way. 1) Leave String.subSequence() as it stands, requiring a new CharSequence. This fixes the relationship with the CS interface, but the current implementation violates this spec, as it returns 'this' in some cases (where the subSequence represents the entire String). Also wrong as you note. 2) Leave String.subSequence() as it stands, requiring a new CharSequence, and change the implementation so that it always creates a new String, even if the subSequence represents the entire original String. This is correct and less work than making spec changes, but doesn't achieve the objective of the overall change. 3) Change the CharSequence.subSequence() spec to be more precise about what it requires (mutation of one doesn't affect the other), allowing String.subSequence() to return 'this' when it can. This is right and probably not too much more additional work :) But let someone else weigh in on this. Thanks, David Anything else? I don't like (1) as it leaves our implementation in violation of a close reading of the spec. I don't like (2) as it de-optimizes the current implementation. I could live with (3) but it's more work, as new wording will have to be generated, and its impact on the other CS implementations will need to be assessed. It would probably result in the "best" (as in, most precise and self-consistent) specification, but I have to question whether it's worth the effort. I still prefer (0), though I acknowledge it does have this issue of weakening the interface's post-condition. I note that there are other places where an implementation doesn't strictly conform to its interface, such as IdentityHashMap and Map. Another possibility, I guess, would be to do something like adding an @apiNote to String.subSequence() explaining why it's different from CS.subSequence(). s'marks Thanks, David On 12/11/2013 8:43 AM, Stuart Marks wrote: Hi all, Here's an updated version of the String spec change. Changes from the previous version address comments made by Brent Christian and David Holmes. Specifically: - Specify copyValueOf() overloads as "equivalent to" corresponding valueOf() overloads. - Remove extranous changes to subSequence() method - Clarify wording of concat() method doc. Bug report: https://bugs.openjdk.java.net/browse/JDK-7174936 Webrev: http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.1/ Specdiff: http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.1/overview-summary.html Thanks! s'marks On 11/7/13 2:31 AM, Stuart Marks wrote: Hi all, Please review the following spec changes to java.lang.String. In several places the specs mention returning "new" strings. This is overspecified; it's sufficient to return "a" string that satisfies the required properties. In some cases the current implementation doesn't create a new string -- for example, substring(0) returns 'this' -- which is strictly a violation of the spec. The solution is to relax the spec requirement to create new str
Re: RFR(2): 7174936: several String methods claim to always create new String
On 11/14/13 2:04 AM, David Holmes wrote: Sorry for the delay (been enroute). Only issue I have remains the subSequence change - you can't weaken the post-condition of CharSequence.subSequence without breaking subtype substitutability. Hi David, Yes, in general, what you say about weakening post-conditions is true. In this case, though, I can't see how it can cause any practical harm. I think the CharSequence spec for subSequence() is written with the assumption that the CharSequence is mutable. If I'm not mistaken, all the CharSequence implementations except for String are mutable. Thus, in the statement Returns a new CharSequence that is a subsequence of this sequence. the "new" really means that if the value of this sequence is later mutated, it will not affect the value of the returned sequence. Well, strictly speaking, it's ambiguous; the "new" could refer to either the sequences' values, the reference identity of this vs the returned sequence, or both. I'd say that for the mutable CharSequences, keeping the values independent is the important part, and having independent values pretty much implies that the returned sequence is a distinct object. Since Strings are immutable, there is no concern about mutation of the value of the original String affecting the value of the returned String. That leaves reference identity. Obviously identity of mutable objects is really important. For immutable objects, identity is much less important, and so I'm trying to relax the spec to allow more discretion to the implementation about when to create new objects. I guess our choices are as follows: 0) Current webrev: remove "new" from String's spec for subSequence(). Allows 'this' to be returned in certain cases, but weakens the post-condition of its definining interface, which is usually a no-no. 1) Leave String.subSequence() as it stands, requiring a new CharSequence. This fixes the relationship with the CS interface, but the current implementation violates this spec, as it returns 'this' in some cases (where the subSequence represents the entire String). 2) Leave String.subSequence() as it stands, requiring a new CharSequence, and change the implementation so that it always creates a new String, even if the subSequence represents the entire original String. 3) Change the CharSequence.subSequence() spec to be more precise about what it requires (mutation of one doesn't affect the other), allowing String.subSequence() to return 'this' when it can. Anything else? I don't like (1) as it leaves our implementation in violation of a close reading of the spec. I don't like (2) as it de-optimizes the current implementation. I could live with (3) but it's more work, as new wording will have to be generated, and its impact on the other CS implementations will need to be assessed. It would probably result in the "best" (as in, most precise and self-consistent) specification, but I have to question whether it's worth the effort. I still prefer (0), though I acknowledge it does have this issue of weakening the interface's post-condition. I note that there are other places where an implementation doesn't strictly conform to its interface, such as IdentityHashMap and Map. Another possibility, I guess, would be to do something like adding an @apiNote to String.subSequence() explaining why it's different from CS.subSequence(). s'marks Thanks, David On 12/11/2013 8:43 AM, Stuart Marks wrote: Hi all, Here's an updated version of the String spec change. Changes from the previous version address comments made by Brent Christian and David Holmes. Specifically: - Specify copyValueOf() overloads as "equivalent to" corresponding valueOf() overloads. - Remove extranous changes to subSequence() method - Clarify wording of concat() method doc. Bug report: https://bugs.openjdk.java.net/browse/JDK-7174936 Webrev: http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.1/ Specdiff: http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.1/overview-summary.html Thanks! s'marks On 11/7/13 2:31 AM, Stuart Marks wrote: Hi all, Please review the following spec changes to java.lang.String. In several places the specs mention returning "new" strings. This is overspecified; it's sufficient to return "a" string that satisfies the required properties. In some cases the current implementation doesn't create a new string -- for example, substring(0) returns 'this' -- which is strictly a violation of the spec. The solution is to relax the spec requirement to create new strings. Also, this change cleans up the specs for the copyValueOf() overloads to make them identical to the corresponding valueOf() overloads. Previously, they were gratuitously different. I think that some time in the dim, distant past, probably before JDK 1.0, they had different semantics, but now they're identical. The specs should say they're identical. This change is spec
Re: RFR(2): 7174936: several String methods claim to always create new String
Hi Stuart, Sorry for the delay (been enroute). Only issue I have remains the subSequence change - you can't weaken the post-condition of CharSequence.subSequence without breaking subtype substitutability. Thanks, David On 12/11/2013 8:43 AM, Stuart Marks wrote: Hi all, Here's an updated version of the String spec change. Changes from the previous version address comments made by Brent Christian and David Holmes. Specifically: - Specify copyValueOf() overloads as "equivalent to" corresponding valueOf() overloads. - Remove extranous changes to subSequence() method - Clarify wording of concat() method doc. Bug report: https://bugs.openjdk.java.net/browse/JDK-7174936 Webrev: http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.1/ Specdiff: http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.1/overview-summary.html Thanks! s'marks On 11/7/13 2:31 AM, Stuart Marks wrote: Hi all, Please review the following spec changes to java.lang.String. In several places the specs mention returning "new" strings. This is overspecified; it's sufficient to return "a" string that satisfies the required properties. In some cases the current implementation doesn't create a new string -- for example, substring(0) returns 'this' -- which is strictly a violation of the spec. The solution is to relax the spec requirement to create new strings. Also, this change cleans up the specs for the copyValueOf() overloads to make them identical to the corresponding valueOf() overloads. Previously, they were gratuitously different. I think that some time in the dim, distant past, probably before JDK 1.0, they had different semantics, but now they're identical. The specs should say they're identical. This change is spec only, no code changes. Bug report: https://bugs.openjdk.java.net/browse/jdk-7174936 Webrev: http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.0/ Specdiff: http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.0/overview-summary.html Thanks! s'marks
Re: RFR(2): 7174936: several String methods claim to always create new String
The wording changes seem fine to me. Thanks for the specdiff as it made it much easier to review On Nov 12, 2013, at 11:43 AM, Stuart Marks wrote: > Hi all, > > Here's an updated version of the String spec change. Changes from the > previous version address comments made by Brent Christian and David Holmes. > Specifically: > > - Specify copyValueOf() overloads as "equivalent to" corresponding valueOf() > overloads. > - Remove extranous changes to subSequence() method > - Clarify wording of concat() method doc. > > Bug report: > > https://bugs.openjdk.java.net/browse/JDK-7174936 > > Webrev: > > http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.1/ > > Specdiff: > > > http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.1/overview-summary.html > > Thanks! > > s'marks > > On 11/7/13 2:31 AM, Stuart Marks wrote: >> Hi all, >> >> Please review the following spec changes to java.lang.String. >> >> In several places the specs mention returning "new" strings. This is >> overspecified; it's sufficient to return "a" string that satisfies the >> required >> properties. In some cases the current implementation doesn't create a new >> string >> -- for example, substring(0) returns 'this' -- which is strictly a violation >> of >> the spec. The solution is to relax the spec requirement to create new >> strings. >> >> Also, this change cleans up the specs for the copyValueOf() overloads to make >> them identical to the corresponding valueOf() overloads. Previously, they >> were >> gratuitously different. I think that some time in the dim, distant past, >> probably before JDK 1.0, they had different semantics, but now they're >> identical. The specs should say they're identical. >> >> This change is spec only, no code changes. >> >> Bug report: >> >> https://bugs.openjdk.java.net/browse/jdk-7174936 >> >> Webrev: >> >> http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.0/ >> >> Specdiff: >> >> >> http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.0/overview-summary.html >> >> Thanks! >> >> s'marks Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037 Oracle Java Engineering 1 Network Drive Burlington, MA 01803 lance.ander...@oracle.com
RFR(2): 7174936: several String methods claim to always create new String
Hi all, Here's an updated version of the String spec change. Changes from the previous version address comments made by Brent Christian and David Holmes. Specifically: - Specify copyValueOf() overloads as "equivalent to" corresponding valueOf() overloads. - Remove extranous changes to subSequence() method - Clarify wording of concat() method doc. Bug report: https://bugs.openjdk.java.net/browse/JDK-7174936 Webrev: http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.1/ Specdiff: http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.1/overview-summary.html Thanks! s'marks On 11/7/13 2:31 AM, Stuart Marks wrote: Hi all, Please review the following spec changes to java.lang.String. In several places the specs mention returning "new" strings. This is overspecified; it's sufficient to return "a" string that satisfies the required properties. In some cases the current implementation doesn't create a new string -- for example, substring(0) returns 'this' -- which is strictly a violation of the spec. The solution is to relax the spec requirement to create new strings. Also, this change cleans up the specs for the copyValueOf() overloads to make them identical to the corresponding valueOf() overloads. Previously, they were gratuitously different. I think that some time in the dim, distant past, probably before JDK 1.0, they had different semantics, but now they're identical. The specs should say they're identical. This change is spec only, no code changes. Bug report: https://bugs.openjdk.java.net/browse/jdk-7174936 Webrev: http://cr.openjdk.java.net/~smarks/reviews/7174936/webrev.0/ Specdiff: http://cr.openjdk.java.net/~smarks/reviews/7174936/specdiff.0/overview-summary.html Thanks! s'marks