JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-09 Thread Brian Burkhalter
Hello, Issue: https://bugs.openjdk.java.net/browse/JDK-8039474 Patch: http://cr.openjdk.java.net/~bpb/8039474/webrev.00/ The change is to specify the charset for the String where none had been before. The extant test sun/misc/Encode/GetBytes.java appears to suffice for this. Thanks, Brian

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Chris Hegarty
On 10 Apr 2014, at 02:19, Brian Burkhalter wrote: > Hello, > > Issue:https://bugs.openjdk.java.net/browse/JDK-8039474 > Patch:http://cr.openjdk.java.net/~bpb/8039474/webrev.00/ The change looks fine to me Brian. Trivially, you could ( but of not have to ) use java.nio.charset.

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Ulf Zibis
Hi Chris, Am 10.04.2014 11:04, schrieb Chris Hegarty: Trivially, you could ( but of not have to ) use java.nio.charset.StandardCharsets.ISO_8859_1 to avoid the cost of String to CharSet lookup. In earlier tests Sherman and I have found out, that the cost of initialization of a new charsets

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Chris Hegarty
On 10 Apr 2014, at 11:03, Ulf Zibis wrote: > Hi Chris, > > Am 10.04.2014 11:04, schrieb Chris Hegarty: >> Trivially, you could ( but of not have to ) use >> java.nio.charset.StandardCharsets.ISO_8859_1 to avoid the cost of String to >> CharSet lookup. > > In earlier tests Sherman and I have f

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Ulf Zibis
Correction ... Am 10.04.2014 12:03, schrieb Ulf Zibis: Hi Chris, Am 10.04.2014 11:04, schrieb Chris Hegarty: Trivially, you could ( but of not have to ) use java.nio.charset.StandardCharsets.ISO_8859_1 to avoid the cost of String to CharSet lookup. In earlier tests Sherman and I have found o

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Brian Burkhalter
On Apr 10, 2014, at 3:27 AM, Ulf Zibis wrote: > Correction ... > > Am 10.04.2014 12:03, schrieb Ulf Zibis: >> Hi Chris, >> >> Am 10.04.2014 11:04, schrieb Chris Hegarty: >>> Trivially, you could ( but of not have to ) use >>> java.nio.charset.StandardCharsets.ISO_8859_1 to avoid the cost of S

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Chris Hegarty
On 10 Apr 2014, at 15:57, Brian Burkhalter wrote: > > On Apr 10, 2014, at 3:27 AM, Ulf Zibis wrote: > >> Correction ... >> >> Am 10.04.2014 12:03, schrieb Ulf Zibis: >>> Hi Chris, >>> >>> Am 10.04.2014 11:04, schrieb Chris Hegarty: Trivially, you could ( but of not have to ) use j

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Xueming Shen
Looks fine. Personally I would prefer the "canonicalized/real" name "ISO-8859-1" though. -Sherman On 4/10/14 7:57 AM, Brian Burkhalter wrote: On Apr 10, 2014, at 3:27 AM, Ulf Zibis wrote: Correction ... Am 10.04.2014 12:03, schrieb Ulf Zibis: Hi Chris, Am 10.04.2014 11:04, schrieb Chris

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Mike Duigou
On Apr 10 2014, at 03:21 , Chris Hegarty wrote: > On 10 Apr 2014, at 11:03, Ulf Zibis wrote: > >> Hi Chris, >> >> Am 10.04.2014 11:04, schrieb Chris Hegarty: >>> Trivially, you could ( but of not have to ) use >>> java.nio.charset.StandardCharsets.ISO_8859_1 to avoid the cost of String to >

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Mike Duigou
Shouldn't we be using the platform default character set rather than iso8859-1? This change will change the charset used for all platforms not using iso885901 as their default. It is certainly odd that sun.misc.CharacterEncoder(byte) and sun.misc.CharacterDecoder(String) are not symmetrical but

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Brian Burkhalter
How can one keep it symmetrical without forcing a particular encoding? Brian On Apr 10, 2014, at 10:54 AM, Mike Duigou wrote: > Shouldn't we be using the platform default character set rather than > iso8859-1? > > This change will change the charset used for all platforms not using > iso8859

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Mike Duigou
It won't be symmetrical unless the default charset is ISO8859-1. We can't change sun.misc.CharacterEncoder(byte) to use the default charset because it has the longstanding behaviour of encoding to ISO8859-1 and I would argue we can't change sun.misc.CharacterDecoder(String) from using the defaul

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Brian Burkhalter
That is to say either explicitly or implicitly, i.e., using the default on both ends? On Apr 10, 2014, at 10:59 AM, Brian Burkhalter wrote: > How can one keep it symmetrical without forcing a particular encoding? > > Brian > > On Apr 10, 2014, at 10:54 AM, Mike Duigou wrote: > >> Shouldn't

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Chris Hegarty
> On 10 Apr 2014, at 18:40, Mike Duigou wrote: > > >> On Apr 10 2014, at 03:21 , Chris Hegarty wrote: >> >>> On 10 Apr 2014, at 11:03, Ulf Zibis wrote: >>> >>> Hi Chris, >>> >>> Am 10.04.2014 11:04, schrieb Chris Hegarty: Trivially, you could ( but of not have to ) use java.nio.

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Brian Burkhalter
=> Resolved: Won’t Fix. On Apr 10, 2014, at 11:05 AM, Mike Duigou wrote: > Strange, wrongheaded and nonsensical behaviour, but longstanding.

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Xueming Shen
This fix is to "un-do" a previous changeset (8036848), in which it replaces the use of deprecated String.getBytes(int,int,byte[],int) method with String.getBytes() (which uses the default platform default charset), therefor causes a behavioral change. This one is to undo that change to go back to

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Xueming Shen
On 04/10/2014 11:08 AM, Chris Hegarty wrote: On 10 Apr 2014, at 18:40, Mike Duigou wrote: On Apr 10 2014, at 03:21 , Chris Hegarty wrote: On 10 Apr 2014, at 11:03, Ulf Zibis wrote: Hi Chris, Am 10.04.2014 11:04, schrieb Chris Hegarty: Trivially, you could ( but of not have to ) use ja

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Brian Burkhalter
Here’s an updated version with the encoder also modified for symmetry. Brian On Apr 10, 2014, at 11:23 AM, Xueming Shen wrote: > String version has the cache mechanism of charset -> CharsetDe/Encoder, so if > cache hits, you don't need to have String->Charset lookup. > > We don't cache the "ex

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Brian Burkhalter
Would have been nice had I included the link: http://cr.openjdk.java.net/~bpb/8039474/webrev.01/ Brian On Apr 10, 2014, at 11:32 AM, Brian Burkhalter wrote: > Here’s an updated version with the encoder also modified for symmetry.

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Mike Duigou
On Apr 10 2014, at 11:08 , Chris Hegarty wrote: > >> On 10 Apr 2014, at 18:40, Mike Duigou wrote: >> >> >>> On Apr 10 2014, at 03:21 , Chris Hegarty wrote: >>> On 10 Apr 2014, at 11:03, Ulf Zibis wrote: Hi Chris, Am 10.04.2014 11:04, schrieb Chris Hegarty: >

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Xueming Shen
On 04/10/2014 11:38 AM, Mike Duigou wrote: On Apr 10 2014, at 11:08 , Chris Hegarty wrote: On 10 Apr 2014, at 18:40, Mike Duigou wrote: On Apr 10 2014, at 03:21 , Chris Hegarty wrote: On 10 Apr 2014, at 11:03, Ulf Zibis wrote: Hi Chris, Am 10.04.2014 11:04, schrieb Chris Hegarty: Tr

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Chris Hegarty
On 10 Apr 2014, at 19:50, Xueming Shen wrote: > On 04/10/2014 11:38 AM, Mike Duigou wrote: >> On Apr 10 2014, at 11:08 , Chris Hegarty wrote: >> On 10 Apr 2014, at 18:40, Mike Duigou wrote: > On Apr 10 2014, at 03:21 , Chris Hegarty wrote: > >> On 10 Apr 2014, at

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Mandy Chung
On 4/10/14 11:05 AM, Mike Duigou wrote: "Isn't all this sun.misc stuff going go away soon anyway?" <-- wishful thinking We use them in our implementation and can't go away but at least access will be denied with module boundary enforcement. Mandy

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Xueming Shen
On 04/10/2014 12:03 PM, Chris Hegarty wrote: On 10 Apr 2014, at 19:50, Xueming Shen wrote: On 04/10/2014 11:38 AM, Mike Duigou wrote: On Apr 10 2014, at 11:08 , Chris Hegarty wrote: On 10 Apr 2014, at 18:40, Mike Duigou wrote: On Apr 10 2014, at 03:21 , Chris Hegarty wrote: On 10

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Mandy Chung
On 4/10/14 11:34 AM, Brian Burkhalter wrote: Would have been nice had I included the link: http://cr.openjdk.java.net/~bpb/8039474/webrev.01/ Brian - thanks for getting this fixed. Looks good to me. I reviewed the fix for JDK-8036848 and missed the subtle compatibility issue (thanks to Sher

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Ulf Zibis
Am 10.04.2014 17:20, schrieb Xueming Shen: Looks fine. Personally I would prefer the "canonicalized/real" name "ISO-8859-1" though. Yep, using the canonical name guarantees best performance for the charset lookup. BTW, where are these links gone: Bug 100092 -- Speed-up FastCharsetProvider

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Tim Bell
On 04/10/14 19:26, Ulf Zibis wrote: BTW, where are these links gone: This part of the question I can handle. The six digit Bug numbers came from the legacy OpenJDK bugzilla instance. Before it was shut down, those bug reports were transferred to JBS. In the process, they were assigned new JD

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-10 Thread Ulf Zibis
Am 10.04.2014 21:53, schrieb Tim Bell: On 04/10/14 19:26, Ulf Zibis wrote: BTW, where are these links gone: This part of the question I can handle. The six digit Bug numbers came from the legacy OpenJDK bugzilla instance. Before it was shut down, those bug reports were transferred to JBS. I

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-15 Thread Ulf Zibis
Am 10.04.2014 21:53, schrieb Tim Bell: On 04/10/14 19:26, Ulf Zibis wrote: BTW, where are these links gone: This part of the question I can handle. The six digit Bug numbers came from the legacy OpenJDK bugzilla instance. Before it was shut down, those bug reports were transferred to JBS. In

Re: JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

2014-04-15 Thread Tim Bell
On 04/15/14 16:47, Ulf Zibis wrote: But where are the original attachments e.g. webrevs, patches ? Are they lost forever ? No, they are there on the new JBS bug reports. For some reason they are not visible to users outside Oracle. I will see if that can be changed. Regards- Tim