Re: RFR: 8364365: HKSCS encoder does not properly set the replacement character

Xueming Shen Tue, 05 Aug 2025 19:20:30 -0700

On Tue, 5 Aug 2025 08:20:55 GMT, Volkan Yazici <[email protected]> wrote:


>> Fix `HKSCS` encoder to correctly set the replacement character, and add 
>> tests to verify the `CodingErrorAction.REPLACE` behavior of all available 
>> encoders.
>
> test/jdk/sun/nio/cs/TestEncoderReplaceUTF16.java line 146:
> 
>> 144:             System.err.println("Character set is known to be absent of 
>> unmappable non-Latin-1 characters!");
>> 145:             return null;
>> 146:         }
> 
> Without this fast-path, this test take several minutes to complete due to 
> `findUnmappableNonLatin1()` taking ~20 seconds for each character set absent 
> of unmappable Latin-1 characters.

we definitely want to exclude 'some' charsets here. yes, all unicode variants 
probably should be excluded, as they are expected to have a 'mapping' for every 
unicode character. Additionally, many charsets have an "internal status", 
meaning they might shift in and shift out its status based on input. See 
https://www.rfc-editor.org/rfc/rfc1468.html for an example. The encoder 
might/should add the shift-in/out escape sequence characters on top of the 
'replacement', if the replacement character's target sub-charset does not match 
the 'existing' sub-charset. i would assume this is really out of the scope of 
this pr though :-)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26635#discussion_r2255705386

Re: RFR: 8364365: HKSCS encoder does not properly set the replacement character

Reply via email to