On Sun, 20 Aug 2023 17:53:52 GMT, Ichiroh Takiguchi <[email protected]> 
wrote:

>> "character set of font" (font charset) table was created by "Rich Text 
>> Format Specification 1.9.1"
>> https://interoperability.blob.core.windows.net/files/Archive_References/[MSFT-RTF].pdf
>> It refers windgi.h
>> https://learn.microsoft.com/en-us/windows/win32/api/wingdi/ns-wingdi-textmetrica
>> 
>> Test files and testcase are in bugid 
>> [JDK-6928542](https://bugs.openjdk.org/browse/JDK-6928542)
>> 
>> Additional change:
>> Special character `\line` should `\n`
>> 
>> Additional information:
>> 
>> Add 2 hash tables
>> - fcharsetToCP: Predefined conversion table, `fcharset` with number control 
>> word, from control word to Java charset name, `fcharset0` refers 
>> `windows-1252` Java charset name
>> - fcharsetTable: Conversion table for each RTF file, `f` control word with 
>> number, from integer font numbers to Charset font charsets, In case of 
>> `{\f0\fnil\fcharset0 Segoe UI;}`, `0` refers Java Charset `windows-1252`
>> 
>> When RTF Character Set control word (like `\mac`) is used, unmappable 
>> character returns \u0000 and it's not written into RTF text..
>> When fcharset control word is used, unmappable character returns \uFFFD 
>> (it's the same as replacement character on decoder), \u0000 is used for DBCS 
>> lead byte detection.
>> If `f` or `par` control word is there and lead byte is remains on byte 
>> buffer for decoder, this byte data is as invalid character and write \uFFFD 
>> into RTF text.
>> 
>> If `f` control word is used without `fcharset`, `translationTable` char 
>> array is used.
>> If `f` control word is used with `fcharset`, predefined Java Charset name is 
>> used (if missing, ISO8859_1 is used for fallback).
>> 
>> **Note:** Following GitHub actions were failed
>> linux-cross-compile / build (riscv64), I opened following JBS.
>>> [JDK-8314624](https://bugs.openjdk.org/browse/JDK-8314624) GHA: RISC-V 
>>> cross-build was failed
>
> Ichiroh Takiguchi has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   6928542: Chinese characters in RTF are not decoded

Hello @prrace .
I'm very sorry for your confusion and I appreciate your comments.
I put some comments into source files and added/updated some lines, and updated 
description text.
Please let me know if it's not enough.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13553#issuecomment-1685636707

Reply via email to