https://bugs.freedesktop.org/show_bug.cgi?id=48446
Bug #: 48446 Summary: RTF Importer does not honor ansicpgN and cpgN control words -> fails to import some non-Englist documents properly Classification: Unclassified Product: LibreOffice Version: LibO 3.5.2 Release Platform: Other OS/Version: Windows (All) Status: UNCONFIRMED Severity: normal Priority: medium Component: Writer AssignedTo: libreoffice-bugs@lists.freedesktop.org ReportedBy: mikekagan...@hotmail.com Created attachment 59657 --> https://bugs.freedesktop.org/attachment.cgi?id=59657 Test file showing this behaviour When an RTF document contains a /ansigpgN control word in the header just after /ansi control word, a reader should use this code page to perform ansi-to-Unicode conversion wherever another codepage isn't specified for a text run and Unicode RTF isn't used[1]. When a font definition contains /fcharsetN control word, it overrides the top-level setting, and when there is a /cpgN, it overrides both top-level setting and /fcharsetN [2]. Now, when opening an RTF which doesn't contain any codepage/charset data, LO defaults to Latin-1 (see Bug 48023). If such document contains /ansicpgN, of its fonts have /cpgN, LO ignores this information, and still uses Latin-1. Only /fcharsetN is taken into account. The attachment is the test document from Bug 48023, where the missing language information is manually added. There is /ansicpg1251 in the header now, as well as /fcharset204 in one font, and /cpg1251 in another. It may be seen, that only the text using the first font is displayed properly. As to documents that don't contain language information at all (and there is a great number of such documents generated by various non-MS software out there), I believe that LO should use user language (and provide a means of specifying another on opening, like a checkbox in Open dialog saying "Specify missing charset" doing something similar to Text Encoded filter). -- 1. Word 2007: Rich Text Format (RTF) Specification, version 1.9.1 (http://www.microsoft.com/download/en/details.aspx?id=10725), page 12: Character Set 2. Ibid., pages 17-20. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ Libreoffice-bugs mailing list Libreoffice-bugs@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice-bugs