[ https://issues.apache.org/jira/browse/TIKA-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Burch updated TIKA-683: ---------------------------- Attachment: testRTFJapanese.rtf Add test file. Based on Jp_euc-jp_rtf1.rtf from http://mail-archives.apache.org/mod_mbox/tika-user/201106.mbox/%3cof03cf5cf6.40c9789f-onc22578bc.0035a24f-c22578bc.0036c...@il.ibm.com%3E but with images removed to keep the size sane > RTF Parser issues with non european characters > ---------------------------------------------- > > Key: TIKA-683 > URL: https://issues.apache.org/jira/browse/TIKA-683 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.9 > Reporter: Nick Burch > Attachments: testRTFJapanese.rtf > > > As reported on user@ in "non-West European languages support": > > http://mail-archives.apache.org/mod_mbox/tika-user/201107.mbox/%3cof0c0a3275.da7810e9-onc22578cc.0051eede-c22578cc.00525...@il.ibm.com%3E > The RTF Parser seems to be doubling up some non-european characters -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira