To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=74034
------- Additional comments from [EMAIL PROTECTED] Thu Feb 22 08:40:44 +0000 2007 ------- I discussed MeCab integration with MeCab developer. He told me helpful advice like following. ===== First of all, MeCab is designed to be independent from specific character encodings. So it works correctly while the character encoding of input string is the same as the one of the dictionary. Thus, in principle, we can pass UCS-2(BE|LE) string to MeCab by the current interface without having to create a new interface if we encoded the MeCab dictionary by UCS-2(BE|LE). However, we need a lot of modifications to support UCS-2(BE|LE) dictionary because MeCab uses "char *" string and considers 0x00 as the end of string. In addition, the comment "All internal codes are represented in UCS2," in ucs.h implies that MeCab calls *_to_ucs2 functions to determine the type of characters included in unknown words. The process for known words and the one for unknown words are distinct. Only the latter calls *_to_ucs2. ===== In fact, MeCab doesn't encode and decode all UTF-8 strings by UCS2 in vain. Writing patches for the problem seems to be very difficult and, in my humble opinion, such patches don't affect on OOo's performance. In conlusion, it is the practically best that OOo passes UTF8 string to MeCab. Of course, we should set MECAB_USE_UTF8_ONLY = 1 in order to remove useless conversion table. I'm sorry but the legal issue has not been solved yet. The external project pages told me how to integrate external source codes, so I canceled a mail to mh and I will follow the instruction written in external project website. --------------------------------------------------------------------- Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]