To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=100273
                 Issue #|100273
                 Summary|Hyphenation wrongly relies on encodings that are 1-1 m
                        |appings to unicode
               Component|lingucomponent
                 Version|DEV300m42
                Platform|All
                     URL|
              OS/Version|Linux
                  Status|NEW
       Status whiteboard|
                Keywords|
              Resolution|
              Issue type|PATCH
                Priority|P3
            Subcomponent|other
             Assigned to|iss...@lingucomponent
             Reported by|cmc





------- Additional comments from c...@openoffice.org Tue Mar 17 14:23:35 +0000 
2009 -------
What I mean is that in
lingucomponent/source/hyphenator/altlinuxhyph/hyphen/hyphenimp.cxx we have 

OUString sWord
OString sConverted(sWord, eEnc)

and then a loop effectively of

for (i = 0; i < sConverted.getLength(); ++i)
{
    ... sWord[i];
}

i.e. it is assumed that an index into the 8bit encoded string can be used to
index the same equivalent character in the unicode string. But that isn't true
for UTF-8 or other multi-byte encodings.

In practice this means that the sample attached Kannada hyphenation rules for
the sample attached document in an UTF-8 hyphenation format gets whacked out
values and other on-screen weirdness.

Attached is a patch which I think does the right thing and builds up the unicode
hyphenation string from the 8bit string correctly, and maps the 8bit positions
and maps the 8bit positions to unicode ones.

For my Kannada example it seems to do the right thing.

---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lingucomponent.openoffice.org
For additional commands, e-mail: issues-h...@lingucomponent.openoffice.org


---------------------------------------------------------------------
To unsubscribe, e-mail: allbugs-unsubscr...@openoffice.org
For additional commands, e-mail: allbugs-h...@openoffice.org

Reply via email to