Hi, On Fri, Apr 11, 2008 at 4:26 AM, Ignacio Javier <[EMAIL PROTECTED]> wrote: > What is really doing koha, when doing marc8 decoding/MARC21, is to store in > database a sequence of: > > base character + unicode form of (tilde|cute|grave...etc) > > ...that is: > > a´, a`,n~, etc... > > ...with ` or ´ or etc... in UTF-8 (using 3 native bytes instead of 2 native > bytes) > > Instead of: > > á, à, etc... (2 native bytes) > > Internet Explorer, not surprisingly for me, renders a´ as á, etc... but no > other tools do it this way, for example firefox renders í as i with an upper > to the dot acute.
The UTF-8 is valid, it just may not be in the ideal normalization form. The strings that MARC::Charset produces when it converts from MARC-8 are in a decomposed Unicode normalization form, either NFD or NFKD. Some web browsers can render NFD strings without any difficulty, while other ones seem to work better if NFC is used. Right now Koha passes UTF-8 strings to the browser without renormalizing them, but perhaps we should be automatically converting them to NFC? Regards, Galen -- Galen Charlton Koha Application Developer LibLime [EMAIL PROTECTED] p: 1-888-564-2457 x709 _______________________________________________ Koha-devel mailing list Koha-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/koha-devel