http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=14759
--- Comment #20 from Yuval Hager <yha...@yhager.com> --- > I suspect that will make the output of Text::Unaccent and > Text::Unaccent::PurePerl the same. > Not really, it stays the same garbled mess. > unac_debug($Text::Unaccent::DEBUG_HIGH); > > That will also tell you what Text::Unaccent is doing (or probably not doing). I tested on one string: unac.c:13708: unac_data0[7] & unac_positions[0][8]: 0x05e7 => untouched unac.c:13708: unac_data0[24] & unac_positions[0][25]: 0x05b8 => untouched unac.c:13708: unac_data0[30] & unac_positions[0][31]: 0x05de => untouched unac.c:13708: unac_data0[24] & unac_positions[0][25]: 0x05b8 => untouched unac.c:13708: unac_data0[5] & unac_positions[0][6]: 0x05e5 => untouched Text::Unaccent - קָמָץ => קָ×ָץ > Note that nothing seems to happen with the (Japanese?) ideograms that Galen > tested. I wonder if accents are even a thing with CJK languages... I am definitely not an authoritative source, but I know a tiny bit of Japanese. The letters above are Kanji alphabet, and to the best of my knowledge do not have diacritics. BUT Japanese has two more alphabets, Hiragana and Katakana, both use diacritics, which CANNOT be removed, or they change the sound (and potentially the meaning). For example, in the word Hiragana, the first syllable is ひ (Hi, pronounce Hee). This same syllable, with two ticks is び, and it sounds like Bee. A circle makes it ぴ - sounds like Pee. Testing those three: Text::Unaccent - ひびぴ => ã²ã²ã² Text::Unaccent::PurePerl - ひびぴ => ひひひ Strip NonspacingMark - ひびぴ => ひひひ So we've changed 'Hee Bee Pee' to 'Hee Hee Hee'. The same result (and same syllables) for Katakana: Text::Unaccent - ヒビピ => ããã Text::Unaccent::PurePerl - ヒビピ => ヒヒヒ Strip NonspacingMark - ヒビピ => ヒヒヒ So diacritics, at least in those two alphabets, should not be removed, to the best of my knowledge. -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list Koha-bugs@lists.koha-community.org http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/