Hi Goran, On 25.08.2010 13:11, Goran Rakic wrote: > Hi Thomas, > > Thank you for your message. I have a question that may be offtopic or > just uninformed. > > What is puzzling me is should not ligatures decomposing be implemented > further down so things like search, thesaurus or other extensions using > grammar checking API like languagetool can work transparently? > > I can see that hyphenation or some typography style checkers would need > an exception of such decomposing with an option to work on a raw text.
>From the user point of view this is probably true. All I can say here is that currently we do neither Unicode normalization nor decomposition. And it is unclear whether we will do so in the future or not. As for some idle talk about the pros and cons: - you have already noticed it might be a good idea to have the 'raw' text available as well for some cases, but always keeping two versions of the same text will waste too much memory. Therefore spot solutions seem to be required. And thus someone would be required to list all the relevant cases and which string version is to be used. But even that solution will become troublesome if you need to keep the raw data but currently to use the decomposed text, and later on maybe even have to match modified decomposed text (or parts of that) to the raw text again. This is likely to be a lot offset trouble and may have a negative performance impact if something like that needs to be done on a regular basis. - according to unicode.org ligatures should probably not have been added to Unicode at all, but were added since they were available in quite a number of the 'old' character set tables. - many fonts do not even support ligatures (or only a few of them), probably since font rendering is pretty much advanced by now and thus there is no real need for them anymore just to get a nice layout. - the usual competitive 'reference product' does not seem to support them at all (aside from being able to display them). The don't get handled upon uppercase, tiltle case or sentence case conversion. And any word containing a ligature is reported as wrong by the spell checker (aside from the standalone ff). There as well search does not work as you would like it to work. Thus on the bright side in OOo we now at least have: - working case conversion with ligatures - working spell checking with ligatures (i.e. if all the required affix files get changed as listed) Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lingucomponent.openoffice.org For additional commands, e-mail: dev-h...@lingucomponent.openoffice.org