Hi Goran,

On 25.08.2010 13:11, Goran Rakic wrote:
> Hi Thomas,
>
> Thank you for your message. I have a question that may be offtopic or
> just uninformed.
>
> What is puzzling me is should not ligatures decomposing be implemented
> further down so things like search, thesaurus or other extensions using
> grammar checking API like languagetool can work transparently?
>
> I can see that hyphenation or some typography style checkers would need
> an exception of such decomposing with an option to work on a raw text.

>From the user point of view this is probably true. All I can say here is
that currently we do neither Unicode normalization nor decomposition.
And it is unclear whether we will do so in the future or not.


As for some idle talk about the pros and cons:

- you have already noticed it might be a good idea to have the 'raw'
text available as well for some cases, but always keeping two versions
of the same text will waste too much memory. Therefore spot solutions
seem to be required. And thus someone would be required to list all the
relevant cases and which string version is to be used. But even that
solution will become troublesome if you need to keep the raw data but
currently to use the decomposed text, and later on maybe even have to
match modified decomposed text (or parts of that) to the raw text again.
This is likely to be a lot offset trouble and may have a negative
performance impact if something like that needs to be done on a regular
basis.

- according to unicode.org ligatures should probably not have been added
to Unicode at all, but were added since they were available in quite a
number of the 'old' character set tables.

- many fonts do not even support ligatures (or only a few of them),
probably since font rendering is pretty much advanced by now and thus
there is no real need for them anymore just to get a nice layout.

- the usual competitive 'reference product' does not seem to support
them at all (aside from being able to display them). The don't get
handled upon uppercase, tiltle case or sentence case conversion. And any
word containing a ligature is reported as wrong by the spell checker
(aside from the standalone ff). There as well search does not work as
you would like it to work.

Thus on the bright side in OOo we now at least have:
- working case conversion with ligatures
- working spell checking with ligatures (i.e. if all the required affix
files get changed as listed)


Thomas



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lingucomponent.openoffice.org
For additional commands, e-mail: dev-h...@lingucomponent.openoffice.org

Reply via email to