On Sat, 22 Oct 2016 13:25:30 +0300
Juha Manninen via Lazarus <lazarus@lists.lazarus-ide.org> wrote:

>[...]
> I guess the biggest complexity is in glyphs and ligatures. I still
> don't understand their details.

There is nothing to understand. Some languages have irregular letters.
Same as English has irregular verbs. You don't "understand" them, you
simply learn them.
As a programmer you don't need to learn them, but you should be aware
that many languages can't be mapped to simple arrays of characters.


> However for a program that must care about Unicode, like a text layout
> app, the rules for combining codepoints and glyphs are equally
> important. Codepoints for one glyph should never be split or copied
> separately. Isn't it so?

"Never" is wrong here. For example some editors allow to select the single 
letters of a
ligature. Also when comparing words you may want to ignore the
diacritical signs using the decomposed form of Unicode.
But afaik you are right that most programs never have an issue with
ligatures.
Btw, we need a wiki page about collation.


>[...]
> Despite problems and incompleteness of our Unicode support, it is
> actually better than most other solutions out there.
> Ok, most programming tools support Unicode somehow but people use them wrong.
> A good example is our forum SMF software. It deals with text layout
> and definitely should handle Unicode but it does not.
> Not even single Codepoints beyond BMP which should be the most easy
> case! No combining rules needed or anything.

Yes, that is basic Unicode encoding. No ligatures, no bidi. I agree
that this is the minimum for supporting Unicode.
Synedit goes much further.
And the native widgets often have pretty good support for the language
of the user. So the LCL controls using native widgets have
automatically good Unicode support.

Mattias
-- 
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
http://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to