On Fri, Oct 21, 2016 at 2:26 PM, Juha Manninen
<juha.mannine...@gmail.com> wrote:
> No, neither FPC nor Lazarus have library code to deal with [combined 
> CodePoints] yet.
> The goal is to have an enumerator for user perceived characters, just
> like LazUnicode unit has for encoding agnostic CodePoints.

Sorry, that was not accurate.
Unit LazUnicode already has TUnicodeCharacterEnumerator which is able
to iterate combined accented Unicode characters.
It calls either function UTF8IsCombining or UTF16IsCombining depending
on the default encoding in use. Yes, Delphi and UTF-16 are supported.
The code was basically copied from SynEdit and then ported also to
UTF-16. It does not support all the complex rules of combining
CodePoints, but it apparently works well for accented characters in
western languages.

This:
 operator Enumerator(A: String): TUnicodeCharacterEnumerator;
would enable it for the for-in loop, but it is commented out now. The
current for-in loop enumerator works with CodePoints.

There is a test project in components/lazutils/test/LazUnicodeTest.lpi.
It includes combining CodePoints, too. Please take a look if you are interested.

Juha
-- 
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
http://lists.lazarus-ide.org/listinfo/lazarus

Reply via email to