On Fri, Oct 21, 2016 at 2:26 PM, Juha Manninen <juha.mannine...@gmail.com> wrote: > No, neither FPC nor Lazarus have library code to deal with [combined > CodePoints] yet. > The goal is to have an enumerator for user perceived characters, just > like LazUnicode unit has for encoding agnostic CodePoints.
Sorry, that was not accurate. Unit LazUnicode already has TUnicodeCharacterEnumerator which is able to iterate combined accented Unicode characters. It calls either function UTF8IsCombining or UTF16IsCombining depending on the default encoding in use. Yes, Delphi and UTF-16 are supported. The code was basically copied from SynEdit and then ported also to UTF-16. It does not support all the complex rules of combining CodePoints, but it apparently works well for accented characters in western languages. This: operator Enumerator(A: String): TUnicodeCharacterEnumerator; would enable it for the for-in loop, but it is commented out now. The current for-in loop enumerator works with CodePoints. There is a test project in components/lazutils/test/LazUnicodeTest.lpi. It includes combining CodePoints, too. Please take a look if you are interested. Juha -- _______________________________________________ Lazarus mailing list Lazarus@lists.lazarus-ide.org http://lists.lazarus-ide.org/listinfo/lazarus