Hello Graeme, Sunday, November 23, 2008, 9:21:09 AM, you wrote:
GG> So UTF8Decode only supports UCS2 output! Now this is why I think GG> supporting UTF-8 in fpGUI and Lazarus LCL was a good idea. By design GG> (utf-8), you have to support the whole unicode range. With UTF-16, GG> many people take shortcuts and actually only support UCS2 - and it GG> goes unnoticed like this case for many years! It no so worst, technically it is easy to solve all of those issues, as you seen in the bug report also Delphi has the same problems and it has not been fixed to keep Delphi compatibility. My UTF8ToUnicode takes care of all that problems and the surrogate pairs with a 25% speed penalty (not the same version posted in the bug report, other that I had optimized a bit). GG> I'm busy writing unit tests for all the conversion functions and GG> implementing some new helper functions as well. Hopefully this will GG> highlight all the UCS2 shotcuts in UTF-16 implementation and other GG> possible conversion issues. There are many test case files in unicode.org but most of them are quite complex to be coded as a test case :( Also I whish to know which basic unicode functions will be supported by FPC, only upper/lower, or maybe some more like decompose, normalize, char-word-line-paragraph iterators... I have some of them written if the FPC team wants them. -- Best regards, JoshyFun _______________________________________________ fpc-devel maillist - [email protected] http://lists.freepascal.org/mailman/listinfo/fpc-devel
