Re: [review] new string type

stephan Wed, 01 Dec 2010 08:46:00 -0800

There's one other issue that should be considered at some stage: normalization and the 
fact that a single "character" can be constructed from several code points. 
(acutes and such)


This is my next little project. May build on Steve's job. (But it's not 
necessary, dchar is enough as a base, I guess.)


Hi Denis, you might want to consider helping us out.

We have got a feature-complete Unicode normalization, case-folding, andconcatenation implementation passing all test cases inhttp://unicode.org/Public/6.0.0/ucd/NormalizationTest.txt (and thensome) for all recent Unicode versions. This code was part of a biggerproject that we have stopped working on.

We feel that the Unicode normalization part might be useful to others.Therefore we consider releasing them under an open source license.Before we can do so, we have to clean up things a bit. Some open issues are

a) The code still contains some TODOs and FIXMEs (bugs,inefficiencies, some bigger issues like more efficient storing of dataetc.).

b) No profiling and no benchmarking against the ICU implementation(http://site.icu-project.org/) has been done yet (we expect surprises).

c) Implementation of additional Unicode algorithms (e.g. full casemapping, matching, collation).

Since we have stopped working on the bigger project, we haven’t mademuch progress. Any help would be welcome. Let me know whether this wouldbe of interest to you.

Re: [review] new string type

Reply via email to