On 1/17/11 6:44 AM, Steven Schveighoffer wrote:
We need to get some real numbers together. I'll see what I can create
for a type, but someone else needs to supply the input :) I'm on short
supply of unicode data, and any attempts I've made to create some result
in failure. I have one example of one composed character in this thread
that I can cling to, but in order to supply some real numbers, we need a
large amount of data.

Oh, one more thing. You don't need a lot of Unicode text containing combining characters to write benchmarks. (You do need it for testing purposes.) Most text won't contain combining characters anyway, so after you implement graphemes, just benchmark them on regular text.

Andrei

Reply via email to