On 01/15/2011 12:21 AM, Michel Fortin wrote:
Also, it'd really help this discussion to have some hard numbers about
the cost of decoding graphemes.

Text has a perf module that provides such numbers (on different stages of Text object construction) (but the measured algos are not yet stabilised, so that said numbers regularly change, but in the right sense ;-) You can try the current version at https://bitbucket.org/denispir/denispir-d/src (the perf module is called chrono.d)

For information, recently, the cost of full text construction: decoding, normalisation (both decomp & ordering), piling, was about 5 times decoding alone. The heavy part (~ 70%) beeing piling. But Stephan just informed me about a new gain in piling I have not yet tested. This performance places our library in-between Windows native tools and ICU in terms of speed. Which is imo rather good for a brand new tool written in a still unstable language.

I have carefully read your arguments on Text's approach to systematically "pile" and normalise source texts not beeing the right one from an efficiency point of view. Even for strict use cases of universal text manipulation (because the relative space cost would indirectly cause time cost due to cache effects). Instead, you state we should "pile" and/or normalise on the fly. But I am, similarly to you, rather doubtful on this point without any numbers available.
So, let us produce some benchmark results on both approaches if you like.


Denis
_________________
vita es estrany
spir.wikidot.com

Reply via email to