On 30.03.2016 19:30, Jack Stouffer wrote:
Just to drive this point home, I made a very simple benchmark. Iterating
over code points when you don't need to is 100x slower than iterating
over code units.
[...]
enum testCount = 1_000_000;
enum var = "Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Praesent justo ante, vehicula in felis vitae, finibus tincidunt dolor.
Fusce sagittis.";

void test()
{
     auto a = var.array;
}

void test2()
{
     auto a = var.byCodeUnit.array;
}

void test3()
{
     auto a = var.byGrapheme.array;
}
[...]
$ ldc2 -O3 -release -boundscheck=off test.d
$ ./test
auto-decoding            1 μs
byCodeUnit        0 hnsecs
byGrapheme        11 μs

When byCodeUnit takes no time at all, isn't 1µs infinite times slower, instead of 100 times? And I think byCodeUnits's 1µs is so low that noise is going to mess with any ratios you make.

byCodeUnit taking no time at all suggests that it's been optimized away completely. To avoid that, don't hardcode the test data, and make some output that depends on the calculations being actually done. There was a little thread about this recently:
http://forum.dlang.org/post/sdmdwyhfgmbppfflk...@forum.dlang.org

I think creating arrays from the ranges is relatively costly and noisy, and it's not of interest when you want to compare iteration speed.

Reply via email to