On Monday, 18 November 2013 at 02:36:40 UTC, Andrei Alexandrescu wrote:
I think that's a terrific start!

Thanks :)

(Not sure I understand where the 30 minutes come from...)

On my computer (`-release -inline -noboundscheck` ... `-O` is omitted because it removes the summation all together since it isn't used anywhere):

---
auto benchmarkSumming() {
    auto proportions = iota(150_000);
    // copied from std.random.dice:
auto sum = reduce!("(assert(b >= 0), a + b)")(0.0, proportions.save);
}

auto bench = benchmark!benchmarkSumming(1_000);
writeln(bench.front.msecs, " ms");
---

Gives me 1618 ms. ~150_000 is the number of entries in the probability table (number of surnames), and generating the sum for 1_000 names would take about that long. I really wanted 1 million names to sort, so scaling that up would mean 1618 seconds, which is nearly 27 minutes.

That summation doesn't change, hence the importance of caching it for repeated runs. That's about 27 minutes of completely wasted work.

That all isn't to say that std.random.dice is fundamentally flawed (it's fine and perfectly suited for one-shot dice rolls and is, in fact, a bit better than a DiceRange for that task) but a DiceRange is really needed to efficiently do repeated rolls like this.

Jeez, we're talking about maximizing sorting efficiency, and now I've gone off talking about making dice rolls faster too! :)

Reply via email to