grauzone wrote:
dsimcha wrote:
== Quote from Walter Bright (newshou...@digitalmars.com)'s article
Andrei Alexandrescu wrote:
2. The global random generator will be allocated per thread. Are you
cool with this too?
That could have speed problems with a tight loop accessing thread local
storage.

I don't think this matters, since for apps that really need it, a RNG can be explicitly passed in. This is purely a convenience feature for when you want random number generation to "just work" even if it's not the most efficient thing in the world. Also, checking thread_needLock() is dirt cheap, in my experience faster than accessing TLS. You could also have a global, non-thread local RNG
that is used if a check to thread_needLock() returns false.

I don't understand this. If implemented right (D2.0 or gcc thread variables), TLS can be as fast as a normal global variable. You only need an additional check (a simple if()) to lazily initialize the RNG.

Regarding "just work": it seems D's philosophy, to prefer to do the simple and "right" thing instead of raw efficiency, seems to be abandoned. Like std.algorithm uses weird string mixins, and the predicate is a template argument instead of a normal one (that's IMHO).

Well I can only welcome the return of persistent flamebaiting to this group. D's philosophy is to prefer to do the right thing, even when that doesn't align with what you believe that is. I don't understand why the pattern of some of your posts is to throw some random (sic) semantic grenade in hope to pick a fight.

Speaking of "normal" passing of the predicate, I'd been curious this morning to see how a hand-written max fares compared to reduce!(max), with reduce!("b > a ? b : a"), and with an indirect reduce using a delegate passed, ahem, "normally". This is because a nearest-neighbor algorithm I work on needs to compute max over an array in its inner loop. I wanted to see whether I need to roll my own max implementation versus using std.algorithm.

For 20000 evaluations over a 100000-integers array, reduce!(min), reduce!("b > a ? b : a"), and handMax all finished within 3.4 to 3.5 seconds on my machine. The "normal" version took 11.6 seconds to complete. Trials with various sizes reveal a similar pattern throughout: pass-by-delegate is way behind the others, which work about as fast.

This means that, had std.algorithm taken the "normal" route, it would have been plenty useless for any serious needs. Not taking the "normal" route means that its abstractions rival hand-tuned code so I can use them without worrying they will come unglued when shown a real challenge.

So, to paraphrase the ugly guy in "The Good, The Bad, and The Ugly": If you want to shoot, shoot, don't troll.


Andrei

Reply via email to