I now know why your custom RNG was slow. Racket fixnums are (a) signed and (b) have a width 1 less than the width of the host architecture: [1]
-> (fixnum? (- (expt 2 64) 1)) #f -> (fixnum? (- (expt 2 63) 1)) #f -> (fixnum? (- (expt 2 62) 1)) #t Therefore, your random number generation calculation is constantly overflowing into bignums and allocating. [1] http://www.cs.utah.edu/plt/snapshots/current/doc/reference/numbers.html?q=fixnum#%28tech._fixnum%29 Sam On Tue, Nov 12, 2013 at 8:59 AM, Sam Tobin-Hochstadt <sa...@cs.indiana.edu> wrote: > Here's a start: https://gist.github.com/samth/7431213 > > This is a Typed Racket version of your program, converted to use > flvectors instead of f64vectors. I haven't taken a look at the code > for your random number generator yet, but you could try doing the > timing by creating a large array of random numbers ahead of time, and > then timing the loop just referencing into that array. > > Sam > > On Tue, Nov 12, 2013 at 7:52 AM, William Cushing > <william.cush...@gmail.com> wrote: >> I'm in the process of benchmarking various systems for their ability to >> handle the inner loop of compilations of MCMC from a higher level >> probabilistic modeling language (BLOG). >> >> Attached are examples of the kind of output (instantiations of MCMC) we'd >> like to be able to generate given the tried-and-true Bayes Net linking the >> probabilities of earthquake, burglary, john calling, and mary calling. >> Classic AI fare. >> >> Predictably, C is fastest. But Stalin and SBCL also put in a very fine >> showing. >> >> For 50,000,000 iterations on my machine, I get the following semi-serious, >> semi-bogus timings for a random assortment of systems (petite chez just for >> fun): >> >> 1 SBCL 2.20 >> 2 Bigloo 5.25 >> 3 Chicken 11.49 >> 4 Stalin 2.15 >> 5 Racket 16.67 >> 6 Petite-Chez 30.42 >> 7 GCC 0.82 >> 8 Clang 0.92 >> >> There is much about the comparison that isn't fair. Chiefly, this is >> primarily a test of speed of random number generation, but I haven't >> controlled for the algorithm being used to generate random numbers. Stalin >> may very well be using GLIBC random number generation, for example. I know >> SBCL uses MT; in fact, the SBCL experts responded to my little encoding by >> halving the shown runtime figure via dynamically linking in a new and >> improved (SIMD-aware) implementation of MT. Which is quite impressive, as >> that puts it within 30% of my C version...and my C version is "totally >> cheating" by employing a simple little LCPRNG. >> >> Porting the same generator to Racket in the obvious way more than triples >> its running time. (!) >> (Is it the global var? Is the conversion to double being compiled as a real >> division?) >> >> In any case, even just sticking with Racket's default PRNG, it seems the >> performance of a simple number crunching loop leaves something to be >> desired. I have done what I can to try and shave off time here and there. >> Using unsafe ops, and threading most of the state through the parameters of >> a tail-recursive loop (thus avoiding set!), I managed to shave off 3 seconds >> from my original encoding (so about 15%), which isn't too shabby. Of >> course, for even less effort, Stalin is already 800% times faster. >> >> I couldn't get typed/racket to type check, and /no-check doesn't seem to >> help performance at all (not surprising). I suspect that getting >> typed/racket to be able to process this file might help substantially...but >> I don't really understand its type theory enough to help it figure this >> benchmark out. >> >> I though Racket experts might have some suggestions for how to convince >> Racket to run a simple number crunching loop without quite so much in the >> way of constant-factor slowdown. >> >> -Will >> >> >> ____________________ >> Racket Users list: >> http://lists.racket-lang.org/users >> ____________________ Racket Users list: http://lists.racket-lang.org/users