On Monday, 9 November 2015 at 19:31:14 UTC, Márcio Martins wrote:
I have been running some MCMC simulations in Python and it's hard to cope with how unbelievably slow it is. Takes me almost a minute to run a few hundred thousand samples on my laptop whereas I can run the same simulation with a million samples in under 100ms, on my phone with JavaScript on a browser.

Then, you spend a minute waiting for the simulation to finish, to find out you had an error in your report code that would have been easily caught with static typing. So annoying...

No matter how interested I am in Bayesian statistics, I would think that an MCMC library is relatively lower in importance than a number of other libraries.

I've written some Gibbs samplers in Matlab and Python. They are quite slow in those languages, but I didn't notice that the Python code was more than 2X or so slower than the Matlab code, and I believe that was almost entirely due to Matlab using Intel MKL and Numpy using a slightly less efficient implementation.

While I learned a lot about MCMC by writing my own Gibbs samplers, I don't write them much anymore. I make more use of MC Stan, which can be called from Python with PyStan (maybe easier if you're on Linux than Windows, I tend to use rstan more which works easily with Windows). PyMC is another option that I've heard good things about, but I haven't tried it.

I think the simplest way forward would be something like wrappers to functionality in other languages. With PyD, it shouldn't be inconceivable to have a wrapper to PyMC. Alternately, MC Stan is written in C++ and has interfaces to a number of languages. Being able to call Stan from D would be cool, especially since I don't even think there's a C++ interface yet (you have to use the command line or R or whatever). I have essentially no idea how to do that.

Is anyone doing similar stuff with D? Unfortunately, I couldn't find any plotting libraries nor MATLAB-like numerical/stats libs in dub.

My attitude is the more the better.

This seems like another area where D could easily pick up momentum with RDMD and perhaps an integration with Jupyter which is becoming very very popular.

That would be interesting, but I'm not sure how high a priority it is.

Reply via email to