On Thursday, 7 May 2020 at 14:49:43 UTC, data pulverizer wrote:
After running the Julia code by the Julia community they made some changes (using views rather than passing copies of the array) and their time has come down to ~ 2.5 seconds. The plot thickens.

I've run the Chapel code past the Chapel programming language people and they've brought the time down to ~ 6.5 seconds. I've disallowed calling BLAS because I'm looking at the performance of the programming language implementations rather than it's ability to call other libraries.

So far the times are looking like this:

D:      ~ 1.5 seconds
Julia:  ~ 2.5 seconds
Chapel: ~ 6.5 seconds

I've been working on the Nim benchmark and have written a little byte order set of functions for big -> little endian stuff (https://gist.github.com/dataPulverizer/744fadf8924ae96135fc600ac86c7060) which was fun and has the ntoh, hton, and so forth functions that can be applied to any basic type. Now writing a little matrix type in the same vein as the D matrix type I wrote and then do the easy bit which is writing the kernel matrix algorithm itself.

In the end I'll run the benchmark on data of various sizes. Currently I'm just running it on the (10,000 x 784) data set which outputs a (10,000 x 10,000) matrix. I'll end up running (5,000 x 784), (10,000 x 784), (20,000 x 784), (30,000 x 784), (40,000 x 784), (50,000 x 784), and (60,000 x 784). Ideally I'd measure each on 100 times and plot confidence intervals, but I'll have to settle for measuring each one 3 times and take an average otherwise it will take too much time. I don't think that D will have it it's own way for all the data sizes, from what I can see, Julia may do better at the largest data set, maybe simd will be a factor there.

The data set sizes are not randomly chosen. In many common data science tasks maybe > 90% of what data scientists currently work on, people work with data sets in this range or even smaller, the big data stuff is much less common unless you're working for Google (FANGs) or a specialist startup. I remember running a kernel cluster in often used "data science" languages (none of which I'm benchmarking here) and it wasn't done after an hour and then hung and crashed, I implemented something in Julia and it was done in a minute. Calculating kernel matrices is the cornerstone of many kernel-based machine learning libraries kernel PCA, Kernel Clustering, SVM and so on. It's a pretty important thing to calculate and shows the potential of these languages in the data science field. I think an article like this is valid for people that implement numerical libraries. I'm also hoping to throw in C++ by way of comparison.


Reply via email to