Good thing about this is it shows SeqAcc is slower than RandAcc at many of the distance measures. I was thinking whether JNI calls to optimised native code for vector computation would yield better effect
On that note found this under NewBSD license http://code.google.com/p/netlib-java/ Worth a look Robin On Wed, Feb 17, 2010 at 3:11 PM, Jake Mannix <jake.man...@gmail.com> wrote: > Cool, excellent starting point, Robin! > > Some additions in the same vein: dot (and other binary ops) with two > different impls (with both impl1 as caller on impl2 as method param, and > vice-versa), looking at effects of sparsity are, and create incrementally > (with n different set() operations). > > -jake > > On Feb 17, 2010 1:24 AM, "Robin Anil" <robin.a...@gmail.com> wrote: > > Its checked in under utils > org.apache.mahout.benchmark.VectorBenchmarks. > It current runs on full vectors 0-cardinality > > only create, clone and dot is benchmarked > All distance measures are benchmarked where each unit is k = numOps times > the time taken to calculate distance measure between 2 vectors > this is to mimic kmeans and other clustering. > It prints out the number of vectors processed and the number of megabytes > read(to mimic the speed at which a dataset could be processed) > I know a lot of assumptions could be wrong. So please feel free to modify.. > > An output for cardinality = 1000, numVectors=100, loop = 200, numOps = 10 > > http://pastebin.com/f1b687091 >