Good thing about this is it shows SeqAcc is slower than RandAcc at many of
the distance measures. I was thinking whether JNI calls to optimised native
code for vector computation would yield better effect

On that note found this under NewBSD license
http://code.google.com/p/netlib-java/
Worth a look

Robin



On Wed, Feb 17, 2010 at 3:11 PM, Jake Mannix <jake.man...@gmail.com> wrote:

> Cool, excellent starting point, Robin!
>
> Some additions in the same vein: dot (and other binary ops) with two
> different impls (with both impl1 as caller on impl2 as method param, and
> vice-versa), looking at effects of sparsity are, and create incrementally
> (with n different set() operations).
>
>  -jake
>
> On Feb 17, 2010 1:24 AM, "Robin Anil" <robin.a...@gmail.com> wrote:
>
> Its checked in under utils
> org.apache.mahout.benchmark.VectorBenchmarks.
> It current runs on full vectors 0-cardinality
>
> only create, clone and dot is benchmarked
> All distance measures are benchmarked where each unit is k = numOps times
> the time taken to calculate distance measure between 2 vectors
> this is to mimic kmeans and other clustering.
> It prints out the number of vectors processed and the number of megabytes
> read(to mimic the speed at which a dataset could be processed)
> I know a lot of assumptions could be wrong. So please feel free to modify..
>
> An output for cardinality = 1000, numVectors=100, loop = 200, numOps = 10
>
> http://pastebin.com/f1b687091
>

Reply via email to