On Thu, Feb 3, 2011 at 10:20 AM, Dawid Weiss <[email protected]>wrote:
> We don't use much of math from colt, to be honest. Excellent. > Basically matrix > decompositions (SVD), OK. I think I may raid commons math to improve our implementation there. Are you dependent on assumptions of ordering of eigenvalues? > some sorting routines from Sorting class > (removed now, but this can be replaced) We will need more details on this. > and a lot of multiplying/ > basic operations of vectors and matrices. Got that, probably. > One thing we DO use heavily > is 2 dimensional matrix representation in a double[] array because > this allows us to plug in BLAS to work directly on Java data, without > copying or other manipulations... but then we don't have any newer > native BLAS build and it's been a pain to compile and link it with > Java. We care mostly about native Lapack's gesdd (SVD) and Blas's gemm > (general matrix multiplication); these do provide significant speedups > when clustering larger data sets using Lingo. But I can imagine > hardware-accelerated implementations will eventually surface inside > mahout-math anyway, so we could switch to these instead of doing all > the trickery we currently do with Colt. > So, key step here would be to expose a native array from DenseMatrix, right? What format do you require? Would row major order be OK? > > So, to summarize: don't worry about us much, really. For now we will > stick to mahout-math release that we know works for us. I will try to > switch to the trunk of mahout-math as a proof of concept (without > native matrix computations support) and will let you know if I have > any problems. This is a much larger refactoring than I initially > thought though. > Sorry about that. I feel your pain, quite literally. But I also really appreciate your feedback.
