(I don't know anything spark specific, so I'm going to treat it like a Breeze question...)
As I understand it, Spark uses ARPACK via Breeze for SVD, and presumably the same approach can be used for EVD. Basically, you make a function that multiplies your "matrix" (which might be represented implicitly/distributed, whatever) by a breeze.linalg.DenseVector. This is the Breeze implementation for sparse SVD (which is fully generic and might be hard to follow if you're not used to Breeze/typeclass-heavy Scala...) https://github.com/dlwh/breeze/blob/aa958688c428db581d853fd92eb35e82f80d8b5c/math/src/main/scala/breeze/linalg/functions/svd.scala#L205-L205 The difference between SVD and EVD in arpack (to a first approximation) is that you need to multiple by A.t * A * x for SVD, and just A * x for EVD. The basic idea is to implement a Breeze UFunc eig.Impl2 implicit following the svd code (or you could just copy out the body of the function and specialize it.) The signature you're looking to implement is: implicit def Eig_Sparse_Impl[Mat](implicit mul: OpMulMatrix.Impl2[Mat, DenseVector[Double], DenseVector[Double]], dimImpl: dim.Impl[Mat, (Int, Int)]) : eig.Impl3[Mat, Int, Double, EigenvalueResult] = { The type parameters of Impl3 are: the matrix type, the number of eigenvalues you want, and a tolerance, and a result type. If you implement this signature, then you can call eig on anything that can be multiplied by a dense vector and that implements dim (to get the number of outputs). (You'll need to define the class eigenvalue result to be what you want. I don't immediately know how to unpack ARPACK's answers, but you might look at this scipy thing: https://github.com/thomasnat1/cdcNewsRanker/blob/71b0ff3989d5191dc6a78c40c4a7a9967cbb0e49/venv/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py#L1049 ) I'm happy to help more if you decide to go this route, here, or on the scala-breeze google group, or on github. -- David On Tue, Jan 12, 2016 at 10:28 AM, Lydia Ickler <ickle...@googlemail.com> wrote: > Hi, > > I wanted to know if there are any implementations yet within the Machine > Learning Library or generally that can efficiently solve eigenvalue > problems? > Or if not do you have suggestions on how to approach a parallel execution > maybe with BLAS or Breeze? > > Thanks in advance! > Lydia > > > Von meinem iPhone gesendet > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >