(I don't know anything spark specific, so I'm going to treat it like a
Breeze question...)

As I understand it, Spark uses ARPACK via Breeze for SVD, and presumably
the same approach can be used for EVD. Basically, you make a function that
multiplies your "matrix" (which might be represented
implicitly/distributed, whatever) by a breeze.linalg.DenseVector.

This is the Breeze implementation for sparse SVD (which is fully generic
and might be hard to follow if you're not used to Breeze/typeclass-heavy
Scala...)

https://github.com/dlwh/breeze/blob/aa958688c428db581d853fd92eb35e82f80d8b5c/math/src/main/scala/breeze/linalg/functions/svd.scala#L205-L205

The difference between SVD and EVD in arpack (to a first approximation) is
that you need to multiple by A.t * A * x for SVD, and just A * x for EVD.

The basic idea is to implement a Breeze UFunc eig.Impl2 implicit following
the svd code (or you could just copy out the body of the function and
specialize it.) The signature you're looking to implement is:

implicit def Eig_Sparse_Impl[Mat](implicit mul: OpMulMatrix.Impl2[Mat,
DenseVector[Double], DenseVector[Double]],
                                  dimImpl: dim.Impl[Mat, (Int, Int)])
  : eig.Impl3[Mat, Int, Double, EigenvalueResult] = {

The type parameters of Impl3 are: the matrix type, the number of
eigenvalues you want, and a tolerance, and a result type. If you implement
this signature, then you can call eig on anything that can be multiplied by
a dense vector and that implements dim (to get the number of outputs).

(You'll need to define the class eigenvalue result to be what you want. I
don't immediately know how to unpack ARPACK's answers, but you might look
at this scipy thing:
https://github.com/thomasnat1/cdcNewsRanker/blob/71b0ff3989d5191dc6a78c40c4a7a9967cbb0e49/venv/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/arpack/arpack.py#L1049
)

I'm happy to help more if you decide to go this route, here, or on the
scala-breeze google group, or on github.

-- David


On Tue, Jan 12, 2016 at 10:28 AM, Lydia Ickler <ickle...@googlemail.com>
wrote:

> Hi,
>
> I wanted to know if there are any implementations yet within the Machine
> Learning Library or generally that can efficiently solve eigenvalue
> problems?
> Or if not do you have suggestions on how to approach a parallel execution
> maybe with BLAS or Breeze?
>
> Thanks in advance!
> Lydia
>
>
> Von meinem iPhone gesendet
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Reply via email to