from:"Li Pu"

Re: SVD on larger than taller matrix

2014-09-18 Thread Li Pu

The main bottleneck of current SVD implementation is on the memory of driver node. It requires at least 5*n*k doubles in driver memory because all right singular vectors are stored in driver memory and there are some working memory required. So it is bounded by the smaller dimension of your matrix

Re: How can I implement eigenvalue decomposition in Spark?

2014-08-08 Thread Li Pu

@Miles, eigen-decomposition with asymmetric matrix doesn't always give real-value solutions, and it doesn't have the nice properties that symmetric matrix holds. Usually you want to symmetrize your asymmetric matrix in some way, e.g. see http://machinelearning.wustl.edu/mlpapers/paper_files/icml200

Re: How can I implement eigenvalue decomposition in Spark?

2014-08-07 Thread Li Pu

@Miles, the latest SVD implementation in mllib is partially distributed. Matrix-vector multiplication is computed among all workers, but the right singular vectors are all stored in the driver. If your symmetric matrix is n x n and you want the first k eigenvalues, you will need to fit n x k double

Re: Recommended pipeline automation tool? Oozie?

2014-07-11 Thread Li Pu

I like the idea of using scala to drive the workflow. Spark already comes with a scheduler, why not program a plugin to schedule other types of tasks (copy file, send email, etc.)? Scala could handle any logic required by the pipeline. Passing objects (including RDDs) between tasks is also easier.

Re: running SparkALS

2014-04-28 Thread Li Pu

http://spark.apache.org/docs/0.9.0/mllib-guide.html#collaborative-filtering-1 One thing which is undocumented: the integers representing users and items have to be positive. Otherwise it throws exceptions. Li On 28 avr. 2014, at 10:30, Diana Carroll wrote: > Hi everyone. I'm trying to run som

Re: SVD on larger than taller matrix

Re: How can I implement eigenvalue decomposition in Spark?

Re: How can I implement eigenvalue decomposition in Spark?

Re: Recommended pipeline automation tool? Oozie?

Re: running SparkALS

5 matches

Site Navigation

Mail list logo

Footer information