Re: How can I implement eigenvalue decomposition in Spark?

x Thu, 07 Aug 2014 20:21:07 -0700

U.rows.toArray.take(1).foreach(println) and
V.toArray.take(s.size).foreach(println)
are not eigenvectors corresponding to the biggest eigenvalue s.toArray(0)*s.
toArray(0)?


xj @ Tokyo

On Fri, Aug 8, 2014 at 12:07 PM, Chunnan Yao <yaochun...@gmail.com> wrote:

> Hi there, what you've suggested are all meaningful. But to make myself
> clearer, my essential problems are:
> 1. My matrix is asymmetric, and it is a probabilistic adjacency matrix,
> whose entries(a_ij) represents the likelihood that user j will broadcast
> the information generated by user i. Apparently, a_ij and a_ji is
> different, caus I love you doesn't necessarily mean you love me(What a sad
> story~). All entries are real.
> 2. I know I can get eigenvalues through SVD. My problem is I can't get the
> corresponding eigenvectors, which requires solving equations, and I also
> need eigenvectors in my calculation.In my simulation of this paper, I only
> need the biggest eigenvalues and corresponding eigenvectors.
> The paper posted by Shivaram Venkataraman is also concerned about
> symmetric matrix. Could any one help me out?
>
>
> 2014-08-08 9:41 GMT+08:00 x <wasedax...@gmail.com>:
>
>  The SVD computed result already contains descending order of singular
>> values, you can get the biggest eigenvalue.
>>
>> ---
>>
>>   val svd = matrix.computeSVD(matrix.numCols().toInt, computeU = true)
>>   val U: RowMatrix = svd.U
>>   val s: Vector = svd.s
>>   val V: Matrix = svd.V
>>
>>   U.rows.toArray.take(1).foreach(println)
>>
>>   println(s.toArray(0)*s.toArray(0))
>>
>>   println(V.toArray.take(s.size).foreach(println))
>>
>> ---
>>
>> xj @ Tokyo
>>
>>
>> On Fri, Aug 8, 2014 at 3:06 AM, Shivaram Venkataraman <
>> shiva...@eecs.berkeley.edu> wrote:
>>
>>> If you just want to find the top eigenvalue / eigenvector you can do
>>> something like the Lanczos method. There is a description of a MapReduce
>>> based algorithm in Section 4.2 of [1]
>>>
>>> [1] http://www.cs.cmu.edu/~ukang/papers/HeigenPAKDD2011.pdf
>>>
>>>
>>> On Thu, Aug 7, 2014 at 10:54 AM, Li Pu <l...@twitter.com.invalid> wrote:
>>>
>>>> @Miles, the latest SVD implementation in mllib is partially
>>>> distributed. Matrix-vector multiplication is computed among all workers,
>>>> but the right singular vectors are all stored in the driver. If your
>>>> symmetric matrix is n x n and you want the first k eigenvalues, you will
>>>> need to fit n x k doubles in driver's memory. Behind the scene, it calls
>>>> ARPACK to compute eigen-decomposition of A^T A. You can look into the
>>>> source code for the details.
>>>>
>>>> @Sean, the SVD++ implementation in graphx is not the canonical
>>>> definition of SVD. It doesn't have the orthogonality that SVD holds. But we
>>>> might want to use graphx as the underlying matrix representation for
>>>> mllib.SVD to address the problem of skewed entry distribution.
>>>>
>>>>
>>>> On Thu, Aug 7, 2014 at 10:51 AM, Evan R. Sparks <evan.spa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Reza Zadeh has contributed the distributed implementation of
>>>>> (Tall/Skinny) SVD (
>>>>> http://spark.apache.org/docs/latest/mllib-dimensionality-reduction.html),
>>>>> which is in MLlib (Spark 1.0) and a distributed sparse SVD coming in Spark
>>>>> 1.1. (https://issues.apache.org/jira/browse/SPARK-1782). If your data
>>>>> is sparse (which it often is in social networks), you may have better luck
>>>>> with this.
>>>>>
>>>>> I haven't tried the GraphX implementation, but those algorithms are
>>>>> often well-suited for power-law distributed graphs as you might see in
>>>>> social networks.
>>>>>
>>>>> FWIW, I believe you need to square elements of the sigma matrix from
>>>>> the SVD to get the eigenvalues.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Aug 7, 2014 at 10:20 AM, Sean Owen <so...@cloudera.com> wrote:
>>>>>
>>>>>> (-incubator, +user)
>>>>>>
>>>>>> If your matrix is symmetric (and real I presume), and if my linear
>>>>>> algebra isn't too rusty, then its SVD is its eigendecomposition. The
>>>>>> SingularValueDecomposition object you get back has U and V, both of
>>>>>> which have columns that are the eigenvectors.
>>>>>>
>>>>>> There are a few SVDs in the Spark code. The one in mllib is not
>>>>>> distributed (right?) and is probably not an efficient means of
>>>>>> computing eigenvectors if you really just want a decomposition of a
>>>>>> symmetric matrix.
>>>>>>
>>>>>> The one I see in graphx is distributed? I haven't used it though.
>>>>>> Maybe it could be part of a solution.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 7, 2014 at 2:21 PM, yaochunnan <yaochun...@gmail.com>
>>>>>> wrote:
>>>>>> > Our lab need to do some simulation on online social networks. We
>>>>>> need to
>>>>>> > handle a 5000*5000 adjacency matrix, namely, to get its largest
>>>>>> eigenvalue
>>>>>> > and corresponding eigenvector. Matlab can be used but it is
>>>>>> time-consuming.
>>>>>> > Is Spark effective in linear algebra calculations and
>>>>>> transformations? Later
>>>>>> > we would have 5000000*5000000 matrix processed. It seems emergent
>>>>>> that we
>>>>>> > should find some distributed computation platform.
>>>>>> >
>>>>>> > I see SVD has been implemented and I can get eigenvalues of a
>>>>>> matrix through
>>>>>> > this API.  But when I want to get both eigenvalues and eigenvectors
>>>>>> or at
>>>>>> > least the biggest eigenvalue and the corresponding eigenvector, it
>>>>>> seems
>>>>>> > that current Spark doesn't have such API. Is it possible that I
>>>>>> write
>>>>>> > eigenvalue decomposition from scratch? What should I do? Thanks a
>>>>>> lot!
>>>>>> >
>>>>>> >
>>>>>> > Miles Yao
>>>>>> >
>>>>>> > ________________________________
>>>>>> > View this message in context: How can I implement eigenvalue
>>>>>> decomposition
>>>>>> > in Spark?
>>>>>> > Sent from the Apache Spark User List mailing list archive at
>>>>>> Nabble.com.
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Li
>>>> @vrilleup
>>>>
>>>
>>>
>>
>

Re: How can I implement eigenvalue decomposition in Spark?

Reply via email to