RE: math-scala dssvd docs

Andrew Palumbo Fri, 27 Mar 2015 17:17:23 -0700

I did put a reference in the footnotes to halko's dissertation but didn't 
explicit reference it (haven't looked at it closely and so wasn't sure where to 
reference it)..  The Parallelizeton strategy right?


The r code could be good  to to show the for comparison.

Will try update it over the weekend.



Sent from my Verizon Wireless 4G LTE smartphone

<div>-------- Original message --------</div><div>From: Dmitriy Lyubimov 
<[email protected]> </div><div>Date:03/27/2015  8:00 PM  (GMT-05:00) 
</div><div>To: [email protected] </div><div>Subject: Re: math-scala dssvd 
docs </div><div>
</div>
and R simulation sources perhaps ...

On Fri, Mar 27, 2015 at 4:57 PM, Dmitriy Lyubimov <[email protected]> wrote:

> Andrew, thanks a lot!
>
> I think acknowledgement and refference to N. Halko's dissertation from MR
> page is also worthy of mention on this page as well.
>
> On Fri, Mar 27, 2015 at 4:41 PM, Andrew Palumbo <[email protected]>
> wrote:
>
>> Ok i think everything should be good.  I'll try to get (d)spca and
>> possibly d-als up.  I think it does make sense to put the in-core algos on
>> that page as well.  Will do that soon.
>>
>> yeah.. showing that it is almost line for line (a few more maybe ~1.25
>> lines per step ) to follow the algorithms step by step ..i think is good to
>> have up on the site.
>>
>> please let me know if you see any other changes that need to be made.
>>
>>
>>
>>
>> On 03/27/2015 07:06 PM, Dmitriy Lyubimov wrote:
>>
>>> In fact, algorithm just executes the outline formulas. Not always line
>>> for
>>> line, but step for step for sure.
>>>
>>> On Fri, Mar 27, 2015 at 4:05 PM, Andrew Palumbo <[email protected]>
>>> wrote:
>>>
>>>  Just commited and published it.  Its pretty cool to see that there are
>>>> not
>>>> a whole lot more lines in the implementation than there are steps in the
>>>> algorithm.
>>>>
>>>>
>>>> On 03/27/2015 06:58 PM, Andrew Palumbo wrote:
>>>>
>>>>  On 03/27/2015 06:46 PM, Dmitriy Lyubimov wrote:
>>>>>
>>>>>  Note also that all these related beasts come in pairs (in-core input
>>>>>> <->
>>>>>> distributed input):
>>>>>>
>>>>>> ssvd - dssvd
>>>>>> spca - dspca
>>>>>>
>>>>>>  yeah I've been thinking that i'd give a less detailed description of
>>>>> them
>>>>> (the in core algos) in an overview page (which would include the
>>>>> basics and
>>>>> operators, etc.).  Probably makes sense to reference them here as well.
>>>>> I'd like to get most of the manual easily viewable on different pages.
>>>>>
>>>>>   Yes. Except it doesn't follow same parallel reordered Givens QR but
>>>>> uses
>>>>> Cholesky QR (which we call "thin QR") as an easy-to-implement shortcut.
>>>>> But
>>>>> this page makes no mention of QR specifics i think
>>>>>
>>>>>
>>>>> right.. no QR specifics, just the dqrThin(...) call in the code. I'm
>>>>> going to put the link directly below the Cholesky QR link so that will
>>>>> tie
>>>>> together well.
>>>>>
>>>>>
>>>>>
>>>>>  On Fri, Mar 27, 2015 at 3:45 PM, Dmitriy Lyubimov <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>   But MR version of SSVD is more stable because of the QR differences.
>>>>>>
>>>>>>> On Fri, Mar 27, 2015 at 3:44 PM, Dmitriy Lyubimov <[email protected]
>>>>>>> >
>>>>>>> wrote:
>>>>>>>
>>>>>>>   Yes. Except it doesn't follow same parallel reordered Givens QR but
>>>>>>>
>>>>>>>> uses
>>>>>>>> Cholesky QR (which we call "thin QR") as an easy-to-implement
>>>>>>>> shortcut. But
>>>>>>>> this page makes no mention of QR specifics i think
>>>>>>>>
>>>>>>>> On Fri, Mar 27, 2015 at 12:57 PM, Andrew Palumbo <
>>>>>>>> [email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>   math-scala dssvd follows the same algorithm as MR ssvd correct?
>>>>>>>>
>>>>>>>>> Looking
>>>>>>>>> at the code against the algorithm outlined at the bottom of
>>>>>>>>> http://mahout.apache.org/users/dim-reduction/ssvd.html it seems
>>>>>>>>> the
>>>>>>>>> same, but I wanted to make I'm not missing anything before I put
>>>>>>>>> the
>>>>>>>>> following doc up (with the algorithm taken from the MR ssvd page):
>>>>>>>>>
>>>>>>>>> # Distributed Stochastic Singular Value Decomposition
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ## Intro
>>>>>>>>>
>>>>>>>>> Mahout has a distributed implementation of Stochastic Singular
>>>>>>>>> Value
>>>>>>>>> Decomposition [1].
>>>>>>>>>
>>>>>>>>> ## Modified SSVD Algorithm
>>>>>>>>>
>>>>>>>>> Given an `\(m\times n\)`
>>>>>>>>> matrix `\(\mathbf{A}\)`, a target rank `\(k\in\mathbb{N}_{1}\)`
>>>>>>>>> , an oversampling parameter `\(p\in\mathbb{N}_{1}\)`,
>>>>>>>>> and the number of additional power iterations
>>>>>>>>> `\(q\in\mathbb{N}_{0}\)`,
>>>>>>>>> this procedure computes an `\(m\times\left(k+p\right)\)`
>>>>>>>>> SVD `\(\mathbf{A\approx U}\boldsymbol{\Sigma}\mathbf{V}^{\top}\)`:
>>>>>>>>>
>>>>>>>>>     1. Create seed for random `\(n\times\left(k+p\right)\)`
>>>>>>>>>     matrix `\(\boldsymbol{\Omega}\)`. The seed defines matrix
>>>>>>>>> `\(\mathbf{\Omega}\)`
>>>>>>>>>     using Gaussian unit vectors per one of suggestions in [Halko,
>>>>>>>>> Martinsson, Tropp].
>>>>>>>>>
>>>>>>>>>     2. `\(\mathbf{Y=A\boldsymbol{\Omega}},\,\mathbf{Y}\in\
>>>>>>>>> mathbb{R}^{m\times\left(k+p\right)}\)`
>>>>>>>>>
>>>>>>>>>     3. Column-orthonormalize `\(\mathbf{Y}\rightarrow\mathbf{Q}\)`
>>>>>>>>>     by computing thin decomposition `\(\mathbf{Y}=\mathbf{Q}\
>>>>>>>>> mathbf{R}\)`.
>>>>>>>>>     Also, `\(\mathbf{Q}\in\mathbb{R}^{m\times\left(k+p\right)},\,\
>>>>>>>>> mathbf{R}\in\mathbb{R}^{\left(k+p\right)\times\left(k+p\
>>>>>>>>> right)}\)`;
>>>>>>>>> denoted as `\(\mathbf{Q}=\mbox{qr}\left(\
>>>>>>>>> mathbf{Y}\right).\mathbf{Q}\)`
>>>>>>>>>
>>>>>>>>>     4. `\(\mathbf{B}_{0}=\mathbf{Q}^{
>>>>>>>>> \top}\mathbf{A}:\,\,\mathbf{B}
>>>>>>>>> \in\mathbb{R}^{\left(k+p\right)\times n}\)`.
>>>>>>>>>
>>>>>>>>>     5. If `\(q>0\)`
>>>>>>>>>     repeat: for `\(i=1..q\)`:
>>>>>>>>> `\(\mathbf{B}_{i}^{\top}=\mathbf{A}^{\top}\mbox{qr}\
>>>>>>>>> left(\mathbf{A}\mathbf{B}_{i-1}^{\top}\right).\mathbf{Q}\)`
>>>>>>>>>     (power iterations step).
>>>>>>>>>
>>>>>>>>>     6. Compute Eigensolution of a small Hermitian
>>>>>>>>> `\(\mathbf{B}_{q}\mathbf{B}_{q}^{\top}=\mathbf{\hat{U}}\
>>>>>>>>> boldsymbol{\Lambda}\mathbf{\hat{U}}^{\top}\)`,
>>>>>>>>> `\(\mathbf{B}_{q}\mathbf{B}_{q}^{\top}\in\mathbb{R}^{\left(
>>>>>>>>> k+p\right)\times\left(k+p\right)}\)`.
>>>>>>>>>
>>>>>>>>>     7. Singular values `\(\mathbf{\boldsymbol{\Sigma}
>>>>>>>>> }=\boldsymbol{\Lambda}^{0.5}\)`,
>>>>>>>>>     or, in other words, `\(s_{i}=\sqrt{\sigma_{i}}\)`.
>>>>>>>>>
>>>>>>>>>     8. If needed, compute `\(\mathbf{U}=\mathbf{Q}\hat{\
>>>>>>>>> mathbf{U}}\)`.
>>>>>>>>>
>>>>>>>>>     9. If needed, compute `\(\mathbf{V}=\mathbf{B}_{q}^{
>>>>>>>>> \top}\hat{\mathbf{U}}\boldsymbol{\Sigma}^{-1}\)`.
>>>>>>>>> Another way is `\(\mathbf{V}=\mathbf{A}^{\
>>>>>>>>> top}\mathbf{U}\boldsymbol{\
>>>>>>>>> Sigma}^{-1}\)`.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ## Implementation
>>>>>>>>>
>>>>>>>>> Mahout `dssvd(...)` is implemented in the mahout `math-scala`
>>>>>>>>> algebraic
>>>>>>>>> optimizer which translates Mahout's R-like linear algebra operators
>>>>>>>>> into a
>>>>>>>>> physical plan for both Spark and H2O distributed engines.
>>>>>>>>>
>>>>>>>>>       def dssvd[K: ClassTag](drmA: DrmLike[K], k: Int, p: Int =
>>>>>>>>> 15, q:
>>>>>>>>> Int
>>>>>>>>> = 0):
>>>>>>>>>           (DrmLike[K], DrmLike[Int], Vector) = {
>>>>>>>>>
>>>>>>>>>           val drmAcp = drmA.checkpoint()
>>>>>>>>>
>>>>>>>>>           val m = drmAcp.nrow
>>>>>>>>>           val n = drmAcp.ncol
>>>>>>>>>           assert(k <= (m min n), "k cannot be greater than smaller
>>>>>>>>> of
>>>>>>>>> m,
>>>>>>>>> n.")
>>>>>>>>>           val pfxed = safeToNonNegInt((m min n) - k min p)
>>>>>>>>>
>>>>>>>>>           // Actual decomposition rank
>>>>>>>>>           val r = k + pfxed
>>>>>>>>>
>>>>>>>>>           // We represent Omega by its seed.
>>>>>>>>>           val omegaSeed = RandomUtils.getRandom().nextInt()
>>>>>>>>>
>>>>>>>>>           // Compute Y = A*Omega.
>>>>>>>>>           var drmY = drmAcp.mapBlock(ncol = r) {
>>>>>>>>>               case (keys, blockA) =>
>>>>>>>>>                   val blockY = blockA %*%
>>>>>>>>> Matrices.symmetricUniformView(n,
>>>>>>>>> r, omegaSeed)
>>>>>>>>>               keys -> blockY
>>>>>>>>>           }
>>>>>>>>>
>>>>>>>>>           var drmQ = dqrThin(drmY.checkpoint())._1
>>>>>>>>>
>>>>>>>>>           // Checkpoint Q if last iteration
>>>>>>>>>           if (q == 0) drmQ = drmQ.checkpoint()
>>>>>>>>>
>>>>>>>>>           var drmBt = drmAcp.t %*% drmQ
>>>>>>>>>
>>>>>>>>>           // Checkpoint B' if last iteration
>>>>>>>>>           if (q == 0) drmBt = drmBt.checkpoint()
>>>>>>>>>
>>>>>>>>>           for (i <- 0  until q) {
>>>>>>>>>               drmY = drmAcp %*% drmBt
>>>>>>>>>               drmQ = dqrThin(drmY.checkpoint())._1
>>>>>>>>>
>>>>>>>>>               // Checkpoint Q if last iteration
>>>>>>>>>               if (i == q - 1) drmQ = drmQ.checkpoint()
>>>>>>>>>
>>>>>>>>>               drmBt = drmAcp.t %*% drmQ
>>>>>>>>>
>>>>>>>>>               // Checkpoint B' if last iteration
>>>>>>>>>               if (i == q - 1) drmBt = drmBt.checkpoint()
>>>>>>>>>           }
>>>>>>>>>
>>>>>>>>>           val (inCoreUHat, d) = eigen(drmBt.t %*% drmBt)
>>>>>>>>>           val s = d.sqrt
>>>>>>>>>
>>>>>>>>>           // Since neither drmU nor drmV are actually computed
>>>>>>>>> until
>>>>>>>>> actually used
>>>>>>>>>           // we don't need the flags instructing compute (or not
>>>>>>>>> compute)
>>>>>>>>> either of the U,V outputs
>>>>>>>>>           val drmU = drmQ %*% inCoreUHat
>>>>>>>>>           val drmV = drmBt %*% (inCoreUHat %*%: diagv(1 /: s))
>>>>>>>>>
>>>>>>>>>           (drmU(::, 0 until k), drmV(::, 0 until k), s(0 until k))
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> Note: As a side effect of checkpointing, U and V values are
>>>>>>>>> returned
>>>>>>>>> as
>>>>>>>>> logical operators (i.e. they are neither checkpointed nor
>>>>>>>>> computed).
>>>>>>>>> Therefore there is no physical work actually done to compute
>>>>>>>>> `\(\mathbf{U}\)` or `\(\mathbf{V}\)` until they are used in a
>>>>>>>>> subsequent
>>>>>>>>> expression.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ## Usage
>>>>>>>>>
>>>>>>>>> The scala `dssvd(...)` method can easily be called in any Spark or
>>>>>>>>> H2O
>>>>>>>>> application built with the `math-scala` library and the
>>>>>>>>> corresponding
>>>>>>>>> `Spark` or `H2O` engine module as follows:
>>>>>>>>>
>>>>>>>>>       import org.apache.mahout.math._
>>>>>>>>>       import org.decompsitions._
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>       val(drmU, drmV, s) = dssvd(drma, k = 40, q = 1)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ## References
>>>>>>>>>
>>>>>>>>> [1]: [Mahout Scala and Mahout Spark Bindings for Linear Algebra
>>>>>>>>> Subroutines](http://mahout.apache.org/users/sparkbindings/
>>>>>>>>> ScalaSparkBindings.pdf)
>>>>>>>>>
>>>>>>>>> [2]: [Halko, Martinsson, Tropp](http://arxiv.org/abs/0909.4061)
>>>>>>>>>
>>>>>>>>> [3]: [Mahout Spark and Scala Bindings](http://mahout.
>>>>>>>>> apache.org/users/
>>>>>>>>> sparkbindings/home.html)
>>>>>>>>>
>>>>>>>>> [4]: [Randomized methods for computing low-rank
>>>>>>>>> approximations of matrices](http://amath.
>>>>>>>>> colorado.edu/faculty/martinss/
>>>>>>>>> Pubs/2012_halko_dissertation.pdf)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>
>

RE: math-scala dssvd docs

Reply via email to