[
https://issues.apache.org/jira/browse/MAHOUT-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965961#action_12965961
]
Dmitriy Lyubimov commented on MAHOUT-376:
-----------------------------------------
{qoute}Doesn't the streaming QR decomposition require that we look at each row
of Y one at a time in a streaming fashion? That is, isn't that a completely
sequential algorithm?{quote}
{quote} Even if it is dense, one such vector would take 8MB memory at a time.
but sparse sequential vectors should be ok too (it will probably require a
little tweak during Y computations to scan it one time sequentially instead of
k+p times as i think it is done now with assumption it can be random). {qoute}
Oh. I guess you hinted at possibilty that if we use sparse sequential vector
for A rows, then we memory-unbound for n! so who cares about m then! And then
we can have billion by billion even with this implemetnation. Wow. That's an
extremely powerful suggestion. But that's definitely requires code review and
performance tests. And we go over A only one time so there's no need to revisit
the sparse vectors. I'll take a look at it to see if i can engineer a solution.
If it is possible at all, it should be extremely simple.
> Implement Map-reduce version of stochastic SVD
> ----------------------------------------------
>
> Key: MAHOUT-376
> URL: https://issues.apache.org/jira/browse/MAHOUT-376
> Project: Mahout
> Issue Type: Improvement
> Components: Math
> Reporter: Ted Dunning
> Assignee: Ted Dunning
> Fix For: 0.5
>
> Attachments: MAHOUT-376.patch, Modified stochastic svd algorithm for
> mapreduce.pdf, QR decomposition for Map.pdf, QR decomposition for Map.pdf, QR
> decomposition for Map.pdf, sd-bib.bib, sd.pdf, sd.pdf, sd.pdf, sd.pdf,
> sd.tex, sd.tex, sd.tex, sd.tex, SSVD working notes.pdf, SSVD working
> notes.pdf, SSVD working notes.pdf, ssvd-CDH3-or-0.21.patch.gz,
> ssvd-m1.patch.gz, ssvd-m2.patch.gz, ssvd-m3.patch.gz, Stochastic SVD using
> eigensolver trick.pdf
>
>
> See attached pdf for outline of proposed method.
> All comments are welcome.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.