Thank you, Ted.
i ran some tests for low rank matrices as you suggested. I do get perfect
results with 1 block again but not with many blocks (although not
significantly worse). I do beleive the second result is normal stochastic
deviation of this method but multiple block QR results are screwed
somewhere. I'll keep looking into it since it is my first pass over
implementing merging Q blocks.

-D

On Thu, Nov 18, 2010 at 7:12 AM, Ted Dunning <[email protected]> wrote:

> I would suggest low-rank test matrices at first.  With k+p = 20, you can
> build rank 10, 20 and 30 input matrices.  These
> can be built in R with this:
>
> a.10 = matrix(rnorm(10*800), nrow=800) %*% matrix(rnorm(10*800), ncol=800)
> a.20 = matrix(rnorm(20*800), nrow=800) %*% matrix(rnorm(20*800), ncol=800)
> a.30 = matrix(rnorm(30*800), nrow=800) %*% matrix(rnorm(30*800), ncol=800)
>
> You should get essentially perfect results with a.10, very good results
> with
> a.20 and ok results with a.30.
>
> With block QR and rank(A) \le k, p = 5, I didn't see any important
> difference between block QR, stochastic projection or svd.
>
> On Thu, Nov 18, 2010 at 12:32 AM, Dmitriy Lyubimov <[email protected]
> >wrote:
>
> > First test results of this method including QR blocking as per document:
> >
> > Test matrix is 800x200, k+p=20.
> >
> > Colt library results (singular values)
> >
> > 1.204e+04 1.196e+04 1.191e+04 1.181e+04 1.175e+04 1.170e+04 1.159e+04
> > 1.151e+04 1.146e+04 1.143e+04 1.138e+04 1.132e+04 1.129e+04 1.118e+04
> > 1.112e+04 1.106e+04 1.101e+04 1.097e+04 1.090e+04 1.087e+04
> >
> > Singular values, stochastic SVD, 1 QR block
> >
> > 1.019e+04 9.964e+03 9.842e+03 9.638e+03 9.574e+03 9.457e+03 9.275e+03
> > 9.190e+03 9.057e+03 9.022e+03 8.957e+03 8.689e+03 8.673e+03 8.588e+03
> > 8.415e+03 8.300e+03 8.214e+03 8.120e+03 8.007e+03 7.929e+03
> >
> > Singular values, 20 Givens-based QR blocks
> >
> > 9.263e+03 6.941e+03 6.413e+03 6.241e+03 5.985e+03 5.898e+03 5.650e+03
> > 5.246e+03 5.021e+03 4.898e+03 4.836e+03 4.594e+03 4.412e+03 4.191e+03
> > 4.079e+03 3.949e+03 3.786e+03 3.728e+03 3.565e+03 3.408e+03
> >
> > Notes:
> >
> > the source matrix is the same (as well as seed for random matrix Omega).
> > Omega is random Gaussian numbers, as the paper suggests as a first step.
> > Source matrix is random which perhaps means it is ill conditioned. The
> > authors in the paper warned that in such sitations results may go astray
> > quite a lot and power iterations are required. With blocking we get
> almost
> > another 8% of error on the largest singular value (but we probably want
> to
> > go with k+p far more than 20). Obviously not all of that 8% can be
> > attribute
> > to QR definciencies themselves, for error would be magnified by further
> > calculations in the pipeline. I use eigensolver from the apache
> > math-commons, which may use slightly different approaches, who knows.
> > it seesm though that either way in my case regardless of # of blocks
> > stochastic svd seems to consistently underestimate singular values in my
> > randomly generated source. Perhaps there are better tests (better
> > conditioned matrices to begin with) in which case this method should
> > presumably work better, but i am not sure how to generate them and what
> > margin of error to expect.
> >
> > Orthogonality tests for Q pass with epsilon set to 1e-10 though.
> >
> > I was worried about numerical stability, i wonder how significantly
> errors
> > may grow on gigantic matrices.
> > Please kindly let me know if you think this kind of precision to expect
> or
> > it actually should be better than that with this method.
> >
> > If this is acceptable (well Halko et al call it 'approximate' method
> > anyway)
> > i will put those subroutine into map-reduce code. I certainly can run
> more
> > tests per your suggestions if needed, in order to figure more details
> about
> > the limitations of the method.
> >
> > Thanks.
> > -Dmitriy
> >
> > On Mon, Nov 15, 2010 at 9:27 PM, Dmitriy Lyubimov (JIRA) <
> [email protected]
> > >wrote:
> >
> > >
> > >     [
> > >
> >
> https://issues.apache.org/jira/browse/MAHOUT-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> > ]
> > >
> > > Dmitriy Lyubimov updated MAHOUT-376:
> > > ------------------------------------
> > >
> > >    Attachment: QR decomposition for Map.pdf
> > >
> > > QR step doc is ready for review.
> > >
> > > Especially sections 3.2 and on which i still haven't worked up in the
> > > actual code. These reflect my original ideas to deal with blockwise QR
> in
> > a
> > > MapReduce settings. I am probably retracing some block computations
> > already
> > > found elsewhere (such as in LAPack) but i think it may actually work
> out
> > ok.
> > >
> > > > Implement Map-reduce version of stochastic SVD
> > > > ----------------------------------------------
> > > >
> > > >                 Key: MAHOUT-376
> > > >                 URL:
> https://issues.apache.org/jira/browse/MAHOUT-376
> > > >             Project: Mahout
> > > >          Issue Type: Improvement
> > > >          Components: Math
> > > >            Reporter: Ted Dunning
> > > >            Assignee: Ted Dunning
> > > >             Fix For: 0.5
> > > >
> > > >         Attachments: MAHOUT-376.patch, Modified stochastic svd
> > algorithm
> > > for mapreduce.pdf, QR decomposition for Map.pdf, QR decomposition for
> > > Map.pdf, sd-bib.bib, sd.pdf, sd.pdf, sd.pdf, sd.pdf, sd.tex, sd.tex,
> > sd.tex,
> > > sd.tex, Stochastic SVD using eigensolver trick.pdf
> > > >
> > > >
> > > > See attached pdf for outline of proposed method.
> > > > All comments are welcome.
> > >
> > > --
> > > This message is automatically generated by JIRA.
> > > -
> > > You can reply to this email to add a comment to the issue online.
> > >
> > >
> >
>

Reply via email to