On Thu, Apr 4, 2013 at 9:13 AM, Ted Dunning <ted.dunn...@gmail.com> wrote:

> Typically, to deal with this kind of problem, you need to follow one of two
> courses.
>
> First, you can use a so-called rank-revealing QR which uses a pivoting
> strategy to push all of the small elements of R as far down the diagonal as
> possible.  This gives you a reliable way of finding the problems and can
> give you approximate solutions of a limited rank decomposition of A.
>
> Typically, it is better to use SVD instead of QR in these cases.  You can
> truncate S (the matrix with the singular values) at whatever point you deem
> correct and get an optimal least squares solution.
>
> The Mahout QR that I whipped up a couple of months ago is not rank
> revealing, but it is pretty easy to convert the Gram-Schmidt algorithm to
> be such.  The SVD we have should work just fine.
>
> Mahout QR uses Gram-Schmidt?
Wouldn't it be better to use Householder?

>
>
> On Thu, Apr 4, 2013 at 12:41 PM, Sean Owen <sro...@gmail.com> wrote:
>
> > This is more of a linear algebra question, but I thought it worth
> > posing to the group --
> >
> > As part of a process like ALS, you solve a system like A = X * Y' for
> > X or for Y, given the other two. A is sparse (m x n); X and Y are tall
> > and skinny (m x k, m x n, where k << m,n)
> >
> > For example to solve for X, just:   X = A * Y * (Y' * Y)^-1
> >
> > This fails if the k x k matrix Y' * Y is not invertible of course.
> > This can happen if the data is tiny and k is actually large relative
> > to m,n.
> >
> > It also goes badly if it is nearly not invertible. The solution for X
> > can become very large, for example, for a small A, which is "obviously
> > wrong". You can -- often -- detect this by looking at the diagonal of
> > R in a QR decomposition, looking for near-zero values.
> >
> > However I find a similar behavior even when the rank k seems
> > intuitively fine (easily low enough given the data), but when, for
> > example, the regularization term is way too high. X and Y are so
> > constrained that the inverse above becomes a badly behaved operator
> > too.
> >
> > I think I understand the reasons for this intuitively. The goal isn't
> > to create a valid solution since there is none here; the goal is to
> > define and detect this "bad" situation reliably and suggest a fix to
> > parameters if possible.
> >
> > I have had better success looking at the operator norm of (Y' * Y)^-1
> > (its largest singular value) to get a sense of when it is going to
> > potentially scale its input greatly, since that's a sign it's bad, but
> > I feel like I'm missing the rigorous understanding of what to do with
> > that info. I'm looking for a way to think about a cutoff or threshold
> > for that singular value that will make it be rejected (>1?) but think
> > I have some unknown-unknowns in this space.
> >
> > Any insights or pointers into the next concept that's required here
> > are appreciated.
> >
> > Sean
> >
>

Reply via email to