This is more of a linear algebra question, but I thought it worth
posing to the group --

As part of a process like ALS, you solve a system like A = X * Y' for
X or for Y, given the other two. A is sparse (m x n); X and Y are tall
and skinny (m x k, m x n, where k << m,n)

For example to solve for X, just:   X = A * Y * (Y' * Y)^-1

This fails if the k x k matrix Y' * Y is not invertible of course.
This can happen if the data is tiny and k is actually large relative
to m,n.

It also goes badly if it is nearly not invertible. The solution for X
can become very large, for example, for a small A, which is "obviously
wrong". You can -- often -- detect this by looking at the diagonal of
R in a QR decomposition, looking for near-zero values.

However I find a similar behavior even when the rank k seems
intuitively fine (easily low enough given the data), but when, for
example, the regularization term is way too high. X and Y are so
constrained that the inverse above becomes a badly behaved operator
too.

I think I understand the reasons for this intuitively. The goal isn't
to create a valid solution since there is none here; the goal is to
define and detect this "bad" situation reliably and suggest a fix to
parameters if possible.

I have had better success looking at the operator norm of (Y' * Y)^-1
(its largest singular value) to get a sense of when it is going to
potentially scale its input greatly, since that's a sign it's bad, but
I feel like I'm missing the rigorous understanding of what to do with
that info. I'm looking for a way to think about a cutoff or threshold
for that singular value that will make it be rejected (>1?) but think
I have some unknown-unknowns in this space.

Any insights or pointers into the next concept that's required here
are appreciated.

Sean

Reply via email to