Re: Detecting rank-deficiency, or worse, via QR decomposition

Sean Owen Thu, 04 Apr 2013 11:24:19 -0700

Does it complete without problems? It may complete without error but
the result may be garbage. The matrix that's inverted is not going to
be singular due to round-off. Even if it's not you may find that the
resulting vectors are infinite or very large. In particular I at least
had to make the singularity threshold a lot larger than
Double.MIN_VALUE in the QR decomposition.


Try some simple dummy data like below, without maybe k=10. If it
completes with error that's a problem!

0,0,1
0,1,4
0,2,3
1,2,3
2,1,4
2,3,3
2,4,2
3,0,5
3,2,2
3,4,3
4,3,5
5,0,2
5,1,4

On Thu, Apr 4, 2013 at 7:05 PM, Koobas <koo...@gmail.com> wrote:
> I took Movie Lens 100K data without ratings and ran non-weighted ALS in
> Matlab.
> I set number of features k=2000, which is larger than the input matrix
> (1000 x 1700).
> I used QR to do the inversion.
> It runs without problems.
> Can you share your data?
>
>
>
> On Thu, Apr 4, 2013 at 1:10 PM, Koobas <koo...@gmail.com> wrote:
>
>> Just to throw another bit.
>> Just like Ted was saying.
>> If you take the largest singular value over the smallest singular value,
>> you get your condition number.
>> If it turns out to be 10^16, then you're loosing all the digits of double
>> precision accuracy,
>> meaning that your solver is nothing more than a random number generator.
>>
>>
>>
>>
>> On Thu, Apr 4, 2013 at 12:21 PM, Dan Filimon 
>> <dangeorge.fili...@gmail.com>wrote:
>>
>>> For what it's worth, here's what I remember from my Numerical Analysis
>>> course.
>>>
>>> The thing we were taught to use to figure out whether the matrix is ill
>>> conditioned is the condition number of a matrix (k(A) = norm(A) *
>>> norm(A^-1)). Here's a nice explanation of it [1].
>>>
>>> Suppose you want to solve Ax = b. How much worse results will you get
>>> using
>>> A if you're not really solving Ax = b but A(x + delta) = b + epsilon (x is
>>> still a solution for Ax = b).
>>> So, by perturbing the b vector by epsilon, how much worse is delta going
>>> to
>>> be? There's a short proof [1, page 4] but the inequality you get is:
>>>
>>> norm(delta) / norm(x) <= k(A) * norm(epsilon) / norm(b)
>>>
>>> The rule of thumb is that if m = log10(k(A)), you lose m digits of
>>> accuracy. So, equivalently, if m' = log2(k(A)) you lose m' bits of
>>> accuracy.
>>> Since floats are 32bits, you can decide that say, at most 2 bits may be
>>> lost, therefore any k(A) > 4 is not acceptable.
>>>
>>> Anyway there are lots of possible norms and you need to look at ways of
>>> actually interpreting the condition number but from what I learned this is
>>> probably the thing you want to be looking at.
>>>
>>> Good luck!
>>>
>>> [1] http://www.math.ufl.edu/~kees/ConditionNumber.pdf
>>> [2] http://www.rejonesconsulting.com/CS210_lect07.pdf
>>>
>>>
>>> On Thu, Apr 4, 2013 at 5:26 PM, Sean Owen <sro...@gmail.com> wrote:
>>>
>>> > I think that's what I'm saying, yes. Small rows X shouldn't become
>>> > large rows of A -- and similarly small changes in X shouldn't mean
>>> > large changes in A. Not quite the same thing but both are relevant. I
>>> > see that this is just the ratio of largest and smallest singular
>>> > values. Is there established procedure for evaluating the
>>> > ill-conditioned-ness of matrices -- like a principled choice of
>>> > threshold above which you say it's ill-conditioned, based on k, etc.?
>>> >
>>> > On Thu, Apr 4, 2013 at 3:19 PM, Koobas <koo...@gmail.com> wrote:
>>> > > So, the problem is that the kxk matrix is ill-conditioned, or is there
>>> > more
>>> > > to it?
>>> > >
>>> >
>>>
>>
>>

Re: Detecting rank-deficiency, or worse, via QR decomposition

Reply via email to