This demonstrates the Eckhart-Young theorem?

http://en.wikipedia.org/wiki/Singular_value_decomposition#Low-rank_matrix_approximation

On Fri, Jul 8, 2011 at 1:53 AM, Lance Norskog <goks...@gmail.com> wrote:
> Thanks! Very illuminating.
>
>
>
> On Thu, Jul 7, 2011 at 10:15 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>> This means that the rank 2 reconstruction of your matrix is close to your
>> original in the sense that the Frobenius norm of the difference will be
>> small.
>>
>> In particular, the Frobenius norm of the delta should be the same as the
>> Frobenius norm of the missing singular values (root sum squared missing
>> values, that is).
>>
>> Here is an example.  First I use a random 20x7 matrix to get an SVD into
>> which I transplant your singular values.  This gives me a matrix whose
>> decomposition is the same as the one you are using.
>>
>> I then do that decomposition and truncate the singular values to get an
>> approximate matrix.  The Frobenius norm of the difference is the same as the
>> Frobenius norm of the missing singular values.
>>
>>> m = matrix(rnorm(20*7), nrow=20)
>>> svd1 = svd(m)
>>> length(svd1$d)
>> [1] 7
>>> d = c(0.7, 0.2,0.05, 0.02, 0.01, 0.01, 0.01)
>>> m2 = svd1$u %*% diag(d) %*% t(svd1$v)
>>> svd = svd(m2)
>>> svd$d
>> [1] 0.70 0.20 0.05 0.02 0.01 0.01 0.01
>>> m3 = svd$u[,1:2] %*% diag(svd$d[1:2]) %*% t(svd$v[,1:2])
>>> dim(m3)
>> [1] 20  7
>>> m2-m3
>>               [,1]          [,2]          [,3]          [,4]          [,5]
>>         [,6]          [,7]
>>  [1,]  0.0069233794  0.0020467352 -0.0071659763 -4.099546e-03  0.0056399256
>> -0.0023953930 -0.0119370905
>>  [2,] -0.0018546491  0.0011631030  0.0013261685 -1.193252e-03  0.0002839689
>>  0.0014320601  0.0036207164
>>  [3,]  0.0011612177  0.0027845827 -0.0023368373 -4.240565e-03  0.0009362635
>> -0.0032987483 -0.0019110953
>>  [4,] -0.0061414783  0.0070092709  0.0066429461  2.240401e-03 -0.0003033182
>> -0.0031444510  0.0027860257
>>  [5,]  0.0004910556 -0.0057660226  0.0014586550 -3.383020e-03 -0.0015763103
>>  0.0011357677  0.0101147998
>>  [6,]  0.0016672016 -0.0043701670 -0.0002311687 -1.706181e-04 -0.0032324629
>> -0.0033587690  0.0018471306
>>  [7,] -0.0024146270  0.0007510238  0.0052282604  7.724380e-04 -0.0004411600
>> -0.0026622302  0.0050655693
>>  [8,]  0.0036106469  0.0028629467 -0.0007957853  1.333764e-03  0.0074933368
>>  0.0008158132 -0.0091284389
>>  [9,]  0.0013369776  0.0036364763 -0.0009691292 -2.050044e-03  0.0021208815
>> -0.0042241753 -0.0043885229
>> [10,]  0.0031153692  0.0003852343 -0.0053822410 -6.538480e-04  0.0005221515
>> -0.0003594550 -0.0077290438
>> [11,] -0.0012286952  0.0026373981  0.0017958449  4.693112e-05  0.0003753286
>> -0.0024000476 -0.0001261246
>> [12,] -0.0024890888 -0.0018374670  0.0048781861 -1.065282e-03 -0.0018902396
>> -0.0013280442  0.0096305420
>> [13,]  0.0099545328 -0.0012843802 -0.0035220130  1.599559e-03  0.0081592109
>> -0.0047310711 -0.0158840779
>> [14,] -0.0029835169  0.0046807105  0.0016607724  4.339315e-03 -0.0015926183
>> -0.0026172305 -0.0048268815
>> [15,] -0.0102632616  0.0033271770  0.0104700407  2.728651e-03 -0.0037697307
>>  0.0016053567  0.0145899365
>> [16,] -0.0074888872 -0.0002277379  0.0068370652 -4.688963e-05 -0.0044921343
>>  0.0024889009  0.0150436991
>> [17,] -0.0068501952 -0.0017733601  0.0076497285  1.743932e-03 -0.0055472565
>>  0.0006109667  0.0142443162
>> [18,] -0.0020245716 -0.0011431425  0.0044837803  3.219527e-04  0.0007887701
>>  0.0019836205  0.0070585228
>> [19,] -0.0016059867 -0.0028328316  0.0032223649  2.025061e-03 -0.0019194344
>>  0.0009643023  0.0052452638
>> [20,]  0.0042324909 -0.0063013966 -0.0041269199 -9.848214e-04 -0.0029591571
>> -0.0015911580 -0.0012584919
>>> sqrt(sum((m2-m3)^2))
>> [1] 0.05656854
>>> sqrt(sum(d[3:7]^2))
>> [1] 0.05656854
>>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Jul 7, 2011 at 8:46 PM, Lance Norskog <goks...@gmail.com> wrote:
>>
>>> Rats "My 3D coordinates" should be 'My 2D coordinates'. The there is a
>>> preposition missing in the first sentence.
>>>
>>> On Thu, Jul 7, 2011 at 8:44 PM, Lance Norskog <goks...@gmail.com> wrote:
>>> > The singular values in my experiments drop like a rock. What is
>>> > information/probability loss formula when dropping low-value vectors?
>>> >
>>> > That is, I start with a 7D vector set, go through this random
>>> > projection/svd exercise, and get these singular vectors: [0.7, 0.2,
>>> > 0.05, 0.02, 0.01, 0.01, 0.01]. I drop the last five to create a matrix
>>> > that gives 2D coordinates. The sum of the remaining singular values is
>>> > 0.9. My 3D vectors will be missing 0.10 of *something* compared to the
>>> > original 7D vectors. What is this something and what other concepts
>>> > does it plug into?
>>> >
>>> > Lance
>>> >
>>> > On Sat, Jul 2, 2011 at 11:54 PM, Lance Norskog <goks...@gmail.com>
>>> wrote:
>>> >> The singular values on my recommender vectors come out: 90, 10, 1.2,
>>> >> 1.1, 1.0, 0.95..... This was playing with your R code. Based on this,
>>> >> I'm adding the QR stuff to my visualization toolkit.
>>> >>
>>> >> Lance
>>> >>
>>> >> On Sat, Jul 2, 2011 at 10:15 PM, Lance Norskog <goks...@gmail.com>
>>> wrote:
>>> >>> All pairwise distances are preserved? There must be a qualifier on
>>> >>> pairwise distance algorithms.
>>> >>>
>>> >>> On Sat, Jul 2, 2011 at 6:49 PM, Lance Norskog <goks...@gmail.com>
>>> wrote:
>>> >>>> Cool. The plots are fun. The way gaussian spots project into spinning
>>> >>>> chains is very educational about entropy.
>>> >>>>
>>> >>>> For full Random Projection, a lame random number generator
>>> >>>> (java.lang.Random) will generate a higher standard deviation than a
>>> >>>> high-quality one like MurmurHash.
>>> >>>>
>>> >>>> On Fri, Jul 1, 2011 at 5:25 PM, Ted Dunning <ted.dunn...@gmail.com>
>>> wrote:
>>> >>>>> Here is R code that demonstrates what I mean by stunning (aka 15
>>> significant
>>> >>>>> figures).  Note that I only check dot products here.  From the fact
>>> that the
>>> >>>>> final transform is orthonormal we know that all distances are
>>> preserved.
>>> >>>>>
>>> >>>>> # make a big random matrix with rank 20
>>> >>>>>> a = matrix(rnorm(20000), ncol=20) %*% matrix(rnorm(20000), nrow=20);
>>> >>>>>> dim(a)
>>> >>>>> [1] 1000 1000
>>> >>>>> # random projection
>>> >>>>>> y = a %*% matrix(rnorm(30000), ncol=30);
>>> >>>>> # get basis for y
>>> >>>>>> q = qr.Q(qr(y))
>>> >>>>>> dim(q)
>>> >>>>> [1] 1000   30
>>> >>>>> # re-project b, do svd on result
>>> >>>>>> b = t(q) %*% a
>>> >>>>>> v = svd(b)$v
>>> >>>>>> d = svd(b)$d
>>> >>>>> # note how singular values drop like a stone at index 21
>>> >>>>>> plot(d)
>>> >>>>>> dim(v)
>>> >>>>> [1] 1000   30
>>> >>>>> # finish svd just for fun
>>> >>>>>> u = q %*% svd(b)$u
>>> >>>>>> dim(u)
>>> >>>>> [1] 1000   30
>>> >>>>> # u is orthogonal, right?
>>> >>>>>> diag(t(u)%*% u)
>>> >>>>>  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>>> >>>>> # and u diag(d) v' reconstructs a very precisely, right?
>>> >>>>>> max(abs(a-u %*% diag(d) %*% t(v)))
>>> >>>>> [1] 1.293188e-12
>>> >>>>>
>>> >>>>> # now project a into the reduced dimensional space
>>> >>>>>> aa = a%*%v
>>> >>>>>> dim(aa)
>>> >>>>> [1] 1000   30
>>> >>>>> # check a few dot products
>>> >>>>>> sum(aa[1,] %*% aa[2,])
>>> >>>>> [1] 6835.152
>>> >>>>>> sum(a[1,] %*% a[2,])
>>> >>>>> [1] 6835.152
>>> >>>>>> sum(a[1,] %*% a[3,])
>>> >>>>> [1] 3337.248
>>> >>>>>> sum(aa[1,] %*% aa[3,])
>>> >>>>> [1] 3337.248
>>> >>>>>
>>> >>>>> # wow, that's close.  let's try a hundred dot products
>>> >>>>>> dot1 = rep(0,100);dot2 = rep(0,100)
>>> >>>>>> for (i in 1:100) {dot1[i] = sum(a[1,] * a[i,]); dot2[i] =
>>> sum(aa[1,]*
>>> >>>>> aa[i,])}
>>> >>>>>
>>> >>>>> # how close to the same are those?
>>> >>>>>> max(abs(dot1-dot2))
>>> >>>>> # VERY
>>> >>>>> [1] 3.45608e-11
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> On Fri, Jul 1, 2011 at 4:54 PM, Ted Dunning <ted.dunn...@gmail.com>
>>> wrote:
>>> >>>>>
>>> >>>>>> Yes.  Been there.  Done that.
>>> >>>>>>
>>> >>>>>> The correlation is stunningly good.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Fri, Jul 1, 2011 at 4:22 PM, Lance Norskog <goks...@gmail.com>
>>> wrote:
>>> >>>>>>
>>> >>>>>>> Thanks!
>>> >>>>>>>
>>> >>>>>>> Is this true? - "Preserving pairwise distances" means the relative
>>> >>>>>>> distances. So the ratios of new distance:old distance should be
>>> >>>>>>> similar. The standard deviation of the ratios gives a rough&ready
>>> >>>>>>> measure of the fidelity of the reduction. The standard deviation of
>>> >>>>>>> simple RP should be highest, then this RP + orthogonalization, then
>>> >>>>>>> MDS.
>>> >>>>>>>
>>> >>>>>>> On Fri, Jul 1, 2011 at 11:03 AM, Ted Dunning <
>>> ted.dunn...@gmail.com>
>>> >>>>>>> wrote:
>>> >>>>>>> > Lance,
>>> >>>>>>> >
>>> >>>>>>> > You would get better results from the random projection if you
>>> did the
>>> >>>>>>> first
>>> >>>>>>> > part of the stochastic SVD.  Basically, you do the random
>>> projection:
>>> >>>>>>> >
>>> >>>>>>> >       Y = A \Omega
>>> >>>>>>> >
>>> >>>>>>> > where A is your original data, R is the random matrix and Y is
>>> the
>>> >>>>>>> result.
>>> >>>>>>> >  Y will be tall and skinny.
>>> >>>>>>> >
>>> >>>>>>> > Then, find an orthogonal basis Q of Y:
>>> >>>>>>> >
>>> >>>>>>> >       Q R = Y
>>> >>>>>>> >
>>> >>>>>>> > This orthogonal basis will be very close to the orthogonal basis
>>> of A.
>>> >>>>>>>  In
>>> >>>>>>> > fact, there are strong probabilistic guarantees on how good Q is
>>> as a
>>> >>>>>>> basis
>>> >>>>>>> > of A.  Next, you project A using the transpose of Q:
>>> >>>>>>> >
>>> >>>>>>> >       B = Q' A
>>> >>>>>>> >
>>> >>>>>>> > This gives you a short fat matrix that is the projection of A
>>> into a
>>> >>>>>>> lower
>>> >>>>>>> > dimensional space.  Since this is a left projection, it isn't
>>> quite what
>>> >>>>>>> you
>>> >>>>>>> > want in your work, but it is the standard way to phrase things.
>>>  The
>>> >>>>>>> exact
>>> >>>>>>> > same thing can be done with left random projection:
>>> >>>>>>> >
>>> >>>>>>> >      Y = \Omega A
>>> >>>>>>> >      L Q = Y
>>> >>>>>>> >      B = A Q'
>>> >>>>>>> >
>>> >>>>>>> > In this form, B is tall and skinny as you would like and Q' is
>>> >>>>>>> essentially
>>> >>>>>>> > an orthogonal reformulation of of the random projection.  This
>>> >>>>>>> projection is
>>> >>>>>>> > about as close as you are likely to get to something that exactly
>>> >>>>>>> preserves
>>> >>>>>>> > distances.  As such, you should be able to use MDS on B to get
>>> exactly
>>> >>>>>>> the
>>> >>>>>>> > same results as you want.
>>> >>>>>>> >
>>> >>>>>>> > Additionally, if you start with the original form and do an SVD
>>> of B
>>> >>>>>>> (which
>>> >>>>>>> > is fast), you will get a very good approximation of the prominent
>>> right
>>> >>>>>>> > singular vectors of A.  IF you do that, the first few of these
>>> should be
>>> >>>>>>> > about as good as MDS for visualization purposes.
>>> >>>>>>> >
>>> >>>>>>> > On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog <goks...@gmail.com
>>> >
>>> >>>>>>> wrote:
>>> >>>>>>> >
>>> >>>>>>> >> I did some testing and make a lot of pretty charts:
>>> >>>>>>> >>
>>> >>>>>>> >> http://ultrawhizbang.blogspot.com/
>>> >>>>>>> >>
>>> >>>>>>> >> If you want to get quick visualizations of your clusters, this
>>> is a
>>> >>>>>>> >> great place to start.
>>> >>>>>>> >>
>>> >>>>>>> >> --
>>> >>>>>>> >> Lance Norskog
>>> >>>>>>> >> goks...@gmail.com
>>> >>>>>>> >>
>>> >>>>>>> >
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Lance Norskog
>>> >>>>>>> goks...@gmail.com
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Lance Norskog
>>> >>>> goks...@gmail.com
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Lance Norskog
>>> >>> goks...@gmail.com
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Lance Norskog
>>> >> goks...@gmail.com
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Lance Norskog
>>> > goks...@gmail.com
>>> >
>>>
>>>
>>>
>>> --
>>> Lance Norskog
>>> goks...@gmail.com
>>>
>>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to