Re: Dimensional Reduction via Random Projection: investigations

Lance Norskog Thu, 07 Jul 2011 20:47:06 -0700

Rats "My 3D coordinates" should be 'My 2D coordinates'. The there is a
preposition missing in the first sentence.


On Thu, Jul 7, 2011 at 8:44 PM, Lance Norskog <goks...@gmail.com> wrote:
> The singular values in my experiments drop like a rock. What is
> information/probability loss formula when dropping low-value vectors?
>
> That is, I start with a 7D vector set, go through this random
> projection/svd exercise, and get these singular vectors: [0.7, 0.2,
> 0.05, 0.02, 0.01, 0.01, 0.01]. I drop the last five to create a matrix
> that gives 2D coordinates. The sum of the remaining singular values is
> 0.9. My 3D vectors will be missing 0.10 of *something* compared to the
> original 7D vectors. What is this something and what other concepts
> does it plug into?
>
> Lance
>
> On Sat, Jul 2, 2011 at 11:54 PM, Lance Norskog <goks...@gmail.com> wrote:
>> The singular values on my recommender vectors come out: 90, 10, 1.2,
>> 1.1, 1.0, 0.95..... This was playing with your R code. Based on this,
>> I'm adding the QR stuff to my visualization toolkit.
>>
>> Lance
>>
>> On Sat, Jul 2, 2011 at 10:15 PM, Lance Norskog <goks...@gmail.com> wrote:
>>> All pairwise distances are preserved? There must be a qualifier on
>>> pairwise distance algorithms.
>>>
>>> On Sat, Jul 2, 2011 at 6:49 PM, Lance Norskog <goks...@gmail.com> wrote:
>>>> Cool. The plots are fun. The way gaussian spots project into spinning
>>>> chains is very educational about entropy.
>>>>
>>>> For full Random Projection, a lame random number generator
>>>> (java.lang.Random) will generate a higher standard deviation than a
>>>> high-quality one like MurmurHash.
>>>>
>>>> On Fri, Jul 1, 2011 at 5:25 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>>>>> Here is R code that demonstrates what I mean by stunning (aka 15 
>>>>> significant
>>>>> figures).  Note that I only check dot products here.  From the fact that 
>>>>> the
>>>>> final transform is orthonormal we know that all distances are preserved.
>>>>>
>>>>> # make a big random matrix with rank 20
>>>>>> a = matrix(rnorm(20000), ncol=20) %*% matrix(rnorm(20000), nrow=20);
>>>>>> dim(a)
>>>>> [1] 1000 1000
>>>>> # random projection
>>>>>> y = a %*% matrix(rnorm(30000), ncol=30);
>>>>> # get basis for y
>>>>>> q = qr.Q(qr(y))
>>>>>> dim(q)
>>>>> [1] 1000   30
>>>>> # re-project b, do svd on result
>>>>>> b = t(q) %*% a
>>>>>> v = svd(b)$v
>>>>>> d = svd(b)$d
>>>>> # note how singular values drop like a stone at index 21
>>>>>> plot(d)
>>>>>> dim(v)
>>>>> [1] 1000   30
>>>>> # finish svd just for fun
>>>>>> u = q %*% svd(b)$u
>>>>>> dim(u)
>>>>> [1] 1000   30
>>>>> # u is orthogonal, right?
>>>>>> diag(t(u)%*% u)
>>>>>  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>>>>> # and u diag(d) v' reconstructs a very precisely, right?
>>>>>> max(abs(a-u %*% diag(d) %*% t(v)))
>>>>> [1] 1.293188e-12
>>>>>
>>>>> # now project a into the reduced dimensional space
>>>>>> aa = a%*%v
>>>>>> dim(aa)
>>>>> [1] 1000   30
>>>>> # check a few dot products
>>>>>> sum(aa[1,] %*% aa[2,])
>>>>> [1] 6835.152
>>>>>> sum(a[1,] %*% a[2,])
>>>>> [1] 6835.152
>>>>>> sum(a[1,] %*% a[3,])
>>>>> [1] 3337.248
>>>>>> sum(aa[1,] %*% aa[3,])
>>>>> [1] 3337.248
>>>>>
>>>>> # wow, that's close.  let's try a hundred dot products
>>>>>> dot1 = rep(0,100);dot2 = rep(0,100)
>>>>>> for (i in 1:100) {dot1[i] = sum(a[1,] * a[i,]); dot2[i] = sum(aa[1,]*
>>>>> aa[i,])}
>>>>>
>>>>> # how close to the same are those?
>>>>>> max(abs(dot1-dot2))
>>>>> # VERY
>>>>> [1] 3.45608e-11
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 1, 2011 at 4:54 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>>>>>
>>>>>> Yes.  Been there.  Done that.
>>>>>>
>>>>>> The correlation is stunningly good.
>>>>>>
>>>>>>
>>>>>> On Fri, Jul 1, 2011 at 4:22 PM, Lance Norskog <goks...@gmail.com> wrote:
>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> Is this true? - "Preserving pairwise distances" means the relative
>>>>>>> distances. So the ratios of new distance:old distance should be
>>>>>>> similar. The standard deviation of the ratios gives a rough&ready
>>>>>>> measure of the fidelity of the reduction. The standard deviation of
>>>>>>> simple RP should be highest, then this RP + orthogonalization, then
>>>>>>> MDS.
>>>>>>>
>>>>>>> On Fri, Jul 1, 2011 at 11:03 AM, Ted Dunning <ted.dunn...@gmail.com>
>>>>>>> wrote:
>>>>>>> > Lance,
>>>>>>> >
>>>>>>> > You would get better results from the random projection if you did the
>>>>>>> first
>>>>>>> > part of the stochastic SVD.  Basically, you do the random projection:
>>>>>>> >
>>>>>>> >       Y = A \Omega
>>>>>>> >
>>>>>>> > where A is your original data, R is the random matrix and Y is the
>>>>>>> result.
>>>>>>> >  Y will be tall and skinny.
>>>>>>> >
>>>>>>> > Then, find an orthogonal basis Q of Y:
>>>>>>> >
>>>>>>> >       Q R = Y
>>>>>>> >
>>>>>>> > This orthogonal basis will be very close to the orthogonal basis of A.
>>>>>>>  In
>>>>>>> > fact, there are strong probabilistic guarantees on how good Q is as a
>>>>>>> basis
>>>>>>> > of A.  Next, you project A using the transpose of Q:
>>>>>>> >
>>>>>>> >       B = Q' A
>>>>>>> >
>>>>>>> > This gives you a short fat matrix that is the projection of A into a
>>>>>>> lower
>>>>>>> > dimensional space.  Since this is a left projection, it isn't quite 
>>>>>>> > what
>>>>>>> you
>>>>>>> > want in your work, but it is the standard way to phrase things.  The
>>>>>>> exact
>>>>>>> > same thing can be done with left random projection:
>>>>>>> >
>>>>>>> >      Y = \Omega A
>>>>>>> >      L Q = Y
>>>>>>> >      B = A Q'
>>>>>>> >
>>>>>>> > In this form, B is tall and skinny as you would like and Q' is
>>>>>>> essentially
>>>>>>> > an orthogonal reformulation of of the random projection.  This
>>>>>>> projection is
>>>>>>> > about as close as you are likely to get to something that exactly
>>>>>>> preserves
>>>>>>> > distances.  As such, you should be able to use MDS on B to get exactly
>>>>>>> the
>>>>>>> > same results as you want.
>>>>>>> >
>>>>>>> > Additionally, if you start with the original form and do an SVD of B
>>>>>>> (which
>>>>>>> > is fast), you will get a very good approximation of the prominent 
>>>>>>> > right
>>>>>>> > singular vectors of A.  IF you do that, the first few of these should 
>>>>>>> > be
>>>>>>> > about as good as MDS for visualization purposes.
>>>>>>> >
>>>>>>> > On Fri, Jul 1, 2011 at 2:44 AM, Lance Norskog <goks...@gmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> >> I did some testing and make a lot of pretty charts:
>>>>>>> >>
>>>>>>> >> http://ultrawhizbang.blogspot.com/
>>>>>>> >>
>>>>>>> >> If you want to get quick visualizations of your clusters, this is a
>>>>>>> >> great place to start.
>>>>>>> >>
>>>>>>> >> --
>>>>>>> >> Lance Norskog
>>>>>>> >> goks...@gmail.com
>>>>>>> >>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Lance Norskog
>>>>>>> goks...@gmail.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lance Norskog
>>>> goks...@gmail.com
>>>>
>>>
>>>
>>>
>>> --
>>> Lance Norskog
>>> goks...@gmail.com
>>>
>>
>>
>>
>> --
>> Lance Norskog
>> goks...@gmail.com
>>
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>



-- 
Lance Norskog
goks...@gmail.com

Re: Dimensional Reduction via Random Projection: investigations

Reply via email to