Re: More LDA Questions

Jake Mannix Sun, 10 Jan 2010 21:38:43 -0800

On Sun, Jan 10, 2010 at 8:39 PM, David Hall <[email protected]> wrote:
>
>
> In some sense, I've come to believe that assigning a label to a topic
> reifies it more than it really deserves to be. Topics are in a lot of
> ways like eigenvectors/eigenfaces; you don't really assign a name (or
> even a visual word) to the fourth eigenface, even if it looks like it
> might be smiling a little bit...
>


Yeah, this is something that has been nagging at me for a while
whenever these questions of "human interpretable labels" for
clusters/topics/eigenvectors, while I don't have enough deep familiarity
with all of the techniques involved to say how it relates in all cases,
I can say this: for the case of eigenvectors, where if they are texual,
you could take out the "top-k terms", or if they are faces, you could
try to pick out the "top-k facial structures", the problem of mixing is
pretty significant:

given two eigenvectors e1, e2, with eigenvalues a1, a2, then even
when a1 = 2 * a2, the vector v = e1 + (e2 / 2)  satisfies the
eigenvector criterion with an error of only about 3% (meaning the
cosine between v and M*v is about 0.97, compared to 1.0 for exact
eigenvectors, and compared to roughly 1/sqrt(num_dimensions) for
two randomly chosen unit vectors).

What this means, in practical terms, is that when you do real
large scale decompositions (and I'm thinking this is similar with
LDA and the like), numerical errors and imperfect convergence
leads to finding a great eigen-*space*, but the actual basis
vectors you've found in it can be much more of a mix with each
other than you might imagine (think about it: take the top-k terms
from one eigenvector, and the top-k terms from another, and
now consider a mixture of the two vectors with two-parts the
first eigenvector and one part the next - the top-k terms of this
linear combination could be a considerably different set).

Of course, you can turn this criticism on its head, and instead
say that you could take any slightly rotated basis of your originally
found one instead, and use this to pick a basis specifically
*because* it is more interpretable than others.  Of course, finding
an efficient way to do that might be more challenging than the
original problem of computing the decomposition in the first place.

  -jake

Re: More LDA Questions

Reply via email to