Re: [agi] Introducing Steve's "Theory of Everything" in cognition.

Abram Demski Thu, 01 Jan 2009 21:40:17 -0800

Steve,

Sorry for not responding for a little while. Comments follow:


>>
>> PCA attempts to isolate components that give maximum
>> information... so my question to you becomes, do you think that the
>> problem you're pointing towards is suboptimal models that don't
>> predict the data well enough, or models that predict the data fine but
>> aren't directly useful for what you expect them to be useful for?
>
>
> Since prediction is NOT the goal, but rather just a useful measure, I am
> only interested in recognizing
> that which can be recognized, and NOT in expending resources on
> "understanding" semi-random noise.
> Further, since compression is NOT my goal, I am not interested in combining
> features
> in ways that minimize the number of components. In short, there is a lot to
> be learned from PCA,
> but a "perfect" PCA solution is likely a less-than-perfect NN solution.

What I am saying is this: a good predictive model will predict
whatever is desired. Unsupervised learning attempts to find such a
model. But, a good predictive model will probably predict lots of
stuff we aren't particularly interested in, so supervised methods have
been invented to predict single variables when those variables are of
interest. Still, in principle, we could use unsupervised methods.
Furthermore (as I understand it), if we are dealing with lots of
variables and believe deep patterns are present, unsupervised learning
can outperform supervised learning by grabbing onto patterns that may
ultimately lead to the desired result, which supervised learning would
miss because no immediate value was evident. But, anyway, my point is
that I can only see two meanings for the word "goodness":

--usefulness in predicting the data as a whole
--usefulness in predicting reward in particular (the real goal)

(Actually, I can think of a third: usefulness in *getting* reward (ie,
motor control). But, I feel adding that to the discussion would be
premature... there are interesting issues, but they are separate from
the ones being discussed here...)

>>
>> To that end... you weren't talking about using the *predictions* of
>> the PCA model, but rather the principle components themselves. The
>> components are essentially hidden variables to make the model run.
>
>
> ... or variables smushed together in ways that may work well for
> compression, but poorly for recognition.

What are the variables that you keep worrying might be smushed
together? Can you give an example? If PCA smushes variables together,
that suggests 1 of 3 things:

--PCA found suboptimal components
--PCA found optimal components, but the hidden variables that got
smooshed really are functionally equivalent (when looked at through
the lens of the available visible variables)
--The true probabilistic situation violates the probabilistic
assumptions behind PCA

The third option is by far the most probable, I think.

>>
>> or in an attempt to complexify the model to make it more accurate in
>> its predictions, by looking for links between the hidden variables, or
>> patterns over time, et cetera.
>
>
> Setting predictions aside, the next layer of PCA-like neurons would be
> looking for those links.

Absolutely.

--Abram Demski


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=123753653-47f84b
Powered by Listbox: http://www.listbox.com

Re: [agi] Introducing Steve's "Theory of Everything" in cognition.

Reply via email to