On Fri, Jul 3, 2020 at 6:53 PM Ben Goertzel <b...@goertzel.org> wrote:

> ...Under what conditions is it the case that, for prediction based on a
> dataset using realistically limited resources, the smallest of the
> available programs that precisely predicts the training data actually gives
> the best predictions on the test data?


If I may refine this a bit to head off misunderstanding at the outset of
this project:

The CIC* (Compression Information Criterion) hypothesis is that among
existing models of a process producing an executable archive of the same
training data within the same computation constraints, the one that
produces the smallest executable archive will in general be the most
accurate on the test data.


Run a number of experiments and for each:
1 Select a nontrivial
1.1 computational resource level as constraint
1.2 real world dataset -- no less than 1GB gzipped.
2 Divide the data into training and testing sets
3 For each competing model:
3.1 Provide the training set
3.2 Record the length of the executable archive the model produces
3.3 Append the test set to the training set
3.4  Record the length of the executable archive the model produces
4 Produce 2 rank orders for the models
4.1 training set executable archive sizes
4.2 training with testing set executable archive sizes
5 Record differences in the training vs test rank orders

The lower the average differences the more general the criterion.

It should be possible to run similar tests of other model selection
criteria and rank order model selection criteria.

*We're going to need a catchy acronym to keep up with:

AIC (Akaike Information Criterion)
BIC (Bayesian Information Criterion)...
...aka
SIC (Schwarz Information Criterion)...
...aka
MDL or MDLP (both travestic abuses of "Minimum Description Length
[Principle]" that should be forever cast into the bottomless pit)
HQIC (Hannan-Quinn Information Criterion)...
KIC (Kullback Information Criterion)
etc. etc.

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Ta901988932dbca83-M38f2416839b2c191952a0332
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to