“The assumption that all data sets are equally likely is wrong.”

I am not sure that this is really an assumption necessary for NFLT to apply. 
This is just one consequence, I think—as a mathematically clear way to define 
the theorem. But the no free lunch theorem applies to today’s machine learning 
problems—on daily basis. For example, for some problems random forest does 
better job, for others logistic regression does better job, etc. This is no 
free lunch in everyday application; and yet all data sets are not equally 
likely. Also, deep learning is better a in broader set of problems but there is 
no free lunch either because mode data are needed for this broader application. 
And so on.

Occam’s Razor is one consequence of no free lunch theorem. If you have two 
models that account for data equally well, you take the simpler one. See the 
above example of logistic regression (simplest), random forest (more complex) 
and deep learning (most complex). If on data set A you do well with logistic 
regression, you don’t want to use any of the more complex models. Also, if 
dataset B can be accounted for accurately by a random forest, you don’t want to 
go any more complex. Why? Because you will have to pay the price. For example, 
a naïve modeler may be tempted to use deep learning because deep learning is 
“so powerful” and can do more things. So, you may want to try squeezing dataset 
B together with A into a single deep learning model and have one model 
accounting for both. Naively, one may think that this is better now. But of 
course, we know that this is not a good idea—this is what Occam Razor tells us. 
But also no-free-lunch tells us the same thing too, and plus it explains to us 
why this is the case: we will have to pay a price in something else. In this 
case we may need several orders of magnitude more data to make a model that 
accounts for both A and B. Deep learning performs across a broader data set A+B 
but poorly. Logistic regression performs better over a narrower data set A but 
is helpless on B. Random forest performs better over another medium narrow data 
set B but is horrible at A. On average all three perform about equally well 
across both data sets.

Dummy table:

Data set  |  A score |  B score |    total score
----------------------------------------------------------
LogReg                10            0               10
RandFor                0          10               10
deepLearn            5             5              10
----------------------------------------------------------

So, if you work with a single data set either A or B, Occam’s Razor tells you 
to pick the model that does 10 on that data set. That way, Occam Razor 
prescribes how to deal with no free lunch in  models.

Best regards,

Danko 




Sent from Mail for Windows 10

From: Matt Mahoney
Sent: Montag, 3. Februar 2020 22:51
To: AGI
Subject: Re: [agi] Re: General Intelligence vs. no-free-lunch theorem


On Mon, Feb 3, 2020, 2:48 PM Danko Nikolic <danko.niko...@gmail.com> wrote:
Thanks everyone for the opinions.

So, I see a lot of disagreement. But let me try to accurate about what this 
disagreement is about. It seems like nobody is challenging the idea that 
no-free-lunch theorem contradicts the notion of general intelligence (and vice 
versa). Instead, everyone seems to challenge the validity of the no-free-lunch 
theorem. Would that be a correct conclusion so far?

The proof is correct. The assumption that all data sets are equally likely is 
wrong. We know a-priori that Occam's Razor applies.

Artificial General Intelligence List / AGI / see discussions + participants + 
delivery options Permalink 


------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T353f2000d499d93b-M9ebb629966328f7434af4c40
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to