Nice Occam's Razor argument. I understood it simply because I knew there are
always an infinite number of possible explanations for every observation
that are more complicated than the simplest explanation. So, without a
reason to choose one of those other interpretations, then why choose it? You
could look for reasons in complex environments, but it would likely be more
efficient to wait for a reason to need a better explanation. It's more
efficient to wait for an inconsistency than to search an infinite set
without a reason to do so.

Dave

On Fri, Jul 2, 2010 at 6:08 PM, Matt Mahoney <matmaho...@yahoo.com> wrote:

>   Jim, to address all of your points,
>
> Solomonoff induction claims that the probability of a string is
> proportional to the number of programs that output the string, where each
> program M is weighted by 2^-|M|. The probability is dominated by the
> shortest program (Kolmogorov complexity), but it is not exactly the same.
> The difference is small enough that we may neglect it, just as we neglect
> differences that depend on choice of language.
>
> Here is the proof that Kolmogorov complexity is not computable. Suppose it
> were. Then I could test the Kolmogorov complexity of strings in increasing
> order of length (breaking ties lexicographically) and describe "the first
> string that cannot be described in less than a million bits", contradicting
> the fact that I just did. (Formally, I could write a program that outputs
> the first string whose Kolmogorov complexity is at least n bits, choosing n
> to be larger than my program).
>
> Here is the argument that Occam's Razor and Solomonoff distribution must be
> true. Consider all possible probability distributions p(x) over any infinite
> set X of possible finite strings x, i.e. any X = {x: p(x) > 0} that is
> infinite. All such distributions must favor shorter strings over longer
> ones. Consider any x in X. Then p(x) > 0. There can be at most a finite
> number (less than 1/p(x)) of strings that are more likely than x, and
> therefore an infinite number of strings which are less likely than x. Of
> this infinite set, only a finite number (less than 2^|x|) can be shorter
> than x, and therefore there must be an infinite number that are longer than
> x. So for each x we can partition X into 4 subsets as follows:
>
> - shorter and more likely than x: finite
> - shorter and less likely than x: finite
> - longer and more likely than x: finite
> - longer and less likely than x: infinite.
>
> So in this sense, any distribution over the set of strings must favor
> shorter strings over longer ones.
>
>
> -- Matt Mahoney, matmaho...@yahoo.com
>
>
>  ------------------------------
> *From:* Jim Bromer <jimbro...@gmail.com>
> *To:* agi <agi@v2.listbox.com>
> *Sent:* Fri, July 2, 2010 4:09:38 PM
>
> *Subject:* Re: [agi] Re: Huge Progress on the Core of AGI
>
>
>
> On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer <jimbro...@gmail.com> wrote:
>>
>>    There cannot be a one to one correspondence to the representation of
>>> the shortest program to produce a string and the strings that they produce.
>>> This means that if the consideration of the hypotheses were to be put into
>>> general mathematical form it must include the potential of many to one
>>> relations between candidate programs (or subprograms) and output strings.
>>>
>>
>
>> But, there is also no way to determine what the "shortest" program is,
>> since there may be different programs that are the same length.  That means
>> that there is a many to one relation between programs and program "length".
>> So the claim that you could just iterate through programs *by length* is
>> false.  This is the goal of algorithmic information theory not a premise
>> of a methodology that can be used.  So you have the diagonalization problem.
>>
>
>
> A counter argument is that there are only a finite number of Turing Machine
> programs of a given length.  However, since you guys have specifically
> designated that this theorem applies to any construction of a Turing Machine
> it is not clear that this counter argument can be used.  And there is still
> the specific problem that you might want to try a program that writes a
> longer program to output a string (or many strings).  Or you might want to
> write a program that can be called to write longer programs on a dynamic
> basis.  I think these cases, where you might consider a program that outputs
> a longer program, (or another instruction string for another Turing
> Machine) constitutes a serious problem, that at the least, deserves to be
> answered with sound analysis.
>
> Part of my original intuitive argument, that I formed some years ago, was
> that without a heavy constraint on the instructions for the program, it will
> be practically impossible to test or declare that some program is indeed the
> shortest program.  However, I can't quite get to the point now that I can
> say that there is definitely a diagonalization problem.
>
> Jim Bromer
>
>   *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com/>
>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com/>
>



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Reply via email to