Jim, to address all of your points, Solomonoff induction claims that the probability of a string is proportional to the number of programs that output the string, where each program M is weighted by 2^-|M|. The probability is dominated by the shortest program (Kolmogorov complexity), but it is not exactly the same. The difference is small enough that we may neglect it, just as we neglect differences that depend on choice of language.
Here is the proof that Kolmogorov complexity is not computable. Suppose it were. Then I could test the Kolmogorov complexity of strings in increasing order of length (breaking ties lexicographically) and describe "the first string that cannot be described in less than a million bits", contradicting the fact that I just did. (Formally, I could write a program that outputs the first string whose Kolmogorov complexity is at least n bits, choosing n to be larger than my program). Here is the argument that Occam's Razor and Solomonoff distribution must be true. Consider all possible probability distributions p(x) over any infinite set X of possible finite strings x, i.e. any X = {x: p(x) > 0} that is infinite. All such distributions must favor shorter strings over longer ones. Consider any x in X. Then p(x) > 0. There can be at most a finite number (less than 1/p(x)) of strings that are more likely than x, and therefore an infinite number of strings which are less likely than x. Of this infinite set, only a finite number (less than 2^|x|) can be shorter than x, and therefore there must be an infinite number that are longer than x. So for each x we can partition X into 4 subsets as follows: - shorter and more likely than x: finite - shorter and less likely than x: finite - longer and more likely than x: finite - longer and less likely than x: infinite. So in this sense, any distribution over the set of strings must favor shorter strings over longer ones. -- Matt Mahoney, matmaho...@yahoo.com ________________________________ From: Jim Bromer <jimbro...@gmail.com> To: agi <agi@v2.listbox.com> Sent: Fri, July 2, 2010 4:09:38 PM Subject: Re: [agi] Re: Huge Progress on the Core of AGI On Fri, Jul 2, 2010 at 2:25 PM, Jim Bromer <jimbro...@gmail.com> wrote: There cannot be a one to one correspondence to the representation of the shortest program to produce a string and the strings that they produce. This means that if the consideration of the hypotheses were to be put into general mathematical form it must include the potential of many to one relations between candidate programs (or subprograms) and output strings. But, there is also no way to determine what the "shortest" program is, since there may be different programs that are the same length. That means that there is a many to one relation between programs and program "length". So the claim that you could just iterate through programs by length is false. This is the goal of algorithmic information theory not a premise of a methodology that can be used. So you have the diagonalization problem. A counter argument is that there are only a finite number of Turing Machine programs of a given length. However, since you guys have specifically designated that this theorem applies to any construction of a Turing Machine it is not clear that this counter argument can be used. And there is still the specific problem that you might want to try a program that writes a longer program to output a string (or many strings). Or you might want to write a program that can be called to write longer programs on a dynamic basis. I think these cases, where you might consider a program that outputs a longer program, (or another instruction string for another Turing Machine) constitutes a serious problem, that at the least, deserves to be answered with sound analysis. Part of my original intuitive argument, that I formed some years ago, was that without a heavy constraint on the instructions for the program, it will be practically impossible to test or declare that some program is indeed the shortest program. However, I can't quite get to the point now that I can say that there is definitely a diagonalization problem. Jim Bromer agi | Archives | Modify Your Subscription ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com