> > Jim: So, did Solomonoff's original idea involve randomizing whether the > next bit would be a 1 or a 0 in the program?
Abram: Yep. I meant, did Solomonoff's original idea involve randomizing whether the next bit in the program's that are originally used to produce the *prior probabilities* involve the use of randomizing whether the next bit would be a 1 or a 0? I have not been able to find any evidence that it was. I thought that my question was clear but on second thought I guess it wasn't. I think that the part about the coin flips was only a method to express that he was interested in the probability that a particular string would be produced from all possible programs, so that when actually testing the prior probability of a particular string the program that was to be run would have to be randomly generated. Jim Bromer On Wed, Aug 4, 2010 at 10:27 PM, Abram Demski <[email protected]> wrote: > Jim, > > Your function may be convergent but it is not a probability. >> > > True! All the possibilities sum to less than 1. There are ways of > addressing this (ie, multiply by a normalizing constant which must also be > approximated in a convergent manner), but for the most part adherents of > Solomonoff induction don't worry about this too much. What we care about, > mostly, is comparing different hyotheses to decide which to favor. The > normalizing constant doesn't help us here, so it usually isn't mentioned. > > > You said that Solomonoff's original construction involved flipping a coin >> for the next bit. What good does that do? > > > Your intuition is that running totally random programs to get predictions > will just produce garbage, and that is fine. The idea of Solomonoff > induction, though, is that it will produce systematically less garbage than > just flipping coins to get predictions. Most of the garbage programs will be > knocked out of the running by the data itself. This is supposed to be the > least garbage we can manage without domain-specific knowledge > > This is backed up with the proof of dominance, which we haven't talked > about yet, but which is really the key argument for the optimality of > Solomonoff induction. > > > And how does that prove that his original idea was convergent? > > > The proofs of equivalence between all the different formulations of > Solomonoff induction are something I haven't cared to look into too deeply. > > Since his idea is incomputable, there are no algorithms that can be run to >> demonstrate what he was talking about so the basic idea is papered with all >> sorts of unverifiable approximations. > > > I gave you a proof of convergence for one such approximation, and if you > wish I can modify it to include a normalizing constant to ensure that it is > a probability measure. It would be helpful to me if your criticisms were > more specific to that proof. > > So, did Solomonoff's original idea involve randomizing whether the next bit >> would be a 1 or a 0 in the program? >> > > Yep. > > Even ignoring the halting problem what kind of result would that give? >> > > Well, the general idea is this. An even distribution intuitively represents > lack of knowledge. An even distribution over possible data fails horribly, > however, predicting white noise. We want to represent the idea that we are > very ignorant of what the data might be, but not *that* ignorant. To capture > the idea of regularity, ie, similarity between past and future, we instead > take an even distribution over *descriptions* of the data. (The distribution > in the 2nd version of solomonoff induction that I gave, the one in which the > space of possible programs is represented as a continuum, is an even > distribution.) This appears to provide a good amount of regularity without > too much. > > --Abram > > On Wed, Aug 4, 2010 at 8:10 PM, Jim Bromer <[email protected]> wrote: > >> Abram, >> Thanks for the explanation. I still don't get it. Your function may be >> convergent but it is not a probability. You said that Solomonoff's original >> construction involved flipping a coin for the next bit. What good does that >> do? And how does that prove that his original idea was convergent? The >> thing that is wrong with these explanations is that they are not coherent. >> Since his idea is incomputable, there are no algorithms that can be run to >> demonstrate what he was talking about so the basic idea is papered with all >> sorts of unverifiable approximations. >> >> So, did Solomonoff's original idea involve randomizing whether the next >> bit would be a 1 or a 0 in the program? Even ignoring the halting >> problem what kind of result would that give? Have you ever solved the >> problem for some strings and have you ever tested the solutions using a >> simulation? >> >> Jim Bromer >> >> On Mon, Aug 2, 2010 at 5:12 PM, Abram Demski <[email protected]>wrote: >> >>> Jim, >>> >>> Interestingly, the formalization of Solomonoff induction I'm most >>> familiar with uses a construction that relates the space of programs with >>> the real numbers just as you say. This formulation may be due to Solomonoff, >>> or perhaps Hutter... not sure. I re-formulated it to "gloss over" that in >>> order to make it simpler; I'm pretty sure the version I gave is equivalent >>> in the relevant aspects. However, some notes on the original construction. >>> >>> Programs are created by flipping coins to come up with the 1s and 0s. We >>> are to think of it like this: whenever the computer reaches the end of the >>> program and tries to continue on, we flip a coin to decide what the next bit >>> of the program will be. We keep doing this for as long as the computer wants >>> more bits of instruction. >>> >>> This framework makes room for infinitely long programs, but makes them >>> infinitely improbable-- formally, they have probability 0. (We could alter >>> the setup to allow them an infinitesimal probability.) Intuitively, this >>> means that if we keep flipping a coin to tell the computer what to do, >>> eventually we will either create an infinite loop-back (so the computer >>> keeps executing the already-written parts of the program and never asks for >>> more) or write out the "HALT" command. Avoiding doing one or the other >>> forever is just too improbable. >>> >>> This also means all real numbers are output by some program! It just may >>> be one which is infinitely long. >>> >>> However, all the programs that "slip past" my time bound as T increases >>> to infinity will have measure 0, meaning they don't add anything to the sum. >>> This means the convergence is unaffected. >>> >>> Note: in this construction, program space is *still* a well-defined >>> entity. >>> >>> --Abram >>> >>> On Sun, Aug 1, 2010 at 9:05 PM, Jim Bromer <[email protected]> wrote: >>> >>>> Abram, >>>> >>>> This is a very interesting function. I have spent a lot of time >>>> thinking about it. However, I do not believe that does, in any way, >>>> prove or indicate that Solomonoff Induction is convergent. I want to >>>> discuss the function but I need to take more time to study some stuff and >>>> to >>>> work various details out. (Although I have thought a lot about it, I am >>>> writing this under a sense of deadline, so it may not be well composed.) >>>> >>>> >>>> >>>> My argument was that Solomonoff's conjecture, which was based (as far as >>>> I can tell) on 'all possible programs', was fundamentally flawed because >>>> the >>>> idea of 'all possible programs' is not a programmable definition. All >>>> possible programs is a domain, not a class of programs that can be feasibly >>>> defined in the form of an algorithm that could 'reach' all the programs. >>>> >>>> >>>> >>>> The domain of all possible programs is trans-infinite just as the domain >>>> of irrational numbers are. Why do I believe this? Because if we >>>> imagine that infinite algorithms are computable, then we could compute >>>> irrational numbers. That is, there are programs that, given infinite >>>> resources, could compute irrational numbers. We can use the binomial >>>> theorem, for example to compute the square root of 2. And we can use >>>> trial and error methods to compute the nth root of any number. So that >>>> proves that there are infinite irrational numbers that can be computed by >>>> algorithms that run for infinity. >>>> >>>> >>>> >>>> So what does this have to do with Solomonoff's conjecture of all >>>> possible programs? Well, if I could prove that any individual >>>> irrational number could be computed (with programs that ran through >>>> infinity) then I might be able to prove that there are trans-infinite >>>> programs. If I could prove that some trans-infinite subset of >>>> irrational numbers could be computed then I might be able to prove that >>>> 'all >>>> possible programs' is a trans-infinite class. >>>> >>>> >>>> Now Abram said that since his sum, based on runtime and program length, >>>> is convergent it can prove that Solomonoff Induction is convergent. Even >>>> assuming that his convergent sum method could be fixed up a little, I >>>> suspect that this time-length bound is misleading. Since a Turing >>>> Machine allows for erasures this means that a program could last longer >>>> than >>>> his time parameter and still produce an output string that matches the >>>> given >>>> string. And if 'all possible programs' is a trans-infinite class then >>>> there are programs that you are going to miss. Your encoding method will >>>> miss trans-infinite programs (unless you have trans-cended the >>>> trans-infinite.) >>>> >>>> However, I want to study the function and some other ideas related to >>>> this kind of thing a little more. It is an interesting function. >>>> Unfortunately I also have to get back to other-worldly things. >>>> >>>> Jim Bromer >>>> >>>> >>>> On Mon, Jul 26, 2010 at 2:54 PM, Abram Demski <[email protected]>wrote: >>>> >>>>> Jim, >>>>> >>>>> I'll argue that solomonoff probabilities are in fact like Pi, that is, >>>>> computable in the limit. >>>>> >>>>> I still do not understand why you think these combinations are >>>>> necessary. It is not necessary to make some sort of ordering of the sum to >>>>> get it to converge: ordering only matters for infinite sums which include >>>>> negative numbers. (Perhaps that's where you're getting the idea?) >>>>> >>>>> Here's my proof, rewritten from an earlier post, using the properties >>>>> of infinite sums of non-negative numbers. >>>>> >>>>> (preliminaries) >>>>> >>>>> Define the computation as follows: we start with a string S which we >>>>> want to know the Solomonoff probability of. We are given a time-limit T. >>>>> We >>>>> start with P=0, where P is a real number with precision 2*log_4(T) or >>>>> more. >>>>> We use some binary encoding for programs which (unlike normal programming >>>>> languages) does not contain syntactically invalid programs, but still will >>>>> (of course) contain infinite loops and so on. We run program "0" and "1" >>>>> for >>>>> T/4 each, "00", "01", "10" and "11" for T/16 each, and in general run each >>>>> program of length N for floor[T/(4^N)] until T/(4^N) is less than 1. Each >>>>> time we run a program and the result is S, we add 1/(4^N) to P. >>>>> >>>>> (assertion) >>>>> >>>>> P converges to some value as T is increased. >>>>> >>>>> (proof) >>>>> >>>>> If every single program were to output S, then T would converge to 1/4 >>>>> + 1/4 + 1/16 + 1/16 + 1/16 + 1/16 + ... that is, 2*(1/(4^1)) + >>>>> 4*(1/(4^2)) + >>>>> 8*(1/(4^3)) + ... which comes to 1/2 + 1/4 + 1/8 + 1/16 + .. i.e. 1/(2^1) >>>>> + >>>>> 1/(2^2) + 1/(2^3) + ... ; it is well-known that this sequence converges to >>>>> 1. Thus 1 is an upper bound for P: it could only get that high if every >>>>> single program were to output S. >>>>> >>>>> 0 is a lower bound, since we start there and never subtract anything. >>>>> In fact, every time we add more to P, we have a new lower bound: P will >>>>> never go below a number once it reaches it. The sum can only increase. >>>>> Infinite sums with this property must either converge to a finite number, >>>>> or >>>>> go to infinity. However, we already know that 1 is an upper bound for P; >>>>> so, >>>>> it cannot go to infinity. Hence, it must converge. >>>>> >>>>> --Abram >>>>> >>>>> On Mon, Jul 26, 2010 at 9:14 AM, Jim Bromer <[email protected]>wrote: >>>>> >>>>>> As far as I can tell right now, my theories that Solomonoff >>>>>> Induction is trans-infinite were wrong. Now that I realize that the >>>>>> mathematics do not support these conjectures, I have to acknowledge that >>>>>> I >>>>>> would not be able to prove or even offer a sketch of a proof of my >>>>>> theories. Although I did not use rigourous mathematics while I have >>>>>> tried >>>>>> to make an assessment of the Solomonoff method, the first principle of >>>>>> rigourous mathematics is to acknowledge that the mathematics does not >>>>>> support your supposition when they don't. >>>>>> >>>>>> Solomonoff Induction isn't a mathematical theory because the desired >>>>>> results are not computable. As I mentioned before, there are a great >>>>>> many >>>>>> functions that are not computable but which are useful and important >>>>>> because >>>>>> they tend toward a limit which can be seen in with a reasonable number of >>>>>> calculations using the methods available. Pi is one such function. (I >>>>>> am >>>>>> presuming that pi would require an infinite expansion which seems right.) >>>>>> >>>>>> I have explained, and I think it is a correct explanation, that there >>>>>> is no way that you could make an apriori computation of all possible >>>>>> combinations taken from infinite values. So you could not even come up >>>>>> with >>>>>> a theoretical construct that could take account of that level of >>>>>> complexity. It is true that you could come up with a theoretical >>>>>> computational method that could take account of any finite number of >>>>>> values, >>>>>> and that is what we are talking about when we talk about the infinite, >>>>>> but >>>>>> in this context it only points to a theoretical paradox. Your >>>>>> theoretical >>>>>> solution could not take the final step of computing a probability for a >>>>>> string until it had run through the infinite combinations and this is >>>>>> impossible. The same problem does not occur for pi because the function >>>>>> that produces it tends toward a limit. >>>>>> >>>>>> The reason I thought Solomonoff Induction was trans-infinite was >>>>>> because I thought it used the Bayesian notion to compute the >>>>>> probability using all possible programs that produced a particular >>>>>> substring >>>>>> following a given prefix. Now I understand that the desired function is >>>>>> the >>>>>> computation of only the probability of a particular string (not all >>>>>> possible >>>>>> substrings that are identical to the string) following a given prefix. I >>>>>> want to study the method a little further during the next few weeks, but >>>>>> I >>>>>> just wanted to make it clear that, as far as now understand the program, >>>>>> that I do not think that it is trans-infinite. >>>>>> >>>>>> Jim Bromer >>>>>> >>>>> ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c Powered by Listbox: http://www.listbox.com
