Re: Perfect compression and true randomness
Paul Crowley wrote: >This supports your main point: perfect compression is a *much* less >realistic idea than true randomness! Yeah. Now that you mention it, it's not entirely clear what perfect compression means, but it seems that it would at a minimum require ability to break every cryptosystem in existence. In other words, perfect compression is apparently utterly unrealistic, unless cryptography is impossible. Consider a very long file which contains AES_k(0), AES_k(1), AES_k(2), AES_k(3), ... for some random key k that is not mentioned in the file. Of course, the optimal compression of this file is just 128 bits for the key k, plus a brief description of the algorithm (AES in counter mode). However, finding k is infeasible unless AES is insecure. In other words, perfect compression of this file requires breaking the AES! A similar example shows that if there is any secure cryptosystem at all, then perfect compression is infeasible. Hence, perfect compression seems to be entirely unrealistic, unless cryptography is impossible.
Re: Perfect compression and true randomness
"Arnold G. Reinhold" <[EMAIL PROTECTED]> writes: > In any case, as I tried to point out before, perfect compression, what > ever it may be, does not prevent a know-plaintext attack. Actually it does: if the compression is perfect with respect to the document model of the attacker, and the plaintext is known, then it compresses down to zero bits so the attacker learns nothing. This supports your main point: perfect compression is a *much* less realistic idea than true randomness! -- __ \/ o\ [EMAIL PROTECTED] /\__/ http://www.cluefactory.org.uk/paul/
Re: Perfect compression and true randomness
I don't think Chaitin/Kolomogorv complexity is relevant here. In real world systems both parties have a lot of a priori knowledge. Your probably_perfect_compress program is not likely to compress this sentence at all, but PKZIP can. The probably_perfect_compress argument would work (ignoring run time) if Alice first had to send Bob the entire PKZIP program, but in reality she doesn't. Also discussing "perfect compression" doesn't make sense in the absence of a space of possible messages and a probability distribution on that space. I don't agree that the assumption of randomness in OTP's is on the same footing as "perfect" compression. The laws of physics let you put a lower bound on the entropy per bit for practical noise generators. You can then distill the collected bits to produce fewer bits which are completely random. In any case, as I tried to point out before, perfect compression, what ever it may be, does not prevent a know-plaintext attack. If Malfoy knows the plaintext and the compression algorithm, he has every thing he needs to guess or exhaust keys. If he has a large number of plaintexts or can choose plaintexts he might be able to effect more sophisticated attacks attacks. Arnold Reinhold At 9:20 PM -0800 1/4/2001, Nick Szabo wrote: >Anonymous wrote (responding to the idea of "perfect compression"): >> ... Once you have specified >> such a probability distribution, you can evaluate how well a particular >> compression algorithm works. But speaking of absolute compression or >> absolute entropy is meaningless. > >These ideas have on a Turing machine the same meaning as the idea of >"truly random numbers", and for the same reason. The assumption of >randomness used in proving that OTPs and other protocols are >"unconditionally" secure is very similar to the assumption that a string >is "perfectly compressed". The problem is that determining the absolute >entropy of a string, as well as the equivalent problem of determining >whether it is "real random", is both uncomputable and language-dependent. > >Empirically, it seems likely that generating truly random numbers is much >more practical than perfect compression. If one has access to certain >well-observed physical phenomena, one can make highly confident, if >still mathematically unproven, assumptions of "true randomness", but >said phenomena don't help with perfect compression. > >If we restrict ourselves to Turing machines, we can do something *close* >to perfect compression and tests of true randomness -- but not quite. >And *very* slow. From a better physical source there is still the problem >that if we can't sufficiently test them, how can we be so confident >they are random anyway? Such assumptions are based on the extensive and >various, but imperfect, statistical tests physicists have done (has >anybody tried cryptanalyzing radioactive decay? :-) > >We can come close to testing for true randomness and and doing perfect >compression on a Turing machine. For example, here is an algorithm that, >for sufficiently long but finite number of steps t, will *probably* give you >the perfect compression (I believe the probability converges on >a number related to Chaitin's "Omega" halting probability as t grows, >but don't quote me -- this would make an interesting research topic). > >probably_perfect_compress(data,t) { >for all binary programs smaller than data { >run program until it halts or it has run for time t >if (output of program == data AND >length(program) < length(shortest_program)) { >shortest_program = program >} >} >print "the data: ", data >print "the (probably) perfect compression of the data", shortest_program >return shortest_program >} > >(We have to makes some reasonable assumption about what the binary >programming language is -- see below). > >We can then use our probably-perfect compression algorithm as a statstical >test of randomness as follows: > >probably_random_test(data,t) { > if length(probably_perfect_compress(data,t)) = length(data) > then print "data is probably random" > else print "pattern found, data is not random" >} > >We can't *prove* that we've found the perfect compression. However, >I bet we can get a good idea of the *probability* that we've found the >perfect compression by examining this algorithm in terms >of the algorithmic probability of the data and Chaitin's halting >probability. > >Nor is the above algorithm efficient. Similarly, you can't prove >that you've found truly random numbers, nor is it efficient to >generate such numbers on a Turing machine. (Pseudorandom >numbers are another story, and numbers derived from non-Turing >physical sources are another story). > >We could generate (non-cryptographic) probably-random numbers as follows: > >probably_random_generate(seed,t) { > return probably_perfect_compress(seed,t) >} > >For cryptographic applications there ar
Perfect compression and true randomness
Anonymous wrote (responding to the idea of "perfect compression"): > ... Once you have specified > such a probability distribution, you can evaluate how well a particular > compression algorithm works. But speaking of absolute compression or > absolute entropy is meaningless. These ideas have on a Turing machine the same meaning as the idea of "truly random numbers", and for the same reason. The assumption of randomness used in proving that OTPs and other protocols are "unconditionally" secure is very similar to the assumption that a string is "perfectly compressed". The problem is that determining the absolute entropy of a string, as well as the equivalent problem of determining whether it is "real random", is both uncomputable and language-dependent. Empirically, it seems likely that generating truly random numbers is much more practical than perfect compression. If one has access to certain well-observed physical phenomena, one can make highly confident, if still mathematically unproven, assumptions of "true randomness", but said phenomena don't help with perfect compression. If we restrict ourselves to Turing machines, we can do something *close* to perfect compression and tests of true randomness -- but not quite. And *very* slow. From a better physical source there is still the problem that if we can't sufficiently test them, how can we be so confident they are random anyway? Such assumptions are based on the extensive and various, but imperfect, statistical tests physicists have done (has anybody tried cryptanalyzing radioactive decay? :-) We can come close to testing for true randomness and and doing perfect compression on a Turing machine. For example, here is an algorithm that, for sufficiently long but finite number of steps t, will *probably* give you the perfect compression (I believe the probability converges on a number related to Chaitin's "Omega" halting probability as t grows, but don't quote me -- this would make an interesting research topic). probably_perfect_compress(data,t) { for all binary programs smaller than data { run program until it halts or it has run for time t if (output of program == data AND length(program) < length(shortest_program)) { shortest_program = program } } print "the data: ", data print "the (probably) perfect compression of the data", shortest_program return shortest_program } (We have to makes some reasonable assumption about what the binary programming language is -- see below). We can then use our probably-perfect compression algorithm as a statstical test of randomness as follows: probably_random_test(data,t) { if length(probably_perfect_compress(data,t)) = length(data) then print "data is probably random" else print "pattern found, data is not random" } We can't *prove* that we've found the perfect compression. However, I bet we can get a good idea of the *probability* that we've found the perfect compression by examining this algorithm in terms of the algorithmic probability of the data and Chaitin's halting probability. Nor is the above algorithm efficient. Similarly, you can't prove that you've found truly random numbers, nor is it efficient to generate such numbers on a Turing machine. (Pseudorandom numbers are another story, and numbers derived from non-Turing physical sources are another story). We could generate (non-cryptographic) probably-random numbers as follows: probably_random_generate(seed,t) { return probably_perfect_compress(seed,t) } For cryptographic applications there are two important ideas, one-wayness and expanding rather than contracting the seed, that are not captured here. Probably_random_generate is more like the idea of hashing an imperfect entropy source to get the "core" entropy one believes exists in it. Only probably_random_generate is far more reliable, as one can actually formally analyze the probability of having generated truly random numbers. It is, alas, much slower than hashing. :-( Back to the theoretical point about whether there is such a thing as "absolute" entropy or compression. The Kolmogorov complexity (the smallest program that, when run, produce the decompressed data) is clearly defined and fully general for Turing machines. If we could determine the Kolmogorov complexity we wouldn't need to invoke any probability distribution to determine the absolute minimum possible entropy of any data to be compressed on a Turning machine. It is, alas, uncomputable. To find the Kolmogorov complexity we could simply search through the space of all programs smaller than the data. But due to the halting problem we cannot always be certain that there does not exist a smaller program that, run for a sufficiently long period of time, will produce the decompressed data. When we can't prove that there is no smaller program than the data which generates the data, we