On Thu, 07 Nov 2013 18:43:17 -0800, Mark Janssen wrote: >>> I am not sure if it is just stupidness or laziness that prevent you >>> from seeing that 4^8=65536. >> >> I can see that 4^8 = 65536. Now how are you going to render 65537? You >> claimed that you could render *any* number efficiently. What you've >> proven is that a small subset of numbers can be rendered efficiently. > > I think the idea would be to find the prime factorization for a given > number, which has been proven to be available (and unique) for any and > every number. Most numbers can compress given this technique.
Sadly, they would not. Let's suppose you wish to compress the number 143. So you find the prime factorization, 11*13=143. Have you compressed anything? No you have not -- the original number, 143, fits in a single byte, while it requires *two* bytes to deal with 11 and 13 (one byte for 11, and a second byte for 13). We can try a different strategy: while 143 requires a full eight bits (one byte), both 11 and 13 require only four bits (one nybble) each. So by hard coding our expectation that the result will be two nybbles, we can avoid expanding the data and end up with exactly zero compression: 143 = binary 10001110 or eight bits in total 11, 13 = binary 1011 1101 or eight bits in total But while that trick works for 143, it doesn't work for 154 (one byte, binary 10011010) which has prime factorization 2*7*11 (three nybbles 0010 0111 1011). Even if you can somehow drop the leading zeroes, it requires nine bits versus eight. Suppose instead of using bytes, you use decimal digits. 11 and 13 use four digits, while 143 uses only three. Likewise, 154 has three decimal digits while 2*7*11 has four digits. In both cases, there is no compression. How about recording which prime number it is, instead of the actual value of the prime? So 154=2*7*11 expands to the 1st, 4th and 5th prime, so maybe we can record 1 4 5 instead of 2 7 11 and save some bits. (We'll save more bits the larger the prime.) Of course, to reverse the compression you need to keep a table of the prime numbers somewhere, and that's going to be a lot bigger than the space you save... Now, I accept that, possibly, with sufficient work we might be able to find some cunning mechanism by which we can compress *any* integer value. But it won't be the same mechanism for every value! For example, we might compress (2**1000+1) using a run-length encoding of the binary bits, while compressing 14629839399572435720381495231 as its prime factorization (the 319th prime number, the 479th prime, the 499th prime six times), and 10**1000 as an encoding of the decimal digits. But we still need some sort of tag to distinguish between these different compression schemes, and once you take into account the extra information in the tag, there will be cases where some numbers aren't compressed at all. The ability to losslessly compress *everything* is sheer pseudo- mathematics, up there with finding an exact rational value for pi, or squaring the circle using only a compass and straight-edge. But the ability to losslessly compress *useful data* is real, and such algorithms operate by finding non-random patterns and redundant information in the data. -- Steven -- https://mail.python.org/mailman/listinfo/python-list