Gregory Ewing <greg.ew...@canterbury.ac.nz> writes: > Ben Bacarisse wrote: >> But that has to be about the process that gives rise to the data, not >> the data themselves. > >> If I say: "here is some random data..." you can't tell if it is or is >> not from a random source. I can, as a parlour trick, compress and >> recover this "random data" because I chose it. > > Indeed. Another way to say it is that you can't conclude > anything about the source from a sample size of one. > > If you have a large enough sample, then you can estimate > a probability distribution, and calculate an entropy. > >> I think the argument that you can't compress arbitrary data is simpler >> ... it's obvious that it includes the results of previous >> compressions. > > What? I don't see how "results of previous compressions" comes > into it. The source has an entropy even if you're not doing > compression at all.
Maybe we are taking at cross purposes. A claim to be able to compress arbitrary data leads immediately to the problem that iterating the compression will yield zero-size results. That, to me, is a simpler argument that talking about data from a random source. -- Ben. -- https://mail.python.org/mailman/listinfo/python-list