On Fri, 1 Feb 2002, Kossmann, Bill wrote:

>The article below may be of interest to members of this list.
[An article on categorizing textual strings by appending them on
reference documents and measuring aggregate compressibility snipped.]

This shouldn't be a big surprise, considering how close to the estimated
entropy of various sources current compression algorithms get. In essence,
compressors are statistical learners, and classification problems can be
formulated as partitionings based on statistical similarity. I just wonder
if the overhead of doing a significant number of compression runs against
known sources isn't a bit expensive compared to current methods of
identification.

Sampo Syreeni, aka decoy - mailto:[EMAIL PROTECTED], tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2




---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]

Reply via email to