Re: ?splints for broken hash functions
I wrote >> the Bi are the input blocks: >> (IV) -> B1 -> B2 -> B3 -> ... Bk -> H1 >> (IV) -> B2 -> B3 -> ... Bk -> B1 -> H2 >>then we combine H1 and H2 nonlinearly. (Note that I have since proposed a couple of improvements, but I don't think they are relevant to the present remarks.) David Wagner wrote: This does not add any strength against Joux's attack. One can find collisions for this in 80*2^80 time with Joux's attack. First, generate 2^80 collisions for the top line. Find B1,B1* that produce a collision, i.e., C(IV,B1)=C(IV,B1*)=V2. Then, find B2,B2* that produce a collision, i.e., C(V2,B2)=C(V2,B2*)=V3. Continue to find B3,B3*, ..., Bk,Bk*. Note that we can combine this in any way we like (e.g., B1, B2*, B3*, B4, .., Bk) to get 2^80 different messages that all produce the same output in the top line (same H1). OK so far. I think this assumes a sufficiently long input string, but I'm willing to make that assumption. Next, look at the bottom line. For each of the 2^80 ways to combine the above blocks, compute what output you get in the bottom line. OK so far. By the birthday paradox, Whoa, lost me there. I thought H1 was fixed, namely the ordinarly has of the original message. Two questions: 1) If H1 is fixed, I don't understand how birthday arguments apply. Why will trying the bottom line 2^80 times find me H@ equal to a particular fixed H1? There are a whole lot more that 2^80 possibilities. 2) If H1 is not fixed, then this seems to be an unnecessarily difficult way of saying that a 160-bit hash can be broken using 2^80 work. But we knew that already. We didn't need Joux or anybody else to tell us that. A proposal that uses 80*2^80 work is particularly confusing, if H1 is not fixed. you will find some pair that produce a collision in the bottom line (same H2). But that pair also produces a collision in the top line (since all pairs collide in the top line), so you have a collision for the whole hash (same H1,H2). Still lost, for the same reasons. Please explain. Also if possible please address the improved version, with different IVs and long-range permutation of a partial block. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
RE: "Approximate" hashes
| nilsimsa | Computes nilsimsa codes of messages and compares the codes and finds | clusters of similar messages so as to trash spam. | | What's a nilsimsa code? | | A nilsimsa code is something like a hash, but unlike hashes, a small change | in the message results in a small change in the nilsimsa code. | | http://lexx.shinn.net/cmeclax/nilsimsa.html I had a look at the code (which isn't easy to follow). This appears to be a new application of Bloom filters. -- Jerry - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
?splints for broken hash functions
Hal Finney writes: >[John Denker proposes:] the Bi are the input blocks: > (IV) -> B1 -> B2 -> B3 -> ... Bk -> H1 > (IV) -> B2 -> B3 -> ... Bk -> B1 -> H2 >then we combine H1 and H2 nonlinearly. This does not add any strength against Joux's attack. One can find collisions for this in 80*2^80 time with Joux's attack. First, generate 2^80 collisions for the top line. Find B1,B1* that produce a collision, i.e., C(IV,B1)=C(IV,B1*)=V2. Then, find B2,B2* that produce a collision, i.e., C(V2,B2)=C(V2,B2*)=V3. Continue to find B3,B3*, ..., Bk,Bk*. Note that we can combine this in any way we like (e.g., B1, B2*, B3*, B4, .., Bk) to get 2^80 different messages that all produce the same output in the top line (same H1). Next, look at the bottom line. For each of the 2^80 ways to combine the above blocks, compute what output you get in the bottom line. By the birthday paradox, you will find some pair that produce a collision in the bottom line (same H2). But that pair also produces a collision in the top line (since all pairs collide in the top line), so you have a collision for the whole hash (same H1,H2). >[...] how about this simpler construction? > (IV1) -> B1 -> B2 -> B3 -> ... Bk -> H1 > (IV2) -> B1 -> B2 -> B3 -> ... Bk -> H2 >and again combine H1 and H2. The same attack applies. This construction is not secure against Joux's attack, either. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
RE: "Approximate" hashes
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Marcel Popescu > Sent: Wednesday, September 01, 2004 9:56 AM > To: [EMAIL PROTECTED] > Subject: "Approximate" hashes > > I am trying to build a Windows anti-spam thingy; it's > supposed to "sit" in > between the mail client and the outer world, and indicate through mail > headers whether the incoming mail has a valid hashcash > http://www.hashcash.org/ "coin" (and, of course, to automatically add > hashcash to outgoing emails). > > My problem is that I don't know what happens with the email in transit > (this, I believe, is an observation in the hashcash FAQ). I > am worried that > some mail server might dislike ASCII characters with the high > bit set, or > that a client uses some encoding which for some reason > doesn't make it to > the destination unchanged. > > Hence my question: is there some "approximate" hash function > (which I could > use instead of SHA-1) which can verify that a text hashes > "very close" to a > value? So that if I change, say, tabs into spaces, I won't > get exactly the > same value, but I would get a "good enough"? > > I don't know if this is possible. But if it is, I though this > would be a > good place to find out about it. nilsimsa Computes nilsimsa codes of messages and compares the codes and finds clusters of similar messages so as to trash spam. What's a nilsimsa code? A nilsimsa code is something like a hash, but unlike hashes, a small change in the message results in a small change in the nilsimsa code. http://lexx.shinn.net/cmeclax/nilsimsa.html -- Keith Ray <[EMAIL PROTECTED]> -- OpenPGP Key: 0x79269A12 - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Kerberos Design
Hi, I'm currently looking into implementing a single sign-on solution for distributed services. The requirement profile seems to somewhat fit Kerberos, but I'm not entirely convinced that I can use it in my scenario - which can't simply run an off-the-shelf KDC and use UDP for communication with it. However, years of reading various crypto resources have strongly conditioned me against simple-minded attempts to "roll my own" as a crypto dilletante. I've been trying to study Kerberos' design history in the recent past and have failed to come up with a good resource that explains why things are built the way they are. Since I'm already using OpenSSL for various SSL/x.509 related things, I'm most astonished by the almost total absence of public key cryptography in Kerberos, and I haven't been able to find out why this design choice was made - performance reasons, given that at its inception public key operation cost was probably much more prohibitive? So, I'd like to ask the audience: - Is there a good web/book/whatever resource regarding the design of Kerberos? Amazon offers the O'Reilly book, which, from the abstract, seems to take the cryptographic design of Kerberos as a given and concentrates on its usage, and another one that also doesn't seem to give much detail on the issue. Something in the direction of EKR's SSL/TLS book would be very much appreciated. - Is Kerberos a sane choice to adapt for such solutions today? Is there anything more recent that I should be aware of? thanks, -- [*Thomas Themel*] [extended contact] But let your communication be, Yea, yea; Nay, nay: [info provided in] for whatsoever is more than these cometh of evil. [*message header*] - Matthew 5:37 pgprz2ISLjwMt.pgp Description: PGP signature
Re: "Approximate" hashes
At 06:02 PM 9/1/04 +0300, Marcel Popescu wrote: >From: "Marcel Popescu" <[EMAIL PROTECTED]> > >> Hence my question: is there some "approximate" hash function (which I >could >> use instead of SHA-1) which can verify that a text hashes "very close" to >a >> value? So that if I change, say, tabs into spaces, I won't get exactly the >> same value, but I would get a "good enough"? This is completely what secure hashes are supposed to prevent. *Any* change in the input will flip half the hash-bits on average. Just like a block cipher. There is no input "nearness" preservation. That's part of the point of the algorithm. >I just had an idea. Would this work? > >- let S be the input string, whose hash I want to verify >- make S uppercase >- remove everything but A-Z, 0-9, and common punctuation (!;:'",.?) >- calculate the SHA1 hash of the result > >This should keep any insignificant changes out of the final result. You can encode your message in some format which is not subject to mangling, and use a hash of that encoding. You can then decode the encoding back into unicode or whatever funky but net-fragile character set you like. This is somewhat like ascii-armoring of PGP-encrypted messages. = 36 Laurelwood Dr Irvine CA 92620-1299 VOX: (714) 544-9727 (home) mnemonic: P1G JIG WRAP ICBM: -117.7621, 33.7275 HTTP: http://68.5.216.23:81 (back up, but not 99.999% reliable) PGP PUBLIC KEY: by arrangement Send plain ASCII text not HTML lest ye be misquoted -- "Don't 'sir' me, young man, you have no idea who you're dealing with" Tommy Lee Jones, MIB No, you're not 'tripping', that is an emu ---Hank R. Hill - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: "Approximate" hashes
From: ""Hal Finney"" <[EMAIL PROTECTED]> > As you are probably aware, existing hashcash implementations do not base > the stamp on the message content. Instead they only lock the stamp to > the receiver's email address. Then the receiver keeps a list of the > hashcash stamps he has seen recently, to prevent reuse. I'm not sure > what you hope to gain by locking the stamp to the message content. Me dumb, sorry. I was actually thinking of some other thing I wanted to do about spam (which involved the whole content), and haven't re-read the hashcash site in about a week. You are perfectly right, of course. Windows spam victims, here I come! Thanks, Marcel - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Implementation choices in light of recent attacks?
On Wed, 1 Sep 2004, Jim McCoy wrote: >After digesting the various bits of information and speculation on the >recent breaks and partial attacks on various popular hash functions I >am wondering if anyone has suggestions for implementation choices for >someone needing a (hopefully) strong hash today, but who needs to keep >the hash output size in the 128-192 bit range. A cursory examination >of Tiger seems to indicate that it uses a different methodology than >the MDx & SHAx lines, does this mean that it does not suffer from the >recent hash attacks? Would a SHA256 that has its output chopped be >sufficient? > >Any suggestions would be appreciated. I believe that SHA256 with its output cut to 128 bits will be effective. The simplest construction is to just throw away half the bits. Bear - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
On Tue, 31 Aug 2004, John Denker wrote: > I emphasize that I am only discussing messages of length N, > where N is some pre-chosen number. For concreteness, let's > choose N=10. > I repeat my assertion that _if_ XX can compress any string, > shortening it by one bit, and _if_ you know that the original > messages each have exactly 10 bits, _then_ any 10-bit message > can be compressed down to a single bit. Actually you don't need to take it all the way to the reductio ad absurdum here. There are 1024 10-bit messages. There are 512 9-bit messages. You need to point out that since a one-to-one mapping between sets of different ordinality cannot exist, it is not possible to create something that will compress every ten-bit message by one bit. Or, slightly more formally, assume that a function C can reversibly compress any ten-bit message by one bit. Since there are 1024 distinct ten-bit messages, there must be at least 1024 distinct nine-bit messages which, when the reversal is applied, result in these 1024 messages. There are exactly 512 distinct nine-bit messages. Therefore 512 >= 1024. Bear - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
On Wed, 1 Sep 2004, Hadmut Danisch wrote: > >I have often heard that example and I used it myself several times. I >do not use it anymore, because it is not that easy. There's a major >flaw in this example, and therefore it is not really a >counterexample. If the faculty found that flaw I'd be in a bad >position. > >You could define some reversible bizarre function f that does exactly >that job, i.e. for a given message m you need to apply f again and >again and after some finite number of calculations you'll find that >f(...f(m)...) = x where x = 0 or 1 For example, loading the message into a Linear Feedback Shift Register and iterating until it produces one of two predetermined messages 0 or 1? For a nontrivial message, the last star will burn out before you get that many iterations. Also, as you say, in order to decode the message you have to know how many iterations you made - a number which will, on the average, be the same length as the message. It hardly seems a worthwhile example. >They say LZW and MTF. I have already given an example for LZW. >They don't care. I've told them to take any random string taken >from /dev/random under Linux. They don't care. The german principle is >that a faculty is always right by definition. That is inconsistent with the advancement of knowledge. Any university relying on such a principle has abandoned its duty. Bear - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: ?splints for broken hash functions
John Kelsey critiques the proposal from Practical Cryptography: > >"We do not know of any literature about how to fix the hash functions, > >but here is what we came up with when writing this book. ... Let h be > >one of the hash functions mentioned above. Instead of m->h(m), we use > >m->h(h(m) || m) as hash function. > > I believe this falls to a generalization of the Joux attack, as well. (Someone may > have already noticed this.) > > a. I build a 2^{80} multicollision on h(m) using Joux' attack, requiring 80*2^{80} > work. > > b. I now have 2^{80} different messages which are being hashed with the same IV. I > expect one pair of them to give me a collision. You did 80*2^80 work to create a collision. But you could create a collision without all this effort in only 2^80 work, by a conventional birthday attack. Just ignore the internal structure and treat it as a 160 bit hash function and you can collide in 2^80 work. You worked harder than you needed to, so this is not a break. Hal - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: "Approximate" hashes
On Wed, 1 Sep 2004, Marcel Popescu wrote: > My problem is that I don't know what happens with the email in transit > some mail server might dislike ASCII characters with the high bit set, or > Hence my question: is there some "approximate" hash function (which I could PGP has this issue with 'clear-signed' output. An excerpt from the pgp man page: If CLEARSIG is enabled, then when signing and ASCII-armoring a text file, PGP uses a different format that includes the plaintext in human-readable form. Lines beginning with "-" are quoted with "- ". To cope with some of the stupider mailers in the world, lines beginning with "From" are also quoted, and trailing whitespace on lines is stripped. PGP will remove the quoting if you use it to decrypt the message, but the trailing whitespace is not recovered. This is still useful enough to be enabled by default. You might find more details about typical mailer-munging from the PGP docs or source. --scott GRALLSPICE Moscow MKSEARCH affinity group KUDESK Nazi operative Clinton Register to vote! http://www.yourvotematters.org/VerifiedVoting ( http://cscott.net/ ) - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Implementation choices in light of recent attacks?
After digesting the various bits of information and speculation on the recent breaks and partial attacks on various popular hash functions I am wondering if anyone has suggestions for implementation choices for someone needing a (hopefully) strong hash today, but who needs to keep the hash output size in the 128-192 bit range. A cursory examination of Tiger seems to indicate that it uses a different methodology than the MDx & SHAx lines, does this mean that it does not suffer from the recent hash attacks? Would a SHA256 that has its output chopped be sufficient? Any suggestions would be appreciated. Jim - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: "Approximate" hashes
From: "Marcel Popescu" <[EMAIL PROTECTED]> > Hence my question: is there some "approximate" hash function (which I could > use instead of SHA-1) which can verify that a text hashes "very close" to a > value? So that if I change, say, tabs into spaces, I won't get exactly the > same value, but I would get a "good enough"? I just had an idea. Would this work? - let S be the input string, whose hash I want to verify - make S uppercase - remove everything but A-Z, 0-9, and common punctuation (!;:'",.?) - calculate the SHA1 hash of the result This should keep any insignificant changes out of the final result. Does anyone know of a mail transformation which could upset it? Can anyone see a way to "attack" this by letting a significantly different message collide on the same hash? (I'm ignoring the recent discoveries - they're not that practical, I'm only trying to fight spam, not the government.) Thanks, Marcel - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
"Approximate" hashes
I am trying to build a Windows anti-spam thingy; it's supposed to "sit" in between the mail client and the outer world, and indicate through mail headers whether the incoming mail has a valid hashcash http://www.hashcash.org/ "coin" (and, of course, to automatically add hashcash to outgoing emails). My problem is that I don't know what happens with the email in transit (this, I believe, is an observation in the hashcash FAQ). I am worried that some mail server might dislike ASCII characters with the high bit set, or that a client uses some encoding which for some reason doesn't make it to the destination unchanged. Hence my question: is there some "approximate" hash function (which I could use instead of SHA-1) which can verify that a text hashes "very close" to a value? So that if I change, say, tabs into spaces, I won't get exactly the same value, but I would get a "good enough"? I don't know if this is possible. But if it is, I though this would be a good place to find out about it. Thanks, Marcel - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: ?splints for broken hash functions
>From: Ivan Krstic <[EMAIL PROTECTED]> >Sent: Aug 29, 2004 8:40 AM >To: Metzdowd Crypto <[EMAIL PROTECTED]> >Subject: Re: ?splints for broken hash functions >This is Schneier's and Ferguson's solution to then-known hash function >weaknesses in Practical Cryptography, Wiley Publishing, 2003: >"We do not know of any literature about how to fix the hash functions, >but here is what we came up with when writing this book. ... Let h be >one of the hash functions mentioned above. Instead of m->h(m), we use >m->h(h(m) || m) as hash function. Effectively we put h(m) before the >message we are hashing. This ensures that the iterative hash >computations immediately depend on all the bits of the message, and no >partial-message or length extension attacks can work. ... I believe this falls to a generalization of the Joux attack, as well. (Someone may have already noticed this.) a. I build a 2^{80} multicollision on h(m) using Joux' attack, requiring 80*2^{80} work. b. I now have 2^{80} different messages which are being hashed with the same IV. I expect one pair of them to give me a collision. >Cheers, >Ivan. Comments? --John - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
RE: Compression theory reference?
On Tue, Aug 31, 2004 at 02:48:00PM +0200, Hadmut Danisch wrote: > It can be easily shown that there is no lossless > compression method which can effectively compress every possible > input. Even more simply, if such a method existed, you could recursively apply it to its output and compress every message as one bit. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
On Wed, Sep 01, 2004 at 04:02:02PM +1200, Peter Gutmann wrote: > > comp.compression FAQ, probably question #1 given the number of times this > comes up in the newsgroup. > > (I've just checked, it's question #9 in part 1. Question #73 in part 2 may > also be useful). Thanks, that's a pretty good hint, especially because it contains an explicit statement, and it's an FAQ, making it easy to show, that the university's claim is not just wrong, but silly. :-) regards Hadmut - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
More problems with hash functions
Jerrold Leichter wrote: >Joux's attack says: Find single block messages M1 and M1' that collide on >the "blank initial state". Now find messages M2 amd M2' that collide with >the (common) final state from M1 and M1'. Then you hav four 2-block >collisions for the cost of two: M1|M2, M1'|M2, and so on. > >But even a simple XOR breaks this. M1 and M1' have the same hash H, but the >state being passed is now very different: in one case, in the >other. So M1|M2 and M1'|M2 are completely different: Both started the second >step with the compression function in state H, but the first compressed >M1 XOR M2, and the second M1' XOR M2. Doesn't work, alas. A trivial adjustment to Joux's attack also defeats your proposal. Suppose M1 and M1' collide on the "blank initial state". Let M2 be arbitrary. Then M1|M2 and M1'|M2 collide, and the final state after processing them is in both cases. Now find messages M3 and M3' that collide if processed starting from the common state . Then you have four 3-block collisions for the cost of two: M1|M2|M3, M1'|M2|M3, etc. With this small tweak, Joux's attack will go through. So, your scheme offers little or no additional resistance to Joux's attack. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
Hadmut Danisch <[EMAIL PROTECTED]> writes: >I need a literature reference for a simple problem of encoding/compression >theory: comp.compression FAQ, probably question #1 given the number of times this comes up in the newsgroup. (I've just checked, it's question #9 in part 1. Question #73 in part 2 may also be useful). Peter. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Hashes, splints, and PRNGs
I'm really enjoying the current discussion about hash constructions and splints for current algorithms. I will make one observation in that discussion, which is that the proposal for a Hnew (2n -> n) seems a little beyond the scope of a "field splint" that can be done using existing tools and implementations, and so I prefer some of the other proposals that don't require this new element. The discussion also gives me the excuse to ask a question I've been wanting to pose to this list for a long time, regarding the techniques used by the NetBSD /dev/random pseudo-device. Although designed and used for different purposes, the discussion here in the past few days has similarities to that construction. I hope that by raising it, I can both add something to the discussion, and also get feedback from this esteemed company on its general worthiness for original purposes. I've attached the source code for the core part of the accumulator and output function, which is reasonably well commented. (I hope; the comments are largely mine, the code is very largely not). The general construction is an entropy "pool" (equivalent to the state passed from block to block in the diagrams we've been using to talk about Joux' attack) into which data (input blocks of 32-bit words) are mixed. The first observation here is that the state path between blocks is very much wider than the blocksize of either input or output, as desired. The input mixing is done my modelling the state array as 32 parallel LFSR PRNG's, one per bit-plane, in combination with a rotation of the input and output around each of the LFSR's (so that none of them runs for many cycles and becomes vulnerable to the well-known limitations of LFSR's run too long). In the normal usage of this device, the input data is timing information from various device drivers (disks, keyboards, etc). It can also include user-supplied data written to /dev/random, and so collision-resistance does become at least potentially an issue. Output is done by taking a SHA-1 hash of the whole state array, and then folding it in half (XOR) before giving it to the user. (The unfolded SHA-1 result is also mixed back into the pool as above, so that successive calls produce different results as a PRNG). Again, this is reminiscent of using a n-bit hash to claim only n/2 bits of security, though obviously this usage originated out of different concerns. So, looking at this in the context of present disussions, and pretending this construct is used purely as a (hybrid) hash function, it has the following characteristics: - 32-bit input block - 80-bit output block - wide path (4096 bits in default configuration) between blocks - Hnew = TT ( H2 (H1 (M) )) - probably only effective for messages longer than some multiple of the internal state size Joux' arguments certainly apply to this construction (which was in any case designed to preserve entropy rather than for collision resistance against wholly-known different inputs). If collisions can be found for H1, then H2 adds nothing (its purpose is whitening of the internal state, in the original usage). Anyway, some questions for the floor: - How much relevance and similarity is there between PRNG style constructs used this way and hash compression functions? Normally, a PRNG is designed to produce many bits from few, and a hash the reverse. However, using the PRNG this way tries to produce many state bits to carry forward, rather than output bits. - If contemplating a construct such as this as a hash-like function, any hope of Joux-resistance relies on the large internal state. How could this be made stronger? One thought would be to clock the output function (a SHA-1 of the pool) and mix that back in every N inputs - or is this just needless work? - Is the whole thing pointless (for this purpose) and does it just amount to a SHA-1 of TT(M) if M is wholly-known? - Finally (and of most interest to me practically), can anyone see a problem with the construct for its intended usage as a PRNG. In particular, can anyone see a way to feed/flood it with known input such that the output becomes weaker (if M is partially known or partially chosen)? -- Dan. PS: if looking at the code, please ignore the "entropy count" parts; they work with an estimator that implements the (non-)blocking nature of /dev/(u)random, but this is known to be highly arbitrary and (hopefully) overly conservative. /* $NetBSD: rndpool.c,v 1.17 2002/11/10 03:29:00 thorpej Exp $*/ /*- * Copyright (c) 1997 The NetBSD Foundation, Inc. * All rights reserved. * * This code is derived from software contributed to The NetBSD Foundation * by Michael Graff <[EMAIL PROTECTED]>. This code uses ideas and * algorithms from the Linux driver written by Ted Ts'o. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * a
Re: Compression theory reference?
I wrote: 4) Don't forget the _recursion_ argument. Take their favorite algorithm (call it XX). If their claims are correct, XX should be able to compress _anything_. That is, the output of XX should _always_ be at least one bit shorter than the input. Then the compound operation XX(XX(...)) should produce something two bits shorter than the original input. If you start with a N-bit message and apply the XX function N-1 times, you should be able to compress each and every message down to a single bit. Matt Crawford wrote: Plus a string of log(N) bits telling you how many times to apply the decompression function! Uh-oh, now goes over the judge's head ... Actually you don't need to adjoin log(N) bits. But perhaps my assertion would benefit from some clarification. I emphasize that I am only discussing messages of length N, where N is some pre-chosen number. For concreteness, let's choose N=10. I repeat my assertion that _if_ XX can compress any string, shortening it by one bit, and _if_ you know that the original messages each have exactly 10 bits, _then_ any 10-bit message can be compressed down to a single bit. I have proved that XX is ridiculous in this one case. My function YY := XX^9 is less general than XX. XX works on any input, whereas YY by its definition only applies to 10-bit messages. The fact remains that we have a proof by contradiction. We assume by way of hypothesis that the bad-guys are right, namely that XX exists and has the properties they assign to it. Then I can construct YY. But YY is ridiculous, through no fault of mine. Ergo the bad guys are wrong, i.e. no such XX can exist. With a little more work I could construct a more powerful and/or more general version of YY ... but that would be doing more work than is required. Their XX stuck its neck out; it is not required for my YY to stick its neck out in the same way. The requirement, as I understood it, was to prove the bad guys wrong. Well, the bad guys have been proved wrong. If something more is required, please explain the requirements in more detail. (BTW I suppose it would be better to call this the 'iterated composition' argument rather than the recursion argument.) Hadmut wrote: I found some outside Germany. But they didn't give me a paper with signature, just e-mails. Will see whether the court will accept that. Ask your legal advisor. In the US, getting such emails admitted as evidence would be problematic. Each jurisdiction (I assume) has its own standards for how affidavits should be prepared. Figure out the rules, and play by the rules. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
On Tue, Aug 31, 2004 at 04:56:25PM -0400, John Denker wrote: > 4) Don't forget the _recursion_ argument. Take their favorite > algorithm (call it XX). If their claims are correct, XX should > be able to compress _anything_. That is, the output of XX > should _always_ be at least one bit shorter than the input. > Then the compound operation XX(XX(...)) should produce something > two bits shorter than the original input. If you start with a > N-bit message and apply the XX function N-1 times, you should be > able to compress each and every message down to a single bit. > This proof as written (the theorem is still true of course) relies on the algorithm always compressing, rather than never expanding. It is much simpler as a result, is there an equally simple argument to prove that all non-expanding codes never compress? Note that it is possible to turn any compressor into one whose expansion is at most one 1 extra bit: If F(x) is shorter than x by at least one bit output 0|F(x) if F(x) is the same length as x or longer output 1|x. So we can lose 1 bit of efficiency in compressed strings to gain at most 1 bit of overhead in uncompressed strings. -- /"\ ASCII RIBBON NOTICE: If received in error, \ / CAMPAIGN Victor Duchovni please destroy and notify X AGAINST IT Security, sender. Sender does not waive / \ HTML MAILMorgan Stanley confidentiality or privilege, and use is prohibited. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
On Tue, Aug 31, 2004 at 05:07:30PM -0500, Matt Crawford wrote: > > Plus a string of log(N) bits telling you how many times to apply the > decompression function! > Uh-oh, now goes over the judge's head ... Yeah, I just posted a lengthy description why I think that this counterexample is not a counterexample. The problem is that if you ask for a string of log(N) bits, then someone else could take this as a proof that this actually works, because a string of log(N) bits is obviously shorter than the message of N bits, thus the compression scheme is working. Hooray! The problem is, that the number of iterations is not in the order of N, but in the order of 2^N, so it takes log2(around 2^N) = around N bits to store the number of iterations. The recursion convertes a message of N bit recursively into a message of 1 or zero bit length (to your taste), *and* a number which takes around N bit to be stored. Nothing is won. But proof that. This recursion game is far more complicated than it appears to be. Note also that storing a number takes in reality more than log(N) bits. Why? Because you don't know N in advance. We don't have any limit for the message length. So you'r counting register needs theoretically inifinte many bits. When you're finished you know how many bits your number took. But storing your number needs an end symbol or a tristate-bit (0,1,void). That's a common mistake. When determining the compression rate for a file people often forget, that some information is not stored in the file itself, but in the file system, e.g. the file length (telling where the compressed data stops) and the file name (telling you, that the file was compressed). That's basically the same problem. thanks and regards Hadmut - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
On Tue, Aug 31, 2004 at 04:56:25PM -0400, John Denker wrote: > 4) Don't forget the _recursion_ argument. Take their favorite > algorithm (call it XX). If their claims are correct, XX should > be able to compress _anything_. That is, the output of XX > should _always_ be at least one bit shorter than the input. > Then the compound operation XX(XX(...)) should produce something > two bits shorter than the original input. If you start with a > N-bit message and apply the XX function N-1 times, you should be > able to compress each and every message down to a single bit. I have often heard that example and I used it myself several times. I do not use it anymore, because it is not that easy. There's a major flaw in this example, and therefore it is not really a counterexample. If the faculty found that flaw I'd be in a bad position. You could define some reversible bizarre function f that does exactly that job, i.e. for a given message m you need to apply f again and again and after some finite number of calculations you'll find that f(...f(m)...) = x where x = 0 or 1 So this is possible. Since the function is reversible, let's call the inverse function f', and you'll find m = f'( ... f'(x)...) where x is still 0 or 1 Ooops. What happened? Why does this work? Because the commonly used counterexample has a flaw. The reason is that you can invert f(...f(m)...) only if you count the number of times you applied f. You need to know the number of times in order to revert = decompress it, because you need to apply f' exactly that many times. You don't have any other stop condition. Applying f' is not a proper recursion, it's an iteration. So your information is actually stored in this number, not in 0 or 1. The output of the compression function is not 0 or 1, it is that hidden number telling how often you need to apply f to reach 0 or 1. So just give it as a contradiction that there can not be such a function because it could be recursively applied to result in a single bit is insufficient, it is not a contradiction. You need to consider the recursion depth and the inversion. But then it get's so complicated that most people don't understand it anymore. And the argument, that reaching a single bit recursively is a contradiction, is gone. You need to store a number. So what? Who says that this number isn't shorter that the plain message? And suddenly your back deep in theory. That's why I don't like that example. It's convincing at a first glance, but I don't believe that it is correct. > > 1) Get a few "expert witnesses" to certify that your position is > certainly not a personal fantasy. (I assume German jurisprudence > has something akin to the US notion of expert witnesses, right?) I did. Unfortunately I didn't find a german one, because it is very difficult to find a german professor witnessing against any other. It's a tight community. I found some outside Germany. But they didn't give me a paper with signature, just e-mails. Will see whether the court will accept that. I've sent those e-mails to the dean of the faculty of computer science to convince him that the faculty is wrong. As a result, he configured the mail relay of the faculty to drop any e-mail containing my last name anywhere in the header. It's ridiculous and I would laugh if it wouldn't be exactly the faculty that's said to be the best german faculty of computer science. > 2) Try to get the burden-of-proof reversed. Very difficult. I meanwhile became an expert in german examination law and it usually requires the examinee to proof that the examiners opinion is wrong. But since I already have proven several times that the university was lying intentionally to the court, they might take that into consideration. After all, I have brought this forward, and I have done my duty. Now it should be up to the university to respond. They didn't comment for more than four years now. > The opposition > are claiming that known algorithms do the job. Get them to be > specific about which algorithms they are referring to. Then > file with the court some disks, each containing 1.44 megabytes > of data. They say LZW and MTF. I have already given an example for LZW. They don't care. I've told them to take any random string taken from /dev/random under Linux. They don't care. The german principle is that a faculty is always right by definition. > 3) Diagram the pigeon-hole argument for the judge. See > diagrams below. I'll try that. Thanks. regards Hadmut - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
On Aug 31, 2004, at 15:56, John Denker wrote: 4) Don't forget the _recursion_ argument. Take their favorite algorithm (call it XX). If their claims are correct, XX should be able to compress _anything_. That is, the output of XX should _always_ be at least one bit shorter than the input. Then the compound operation XX(XX(...)) should produce something two bits shorter than the original input. If you start with a N-bit message and apply the XX function N-1 times, you should be able to compress each and every message down to a single bit. Plus a string of log(N) bits telling you how many times to apply the decompression function! Uh-oh, now goes over the judge's head ... - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
On Aug 31, 2004, at 14:50, Victor Duchovni wrote: This is a question in elementary finite set theory, not computer science (whatever that is :-). All proofs will involve some mathematics. The above is I think simpler than your original argument and is the simplest I can come up with... I think Hadmut was looking for an authority that would be respected by the CS department he is dealing with. It's a sad state of affairs when they will accept authority over proof. However, I can give what I think is a simpler proof, using only high school math. Assume that some invertible function f() turns no input message into a longer output message. We can prove that it also does not make any message *shorter*, and hence is not a "compression" function after all. In particular, f() turns every one-bit message into a one-bit message. Suppose f() preserves the length of all n-bit messages, for 1 <= n <= N. (This is already the case for N=1.) What does f() do to a message M of N+1 bits length? By assumption, f(M) is not N+2 bits or longer. But all output messages of N bits or less are the result of some input of N bits or less and hence cannot be f(M). So by elimination, f(M) is N+1 bits long. By mathematical induction, f() preserves the length of every message. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]
Re: Compression theory reference?
From: Hadmut Danisch <[EMAIL PROTECTED]> > It can be easily shown that there is no lossless > compression method which can effectively compress every possible > input. > Therefore, I need a book about computer science or encoding theory, > which explicitely says that this is impossible, in a way that a person > unexperienced in computer science (judges at a court) can read and > at least see that this is not just my personal phantasy and at least > somehow reasonable. Section 20.4 of Press, Teukolsky, Vetterling & Flannery Numerical Recipies in Fortran, 2nd Ed (1992) Cambridge University Press ISBN 0-521-43064-X makes simple reading to this reader with minimal computer science education incidental to a Physics course. There are related "Numerical Recipies" books using other computer languages. The first paragraph on compression is (with *italics*): "A lossless data compression algorithm takes a string of symbols (typically ASCII characters or bytes) and translates it *reversibly* into another string, one that is *on the average* of shorter length. The words "on the average" are crucial; it is obvious that no reversible algorithm can make all strings shorter - there just aren't enough short strings to be in one-to-one correspondence with longer strings. Compression algorithms are possible only when, on the input side, some strings, or some input symbols, are more common than others. These can then be encoded in fewer bits than rarer input strings or symbols, giving a net average gain." There is 10.5 pages on compression and some further references I have not read. - The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]