severity 184057 serious severity 187417 serious merge 187417 184057 thanks On Thu, Apr 03, 2003 at 11:07:16AM +0200, Falk Hueffner wrote: > > Ah, the gzip killer bug. Works with -3, fails with -4 - where the > > --rsyncable patch was introduced. However, StevenK claimed he couldn't > > reproduce it, so I didn't file a bug about it. > > Bug needs to be filed on gzip about the 'gzip killer' .bdf. > Already reported 3 weeks ago as #184057.
So, here's the deal. On alpha, this bug is reproducible when compiled with gcc-3.2 at any optimisation, but not reproducible with gcc-2.95 at -O2. When -DDEBUG is enabled, the assertion is triggered on alpha with gcc-2.95 and gcc-3.2. With -DDEBUG enabled, the assertion is also triggered on powerpc. The problem appears to be that the checks in deflate_fast() and deflate(), namely: if (hash_head != NIL && strstart - hash_head <= MAX_DIST) { and if (hash_head != NIL && prev_length < max_lazy_match && strstart - hash_head <= MAX_DIST) { preceeding calls to longest_match() do not actually ensure that the assertion: Assert(strstart <= window_size-MIN_LOOKAHEAD, "insufficient lookahead"); in longest_match() actually passes. I don't really understand what's going on exactly, but the thoughtless solution of adding the extra check from the assertion explicitly seems to work (and, afaics, should work). The two tests (strstart <= window_size - MIN_LOOKAHEAD, and strstart - hash_head <= MAX_DIST) are equivalent when hash_head > WSIZE, but there's no particular reason for that to be true, that I can see. I don't think this can result in corrupted data, and while a buffer is overflown I think it's only by reading, so apart from the segfault I don't _think_ there are any problems caused by this bug. I'm not really sure though. The patch looks like: --- gzip-1.3.5/deflate.c 2003-04-03 21:51:36.000000000 +1000 +++ gzip-1.3.5-aj/deflate.c 2003-04-03 21:56:38.000000000 +1000 @@ -643,7 +643,8 @@ /* Find the longest match, discarding those <= prev_length. * At this point we have always match_length < MIN_MATCH */ - if (hash_head != NIL && strstart - hash_head <= MAX_DIST) { + if (hash_head != NIL && strstart - hash_head <= MAX_DIST && + strstart <= window_size - MIN_LOOKAHEAD) { /* To simplify the code, we prevent matches with the string * of window index 0 (in particular we have to avoid a match * of the string with itself at the start of the input file). @@ -737,7 +738,8 @@ match_length = MIN_MATCH-1; if (hash_head != NIL && prev_length < max_lazy_match && - strstart - hash_head <= MAX_DIST) { + strstart - hash_head <= MAX_DIST && + strstart <= window_size - MIN_LOOKAHEAD) { /* To simplify the code, we prevent matches with the string * of window index 0 (in particular we have to avoid a match * of the string with itself at the start of the input file). As it stands, this patch decreases the effectiveness of gzip's deflate implementation by, I guess, up to 258 bytes per file. For comparison: $ echo `gzip <X | wc -c` `gzip <X | md5sum | cut -d\ -f1` 11912 4d02d6c8ee27d64a2cb773ad5cf9a086 $ echo `./gzip <X | wc -c` `./gzip <X | md5sum | cut -d\ -f1` 11981 499f30e71926490dd5404f9d38efeacd Both files decompresses correctly, of course. For files not affected by this bug, the output is exactly the same: $ echo `gzip </etc/motd | wc -c` `gzip </etc/motd | md5sum | cut -d\ -f1` 277 e73b4f30720cf2a29c2b774078237ea4 $ echo `./gzip </etc/motd | wc -c` `./gzip </etc/motd | md5sum | cut -d\ -f1` 277 e73b4f30720cf2a29c2b774078237ea4 Note that i386 has an assembly implementation of longest_match() which may or may not have the bug. Cheers, aj -- Anthony Towns <[EMAIL PROTECTED]> <http://azure.humbug.org.au/~aj/> I don't speak for anyone save myself. GPG signed mail preferred. ``Dear Anthony Towns: [...] Congratulations -- you are now certified as a Red Hat Certified Engineer!''