> This is the first detailed description of the problem I've seen. I've heard > it mentioned several times before, and thought that the md4 code in librsync > was the same as in rsync. I've looked and tweaked the md4 code in librsync > and could never see the bug so I thought it was a myth. I also thought that > samba used this code.... I wonder what variant it is using :-)
Samba looks right to me. Anyhow, I looked at the archives and found this message, so I have simply rediscovered the same bug as Tridge: http://www.mail-archive.com/rsync@lists.samba.org/msg03919.html > > > The fix is easy: a couple of ">" checks should be ">=". I can send > > > diffs if you want. But of course this can't be rolled in unless it > > > is coupled with a bump in the protocol version. > > > > Another bump in the protocol version is no problem. Please submit a patch. > > I can submit patches if required for the md4code as tweaked/fixed for > librsync. The fixed code is faster as well as correct :-) Sure, that would be great. Otherwise, I would be happy to recreate and test a patch. > > > email about fixing MD4 to handle files >= 512MB (I presume this > > > relates to the 64-bit bit count in the final block). Perhaps this > > > change can be made at the same time? > > > > Could you please post a reference to that email? It isn't familiar to me > > and I didn't find it through google. There have been other problems we've > > been seeing with with the end of large files and zlib compression, though. > > I wonder if it can somehow be related. > > It may not have been on the rsync list, but on the librsync list... Please > note that there are several variants of the md4 patch floating around. I've > been meaning to seperate the latest md4 patch from my bigger librsync "delta > refactor patch" for some time. I must be spacing. I can't find the earlier post either. And I also can't find my original post in the archives... Anyhow, the bug occurs for in the file MD4 digest for file lengths >= 512MB. Step 2 in the RFC for the MD4 algorithm specifies that the lower 64 bits (not 32 bits) of the data's bit length is embedded in the tail buffer; see: http://www.faqs.org/rfcs/rfc1186.html Both librsync and rsync use a 32 bit unsigned int for counting the number of bytes processed. This is then multiplied by 8 (to get bits) and this is embedded in the tail buffer when MD4 finishes up. So for files bigger than 4GB bits (512MB) the 32 bit unsigned int overflows. Again, a benign bug but a little disconcerting if you are using another program to check MD4 digests of large files. Craig -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html