I think you're missing a point here. Two different checksum algorithms are used in concert, the Adler-based one and the MD5 one. I SSE-optimized the Adler-based one. The Adler-based hash is used to _find_ blocks that might have shifted, while the MD5 hash is a strong cryptographic hash used to _verify_ blocks and files. You wouldn't want to replace the MD5 hash with the Adler-based hash, they are of a different class. If you'd replace the MD5 hash with a different one, you'd replace it with one of the SHA's or even xxHash.
On Mon, May 18, 2020 at 6:21 PM Ben RUBSON via rsync <rsync@lists.samba.org> wrote: > > Thank you Jorrit for your detailed answer. > > > On 18 May 2020, at 17:58, Jorrit Jongma via rsync <rsync@lists.samba.org> > > wrote: > > > > Well, don't get too excited, get_checksum1() (the function optimized > > here) is not the great performance limiter in this case, it's > > get_checksum2() and sum_update(), which will be using MD5. > > Certainly that all other functions using MD5 could be updated to use your > SSE-optimized function. > So that we have a full SSE MD5 support, wherever rsync is using it (basis > file checksum, rolling checksum etc...). > > I think one nice performance improvement could be when the receiver checksums > the (big/huge) basis file, because here the sender is then simply waiting... > > > Unfortunately, single stream MD5 cannot be effectively optimized with > > SSE, at least I've not seen an SSE version faster than pure C > > I was about to tell you that we successfully implemented it into FreeBSD a > few years ago, but it's CRC32, not MD5... > https://github.com/freebsd/freebsd/commit/c4b27423f57c30068aff3f234c912ae8d9ff1b6a > https://github.com/freebsd/freebsd/commit/5a798b035b4858923878c014a5faa48b2f9aa6e7 > At least sounds like the algorithm author / inspiration, Mark Adler, is the > same :) > > Anyway, this is a first interesting SSE MD5 support. > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html