General question...

blob_exists does this - "SELECT id FROM dbmail_mimeparts WHERE hash=? AND
size=? AND blob=?"

Any reason why " AND blob=?" is also included, even though we're querying
based on hash and size? Is it to absolutely, 100% confirm that there's no
potential collision?

If you're running a database over the network, sending 10M over the wire for
a single part (most likely twice - once for the select and once for the
insert) seems to add quite a bit of unnecessary overhead, then you're
forcing the QBMS to load the entire blob from the disk (forcing it in to
cache), and compare the entire string

Given that we don't have any registered SHA-1 collisions (at the standard 80
rounds anyway), and at SHA-2 (256/512), I think we'd be pretty safe to stop
sending the blob for comparison as well? I'm *slightly* more concerned about
MD5 and the other hashing algorithms which I'm not familiar with.

Thoughts appreciated.

Regards,

Chris Boulton
Lead Engineer
BigCommerce / Interspire

Web: http://www.bigcommerce.com
Web: http://www.interspire.com
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev

Reply via email to