Re: [Dovecot] Single instance storage - testing please
On 27.8.2010, at 2.52, Michael Orlitzky wrote: >>> Won't files hashed with the old function begin to dupe though? >> >> You mean new hash would become a duplicate of the old? Well .. >> >> 1) It's highly unlikely to happen, especially because with the new >> hash function there again shouldn't be a way to create any specific >> hash. > > Oh, no, that's not what I meant. > > I mean, my friend sends me a video of two cats cuddling, and it gets MD5 > hashed and stored somewhere (it's the first instance of that file in my SIS). > Tomorrow, I read the newspaper from 2005 explaining how MD5 is broken, and > decide to switch my hash function to MD4 for safety reasons. A week later, > another friend sends me the same video (it's REALLY cute). Doesn't the video > get stored again? Oh. Yeah, it gets duplicated once. I don't think that's a big deal.
Re: [Dovecot] Single instance storage - testing please
On 08/26/2010 09:38 PM, Timo Sirainen wrote: On 27.8.2010, at 2.24, Michael Orlitzky wrote: Have a utility that updates all (or a subset) of them. That won't be necessary. Once the hash changes, the new files are created with new hash function and it doesn't matter if the old hash is broken because you can't generate new files with it anymore anyway. Won't files hashed with the old function begin to dupe though? You mean new hash would become a duplicate of the old? Well .. 1) It's highly unlikely to happen, especially because with the new hash function there again shouldn't be a way to create any specific hash. Oh, no, that's not what I meant. I mean, my friend sends me a video of two cats cuddling, and it gets MD5 hashed and stored somewhere (it's the first instance of that file in my SIS). Tomorrow, I read the newspaper from 2005 explaining how MD5 is broken, and decide to switch my hash function to MD4 for safety reasons. A week later, another friend sends me the same video (it's REALLY cute). Doesn't the video get stored again?
Re: [Dovecot] Single instance storage - testing please
On 27.8.2010, at 2.24, Michael Orlitzky wrote: >>> Have a utility that updates all (or a subset) of them. >> >> That won't be necessary. Once the hash changes, the new files are >> created with new hash function and it doesn't matter if the old hash >> is broken because you can't generate new files with it anymore >> anyway. >> > > Won't files hashed with the old function begin to dupe though? You mean new hash would become a duplicate of the old? Well .. 1) It's highly unlikely to happen, especially because with the new hash function there again shouldn't be a way to create any specific hash. 2) As long as byte-by-byte comparison is always done, collisions don't matter much anyway (if you can reliably reproduce them, that could lead to some kind of DoS by filling the filesystem, but again once hash function is changed this couldn't be done anymore). 3) The filename can be made different, making the collision impossible. Either because of different hash length or by manually adding some specific character there.
Re: [Dovecot] Single instance storage - testing please
On 08/26/2010 09:00 PM, Timo Sirainen wrote: On 27.8.2010, at 1.52, Michael Orlitzky wrote: On 08/26/2010 04:41 PM, Mike Abbott wrote: 1. What hash algorithm to use? 2. Should I add support for trusting hash uniqueness Use two hash functions and concatenate the hashes. While both hash systems may eventually be hacked it is unlikely that hacking them will result in a targeted alias. Just make it possible to change the hash in the future. I'm thinking about mail_attachment_hash setting where you can configure it pretty much any way you want. Have a utility that updates all (or a subset) of them. That won't be necessary. Once the hash changes, the new files are created with new hash function and it doesn't matter if the old hash is broken because you can't generate new files with it anymore anyway. Won't files hashed with the old function begin to dupe though?
Re: [Dovecot] Single instance storage - testing please
On 27.8.2010, at 1.52, Michael Orlitzky wrote: > On 08/26/2010 04:41 PM, Mike Abbott wrote: >>> 1. What hash algorithm to use? >> >>> 2. Should I add support for trusting hash uniqueness >> >> Use two hash functions and concatenate the hashes. While both hash >> systems may eventually be hacked it is unlikely that hacking them >> will result in a targeted alias. > > Just make it possible to change the hash in the future. I'm thinking about mail_attachment_hash setting where you can configure it pretty much any way you want. > Have a utility that updates all (or a subset) of them. That won't be necessary. Once the hash changes, the new files are created with new hash function and it doesn't matter if the old hash is broken because you can't generate new files with it anymore anyway.
Re: [Dovecot] Single instance storage - testing please
On 08/26/2010 04:41 PM, Mike Abbott wrote: 1. What hash algorithm to use? 2. Should I add support for trusting hash uniqueness Use two hash functions and concatenate the hashes. While both hash systems may eventually be hacked it is unlikely that hacking them will result in a targeted alias. Just make it possible to change the hash in the future. Have a utility that updates all (or a subset) of them. If e.g. SHA256 is truly broken in the future, the utility can run overnight while I fix the million other emergencies that are about to exist in the morning.
Re: [Dovecot] Single instance storage - testing please
> 1. What hash algorithm to use? > 2. Should I add support for trusting hash uniqueness Use two hash functions and concatenate the hashes. While both hash systems may eventually be hacked it is unlikely that hacking them will result in a targeted alias.
Re: [Dovecot] Single instance storage - testing please
On Thu, 2010-08-26 at 20:32 +0100, Timo Sirainen wrote: > http://hg.dovecot.org/dovecot-2.0-sis contains the code for it. > Otherwise it's the latest (as of writing this) dovecot-2.0 hg tree. > Please test if you're interested in SIS. :) One more point that I have to remember to mention once I write its wiki page: The attachment handling code is NFS safe, because it never modifies existing files. So there won't be problems with using director to distribute users to different servers and still all servers accessing the common attachment storage. Another thing I just remembered: The code currently uses 0600 / 0700 permissions for everything. I guess it should take the permissions from /attachments directory and preserve them for all the subdirs/files.