Re: [Dovecot] Single instance storage - testing please

2010-08-26 Thread Timo Sirainen
On 27.8.2010, at 2.52, Michael Orlitzky wrote:

>>> Won't files hashed with the old function begin to dupe though?
>> 
>> You mean new hash would become a duplicate of the old? Well ..
>> 
>> 1) It's highly unlikely to happen, especially because with the new
>> hash function there again shouldn't be a way to create any specific
>> hash.
> 
> Oh, no, that's not what I meant.
> 
> I mean, my friend sends me a video of two cats cuddling, and it gets MD5 
> hashed and stored somewhere (it's the first instance of that file in my SIS). 
> Tomorrow, I read the newspaper from 2005 explaining how MD5 is broken, and 
> decide to switch my hash function to MD4 for safety reasons. A week later, 
> another friend sends me the same video (it's REALLY cute). Doesn't the video 
> get stored again?

Oh. Yeah, it gets duplicated once. I don't think that's a big deal.



Re: [Dovecot] Single instance storage - testing please

2010-08-26 Thread Michael Orlitzky

On 08/26/2010 09:38 PM, Timo Sirainen wrote:

On 27.8.2010, at 2.24, Michael Orlitzky wrote:


Have a utility that updates all (or a subset) of them.


That won't be necessary. Once the hash changes, the new files
are created with new hash function and it doesn't matter if the
old hash is broken because you can't generate new files with it
anymore anyway.



Won't files hashed with the old function begin to dupe though?


You mean new hash would become a duplicate of the old? Well ..

1) It's highly unlikely to happen, especially because with the new
hash function there again shouldn't be a way to create any specific
hash.


Oh, no, that's not what I meant.

I mean, my friend sends me a video of two cats cuddling, and it gets MD5 
hashed and stored somewhere (it's the first instance of that file in my 
SIS). Tomorrow, I read the newspaper from 2005 explaining how MD5 is 
broken, and decide to switch my hash function to MD4 for safety reasons. 
A week later, another friend sends me the same video (it's REALLY cute). 
Doesn't the video get stored again?


Re: [Dovecot] Single instance storage - testing please

2010-08-26 Thread Timo Sirainen
On 27.8.2010, at 2.24, Michael Orlitzky wrote:

>>> Have a utility that updates all (or a subset) of them.
>> 
>> That won't be necessary. Once the hash changes, the new files are
>> created with new hash function and it doesn't matter if the old hash
>> is broken because you can't generate new files with it anymore
>> anyway.
>> 
> 
> Won't files hashed with the old function begin to dupe though?

You mean new hash would become a duplicate of the old? Well ..

1) It's highly unlikely to happen, especially because with the new hash 
function there again shouldn't be a way to create any specific hash.

2) As long as byte-by-byte comparison is always done, collisions don't matter 
much anyway (if you can reliably reproduce them, that could lead to some kind 
of DoS by filling the filesystem, but again once hash function is changed this 
couldn't be done anymore).

3) The filename can be made different, making the collision impossible. Either 
because of different hash length or by manually adding some specific character 
there.


Re: [Dovecot] Single instance storage - testing please

2010-08-26 Thread Michael Orlitzky

On 08/26/2010 09:00 PM, Timo Sirainen wrote:

On 27.8.2010, at 1.52, Michael Orlitzky wrote:


On 08/26/2010 04:41 PM, Mike Abbott wrote:

1. What hash algorithm to use?



2. Should I add support for trusting hash uniqueness


Use two hash functions and concatenate the hashes.  While both
hash systems may eventually be hacked it is unlikely that hacking
them will result in a targeted alias.


Just make it possible to change the hash in the future.


I'm thinking about mail_attachment_hash setting where you can
configure it pretty much any way you want.


Have a utility that updates all (or a subset) of them.


That won't be necessary. Once the hash changes, the new files are
created with new hash function and it doesn't matter if the old hash
is broken because you can't generate new files with it anymore
anyway.



Won't files hashed with the old function begin to dupe though?


Re: [Dovecot] Single instance storage - testing please

2010-08-26 Thread Timo Sirainen
On 27.8.2010, at 1.52, Michael Orlitzky wrote:

> On 08/26/2010 04:41 PM, Mike Abbott wrote:
>>> 1. What hash algorithm to use?
>> 
>>> 2. Should I add support for trusting hash uniqueness
>> 
>> Use two hash functions and concatenate the hashes.  While both hash
>> systems may eventually be hacked it is unlikely that hacking them
>> will result in a targeted alias.
> 
> Just make it possible to change the hash in the future.

I'm thinking about mail_attachment_hash setting where you can configure it 
pretty much any way you want.

> Have a utility that updates all (or a subset) of them.

That won't be necessary. Once the hash changes, the new files are created with 
new hash function and it doesn't matter if the old hash is broken because you 
can't generate new files with it anymore anyway.



Re: [Dovecot] Single instance storage - testing please

2010-08-26 Thread Michael Orlitzky

On 08/26/2010 04:41 PM, Mike Abbott wrote:

1. What hash algorithm to use?



2. Should I add support for trusting hash uniqueness


Use two hash functions and concatenate the hashes.  While both hash
systems may eventually be hacked it is unlikely that hacking them
will result in a targeted alias.


Just make it possible to change the hash in the future. Have a utility 
that updates all (or a subset) of them.


If e.g. SHA256 is truly broken in the future, the utility can run 
overnight while I fix the million other emergencies that are about to 
exist in the morning.


Re: [Dovecot] Single instance storage - testing please

2010-08-26 Thread Mike Abbott
> 1. What hash algorithm to use?

> 2. Should I add support for trusting hash uniqueness

Use two hash functions and concatenate the hashes.  While both hash systems may 
eventually be hacked it is unlikely that hacking them will result in a targeted 
alias.

Re: [Dovecot] Single instance storage - testing please

2010-08-26 Thread Timo Sirainen
On Thu, 2010-08-26 at 20:32 +0100, Timo Sirainen wrote:
> http://hg.dovecot.org/dovecot-2.0-sis contains the code for it.
> Otherwise it's the latest (as of writing this) dovecot-2.0 hg tree.
> Please test if you're interested in SIS. :)

One more point that I have to remember to mention once I write its wiki
page: The attachment handling code is NFS safe, because it never
modifies existing files. So there won't be problems with using director
to distribute users to different servers and still all servers accessing
the common attachment storage.

Another thing I just remembered: The code currently uses 0600 / 0700
permissions for everything. I guess it should take the permissions
from /attachments directory and preserve them for all the subdirs/files.