On 04/30/2013 08:05 AM, Angel L. Mateo wrote:
El 30/04/13 03:28, Tim Groeneveld escribió:
Hi Guys,
I am wondering about mail deduplication. I am looking into the
possibility
of seperating out all of the message bodies with multiple parts inside
mail
that is recived from `dovecot` and hashing them all.
The idea is that by hashing all of the parts inside the email, I will be
able to ensure that each part of the email will only be saved once.
This means that attachments & common parts of the body will only be
saved once inside the storage.
How achievable would this be with the current state of dovecot? Would it
even be worth doing?
I asked the same question recently. As Timo responsed at
http://kevat.dovecot.org/list/dovecot/2013-March/089072.html it seems
that this feature is production stable in recent versions of dovecot.
And I think it is worth. My estimations (with just about 10 users
of my organization, they are no accurate) is that you can save more than
30% of total mail storage.
To configure it you need to use options:
* mail_attachment_dir
* mail_attachement_min_size
* mail_attachment_fs
* mail_attachment_hash
Hello,
Is it just working or is it working in a optimal way? back in October
2011 we noticed that the deduplication wasn't working as well as we were
expecting as some files weren't properly deduplicated
(http://markmail.org/message/ymfdwng7un2mj26z). Timo did you ever hit
that bug and got it fixed if there was anything to fix on your side?
Since we are very interrested in this feature I am very eager to hear
about admins using it on a similar scale (around 80,000 mailboxes).
Thanks,
Arnaud
--
Arnaud Abélard (jabber: arnaud.abel...@univ-nantes.fr)
Administrateur Système - Responsable Services Web
Direction des Systèmes d'Informations
Université de Nantes
-
ne pas utiliser: trapem...@univ-nantes.fr