On Mon, 2010-07-19 at 09:01 -0700, Daniel L. Miller wrote: > > The idea is to have dbox and mdbox support saving attachments (or MIME > > parts in general) to separate files, which with some magic gives a > > possibility to do single instance attachment storage. Comments welcome. > > > > > YAAAY!!! Timo's gonna give us SIS!!! > > Is it done yet :) ?
Well, there was a "code status" at the bottom of the mail :) > 1. You've already identified that enabling this feature needs to avoid > introducing problems - including treating different-but-similar > attachments as identical. In your hashing choices, you only mentioned > attachment body. What about including size and date in the hash? Attachments don't have dates. Size could be included as part of the filename I guess.. Maybe it would even be a good idea.. > 2. You didn't explicitly define if SIS would be per-mailbox or > system-wide. Speaking for myself, and probably a few others, I'll take > whatever implementation I can get - but I'd love to see it system-wide. System-wide. Of course permissions need to be properly set so all users can access them. > 3. Are you envisioning this as being handled totally within deliver, or > would there be a server process for consolidating the messages? I'm > wondering about the impact to high-traffic sites (which mine is > thankfully NOT) - if deliver needs to crunch on large messages, could > this lead to time-out issues from the MTA's? > > A possible alternative, have deliver write the message out as normal - > but flag it for attachment processing. Then have a secondary process > awakened to check for attachments and perform accordingly. So any SIS > overhead becomes invisible to the MTA - other than needing available > system resources for processing (and the attachment processing could be > done at a lower priority). Yeah, something like that would be possible. Or the attachment could still be stored to the attachment storage using the <hash>-<guid>[-<size>?] name and the daemon could then do the deduplication by finding any new files and seeing if they could be replaced with links to other existing files.