2013/2/12 Bron Gondwana <[email protected]>:
> One of the perennial topics on #cyrus is "what about a more configurable set 
> of cached headers".
>

Indeed.

> As you can see, there are some normalised things from some headers.  The same 
> information normalised in a DIFFERENT way in the ENVELOPE and then a 
> BODYSTRUCTURE and a BODY response.

Yes it's redundant

>
> 1) keep the BODYSTRUCTURE, it's the result of parsing the entire message, and 
> can't be calculated cheaply again
> 2) keep the SECTION data (possibly along with the bodystructure) - it's the 
> offsets for the various parts of the message, same issue
> 3) add a list of "SUPPRESSED HEADERS".  This would list any header which is 
> present in the file, but NOT in the cache.
> 4) cache every other header, including all the To:, From:, Subject:, etc - in 
> as close to raw form as possible.
>
> The entire list of headers to suppress would initially be:
>
> received
> dkim-signature
> domainkey-signature
> domainkey-x509
>
> But it would be configurable as an imapd.conf option.
>
> NOTE: you can still infer the presence or absence just by querying the 
> suppressed list - so many messages the entire suppressed list would just be 
> 'received'.
>
> This should take fairly similar space to what we have now, be more flexible, 
> and be more future-proof.

However, I think the cache file is already big today. It causes extra disk I/O.

> No matter how you want to parse the fields, the original values is what 
> you've got!  Even if you change the list of headers you suppress, each cache 
> record is complete in itself, so there's no loss of fidelity.
>
> It means a little more CPU to calculate the ENVELOPE, but seriously... I 
> don't think it's a worry in the current world, and it's not so commonly 
> requested anyway.

Completely agree

> =====
>
> Thoughts?

Your proposal sounds good. It is quite close to current dovecot
behavior, according to the documentation :

>Cache file may contain the following information for messages:
>
>    Message headers (some, not all)
>    Sent date (parsed Date: header)
>    Received date (IMAP's INTERNALDATE field)
>    Physical and virtual message sizes
>    Message's parsed MIME structure, allowing to quickly read only a specific 
> MIME part (IMAP's FETCH BODY[1.2.3] command)
>    IMAP's BODY and BODYSTRUCTURE fields
>        If both are used, only BODYSTRUCTURE is saved, since BODY can be 
> generated from it
>    IMAP's ENVELOPE isn't cached currently. Instead the headers used to build 
> it are cached directly.

I also like the opportunity to get out old cached data that are no
longer needed. And the adaptative behavior depending how the IMAP
clients work :
http://wiki2.dovecot.org/IndexFiles
http://wiki2.dovecot.org/Design/Indexes/Cache

However, I wonder what happens when a webmail users requests to sort
the mails by sender, if From headers are not all cached !

Regards,
Sébastien

Reply via email to