DBMAIL Developers Mailinglist <dbmail-dev@dbmail.org> on Monday, August
23, 2004 at 15:16 +0100 wrote:
>
>
>Ilja Booij wrote:
>> Doesn't it make sense to parse the message only the first time it's 
>> fetched? Messages read using
>> IMAP will only have to be parsed once. Messages read using POP will not 
>> have to be parsed.
>
>No that doesn't make any sense at all.  Any design that can't handle
>multiple 
>messages in single queries (search, retrieve, sort, etc) is flawed.
>Parsing is 
>relatively cheap and fast (assuming we use a decent parser). Talking to
>the 
>backend is relatively expensive in terms of IO latency and backend
>resources.
>
>Also, we want to keep dbmail small and simple, not add new levels of
>complexity 
>that don't offer much added value.
>
>The email storage should be consistent.
>The email storage should be highly optimized.
>
>Neither of these qualifications apply to this to-parse-or-not-to-parse
>thread.
>
>If we go for parsed storage, all of the storage should be converted.
>
>Most messages that are retrieved are retrieved relatively few times. That
>is: I 
>want to read my messages once, but I want to read my messages fast, not
>wait for 
>the backend to store it's parsed equivalent. Esp if all I do is delete
>the mail, 
>or read it maybe once or twice sometimes later.

OK. After reading this through a few times I have to agree with you.
>
>
>> It would make the fetching a bit more cumbersome.. 
>
>Fetching is already too cumbersome as it is, no?

Is it? ;) it's veeeerrryy cumbersome. Every time I see _ic_fetch() i get
nightmares.. (BTW How's the rewrite going?)
>
>
>> But it could work 
>> like this:
>> 
>> Message already parsed:
>> 1. Fetch parsed message
>> 2. parsed message returned to client
>> 
>> Message not yet parsed
>> 1. Fetched parsed message
>> 2. no message returned
>> 3. fetch raw message
>> 4. parse message
>> 5. store parsed message
>> 6. return parsed message
>
>So instead of improving dbmail performance by using cached information
>you'd be 
>slowing dbmail down significantly without any advantages. Just think of a 
>scenario where someone does a fetch 1:*. In stead of a single query you'd
>get 
>two queries for each message in the folder, not to mention the cascade in 
>network traffic this would trigger.

Agreed.
>
>
>This idea only makes sense as part of a migration tool when we move to 
>mime-caching in the database. As part of the retrieval chain it sucks, if
>you 
>pardon my french.
>
>Finally, performance of the delivery chain is much less visible to users,
>and 
>therefor less critical. If users have to wait a second longer for a new
>message 
>to be inserted, they won't complain. But make them wait a second longer
>for each 
>message they want to retrieve, and they will start calling the support
>desk. 
>Meaning *you*.
>
>So please, please, if we move to parsed storage, just *go* for it. No
>compromises.

Yep, you're right. I was thinking of speed as in: parsing every message
costs time. It costs performance. But I forgot about the difference
between performance and *perceived* performance. The speed of IMAP
fetching is much more important for the perceived performance than the
speed of insertion.

Well, I'll go back into my cave now and continue fixing stuff. 

Ilja

--
Ilja Booij
IC&S B.V.

Stadhouderslaan 57
3583 JD  Utrecht
www.ic-s.nl

T algemeen: 030 6355730
T direct: 030 6355739
F: 030 6355731
E: [EMAIL PROTECTED]

Reply via email to