Ilja Booij wrote:
I'm not sure what the best way to handle this is, though. My thinking has
always revolved around handling huge messages without causing resource
starvation. So a gigabyte email should be parsed in pieces and not all
allocated into memory at once. But a four megabyte email might as well go into
memory. You'd have to get a heck of a lot of people each reading a four meg
email at once for that to be a major problem. But, since we want DBMail to be
properly scalable to really large installations, it is a distinct possibility
to have that many people each reading an email that large (scenario: the CEO
sounds out his latest crazy plan in a four meg powerpoint, and everyone in the
whole company starts pounding on the mail server to retrieve their copy.)
OTOH, a message which consists of multiple block will (almost) always be
fetched from the database completely anyway. With that in mind, it
wouldn't matter if the message data is in 2 blocks (header and body)
instead of > 2 blocks.
I can remember something about the maximum size for a TEXT field in
MySQL being the original reason for the choice of splitting the message
into parts. I'll ask Eelco and Roel, they should know this.
I've asked Eelco about this:
MySQL used to have a limit on the client-server communication that
forced us to limit the size of blocks being transferred. Nowadays that
limit is much higher.
Setting the max_allowed_packet variable to a high number (a few MB for
instance) on both client and server will allow for sending big blocks
between client and server.
The TEXT field itself has no limit.
In PostgreSQL, there's a 1GB limit on the TEXT field.
I think we can safely go to a strategy where we put the message in 2
blocks, 1 for the header and 1 for the body. However, if we make our
parsing code to handle messages only in that way, we need to produce a
script for migrating from a database with split messages. This should'n
be too much of a problem though.
Ilja