On Sat, Jun 10, 2006 at 01:23:35PM -0600, qqqq wrote: > >I would defer to the smart people to figure out the details. However I do > >wonder if the actual body content of the message would be best stored in a > >file and the SQL used to store anything and everything you would want to > >index. That would keep the SQL file size down if that's an issue. However, > >SQL databases might have to be changed to accomodate the needs to store > >email. > > I think this is what I was getting at early in the thread. I would think > that a 5 MB body would do better on file but I don't know enough in regards > to DBs to even make a call.
A good rule of thumb about storing something in the database is: are you going to search that data? If you're going to search the text of an email body, that makes it a more likely candidate for storing it in the database (though there are ways to do this searching while storing the file externally). Another consideration is that storing everything in the database is substantially easier than splitting between a database and the filesystem. If you think this is a non-issue, consider how to deal with all the error conditions where either the database or the filesystem is updated, but not both. Of course, storing anything in a database is going to have more overheard than storing it as raw bytes on the filesystem, and there's not really a way around that. Different databases will impose different amounts of overhead. As for all the arguments about how databases won't scale, or how they're a single point of failure.. what exactly do you think a single mail server is? Answer: not scalable and a single point of failure. Of course there are ways to work around that, and those methods apply just as well to databases (though the implementation can be different). Most databases support at least some form of replication, and many support clustering. And of course you don't have to try and cram all your users into a single database. Having said all that; it's nearly impossible to get a general-purpose RDBMS to outperform an optimized storage format (if you find an example where it is possible, I'd wager that's only true because the original format wasn't very well thought-out). It's essentially a given that a given set of hardware will be able to handle a higher load of storing and retrieving emails using maildir rather than a database (unless you get enough messages in a directory that it starts choking the filesystem). But if you want to do something like search for specific emails, there's a much better chance that a database will outperform maildir, especially if you're searching the message body. And there's other potential applications where a database would outperform maildir as well. So, in a nutshell, if you're not going to try doing something more advanced than just storing and retrieving email, it's unlikely that you'll be happy with storing that email in a database. The further off that 'beaten path' you get, the more likely you are to see benefit from using a database. -- Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"