On Thu, Mar 11, 2004 at 12:51:28AM -0600, Michael Parker wrote:
> On Thu, Mar 11, 2004 at 07:36:07PM +1300, Sidney Markowitz wrote:
> > Michael Parker wrote:
> > >Is there a particular problem you are trying to solve?
> > 
> > Yes, I'm trying to figure out why Kelsey sees the very high I/O 
> > requirements that he does that blocks him from scaling up to the 
> > multiple tens of thousands of users, while DSPAM claims to be running on 
> > a site with 125,000 users.
> 
> Kelsey will have to correct me if I'm wrong, but he's not seeing the
> high I/O with the MySQL bayes storage, he's seeing it with the DB_File
> solution.

That is correct.  But we're having trouble groking how MySQL could resolve
the problems.  Granted, with a small table and queries being served from
chache MySQL should scream.  However, with tables approaching 1TB, that
obviously can't be cached effectively, the select's are going to have the
hit the disks.  Our estimation for load associated with a message is as
follows:

On average a message is broken down into 262 tokens (this is based of
Sidney's mail flow) our target goal for deployment is ~2-3k msg/min
capacity.  For DB_file, this results in the worst case as, 25-52 mb/sec of
read IO (4k read blocks * msg/sec * #tokens).  Our benchmarking is pretty
much in line with the theoretical numbers.  This doesn't take into account
database updates.

Now, I'll be honest, I don't have a good understanding of how this would
translate to the SQL backend.  I'm installing the 3.0 snap now, and will
start to play around.  Hopefully I'll have results shortly.


-- 
Kelsey Cummings - [EMAIL PROTECTED]           sonic.net, inc.
System Administrator                      2260 Apollo Way
707.522.1000 (Voice)                      Santa Rosa, CA 95407
707.547.2199 (Fax)                        http://www.sonic.net/
Fingerprint = D5F9 667F 5D32 7347 0B79  8DB7 2B42 86B6 4E2C 3896

Reply via email to