> > 
> > I guess the relevant point for this thread is that I don't necessarily
> think
> > that this is the silver bullet as implied.  Even if you use a
> > high-availability clustering technology that can mirror writes and reads,
> you
> > are STILL dealing with the possibility of a database that is just
> massive. 
> > Processing this size of database will still be disk-bound unless you have
> an
> > unheard-of amount of memory; I don't think there's any reason to think
> that
> > clustering the problem will make it go away.
> > 
> > So I still wonder if anyone has any musings on my earlier questions?
> 
> A few spamassassin hacks could help.
> 1. Have multiple mysql servers, split your users into A-J, K-S, T-Z OR 
> smaller units and distribute them over different servers, with some HA / 
> failover mechanism (possibly drbd).
> 2. Have 2 level of bayes, one large global and the other smaller per 
> user if thats possible. Of course SA will need to be changed to use both 
> the bayes'. This way you could have 2 large servers for the global bayes 
> db and 2 for the per user bayes dbs.
> 
> Also see if this SQL failover patch can help you in any way.
> http://issues.apache.org/SpamAssassin/show_bug.cgi?id=2197

Thanks for the good thoughts.  Sounds like the ultimate answer is that not
many people are using per-user Bayes, at least at this level, and that any
"solutions" are yet to be realized in practice.  I don't think we've got the
resources or time to contribute any SA patches, but the food for thought is
very much appreciated!
 
> Finally to speed up the database have a look at this, the people at 
> wikimedia / livejournal seem to be happy using it.
> http://www.danga.com/memcached/

That's very cool.  I'll *definitely* be keeping this one in mind.



        
                
__________________________________ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com

Reply via email to