On Thu, Mar 11, 2004 at 05:37:36PM +1300, Sidney Markowitz wrote: > I started looking over the details of what we do with Bayes and MySQL, > and I have some questions. > > The tables defined in sql/bayes_mysql.sql all have a username field that > is varchar(200). > > Why do we need a long username string in every record? Why is it 200 > when the username field in the MySQL userpref table is varchar(100). If > there really has to be a user id in each record why not use the integer > prefid field from the userpref table?
200 was mostly arbitrary. It's a varchar so it will only take up as much space as it needs. The bayes sql storage is seperate from the ConfSourceSQL stuff. I don't currently use sql prefs. So, it should not be tied to the userpref table. Is there a particular problem you are trying to solve? > In the tables that do have a username field, that field is declared as > either the key or is the prefix of the key. With MySQL is a selection > based on the username just as fast as somehow splitting the data for > each user into some separate location, whether that is a separate table > per user or database? Does MySQL optimize the storage by not storing the > actual key with the record? I guess I'm asking if there is some MySQL > optimization that isn't apparent to me that makes sense out of having > the username in every record. > > Does MySQL automatically optimize things so that when SpamAssassin > queries the database for each token in a message, since they are all for > the same user records for that user, or at least index entries for > records for that user, will end up getting cached on the first query and > then read again from memory on the subsequent queries? If not, then > shouldn't the database be structured so as to keep each user's data > together? Properly configured MySQL will handle all of this properly. I'm currently getting 99.9% key efficiency on my bayes database, so most of my selects are getting served out of memory. Michael
