> My question is: how can I (and should I, even) share the training data > between the two locations? Ideally I'd like to maintain just one > database, since this will all be a single set of mail; otherwise I'd > have to repeat the same training at home and at work.
Firstly, I agree with Jesse that it's really not worth sharing the databases - just train each one separately and you should have good results. > 1. Manually copy the database file(s) from one location to the other. > There are several variations on this. I could just do the initial > training in one location and copy to the other, then maintain each > database separately thereafter, expecting the follow-on training to > take > much less work. Note that we don't recommend doing any initial training. We recommend only training on any mistakes and unsures. > 2. Maintain the database file(s) on a server somewhere. If you do want to do this, then a good choice would be to use ZEO, for which there is already a (relatively untested) storage system written for SpamBayes. There're also mySQL and postgreSQL options. > 3. Carry a little USB drive around with me, and keep the DB on that. > (Does the DB get too big for this to be practical?) Depends how big the USB drive is <wink>. This should work fine. > 4. Super Crazy Ninja Trick?: enhance the SpamBayes IMAP proxy with the > ability to maintain a DB in a folder on the IMAP server, download it > before beginning filtering, and upload it whenever it is modified. If > this seems productive, and the feature doesn't yet exist, I'd be happy > to add it if I can find the time. Storing the database remotely in some other way (e.g. ZEO) would be a more sensible method, IMO. =Tony.Meyer _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
