On Tue, 2011-11-22 at 01:47 +0100, Jesper Wallin wrote:
> On 11/22/2011 12:35 AM, Karsten Bräckelmann wrote:

> > > I also noticed that my old database only had 11k tokens while the new
> > > one got about 60k (both the old and new server has hapaxes enabled and
> > > was trained using a corpus of about 600 spam and 200 ham)
> > 
> > Is that "old" database the original one from the previous system, or old
> > as in "before learning from scratch", but *after* migrating the db?
> >
> > I'd guess the latter. 11k tokens is terribly low, and as you just
> > noticed even less than learning a handful from scratch.
> I meant the original database, created by SA 3.3.2.. It got about 11k 
> tokens. Also, it runs MySQL 5.5.17 (as that machine runs ArchLinux) and 
> I'm not sure about the last comment on the MySQL bug page, it doesn't 
> really say if it's fixed or not in 5.5.16.

Your Ubuntu system uses 5.1, though.

Anyway, I guess to ever find out if this might be the issue, Mark or
someone else needs to come up with some funky idea.

And regardless, 11k tokens is terribly low.

