Hi Jason, I think that tokens for non-HTML are the same so you would have minimal effects on accuracy. Since HTML is processed differently, you accuracy could change based on what tokens are currently being used to categorize a message. You could run a comparision by using the same tokens with the new version over already processed messages and see how they compare. I would want to validate that before going live.
Regards, Ken On Thu, Aug 21, 2014 at 01:09:25PM -0700, Jason J. W. Williams wrote: > Hi Ken, > > That's not a bad idea, but for the time being we just want to upgrade > to fix the period escaping issue. Can 3.10.2 use the existing database > without any changes in accuracy or classification? > > -J > ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Dspam-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspam-user
