Hi Jason,

I think that tokens for non-HTML are the same so you would have
minimal effects on accuracy. Since HTML is processed differently,
you accuracy could change based on what tokens are currently
being used to categorize a message. You could run a comparision
by using the same tokens with the new version over already processed
messages and see how they compare. I would want to validate that
before going live.

Regards,
Ken

On Thu, Aug 21, 2014 at 01:09:25PM -0700, Jason J. W. Williams wrote:
> Hi Ken,
> 
> That's not a bad idea, but for the time being we just want to upgrade
> to fix the period escaping issue. Can 3.10.2 use the existing database
> without any changes in accuracy or classification?
> 
> -J
> 

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to