On Mon, 22 Aug 2016 07:34:00 -0700 Marc Perkel wrote: > On 08/22/16 07:28, Dianne Skoll wrote:
> > The other two possibilities (no tokens in either or some tokens in > > both) are undecidable. > > Exactly! In the past you've said that when there are token in both you compare the counts. On Wed, 17 Aug 2016 11:02:38 -0700 Marc Perkel wrote: > Here's the actual formula. > > card(Test_message intersect Spam diff Ham) minus card(Test_message > intersect Ham diff Spam) > On Wed, 20 Jan 2016 08:52:05 -0800 Marc Perkel wrote: > Then you do a set > diff both ways (ham - spam) (spam - ham) and whichever side is bigger > wins. Generally it will match on only one side or very predominately > on one side.