Hello Daniel,

<snipped a bit>
> So  my  real  question, after all this, is what have you found to be the
> most  appropriate  bucket  setup?  Should  I  artificially  create  more
> buckets,  or  leave  it  at  the  minimum..  or  am  i missing the point
> entirely? or.........

I don't really know what *is* better, but I'll tell you what I did.
For some three days I had 4 buckets and the overall accuracy didn't go
much higher than 80%. Then I thought that the more buckets the longer
the training period would probably need to be and, what the heck!, I'm
only really interested in classifying spam from no-spam. So, as
actually two of the buckets were sub-sets of a third one, I
deleted the to sub-set ones and reset statistics and, after one more
day, accuracy was over 90%.

Aside of that I see no sense in using a Bayesian text classifier (that
is what POFile actually is) to tell me if a message is coming from
John Doe or from TBUDL list, both are legitimate mail and that kind of
sorting is better and more easily done by TB's *Sorting* Office.

> Also,  am  I  just  lucky,  or over the last, say, month or so, has spam
> really  slowed  down.  The  last spam i received through my main account
> must  have  been  early  last  week,  and I haven't received any spam in
> either  of  my  hotmail  accounts for weeks either... Just lucky, or are
> other people noticing a similar occurrence...

I wouldn't know. I get spam everyday, specially in two of my accounts,
and even just one spam message seems too much for me :-(

-- 
Best regards,

Miguel A. Urech (El Escorial - Spain)
Using The Bat! v1.61


________________________________________________
Current version is 1.62 | "Using TBUDL" information:
http://www.silverstones.com/thebat/TBUDLInfo.html

Reply via email to