SpamBayes doesn't follow links (see 
http://spambayes.sourceforge.net/faq.html#will-show-spam-clues-notify-a-spammer-that-i-opened-their-message
 for a tangentially related discussion), but it does process message headers.  
Lots of good information in there that you might think came from a Web site.

Unless you're willing to dive into the the code and the math, I'd caution 
against trying to second-guess SpamBayes.  You're going to want it to behave 
rationally, and it doesn't (at least at the level you're looking at); it 
behaves statistically.  That's why the FAQ 
(http://spambayes.svn.sourceforge.net/viewvc/spambayes/trunk/spambayes/Outlook2000/docs/troubleshooting.html#Messages_have_incorrect_or_unexpected)
 suggests sending all the Spam clues to the list when trying to understand why 
a given message isn't classified as expected.


-----Original Message-----
From: [email protected] on behalf of Ocean
Sent: Thu 2/4/2010 8:58 AM
To: [email protected]
Subject: [Spambayes] Problems with classifying as spam
 

        In addition to the startup problems, Spambayes is having problems
marking messages as spam.  



As an example, I received this email:

------------------------------

Subject: ***Discount_Viagra_VXPL_Percocet*_Adderall****

Body:

<URL Link>***Discount_Viagra_VXPL_Percocet*_Adderall****!
<Links to:> http://kashertqdum17.com/

------------------------------


        That's it.  The only text in the body of the message is that URL
link.  


There are two issues I see showing up:


1.  The subject and link text isn't being parsed properly.  Nowhere in the
spam clues are the words "viagra", "percocet", or "adderall" showing up.
The spam token involving the subject is "'subject:****'"  So, not only is
SpamBayes not treating the underscores as word seperators, but it's not even
getting to the words, because it looks like it's getting choked up on the
asterisks.

2.  I've got a *lot* of tokens showing up in the Spam Clues that are nowhere
in the email itself.  I'm guessing that Spambayes is actually going to that
link and processing what's on the page, but if so, that's a big problem.
First of all, it gives the spammers more flexibility in trying to bypass
spambayes.  And second, if it's following links, then it's confirming to the
spammers that my email address is valid.  That's a huge no-no.  Spambayes
should not be following links at all, but should only look in the message
itself.



_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

_______________________________________________
[email protected]
http://mail.python.org/mailman/listinfo/spambayes
Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to