On 07/03/18 09:08, Daniele Duca wrote:
On 07/03/2018 09:52, Sebastian Arcus wrote:
I have this one email account receiving, for more than a year, a very
specific type of spam which I find very difficult to block:
1. The messages are all kept very short, generally below 20 words - I
assume so that Bayes is less efficient at classifying them?
2. Although they are all invitations to sex, or making money - they
are phrased differently every time and use different words - so Bayes
scores are consistently low.
<snip>
Hi Sebastian,
I perfectly know what type of email you are talking about, I've seen
them written at least in italian, english and spanish. If you click the
link you are being redirected to shady dating websites or
bitcoin/investment scams sites (at least in my experience).
Since I get the majority of these emails in italian, I've written a meta
rule that takes in account:
- Common mispelled words/phrases
- Body lines must be < 5
- The common pattern in all the urls. Take a close look at them, there
IS a pattern, not writing it here for obvious reasons :)
Thank you so much for that! The emails I see don't usually have spelling
mistakes, but you are right, it seems that the url is the way to go.
I've been looking for patters in the headers and source servers all
along - it never crossed my mind to check the body! Thanks again