Re: phish/bayes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 (Note: CC: changed to users@spamassassin.apache.org - @incubator.apache.org address is deprecated). Sander Holthaus - Orange XL wrote: [snip] | But couldn't some 'simple' rules fix this? One metafilter that looks for | valid links (images, href's, email-addresses) to ebay, amazon, banks, | etc. and another meta-rule that looks for links that point to non-ebay, | non-amazon, non-bank links. A phisers will always need to point the | users to a site that is under his control and it shouldn't be too hard | to recognize this site. I've been using the attached for a while to catch paypal phishing scams, and am in the process of modifying it to catch ebay account scams too. Caveat: It's never FPd for me but there is plenty of potential there. Anyway, feel free to use/adapt/whatever to suit. Kind Regards, Craig. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDEvKjMDDagS2VwJ4RArUWAKDU1UZss3lF3joOxT+CZg1o2izfXQCglmt7 9owI38Yw6sPtLuhj9Cw/5Rs= =W+hS -END PGP SIGNATURE- # # Rules to catch PayPal phishing attempts. # # Checks for common paypal "update your account" phrases, or "unauthorised # access" phrases. Confirms that the mail came from @paypal and contains # only paypal.com links, otherwise throws scores. # # Craig McLean - 2005/05/22 header __LOCAL_PP_ISFROMPP From:addr =~ /[EMAIL PROTECTED]/i header __LOCAL_PP_S_UPD Subject: =~ m'(?:confirm|update) (?:your|the) (?:billing)?(?:records?|information|account)'i header __LOCAL_PP_S_AUT Subject: =~ m'unauthori[sz]ed access'i body __LOCAL_PP_B_UPD m'(?:confirm|updated?|verify|restore) (?:your|the) (?:account|current|billing|personal)? ?(?:records?|information|account|identity|access|data)'i body __LOCAL_PP_B_ATT m'one or more attempts'i body __LOCAL_PP_B_ACT m'unusual activity'i uri __LOCAL_PP_PPCGIURL m'https?://www\.paypal\.com/([A-Za-z0-9-_]+/)?cgi-bin/webscr\?'i uri __LOCAL_PP_NONPPURL m'https?://(?:[A-Za-z0-9-_]+)\.(?!(paypal)\.com)(?:[A-Za-z0-9-_\.]+)'i meta LOCAL_PP_UPD_BADURL (__LOCAL_PP_ISFROMPP && ((__LOCAL_PP_S_AUT || __LOCAL_PP_B_ATT || __LOCAL_PP_B_ACT || __LOCAL_PP_B_UPD || __LOCAL_PP_S_UPD) || __LOCAL_PP_PPCGIURL) && __LOCAL_PP_NONPPURL) meta LOCAL_PP_UPD_BADADDR (!__LOCAL_PP_ISFROMPP && ((__LOCAL_PP_S_AUT || __LOCAL_PP_B_ATT || __LOCAL_PP_B_ACT || __LOCAL_PP_B_UPD || __LOCAL_PP_S_UPD) && __LOCAL_PP_PPCGIURL)) describe LOCAL_PP_UPD_BADURL paypal/ebay account update, but has bad URL describe LOCAL_PP_UPD_BADADDR paypal/ebay account update, but from bad email score LOCAL_PP_UPD_BADURL 4 score LOCAL_PP_UPD_BADADDR 4
RE: phish/bayes
I wouldn't count too much on ClamAV to protect you from phising. I supplied them with various phising samples, but only a select few have been added to the database. Next to that, I wonder how well suited ClamAV is for this job. But couldn't some 'simple' rules fix this? One metafilter that looks for valid links (images, href's, email-addresses) to ebay, amazon, banks, etc. and another meta-rule that looks for links that point to non-ebay, non-amazon, non-bank links. A phisers will always need to point the users to a site that is under his control and it shouldn't be too hard to recognize this site. Kind Regards, Sander Holthaus From: Greg Allen [mailto:[EMAIL PROTECTED] Sent: Sunday, August 28, 2005 12:19 PMTo: satalk; users@spamassassin.apache.orgSubject: RE: phish/bayes I wouldn't worry about it. You can whitelist the real ebay servers with SA. Also, if you want to catch more of the phish messages you can install the Clamav plugin for SA, it does very good at finding phishies. You have to also install Clamav, but it is a fairly simple thing to install. On a side note, Ebay is not too smart IMO. Their real emails sometimes look a lot like phish, which must confuse the heck out of their customers. I am sure the bad guys like it though. -Original Message-From: satalk (sent by Nabble.com) [mailto:[EMAIL PROTECTED]Sent: Thursday, August 25, 2005 6:49 PMTo: users@spamassassin.apache.orgSubject: phish/bayesI could not find any email in this forum addressing this issue - it does not mean there is not one - I just could'nt find it :) MY question is as follows: Given that so many valid tokens from ebay/paypal sites exist in phish emails, am I correct in saying that it is imperative to avoid phish emails entering the bayes database? Anthony Sent from the SpamAssassin - Users forum at Nabble.com.
RE: phish/bayes
I wouldn't worry about it. You can whitelist the real ebay servers with SA. Also, if you want to catch more of the phish messages you can install the Clamav plugin for SA, it does very good at finding phishies. You have to also install Clamav, but it is a fairly simple thing to install. On a side note, Ebay is not too smart IMO. Their real emails sometimes look a lot like phish, which must confuse the heck out of their customers. I am sure the bad guys like it though. -Original Message-From: satalk (sent by Nabble.com) [mailto:[EMAIL PROTECTED]Sent: Thursday, August 25, 2005 6:49 PMTo: users@spamassassin.apache.orgSubject: phish/bayesI could not find any email in this forum addressing this issue - it does not mean there is not one - I just could'nt find it :) MY question is as follows: Given that so many valid tokens from ebay/paypal sites exist in phish emails, am I correct in saying that it is imperative to avoid phish emails entering the bayes database? Anthony Sent from the SpamAssassin - Users forum at Nabble.com.
Re: phish/bayes
From: "Matt Kettler" <[EMAIL PROTECTED]> At 06:49 PM 8/25/2005, satalk (sent by Nabble.com) wrote: MY question is as follows: Given that so many valid tokens from ebay/paypal sites exist in phish emails, am I correct in saying that it is imperative to avoid phish emails entering the bayes database? I would say it's imperative NOT to avoid training phish mails. To avoid training them is to intentionally poison your database. Don't ever avoid training a spam because it's got "ham like" content. This includes phish mails, "bayes poison" etc. Train them all. If it is spam, train it as spam. Period. Remember, your bayes DB can only be as accurate as your training is. If your training isn't realistic, your bayes db won't work well on realistic email. It's a common misconception that training ham-like spam will poison your bayes db. This problem might exist in very crude bayes implementations, but most bayes implementations, including SA, are largely immune to this. SA's use of chi-squared combining makes it very resistant to being "poisoned" into creating FPs by training nonspam text inside spam. Most tokens that are seen in both spam and ham are given very little weight by the chi-squared combining. On the other hand, failing to train those same messages makes SA very weak to having them FN in the future. If a token is only ever seen in ham it's given a very strong weight in the chi-squared combining. I modify that a little. I see no huge benefit and potential bad side effects from indiscriminately training on every spam that comes through. Instead I look at the low scoring spams, the ones that just barely were caught. If they are not BAYES_99 already and have anything to train on other than a single URL I train on them. I also train on missed spam that has anything within it that can distinguish it. (That's anything beyond a single URL.) I figure single URL emails are best caught on their second time around by the BLs in use. So far they always are. (Of course, most of them are caught by the specific geocities rule, anyway.) SARE and Bayes are highly synergistic in making a reliable SpamAssassin, I find. (So is no automatic anything in the local.cf settings. Manual training uber alles.) {^_-}
Re: phish/bayes
At 06:49 PM 8/25/2005, satalk (sent by Nabble.com) wrote: MY question is as follows: Given that so many valid tokens from ebay/paypal sites exist in phish emails, am I correct in saying that it is imperative to avoid phish emails entering the bayes database? I would say it's imperative NOT to avoid training phish mails. To avoid training them is to intentionally poison your database. Don't ever avoid training a spam because it's got "ham like" content. This includes phish mails, "bayes poison" etc. Train them all. If it is spam, train it as spam. Period. Remember, your bayes DB can only be as accurate as your training is. If your training isn't realistic, your bayes db won't work well on realistic email. It's a common misconception that training ham-like spam will poison your bayes db. This problem might exist in very crude bayes implementations, but most bayes implementations, including SA, are largely immune to this. SA's use of chi-squared combining makes it very resistant to being "poisoned" into creating FPs by training nonspam text inside spam. Most tokens that are seen in both spam and ham are given very little weight by the chi-squared combining. On the other hand, failing to train those same messages makes SA very weak to having them FN in the future. If a token is only ever seen in ham it's given a very strong weight in the chi-squared combining.
Re: phish/bayes
>MY question is as follows: Given that so many valid tokens from ebay/paypal sites exist in phish emails, am I correct in saying that it is imperative to avoid phish emails entering the bayes database? Probably not. A lot of them use links from ebay/paypal/whoever, but a lot of them pick up the links from Geocities or the like. Some of the better ones do a good job of getting the text right, and that could be a problem. But the vast majority are written by non-english speakers, and the results are close to butchered jabber. Ought to make some really nice bayes tokens only associated with spam and maybe the lete-speak crowd. Ok, I just looked at some real Paypal mails. They all get bayes_00. Looking at three recent paypal phish, they are all getting bayes-50 to 60. Of course, I don't auto-train, and I don't know that I've ever bothered feeding paypal phish to bayes specifically, although it has likely seen the occasional one in a batch oif spam. Loren
RE: phish/bayes
> From: Thomas Cameron [mailto:[EMAIL PROTECTED] > Sent: Thursday, August 25, 2005 6:03 PM > To: users@spamassassin.apache.org > Subject: Re: phish/bayes > > On Thu, 2005-08-25 at 15:49 -0700, satalk (sent by Nabble.com) wrote: > > I could not find any email in this forum addressing this issue - it > > does not mean there is not one - I just could'nt find it :) > > > > MY question is as follows: > > Given that so many valid tokens from ebay/paypal sites > exist in phish > > emails, am I correct in saying that it is imperative to avoid phish > > emails entering the bayes database? > > It has been my experience that the more of them I teach > Bayes, the less get through. None of my legit eBay/PayPal > e-mail has been tagged. Mine too -- and we likely need to remind the original poster that it is VERY important to also train some VALID emails from the real source that such phishes are targetting. This puts the real mails words in as tokens an means that the words in both types will not be strong indicators of spam (or ham) and other differences will be used to make the estimate. -- Herb Martin
Re: phish/bayes
On Thu, 2005-08-25 at 15:49 -0700, satalk (sent by Nabble.com) wrote: > I could not find any email in this forum addressing this issue - it > does not > mean there is not one - I just could'nt find it :) > > MY question is as follows: > Given that so many valid tokens from ebay/paypal sites > exist in phish emails, am I correct in saying that it is > imperative to avoid phish emails entering the bayes database? It has been my experience that the more of them I teach Bayes, the less get through. None of my legit eBay/PayPal e-mail has been tagged. Thomas