Re: sudden deluge of university spams

2006-06-23 Thread Loren Wilton
 Wow SA is doing a lot of work already. Can I also have a collapsed body
 string with all whitespaces removed

You would have to write a plugin for this.  Keep in mind that this technique
can lead to unexpected FPs pretty easily.  Once you eliminate all the spaces
(and possibly other punctuation) you only have a string of letters and you
have to guess where the words are.  it is really easy to guess wrong and
some up with no-no words that didn't exist in the source, even if you don't
check for obfu cases.

I'm not saying 'don't do this', although others probably will.  I will say
'don't check for single words' if you do this.  I wouldn't want to try it
checking for anything less than about 15 letters in sequence, and probably
20-25 if I include obfu techniques.

This would probably be a real nice area of study for someone in need of a
thesis.  If you eliminate all spaces/punctuation and only leave the letters,
what are the statistics relating to pulling other nonexistant words out of
the resulting letter stream?  How long on average?  How frequent?  Etc.

Loren



Re: sudden deluge of university spams

2006-06-23 Thread Ron Johnson
Ramprasad writes:
 
   I am doing regex match something like
   /1 *- *2 *2 *- *3 *3 */
  
   Any inputs ?
  
  Yes, as SA collapses multiple spaces down to a single space (in 'body'
  tests), you only need to look for a single instance of the space,
  not an unlimited number. Also you can omit that final ' *' as it's
  an optional tail match, thus the rule will work without it.
  
  IE:
/1 ?- ?2 ?2 ?- ?3/
 
 Wow SA is doing a lot of work already. Can I also have a collapsed body
 string with all whitespaces removed
 so I could do 
 
 collapsedbody BADNUMBER /1-22-33/ 
 score BADNUMBER 10
 
 I this this will also help get rid of the 
 genu ine   uni versity  degre es
 
 
With the side issue of The pen is mightier than the sword

and many other potential accidents. IOW handle with care.




RE: sudden deluge of university spams

2006-06-23 Thread David B Funk
On Fri, 23 Jun 2006, Ramprasad wrote:

  Yes, as SA collapses multiple spaces down to a single space (in 'body'
  tests), you only need to look for a single instance of the space,
  not an unlimited number. Also you can omit that final ' *' as it's
  an optional tail match, thus the rule will work without it.
 
  IE:
/1 ?- ?2 ?2 ?- ?3/

 Wow SA is doing a lot of work already. Can I also have a collapsed body
 string with all whitespaces removed
 so I could do

 collapsedbody BADNUMBER /1-22-33/
 score BADNUMBER 10

 I this this will also help get rid of the
 genu ine   uni versity  degre es

You do -NOT- want this. As others have already pointed out you
can no longer determine word boundaries and increase FP rates.
But the real reason is that you will be throwing away the one
gift that the spammers have handed you, a good indication for
seperating spam from ham.
The bane of spam-fighters is FPs. Any good clues should be treasured
not discarded.

In our environment, we have discussions of 'degrees' often.
However I've never seen a legit discussion of 'degr ees' nor 'deg rees'
etc. So a simple negative-lookahead rule makes it easy to
whack the borked version but not FP on the correct, EG:

  body FAKE_DEGREE1 /\b(?!degrees)d ?e ?g ?r ?e ?e ?s/i

will match any permutation of 'degrees' containing spaces but
won't hit the word 'degrees' itself.

A well fed Bayes plus a few of these type of rules made this particular
spam a non-issue here. ;)

Bottom line, when spammers obfsucate words they usually make it -easier-
to catch, not harder. ;)

Dave

-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


RE: sudden deluge of university spams

2006-06-22 Thread Chris Santerre
Title: RE: sudden deluge of university spams







 -Original Message-
 From: Ramprasad [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, June 22, 2006 2:39 AM
 To: users@spamassassin.apache.org
 Subject: sudden deluge of university spams
 
 
 Hi,
 My servers are suddenly facing a deluge of university 
 spams. All that
 get gen uine de grees from pr estigious univers ities type 
 
 These mails have no urls or email addresses, just some phone 
 numbers to
 call back. And the spammers are using some virgin routes , so 
 they dont
 hit the RCVD_IN_* rules too 
 
 For now I have written my own rulesets to catch these mangled words ..
 but I am surprised there arent rules in SARE etc to catch such words
 already


There's a reason. The amount of permutations is ridiculous. But SARE has Evilnumbers which catches these. 


--Chris





RE: sudden deluge of university spams

2006-06-22 Thread Craig Baird

Quoting Chris Santerre [EMAIL PROTECTED]:


There's a reason. The amount of permutations is ridiculous. But SARE has
Evilnumbers which catches these.


Except that evilnumbers hasn't been updated in over a year   :-)

I've been writing custom rules to block the phone numbers used in these.  You
could write rules for the wording, but like Chris said, it changes so often
that it's a very fast-moving target.  It's probably much more difficult for
the spammer to change their phone number than to change the text of their
e-mails, so write a rule for the phone number, and then score it through the
roof.

I've noticed that it usually takes a handful of phone number rules to stop
these spams for a while, then the spammer changes numbers, and you have to do
it all over again.  Modifications in the phone number format are also a small
challenge.  For example 555.555., 555-555-, 555 555 , 555-
555-, (555)555., etc etc.  So you have to write your rules to take
that into account.

Craig




RE: sudden deluge of university spams

2006-06-22 Thread Ramprasad

 There's a reason. The amount of permutations is ridiculous. But SARE
 has Evilnumbers which catches these. 

Is the Evilnumbers ruleset not too heavy 

But the numbers are also mangled
eg 
1-22-33 could be written in numerous ways just adding  spaces in between
randomly 
I am doing regex match something like 
/1 *- *2 *2 *- *3 *3 */

Any inputs ? 

Thanks
Ram
 





RE: sudden deluge of university spams

2006-06-22 Thread Chris Santerre
Title: RE: sudden deluge of university spams







 -Original Message-
 From: Craig Baird [mailto:[EMAIL PROTECTED]]
 Sent: Thursday, June 22, 2006 11:46 AM
 To: users@spamassassin.apache.org
 Subject: RE: sudden deluge of university spams
 
 
 Quoting Chris Santerre [EMAIL PROTECTED]:
 
  There's a reason. The amount of permutations is ridiculous. 
 But SARE has
  Evilnumbers which catches these.
 
 Except that evilnumbers hasn't been updated in over a year :-)
 
People used to post new numbers to this list for SARE to add. They stopped. We only have so many spam traps :) 


--Chris 





Re: sudden deluge of university spams

2006-06-22 Thread qqqq
Title: RE: sudden deluge of university spams



  There's a reason. The amount of permutations is 
ridiculous.  But SARE has   Evilnumbers which catches these.  
 Except that evilnumbers hasn't been updated in over 
a year :-)  People used to post new numbers to this list for SARE to add. They 
stopped. We only have so many spam traps :)

Here's my contribution..


body BRIAN_PHONE_NUMBERS 
/8.?1.?6.?8.?1.?7.?0.?9.?1.?7|2.?0.?6.?3.?5.?0.?3.?7.?3.?7|2.?0.?6.?9.?8.?4.?2.?3.?2.?7|2.?0.?6.?3.?3.?3.?0.?0.?5.?1|2.?0.?6.?9.?8.?4.?0.?1.?0.?6|3.?3.?8.?3.?5.?7.?9|2.?0.?6.?3.?3.?8.?6.?0.?6.?1|2.?0.?6.?2.?0.?2.?2.?0.?3.?3|2.?0.?6.?3.?3.?7.?1.?8.?8.?3|2.?0.?6.?3.?3.?8.?3.?5.?7.?9|9.?2.?8.?.?4.?9.?6.?2.?8.?0.?5|2.?0.?6.?3.?0.?9.?0.?6.?7.?3|2.?0.?6.?3.?5.?0.?4.?7.?8.?5|2.?0.?6.?3.?5.?0.?6.?4.?0.?4|2.?0.?6.?9.?8.?4.?1.?7.?0.?5|2.?0.?6.?3.?3.?7.?1.?9.?6.?8|2.?0.?6.?9.?8.?4.?0.?8.?3.?3|2.?0.?6.?2.?0.?2.?1.?6.?4.?1|2.?0.?6.?6.?6.?6.?5.?5.?1.?0|9.?0.?4.?2.?1.?2.?0.?0.?9.?3|2.?0.?6.?3.?3.?9.?6.?2.?8.?5|2.?0.?8.?3.?3.?0.?0.?0.?9.?3|2.?0.?8.?4.?7.?4.?3.?6.?0.?3|3.?0.?9.?4.?0.?4.?.?7.?7.?8.?1|4.?8.?4.?6.?9.?3.?8.?8.?6.?1|2.?0.?6.?6.?0.?0.?7.?9.?0.?4|2.?1.?5.?6.?8.?9.?7.?3.?7.?9/describe 
BRIAN_PHONE_NUMBERS Phone number or address pulled 
from spamscore 
BRIAN_PHONE_NUMBERS 
3.5


Re: sudden deluge of university spams

2006-06-22 Thread Sandy S

- Original Message - 
From: Ramprasad [EMAIL PROTECTED]
To: Chris Santerre [EMAIL PROTECTED]
Cc: users@spamassassin.apache.org
Sent: Thursday, June 22, 2006 10:46 AM
Subject: RE: sudden deluge of university spams



  There's a reason. The amount of permutations is ridiculous. But SARE
  has Evilnumbers which catches these.

 Is the Evilnumbers ruleset not too heavy

 But the numbers are also mangled
 eg
 1-22-33 could be written in numerous ways just adding  spaces in between
 randomly
 I am doing regex match something like
 /1 *- *2 *2 *- *3 *3 */

 Any inputs ?

 Thanks
 Ram




Except now I'm starting to see numbers like +1 - EIGHT 3 1 - THREE ZERO
ZERO - SIX SIX FOUR THREE.

Sandy




Re: sudden deluge of university spams

2006-06-22 Thread Kelson

Ramprasad wrote:

Hi,
  My servers are suddenly facing a deluge of university spams. All that
get gen uine de grees from pr estigious univers ities  type 


These mails have no urls or email addresses, just some phone numbers to
call back. And the spammers are using some virgin routes , so they dont
hit the RCVD_IN_* rules too 


I've been seeing them too, but they're all being caught.  The main rules 
that they seem to hit are Bayes, Razor, SARE_SPEC_DIPLOMA and 
TVD_FUZZY_DEGREE BODY (which I think is one of the rules you get by 
running sa-update).


So my recommendations would be (assuming you haven't done these already):
Run sa-update
Turn on Razor2 and Bayes
Grab the sare_specific ruleset
Run sa-learn on the messages.

--
Kelson Vibber
SpeedGate Communications www.speed.net


Re: sudden deluge of university spams

2006-06-22 Thread jdow

From: Sandy S [EMAIL PROTECTED]

From: Ramprasad [EMAIL PROTECTED]


 There's a reason. The amount of permutations is ridiculous. But SARE
 has Evilnumbers which catches these.

Is the Evilnumbers ruleset not too heavy

But the numbers are also mangled
eg
1-22-33 could be written in numerous ways just adding  spaces in between
randomly
I am doing regex match something like
/1 *- *2 *2 *- *3 *3 */

Any inputs ?

Thanks
Ram





Except now I'm starting to see numbers like +1 - EIGHT 3 1 - THREE ZERO
ZERO - SIX SIX FOUR THREE.


Either I have not received any of these messages or BLs and BAYES have
conspired to eliminate them. Of course, with individual BAYES it is
effective to raise the score to say 5.001 or the like. I've not seen
any of this spam in ages.

{^_^}


RE: sudden deluge of university spams

2006-06-22 Thread David B Funk
On Thu, 22 Jun 2006, Ramprasad wrote:

 Is the Evilnumbers ruleset not too heavy

 But the numbers are also mangled
 eg
 1-22-33 could be written in numerous ways just adding  spaces in between
 randomly
 I am doing regex match something like
 /1 *- *2 *2 *- *3 *3 */

 Any inputs ?

Yes, as SA collapses multiple spaces down to a single space (in 'body'
tests), you only need to look for a single instance of the space,
not an unlimited number. Also you can omit that final ' *' as it's
an optional tail match, thus the rule will work without it.

IE:
  /1 ?- ?2 ?2 ?- ?3/


-- 
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


Re: sudden deluge of university spams

2006-06-22 Thread jdow

From: David B Funk [EMAIL PROTECTED]


On Thu, 22 Jun 2006, Ramprasad wrote:


Is the Evilnumbers ruleset not too heavy

But the numbers are also mangled
eg
1-22-33 could be written in numerous ways just adding  spaces in between
randomly
I am doing regex match something like
/1 *- *2 *2 *- *3 *3 */

Any inputs ?


Yes, as SA collapses multiple spaces down to a single space (in 'body'
tests), you only need to look for a single instance of the space,
not an unlimited number. Also you can omit that final ' *' as it's
an optional tail match, thus the rule will work without it.


Out of curiosity what would SA do with tabspacetabspace?
One hopes it is smart enough to collapse that, now. Indications in
the past from writing rules files indicate that this is not a normal
parsing feature of perl.

(bodytabspacespaceRuleNamespacetab{stuff} has given me
trouble in the past. These days I am careful not to mix spaces and
tabs so I don't know for sure if this has been solved.)

{^_^}


RE: sudden deluge of university spams

2006-06-22 Thread Ramprasad
  I am doing regex match something like
  /1 *- *2 *2 *- *3 *3 */
 
  Any inputs ?
 
 Yes, as SA collapses multiple spaces down to a single space (in 'body'
 tests), you only need to look for a single instance of the space,
 not an unlimited number. Also you can omit that final ' *' as it's
 an optional tail match, thus the rule will work without it.
 
 IE:
   /1 ?- ?2 ?2 ?- ?3/

Wow SA is doing a lot of work already. Can I also have a collapsed body
string with all whitespaces removed
so I could do 

collapsedbody BADNUMBER /1-22-33/ 
score BADNUMBER 10

I this this will also help get rid of the 
genu ine   uni versity  degre es


Thanks
Ram