It's one of Declude's undocumented tests.  I found a bunch of them in the release notes on his site (link at the bottom of the manual page) and then I searched the archives to find comments about them.  I also found a few from just simply reading people's config files on this board.

This test, a.k.a. SUBJECTSPACES, just simply counts the number of spaces in a subject line.  Spammers often will do something like show a subject, then a bunch of spaces, and then some gibberish.  It will also score on some very long subjects which are not common in real E-mail.  The scoring is additive as higher levels are hit, and you can customize those levels.

Matt


Marc Catuogno wrote:
I'm not familiar with this test?

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]] On Behalf Of Matthew Bramble
Sent: Wednesday, September 10, 2003 10:27 PM
To: [EMAIL PROTECTED]
Subject: Re: [Declude.JunkMail] Strange Subject

Add the following tests and it get's even better :)

SUBSPACE-10        subjectspaces    10    x    1    0
SUBSPACE-20        subjectspaces    20    x    2    0
SUBSPACE-30        subjectspaces    30    x    3    0

Matt


Dan Patnode wrote:

  
I did a scan of all uncaught spam from the last week, found all the
    
one's with Q, removed the QU's and ended up with this list.  All of
these would have been seen by Matt's new config:
  
Subject: Block those unwanted Popups yqvqk
Subject: drive luxury cars and get paid          9xP%oY5NzPG\q2G
Subject: drive luxury cars and get paid          L0z[7J4aYq!F7P1
Subject: drive luxury cars and get paid 9xP%oY5NzPG\q2G
Subject: drive luxury cars and get paid L0z[7J4aYq!F7P1
Subject: FW: Block those unwanted Popups yqvqk
Subject: FW: drive luxury cars and get paid          9xP%oY5NzPG\q2G
Subject: FW: drive luxury cars and get paid          L0z[7J4aYq!F7P1
Subject: FW: get that extra boost in the bed uvqtc qqyixu 
Subject: FW: new mail                                        REgnfqnKQT
Subject: Fw: :( would u mind if i ..
    
jqvmoiqfkzkokdwns u
  
Subject: get that extra boost in the bed uvqtc qqyixu
Subject: get that extra boost in the bed uvqtc qqyixu
Subject: Re: new mail                                        REgnfqnKQT
Subject: Re: new mail REgnfqnKQT
Subject: Stop messages SPAM po p  vyoaejswayqo
Subject: [Fwd:
    
=?GB2312?B?0OnE4r/VvOS089PFu92jrDE5OdSqv8nS1L2o0ru49s341b6jrA==?==?GB231
2?B?uM+/7LW9d3d3LjA3NTVzei5jb23J6sfrsMld?=
  
Dan




On Wednesday, September 10, 2003 17:45, Matthew Bramble
    
<[EMAIL PROTECTED]> wrote:
  
 

    
How about 4 different super tests?  I fail automatically on
=?ISO-8859-1?B?, and that accounts for more than 1% of the
E-mail coming in to my server, but only a handful of additional
catches in what was being missed...no false positives.  I think
I've mentioned enough times, the other tests that I would like
to have...a BODYTEXT filter that searches just a decoded
non-HTML body, a NOTEXT test for nothing but spaces and returns
and attachments (that's a key) after decoding and
de-HTMLifying, and a TEXTCOUNT marquee test that would allow
you to search for amounts of non-HTML decoded body text just
just like SUBECTSPACES and BCC, but in reverse (the less there
is, the higher the score).  I could catch so much crap with
those 40 or so two character gibberish strings, in fact I think
it was properly tagging around 10% to 20% of all unique
incoming messages today if not more.  That gibberish subject
filter is tagging over 5% by itself, and with perfect accuracy
so far.  A functional gibberish body filter though would have a
reasonable number of false positives (was tagging buy.com links
that were shown in displayable text for instance).  I don't of
course though expect Scott to rush to my aid here.

I have managed to add though tests for SUBECTSPACES (very
effective), COMMENTS (effective) and BCC (just ok), along with
some small key word/phrase filters for the body, subject and
sender with very good success.  I only saw about 5 definitive
false positives today out of around 3000 unique messages, but
approximately 150 pieces of spam got through.  I think that
could be reduced by as much as half without a measurable impact
on the false positives.  If that doesn't work, I'm buying a gun
:)

BTW, on Linux, my guru buddy recommends Postfix as the SMTP
client and Webmin as the interface.  I don't though dispute
Sandy's faith in MS SMTP, and it can be run on the same box as
IMail.

Matt




Dan Patnode wrote:

FYI, I pulled this test 3 weeks ago after a email from France
came through (or rather didn't) with this subject:

Subject:
=?ISO-8859-1?B?RW5qb3kgc3VtbWVyIHVudGlsIGl0cyB2ZXJ5IGVuZCE=?=

There's definitely is a correlation here among spammers, ?B?
encoded subjects, disposable domain names, and nothing else in
the body of the message.  There has to be a way to bring the 2
or 3 variables togther as a super test.


Dan


On Monday, September 8, 2003 19:05, Matthew Bramble <[EMAIL PROTECTED]>
      
wrote:
  
Use a text filter and add something like:

SUBJECT 40 CONTAINS =?ISO-8859-1?b?

to it.

I tried this all the way down to ust ?b? and a SUBJECT filter
didn't catch it.  The SUBJECT filter also doesn't catch the
decoded text.

I found though that if you use the HEADERS filter, it will
catch this (customize to suit, this will only catch Latin-1
that is base64 encoded, and I can't think of why that would be
necessary, I would think that only other charactersets could
need this):

   HEADERS        10    CONTAINS    ISO-8859-1?B?

Neither the HEADERS filter nor the SUBJECT filter is catching
the decoded form of the text.  The BASE64 test is also not
catching this if it's only in the Subject of the message (I
assume it only does the body/attachments).

The not so funny thing is that I'm getting this now as a part
of those E-mails containing no displayable text.  This guy is
real good at getting through my settings unless he chooses a
bad IP to send from.  I think a few days ago, another person on
this list commented about this same spammer, bringing up the
domains that he is using (common words followed by numbers). 
The only pattern this guys leaves apart from having no text in
the body, is having different country's TLDs listed in the
Received line, the sender, and the reverse DNS.  Here's a copy
of what I just received using this technique (with links
modified):


  

>From - Mon Sep 08 17:36:44 2003


X-UIDL: 314612976
X-Mozilla-Status: 0011
X-Mozilla-Status2: 00000000
Received: from gjr.paknet.com.pk [81.128.130.33] by igaia.com with
      
ESMTP
  
(SMTPD32-7.13) id A6244F101D8; Mon, 08 Sep 2003 17:35:32 -0400
Date: Mon, 08 Sep 2003 21:35:35 +0000
Message-ID: <[EMAIL PROTECTED]>
X-Mailer: Windows Eudora Pro Version 2.2 (32)
To: [EMAIL PROTECTED]
Subject:
=?ISO-8859-1?B?UmU6T3JkZXIgU2lsZGVuYWZpbCBDaXRyYXRlICBmcm9tIGhvbWUgLSB
      
ubyBkb2N0b3IgcmVxdWlyZWQu?=
  
MIME-Version: 1.0
From: "Shirley Dalton" <[EMAIL PROTECTED]>
Content-Type: text/html
Content-Transfer-Encoding: 8bit
X-Declude-Sender: [EMAIL PROTECTED] [81.128.130.33]
X-Declude-Spoolname: Df62404f101d89e2c.SMD
X-Note: This E-mail was scanned by iGaia Incorporated's E-mail
service (www.igaia.com) for spam.
X-Note: This E-mail was sent from
host81-128-130-33.in-addr.btopenworld.com ([81.128.130.33]).
X-Spam-Tests-Failed: DSN, IPNOTINMX, NOLEGITCONTENT [1]
X-RCPT-TO: <[EMAIL PROTECTED]>
Status: U
X-UIDL: 314612976

<html><body>
<center><!--lfoln42j66--><a
href="" class="moz-txt-link-rfc2396E" href="http://www-dot-payment33dd-dot-com/host/default.asp?ID=omni">"http://www-dot-payment33dd-dot-com/host/default.asp?ID=omni"><im
      
g
  
src="" class="moz-txt-link-rfc2396E" href="http://discountrate2-dot-com/pics/gv1.gif">"http://discountrate2-dot-com/pics/gv1.gif" height="270"
      
width="405"></a></center>
  
</html></body>
   

      

Reply via email to