Re: Multiple subject headers - most blank

2014-12-05 Thread Joe Quinn

On 12/5/2014 1:19 PM, Gibbs, David wrote:

On 12/5/2014 11:25 AM, John Hardin wrote:

FWIW: here's the rule I came up with ... seems to work adequately.

header __COUNT_SUBJ Subject =~ /.*/


You might want to be a little bit more paranoid and explicitly anchor 
that:


   header __COUNT_SUBJ Subject =~ /^.*$/

I know .* is greedy and shouldn't overlap on multiple matches, but 
this helps make sure.


I tried that originally, but it didn't end up matching.

Oddly, when I put the original rule "/.*/" in place, and ran a message 
with multiple subject lines through in debug ... I got the following 
relevant output:


Dec  5 12:09:52.032 [2459] dbg: rules: ran header rule __COUNT_SUBJ 
==> got hit: " The Hottest Smartphones - Details Inside "
Dec  5 12:09:52.032 [2459] dbg: rules: ran header rule __COUNT_SUBJ 
==> got hit: "negative match"
Dec  5 12:09:52.032 [2459] dbg: rules: ran header rule __COUNT_SUBJ 
==> got hit: "negative match"
Dec  5 12:09:52.032 [2459] dbg: rules: ran header rule __COUNT_SUBJ 
==> got hit: "negative match"
Dec  5 12:09:52.033 [2459] dbg: rules: ran header rule __COUNT_SUBJ 
==> got hit: "The Hottest Smartphones - Details Inside"
Dec  5 12:09:52.033 [2459] dbg: rules: ran header rule __COUNT_SUBJ 
==> got hit: "negative match"


I'm assuming "negative match" means that the rule didn't match.

The message in question has 4 subject lines, the first appears to be 
encoded, 2 more that are blank, the 4th one is plain text.


Example: http://code.midrange.com/4c731ced97.html

Not sure why the rule is being applied 6 times.

david

It's likely going to have to do with (.*) accepting a zero-width match. 
The regex engine effectively considers every string to have a zero-width 
/thing/ between every character. The first match consumes the whole 
string, leaving the cursor at the end. The next match is that zero-width 
magic at the end of the text. To see this in action, compare these two 
perl lines:


perl -e '$x = "abc"; while ($x =~ //cg) {print "match\n";}' # matches 
the empty spaces before 'a', and after 'a', 'b', and 'c' - 4 matches total
perl -e '$x = "abc"; while ($x =~ /./cg) {print "match\n";}' # matches 
the characters 'a', 'b', and 'c' as you would expect


Re: Multiple subject headers - most blank

2014-12-05 Thread Gibbs, David

On 12/5/2014 11:25 AM, John Hardin wrote:

FWIW: here's the rule I came up with ... seems to work adequately.

header __COUNT_SUBJ Subject =~ /.*/


You might want to be a little bit more paranoid and explicitly anchor that:

   header __COUNT_SUBJ Subject =~ /^.*$/

I know .* is greedy and shouldn't overlap on multiple matches, but this helps 
make sure.


I tried that originally, but it didn't end up matching.

Oddly, when I put the original rule "/.*/" in place, and ran a message with 
multiple subject lines through in debug ... I got the following relevant output:

Dec  5 12:09:52.032 [2459] dbg: rules: ran header rule __COUNT_SUBJ ==> got hit: 
" The Hottest Smartphones - Details Inside "
Dec  5 12:09:52.032 [2459] dbg: rules: ran header rule __COUNT_SUBJ ==> got hit: 
"negative match"
Dec  5 12:09:52.032 [2459] dbg: rules: ran header rule __COUNT_SUBJ ==> got hit: 
"negative match"
Dec  5 12:09:52.032 [2459] dbg: rules: ran header rule __COUNT_SUBJ ==> got hit: 
"negative match"
Dec  5 12:09:52.033 [2459] dbg: rules: ran header rule __COUNT_SUBJ ==> got hit: 
"The Hottest Smartphones - Details Inside"
Dec  5 12:09:52.033 [2459] dbg: rules: ran header rule __COUNT_SUBJ ==> got hit: 
"negative match"

I'm assuming "negative match" means that the rule didn't match.

The message in question has 4 subject lines, the first appears to be encoded, 2 
more that are blank, the 4th one is plain text.

Example: http://code.midrange.com/4c731ced97.html

Not sure why the rule is being applied 6 times.

david

--
IBM i on Power Systems: For when you can't afford to be out of business!

I'm riding a metric century (100 km / 62 miles) in the 2015 American Diabetes 
Association's Tour de Cure to raise money for diabetes research, education, 
advocacy, and awareness.  You can make a tax deductible donation to my ride by 
visiting http://email.diabetessucks.net.  My goal is $5500 but any amount is 
appreciated.

See where I get my donations from ... visit 
http://email.diabetessucks.net/mapdonations.php for an interactive map (it's a 
geeky thing).



Re: Multiple subject headers - most blank

2014-12-05 Thread John Hardin

On Fri, 5 Dec 2014, Gibbs, David wrote:


On 12/4/2014 10:22 AM, Gibbs, David wrote:

 I've seen a number of spam messages come through with multiple header
 lines ... some of them are blank.

 Any suggestions for a rule to trap this?


FWIW: here's the rule I came up with ... seems to work adequately.

header __COUNT_SUBJ Subject =~ /.*/


You might want to be a little bit more paranoid and explicitly anchor 
that:


  header __COUNT_SUBJ Subject =~ /^.*$/

I know .* is greedy and shouldn't overlap on multiple matches, but this 
helps make sure.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Our government wants to do everything it can "for the children,"
  except sparing them crushing tax burdens.
---
 10 days until Bill of Rights day


Re: Multiple subject headers - most blank

2014-12-05 Thread Gibbs, David

On 12/4/2014 10:22 AM, Gibbs, David wrote:

I've seen a number of spam messages come through with multiple header
lines ... some of them are blank.

Any suggestions for a rule to trap this?


FWIW: here's the rule I came up with ... seems to work adequately.

header __COUNT_SUBJ Subject =~ /.*/
tflags __COUNT_SUBJ multiple

meta DMG_MULT_SUBJ (__COUNT_SUBJ > 2)
score DMG_MULT_SUBJ 1.0
describe DMG_MULT_SUBJ Message has more than one subject header

Although the __COUNT_SUBJ rule isn't behaving as I would expect it.  If the meta rule 
says > 1, and the message only has one subject, the rule is activated.  That's why 
I have it set to > 2 instead.

david

--
IBM i on Power Systems: For when you can't afford to be out of business!

I'm riding a metric century (100 km / 62 miles) in the 2015 American Diabetes 
Association's Tour de Cure to raise money for diabetes research, education, 
advocacy, and awareness.  You can make a tax deductible donation to my ride by 
visiting http://email.diabetessucks.net.  My goal is $5500 but any amount is 
appreciated.

See where I get my donations from ... visit 
http://email.diabetessucks.net/mapdonations.php for an interactive map (it's a 
geeky thing).



Re: Multiple subject headers - most blank

2014-12-04 Thread Bertrand Caplet
> I've seen a number of spam messages come through with multiple header
> lines ... some of them are blank.
> 
> Subject:  
>   
> =?ISO-8859-1?Q?=20The=20Hotte?==?ISO-8859-1?Q?st=20Sm?==?ISO-8859-1?Q?ar?==?ISO-8859-1?Q?tpho?==?ISO-8859-1?Q?nes=20?==?ISO-8859-1?Q?-=20Det?==?ISO-8859-1?Q?ails=20In?==?ISO-8859-1?Q?side=20?=
> 
> Subject:
> Subject:
> Subject:   The Hottest Smartphones - Details Inside
> 
> Any suggestions for a rule to trap this?

Hey David,
What about the other header, From, Return-Path, Received. Are they the
same on multiple mails ?
Because you can simply learn to SA this is spam, I think it would be ok

Regards
-- 
CHUNKZ.NET - script kiddie and computer technician
Bertrand Caplet, Flers (FR)
Feel free to send encrypted/signed messages
Key ID: FF395BD9
GPG FP: DE10 73FD 17EB 5544 A491 B385 1EDA 35DC FF39 5BD9



signature.asc
Description: OpenPGP digital signature


Multiple subject headers - most blank

2014-12-04 Thread Gibbs, David

Folks:

I've seen a number of spam messages come through with multiple header lines ... 
some of them are blank.

Subject:

=?ISO-8859-1?Q?=20The=20Hotte?==?ISO-8859-1?Q?st=20Sm?==?ISO-8859-1?Q?ar?==?ISO-8859-1?Q?tpho?==?ISO-8859-1?Q?nes=20?==?ISO-8859-1?Q?-=20Det?==?ISO-8859-1?Q?ails=20In?==?ISO-8859-1?Q?side=20?=
Subject:
Subject:
Subject:   The Hottest Smartphones - Details Inside

Any suggestions for a rule to trap this?

Thanks!

david

--
IBM i on Power Systems: For when you can't afford to be out of business!

I'm riding a metric century (100 km / 62 miles) in the 2015 American Diabetes 
Association's Tour de Cure to raise money for diabetes research, education, 
advocacy, and awareness.  You can make a tax deductible donation to my ride by 
visiting http://sa.diabetessucks.net.  My goal is $5500 but any amount is 
appreciated.

See where I get my donations from ... visit 
http://sa.diabetessucks.net/mapdonations.php for an interactive map (it's a 
geeky thing).