Re: Bogus authorize.net statements

2012-08-15 Thread Kevin A. McGrail

On 8/15/2012 11:06 AM, Jim Schueler wrote:
Upon Kevin's recommendation, I upgraded.  Big difference.  'Though 
there's a bit of a retuning penalty.

Woohoo, I was right!  All I did was flip a coin, though ;-)
I get quite a few authorize.net http://authorize.net notifications 
on behalf of various ecommerce clients, and this morning I started 
seeing scam/spam similar to the attached.  All share a common marker 
of embedding a text url within an HTML a tag containing a different 
URL.  This seems like an obvious marker for spam, I wonder why there 
isn't a rule for it.


There are many patterns that show up in spam that unfortunately show up 
in ham as well.  If my memory serves me correctly, this just is 
indicative of spam or ham.


HOWEVER, some mail systems with good glue like MIMEDefang can do things 
like disable links that do this or redirect them to a CGI that gives the 
end-user some warning, etc.


Regards,
KAM


Re: Bogus authorize.net statements

2012-08-15 Thread Jim Schueler
Is there such a rule?  Can I write one (I consider myself a bit of a Perl 
wonk)?


I understand that there are few, if any, markers that definitively define 
spam; and that's the beauty of the SpamAssassin architecture.


 -Jim

On Wed, 15 Aug 2012, Kevin A. McGrail wrote:


On 8/15/2012 11:06 AM, Jim Schueler wrote:
  Upon Kevin's recommendation, I upgraded.  Big difference.
   'Though there's a bit of a retuning penalty.

Woohoo, I was right!  All I did was flip a coin, though ;-)
  I get quite a few authorize.net notifications on behalf of
  various ecommerce clients, and this morning I started seeing
  scam/spam similar to the attached.  All share a common marker of
  embedding a text url within an HTML a tag containing a
  different URL.  This seems like an obvious marker for spam, I
  wonder why there isn't a rule for it.

There are many patterns that show up in spam that unfortunately show up in
ham as well.  If my memory serves me correctly, this just is indicative of
spam or ham.

HOWEVER, some mail systems with good glue like MIMEDefang can do things like
disable links that do this or redirect them to a CGI that gives the end-user
some warning, etc.

Regards,
KAM



Re: Bogus authorize.net statements

2012-08-15 Thread John Hardin

On Wed, 15 Aug 2012, Jim Schueler wrote:


Is there such a rule?


No, not at present.


Can I write one (I consider myself a bit of a Perl wonk)?


Sure. Post it here and one of the rule committers can add it to their 
sandbox for testing against the masscheck corpora.


The problem with what you suggest is that having a different description 
in the displayed text for a link is extremely common.


If you can manage to write a regex that detects a link tag where the 
displayed text differs from the href _AND_ the displayed text is a URL, 
then it might be useful. Just triggering on displayed text != href is not 
useful.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Maxim I: Pillage, _then_ burn.
---
 Today: the 67th anniversary of the end of World War II


Re: Bogus authorize.net statements

2012-08-15 Thread Kevin A. McGrail

On 8/15/2012 11:35 AM, John Hardin wrote:

On Wed, 15 Aug 2012, Jim Schueler wrote:


Is there such a rule?


No, not at present.


Can I write one (I consider myself a bit of a Perl wonk)?


Sure. Post it here and one of the rule committers can add it to their 
sandbox for testing against the masscheck corpora.


The problem with what you suggest is that having a different 
description in the displayed text for a link is extremely common.


If you can manage to write a regex that detects a link tag where the 
displayed text differs from the href _AND_ the displayed text is a 
URL, then it might be useful. Just triggering on displayed text != 
href is not useful.


I am 99.9% sure I've personally done research on this and it was no 
indication of SPAM or HAM.  It is equally used in both and anecdotal 
checks yesterday confirmed it.


IMO, this is a waste of time you can confirm simply by checking a couple 
of legit email newsletters, for example.


Regards,
KAM


Re: Bogus authorize.net statements

2012-08-15 Thread Axb

On 08/15/2012 06:01 PM, Kevin A. McGrail wrote:

On 8/15/2012 11:35 AM, John Hardin wrote:

On Wed, 15 Aug 2012, Jim Schueler wrote:


Is there such a rule?


No, not at present.


Can I write one (I consider myself a bit of a Perl wonk)?


Sure. Post it here and one of the rule committers can add it to their
sandbox for testing against the masscheck corpora.


test rule on its way



Re: Bogus authorize.net statements

2012-08-15 Thread John Hardin

On Wed, 15 Aug 2012, Kevin A. McGrail wrote:


On 8/15/2012 11:35 AM, John Hardin wrote:

 On Wed, 15 Aug 2012, Jim Schueler wrote:

  Is there such a rule?

 No, not at present.

  Can I write one (I consider myself a bit of a Perl wonk)?

 Sure. Post it here and one of the rule committers can add it to their
 sandbox for testing against the masscheck corpora.

 The problem with what you suggest is that having a different description
 in the displayed text for a link is extremely common.

 If you can manage to write a regex that detects a link tag where the
 displayed text differs from the href _AND_ the displayed text is a URL,
 then it might be useful. Just triggering on displayed text != href is not
 useful.


I am 99.9% sure I've personally done research on this and it was no 
indication of SPAM or HAM.  It is equally used in both and anecdotal checks 
yesterday confirmed it.


IMO, this is a waste of time you can confirm simply by checking a couple of 
legit email newsletters, for example.


Okay, let me modify my suggestion, then: if you can detect where the 
displayed text for a link is a URL, and the domain name in that URL does 
not match the domain name in the href, then it might be useful.


Does that seem more possible?

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Riff: Torg, you traded our magic beans for a _cow_?
  Torg: It's a _magic_ cow! It's full of steaks!
  Riff: Whoa!-- Sluggy 04/28/2002
---
 Today: the 67th anniversary of the end of World War II


Re: Bogus authorize.net statements

2012-08-15 Thread Axb

On 08/15/2012 06:09 PM, John Hardin wrote:

On Wed, 15 Aug 2012, Kevin A. McGrail wrote:


On 8/15/2012 11:35 AM, John Hardin wrote:

 On Wed, 15 Aug 2012, Jim Schueler wrote:

  Is there such a rule?

 No, not at present.

  Can I write one (I consider myself a bit of a Perl wonk)?

 Sure. Post it here and one of the rule committers can add it to their
 sandbox for testing against the masscheck corpora.

 The problem with what you suggest is that having a different
description
 in the displayed text for a link is extremely common.

 If you can manage to write a regex that detects a link tag where the
 displayed text differs from the href _AND_ the displayed text is a URL,
 then it might be useful. Just triggering on displayed text != href
is not
 useful.


I am 99.9% sure I've personally done research on this and it was no
indication of SPAM or HAM.  It is equally used in both and anecdotal
checks yesterday confirmed it.

IMO, this is a waste of time you can confirm simply by checking a
couple of legit email newsletters, for example.


Okay, let me modify my suggestion, then: if you can detect where the
displayed text for a link is a URL, and the domain name in that URL does
not match the domain name in the href, then it might be useful.

Does that seem more possible?


Wouldn't URIDetail do this?





Re: Bogus authorize.net statements

2012-08-15 Thread Kevin A. McGrail


Okay, let me modify my suggestion, then: if you can detect where the 
displayed text for a link is a URL, and the domain name in that URL 
does not match the domain name in the href, then it might be useful.


Does that seem more possible?


Nope.  Just look at millions of things sent by constantcontact.com where 
they add their tracking links to the newsletter content.


Sorry to be negative but I really don't think you are going to find this 
to be an indication of spam or ham.


Re: Bogus authorize.net statements

2012-08-15 Thread David F. Skoll
Somewhat OT, but I'm getting SPF fail on all the bogus authorize.net
spams I've seen.  That should be enough to whack 'em.

Regards,

David.


Re: Bogus authorize.net statements

2012-08-15 Thread darxus
On 08/15, Jim Schueler wrote:
the attached. �All share a common marker of embedding a text url within an
HTML a tag containing a different URL. �This seems like an obvious
marker for spam, I wonder why there isn't a rule for it.

There is a rule.  It hits 10x as much non-spam as spam:

ruleqa.spamassassin.org/?rule=%2Fspoofed_url

There was some work on improving it:
http://osdir.com/ml/users-spamassassin/2011-10/msg00237.html

It didn't work out:
http://osdir.com/ml/users-spamassassin/2011-10/msg00304.html

Feel free to try to do better.

-- 
Just because you're offended, doesn't mean you're right. - Ricky Gervais
http://www.ChaosReigns.com


Re: Bogus authorize.net statements

2012-08-15 Thread Kevin A. McGrail

On 8/15/2012 12:57 PM, dar...@chaosreigns.com wrote:

On 08/15, Jim Schueler wrote:

the attached. �All share a common marker of embedding a text url within an
HTML a tag containing a different URL. �This seems like an obvious
marker for spam, I wonder why there isn't a rule for it.

There is a rule.  It hits 10x as much non-spam as spam:

ruleqa.spamassassin.org/?rule=%2Fspoofed_url

There was some work on improving it:
http://osdir.com/ml/users-spamassassin/2011-10/msg00237.html

It didn't work out:
http://osdir.com/ml/users-spamassassin/2011-10/msg00304.html

Feel free to try to do better.

Thanks for finding this.  I also have some analysis somewhere on my 
corpus though I doubt it would be different excepting that your corpus 
likely doesn't include emails with images so it's a bit skewed the other 
direction as that likely blocks the advertising tracker companies.


Regards,
KAM