Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available

2014-02-12 Thread Kevin A. McGrail

On 2/12/2014 11:02 AM, Mark Martinec wrote:

 https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7007#c1

For the archive: it was an installation error, having two versions
of SpamAssassin installed at the same time in different locations.

   Mark

Good catch!


Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available

2014-02-12 Thread jpff
I got the same error installing via cpan.  Trouble is I do not know where 
the other system is or how to find it.  There are a number of Util.pm 
files ll over the place.

  This is on a debian system but not really using their package
==John ff


Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available

2014-02-12 Thread Kevin A. McGrail

On 2/12/2014 11:53 AM, jpff wrote:
I got the same error installing via cpan.  Trouble is I do not know 
where the other system is or how to find it.  There are a number of 
Util.pm files ll over the place.

  This is on a debian system but not really using their package
==John ff

Please reopen the ticket and give as much info as you can.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7007

For example, did you do cpan and then install Mail::SpamAssassin?

Was the previous version installed with CPAN?

Regards,
KAM


Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available

2014-02-12 Thread Mark Martinec
On 2/12/2014 11:53 AM, jpff wrote:
 I got the same error installing via cpan.  Trouble is I do not know
 where the other system is or how to find it.  There are a number of
 Util.pm files ll over the place.
   This is on a debian system but not really using their package

This should get rid of the old installation:

  apt-get remove spamassassin

To make sure:

  find /usr -name 'SpamAssassin*' -o -name sa-update \
 -o -name 'spam[cd]' -o -name spamassassin

(there are some additional man pages, documentation and start scripts
beyond the above)

This is how the installation looks the Debian way:

/usr/sbin/spamd
/usr/bin/spamassassin
/usr/bin/sa-update
/usr/share/perl5/Mail/SpamAssassin
/usr/share/perl5/Mail/SpamAssassin.pm
/usr/share/doc/spamassassin
/usr/share/spamassassin

and this is what it looks the CPAN way:

/usr/local/bin/spamc
/usr/local/bin/spamd
/usr/local/bin/spamassassin
/usr/local/bin/sa-update
/usr/local/lib/perl/5.14.2/auto/Mail/SpamAssassin
/usr/local/share/perl/5.14.2/Mail/SpamAssassin
/usr/local/share/perl/5.14.2/Mail/SpamAssassin.pm
/usr/local/share/spamassassin

You can have one or the other, but not both.

  Mark


Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available

2014-02-12 Thread jpff

Thank you -- that is what I needed


On Wed, 12 Feb 2014, Mark Martinec wrote:


On 2/12/2014 11:53 AM, jpff wrote:

I got the same error installing via cpan.  Trouble is I do not know
where the other system is or how to find it.  There are a number of
Util.pm files ll over the place.
  This is on a debian system but not really using their package


This should get rid of the old installation:

 apt-get remove spamassassin

To make sure:

 find /usr -name 'SpamAssassin*' -o -name sa-update \
-o -name 'spam[cd]' -o -name spamassassin

(there are some additional man pages, documentation and start scripts
beyond the above)

This is how the installation looks the Debian way:

/usr/sbin/spamd
/usr/bin/spamassassin
/usr/bin/sa-update
/usr/share/perl5/Mail/SpamAssassin
/usr/share/perl5/Mail/SpamAssassin.pm
/usr/share/doc/spamassassin
/usr/share/spamassassin

and this is what it looks the CPAN way:

/usr/local/bin/spamc
/usr/local/bin/spamd
/usr/local/bin/spamassassin
/usr/local/bin/sa-update
/usr/local/lib/perl/5.14.2/auto/Mail/SpamAssassin
/usr/local/share/perl/5.14.2/Mail/SpamAssassin
/usr/local/share/perl/5.14.2/Mail/SpamAssassin.pm
/usr/local/share/spamassassin

You can have one or the other, but not both.

 Mark



Spam Pattern

2014-02-12 Thread Joe Quinn
This pattern has been showing up in a good 80% of spam I have looked at 
in the past month.


Spammers take a few paragraphs out of a large body of text and put it at 
the end of their email. My favorite is one that had the scene where 
Daisy first meets Jay Gatsby.


Sometimes they add some munging, or like in this example they insert 
base64-encoded hashes. We have a rule for the plaintext hashes, but does 
anyone on the list have a good way of detecting this?


Example: http://pastebin.com/zCStErch

Regards,
JMQ


Re: Spam Pattern

2014-02-12 Thread John Hardin

On Wed, 12 Feb 2014, Joe Quinn wrote:

This pattern has been showing up in a good 80% of spam I have looked at in 
the past month.


Spammers take a few paragraphs out of a large body of text and put it at the 
end of their email. My favorite is one that had the scene where Daisy first 
meets Jay Gatsby.


Sometimes they add some munging, or like in this example they insert 
base64-encoded hashes. We have a rule for the plaintext hashes, but does 
anyone on the list have a good way of detecting this?


Bayes.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Mine eyes have seen the horror of the voting of the horde;
  They've looted the fromagerie where guv'ment cheese is stored;
  If war's not won before the break they grow so quickly bored;
  Their vote counts as much as yours.  -- Tam
---
 Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays


Re: Spam Pattern

2014-02-12 Thread Joe Quinn

On 2/12/2014 3:15 PM, John Hardin wrote:

On Wed, 12 Feb 2014, Joe Quinn wrote:

This pattern has been showing up in a good 80% of spam I have looked 
at in the past month.


Spammers take a few paragraphs out of a large body of text and put it 
at the end of their email. My favorite is one that had the scene 
where Daisy first meets Jay Gatsby.


Sometimes they add some munging, or like in this example they insert 
base64-encoded hashes. We have a rule for the plaintext hashes, but 
does anyone on the list have a good way of detecting this?


Bayes.

Any ideas outside of Bayes? We don't currently have it configured, and 
the setup involved is more than we would like to do for just one rule, 
if at all possible.


Re: Spam Pattern

2014-02-12 Thread RW
On Wed, 12 Feb 2014 15:02:20 -0500
Joe Quinn wrote:

 This pattern has been showing up in a good 80% of spam I have looked
 at in the past month.
 
 Spammers take a few paragraphs out of a large body of text and put it
 at the end of their email. My favorite is one that had the scene
 where Daisy first meets Jay Gatsby.
 
 Sometimes they add some munging, or like in this example they insert 
 base64-encoded hashes.

It's not base64, it's just hexadecimal. 

I don't see any particular reason to think they are hashes.

  We have a rule for the plaintext hashes,

I presume you've mixed up your examples and given the plaintext
version, base64 should be just as easy to spot because of the way
its padded-out.

 
 Example: http://pastebin.com/zCStErch


Re: Spam Pattern

2014-02-12 Thread Amir Caspi
On Feb 12, 2014, at 1:15 PM, John Hardin jhar...@impsec.org wrote:

 Bayes.

Well, yes and no.  Bayes isn't very good about detecting this kind of thing per 
se because it's full of random crap... in fact, they specifically pull text 
from innocuous things like web reviews, movie reviews, news articles, etc. in 
the hopes that it contains a lot of hammy tokens that will negate the spammy 
ones.  On the other hand, there's no real good way of detecting lots of 
garbage filler text without a natural language algorithm that could 
heuristically determine whether the primary content (as determined by subject, 
etc.) is related to the filler... and I don't think any such algorithms exist.  
Bayes provides a way of distilling the garbage into tokens and sifting through 
it objectively, so it's the best option, but I wouldn't say it's a method of 
detecting this kind of thing.

That said, this particular spam template is interspersed with some sort of 
hashcode which is repeated a number of times.  It could be possible to write a 
rule that matches a long (20-30 chars) alphanumeric string and count 
repetitions; if the same long string is repeated more than (say) 10 times, 
there's a good bet it's an embedded spammy hashcode.

I'd write an example rule but I don't know how to store regexp matches from one 
test to see if they match another test... that is, writing a regexp and using 
tflags multiple on it would be fine if we wanted it to hit on 10 or more long 
strings even if those strings don't match, but if we want to see if there are 
10 or more repeated long strings that are identical, we have to store it 
somehow, and I don't know how to do that with SA.

If SA allows backreferences (since Perl does) then something like the following 
MIGHT work, though I suspect it would be a horrible CPU hog:

rawbody AC_REPEATED_HASHCODE
/(\s[A-Za-z0-9]{25,}\s)(?:(?:\s*\w+)+\1){10}

This will look for a 25-character string, and look for 10 more repetitions of 
that string surrounded by an arbitrary number of words.  This is untested so I 
don't know if it'll work for sure, and I suspect it wouldn't be very friendly 
to the CPU.  The previous method of matching a string, storing it, and looking 
for repetitions of that string, would be preferable, but I don't know how to do 
that with SA.

--- Amir




Re: Spam Pattern

2014-02-12 Thread John Hardin

On Wed, 12 Feb 2014, Joe Quinn wrote:


On 2/12/2014 3:15 PM, John Hardin wrote:

 On Wed, 12 Feb 2014, Joe Quinn wrote:

  This pattern has been showing up in a good 80% of spam I have looked at 
  in the past month.
 
  Spammers take a few paragraphs out of a large body of text and put it at 
  the end of their email. My favorite is one that had the scene where 
  Daisy first meets Jay Gatsby.
 
  Sometimes they add some munging, or like in this example they insert 
  base64-encoded hashes. We have a rule for the plaintext hashes, but does 
  anyone on the list have a good way of detecting this?


 Bayes.


Any ideas outside of Bayes? We don't currently have it configured, and the 
setup involved is more than we would like to do for just one rule, if at all 
possible.


Bayes is very useful, you should reconsider.

Perhaps something like this:

body  __HEXHASHWORD   /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/
tflags__HEXHASHWORD   multiple maxhits=5
meta  HEXHASH_WORD__HEXHASHWORD  4
describe  HEXHASH_WORDHexadecimal hash followed by a word

Added to my sandbox, just in case.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Mine eyes have seen the horror of the voting of the horde;
  They've looted the fromagerie where guv'ment cheese is stored;
  If war's not won before the break they grow so quickly bored;
  Their vote counts as much as yours.  -- Tam
---
 Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays


Re: Spam Pattern

2014-02-12 Thread John Hardin

On Wed, 12 Feb 2014, Amir Caspi wrote:


On Feb 12, 2014, at 1:15 PM, John Hardin jhar...@impsec.org wrote:


Bayes.


Well, yes and no.  Bayes isn't very good about detecting this kind of 
thing per se because it's full of random crap... in fact, they 
specifically pull text from innocuous things like web reviews, movie 
reviews, news articles, etc. in the hopes that it contains a lot of 
hammy tokens that will negate the spammy ones.


That only works if your hammy mail stream contains text that looks like 
the random garbage they put in to try to spoof bayes.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  WSJ on the Financial Stimulus package: ...today there are 700,000
  fewer jobs than [the administration] predicted we would have if we
  had done nothing at all.
---
 Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays


Re: Spam Pattern

2014-02-12 Thread Axb

On 02/12/2014 10:06 PM, John Hardin wrote:

On Wed, 12 Feb 2014, Joe Quinn wrote:


On 2/12/2014 3:15 PM, John Hardin wrote:

 On Wed, 12 Feb 2014, Joe Quinn wrote:

  This pattern has been showing up in a good 80% of spam I have
looked at   in the past month.
   Spammers take a few paragraphs out of a large body of text and
put it at   the end of their email. My favorite is one that had the
scene where   Daisy first meets Jay Gatsby.
   Sometimes they add some munging, or like in this example they
insert   base64-encoded hashes. We have a rule for the plaintext
hashes, but does   anyone on the list have a good way of detecting
this?

 Bayes.


Any ideas outside of Bayes? We don't currently have it configured, and
the setup involved is more than we would like to do for just one rule,
if at all possible.


Bayes is very useful, you should reconsider.

Perhaps something like this:

body  __HEXHASHWORD   /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/
tflags__HEXHASHWORD   multiple maxhits=5
meta  HEXHASH_WORD__HEXHASHWORD  4
describe  HEXHASH_WORDHexadecimal hash followed by a word

Added to my sandbox, just in case.


John,

Isn't {30,} (without a limit) dangerously expensive?




Effectiveness of Bayes poisoning (was Re: Spam Pattern)

2014-02-12 Thread David F. Skoll
On Wed, 12 Feb 2014 13:11:19 -0800 (PST)
John Hardin jhar...@impsec.org wrote:

 That only works if your hammy mail stream contains text that looks
 like the random garbage they put in to try to spoof bayes.

Indeed.  Just for kicks, I ran the OP's pastebin example through our
Bayes database and it scored 99.99% likelihood of spam.  The word
Wopsle, for example, was a dead giveaway... that never appears in
our ham stream, but has appeared in 93 spams in our database.

Bayes poisoning, in our experience, is only occasionally effective.

Regards,

David.



Re: Spam Pattern

2014-02-12 Thread John Hardin

On Wed, 12 Feb 2014, Axb wrote:


On 02/12/2014 10:06 PM, John Hardin wrote:


 Perhaps something like this:

 body  __HEXHASHWORD   /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/
 tflags__HEXHASHWORD   multiple maxhits=5
 meta  HEXHASH_WORD__HEXHASHWORD  4
 describe  HEXHASH_WORDHexadecimal hash followed by a word

 Added to my sandbox, just in case.


John,

Isn't {30,} (without a limit) dangerously expensive?


Potentially expensive; the character class and the fact that the following 
atom is not in that class limits the risk - backtracking isn't a 
possibility. However, point taken - recommend {30,64} instead.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  WSJ on the Financial Stimulus package: ...today there are 700,000
  fewer jobs than [the administration] predicted we would have if we
  had done nothing at all.
---
 Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays


Re: Spam Pattern

2014-02-12 Thread Axb

On 02/12/2014 10:46 PM, John Hardin wrote:

On Wed, 12 Feb 2014, Axb wrote:


On 02/12/2014 10:06 PM, John Hardin wrote:


 Perhaps something like this:

 body  __HEXHASHWORD   /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/
 tflags__HEXHASHWORD   multiple maxhits=5
 meta  HEXHASH_WORD__HEXHASHWORD  4
 describe  HEXHASH_WORDHexadecimal hash followed by a word

 Added to my sandbox, just in case.


John,

Isn't {30,} (without a limit) dangerously expensive?


Potentially expensive; the character class and the fact that the
following atom is not in that class limits the risk - backtracking isn't
a possibility. However, point taken - recommend {30,64} instead.


imo, you don't even need to count that much - I'd stop at sweet 16, 
anything above is pink noise and not waste time chasing spaces  co.






spamassassin 3.4.0 spec file for rhel4 rhel5 rhel6 and compatible os's

2014-02-12 Thread Email Lists07
Greetings

Thank you SpamAssassin team and other contrib people for all your hard work
in getting out 3.4.0 and more !

I have been playing with the spec file for spamassassin 3.3.1 and have been
able to get 3.4.0 to turn into the RPM's

Then I do a yum localinstall test and it looks like I will have to deal with
some perl dependencies, or so it appears

I am a working on it although quite rusty on spec file study, comparison,
and editing etc.

Yet since I don't know everything about 3.4.0...

Would someone with a known good working spamassassin spec file that works
well with rhel4 - rhel6 machines please share?

Thank you in advance...

 - rh