Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available
On 2/12/2014 11:02 AM, Mark Martinec wrote: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7007#c1 For the archive: it was an installation error, having two versions of SpamAssassin installed at the same time in different locations. Mark Good catch!
Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available
I got the same error installing via cpan. Trouble is I do not know where the other system is or how to find it. There are a number of Util.pm files ll over the place. This is on a debian system but not really using their package ==John ff
Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available
On 2/12/2014 11:53 AM, jpff wrote: I got the same error installing via cpan. Trouble is I do not know where the other system is or how to find it. There are a number of Util.pm files ll over the place. This is on a debian system but not really using their package ==John ff Please reopen the ticket and give as much info as you can. https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7007 For example, did you do cpan and then install Mail::SpamAssassin? Was the previous version installed with CPAN? Regards, KAM
Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available
On 2/12/2014 11:53 AM, jpff wrote: I got the same error installing via cpan. Trouble is I do not know where the other system is or how to find it. There are a number of Util.pm files ll over the place. This is on a debian system but not really using their package This should get rid of the old installation: apt-get remove spamassassin To make sure: find /usr -name 'SpamAssassin*' -o -name sa-update \ -o -name 'spam[cd]' -o -name spamassassin (there are some additional man pages, documentation and start scripts beyond the above) This is how the installation looks the Debian way: /usr/sbin/spamd /usr/bin/spamassassin /usr/bin/sa-update /usr/share/perl5/Mail/SpamAssassin /usr/share/perl5/Mail/SpamAssassin.pm /usr/share/doc/spamassassin /usr/share/spamassassin and this is what it looks the CPAN way: /usr/local/bin/spamc /usr/local/bin/spamd /usr/local/bin/spamassassin /usr/local/bin/sa-update /usr/local/lib/perl/5.14.2/auto/Mail/SpamAssassin /usr/local/share/perl/5.14.2/Mail/SpamAssassin /usr/local/share/perl/5.14.2/Mail/SpamAssassin.pm /usr/local/share/spamassassin You can have one or the other, but not both. Mark
Re: ANNOUNCE: Apache SpamAssassin 3.4.0 available
Thank you -- that is what I needed On Wed, 12 Feb 2014, Mark Martinec wrote: On 2/12/2014 11:53 AM, jpff wrote: I got the same error installing via cpan. Trouble is I do not know where the other system is or how to find it. There are a number of Util.pm files ll over the place. This is on a debian system but not really using their package This should get rid of the old installation: apt-get remove spamassassin To make sure: find /usr -name 'SpamAssassin*' -o -name sa-update \ -o -name 'spam[cd]' -o -name spamassassin (there are some additional man pages, documentation and start scripts beyond the above) This is how the installation looks the Debian way: /usr/sbin/spamd /usr/bin/spamassassin /usr/bin/sa-update /usr/share/perl5/Mail/SpamAssassin /usr/share/perl5/Mail/SpamAssassin.pm /usr/share/doc/spamassassin /usr/share/spamassassin and this is what it looks the CPAN way: /usr/local/bin/spamc /usr/local/bin/spamd /usr/local/bin/spamassassin /usr/local/bin/sa-update /usr/local/lib/perl/5.14.2/auto/Mail/SpamAssassin /usr/local/share/perl/5.14.2/Mail/SpamAssassin /usr/local/share/perl/5.14.2/Mail/SpamAssassin.pm /usr/local/share/spamassassin You can have one or the other, but not both. Mark
Spam Pattern
This pattern has been showing up in a good 80% of spam I have looked at in the past month. Spammers take a few paragraphs out of a large body of text and put it at the end of their email. My favorite is one that had the scene where Daisy first meets Jay Gatsby. Sometimes they add some munging, or like in this example they insert base64-encoded hashes. We have a rule for the plaintext hashes, but does anyone on the list have a good way of detecting this? Example: http://pastebin.com/zCStErch Regards, JMQ
Re: Spam Pattern
On Wed, 12 Feb 2014, Joe Quinn wrote: This pattern has been showing up in a good 80% of spam I have looked at in the past month. Spammers take a few paragraphs out of a large body of text and put it at the end of their email. My favorite is one that had the scene where Daisy first meets Jay Gatsby. Sometimes they add some munging, or like in this example they insert base64-encoded hashes. We have a rule for the plaintext hashes, but does anyone on the list have a good way of detecting this? Bayes. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Mine eyes have seen the horror of the voting of the horde; They've looted the fromagerie where guv'ment cheese is stored; If war's not won before the break they grow so quickly bored; Their vote counts as much as yours. -- Tam --- Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays
Re: Spam Pattern
On 2/12/2014 3:15 PM, John Hardin wrote: On Wed, 12 Feb 2014, Joe Quinn wrote: This pattern has been showing up in a good 80% of spam I have looked at in the past month. Spammers take a few paragraphs out of a large body of text and put it at the end of their email. My favorite is one that had the scene where Daisy first meets Jay Gatsby. Sometimes they add some munging, or like in this example they insert base64-encoded hashes. We have a rule for the plaintext hashes, but does anyone on the list have a good way of detecting this? Bayes. Any ideas outside of Bayes? We don't currently have it configured, and the setup involved is more than we would like to do for just one rule, if at all possible.
Re: Spam Pattern
On Wed, 12 Feb 2014 15:02:20 -0500 Joe Quinn wrote: This pattern has been showing up in a good 80% of spam I have looked at in the past month. Spammers take a few paragraphs out of a large body of text and put it at the end of their email. My favorite is one that had the scene where Daisy first meets Jay Gatsby. Sometimes they add some munging, or like in this example they insert base64-encoded hashes. It's not base64, it's just hexadecimal. I don't see any particular reason to think they are hashes. We have a rule for the plaintext hashes, I presume you've mixed up your examples and given the plaintext version, base64 should be just as easy to spot because of the way its padded-out. Example: http://pastebin.com/zCStErch
Re: Spam Pattern
On Feb 12, 2014, at 1:15 PM, John Hardin jhar...@impsec.org wrote: Bayes. Well, yes and no. Bayes isn't very good about detecting this kind of thing per se because it's full of random crap... in fact, they specifically pull text from innocuous things like web reviews, movie reviews, news articles, etc. in the hopes that it contains a lot of hammy tokens that will negate the spammy ones. On the other hand, there's no real good way of detecting lots of garbage filler text without a natural language algorithm that could heuristically determine whether the primary content (as determined by subject, etc.) is related to the filler... and I don't think any such algorithms exist. Bayes provides a way of distilling the garbage into tokens and sifting through it objectively, so it's the best option, but I wouldn't say it's a method of detecting this kind of thing. That said, this particular spam template is interspersed with some sort of hashcode which is repeated a number of times. It could be possible to write a rule that matches a long (20-30 chars) alphanumeric string and count repetitions; if the same long string is repeated more than (say) 10 times, there's a good bet it's an embedded spammy hashcode. I'd write an example rule but I don't know how to store regexp matches from one test to see if they match another test... that is, writing a regexp and using tflags multiple on it would be fine if we wanted it to hit on 10 or more long strings even if those strings don't match, but if we want to see if there are 10 or more repeated long strings that are identical, we have to store it somehow, and I don't know how to do that with SA. If SA allows backreferences (since Perl does) then something like the following MIGHT work, though I suspect it would be a horrible CPU hog: rawbody AC_REPEATED_HASHCODE /(\s[A-Za-z0-9]{25,}\s)(?:(?:\s*\w+)+\1){10} This will look for a 25-character string, and look for 10 more repetitions of that string surrounded by an arbitrary number of words. This is untested so I don't know if it'll work for sure, and I suspect it wouldn't be very friendly to the CPU. The previous method of matching a string, storing it, and looking for repetitions of that string, would be preferable, but I don't know how to do that with SA. --- Amir
Re: Spam Pattern
On Wed, 12 Feb 2014, Joe Quinn wrote: On 2/12/2014 3:15 PM, John Hardin wrote: On Wed, 12 Feb 2014, Joe Quinn wrote: This pattern has been showing up in a good 80% of spam I have looked at in the past month. Spammers take a few paragraphs out of a large body of text and put it at the end of their email. My favorite is one that had the scene where Daisy first meets Jay Gatsby. Sometimes they add some munging, or like in this example they insert base64-encoded hashes. We have a rule for the plaintext hashes, but does anyone on the list have a good way of detecting this? Bayes. Any ideas outside of Bayes? We don't currently have it configured, and the setup involved is more than we would like to do for just one rule, if at all possible. Bayes is very useful, you should reconsider. Perhaps something like this: body __HEXHASHWORD /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/ tflags__HEXHASHWORD multiple maxhits=5 meta HEXHASH_WORD__HEXHASHWORD 4 describe HEXHASH_WORDHexadecimal hash followed by a word Added to my sandbox, just in case. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Mine eyes have seen the horror of the voting of the horde; They've looted the fromagerie where guv'ment cheese is stored; If war's not won before the break they grow so quickly bored; Their vote counts as much as yours. -- Tam --- Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays
Re: Spam Pattern
On Wed, 12 Feb 2014, Amir Caspi wrote: On Feb 12, 2014, at 1:15 PM, John Hardin jhar...@impsec.org wrote: Bayes. Well, yes and no. Bayes isn't very good about detecting this kind of thing per se because it's full of random crap... in fact, they specifically pull text from innocuous things like web reviews, movie reviews, news articles, etc. in the hopes that it contains a lot of hammy tokens that will negate the spammy ones. That only works if your hammy mail stream contains text that looks like the random garbage they put in to try to spoof bayes. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- WSJ on the Financial Stimulus package: ...today there are 700,000 fewer jobs than [the administration] predicted we would have if we had done nothing at all. --- Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays
Re: Spam Pattern
On 02/12/2014 10:06 PM, John Hardin wrote: On Wed, 12 Feb 2014, Joe Quinn wrote: On 2/12/2014 3:15 PM, John Hardin wrote: On Wed, 12 Feb 2014, Joe Quinn wrote: This pattern has been showing up in a good 80% of spam I have looked at in the past month. Spammers take a few paragraphs out of a large body of text and put it at the end of their email. My favorite is one that had the scene where Daisy first meets Jay Gatsby. Sometimes they add some munging, or like in this example they insert base64-encoded hashes. We have a rule for the plaintext hashes, but does anyone on the list have a good way of detecting this? Bayes. Any ideas outside of Bayes? We don't currently have it configured, and the setup involved is more than we would like to do for just one rule, if at all possible. Bayes is very useful, you should reconsider. Perhaps something like this: body __HEXHASHWORD /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/ tflags__HEXHASHWORD multiple maxhits=5 meta HEXHASH_WORD__HEXHASHWORD 4 describe HEXHASH_WORDHexadecimal hash followed by a word Added to my sandbox, just in case. John, Isn't {30,} (without a limit) dangerously expensive?
Effectiveness of Bayes poisoning (was Re: Spam Pattern)
On Wed, 12 Feb 2014 13:11:19 -0800 (PST) John Hardin jhar...@impsec.org wrote: That only works if your hammy mail stream contains text that looks like the random garbage they put in to try to spoof bayes. Indeed. Just for kicks, I ran the OP's pastebin example through our Bayes database and it scored 99.99% likelihood of spam. The word Wopsle, for example, was a dead giveaway... that never appears in our ham stream, but has appeared in 93 spams in our database. Bayes poisoning, in our experience, is only occasionally effective. Regards, David.
Re: Spam Pattern
On Wed, 12 Feb 2014, Axb wrote: On 02/12/2014 10:06 PM, John Hardin wrote: Perhaps something like this: body __HEXHASHWORD /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/ tflags__HEXHASHWORD multiple maxhits=5 meta HEXHASH_WORD__HEXHASHWORD 4 describe HEXHASH_WORDHexadecimal hash followed by a word Added to my sandbox, just in case. John, Isn't {30,} (without a limit) dangerously expensive? Potentially expensive; the character class and the fact that the following atom is not in that class limits the risk - backtracking isn't a possibility. However, point taken - recommend {30,64} instead. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- WSJ on the Financial Stimulus package: ...today there are 700,000 fewer jobs than [the administration] predicted we would have if we had done nothing at all. --- Today: Abraham Lincoln's and Charles Darwin's 205th Birthdays
Re: Spam Pattern
On 02/12/2014 10:46 PM, John Hardin wrote: On Wed, 12 Feb 2014, Axb wrote: On 02/12/2014 10:06 PM, John Hardin wrote: Perhaps something like this: body __HEXHASHWORD /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/ tflags__HEXHASHWORD multiple maxhits=5 meta HEXHASH_WORD__HEXHASHWORD 4 describe HEXHASH_WORDHexadecimal hash followed by a word Added to my sandbox, just in case. John, Isn't {30,} (without a limit) dangerously expensive? Potentially expensive; the character class and the fact that the following atom is not in that class limits the risk - backtracking isn't a possibility. However, point taken - recommend {30,64} instead. imo, you don't even need to count that much - I'd stop at sweet 16, anything above is pink noise and not waste time chasing spaces co.
spamassassin 3.4.0 spec file for rhel4 rhel5 rhel6 and compatible os's
Greetings Thank you SpamAssassin team and other contrib people for all your hard work in getting out 3.4.0 and more ! I have been playing with the spec file for spamassassin 3.3.1 and have been able to get 3.4.0 to turn into the RPM's Then I do a yum localinstall test and it looks like I will have to deal with some perl dependencies, or so it appears I am a working on it although quite rusty on spec file study, comparison, and editing etc. Yet since I don't know everything about 3.4.0... Would someone with a known good working spamassassin spec file that works well with rhel4 - rhel6 machines please share? Thank you in advance... - rh