Re: How do I filter out phishing email?

2010-04-16 Thread Benny Pedersen

On ons 14 apr 2010 23:28:38 CEST, John Hardin wrote
Please do not post spammy mail to the list (it poisons our Bayes  
with spammy tokens with hammy score).


If you're running SA list emails through SA you deserve what you get. :)


for sa 3.3.2 bayes_ignore_on_dkim_valid ? :)

bayes_ignore_to can be forged, same can dkim, ok i loose :=)


--
xpoint http://www.unicom.com/pw/reply-to-harmful.html



SOUGHT ruleset FP

2010-04-16 Thread Matthew Newton
Hi,

We had a legitimate e-mail hit the JM_SOUGHT_3 yesterday. It also
hit a few other rules that pushed it over our reject threshold of
10, and easily over the 'junk mail folder' level of 5.

I managed to get them to send me the message, and it hits rule
__SEEK_5ID3LI Conti  nuum Intern ational Publishing (spaces
added!) which is the name of their company.

I know SOUGHT is an auto-generated ruleset; just wondering if
there is there any way to remove false positives before the set is
generated? Otherwise I'll add local rules to compensate against
this one.

Thanks,

Matthew


-- 
Matthew Newton, Ph.D. m...@le.ac.uk

Systems Architect (UNIX and Networks), Network Services,
I.T. Services, University of Leicester, Leicester LE1 7RH, United Kingdom

For IT help contact helpdesk extn. 2253, ith...@le.ac.uk


Re: SOUGHT ruleset FP

2010-04-16 Thread Karsten Bräckelmann
On Fri, 2010-04-16 at 12:20 +0100, Matthew Newton wrote:
 We had a legitimate e-mail hit the JM_SOUGHT_3 yesterday. It also
 hit a few other rules that pushed it over our reject threshold of
 10, and easily over the 'junk mail folder' level of 5.
 
 I managed to get them to send me the message, and it hits rule
 __SEEK_5ID3LI Conti  nuum Intern ational Publishing (spaces
 added!) which is the name of their company.

Makes one wonder how that string ends up quite massively in spam traps.

 I know SOUGHT is an auto-generated ruleset; just wondering if
 there is there any way to remove false positives before the set is

Yes. The Seek bits are cross-checked against a ham corpus, so the
easiest way is to inject an artificial ham message with the string in
question to get it off of the next run.

 generated? Otherwise I'll add local rules to compensate against
 this one.

meta __SEEK_5ID3LI  (0)

The Seek ID is constant, and will be the same even with later Sought
runs, for a given string.

  guenther


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: dcc: [26896] terminated: exit 241

2010-04-16 Thread Bowie Bailey
Micah Anderson wrote:
 This is version '1.2.74-4' from Debian... but now looking closer, it
 seems as if dcc was removed after Debian Etch. It seems that it was
 removed because the upstream authors changed its license to non-free
 (according to Debian's DFSG) in version 1.30. This also means that it
 has not been available in Ubuntu either since Dapper.


 The Distributed Checksum Clearinghouse source carries a license that is
 free to organizations that do not sell filtering devices or services
 except to their own users and that participate in the global DCC
 network. . . you may not redistribute modified, fixed, or improved
 versions of the source or binaries. You also can't call it your own or
 blame anyone for the results of using it.

 So I guess I just will remove dcc, that is a shame, it seems like a good
 service.

According to the quote above, the service is free to almost everyone
(even ISPs).  Why can't you continue to use it?

-- 
Bowie


Re: SOUGHT ruleset FP

2010-04-16 Thread Justin Mason
yep -- feel free to send me over copies of FP messages (or strings
that match them)

2010/4/16 Karsten Bräckelmann guent...@rudersport.de:
 On Fri, 2010-04-16 at 12:20 +0100, Matthew Newton wrote:
 We had a legitimate e-mail hit the JM_SOUGHT_3 yesterday. It also
 hit a few other rules that pushed it over our reject threshold of
 10, and easily over the 'junk mail folder' level of 5.

 I managed to get them to send me the message, and it hits rule
 __SEEK_5ID3LI Conti  nuum Intern ational Publishing (spaces
 added!) which is the name of their company.

 Makes one wonder how that string ends up quite massively in spam traps.

 I know SOUGHT is an auto-generated ruleset; just wondering if
 there is there any way to remove false positives before the set is

 Yes. The Seek bits are cross-checked against a ham corpus, so the
 easiest way is to inject an artificial ham message with the string in
 question to get it off of the next run.

 generated? Otherwise I'll add local rules to compensate against
 this one.

 meta __SEEK_5ID3LI  (0)

 The Seek ID is constant, and will be the same even with later Sought
 runs, for a given string.

  guenther


 --
 char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
 main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
 (c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}




Re: How do I filter out phishing email?

2010-04-16 Thread John Hardin

On Fri, 16 Apr 2010, Benny Pedersen wrote:


On ons 14 apr 2010 23:28:38 CEST, John Hardin wrote
 Please do not post spammy mail to the list (it poisons our Bayes with 
 spammy tokens with hammy score).


If you're running SA list emails through SA you deserve what you get. :)


for sa 3.3.2 bayes_ignore_on_dkim_valid ? :)

bayes_ignore_to can be forged, same can dkim, ok i loose :=)


Fix your glue to bypass SA on list-id and received.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Our government should bear in mind the fact that the American
  Revolution was touched off by the then-current government
  attempting to confiscate firearms from the people.
---
 3 days until the 235th anniversary of The Shot Heard 'Round The World


Re: dcc: [26896] terminated: exit 241

2010-04-16 Thread Michael Scheidell

On 4/15/10 5:35 PM, Micah Anderson wrote:

M
The Distributed Checksum Clearinghouse source carries a license that is
free to organizations that do not sell filtering devices or services
except to their own users and that participate in the global DCC
network. . . you may not redistribute modified, fixed, or improved
versions of the source or binaries. You also can't call it your own or
blame anyone for the results of using it.
   
Which seems silly for debian to remove it, since many of the blacklists 
in SA are by default, licensed similar (free for non commercial use, 
paid if  xxx queries).  maybe debian should look through and remove ALL 
'dual licensed' software, and when you install SA from the RPM's, 
disable the dual licensed RBL's.


Or, hey, lets pretend the people installing debian are smart enough to 
be able to make up their own mind if they fit the free license model.



So I guess I just will remove dcc, that is a shame, it seems like a good
service.


   
it IS a good service, and SA 3.3x supports the reputation query directly 
now in the commercial license.

Some things to understand,  (normal language vs legal talk)

   * if you are doing  100,000 queries a day (100,000 emails a day hit
 SA, and thus dcc), its a lot better, and faster to use a local dcc
 server.
   * If you are using the public servers, their is a built in 1000ms
 delay (so if you are using  100K queries a day, its faster to use
 the commercial service)
   * public servers don't have the reputations scores (see new scores
 for dcc..).  double the accuracy.
   * if you are an isp, just using it for your customers, you don't
 need to pay for the commercial license (no reputation, still
 1000ms delays to public servers)
   * (but you still might want to.  its CHEAP, faster by 1000ms per
 query, and with DCC reputations, more accurate)

DCC reputations not only allows SA to score on the fuzzy checksums of 
the emails, but score on the 'bulk vs non bulk' reputation of the 
sending ip.


zero day spams (bulk!) from known bulk sources can be picked up immediately.
zero say zombots with known spam (bulk) using a new ip can be picked up 
immediately with old scores.


the combination of this makes it very accurate, both with catching new 
bulk providers, and cutting down on FP's.


did I say its CHEAP, and if you are an isp using it for your own 
customers you don't need a license?

If you aren't an appliance vendor

you own it to yourself to at least ASK vernon how much.

(disclaimer: I don't sell DCC, don't know why I am advising competitors 
to use DCC since it is one of our advantages, but I like the product, 
the service and I like vernon)



   *





what did you upgrade?
 

Sorry, I upgraded from Debian etch to Debian Lenny, along with that came
an upgrade to spamassassin.

micah


   




--
Michael Scheidell, CTO
Phone: 561-999-5000, x 1259
 *| *SECNAP Network Security Corporation

   * Certified SNORT Integrator
   * 2008-9 Hot Company Award Winner, World Executive Alliance
   * Five-Star Partner Program 2009, VARBusiness
   * Best Anti-Spam Product 2008, Network Products Guide
   * King of Spam Filters, SC Magazine 2008

__
This email has been scanned and certified safe by SpammerTrap(r). 
For Information please see http://www.secnap.com/products/spammertrap/
__  


Re: SOUGHT ruleset FP

2010-04-16 Thread Matthew Newton
Hi,

On Fri, Apr 16, 2010 at 01:53:55PM +0200, Karsten Bräckelmann wrote:
 On Fri, 2010-04-16 at 12:20 +0100, Matthew Newton wrote:
  We had a legitimate e-mail hit the JM_SOUGHT_3 yesterday. It also
  hit a few other rules that pushed it over our reject threshold of
  10, and easily over the 'junk mail folder' level of 5.
  
  I managed to get them to send me the message, and it hits rule
  __SEEK_5ID3LI Conti  nuum Intern ational Publishing (spaces
  added!) which is the name of their company.
 
 Makes one wonder how that string ends up quite massively in spam traps.

I did consider that. Without seeing the spam, of course, I can't
say whether they are spamming or whether their name is being
abused. All I have is a legitimate mail from them and a report
that it is blocked.

  I know SOUGHT is an auto-generated ruleset; just wondering if
  there is there any way to remove false positives before the set is
 
 Yes. The Seek bits are cross-checked against a ham corpus, so the
 easiest way is to inject an artificial ham message with the string in
 question to get it off of the next run.

OK, understood.

  generated? Otherwise I'll add local rules to compensate against
  this one.
 
 meta __SEEK_5ID3LI  (0)
 
 The Seek ID is constant, and will be the same even with later Sought
 runs, for a given string.

Thought that might be the case from the codes - thanks for the
confirmation.

Cheers,

Matthew


-- 
Matthew Newton, Ph.D. m...@le.ac.uk

Systems Architect (UNIX and Networks), Network Services,
I.T. Services, University of Leicester, Leicester LE1 7RH, United Kingdom

For IT help contact helpdesk extn. 2253, ith...@le.ac.uk


multiple instances

2010-04-16 Thread Gary Smith
I have a need to run several different instances of SA on a single box (in 
development).  In production, we have 3  different SA environments (with 2+ 
servers each) that have different rule sets and specific routing rules 
determine which instance it gets sent to.   We need to mimic this in 
development.  

Ideally I would like to create all 3 instances (*2 mimicing load balancing) on 
a single development box.  We're not worried about the performance or memory 
aspect.

Is this possible, and if so, is there an easy way to do this.   I was thinking 
that I could create separate chroot environments for each one if necessary and 
either bind each instance to an IP (which I'm not sure if that's possible) or 
at least a different port.

Any advice (or some sample scripts on doing this) would be greatly appreciated.

Gary Smith


Re: multiple instances

2010-04-16 Thread Dennis B. Hopp

On Fri, 2010-04-16 at 10:08 -0700, Gary Smith wrote:
 I have a need to run several different instances of SA on a single box (in 
 development).  In production, we have 3  different SA environments (with 2+ 
 servers each) that have different rule sets and specific routing rules 
 determine which instance it gets sent to.   We need to mimic this in 
 development.  
 
 Ideally I would like to create all 3 instances (*2 mimicing load balancing) 
 on a single development box.  We're not worried about the performance or 
 memory aspect.
 
 Is this possible, and if so, is there an easy way to do this.   I was 
 thinking that I could create separate chroot environments for each one if 
 necessary and either bind each instance to an IP (which I'm not sure if 
 that's possible) or at least a different port.
 
 Any advice (or some sample scripts on doing this) would be greatly 
 appreciated.
 

I'm sure it's possible, but rather than going through all the work of
trying to script and setup chroot environments, why not use VMs?  You
can then quite literally match the production setup.

Since you are not worried about performance or memory you could give
each VM 128 MB of RAM and only be using 1 GB or so total...

--Dennis



RE: multiple instances

2010-04-16 Thread Gary Smith
 
 I'm sure it's possible, but rather than going through all the work of
 trying to script and setup chroot environments, why not use VMs?  You
 can then quite literally match the production setup.
 
 Since you are not worried about performance or memory you could give
 each VM 128 MB of RAM and only be using 1 GB or so total...
 

Dennis, 

I had though about that, but the target is a mobile laptop.  Our in house 
development we do use VM's for almost everything just for this purpose.

Looking into spamd, I think I will just copy the config folder for each 
instance type and then run the daemon via a bash script to create it against 9 
local IP's.

I know for things like MySql some people already have some multi-instance 
scripts laying around.  Anyway, I think this will suffice for now.

Gary 


multiple instances, simplification

2010-04-16 Thread Gary Smith
Background:

I've been using SA for a long time, and for a verity of reasons, we run 
different servers to support some minor changes in different rules.  While 
trying to setup a multi instance version on my laptop, I copied these rules 
over into different directories, setup the startup/shutdown script and ran my 
tests and everything worked fine until I found that I didn't create the user 
filter that I run everything as (for SA).  So, I created filter1, filter2, 
etc., for each instance that I want to run.  I noticed that the log still 
complained that filter didn't exist.  Looking into it, it appears that filter 
is the value being passed in via the spamc call.  Now, because SA always works, 
I generally don't touch some of these little things, so I tend to forget things 
like that calling user spamc must exist on the remote spamd server, etc, as I 
never really need to change anything.

Question:

Instead of running multiple SA servers, it is possible to run a single 
consolidated SA server where only the userpref's are different for each spamc 
caller (given that the local config will override the global config) AND still 
use a single bayes DB?  We use a clustered MySql instance for bayes, and I 
don't want to have to worry about a bayes DB per user.

This big difference between the instances are mostly the required_score 
threshold, few score overrides and a few custom rules.

Any recommendations on how to handle this?  I would be really nice to use a 
single config for all SA instances, whereas the only difference being the user 
config.

Gary




Re: multiple instances, simplification

2010-04-16 Thread Kris Deugau

Gary Smith wrote:

Instead of running multiple SA servers, it is possible to run a single 
consolidated SA server where only the userpref's are different for each spamc 
caller (given that the local config will override the global config) AND still 
use a single bayes DB?  We use a clustered MySql instance for bayes, and I 
don't want to have to worry about a bayes DB per user.

This big difference between the instances are mostly the required_score 
threshold, few score overrides and a few custom rules.

Any recommendations on how to handle this?  I would be really nice to use a 
single config for all SA instances, whereas the only difference being the user 
config.


If all of the differences are in required_score, custom scores on a few 
rules, a few fairly trivial rules, etc, then yes, you should be able to 
do this.


Either create real system users filter1, filter2, etc or read up on 
spamd's virtual user support.  A quick read of spamd's man page shows a 
little clearer and more coherent set of options than I recall from ~2.x.


-x and --virtual-config-dir are probably good places to start.

-kgd


Re: multiple instances, simplification

2010-04-16 Thread Jorge Valdes
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Kris Deugau wrote:
 Gary Smith wrote:
 Instead of running multiple SA servers, it is possible to run a single
consolidated SA server where only the userpref's are different for each
spamc caller (given that the local config will override the global
config) AND still use a single bayes DB?  We use a clustered MySql
instance for bayes, and I don't want to have to worry about a bayes DB
per user.

 This big difference between the instances are mostly the
required_score threshold, few score overrides and a few custom rules.

 Any recommendations on how to handle this?  I would be really nice to
use a single config for all SA instances, whereas the only difference
being the user config.

 If all of the differences are in required_score, custom scores on a few
rules, a few fairly trivial rules, etc, then yes, you should be able to
do this.

 Either create real system users filter1, filter2, etc or read up on
spamd's virtual user support.  A quick read of spamd's man page shows a
little clearer and more coherent set of options than I recall from ~2.x.

 -x and --virtual-config-dir are probably good places to start.

 -kgd

Why don't you just run 3 instances of spamd, each listening on different
ports/sockets and each with their own configuration:

spamd --siteconfigpath=/etc/spam1 --socketpath=/tmp/spam1.sock --port=783
spamd --siteconfigpath=/etc/spam2 --socketpath=/tmp/spam2.sock --port=784
spamd --siteconfigpath=/etc/spam3 --socketpath=/tmp/spam3.sock --port=785

This way you can enable/disable different plugins for each config as
well as having totally different configurations in each instance.
Afterwards it's just a matter of calling the right instance from your
MDA by choosing the proper socket or tcp-port.

Since you use MySql for Bayes, you can configure each instance with the
same configuracion so that they all access the same database. And
because its just for testing, don't forget to add --min-children=1
--max-children=1 so that each instance only runs one scanner instance,
thus conserving RAM.

- --
Jorge Valdes
jval...@intercom.com.sv

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkvIwoYACgkQkGBK/EMo0qJUmQCfUNkK/hIY+Dps+bALWHzp0v8f
TnAAniE39uyZUCypqlrgLoJJa7SBR0ZT
=0eCa
-END PGP SIGNATURE-



RE: multiple instances, simplification

2010-04-16 Thread Gary Smith
 Why don't you just run 3 instances of spamd, each listening on different
 ports/sockets and each with their own configuration:
 
 spamd --siteconfigpath=/etc/spam1 --socketpath=/tmp/spam1.sock --port=783
 spamd --siteconfigpath=/etc/spam2 --socketpath=/tmp/spam2.sock --port=784
 spamd --siteconfigpath=/etc/spam3 --socketpath=/tmp/spam3.sock --port=785
 
 This way you can enable/disable different plugins for each config as
 well as having totally different configurations in each instance.
 Afterwards it's just a matter of calling the right instance from your
 MDA by choosing the proper socket or tcp-port.
 
 Since you use MySql for Bayes, you can configure each instance with the
 same configuracion so that they all access the same database. And
 because its just for testing, don't forget to add --min-children=1
 --max-children=1 so that each instance only runs one scanner instance,
 thus conserving RAM.

Jorge,

This is all just a thought, based upon me try to create a development 
environment on a laptop, which spawned off possible configuration changes to a 
production environment.

We currently have 6+ server running these.  3 sets of load balanced SA servers. 
 These servers are roughly 70% idle most of the time.  Running them with user 
preferences, instead of different instances, would allow us to remove 50% of 
the hardware.  Running them as multiple instances on the same box, means we 
will still need to balance across the same number of servers.

I think the virtual user angle might work, I just was thinking of a way to 
use a single consolidated mysql instance, where it doesn't care about 
user_name.  If I can't elegantly resolve this, I could always just patch the 
source to use a hard coded user name in the sql statement to ensure that bayes 
stays consistent.

Bayes is the only real concern here, as I know I can run multiple copies (and 
had forgot that I could run a single copy with user_prefs).  So I think this 
will work either way.  I just needed to put a little thought into it and 
bounce off of people who might have already done something like this.  

Gary Smith


Re: multiple instances, simplification

2010-04-16 Thread Kris Deugau

Gary Smith wrote:
Bayes is the only real concern here, as I know I can run multiple copies (and had forgot that I could run a single copy with user_prefs).  So I think this will work either way.  I just needed to put a little thought into it and 
bounce off of people who might have already done something like this.  


If you're just trying to keep your Bayes table from exploding due to 
multiple users, use the bayes_sql_override_username option.


-kgd


RE: multiple instances, simplification

2010-04-16 Thread Gary Smith
 If you're just trying to keep your Bayes table from exploding due to
 multiple users, use the bayes_sql_override_username option.

I'm not worried about it exploding as we don't allow user_prefs.  The machines 
are processed via relays.  I believe the bayes_sql_override_username will solve 
the last piece of the puzzle.  I think I will test this out this weekend on the 
laptop, then our test environment.  

Thanks for all of the information.

Gary Smith



Re: multiple instances, simplification

2010-04-16 Thread Kris Deugau

Gary Smith wrote:
I'm not worried about it exploding as we don't allow user_prefs.  The machines are processed via relays.  I believe the bayes_sql_override_username will solve the last piece of the puzzle.  I think I will test this out this weekend on the laptop, then our test environment.  


'Exploding' as in 'sucking up far more disk than expected causing I/O 
saturation and grotty spam detection rates' (because by default, each 
calling user as provided by spamc will get its own Bayes records in 
SQL).  In switching from an environment with a sitewide BDB file-based 
Bayes to sitewide SQL I ended up learning about that one the hard way.  :/


There's no direct relation between user_prefs and a sitewide Bayes DB.

-kgd


Looking for *heavy* Perl use-cases

2010-04-16 Thread Steffen Schwigon
Hi SpamAssassin folks!

I am currently looking for “corporate style” or “enterprise class”
Perl use-cases; and strangely it's quite difficult to find a central
resource for that.

Usually this is supported by running applications that are so
important and used in a very heavy environment that the users need to
think about how to best spend their money for tweaking hardware, OS or
Perl.

Do you know of companies that use SpamAssassin on lots of machines as
a primary use-case.

(And if you know of other Perl use-cases, please give me hints,
too. Some blogging providers and BioPerl comes to mind, but I still
lack particular examples.)

Thanks and regards,
Steffen 
-- 
Steffen Schwigon s...@renormalist.net
Dresden Perl Mongers http://dresden-pm.org/