whitelist_from_rcvd to train bayesdb ?

2007-04-27 Thread kshatriyak

Hi,

Although I have some negative-score rules, my ham mails never score too 
much below zero. I've set auto learning for ham to -12 to be sure spam 
never gets marked as ham and my bayes database doesn't get polluted- i 
think it's quite bad if ham mail would be autolearned as spam (i guess 
much more worse than the other way around).


Anyway, i've been thinking to use whitelist_from_rcvd to mark mail from 
certain providers (which i never saw spam from if it came from the 
right mailserver) with a low score so that my database also gets trained 
with more ham.


So for example:

whitelist_from_rcvd  [EMAIL PROTECTED]  isp-sending-domain

Is this a good idea, or am i abusing the whitelist_from_rcvd rule and am I 
missing something so this will it have a bad impact in the end?


Thanks!
K.



Spamd/spamc and queues

2007-04-27 Thread Egoitz Aurrekoetxea Aurre
Hi all,

I'm using spamd/spamc software for mail scanning on my mail server... I'm 
running spamassassin from (debian sarge 3.1 r1 dvd, it's I think spamassassin 
3.0.3). My problem is that I have a mail firewall with 25 spamd proccesses 
but when arriving this limit I have read that spamd queues till perl -MSocket 
-e'print SOMAXCONN value of connections queued. This number is too small (128) 
how could I increase this value? is this number shared with another 
applicacionts connection queue... as for example apache too? because in this 
case the number would be lower then what could happen if I increase this 
number? should I increase some /proc file value too? Please help me I'm quite 
desperate with this... or how could I do here? 

Thanks a lot everybody
Bye!


Egoitz Aurrekoetxea Aurre
Dpto sistemas y empresas
Infobiok C.B.


Re: Score Generation for Apache SpamAssassin

2007-04-27 Thread Justin Mason

Duncan Findlay writes:
 On Thu, Apr 26, 2007 at 12:15:52PM +0100, Justin Mason wrote:
  thanks Duncan -- a great read, and looks promising!
 
  Would it help btw if we came up with a spec for what a score-generation
  tool needs to generate, in terms of score ranges and so on?
  This would also be useful for the future (I'm sure there'll be
  more... ;)
 
 Probably not to me, but it might be useful to others. (I think I
 already know what needs to be done.) Also, it might limit creativity
 in possible solutions. We need a score ranges mechanism, we don't need
 the specific one we have now.

Sure.  However right now it takes a good bit of reverse engineering to
figure out what the rules are, they're not documented, and we're not going
to use code that doesn't use at least some of them -- and it's not clear
which subset are real rules and which subset are just workarounds for
undesirable behaviour by the GA or perceptron.  

That's not exactly encouraging contributions. We should document it.

--j.


Re: RBL tests on MTA vs. RBL rules on SA

2007-04-27 Thread Oenus Tech Services
After much testing, we have decided to put the RBLs on Postfix for
performance reasons. Before checking with those RBLs, our system does
EHLO checks against a known-spammer blacklist database as well to filter
the most obvious cases. Then we use zen.spamhaus.org,
safe.dnsbl.sorbs.net, and bl.spamcop.net, in this order. Next we do
greylisting with postgrey. Then amavisd-new+clamav+sa+sare+fuccyocr take
care of the remaining (our logs show than aprox. 98% of all spam/virus
mail had been blocking before this). We stopped using bayesian at all
since 1.-Many of our customers get their mail through pop3, 2.- those
with imap accounts would not bother training spam and ham. we've had
some (very few) problems in the past with spamassassin giving
false-positives for some ham (though some would say it was spam), but
modifying some scores did the trick without affecting our ability to
filter spam, since most was filtered before it went through
spamassassin. The result: a mail system that hosts more than 100
companies email accounts with no spam at all.

Is there a possibility that we might be blocking sources of legitimate
mail by being so aggressive? My experience tells me that if some server
is on any of the three RBL that we use is because 1.- they're
misconfigured (open relays and such), 2.- they are on a residential
[dynamic] IP segment, 3.- They do permit spam coming from their servers,
4.- if they would be listed by mistake, their IT people are not being
professional enough to have themselves delisted immediately.

Ignasi

Luis Hernán Otegui escribió:
 Hi, list, I know this is one of those egg and chicken kind of
 questions, but having now the possibility of checking the impact of
 various setups, I was wondering if it is more convenient to let the MTA
 perform the RBL checks, or disable them and let SA do this job.
 Currently I am using zen.spamhaus.org http://zen.spamhaus.org as my
 primary (and only) RBL tester on Postfix, and I am kinda surprised. The
 daily statistics show that my server is rejecting almost 22000
 connections a day, and accepting only 2500-3000 emails. The major
 drawback is bayes. It seems to lack the necessary amount of data to
 catch up as the spam evolves, so I'm continuously getting new kinds of
 spam (meaning that I can't figure out a tendency to draw a rule from).
 So I'm asking if anyone has a solution for this, or how do you deal with
 this (to me) dellicate balance.
 
 Thanks in advance,
 
 
 Luis
 
 -- 
 -
 GNU-GPL: May The Source Be With You...
 -



RE: RBL tests on MTA vs. RBL rules on SA

2007-04-27 Thread Michael Scheidell

 -Original Message-
 From: Oenus Tech Services [mailto:[EMAIL PROTECTED] 
 Sent: Friday, April 27, 2007 6:33 AM
 To: Luis Hernán Otegui
 Cc: users@spamassassin.apache.org
 Subject: Re: RBL tests on MTA vs. RBL rules on SA
 
 
 After much testing, we have decided to put the RBLs on Postfix for
 performance reasons. Before checking with those RBLs, our system does
 EHLO checks against a known-spammer blacklist database as 
 well to filter
 the most obvious cases. Then we use zen.spamhaus.org,
 safe.dnsbl.sorbs.net, and bl.spamcop.net, in this order. Next we do
 

We use (and like spamcop) and PERSONALLY, in our office use it in postfix.

HOWEVER, it is TOO easy to get ON spamcop bl, and TOO hard to get off.

Now, I don't want to get into a debate about aol, yahoo, gmail, etc, but if a 
spammer user on of those accounts to send email, a lot of it, then you WILL 
bounce legit aol email because 'that one' (or two or three) aol smtp servers is 
blacklisted.

Try using the BL's in graylist instead.


-- 
Michael Scheidell, CTO
Join SECNAP at SecureWorld Atlanta, May 1-2, May 16-17 in Philadelphia
http://www.secnap.com/events for free and discounted seminar tickets
_
This email has been scanned and certified safe by SpammerTrap(tm).
For Information please see http://www.spammertrap.com
_


Re: whitelist_from_rcvd to train bayesdb ?

2007-04-27 Thread Matt Kettler
[EMAIL PROTECTED] wrote:
 Hi,

 Although I have some negative-score rules, my ham mails never score
 too much below zero. I've set auto learning for ham to -12 to be sure
 spam never gets marked as ham and my bayes database doesn't get
 polluted- i think it's quite bad if ham mail would be autolearned as
 spam (i guess much more worse than the other way around).

 Anyway, i've been thinking to use whitelist_from_rcvd to mark mail
 from certain providers (which i never saw spam from if it came from
 the right mailserver) with a low score so that my database also gets
 trained with more ham.
userconf rules are not used when determining the learning score. This
includes all whitelist_* rules.




RE: Mail Lost? How can this happen?

2007-04-27 Thread Michael Scheidell

 -Original Message-
 From: dbsanders [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, April 26, 2007 8:15 PM
 To: users@spamassassin.apache.org
 Subject: Mail Lost? How can this happen?
 
 
 
 Not sure this is an SA problem at all, but maybe you can give 
 me a clue. I seem to be losing messages. They are received by 
 my mail system:
 
 Apr 26 10:28:45 heckle sendmail[9295]: [ID 801593 mail.info] 
 l3QHShh9009295: from=[EMAIL PROTECTED], size=78591, 
 class=0, nrcpts=1, 
 msgid=!!AAAYAKUg0ihnLChGuixmNt9Cx7vCgAAAEAAA
 [EMAIL PROTECTED],

Did you cut off the logs here? I don't see a closing  in message id.
procmail.log shows:
 
 From [EMAIL PROTECTED]  Thu Apr 26 10:28:46 2007
  Subject: Northwood Connect - Qlogic  #SANQ4747
   Folder: SpamBox 

Procmail problem, join procmail list.

If SA was done with it and sent it to procmail, then procmail dropped
it.


-- 
Michael Scheidell, CTO
Join SECNAP at SecureWorld Atlanta, May 1-2, May 16-17 in Philadelphia
http://www.secnap.com/events for free and discounted seminar tickets
_
This email has been scanned and certified safe by SpammerTrap(tm).
For Information please see http://www.spammertrap.com
_


RE: RBL tests on MTA vs. RBL rules on SA

2007-04-27 Thread Michael Scheidell
 spam/virus mail had been blocking before this). We stopped 
 using bayesian at all since 1.-Many of our customers get 

Ps, bayesian isn't just for manual training.  Maybe set a high/low score in 
auto learning, but it does help.
_
This email has been scanned and certified safe by SpammerTrap(tm).
For Information please see http://www.spammertrap.com
_


FUN: Help Rob McEwen test his new anti-spam tools!

2007-04-27 Thread Rob McEwen
FUN PROJECT:

Help Rob McEwen test his new anti-spam tools!

As many already know... I'm one of a **small** handful of organizations with
authority to blacklist and whitelist at will on SURBL and I've provided
much administrative assistance to SURBL for years, particularly in
preventing false positives. Of course, my efforts there are miniscule
compared to Jeff Chan's great work! Still, Jeff has thanked me countless
times for my assistance.

Most importantly, I have an insider's view and **uncommon expertise** into
what it takes to make a world class blacklist and, within the next few
business days, I will be officially releasing my 2 new Invaluement Spam
Blocklists:

(1) The Invaluement-URI blocklist
(much like SURBL  URIBL)

..AND..

(2) The Invaluement-SIP blocklist, a Sender's IP blocklist
(a.k.a. an RBL, like DSBL, SBL, etc.).

SIP = Sender's IP

Proverbs 15:22 says, Without counsel plans fail, but with many advisers
they succeed. NOT that these two lists will be built by committee... but,
along these lines, I sure could use some feedback!

You may be asking:

--WHY SHOULD WE USE THESE LISTS?

--HOW ARE THEY HELPFUL?

--WHAT ARE THESE?

First, if you are already using SURBL  URIBL, continue to do so!

Invaluement-URI will NOT replace SURBL  URIBL as those lists WILL catch
things that Invaluement-URI will miss or not catch as quickly.

However...

**
REGARDING: Invaluement-URI blocklist
**

(A) The Invaluement-URI blocklist is catching over 1,000 URIs (per week)
minutes, hours, and even days BEFORE surbl or uribl or even uribl-red!

Did you catch that? Let me repeat:

Invaluement-URI is listing over 1,000 URIs (per week) minutes, hours,
and even days BEFORE surbl or uribl or even uribl-red!

(If a URI showed up on ANY 1 of these lists, I didn't count it towards that
tally. I ONLY counted items which were not on ANY of those other lists!)

Q: Why? How?

A: Mostly because Invaluement-URI is a fast reacting list! Often even
faster than URIBL-RED!!

Q: Why is this important?

A: Because many new series of spams are listed on Invaluement-URI lightening
fast and this will help you block much spam that would otherwise pass
through your spam filtering during those minutes/hours BEFORE the URI is
listed on SURBL or URIBL.

(B) The False Positive Rate for Invaluement-URI is extremely low -- and
might even be better than SURBL's already very low FP rate! I have yet to
spot a single egregious FP... and the **few** that I have spotted (and
removed) were VERY questionable to begin with!

NOTE: Being aggressive and fast is easy... but doing such **without** the
FPs is incredibly difficult. Years of programming and analysis went into the
development of these two lists!

(C) Additionally, Invaluement-URI is catching many URIs, particularly
phishes, that **might** NEVER be getting in SURBL or URIBL... or at least
that seems to be the case as several days have gone by without them being
listed.

NOTE: You might ask, Rob, why haven't **you** placed these into SURBL or
requested them be listed in URIBL? The answer is simple. In recent weeks,
finishing touches on these new lists have consumed most of my time and
energies. But I do plan to use this knowledge/data to do more submissions to
SURBL  URIBL. However, even then, for various reasons, such submissions
will have to be hand-submitted and hand-checked. Therefore,
Invaluement-URI will STILL haVE the upper hand in being a fast-reaction
list.


**
REGARDING: Invaluement-SIP blocklist
**

I find that many Sender's IP blocklists (a.k.a. RBLs):

(1) tend to catch much spam without FPs, but also seem to have diminishing
returns... sort of an upper limit in their effectiveness... a glass
ceiling

...OR...

(2) block much legit mail and/or very credible sources... or even purposely
punish sources of legit mail for those ISP's/ESP's who are lacking in
their prevention of spams sent from their network.

So you are stuck with one type of Sender's IP blocklist being helpful, but
very limited... and the other type too aggressive to be used, requiring that
you score it very, very low in your filtering to prevent FPs... thus
minimizing its effectiveness!

IN CONTRAST... you'll find Invaluement-SIP to be a best of both worlds
Sender's IP Blocklist. It is as aggressive and fast reacting as many of
the best... listing MANY IPs that are not yet on other RBLs... but NOT
having the high FP rate found on many other aggressive IP blacklists.


**
REGARDING: BOTH blocklists
**

LOW MEMORY FOOTPRINT:

While both are quick reacting... both are also quick expire time lists.
If a spam hasn't been seen containing that URI or from that Sender's IP for
more than a few weeks, it gets expired and removed. This keeps the memory
footprint very low... and this opens up multiple possibilities... like
possible 

BOTNET is great but...

2007-04-27 Thread Andy Spiegl
...I wonder how to deal with the cases where there is a legitimate
internal mailserver behind dialup-IPs.  There are quite a few small
companies that have a small home office network behind a dialup DSL
and run an internal mailserver which relays external mail to the mailserver
of their provider which then delivers to the destination.

That seems perfectly okay to me and very distinct from the botnet case
where mails from dialup-IPs are sent _directly_ to the destination MX.
But the BOTNET rules don't differentiate these two cases.

What do you think how to deal with that?  How do YOU deal with it?
I'd really hate to lower the BOTNET scores but otoh if it hits
legit mailservers too?

Thanks,
 Andy.

PS: Shouldn't the BOTNET_SOHO rule avoid a high BOTNET score in these cases?
Or do I have to set the score for BOTNET_SOHO manually???
-- 
 Warning: This email, when printed on paper, has sharp edges.
 Handle with care or serious injury may result.


RE: Mail Lost? How can this happen?

2007-04-27 Thread dbsanders

Strange, I checked the log and there is no closing bracket in the message id.
Maybe this screwed with something in the SA/procmail process.




Michael Scheidell wrote:
 
 
 -Original Message-
 From: dbsanders [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, April 26, 2007 8:15 PM
 To: users@spamassassin.apache.org
 Subject: Mail Lost? How can this happen?
 
 
 
 Not sure this is an SA problem at all, but maybe you can give 
 me a clue. I seem to be losing messages. They are received by 
 my mail system:
 
 Apr 26 10:28:45 heckle sendmail[9295]: [ID 801593 mail.info] 
 l3QHShh9009295: from=[EMAIL PROTECTED], size=78591, 
 class=0, nrcpts=1, 
 msgid=!!AAAYAKUg0ihnLChGuixmNt9Cx7vCgAAAEAAA
 [EMAIL PROTECTED],
 
 Did you cut off the logs here? I don't see a closing  in message id.
 procmail.log shows:
 
 From [EMAIL PROTECTED]  Thu Apr 26 10:28:46 2007
  Subject: Northwood Connect - Qlogic  #SANQ4747
   Folder: SpamBox 
 
 Procmail problem, join procmail list.
 
 If SA was done with it and sent it to procmail, then procmail dropped
 it.
 
 
 -- 
 Michael Scheidell, CTO
 Join SECNAP at SecureWorld Atlanta, May 1-2, May 16-17 in Philadelphia
 http://www.secnap.com/events for free and discounted seminar tickets
 _
 This email has been scanned and certified safe by SpammerTrap(tm).
 For Information please see http://www.spammertrap.com
 _
 
 

-- 
View this message in context: 
http://www.nabble.com/Mail-Lost--How-can-this-happen--tf3655105.html#a10221174
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



'From' containing comma + french accent is getting stripped

2007-04-27 Thread Eric Beaurivage
Dear SpamAssassin Users,

X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) Exchange SpamAssassin 
Sink (www.christopherlewis.com) 1.2.76

I'm using the above versions under Windows 2000 Server and as soon as someone 
is sending us an e-mail with a name containing a comma + a french accent in the 
'From' field, the name and e-mail address are getting stripped.  Examples:

From: Bournival, Jean-Sébastien

Header =

From: Bournival,
=?iso-8859-1?Q?Jean-S=E9bastien?= [EMAIL PROTECTED]
To: [EMAIL PROTECTED]

=

From: Bournival

*

From: Beaurivage, Éric

Header =

From: Beaurivage,
=?iso-8859-1?B?yXJpYw==?= [EMAIL PROTECTED]
To: Eric Beaurivage [EMAIL PROTECTED]

=

From: Beaurivage

*

This is happening as soon as there is a combination of a comma + a french 
accent because if the message would be sent by 'Éric Beaurivage' or 
'Beaurivage, Eric' it won't get stripped out.

Please, let me know if you have any ideas.  I also tried disabling all the 
plugins and it hasn't fixed our problem.  Thanks in advance!

-é



Re: 'From' containing comma + french accent is getting stripped

2007-04-27 Thread Daryl C. W. O'Shea

Eric Beaurivage wrote:

Dear SpamAssassin Users,

X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) Exchange SpamAssassin 
Sink (www.christopherlewis.com) 1.2.76

I'm using the above versions under Windows 2000 Server and as soon as someone 
is sending us an e-mail with a name containing a comma + a french accent in the 
'From' field, the name and e-mail address are getting stripped.  Examples:



This is happening as soon as there is a combination of a comma + a french 
accent because if the message would be sent by 'Éric Beaurivage' or 
'Beaurivage, Eric' it won't get stripped out.

Please, let me know if you have any ideas.  I also tried disabling all the 
plugins and it hasn't fixed our problem.  Thanks in advance!


Since SA doesn't re-write the From: field I'd look into a problem with 
the exchange sink.


Daryl



Re: 'From' containing comma + french accent is getting stripped

2007-04-27 Thread Daryl C. W. O'Shea

Daryl C. W. O'Shea wrote:

Eric Beaurivage wrote:

Dear SpamAssassin Users,

X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) Exchange 
SpamAssassin Sink (www.christopherlewis.com) 1.2.76


I'm using the above versions under Windows 2000 Server and as soon as 
someone is sending us an e-mail with a name containing a comma + a 
french accent in the 'From' field, the name and e-mail address are 
getting stripped.  Examples:


This is happening as soon as there is a combination of a comma + a 
french accent because if the message would be sent by 'Éric 
Beaurivage' or 'Beaurivage, Eric' it won't get stripped out.


Please, let me know if you have any ideas.  I also tried disabling all 
the plugins and it hasn't fixed our problem.  Thanks in advance!


Since SA doesn't re-write the From: field I'd look into a problem with 
the exchange sink.


Daryl


...unless you're using report_safe, then I could see that maybe there's 
a problem with SA.  In any case, try running a message that should be 
affected by the SA+exchange sink combo through just spamassassin to see 
if the problem occurs.


Daryl


Re: RBL tests on MTA vs. RBL rules on SA

2007-04-27 Thread Brian Godette
Oenus Tech Services wrote:
 After much testing, we have decided to put the RBLs on Postfix for
 performance reasons. Before checking with those RBLs, our system does
 EHLO checks against a known-spammer blacklist database as well to filter
 the most obvious cases. Then we use zen.spamhaus.org,
 safe.dnsbl.sorbs.net, and bl.spamcop.net, in this order. Next we do

safe.dnsbl.sorbs.net includes new.spam.dnsbl.sorbs.net which is not very
safe at all. bl.spamcop.net isn't all that safe either. Both will
routinely hit on the free email providers and major ISPs outgoing MTAs.
This is because both have automatic systems generating them.

Its fairly hard for any sizable ISP or mail provider to not constantly
be going on and off new.spam and spamcop lists given harvested/weak
passwords and the newer bots that will use the MTA configured in the
default mail client of the zombied system including being able to do
SMTP-AUTH.

safe sorbs would be something along the lines of:
dul.dnsbl.sorbs.net + relays.dnsbl.sorbs.net + zombie.dnsbl.sorbs.net



Re: Mail Lost? How can this happen?

2007-04-27 Thread Nigel Frankcom
It won't be SA doing the deleting. SA does nothing with email except
scan and add the headers (if so set). 

What happens to the mail before and after is entirely down to your
mail prog. You can't bounce a message from SA, nor can you delete a
message from SA; all you can do is scan it and add some headers to say
if SA thinks it's ham or spam. 

If you are missing mail it may be worth seeing if you have rbl list
based refusal at your MTA.

Bottom line ... SA doesn't delete emails... it may mis-label them but
it doesn't delete.

If mails are being lost then the screw up is at the MTA; SA doesn't
delete emails (I think I may have said that already :-D)

Hope that helps.

KR

Nigel



On Fri, 27 Apr 2007 08:23:28 -0700 (PDT), dbsanders
[EMAIL PROTECTED] wrote:


Strange, I checked the log and there is no closing bracket in the message id.
Maybe this screwed with something in the SA/procmail process.




Michael Scheidell wrote:
 
 
 -Original Message-
 From: dbsanders [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, April 26, 2007 8:15 PM
 To: users@spamassassin.apache.org
 Subject: Mail Lost? How can this happen?
 
 
 
 Not sure this is an SA problem at all, but maybe you can give 
 me a clue. I seem to be losing messages. They are received by 
 my mail system:
 
 Apr 26 10:28:45 heckle sendmail[9295]: [ID 801593 mail.info] 
 l3QHShh9009295: from=[EMAIL PROTECTED], size=78591, 
 class=0, nrcpts=1, 
 msgid=!!AAAYAKUg0ihnLChGuixmNt9Cx7vCgAAAEAAA
 [EMAIL PROTECTED],
 
 Did you cut off the logs here? I don't see a closing  in message id.
 procmail.log shows:
 
 From [EMAIL PROTECTED]  Thu Apr 26 10:28:46 2007
  Subject: Northwood Connect - Qlogic  #SANQ4747
   Folder: SpamBox 
 
 Procmail problem, join procmail list.
 
 If SA was done with it and sent it to procmail, then procmail dropped
 it.
 
 
 -- 
 Michael Scheidell, CTO
 Join SECNAP at SecureWorld Atlanta, May 1-2, May 16-17 in Philadelphia
 http://www.secnap.com/events for free and discounted seminar tickets
 _
 This email has been scanned and certified safe by SpammerTrap(tm).
 For Information please see http://www.spammertrap.com
 _
 
 


Re: BOTNET is great but...

2007-04-27 Thread John Rudd

Andy Spiegl wrote:

...I wonder how to deal with the cases where there is a legitimate
internal mailserver behind dialup-IPs.  There are quite a few small
companies that have a small home office network behind a dialup DSL
and run an internal mailserver which relays external mail to the mailserver
of their provider which then delivers to the destination.

That seems perfectly okay to me and very distinct from the botnet case
where mails from dialup-IPs are sent _directly_ to the destination MX.
But the BOTNET rules don't differentiate these two cases.

What do you think how to deal with that?  How do YOU deal with it?
I'd really hate to lower the BOTNET scores but otoh if it hits
legit mailservers too?

Thanks,
 Andy.

PS: Shouldn't the BOTNET_SOHO rule avoid a high BOTNET score in these cases?
Or do I have to set the score for BOTNET_SOHO manually???


The situation you're talking about is exactly what BOTNET_SOHO is meant 
to handle.  Those soho type mail servers that _cannot_ get their ISP to 
give them a static IP address with proper DNS for their mail domain.



When you're just using the BOTNET rule directly, not as a meta-rule, the 
BOTNET_SOHO code is called internally, so it should automatically kick 
in an exempt a host from BOTNET if it appears to be a soho type mail 
server.  But it's difficult to detect that.



You only need to set a score for BOTNET_SOHO if you're using BOTNET as a 
metarule.


hello, is there anybody out there?

2007-04-27 Thread Anton Melser

Did the filters get me as spam, or are my questions to stupid to even
think about?
Cheers
Anton
ps. If there are docs out there that answer my questions, then please
stoop to providing a link or two...


R: parsing the summary

2007-04-27 Thread Giampaolo Tomassoni
 -Messaggio originale-
 Da: Anton Melser [mailto:[EMAIL PROTECTED]
 
 Hi,
 I am writing a programme which needs to parse the summary (_SUMMARY_)
 returned by SA, and after combing the docs couldn't find relevant
 specs. It appears that the lines are a fixed length, but I couldn't be
 sure... is there anywhere I can get the specs so my parser doesn't do
 silly things?

The only spec I see about is this:

%s %-22s %s%s\n%s

(From PerMsgStatus.pm line no.2784 as per v.3.1.8)


 Cheers
 Anton

Cheers,

Giampaolo

PS: No answer often means: What is he asking about?



Re: BOTNET is great but...

2007-04-27 Thread Andy Spiegl
John Rudd wrote:

 When you're just using the BOTNET rule directly, not as a meta-rule, the
 BOTNET_SOHO code is called internally, so it should automatically kick in
 an exempt a host from BOTNET if it appears to be a soho type mail server.

I'm not sure I understand what you mean by using as a meta-rule.
Do you mean it should work if I just write:
  describeBOTNET  Relay might be a spambot or virusbot
  header  BOTNET  eval:botnet()
  score   BOTNET  3.5

(That's the default in Botnet.cf)

If so, I don't understand why for example my own mails get scored like
this:  (I've got a soho mailserver too)

 X-Spam-Checker-Version: SpamAssassin 3.1.7-deb (2006-10-05) on 
condor.int.spiegl.de
 X-Spam-Scores: AWL=-1.933,BAYES_00=-2.599,BOTNET=3.5,FORGED_RCVD_HELO=0.135

These are the corresponding header lines:
 Received: from pop..de [80.237.184.21]
by condor.int.spiegl.de with POP3 (fetchmail-6.3.8)
for [EMAIL PROTECTED] (single-drop); Tue, 24 Apr 2007 20:48:13 +0200 
(CEST)
 Received: from condor.int.spiegl.de (p57988fca.dip.t-dialin.net 
[87.152.143.202])
 by sienna..de  via kasmail (3.1)
 id 1IgQ30-4tK-1-sienna; Tue, 24 Apr 2007 18:47:30 GMT
 Received: from condor.int.spiegl.de ([EMAIL PROTECTED] [127.0.0.1])
by condor.int.spiegl.de (8.13.8/8.13.8/Debian-3) with ESMTP id 
l3OIlTIb032652
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT);
Tue, 24 Apr 2007 20:47:29 +0200
 Received: (from [EMAIL PROTECTED])
by condor.int.spiegl.de (8.13.8/8.13.8/Submit) id l3OIlTTk032647;
Tue, 24 Apr 2007 20:47:29 +0200

My internal mailserver (condor.int.spiegl.de, 87.152.143.202) delivered the
mail via SMTP AUTH to the mailserver of my provider, and then a bit later I
fetched the mail from the popserver and ran SpamAssassin.
If Botnet checks whether the providers mailserver is an MX of spiegl.de,
that's the case:
 spiegl.de mail is handled by 10 mx1.spiegl.de. (82.165.28.56)
 spiegl.de mail is handled by 10 mx2.spiegl.de. (80.237.158.92)
 spiegl.de mail is handled by 10 mx3.spiegl.de. (80.237.206.21)
 spiegl.de mail is handled by 10 mx4.spiegl.de. (80.237.184.21)

sienna..de has address 80.237.184.21  (- mx4.spiegl.de)
What else could be wrong?

And I can't get rid of the FORGED_RCVD_HELO either. :-(
condor.int.spiegl.de resolves to the dynamic IP, as it should.
What else is necessary?

Thanks,
 Andy.

-- 
 2 is not equal to 3  -- not even for large values of 2.


Re: SUBJECT_ENCODED_TWICE really wrong?

2007-04-27 Thread alan premselaar
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 4/25/07 11:15 PM, John Wilcock wrote:
 Andy Spiegl wrote:

 But the score for SUBJECT_ENCODED_TWICE is pretty high:
  1.723
 How does that justify?
 
 No doubt it is justified by the fact that the corpora used to
 determine SpamAssassin scores don't contain enough non-English-language
 content.
 
 You'll almost certainly find that you want to lower the score for this
 rule (and other rules such as SUBJ_ILLEGAL_CHARS which tend to cause FPs
 on genuine non-English mail).
 
 John.
 

I've had to reduce the SUBJ_ENCODED_TWICE score (to .001 so i know it
hits but so it doesn't have any impact) because it's basically required
to handle long 2-byte subject encoding.

I've left SUBJ_ILLEGAL_CHARS as is because the subject really shouldn't
contain raw non-ascii characters, it should be encoded.

So far I haven't had any problems with this combination.

just my 2 yen worth.

Alan
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGMpHtE2gsBSKjZHQRAsfMAJwO8iqLnF/BpAw5tX/YOm/tsSGCVQCfaJHP
JRPY+2PKlce6j0hKfKsoQ9Y=
=BEbK
-END PGP SIGNATURE-


Re: BOTNET is great but...

2007-04-27 Thread John Rudd

Andy Spiegl wrote:

John Rudd wrote:


When you're just using the BOTNET rule directly, not as a meta-rule, the
BOTNET_SOHO code is called internally, so it should automatically kick in
an exempt a host from BOTNET if it appears to be a soho type mail server.


I'm not sure I understand what you mean by using as a meta-rule.
Do you mean it should work if I just write:
  describeBOTNET  Relay might be a spambot or virusbot
  header  BOTNET  eval:botnet()
  score   BOTNET  3.5


Yes, that is using the BOTNET rule directly, and not as a meta-rule. 
So it will call the BOTNET_SOHO code automatically.  If your config 
qualifies for the soho exemption, this will make it happen.




(That's the default in Botnet.cf)

If so, I don't understand why for example my own mails get scored like
this:  (I've got a soho mailserver too)

 X-Spam-Checker-Version: SpamAssassin 3.1.7-deb (2006-10-05) on 
condor.int.spiegl.de
 X-Spam-Scores: AWL=-1.933,BAYES_00=-2.599,BOTNET=3.5,FORGED_RCVD_HELO=0.135

These are the corresponding header lines:
 Received: from pop..de [80.237.184.21]
by condor.int.spiegl.de with POP3 (fetchmail-6.3.8)
for [EMAIL PROTECTED] (single-drop); Tue, 24 Apr 2007 20:48:13 +0200 
(CEST)
 Received: from condor.int.spiegl.de (p57988fca.dip.t-dialin.net 
[87.152.143.202])
 by sienna..de  via kasmail (3.1)
 id 1IgQ30-4tK-1-sienna; Tue, 24 Apr 2007 18:47:30 GMT
 Received: from condor.int.spiegl.de ([EMAIL PROTECTED] [127.0.0.1])
by condor.int.spiegl.de (8.13.8/8.13.8/Debian-3) with ESMTP id 
l3OIlTIb032652
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT);
Tue, 24 Apr 2007 20:47:29 +0200
 Received: (from [EMAIL PROTECTED])
by condor.int.spiegl.de (8.13.8/8.13.8/Submit) id l3OIlTTk032647;
Tue, 24 Apr 2007 20:47:29 +0200

My internal mailserver (condor.int.spiegl.de, 87.152.143.202) delivered the
mail via SMTP AUTH to the mailserver of my provider, and then a bit later I
fetched the mail from the popserver and ran SpamAssassin.
If Botnet checks whether the providers mailserver is an MX of spiegl.de,
that's the case:
 spiegl.de mail is handled by 10 mx1.spiegl.de. (82.165.28.56)
 spiegl.de mail is handled by 10 mx2.spiegl.de. (80.237.158.92)
 spiegl.de mail is handled by 10 mx3.spiegl.de. (80.237.206.21)
 spiegl.de mail is handled by 10 mx4.spiegl.de. (80.237.184.21)

sienna..de has address 80.237.184.21  (- mx4.spiegl.de)
What else could be wrong?


Assuming that the sender address on this message was 
(something)@spiegl.de , then in order to get the BOTNET_SOHO code to 
trigger, either:


a) spiegl.de has 1-5 A records, and one of them resolves to the 
submitting relay (87.152.143.202).


b) spiegl.de has 1-5 MX records, and one of them has 1-5 A records, one 
of which resolves to the submitting relay (87.152.143.202).


Neither of these conditions is true: the lone A record for spiegl.de 
resolves to 80.237.211.99; and you showed the MX records for spiegl.de 
already and they don't point back to the submitting relay either. 
Therefore, the IP address submitting the message doesn't appear to be 
the soho mail relay for spiegl.de (according to the code used by BOTNET 
for detecting soho mail relays).


So, you aren't getting the soho exemption.



Re: parsing the summary

2007-04-27 Thread Matt Kettler
Anton Melser wrote:
 Hi,
 I am writing a programme which needs to parse the summary (_SUMMARY_)
 returned by SA, and after combing the docs couldn't find relevant
 specs. It appears that the lines are a fixed length, but I couldn't be
 sure... is there anywhere I can get the specs so my parser doesn't do
 silly things?
 Cheers
 Anton

AFAIK, there's no set-in-stone format for _SUMMARY_.

This field is intended to be human readable, not machine readable.

Most folks that need to parse parse the spamd logs, or the regular
X-Spam-Status header.