Re: Single word mails .

2007-04-26 Thread Tim B.

Matt Kettler wrote:

ram wrote:
  
Are the spammers testing some new spamtool 
I am getting mails with just a single word like gushes using  etc 

what is this about  now ? 
  


Read the archives for more details, however the general consensus is
it's due to:

1) a mass run of short-emails to a broader-range of randomly generated
addresses in an attempt to
disover new ones. (aka Rumpelstiltskin attack)

- OR -

2) some spammer screwed up their template when they last pushed one out
to their botnet, and as a result the bots are generating emails with no
useful payload.

Both are quite plausible.


  
Is there a good test for these? 

The first days run seemed to be related to the german stock market, but 
todays run so far I can not seem to find a common thread as the subject 
is getting the same one word as the body




Re: Score Generation for Apache SpamAssassin

2007-04-26 Thread Justin Mason

Duncan Findlay writes:
 Hi everybody,
 
 As you may already know, Steven Birk and I have been working on our
 4th year undergraduate project in Math and Engineering at Queen's
 University.
 
 The goal of our project was to examine the use of logistic regression
 as a potential replacement for the Perceptron/GA currently used by the
 SpamAssassin project.
 
 It's now done, and it's available here:
 http://people.apache.org/~duncf/FindlayBirkThesis.pdf
 
 Basically, we've found a technique that shows promise as a possible
 replacement, but requires some modifications in order to handle some
 of the restrictions the SpamAssassin projects puts on scores.
 
 I hope to try to make those modifications in the next month or so, but
 I have no idea how well it will turn out, or how easy it will be.
 
 The paper may be an interesting read for people not too familiar with
 the way the scoring process works now, as it discusses many of the
 issues that differentiate the scoring process from most other machine
 learning problems. (Then again, it might just be boring.)

thanks Duncan -- a great read, and looks promising!

Would it help btw if we came up with a spec for what a score-generation
tool needs to generate, in terms of score ranges and so on?
This would also be useful for the future (I'm sure there'll be
more... ;)

that'd be related to
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5376 ...

--j.


Re: IP - Responsible Person

2007-04-26 Thread Marc Perkel



Matt Kettler wrote:

You imply Comcast has sufficient technical know-how to manage a network.

For a while their own outbound mailserver wasn't even generating a HELO
or EHLO.

  


Is that what it was? I had been getting a lot of complaints that random 
Comcast users couldn't email anyone one our servers and I never did 
figure out what the problem was. I thought theywere using some brain 
dead MTA like Qmail and couldn't set their RDNS correctly, another 
Comcast problem.




Re: Single word mails .

2007-04-26 Thread Andy Spiegl
Hi Tim,

 Is there a good test for these? 

I don't get many of them, probably because I block them at MTA via
zen.spamhaus.org.  But the ones that do get through are caught nicely
by the BOTNET rules.

Chau,
 Andy.

-- 
 If it ain't broke, improve it.


Who wants my spam? - seriously

2007-04-26 Thread Marc Perkel
As many of you know, I do front end spam filtering and I block a lot of 
spam. I have been feeding some of this blocked spam to others who can 
use it to mine for information like IPs to block, virus infected hosts, 
URIBL etc. I also have several feeds as to what kind of spam you want 
and I add headers storing the IP address and host name of the server 
that sent it to me.


If you are running a service that provides free services to the world I 
will give you this data for free. If you are a commercial filtering 
business that doesn't provide free services to the world then I have 
these feeds for sale. Several of the blacklists some of you already use 
gets spam from me. I hate spam and want to help fight it.


So - if anyone is interested in my feeds let me know and I can set you up.

Marc Perkel
http://www.junkemailfilter.com



Re: RDJ handling question

2007-04-26 Thread Chris Thielen
Bowie Bailey wrote:
 RDJ is supposed to download to the RulesDuJour directory.  After it
 downloads
 the files there, it moves them from ${TMPDIR} to ${SA_DIR}.  ${TMPDIR} is
 RDJ's
 working directory.  You don't want SA reading it's rules from there.  RDJ
 may
 have multiple copies of each rule file stored there.

   

This is correct. 

To the OP, try deleting all the SARE rules from /etc/spamassassin then
run RDJ once more.  All the latest versions of the SARE rules should
then appear in /etc/spamassassin.



Re: Single word mails .

2007-04-26 Thread Steven W. Orr
On Thursday, Apr 26th 2007 at 01:45 -0400, quoth Matt Kettler:

=ram wrote:
= Are the spammers testing some new spamtool 
= I am getting mails with just a single word like gushes using  etc 
=
= what is this about  now ? 
=   
=Read the archives for more details, however the general consensus is
=it's due to:
=
=1) a mass run of short-emails to a broader-range of randomly generated
=addresses in an attempt to
=disover new ones. (aka Rumpelstiltskin attack)
=
=- OR -
=
=2) some spammer screwed up their template when they last pushed one out
=to their botnet, and as a result the bots are generating emails with no
=useful payload.
=
=Both are quite plausible.

Ok. I have questions:

1. Should I run these through sa-learn --spam or are these not to be 
considered as spam?

2. And also, maybe OT, should these messages be reported to SpamCop?
We all know they're spam, but to be fair, they're not trying to *sell* us 
anything, thus providing a basis for not calling them spam.


-- 
Time flies like the wind. Fruit flies like a banana. Stranger things have  .0.
happened but none stranger than this. Does your driver's license say Organ ..0
Donor?Black holes are where God divided by zero. Listen to me! We are all- 000
individuals! What if this weren't a hypothetical question?
steveo at syslang.net


whitelist_from_rcvd problem

2007-04-26 Thread Bret Miller
One of my users is supposed to get messages from this person, but they
often get marked as spam. So I want to whitelist, and I can use
whitelist_from, but I want to use whitelist_from_rcvd. BUT, it doesn't
work for me.

I said:
whitelist_from_rcvd [EMAIL PROTECTED] sbc.com

Which I think means that as long as his e-mail comes from any host in
any subdomain of sbc.com, it should be whitelisted. But the message
didn't hit the whitelist. (Headers below.)

Before I opened a bug ticket, I just wanted to make sure my reasoning
was sound in thinking that this should have been whitelisted by the
above configuration entry. (I've had to report bugs previously with
whitelist_spf not parsing the received headers from CommuniGate Pro, so
perhaps this is related. I wonder if the header-parsing code is a
central routine of if each plugin has its own way of doing it...)

Thanks,
Bret



X-Spam-Tests: tests=AWL=4.115,BAYES_50=0.001,DKIM_POLICY_SIGNSOME=0.001,
FH_RELAY_NODNS=1.451,HTML_MESSAGE=0.001,RCVD_IN_MXRATE_WL=-1,
RDNS_NONE=0.1;autolearn=no
X-Spam-Score: 4.7
X-Spam-Checker-Version: SpamAssassin 3.2.0-rc2 (2007-04-13) on
mail.hq.wcg.org
X-Spam-Level: 
X-TFF-CGPSA-Version: 1.6a5
X-WCG-CGPSA-Filter: Scanned
X-SPAM-FLAG: Yes
Return-Path: [EMAIL PROTECTED]
Received: from nlpi029.sbcis.sbc.com ([207.115.36.58] verified)
  by mail.wcg.org (CommuniGate Pro SMTP 5.1.8)
  with ESMTP id 21043544 for [EMAIL PROTECTED]; Thu, 26 Apr 2007
11:37:26 -0700
Received-SPF: none
 receiver=mail.wcg.org; client-ip=207.115.36.58;
[EMAIL PROTECTED]
X-ORBL: [63.198.171.170]
Received: from JBROD (adsl-63-198-171-170.dsl.lsan03.pacbell.net
[63.198.171.170])
by nlpi029.sbcis.sbc.com (8.13.8 out.dk.spool/8.13.8) with ESMTP
id l3QIUgM5027947
for [EMAIL PROTECTED]; Thu, 26 Apr 2007 13:31:11 -0500
From: Jon Brod [EMAIL PROTECTED]
To: 'Bernie Schnippert' [EMAIL PROTECTED]
Subject: RE: California/Ontario Estate Matter
Date: Thu, 26 Apr 2007 11:30:09 -0700
Message-ID: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary==_NextPart_000_0010_01C787F6.4582C0D0
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook, Build 10.0.6626
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
In-Reply-To: [EMAIL PROTECTED]





RE: whitelist_from_rcvd problem

2007-04-26 Thread Bret Miller
 One of my users is supposed to get messages from this person, but they
 often get marked as spam. So I want to whitelist, and I can use
 whitelist_from, but I want to use whitelist_from_rcvd. BUT, it doesn't
 work for me.

 I said:
 whitelist_from_rcvd [EMAIL PROTECTED] sbc.com

 Which I think means that as long as his e-mail comes from any host in
 any subdomain of sbc.com, it should be whitelisted. But the message
 didn't hit the whitelist. (Headers below.)

OK, never mind. Upgrading to rc3 (or something in the update process)
fixed this.

Bret




 Before I opened a bug ticket, I just wanted to make sure my reasoning
 was sound in thinking that this should have been whitelisted by the
 above configuration entry. (I've had to report bugs previously with
 whitelist_spf not parsing the received headers from
 CommuniGate Pro, so
 perhaps this is related. I wonder if the header-parsing code is a
 central routine of if each plugin has its own way of doing it...)

 Thanks,
 Bret



 X-Spam-Tests:
 tests=AWL=4.115,BAYES_50=0.001,DKIM_POLICY_SIGNSOME=0.001,
   FH_RELAY_NODNS=1.451,HTML_MESSAGE=0.001,RCVD_IN_MXRATE_WL=-1,
   RDNS_NONE=0.1;autolearn=no
 X-Spam-Score: 4.7
 X-Spam-Checker-Version: SpamAssassin 3.2.0-rc2 (2007-04-13) on
 mail.hq.wcg.org
 X-Spam-Level: 
 X-TFF-CGPSA-Version: 1.6a5
 X-WCG-CGPSA-Filter: Scanned
 X-SPAM-FLAG: Yes
 Return-Path: [EMAIL PROTECTED]
 Received: from nlpi029.sbcis.sbc.com ([207.115.36.58] verified)
   by mail.wcg.org (CommuniGate Pro SMTP 5.1.8)
   with ESMTP id 21043544 for [EMAIL PROTECTED]; Thu,
 26 Apr 2007
 11:37:26 -0700
 Received-SPF: none
  receiver=mail.wcg.org; client-ip=207.115.36.58;
 [EMAIL PROTECTED]
 X-ORBL: [63.198.171.170]
 Received: from JBROD (adsl-63-198-171-170.dsl.lsan03.pacbell.net
 [63.198.171.170])
   by nlpi029.sbcis.sbc.com (8.13.8 out.dk.spool/8.13.8) with ESMTP
 id l3QIUgM5027947
   for [EMAIL PROTECTED]; Thu, 26 Apr 2007 13:31:11 -0500
 From: Jon Brod [EMAIL PROTECTED]
 To: 'Bernie Schnippert' [EMAIL PROTECTED]
 Subject: RE: California/Ontario Estate Matter
 Date: Thu, 26 Apr 2007 11:30:09 -0700
 Message-ID: [EMAIL PROTECTED]
 MIME-Version: 1.0
 Content-Type: multipart/alternative;
   boundary==_NextPart_000_0010_01C787F6.4582C0D0
 X-Priority: 3 (Normal)
 X-MSMail-Priority: Normal
 X-Mailer: Microsoft Outlook, Build 10.0.6626
 Importance: Normal
 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
 In-Reply-To: [EMAIL PROTECTED]









Re: whitelist_from_rcvd problem

2007-04-26 Thread John D. Hardin
On Thu, 26 Apr 2007, Bret Miller wrote:

 I said:
 whitelist_from_rcvd [EMAIL PROTECTED] sbc.com

try:

  whitelist_from_rcvd [EMAIL PROTECTED] *.sbc.com

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Judicial Activism (n): interpreting the Constitution to grant the
  government powers that are popularly felt to be needed but that
  are not explicitly provided for therein (common definition);
  interpreting the Constitution as it is written (Brady definition)
---
 558 days until the Presidential Election



Re: whitelist_from_rcvd problem

2007-04-26 Thread Duane Hill

On Thu, 26 Apr 2007, John D. Hardin wrote:


On Thu, 26 Apr 2007, Bret Miller wrote:


I said:
whitelist_from_rcvd [EMAIL PROTECTED] sbc.com


try:

 whitelist_from_rcvd [EMAIL PROTECTED] *.sbc.com


If that does work, it goes against what is documented. I haven't had any 
problem with whitelist_from_rcvd in the way Bret has illustrated.


The first parameter is the address to whitelist, and the second is a
 string to match the relay's rDNS.

 This string is matched against the reverse DNS lookup used during the
 handover from the internet to your internal network's mail exchangers.
 It can either be the full hostname, or the domain component of that
 hostname.


Mail Lost? How can this happen?

2007-04-26 Thread dbsanders

Not sure this is an SA problem at all, but maybe you can give me a clue.
I seem to be losing messages. They are received by my mail system:

Apr 26 10:28:45 heckle sendmail[9295]: [ID 801593 mail.info] l3QHShh9009295:
from=[EMAIL PROTECTED], size=78591, class=0, nrcpts=1,
msgid=![EMAIL PROTECTED],
proto=ESMTP, daemon=MTA, relay=ds2085.appliedi.net [216.82.71.202]
Apr 26 10:28:46 heckle sendmail[9297]: [ID 801593 mail.info] l3QHShh9009295:
to=[EMAIL PROTECTED], delay=00:00:02, xdelay=00:00:00,
mailer=local, pri=108821, dsn=2.0.0, stat=Sent

(no mention of spamd or any spam processing for some reason in the system
mail log for this message)

procmail.log shows:

From [EMAIL PROTECTED]  Thu Apr 26 10:28:46 2007
 Subject: Northwood Connect - Qlogic  #SANQ4747
  Folder: SpamBox
78896

So it looks like this was processed by Spamd and classified as spam.
However, the message is NOT in SpamBox. I've looked. Many times.

I use sendmail with procmail for local delivery, on Solaris 10 SPARC.

--- procmailrc ---
DROPPRIVS=yes
:0fw
*  256000
| spamc

:0:
* ^X-Spam-Status: Yes
SpamBox

-- 
View this message in context: 
http://www.nabble.com/Mail-Lost--How-can-this-happen--tf3655105.html#a10211386
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: Score Generation for Apache SpamAssassin

2007-04-26 Thread Duncan Findlay
On Thu, Apr 26, 2007 at 12:15:52PM +0100, Justin Mason wrote:
 thanks Duncan -- a great read, and looks promising!

 Would it help btw if we came up with a spec for what a score-generation
 tool needs to generate, in terms of score ranges and so on?
 This would also be useful for the future (I'm sure there'll be
 more... ;)

Probably not to me, but it might be useful to others. (I think I
already know what needs to be done.) Also, it might limit creativity
in possible solutions. We need a score ranges mechanism, we don't need
the specific one we have now.


-- 
Duncan Findlay


pgpYLxN1PCrXk.pgp
Description: PGP signature


Re: Single word mails .

2007-04-26 Thread Matt Kettler
Steven W. Orr wrote:
 On Thursday, Apr 26th 2007 at 01:45 -0400, quoth Matt Kettler:

 =ram wrote:
 = Are the spammers testing some new spamtool 
 = I am getting mails with just a single word like gushes using  etc 
 =
 = what is this about  now ? 
 =   
 =Read the archives for more details, however the general consensus is
 =it's due to:
 =
 =1) a mass run of short-emails to a broader-range of randomly generated
 =addresses in an attempt to
 =disover new ones. (aka Rumpelstiltskin attack)
 =
 =- OR -
 =
 =2) some spammer screwed up their template when they last pushed one out
 =to their botnet, and as a result the bots are generating emails with no
 =useful payload.
 =
 =Both are quite plausible.

 Ok. I have questions:

 1. Should I run these through sa-learn --spam or are these not to be 
 considered as spam?
   
Why wouldn't you?

Don't over-think your bayes training. If it's undesirable to you, train
it as spam. If it's desirable to you, train it as nonspam.

A lot of folks get caught in the trap of only trying to avoiding
training bayes poison or moderate spam for fear these less obvious
cases will confuse SA when it gets nonspam mail. Don't over-worry about
that. The chi-squared combining algorithm is really quite good at not
being fooled by tokens that appear in both kinds of mail.

 2. And also, maybe OT, should these messages be reported to SpamCop?
 We all know they're spam, but to be fair, they're not trying to *sell* us 
 anything, thus providing a basis for not calling them spam.

   
Spamcop uses UBE as their definition of spam. That's Unsolicited Bulk
Email, as opposed to Unsolicited Commercial Email.

http://www.spamcop.net/fom-serve/cache/125.html

So, while these messages are not selling anything, thus not commercial,
they are still unsolicited bulk mail.

Based on that, I say they fit the critera so go ahead and report them if
you like.