mailbox-list in sender: header?

2009-07-09 Thread McDonald, Dan
I recently received a spam with a mailbox-list in the from: and senderd:
headers

From: Inversiones inversiones.fo...@live.com,
i...@lasinversionesforex.com
Sender: Inversiones inversiones.fo...@live.com,
 i...@lasinversionesforex.com

Since I had not seen mailbox-lists in a from: header before, I ran to
read rfc5322:
3.6.2.  Originator Fields

   The originator fields of a message consist of the from field, the
   sender field (when applicable), and optionally the reply-to field.
   The from field consists of the field name From and a comma-
   separated list of one or more mailbox specifications.  If the from
   field contains more than one mailbox specification in the mailbox-
   list, then the sender field, containing the field name Sender and a
   single mailbox specification, MUST appear in the message.  In either
   case, an optional reply-to field MAY also be included, which contains
   the field name Reply-To and a comma-separated list of one or more
   addresses.

Clearly, this message failed this section.  Would multiple addresses in
either the From: or Subject: headers be a useful spam rule?  Is that
construct used often somewhere that I'm not familiar with?


-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
www.austinenergy.com


signature.asc
Description: This is a digitally signed message part


Plugging dspam into SA

2009-07-09 Thread Frank DeChellis
Hi,

We used NetBSD 4, SA 3.2.5 and just installed dspam 3.8.0 .

Our SMTP server is exim.

We run SA on a different server.  Exim receives the mail and sends it out
for checking.  I configured this all myself but I don¹t consider myself 100%
confident and well versed, so I write looking to see if there is an easy
answer to my question.

We call SA into action from exim by specifying spamd_address to the SA
server and port number.

Is there a simple way to integrate dspam into SA like we integrate Bayes
and/or Razor?  Am I looking for too simple a solution?

My hope is that dspam can be used a simple plugin to spamassassin.

Right now, this is the easiest thing I have found:

http://eric.lubow.org/projects/dspam-spamassassin-plugin/

If anybody has any advice or ideas, please let me know.

THANKS.

Frank


Frank DeChellis
President, Internet Access Worldwide
Welland, Ontario, Canada
www.iaw.com




Am I fscking up my bayes db?

2009-07-09 Thread Steve Bertrand
Hi everyone,

I aggregate my work and personal email accounts within the same email
client. All accounts are IMAP-based.

My $work employs a Barracuda cluster, and of course my box runs SA.

From time-to-time, I'll get a SPAM message come through the 'cuda's.

From there, I move the message from one IMAP folder in my MUA into
another SPAM folder, which essentially is a transfer from a work storage
server onto my server.

Every few days, I run sa-learn against the collected SPAM messages.

My question is, given that the messages have already been processed by
the 'cuda's (with their header stamps in place), am I damaging, or at
risk of confusing the learning process of SA when I classify these
messages as SPAM?

Are there any negative consequences by doing this?

Steve



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Am I fscking up my bayes db?

2009-07-09 Thread Mike Cardwell

Steve Bertrand wrote:

Hi everyone,

I aggregate my work and personal email accounts within the same email
client. All accounts are IMAP-based.

My $work employs a Barracuda cluster, and of course my box runs SA.


From time-to-time, I'll get a SPAM message come through the 'cuda's.



From there, I move the message from one IMAP folder in my MUA into

another SPAM folder, which essentially is a transfer from a work storage
server onto my server.

Every few days, I run sa-learn against the collected SPAM messages.

My question is, given that the messages have already been processed by
the 'cuda's (with their header stamps in place), am I damaging, or at
risk of confusing the learning process of SA when I classify these
messages as SPAM?

Are there any negative consequences by doing this?


You should configure bayes to ignore those headers. In your local.cf, 
list each of the cuda headers like this:


bayes_ignore_header X-CudaHeader1
bayes_ignore_header X-CudaHeader2
bayes_ignore_header X-CudaHeader3

--
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/


Re: Am I fscking up my bayes db?

2009-07-09 Thread Daniel Schaefer

Mike Cardwell wrote:

Steve Bertrand wrote:

Hi everyone,

I aggregate my work and personal email accounts within the same email
client. All accounts are IMAP-based.

My $work employs a Barracuda cluster, and of course my box runs SA.


From time-to-time, I'll get a SPAM message come through the 'cuda's.



From there, I move the message from one IMAP folder in my MUA into

another SPAM folder, which essentially is a transfer from a work storage
server onto my server.

Every few days, I run sa-learn against the collected SPAM messages.

My question is, given that the messages have already been processed by
the 'cuda's (with their header stamps in place), am I damaging, or at
risk of confusing the learning process of SA when I classify these
messages as SPAM?

Are there any negative consequences by doing this?


You should configure bayes to ignore those headers. In your local.cf, 
list each of the cuda headers like this:


bayes_ignore_header X-CudaHeader1
bayes_ignore_header X-CudaHeader2
bayes_ignore_header X-CudaHeader3

I have a similar setup. If a Spam message makes it to my inbox with less 
than the required_score, I put it into a SPAM folder and run sa-learn on 
the folder. Should I also implement the following ignore rules?


bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Level
bayes_ignore_header X-Spam-Status
bayes_ignore_header X-Spam...etc.

--
Dan Schaefer



Re: Am I fscking up my bayes db?

2009-07-09 Thread Steve Bertrand
Mike Cardwell wrote:
 Steve Bertrand wrote:

 My question is, given that the messages have already been processed by
 the 'cuda's (with their header stamps in place), am I damaging, or at
 risk of confusing the learning process of SA when I classify these
 messages as SPAM?

 Are there any negative consequences by doing this?
 
 You should configure bayes to ignore those headers. In your local.cf,
 list each of the cuda headers like this:
 
 bayes_ignore_header X-CudaHeader1
 bayes_ignore_header X-CudaHeader2
 bayes_ignore_header X-CudaHeader3

Thanks Mike.

It's extremely infrequent how often I have to touch my email setup, but
I've always been curious about this.

Given your recommendation, would you say that a reset on the db should
be performed?

Essentially, is it fair to say that what I've done has possibly caused
damage?

Steve

ps. fwiw, I feel that my SA setup is not under-performing in any way at
this time.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Am I fscking up my bayes db?

2009-07-09 Thread Martin Gregorie
On Thu, 2009-07-09 at 08:50 -0400, Steve Bertrand wrote:
 My question is, given that the messages have already been processed by
 the 'cuda's (with their header stamps in place), am I damaging, or at
 risk of confusing the learning process of SA when I classify these
 messages as SPAM?
 
Not really answering your question, but I find its helpful to strip SA
headers out of the message collection I use for testing private rules.
Here's a simple bash shell script fragment that does the job and does it
fairly fast:


for f in data/*.txt
do
echo Cleaning $f 
gawk '
BEGIN   { act = copy }
/^X-Spam/   { act = skip }
/^[A-WYZ]/  { act = copy }
{  
  if (act == copy)
{ print }
}
' $f temp.txt
mv temp.txt $f
done



Martin




Re: Am I fscking up my bayes db?

2009-07-09 Thread Chr. von Stuckrad
On Thu, 09 Jul 2009, Martin Gregorie wrote:

 Here's a simple bash shell script fragment that does the job and does it
 fairly fast:
 
 
 for f in data/*.txt
...
 gawk '
...
 done
 

Having also Non-LINUX-Users on the list, you might have explained
that THIS script needs 'gawk' (old awk would be enough) and
works on 'alle the Files in one directory, if their names
end on '.txt' :-) E.g. my mail-collection-files mostly end on
'*.box' or '*.eml' and my old Solaris never had any 'gawk'.
The trick to delete all runs of 'X' Headers from 'X-Spam' on
is a good idea (execept e.g. if the next Header is 'X-remote-IP'
and you want to check for internal Mail :-).

Stucki
-- 
Christoph von Stuckrad  * * |nickname |Mail stu...@mi.fu-berlin.de \
Freie Universitaet Berlin   |/_*|'stucki' |Tel(Mo.,Do.):+49 30 838-75 459|
Mathematik  Informatik EDV |\ *|if online|  (Di,Mi,Fr):+49 30 77 39 6600|
Takustr. 9 / 14195 Berlin   * * |on IRCnet|Fax(home):   +49 30 77 39 6601/


Re: Am I fscking up my bayes db?

2009-07-09 Thread John Hardin

On Thu, 9 Jul 2009, Martin Gregorie wrote:


On Thu, 2009-07-09 at 08:50 -0400, Steve Bertrand wrote:

My question is, given that the messages have already been processed by
the 'cuda's (with their header stamps in place), am I damaging, or at
risk of confusing the learning process of SA when I classify these
messages as SPAM?


Not really answering your question, but I find its helpful to strip SA
headers out of the message collection I use for testing private rules.
Here's a simple bash shell script fragment that does the job and does it
fairly fast:


for f in data/*.txt
do
   echo Cleaning $f
   gawk '
   BEGIN   { act = copy }
   /^X-Spam/   { act = skip }
   /^[A-WYZ]/  { act = copy }
   {
 if (act == copy)
   { print }
   }
   ' $f temp.txt
   mv temp.txt $f
done



...wouldn't that mangle wrapped X-Spam headers?

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  North Korea: the only country in the world where people would risk
  execution to flee to communist China.  -- Ride Fast
---
 11 days until the 40th anniversary of Apollo 11 landing on the Moon


Re: Plugging dspam into SA

2009-07-09 Thread RW
On Thu, 09 Jul 2009 08:07:09 -0400
Frank DeChellis fra...@iaw.on.ca wrote:

 Hi,
 
 We used NetBSD 4, SA 3.2.5 and just installed dspam 3.8.0 .
 
 Our SMTP server is exim.
 
 We run SA on a different server.  Exim receives the mail and sends it
 out for checking.  I configured this all myself but I don¹t consider
 myself 100% confident and well versed, so I write looking to see if
 there is an easy answer to my question.
 
 We call SA into action from exim by specifying spamd_address to the SA
 server and port number.
 
 Is there a simple way to integrate dspam into SA like we integrate
 Bayes and/or Razor?  Am I looking for too simple a solution?
 
 My hope is that dspam can be used a simple plugin to spamassassin.
 
 Right now, this is the easiest thing I have found:
 
 http://eric.lubow.org/projects/dspam-spamassassin-plugin/

That's the only plugin I'm aware of. I don't really see the point of a
plugin though, I just pass the mail though DSPAM and then use

header   DS_HAM X-DSPAM-Result =~ /^(Innocent|Whitelisted)/
header   DS_SPAMX-DSPAM-Result =~ /^Spam/


Re: twitter spam why RCVD_IN_DNSWL?

2009-07-09 Thread Bob Proulx
Michael Scheidell wrote:
 Obviously, they are letting automated processes in.

I just wanted to confirm that I am seeing twitter invite spam that
appears AFAICT to be from twitter.com to addresses that are not and
never have been associated with Twitter.  Mostly moderated mailing
lists.  It looks to me like there is some type of interface at Twitter
that allows a user to upload a list of email addresses and invite them
to use Twitter.  Probably because addresses exist in a user's mailbox
they get spammed by Twitter with an invite.

Bob


Re: Plugging dspam into SA

2009-07-09 Thread Michael Parker


On Jul 9, 2009, at 7:07 AM, Frank DeChellis wrote:



If anybody has any advice or ideas, please let me know.



This is probably way beyond what you wanted to get into but the Bayes  
subsystem has plugin hooks so you could write your own dspam plugin to  
use.


I'm not aware of anyone trying it so the plugin interface might need  
some tweaking but I'd certainly be interested in the results.


Michael



Re: Perl Error: CHARSETS_LIKELY_TO_FP_AS_CAPS on SA

2009-07-09 Thread Terry Carmen

 On Wed, 8 Jul 2009, Terry Carmen wrote:

 I'm running:

 #spamassassin --version
 SpamAssassin version 3.1.9
  running on Perl version 5.8.8

 and would greatly appreciate a help in troubleshooting this problem.

 I'm getting the error messages below from spamassaassin --lint, but it seems
 to be bogus, since CHARSETS_LIKELY_TO_FP_AS_CAPS is defined in
 /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin:

 Any ideas?

 [30813] warn: plugin: failed to create instance of plugin
 Mail::SpamAssassin::Plugin::HeaderEval: Can't locate object method new via
 package Mail::SpamAssassin::Plugin::HeaderEval at
 /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/Plugin/HeaderEval.pm line
 39.

 [30813] warn: plugin: failed to parse plugin (from @INC):
 CHARSETS_LIKELY_TO_FP_AS_CAPS is not exported by the
 Mail::SpamAssassin::Constants module

 ...etc.

 It looks like you have multiple different versions partially installed.
 The HeaderEval plugin does not exist in 3.1.9, and
 CHARSETS_LIKELY_TO_FP_AS_CAPS is not defined in the 3.1.9 Constants.pm
 file.

 I'd suggest completely uninstalling SA and reinstalling 3.2.5 from
 scratch.

 Note that you need to install SA upgrades using the same method every
 time; you can't mix CPAN and distro packages and tarball, things will get
 confused. I suspect that's what happened here.

That's probably a good guess. There are a ton of SA dependencies and I believe
that's exactly what happened.

It seems to be running OK, so I'll just wait until the new server is ready,
and reinstall it.

Thanks for the reply!

Terry




Re: Perl Error: CHARSETS_LIKELY_TO_FP_AS_CAPS on SA

2009-07-09 Thread John Hardin

On Thu, 9 Jul 2009, Terry Carmen wrote:

Note that you need to install SA upgrades using the same method every 
time; you can't mix CPAN and distro packages and tarball, things will 
get confused. I suspect that's what happened here.


That's probably a good guess. There are a ton of SA dependencies and I 
believe that's exactly what happened.


That comment only applies to SA itself, not the various CPAN libraries 
(e.g. Net::DNS) that it depends on. I personally have not had problems 
using CPAN to install non-SA modules with a distro-installed SA package.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
 11 days until the 40th anniversary of Apollo 11 landing on the Moon


Re: twitter spam why RCVD_IN_DNSWL?

2009-07-09 Thread Michelle Konzack
Am 2009-07-09 09:31:24, schrieb Bob Proulx:
 Michael Scheidell wrote:
  Obviously, they are letting automated processes in.
 
 I just wanted to confirm that I am seeing twitter invite spam that
 appears AFAICT to be from twitter.com to addresses that are not and
 never have been associated with Twitter.  Mostly moderated mailing
 lists.  It looks to me like there is some type of interface at Twitter
 that allows a user to upload a list of email addresses and invite them
 to use Twitter.  Probably because addresses exist in a user's mailbox
 they get spammed by Twitter with an invite.
 
 Bob

[ '/etc/courier/bofh' ]-
badfrom @twitter.com


Thanks, Greetings and nice Day/Evening
Michelle Konzack
Systemadministrator
Tamay Dogan Network
Debian GNU/Linux Consultant


-- 
Linux-User #280138 with the Linux Counter, http://counter.li.org/
# Debian GNU/Linux Consultant #
Michelle Konzack   c/o Shared Office KabelBW  ICQ #328449886
+49/177/9351947Blumenstasse 2 MSN LinuxMichi
+33/6/61925193 77694 Kehl/Germany IRC #Debian (irc.icq.com)


signature.pgp
Description: Digital signature


Re: URI-DNSBL problem with spamassassin 3.2.5

2009-07-09 Thread Eddy Beliveau



Is there some way to find the culprit rule ?
other that removing all rules and adding them one at the time.



Perhaps the best timing tool for rules is the HitFreqsRuleTiming
plugin, which can be found in masses/plugins/HitFreqsRuleTiming.pm
in the distribution. Should work with 3.2.5 and with 3.3.0.
It is quite primitive in that it does not have any configurables,
but just dumps its results to a file 'timing.log' in the current
working directory (make sure it is writable for the UID under
which SA is running, no error is issued if it can not write there).

To activate it, copy it to some place, then add a loadplugin
command to one of your .pre files, such as a local.pre, providing
the path to the .pm file, e.g.:

loadplugin HitFreqsRuleTiming /etc/mail/spamassassin/HitFreqsRuleTiming.pm

Then run a command line spamassassin giving it a sample message, e.g.:

$ spamassassin -t test.msg

and after it finishes, you should have a sorted timing report
in file 'timing.log' for all the rules, e.g.:

TDCC_REPUT_13_191.7241.7241
T  RAZOR2_CF_RANGE_E8_51_1000.5250.5251
  

Hi! Mark,

Many thanks for your reply.

I'm using SpamAssassin version 3.2.5 running on Perl version 5.8.5

I did extract HitFreqsRuleTiming.pm from spamassassin_20090708151200.tar.gz,
move it to /etc/mail/spamassassin
then create the /etc/mail/spamassassin/local.pre file with the following 
line

loadplugin HitFreqsRuleTiming /etc/mail/spamassassin/HitFreqsRuleTiming.pm

Now, on /tmp directory,  I execute spamassassin --lint -t -D which 
correctly said:

...cut...
[24936] dbg: plugin: loading HitFreqsRuleTiming from 
/etc/mail/spamassassin/HitFreqsRuleTiming.pm

...cut...
[27955] dbg: plugin: HitFreqsRuleTiming=HASH(0x114a8588) implements 
'start_rules', priority 0

[27955] dbg: rules: compiled one_line_body tests
[27955] dbg: plugin: 
Mail::SpamAssassin::Plugin::Rule2XSBody=HASH(0x1197b19c) implements 
'run_body_fast_scan', priority 0

[27955] dbg: rules: running head tests; score so far=0
[27955] dbg: rules: compiled head tests
[27955] dbg: plugin: HitFreqsRuleTiming=HASH(0x114a8588) implements 
'ran_rule', priority 0

...cut...
[27955] dbg: check: is spam? score=4.205 required=5
[27955] dbg: check: 
tests=MISSING_DATE,MISSING_HEADERS,MISSING_SUBJECT,NO_RECEIVED,NO_RELAYS
[27955] dbg: check: 
subtests=__BOTNET_NOTRUST,__HAS_MSGID,__HAVE_BOUNCE_RELAYS,__MISSING_REF,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__MSOE_MID_WRONG_CASE,__NONEMPTY_BODY,__SANE_MSGID,__SARE_WHITELIST_FLAG,__TVD_BODY,__UNUSABLE_MSGID


but Ido not find any timing.log file on my current directory or anywhere 
on my system!!


Did I missed something ?

Thanks,
Eddy






Re: Am I fscking up my bayes db?

2009-07-09 Thread RW
On Thu, 09 Jul 2009 09:30:37 -0400
Steve Bertrand st...@ibctech.ca wrote:


 It's extremely infrequent how often I have to touch my email setup,
 but I've always been curious about this.
 
 Given your recommendation, would you say that a reset on the db should
 be performed?

 Essentially, is it fair to say that what I've done has possibly caused
 damage?

The Barracuda headers don't matter much unless you get similar headers
in your legitimate incoming mail, in which case just tell bayes to
ignore them. The irrelevant tokens will eventually age out of the
database.

The received headers are a bit more of a problem because you're
weighting bayes against your work domain, ip addresses etc. You could
try sending yourself a mail from work and see if it looks spammy.

 


Re: URI-DNSBL problem with spamassassin 3.2.5

2009-07-09 Thread Michael Parker


On Jul 9, 2009, at 1:40 PM, Eddy Beliveau wrote:


but Ido not find any timing.log file on my current directory or  
anywhere on my system!!


Did I missed something ?



I doubt all the necessary hooks are in place for that plugin to work  
in 3.2.5, you'd need to run 3.3 to make use of that plugin.


Michael



Re: Bayes expiration logic

2009-07-09 Thread RW
On Mon, 06 Jul 2009 16:13:17 -0400
Rosenbaum, Larry M. rosenbau...@ornl.gov wrote:

 Has anybody considered revising the Bayes expiration logic?  Maybe
 it's just our data that's weird, but the built-in expiration logic
 doesn't seem to work very well for us.  Here are my observations:
 
 There's no point in checking anything older than oldest_atime.  For
 this value and older, zero tokens will be expired.  The current
 estimation pass logic goes back 256 days, even if the oldest atime is
 one week and the calculations have already started returning zeroes.

And there's another problem there. If deleting tokens over 256
days would delete more than the target number, then no tokens at all
are deleted. If the database was trained from historic corpora, then
most of the tokens could be older, and in the worst case, the database
could grow to 175% of it's configured maximum.


Re: Bayes expiration logic

2009-07-09 Thread RW
On Fri, 10 Jul 2009 02:28:30 +0100
RW rwmailli...@googlemail.com wrote:

 On Mon, 06 Jul 2009 16:13:17 -0400
 Rosenbaum, Larry M. rosenbau...@ornl.gov wrote:
 
  Has anybody considered revising the Bayes expiration logic?  Maybe
  it's just our data that's weird, but the built-in expiration logic
  doesn't seem to work very well for us.  Here are my observations:
  
  There's no point in checking anything older than oldest_atime.  For
  this value and older, zero tokens will be expired.  The current
  estimation pass logic goes back 256 days, even if the oldest atime
  is one week and the calculations have already started returning
  zeroes.
 
 And there's another problem there. If deleting tokens over 256
 days would delete more than the target number, then no tokens at all
 are deleted. If the database was trained from historic corpora, then
 most of the tokens could be older, and in the worst case, the database
 could grow to 175% of it's configured maximum.

On reflection that should be 175% of the unique tokens in the corpora,
which means that the database could grow to an unlimited size.


Hostkarma Blacklist Climbing the Charts

2009-07-09 Thread Marc Perkel
For what it's worth I'm now ahead of Barracuda on Jeff Makey's blacklist 
comparison chart. Not a scientific comparison but it's about all there 
is to compare blacklists. Now only abuseat.org and spamhaus have me 
beat. (apews doesn't count because they blacklist everything)


http://www.sdsc.edu/~jeff/spam/cbc.html




Re: Short URL provider list?

2009-07-09 Thread Marc Perkel
Thanks for the lists. I'm not sure what I'm going to do with it but I'm 
going to see if I can find a way to use it.




unsubscribe

2009-07-09 Thread Kevin Turner



Re: unsubscribe

2009-07-09 Thread Evan Platt

As the headers of every message state:

list-unsubscribe: mailto:users-unsubscr...@spamassassin.apache.org


At 07:39 PM 7/9/2009, you wrote: