[SA 3.4.1] Are all Perl module dependencies necessary?

2018-12-10 Thread Matteo Dessalvi
Hi all.
I am running SA version 3.4.1 on Debian Jessie (from the backport archives). SA 
run through Amavis and so far I did not have any problem, nor with the 
classification orthe updates through sa-update. Following the advises on the SA 
wiki (https://wiki.apache.org/spamassassin/ImproveAccuracy), I just ran: 
"spamassassin -D --lint 2>&1 | grep -i failed"
and I got the following output:
[...]
Dec 10 11:47:20.731 [6060] dbg: diag: [...] module not installed: Digest::SHA1 
('require' failed)Dec 10 11:47:20.731 [6060] dbg: diag: [...] module not 
installed: Geo::IP ('require' failed)
Dec 10 11:47:20.731 [6060] dbg: diag: [...] module not installed: 
Net::CIDR::Lite ('require' failed)Dec 10 11:47:20.732 [6060] dbg: diag: [...] 
module not installed: LWP::UserAgent ('require' failed)Dec 10 11:47:20.732 
[6060] dbg: diag: [...] module not installed: Encode::Detect::Detector 
('require' failed)
Dec 10 11:47:20.732 [6060] dbg: diag: [...] module not installed: Net::Patricia 
('require' failed)
Despite the 'require failed' messages, how much of these plugins are actually 
used by SA? Are those missing plugins preventing some of the rules from 
triggering oraffecting the scoring process in any sensible way? Or they are 
basically bypassed by the fact that SA is invoked through Amavis via the 
'Mail::SpamAssassin' module?

I went ahead and install all the missing Perl modules through additional 
packages but it would be nice to know why these requirements seems not to 
affect SA. 
Can anybody explain it to me or point me to the docs/code commits/bug 
reports/mailing list thread/ related to this?

Thanks in advance!
Best regards,Matteo

 


Re: Classifying mail as unsolicited

2015-07-07 Thread Matteo Dessalvi

Hi.

Why do you think bayes have problems with such email?
Considering the example you put on Pastebin is triggering
BAYES_00=-1.9 I believe that if you are able to collect
a sufficient amount of these messages and feed it to SA
through sa-learn you should start to trigger more BAYES_5X
or even BAYES_9X rules.

Since you are operating a forwarding server for other users
you can think about creating a fake account just to collect
samples such as the one you've posted.

Regards,
Matteo

On 07.07.2015 05:21, Alex wrote:

Hi,

We have a system with a few hundred users, many of which forward their
mail off the server to their gmail or yahoo account. Lately I've
started to notice quite a few messages are being tagged by gmail and
delayed being received as unsolicited. I know the KAM rules contain a
marketing rule, and razor helps too, but too many of these marketing
messages are not being tagged.

[SNIP]

Here is an example message:

http://pastebin.com/kaD3AQMz

I realize bayes may be a problem on this one, but do you have any
suggestions for blocking these more effectively before they're
forwarded on to gmail?

Thanks,
Alex



Re: Bayes expiration with Redis backend

2015-02-27 Thread Matteo Dessalvi

On 27.02.2015 13:54, Axb wrote:


Is it possible you reject so much spam tha SA sees very little spam?


I believe it's the case. A combination of Postfix policies, blacklists,
ClamAV plus additional signatures, etc. greatly reduces the amount of
email sent through the filtering pipeline.



If classification makes sense, you're not getting FPs or tons of spam
with BAYES_00 you have nothing to worry about.

Sometimes you have to trust your gut feeling... don't let wild bayes
theories misguide you...watch your logs, listen to your users and build
up a feel for it.



I will certainly do that. So far the feedback from our users is
quite good and so I can say I am happy with our current setup.

Regards,
   Matteo


Re: Bayes expiration with Redis backend

2015-02-27 Thread Matteo Dessalvi

Thanks a lot for the explanation Mark, it was very clear.
It would be a good idea considering to add that to the
perldoc of the BayesStore/Redis.pm module.

Regards,
   Matteo

On 27.02.2015 14:55, Mark Martinec wrote:


When redis automatically expires tokens internally based on their
TTL, this operation does not affect nspam and nham counts. These
counts just grow all the time (as there is no explicit expiration
that SpamAssassin would know about), reflecting the count of
(auto)learning operations.

Don't worry about large nspam and/or nham counts when redis is
in use, all that matters is that these counts are above 200
(otherwise bayes is disabled).

You may get the number of tokens that are actually in the redis
database (not expired) by counting the number of lines produced
on stdout by 'sa-learn --backup' or 'sa-learn --dump data'.

The format of fields produced by --dump data is:

   probability  spam_count  ham_count  atime  token

The --backup format is similar, but does not provide
probabilities, just spam and ham counts.

To get some estimate on the number of hammy vs. spammy tokens
(not messages) currently in a database, try something like:

   sa-learn --dump data' | \
 awk '$10.1 {h++}; $10.9 {s++}; END{printf(h=%d, s=%d\n,h,s)}'


(caveat: sa-learn --backup or --dump data may not work on a huge
database, as they need all the tokens (redis keys) to fit into memory)


   Mark


Bayes expiration with Redis backend

2015-02-27 Thread Matteo Dessalvi

Hi all.

I am using a centralized Redis instance to
host the bayesian data for a bunch of MTAs.

AFAICS the SA filter is working quite well
and the BAYES_* rules are triggered correctly,
no false positive so far.

But I am concerned about the expiration of the
bayesian data. sa-learn reports the following:

0.000  0  3  0  non-token data: bayes db version
0.000  0   8437  0  non-token data: nspam
0.000  0 495000  0  non-token data: nham

As stated here:

search.cpan.org/dist/Mail-SpamAssassin/lib/Mail/SpamAssassin/BayesStore/Redis.pm

Expiry is done internally in Redis using *_ttl settings (...)
This is why --force-expire etc does nothing, and token counts
and atime values are shown as zero in statistics.

So, why the nham tokens have grown so much? It looks
like it was never 'pruned'.

I am using the following configuration for the expiration:

bayes_token_ttl 21d
bayes_seen_ttl   8d
bayes_auto_expire 1

I have also left bayes_expiry_max_db_size undefined.

My other concern is about the proportion between spam
and ham tokens. Should I be worried about it?

Thanks in advance!

Regards,
   Matteo


Re: updated RegistrarBoundaries.pm

2015-02-24 Thread Matteo Dessalvi

Hello.

Sorry Axb, I don't want to be pedant but the latest
'svn export' you have suggested gave me an error:
svn: Repository moved permanently to (other location).

Indeed, if I try:

svn export 
http://svn.apache.org/repos/asf//spamassassin/trunk/lib/Mail/SpamAssassin/Util/RegistrarBoundaries.pm


then I am able to get the file.

Are there some 'risks' in replacing that file with the one
that comes from a packaged version of SA (Debian Backports)?
AFAIKS, it looks like the update consists mostly in some changes
in the Regexps.

Anyway, thanks for the update and the 'heads up'!

Best regards,
Matteo

On 21.02.2015 12:09, Axb wrote:

DOH! - need more coffee...

  svn export
http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Util/RegistrarBoundaries.pm




On 02/21/2015 11:29 AM, Axb wrote:



PLS EXCUSE ERROR

seems that wget commant WILL *NOT* get you the .pm file.

You'll need to SVN checkout or use a browser and download

http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Util/RegistrarBoundaries.pm?view=co




On 02/21/2015 11:17 AM, Axb wrote:

I just updated RegistrarBoundaries.pm to reflect

http://data.iana.org/TLD/tlds-alpha-by-domain.txt
# Version 2015022100, Last Updated Sat Feb 21 07:07:01 2015 UTC

NOTE: This is not updated via sa-update, only with release updates.

On Centos/RHE/Fedora boxes you will probably find it in

/usr/local/share/perl5/Mail/SpamAssassin/Util

use locate RegistrarBoundaries.pm to determine path

backup your old version and replace/wget

http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Util/RegistrarBoundaries.pm











Re: Recent spate of Malicious VB attachments II

2015-02-19 Thread Matteo Dessalvi

Hello.

I am just curious, since I am using SaneSecurity
signatures too.

According to: http://sanesecurity.com/usage/signatures/
some of the lists you mentioned have been classified
with 'medium' to 'high' risk of false positives:

foxhole_*
spear / spearl

Did you not get into trouble with those ones?

Regards,
   Matteo

On 19.02.2015 15:46, Reindl Harald wrote:


Am 19.02.2015 um 15:43 schrieb David F. Skoll:

On Thu, 19 Feb 2015 09:34:28 -0500
Alex Regan mysqlstud...@gmail.com wrote:

[David Skoll]

spreadsheet with a macro virus in it.  ClamAV is essentially
useless at detecting viruses, so it's a real problem... any ideas?



Useless? Are you using the third-party patterns?


No, because when I tried some of them, there were an unacceptably
high number of FPs.  I tried tweaking various sets of Sane Security
signatures and they didn't work well for me


looks you are using the wrong ones
no problems with that ones

blurl.ndb
bofhland_cracked_URL.ndb
bofhland_malware_attach.hdb
bofhland_malware_URL.ndb
bofhland_phishing_URL.ndb
crdfam.clamav.hdb
foxhole_all.cdb
foxhole_filename.cdb
foxhole_generic.cdb
malwarehash.hsb
phish.ndb
phishtank.ndb
rogue.hdb
sanesecurity.ftm
scamnailer.ndb
scam.ndb
sigwhitelist.ign2
spearl.ndb
spear.ndb
winnow.attachments.hdb
winnow_bad_cw.hdb
winnow_extended_malware.hdb
winnow_malware.hdb
winnow_malware_links.ndb
winnow_phish_complete_url.ndb
winnow_spam_complete.ndb



Re: Can't locate object method check_for_spf_helo_permerror via package Mail: [...]:SpamAssassin::PerMsgStatus

2015-02-10 Thread Matteo Dessalvi

Hello KAM.

Unfortuntately I am still getting the same warning messages
with the new 25_spf.cf.

It looks like that the check part:

if can(Mail::SpamAssassin::Plugin::SPF::has_check_for_spf_errors)
(...)
endif

is ignored. Could it be an effect introduced by the sa-compile step?
The output of the 'rules compilation' looked fine to me, though.

If it can help, I am using Debian Wheezy with Perl 5.14.2 and
SA version 3.004000 (from the backports archives).

Best regards,
   Matteo

On 10.02.2015 15:13, Kevin A. McGrail wrote:

Can you grab the 25_spf.cf from
http://svn.apache.org/viewvc/spamassassin/trunk/rules/25_spf.cf?view=co
and see if that works?

Then I'll hope the rule update hits tomorrow.  There is some vagueness
to my understanding of exactly how long rules take from start to finish
to go outbound.  There are some emergency rule generation procedures if
someone wants to help the project.

I would guess I missed the cutoff for yesterday's masscheck and
tomorrow's will include it.

Regards,
KAM


Re: Can't locate object method check_for_spf_helo_permerror via package Mail: [...]:SpamAssassin::PerMsgStatus

2015-02-10 Thread Matteo Dessalvi

Uh,ohsorry my bad. I did it try the modifed ruleset on
a test installation without the Rule2XSBody plugin enabled.

With the plugin enabled (and after restarting Amavis) it looks
like those warning messages are not there anymore.

Thanks a lot for the fix!

Regards,
  Matteo

On 10.02.2015 16:06, Kevin A. McGrail wrote:


Interesting idea.  Axb convinced me not to bother with rules

Regards,
KAM


Re: Can't locate object method check_for_spf_helo_permerror via package Mail: [...]:SpamAssassin::PerMsgStatus

2015-02-09 Thread Matteo Dessalvi

Hi all.

I am getting the same errors. I am running SA through Amavis but I guess
it does not matter in this case. Is there a way of fixing those errors,
apart from disabling the rules from 25_spf.cf?

SA version: 3.4.0 - Perl 5.14.2 (Debian Wheezy - 64 bit)

Log snapshot:


Feb  9 10:42:13 lxmtin2 amavis[6411]: (06411-09) _WARN: rules: failed to 
run T_SPF_HELO_PERMERROR test, skipping:\n\t(Can't locate object method 
check_for_spf_helo_permerror via package 
Mail::SpamAssassin::PerMsgStatus at (eval 1329) line 19, GEN38 line 
4061.\n)


Feb  9 10:42:13 lxmtin2 amavis[6411]: (06411-09) _WARN: rules: failed to 
run T_SPF_TEMPERROR test, skipping:\n\t(Can't locate object method 
check_for_spf_temperror via package Mail::SpamAssassin::PerMsgStatus 
at (eval 1329) line 664, GEN38 line 4061.\n)


Feb  9 10:42:13 lxmtin2 amavis[6411]: (06411-09) _WARN: rules: failed to 
run T_SPF_PERMERROR test, skipping:\n\t(Can't locate object method 
check_for_spf_permerror via package Mail::SpamAssassin::PerMsgStatus 
at (eval 1329) line 841, GEN38 line 4061.\n)


Feb  9 10:42:13 lxmtin2 amavis[6411]: (06411-09) _WARN: rules: failed to 
run T_SPF_HELO_TEMPERROR test, skipping:\n\t(Can't locate object method 
check_for_spf_helo_temperror via package 
Mail::SpamAssassin::PerMsgStatus at (eval 1329) line 1226, GEN38 
line 4061.\n)


Best regards,
 Matteo

On 09.02.2015 10:12, Reindl Harald wrote:

what is that below introduced with tonights update and get triggered now
for every single mail and why does such things not automatically get
caught before push?

score T_SPF_PERMERROR 0
score T_SPF_TEMPERROR 0
score T_SPF_HELO_PERMERROR 0
score T_SPF_HELO_TEMPERROR 0

09-Feb-2015 05:35:10: SpamAssassin: Update processed successfully


Feb  9 10:02:57 mail-gw spamd[9786]: spamd: server hit by SIGHUP,
restarting
Feb  9 10:02:57 mail-gw spamd[9786]: spamd: server socket closed, type
IO::Socket::IP
Feb  9 10:02:58 mail-gw spamd[9786]: rules: failed to run
T_SPF_HELO_PERMERROR test, skipping:
Feb  9 10:02:58 mail-gw spamd[9786]: (Can't locate object method
check_for_spf_helo_permerror via package Mail:
[...]:SpamAssassin::PerMsgStatus at (eval 1217) line 357.
Feb  9 10:02:58 mail-gw spamd[9786]: )
Feb  9 10:03:00 mail-gw spamd[9786]: rules: failed to run
T_SPF_HELO_TEMPERROR test, skipping:
Feb  9 10:03:00 mail-gw spamd[9786]: (Can't locate object method
check_for_spf_helo_temperror via package Mail:
[...]:SpamAssassin::PerMsgStatus at (eval 1217) line 690.
Feb  9 10:03:00 mail-gw spamd[9786]: )
Feb  9 10:03:00 mail-gw spamd[9786]: rules: failed to run
T_SPF_PERMERROR test, skipping:
Feb  9 10:03:00 mail-gw spamd[9786]: (Can't locate object method
check_for_spf_permerror via package Mail:
[...]:SpamAssassin::PerMsgStatus at (eval 1217) line 1231.
Feb  9 10:03:00 mail-gw spamd[9786]: )
Feb  9 10:03:00 mail-gw spamd[9786]: rules: failed to run
T_SPF_TEMPERROR test, skipping:
Feb  9 10:03:00 mail-gw spamd[9786]: (Can't locate object method
check_for_spf_temperror via package Mail:
[...]:SpamAssassin::PerMsgStatus at (eval 1217) line 2162.
Feb  9 10:03:00 mail-gw spamd[9786]: )
Feb  9 10:03:00 mail-gw spamd[9786]: spamd: server started on
IO::Socket::IP [127.0.0.1]:10028 (running version 3.4.0)
Feb  9 10:03:00 mail-gw spamd[9786]: spamd: server pid: 9786



SA dns_server option

2014-12-02 Thread Matteo Dessalvi

Hi all.

I have a short question about the dns_server option of SA.
Is this option used when SA is called from Amavis and there
isn't any spamd process running?

To be more clear: should I also be forced to add the IP
address of the caching DNS server to /etc/resolv.conf
or the option would be sufficient?

Thanks in advance.

Best regards,
Matteo


Re: SA dns_server option

2014-12-02 Thread Matteo Dessalvi

Yes, I have read the docs but I was not sure if SA,
when used through Amavis, would use such option.

Nevermind, I pushed up the log verbosity of my DNS
caching service and it looks like SA is using it.
So, problem solved :-).

Thanks.

Best regards,
Matteo

On 02.12.2014 13:18, Axb wrote:


doh..

there it is

dns_server

dns_server ip-addr-port (default: entries provided by Net::DNS)
 Specifies an IP address of a DNS server, and optionally its port
 number. The *dns_server* directive may be specified multiple
times,
 each entry adding to a list of available resolving name
servers. The
 *ip-addr-port* argument can either be an IPv4 or IPv6 address,
 optionally enclosed in brackets, and optionally followed by a
colon
 and a port number. In absence of a port number a standard port
 number 53 is assumed. When an IPv6 address is specified along
with a
 port number, the address must be enclosed in brackets to avoid
 parsing ambiguity regarding a colon separator,

 Examples : dns_server 127.0.0.1 dns_server 127.0.0.1:53 dns_server
 [127.0.0.1]:53 dns_server [::1]:53

 In absence of *dns_server* directives, the list of name servers is
 provided by Net::DNS module, which typically obtains the list from
 /etc/resolv.conf, but this may be platform dependent. Please
consult
 the Net::DNS::Resolver documentation for details.


You don't need to specify one unless you need the specials in the config



Re: SA dns_server option

2014-12-02 Thread Matteo Dessalvi

Hi.

@Mark: thanks for the explanations about Amavis/SA.

@Reindl: thanks, I am indeed using unbound as a DNS
caching server. Interesting the option 'minimal-responses',
I would check that.

Regards,
   Matteo

On 02.12.2014 14:16, Mark Martinec wrote:

Matteo Dessalvi wrote:

I have a short question about the dns_server option of SA.
Is this option used when SA is called from Amavis and there
isn't any spamd process running?


Yes it is.


To be more clear: should I also be forced to add the IP
address of the caching DNS server to /etc/resolv.conf
or the option would be sufficient?


The dns_server only affects SpamAssassin. If you want other
applications on that host to also use the same recursive
name server, its address needs to be in /etc/resolv.conf.
For example DKIM validation is done by amavisd calling
Net::DNS directly, which has no idea about SpamAssassin
settings. Similarly a milter or MTA.



   Mark


Re: Outlook, we do love to hate you....

2014-08-27 Thread Matteo Dessalvi

Since you are mentioning 'Out of office' messages

Is anybody else got in trouble due to this 'fantastic'
auto responder feature? Especially when these email
are generated by an Exchange server.

Just curious, since the last week one of our mail
server (for outbound traffic) got listed in one
black list due to this 'out of office'|'vacation'
messages.

I am aware of the RFC3834 but not sure if our
Exchange server is following it.

-Matteo

On 27.08.2014 17:56, Kris Deugau wrote:

*sigh*

Just got a FP report...

... about an Out of Office message...

... generated by Outlook 15...

... which, among other things, seems to go to great lengths to look like
spam, by way of the HTML formatting overkill that hits a local rule for
HTML comments over 32K long.

*headdesk*

Someone please tell me, how is it Necessary for a three-line OOO message
to include more than 32K of CSS gibberish?

-kgd



Re: sa-learn site-wide bayes on Redis

2014-08-21 Thread Matteo Dessalvi

I am pretty sure SA support the Redis authentication mechanism.
For my tests I have used the following line:

bayes_sql_dsn  server=127.0.0.1:6379;password=MySecretPWD;database=2

Matteo

On 21.08.2014 12:56, Marcin Mirosław wrote:


Hi!
I'm reading bayes_redis.cf and I can see:

#NOTE: We're not using authentication assuming the Redis server/port
should not be reachable form the outside
# You can add authentication once you've seen it work.


Does it means that this example config doesn't include authentication
options or it means that SA doesn't support auth for redis?

Marcin






Re: Need help with setting up MySQL storage for SA

2014-08-21 Thread Matteo Dessalvi

On 21.08.2014 09:20, Michael wrote:


So that means, that actually I do not have to do any action on newly
created users. Once they retrain their first message, the Bayes entries
are getting created. Before that, Bayes is not used for that user. Is
that correct?



Yes, I would say it correct. To start using the bayesian filter SA needs
at least to classify 200 spam and ham email otherwise the BAYES_* rules
will not trigger


What about the autolearn functionality? Where are these infos getting
stored? Is it also stored in the Bayes tables? What happens, if they are
not yet initialised?



About the autolearning feature you can read about that here:

http://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html

And about the info stored into the DB by SA you can take a look here:

http://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_current_release_3.4.x/sql/README.bayes

Regards,
Matteo



Quoting Matteo Dessalvi mte...@yahoo.it:


Hi.

I did test a similar configuration a while ago and had the same problem.
If you take a look at this thread on the mailing list:

http://spamassassin.1065346.n5.nabble.com/Bayes-vars-records-on-MySQL-not-created-automatically-td104615.html


you'll see it was a problem of running 'sa-learn --sync' as the user
who is running the test.

Best regards,




Re: sa-learn site-wide bayes on Redis

2014-08-21 Thread Matteo Dessalvi

Which version of Redis are you using? I did have some
problems with the 2.4 version packaged by Debian and
I did solve a similar problem using a more recent
version, like the 2.7 or 2.8.

Matteo

On 21.08.2014 14:45, Marcin Mirosław wrote:

W dniu 21.08.2014 o 13:45, Matteo Dessalvi pisze:

I am pretty sure SA support the Redis authentication mechanism.
For my tests I have used the following line:

bayes_sql_dsn  server=127.0.0.1:6379;password=MySecretPWD;database=2


Thanks Matteo,
firstly I should try then write to ML:) So now I did own check. It looks
that SA doesn't authenticate when connects to redis. It didn't work for
me with your example not when I used
bayes_sql_password   password

When redis needs passowrd then SA throws bayes: Redis failed: Redis
error: ERR operation not permitted, tcpdump also confirms that SA
doesn't do AUTH.
It's strange because in Redis.pm I can see that authentication is
supported. Now I'm thinking where I could made mistake in configuration...

Thanks,
Marcin



sa-learn site-wide bayes on Redis

2014-08-20 Thread Matteo Dessalvi

Hi all.


I am managing a bunch of Linux MTAs which are placed in
front of some Exchange servers. In such a configuration
the Bayes filter is deployed site-wide.

For a new deployment of these servers I am planning
to use Redis as a centralized backend (previously
the bayes db were just files saved on the disk).

My question is: do I have to use a specific option
to tell sa-learn that the bayes db is now hosted on
Redis? Or sa-learn will use the info from the
bayes_sql_dsn directive in my local.cf?

Looking into the wiki:
http://wiki.apache.org/spamassassin/SiteWideBayesSetup

or into the sa-learn docs:
http://spamassassin.apache.org/full/3.4.x/doc/sa-learn.html

did not give me any clues.


Thanks in advance!


Best regards,
  Matteo


Re: sa-learn site-wide bayes on Redis

2014-08-20 Thread Matteo Dessalvi

No, unfortunately it does not help me.
I already have a proper config file for SA
to access Redis as backend and most of
the configurations are done automatically
through a Chef cookbook (Redis included).

In the docs you pointed me there's nothing
about the interaction between sa-learn and
Redis.

Best regards,
   Matteo

On 20.08.2014 14:42, Axb wrote:


see

http://svn.apache.org/repos/asf/spamassassin/trunk/contrib/HOWTO.Bayes-Redis/


hope that helps.
This is not an official doc, so if you see anything that needs to be
added/changed, pls let me know.



Re: sa-learn site-wide bayes on Redis

2014-08-20 Thread Matteo Dessalvi

Ok, perfect! Thanks a lot! This is what I want to know
and I was not so sure about.

I may be wrong but it looks to me the fact that
tools like sa-learn can access transparently the
backends configured for SA is not exactly clear
from the docs.

It would be great if the wiki maintainers could add
a short note somewhere in the pages regarding the
SiteWide deployment or related topics.

Best regards,
 Matteo

On 20.08.2014 15:08, Axb wrote:

bayes_store_module  Mail::SpamAssassin::BayesStore::Redis

tells SA to use the Redis backend. To sa-learn this becomes transparent,
as with any other backed (DBD,SDBM,SQL)

bayes_redis.cf shows what parameters are mandatory/optional




Re: Need help with setting up MySQL storage for SA

2014-08-20 Thread Matteo Dessalvi

Hi.

I did test a similar configuration a while ago and had the same problem.
If you take a look at this thread on the mailing list:

http://spamassassin.1065346.n5.nabble.com/Bayes-vars-records-on-MySQL-not-created-automatically-td104615.html

you'll see it was a problem of running 'sa-learn --sync' as the user
who is running the test.

Best regards,
   Matteo

On 20.08.2014 16:07, Michael wrote:

Hi,
I'm using Spamassassin in a virtual user environment. To store
preferences like settings, Bayes and AWL for each user I'm trying to set
up a MySQL storage.

I created the MySQL tables according the instructions from the files
awl_mysql.sql, bayes_mysql.sql, README.awl, README.bayes, README and
userpref_mysql that came with my Spamassassin 3.4 installation on Ubuntu
14.04.

The connection to the database seem to be working.
For me the debug output looks like if Spamassassin would expect to be
already some data in the tables. Where shall I get this data from? Do I
have to manually create entries for each user? What am I missing?



When calling spamc -u t...@michi.su  testmail.txt I'm getting the
following debug output (shortened):

Aug 20 08:14:46.563 [16682] dbg: config: Conf::SQL: executing SQL:
select preference, value from userpref where username = 't...@michi.su'
or username = '@GLOBAL' order by username asc
Aug 20 08:14:46.563 [16682] dbg: config: retrieving prefs for
t...@michi.su from SQL server
Aug 20 08:14:46.564 [16682] dbg: info: user has changed
Aug 20 08:14:46.564 [16682] dbg: bayes: learner_new
self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x30fdce0),
bayes_store_module=Mail::SpamAssassin::BayesStore::MySQL
Aug 20 08:14:46.564 [16682] dbg: bayes: using username: t...@michi.su
Aug 20 08:14:46.564 [16682] dbg: bayes: learner_new: got
store=Mail::SpamAssassin::BayesStore::MySQL=HASH(0x3d1a768)
Aug 20 08:14:46.565 [16682] dbg: bayes: database connection established
Aug 20 08:14:46.566 [16682] dbg: bayes: found bayes db version 3
Aug 20 08:14:46.566 [16682] dbg: bayes: unable to initialize database
for t...@michi.su user, aborting!



The MySQL relevant options that I added are:
user_scores_dsn DBI:mysql:spamassassin:localhost
user_scores_sql_usernamespamassassin
user_scores_sql_passwordpass

bayes_store_module  Mail::SpamAssassin::BayesStore::MySQL
bayes_sql_dsn   DBI:mysql:spamassassin:localhost
bayes_sql_username  spamassassin
bayes_sql_password  pass

auto_whitelist_factory  Mail::SpamAssassin::SQLBasedAddrList
user_awl_dsnDBI:mysql:spamassassin:localhost
user_awl_sql_username   spamassassin
user_awl_sql_password   pass



Re: Running SA without the bayesian classifier

2014-08-12 Thread Matteo Dessalvi

Hi all.

Thanks for all the answers. I am afraid I was being naive.
I was explicitly thinking of a scenario like this: filter as
much as possible 'unsolicited email' sent by some (possibly)
'infected' account.

I thought that turning off the bayesian classifier (and the
RBL checks) would still let me able to catch the occasional
spam email. Of course there's already a ClamAV filtering
system for all the outgoing email.

In the past week one of our outgoing SMTP server was blacklisted
for 12 hours (just to be clear: it was not SpamHaus).
Unfortunately, looking at the logs did not give me any clues: there
were no spikes of bulk sending email to thousands of users or
anything particularly suspicious. And the black list manager did
not provide any additional information about the incident.

On 12.08.2014 08:43 Matus UHLAR wrote:

That means, much of rules that push over limit will not hit.

 You still should not push required_score down, I remember outgoing mail

being blocked by inherited servers for hitting 7.0...


I was thinking about using a 5.0 threshold but given your example
I guess I should push it up to 8.0.

On 11.08.2014 23:15, Karsten Bräckelmann wrote:

 Define spam.

 Running SA on your outgoing SMTP will not catch botnet generated junk,
 neither spam nor malware. This would require sniffing raw traffic. Or
 completely firewalling off outgoing port 25 connections.
 You explicitly mention your users (corporate or home?) sending mail.
 Are you talking about them possibly running bulk sending services, or
 hand crafted unsolicited mail to individual recipients?

If possible I would like to catch both but as already said this
gonna look quite hard. I will add Pyzor/DCC in the mix and see
if it can help.

On 11.08.2014 23:15, Karsten Bräckelmann wrote:

Unless there's a 419 gang operating from your internal network, there
might not be much left for SA with stock rules to classify spam...


No 'spam gang' so far but I will keep my eyes open :-).

Best regards,
  Matteo


Running SA without the bayesian classifier

2014-08-11 Thread Matteo Dessalvi

Hi all.

This may be a very stupid question but I would like to ask you all
anyway.

I am planning to install SA on our SMTP MTAs, which deals only with
outgoing traffic generated in the internal network.
I am making the assumption that our clients are mostly sending 'clean'
email (I know, I am trusting *a lot* my users but nevertheless).

So the question is: how efficient will be SA without using the bayesian
classifier? Are all the remaining rulesets (apart from BAYES_*)
sufficient to shave off spam email?

I am considering this scenario just because it will make the deployment
a little be easier, since I would not need a centralized Redis or MySQL
instance to keep the bayes data in a centralized way.

Thanks in advance.

Best regards,
   Matteo


spamassassin (cmd line) connection to Redis

2014-05-22 Thread Matteo Dessalvi

Hi all.

As stated in the subject I am just trying to test my
SpamAssassin 3.4.0 installation (I am using the Debian
Jessie package), with the usual method described here:

http://wiki.apache.org/spamassassin/TestingInstallation

In the output of the command: spamassassin -D  gTube_spam.txt
I have got the following error:

(...)
May 22 12:31:39.240 [8390] warn: plugin: eval failed: bayes: Redis
failed: Redis error: ERR operation not permitted at /usr/share/perl5
/Mail/SpamAssassin/BayesStore/Redis.pm line 233, GEN2 line 1. at
/usr/share/perl5/Mail/SpamAssassin/BayesStore/Redis.pm line 265.
(...)

In the end the test have worked perfectly, because SA has
correctly classified the GTUBE spam sample but I am worried
about that Redis error.

The SA local.cf contains the following string:

bayes_sql_dsn   server=10.1.1.19:6379;password=mypass;database=2
(...)

which, I taught, should be enough for SA. Note that if
I am using the redis-cli from the command line, specifying
the same parameters, I did not have any connection/authorization
problem.

Looking for the line 233 stated in the error message, I found
that the error is raised inside the sub on_connect but it looks
like it's not a Redis authentication error.

Any clues about what I am doing wrong?
Thanks in advance!

Best regards,
Matteo


Re: spamassassin (cmd line) connection to Redis

2014-05-22 Thread Matteo Dessalvi

On 22.05.2014 13:10, Axb wrote:

have you included this in your local.cf ?

bayes_store_module  Mail::SpamAssassin::BayesStore::Redis


These are the relevant configuration lines for the
Redis SA module:

bayes_store_module  Mail::SpamAssassin::BayesStore::Redis
bayes_sql_dsn   server=10.1.1.19:6379;password=mypass;database=2
bayes_token_ttl 21d
bayes_seen_ttl   8d
bayes_auto_expire 1


On 22.05.2014 13:12, Axb wrote:


what happens if you don't use authentication?



It looks like the problem lies in the authentication.
When I have tried with an empty 'password=' (after
disabling the requirepass in the redis.conf) I have
got the following messages: (I have included empty
lines for the sake of readbility):

(...)
dbg: bayes: learner_new 
self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x3cc14c0), 
bayes_store_module=Mail::SpamAssassin::BayesStore::Redis


dbg: bayes: learner_new: got 
store=Mail::SpamAssassin::BayesStore::Redis=HASH(0x42161c0)


dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x3cc14c0) 
implements 'learner_is_scan_available', priority 0


dbg: bayes: _open_db(not yet connected)

dbg: bayes: Redis on-connect, db_id 2

dbg: bayes: CLIENT SETNAME command failed, don't worry, possibly an old 
redis version: ERR Syntax error, try CLIENT (LIST | KILL ip:port)


dbg: bayes: redis server version 2.4.14, memory used 6.8 MiB, Lua is not 
available


dbg: bayes: initialized empty database, version 3

dbg: bayes: nspam_nham_get nspam=0, nham=0

dbg: bayes: not available for scanning, only 0 spam(s) in bayes DB  200
(...)

Of course this is just the initial test, so I do not have enough
bayes data. The 'CLIENT SETNAME' error is probably due to my old
Redis version but other than that it looks fine.

I will try again with the authentication enabled and see if
I stumble in the same problem as before.

Best regards,
Matteo



Re: spamassassin (cmd line) connection to Redis

2014-05-22 Thread Matteo Dessalvi

Yes, you are definitely right: with the latest stable
Redis version (2.8.9 indeed) everything works smoothly
with the authentication.

Thanks for pointing me in the right direction!

Best regards,
Matteo

On 22.05.2014 14:10, Axb wrote:


You're using an ancient Redis version with no LUA support.

Redis 2.8.9 is the latest stable version.

I'd suggest you update Redis before you go on chasing windmills.




Re: MariaDB replacing MySQL for Bayes

2013-06-07 Thread Matteo Dessalvi
As far as I understand only the 3.4 (devel branch) version of SA is supporting 
Redis as a Bayes storage. 

Are you using this version in production? Can you tell us, approximately, the 
volume of the email traffic?
Are you running a configuration with multiple SA connected to a single Redis 
instance?

Thanks.

Matteo

- Messaggio originale -
Da: Axb axb.li...@gmail.com
A: users@spamassassin.apache.org
Cc: 
Inviato: Venerdì 7 Giugno 2013 15:47
Oggetto: Re: MariaDB replacing MySQL for Bayes

On 06/06/2013 09:03 PM, Marc Perkel wrote:
 So - after a couple of weeks it just works. I recommend getting rid of
 MySQL in favor of MariaDB. Besides bayes I'm using it on my web server
 and it just works and it's a lot more solid.

 My 2 centz

If you like MariaDB  Bayes speed, you should try Bayes with Redis.
that is FAST!


Re: malformed To: header blocks further parsing

2013-06-06 Thread Matteo Dessalvi
Hi Fabio.

Have you tried also the 'Language options' of SpamAssassin? Like the one
described here: 
http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html#language_options

Matteo


- Messaggio originale -
Da: Fabio Sangiovanni sangiova...@nweb.it
A: users@spamassassin.apache.org
Cc: 
Inviato: Mercoledì 5 Giugno 2013 12:26
Oggetto: malformed To: header blocks further parsing

Hi everybody,

I'm using spamassassin 3.3.2, along with postfix 2.6.6 and amavisd-new 
2.8.0.
The system spamassassin is running on is used primarily for URIDNSBL checks.
Recently I had some messages classified as spam because of these rules:

X-Spam-Report:
  *  1.2 TO_MALFORMED To: has a malformed address
  *  0.5 NULL_IN_BODY FULL: Message has NUL (ASCII 0) byte in message
  *  0.1 MISSING_MID Missing Message-Id: header
  *  1.8 MISSING_SUBJECT Missing Subject: header
  *  1.4 MISSING_DATE Missing Date: header

The problem is that the body is not null at all, and headers aren't 
missing: what happens here is that the To: header contains chinese 
characters that are not in encoded word format and that interfere with 
spamassassin's parsing.
This is a problem because the mail body can't be checked against other 
rules.

Is there a way to fix this, other than changing MUA's behaviour to 
encode the message properly? How can I have spamassassin to parse the 
remaining part of the message in such conditions?

Configuration follows (please let me know if you need further information):

loaded plugins:

loadplugin Mail::SpamAssassin::Plugin::URIDNSBL
loadplugin Mail::SpamAssassin::Plugin::Check
loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody


local.cf:

trusted_networks 192.168/16
internal_networks 192.168/16
skip_rbl_checks 1
use_learner 0
use_bayes 0
use_bayes_rules 0
bayes_auto_learn 0
score        URIBL_SBL    5
score        URIBL_DBL_SPAM    5
score        URIBL_DBL_REDIR    5
score        URIBL_DBL_ERROR    5
score        URIBL_SC_SURBL    5
score        URIBL_WS_SURBL    5
score        URIBL_PH_SURBL    5
score        URIBL_MW_SURBL    5
score        URIBL_AB_SURBL    5
score        URIBL_JP_SURBL    5
score        URIBL_BLACK    0
score        URIBL_RED    0
score        URIBL_GREY    0
score        URIBL_BLOCKED    0



Re: Bayes_vars records on MySQL not created automatically

2013-05-10 Thread Matteo Dessalvi
Thanks for your answer Michael.

Yes you are right, using sa-learn --sync as one of the user SA will create 
the proper record on the bayes_vars table.
So I guess this is only a problem of having not received enough ham/spam email 
with this user.

Matteo



Da: Michael Parker par...@herk.net
A: Matteo Dessalvi mte...@yahoo.it 
Cc: users@spamassassin.apache.org users@spamassassin.apache.org 
Inviato: Mercoledì 8 Maggio 2013 18:43
Oggetto: Re: Bayes_vars records on MySQL not created automatically



On May 8, 2013, at 8:06 AM, Matteo Dessalvi mte...@yahoo.it wrote:

 
 I always thought that SA would be able to operate autonomously and that it 
 will create the
 proper records in all the tables of the DB. Am I missing something? Is this 
 the designed behavior?
 

It's been awhile since I wrote and looked at the code, but I'm pretty sure that 
the bayes_var entry won't be created until you learn something as that user.

Try doing an sa-learn or an auto-learn for that user and see what happens.

If memory serves the behavior was deliberate so that you wouldn't get hundreds 
of entries in bayes_var when messages are checked for users who may not be real.

Michael


Bayes_vars records on MySQL not created automatically

2013-05-08 Thread Matteo Dessalvi
Hi all.

I do hope that this question is not outside the scope of the list and
I apologize in advance if that is the case.

Currently I'am running this configuration on an email gateway server:

- Debian Linux Wheezy
- Postfix 2.9.6-2
- Spamassassin 3.3.2
- Amavisd-new 2.7.1
- Perl 5.14.2
- MySQL 5.5.30 (which holds awl/bayes_vars/bayes_token/etc.)

The filtering and the interaction between SA and Amavis works without 
problems: every mail that pass through the filter is scanned by SA and 
then reported back to the Amavis daemon which will reinsert it on the 
Postfix queue for the final delivery. 

But there's one issue: every time SA try to insert a record of one of 
the user recipient on the bayes_vars table I will get this error:

(...)
SA dbg: info: user has changed
SA dbg: bayes: learner_new 
self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x5a1b270), 
bayes_store_module=Mail::SpamAssassin::BayesStore::MySQL
SA dbg: bayes: using username: matt@test.domain
SA dbg: bayes: learner_new: got 
store=Mail::SpamAssassin::BayesStore::MySQL=HASH(0x5fef0e0)
SA dbg: bayes: database connection established
SA dbg: bayes: found bayes db version 3
SA dbg: bayes: unable to initialize database for matt@test.domain user, 
aborting!
(...)

Currently there are no records for this user, so it should be ok. But after
many tests I still get the error while at the same time SA is able to update
the AWL table on MySQL.                    

The only way to make SA write the record on the bayes_vars table is insert a
row in it with the username matt@user.domain. In that case I will get what 
I am expecting:

(...)
SA dbg: info: user has changed
SA dbg: bayes: learner_new 
self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x5d3e270), 
bayes_store_module=Mail::SpamAssassin::BayesStore::MySQL
SA dbg: bayes: using username: matt@test.domain
SA dbg: bayes: learner_new: got 
store=Mail::SpamAssassin::BayesStore::MySQL=HASH(0x63120c8)
SA dbg: bayes: database connection established
SA dbg: bayes: found bayes db version 3
SA dbg: bayes: using userid: 10
SA dbg: bayes: not available for scanning, only 0 spam(s) in bayes DB  200
(...)

Not so many spam emails for this user, so that's absolutely the correct answer 
from SA.

Searching for a reason of why this is not working I gave also a look to the
 (...)/perl5/Mail/SpamAssassin/BayesStore/MySQL.pm module. 
Inside the sub _initialize_db, at the line 725, I have found:

(...)
# Do not create an entry for this user unless we were specifically asked to
return 0 unless ($create_entry_p);
(...)

If I comment out the 'return' line SA will be able to create the record on the 
table even if the
user is not already there.

I always thought that SA would be able to operate autonomously and that it will 
create the
proper records in all the tables of the DB. Am I missing something? Is this the 
designed behaviour?

Thanks in advance.

Matt