Hi list!
First, this is my first submission to a project, so please excuse me if
I do something that do not follow the rules.
I'm in charged with spamassassin (SA) and after an update from 3.0 to
3.1, users starts to complain about SA improved effectiveness (to more
false positive). After some investigation, the problem seems to come
from bayesian filter. I use a global bayesian database (using
bayes_sql_override_username) but with customers coming from all over the
world, it seems a more fine grain filtering should be better. But it
seems it is currently impossible. I use SQL bayesian database setup and
it should have been great if it has offered the same kind of feature
that user_scores_sql_custom_query. What I wanted to do is make bayesian
filter retrieve database first for the current user, then for the
routing domain to which user belongs to, then fall back to global
database. For instance:
[EMAIL PROTECTED] -> [EMAIL PROTECTED] -> *
This way, it is possible for users to have their own personal bayes
database and if they do no want to create one or they do no have one
already, they still can benefit from others' databases. Grouping users
per routing domain is like grouping users per center of interest: if one
declares mail as spam, there is little few chance that others consider
it as ham.
So this feature was not implemented yet and I have started writing it
myself. I warn you that I did not write a single perl line before and
"my" code is made of doc quotes and cut & paste. I've added the
following configuration option bayes_sql_custom_query to SA
configuration file. It acts the same way bayes_sql_override_username does.
Please, let me know if this feature could be useful to others and/or if
it requires some rewriting.
Please, find diff as attachments. They applied against SA 3.1.4 because
it is the SA version shipped with Debian testing at this time.
Best regards =)
cedric.
--- SpamAssassin/Conf.pm 2006-08-12 18:08:44.000000000 +0200
+++ Conf.pm 2006-11-13 10:37:16.000000000 +0100
@@ -2330,6 +2330,52 @@
type => $CONF_TYPE_STRING
});
+#### Start of modification (last modified on 20061109).
+=item bayes_sql_custom_query query
+
+This option gives you the ability to create a custom SQL query to
+retrieve username. In order to work correctly your query should
+return only one value, the desired username. In addition, there
+are several "variables" that you can use as part of your query,
+these variables will be substituted for the current values right
+before the query is run. The current allowed variables are:
+
+=over 2
+
+=item _USERNAME_
+
+The current user's username.
+
+=item _DOMAIN_
+
+The portion after the @ as derived from the current user's username, this
+value may be null.
+
+=back
+
+The query must be one continuous line in order to parse correctly.
+
+Here is an example query, please note that it is broken up for easy
+reading, in your config it should be one continuous line.
+
+=over 1
+
+=item Current default query:
+
+C<SELECT username FROM bayes_vars WHERE username = '*' OR Username =
CONCAT('*@',_DOMAIN_) OR Username = _USERNAME_ ORDER BY username ASC>
+
+=back
+
+=cut
+
+ push (@cmds, {
+ setting => 'bayes_sql_custom_query',
+ is_admin => 1,
+ type => $CONF_TYPE_STRING
+ });
+
+#### End of modification.
+
=item bayes_sql_username_authorized ( 0 | 1 ) (default: 0)
Whether to call the services_authorized_for_username plugin hook in BayesSQL.
--- SpamAssassin/BayesStore/SQL.pm 2005-08-11 09:00:37.000000000 +0200
+++ SQL.pm.bayesstore 2006-11-13 11:15:43.000000000 +0100
@@ -85,15 +85,70 @@
if ($self->{bayes}->{conf}->{bayes_sql_override_username}) {
$self->{_username} = $self->{bayes}->{conf}->{bayes_sql_override_username};
}
+#### Start of modification (last modified on 20061113).
else {
- $self->{_username} = $self->{bayes}->{main}->{username};
+ if ($self->{bayes}->{conf}->{bayes_sql_custom_query}) {
- # Need to make sure that a username is set, so just in case there is
- # no username set in main, set one here.
- unless ($self->{_username}) {
- $self->{_username} = "GLOBALBAYES";
- }
+ # Connect to database.
+ return 0 unless ($self->_connect_db());
+
+ # Retrieve current username and play with it.
+ my $username = $self->{bayes}->{main}->{username};
+ my ($mailbox, $domain) = split('@', $username);
+
+ my $quoted_username = $self->{_dbh}->quote($username);
+ my $quoted_domain = $self->{_dbh}->quote($domain);
+
+ my $custom_query =
$self->{bayes}->{conf}->{bayes_sql_custom_query};
+ $custom_query =~ s/_USERNAME_/$quoted_username/g;
+ $custom_query =~ s/_DOMAIN_/$quoted_domain/g;
+
+ dbg("bayes: new: quoted_username = ".$quoted_username);
+ dbg("bayes: new: quoted_domain = ".$quoted_domain);
+ dbg("bayes: new: custom_query = ".$custom_query);
+
+ # Prepare query.
+ my $sth = $self->{_dbh}->prepare($custom_query);
+ unless (defined($sth)) {
+ dbg("bayes: new: SQL error: ".$self->{_dbh}->errstr());
+ return 0;
+ }
+
+ # Execute query.
+ my $rc = $sth->execute();
+ unless ($rc) {
+ dbg("bayes: new: SQL error: ".$self->{_dbh}->errstr());
+ return 0;
+ }
+
+ # Retrieve _username.
+ my $ary_ref = $sth->fetchall_arrayref();
+ $self->{_username} = $ary_ref->[-1]->[-1];
+
+ dbg("bayes: new: _username = ".$self->{_username});
+
+ # Tell database server to free buffer allocated to query.
+ $sth->finish();
+
+ # Close database connection.
+ $self->{_dbh}->disconnect();
+
+ # Set _dbh to initial state.
+ $self->{_dbh} = undef;
+
+ }
+#### End of modification.
+ else {
+ $self->{_username} = $self->{bayes}->{main}->{username};
+ }
+ }
+
+ # Need to make sure that a username is set, so just in case there is
+ # no username set in main, set one here.
+ unless ($self->{_username}) {
+ $self->{_username} = "GLOBALBAYES";
}
+
dbg("bayes: using username: ".$self->{_username});
return $self;