https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5891
Summary: Let AWL keep separate records for DKIM-signed and
unsigned mail
Product: Spamassassin
Version: unspecified
Platform: Other
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P5
Component: Libraries
AssignedTo: [email protected]
ReportedBy: [EMAIL PROTECTED]
The change to be submitted shortly lets AWL plugin keep separate records
in an SQL database for author addresses with a valid DKIM or DomainKeys
signature, and separate records for unsigned addresses. The idea is
that popular freemail domains like gmail.com and yahoo.com are frequently
abused to send spam, and it is much more efficient for a spammer to send
their junk directly, instead of going to trouble of submitting through
a freemail provider. And if a spam/fraud does come from a provider, at
least the yahoo.com is quite responsive to legitimate/proven complaints
sent to [EMAIL PROTECTED]
The resulting effect is that a message with a valid DKIM/DK signature
from such providers is far less likely to be spam than a message with
the same author address which carries no valid signature, so it makes
sense to keep separate AWL averages for each type, yielding significantly
more useful AWL scores.
For example:
yahoo.com not signed, avg.score= 14.8
yahoo.com valid sign., avg.score= -0.7
gmail.com not signed, avg.score= 2.9
gmail.com valid sign., avg.score= -3.3
It is implemented by adding one additional field 'signedby' to an SQL
table awl, which receives a signing identity if a message carries a
valid DKIM or DomainKeys signature, otherwise the field is set to an
empty string. The field awl.ip is ignored when selecting records with
a nonempty awl.signedby.
For compatibility with existing SQL tables without a signedby field
the new feature needs to be explicitly enabled in local.cf:
auto_whitelist_distinguish_signed 1
Upgrading instructions are provided in README.awl, the awl_pg.sql and
the awl_pg.sql schemas already carry the new field - it does no harm
(and no good) if a field exists but a feature is not enabled.
I have tested it under MySQL 5.1 and PostgreSQL 8.2, along with an
upgrading (ALTER TABLE) procedure. The code is in use at our site for
the last couple of months, but was lacking documentation touches.
As an alternative to adding a new field, it would be just as fine to
re-purpose a field awl.ip to carry either an IP addrress or a signer id.
As this field if very short (10 characters), tables would need to be
modified one way or another, so I chose a somewhat cleaner approach.
One side-benefit of collected data is that average scores can be obtained
for each signing domain, yielding some form of a 'reputation score'
for each domain. The new code does provide a sub get_signer_reputation()
which provides an average score for a supplied signing id, and this score
can be used to adjust the AWL result. The call to get_signer_reputation
is currently disabled to save one SQL select (an extra index on signedby
is needed to make this quick enough). It turned out that after a while
this average 'reputation score' settles to a rather constant value, so
dynamically fetching it every time is a bit of an unnecessary effort,
and a manual mechanism suffices.
Since I don't use bdb databases for AWL I did not venture into modifying
DBBasedAddrList.pm, so enabling the feature only has an effect when
AWL database is in SQL. Improvements welcome.
--
Configure bugmail:
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.