https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8361
Bug ID: 8361
Summary: (offgrid scupper) Emits bizarre errors and neglects to
generate scores when running offline; docs lacking;
offline configs needed
Product: Spamassassin
Version: 4.0.1
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: spamassassin
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: Undefined
Testing was done using this workflow as an offgrid user:
- Carry a laptop into an Internet cafe to fetch email and run sa-update
- From home without Internet: process the mail (which would take too long in a
cafe)
- Also from offgrid home: periodically run sa-learn on false negatives and
reprocess
/transparency issue/
Someone approaching SA for the first time would naturally expect
sa-update to need Internet, but not SA’s scoring. The fact that SA
needs an Internet uplink to score content defies reasonable
expectations and fails the “principle of least astonishment”. The man
page and docs in /usr/share/doc/spamassassin give no clues that
Internet is surprisingly required for scoring.
When I first discovered SA’s scoring tool was accessing the net, I
wrapped it with torsocks so as to mitigate leaking personal info to some
extent. That
worked at a time when Internet was always available to me. I should
mention first that torsocks is a hack. It’s not as proper as an app
that supports proxies.
Torsocks is also a somewhat futile hack because DNS lookups are often
done using UDP. In attempt mitigate that risk, a tor middlebox was
tried:
$ firejail --net=vnet0 --dns=\"$(ip address show dev vnet0 | awk
'/inet\>/{gsub(/[/].*/,""); print $2 }')\" /usr/bin/spamassassin
But that also failed even when there was Internet and I did not keep
notes on how or why.
When “torsocks spamassassin” is executed without a WAN, it behaves
poorly. The output is unfriendly nonsense from a python-unaware
end-user standpoint when a msg is scored:
===✂----------------------------------------
1767261476 PERROR torsocks[17588]: socks5 libc connect: Connection refused (in
socks5_connect() at socks5.c:202)
1767261476 PERROR torsocks[17588]: socks5 libc connect: Connection refused (in
socks5_connect() at socks5.c:202)
Use of uninitialized value in subroutine entry at
/usr/lib/x86_64-linux-gnu/perl5/5.36/NetAddr/IP/Lite.pm line 647.
Bad arg length for NetAddr::IP::Util::mask4to6, length is 0, should be 32 at
/usr/lib/x86_64-linux-gnu/perl5/5.36/NetAddr/IP/Lite.pm line 647.
Compilation failed in require at
/usr/lib/x86_64-linux-gnu/perl5/5.36/NetAddr/IP.pm line 8.
BEGIN failed--compilation aborted at
/usr/lib/x86_64-linux-gnu/perl5/5.36/NetAddr/IP.pm line 8.
Compilation failed in require at /usr/share/perl5/Mail/SpamAssassin/NetSet.pm
line 26.
BEGIN failed--compilation aborted at
/usr/share/perl5/Mail/SpamAssassin/NetSet.pm line 26.
Compilation failed in require at /usr/share/perl5/Mail/SpamAssassin/Conf.pm
line 88.
BEGIN failed--compilation aborted at /usr/share/perl5/Mail/SpamAssassin/Conf.pm
line 88.
Compilation failed in require at /usr/share/perl5/Mail/SpamAssassin.pm line 71.
BEGIN failed--compilation aborted at /usr/share/perl5/Mail/SpamAssassin.pm line
71.
Compilation failed in require at /usr/bin/spamassassin line 78.
BEGIN failed--compilation aborted at /usr/bin/spamassassin line 78.
procmail: [17579] Thu Jan 1 10:57:56 2026
procmail: Program failure (111) of "torsocks"
procmail: Rescue of unfiltered data succeeded
procmail: [17579] Thu Jan 1 10:57:56 2026
procmail: No match on "^X-Spam-Status: Yes"
===✂----------------------------------------
It’s bizarre that compilation would be in play at this stage. The
above also took painfully long, and no score was generated.
Testing in a less messy environment yielded better results:
===✂----------------------------------------
$ firejail --net=none /usr/bin/spamassassin -t <
/usr/share/doc/spamassassin/examples/sample-spam.txt
…
X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on localhost
X-Spam-Flag: YES
X-Spam-Level: **************************************************
X-Spam-Status: Yes, score=1000.0 required=5.0 tests=BAYES_40,GTUBE,NO_RECEIVED,
NO_RELAYS autolearn=no autolearn_force=no version=4.0.1
…
===✂----------------------------------------
The workaround for me will be to prefix with "firejail
--net=none". But that’s not ideal because it means the procmail
scripts must be altered between torsocks and firejail every time WAN
availability changes.
What does the “-t” flag do? That appears in USAGE.gz but the man page
does not disclose any CLI options. What other CLI options are there?
SA can probably be configured to skip tests that need a WAN, but it’s
likely unsurmountable for a novice user to quickly derive an advanced
configuration like that. If possible, the docs should disclose a
sample config for offline mode of operation.
Or even better, add a simple CLI flag “--offline”.
/several bugs and enhancement requests enumerated/
1. The man page and docs should state that Internet is used for
scoring and the comprehensive docs in /usr/share/doc should state
the rationale.
2. Spamassassin should be configurable to support a proxy. Ideally,
users should have a choice of SOCKS and HTTP proxies. Note that DNS
lookups often use UDP which the Tor network will not carry. So if a
proxy is supplied SA should also take care to use TCP.
3. When a task requires a WAN and the WAN is unreachable, in the very
least SA should give useful information and terminate more
gracefully when torsocks is used.
4. Or better than ③ above, SA should continue scoring in a degraded
state. It should be able to give a score without a WAN and perhaps
add a warning header stating that the scoring was degraded by the
lack of connectivity.
5. Document all CLI options (e.g. -t) in the man page.
6. Document a sample offline configuration.
7. Add an offline mode of operation that can be switched on the CLI.
Perhaps torsocks is somehow deceiving SA about WAN availability. If SA
is not going to be smart about that scenario, then the proxy option is
needed. A proxy option is needed anyway, in fact. And the proxy option
should be prominently documented and encouraged because there are
security compromises when running with default configs over the
clearnet.
--
You are receiving this mail because:
You are the assignee for the bug.