https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7977
Bug ID: 7977
Summary: sa-learn --mbox broken in trunk
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Libraries
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: Undefined
I have not yet tracked this to a specific commit, but one of the recent changes
to ArchiveIterator.pm seems to have broken sa-learn --mbox. I have tested this
under perl v5.34.0 and v5.32.1 (Debian unstable and stable, respectively).
Example of a failure:
noahm@74805e6e29ad:/tmp$ spamassassin --lint
noahm@74805e6e29ad:/tmp$ spamassassin --version
SpamAssassin version 4.0.0-r1899900
running on Perl version 5.34.0
noahm@74805e6e29ad:/tmp$ sa-learn -D --spam --mbox < spam-2022-04-20_1040 2>
debug.log
Learned tokens from 0 message(s) (0 message(s) examined)
noahm@74805e6e29ad:/tmp$ echo $?
1
debug log contains warnings related to nonexistent tmpfiles:
Apr 21 15:44:53.887 [1396] dbg: bayes: expiry completed
Apr 21 15:44:53.888 [1396] dbg: util: secure_tmpfile created a temporary file
/tmp/.spamassassin13964mXj0atmp
Apr 21 15:44:53.888 [1396] dbg: util: current PATH is:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Apr 21 15:44:53.888 [1396] dbg: util: executable for bzip2 was found at
/bin/bzip2
Apr 21 15:44:53.888 [1396] dbg: util: executable for xz was found at
/usr/bin/xz
Apr 21 15:44:53.888 [1396] dbg: util: executable for lzip was found at
/usr/bin/lzip
Apr 21 15:44:53.888 [1396] dbg: util: executable for lzop was found at
/usr/bin/lzop
Apr 21 15:44:53.888 [1396] dbg: archive-iterator:
_set_default_message_selection_opts After: Scanprob[1], want_date[0], cache[0],
from_regex[(?^:^From \\S+ ?(\\S\\S\\S \\S\\S\\S .?\\d .?\\d:\\d\\d:\\d\\d
\\d{4}|.?\\d-\\d\\d-\\d{4}_\\d\\d:\\d\\d:\\d\\d_))]
Apr 21 15:44:53.889 [1396] dbg: archive-iterator: no access to
/tmp/.spamassassin13964mXj0atmp.0: No such file or directory
Apr 21 15:44:53.889 [1396] dbg: archive-iterator: no access to
/tmp/.spamassassin13964mXj0atmp.4149: No such file or directory
Apr 21 15:44:53.889 [1396] dbg: plugin:
Mail::SpamAssassin::Plugin::Bayes=HASH(0x55dd494c44b8) implements
'learner_close', priority 0
If I revert the most recent changes to ArchiveIterator.pm (r1899848, r1899843,
and r1899836) things work again:
noahm@c30d8e8ec714:/src/spamassassin$ sa-learn -D --spam --mbox <
spam-2022-04-20_1040 2> debug-revert.log
Learned tokens from 2 message(s) (2 message(s) examined)
noahm@c30d8e8ec714:/src/spamassassin$ echo $?
0
In this case, the debug output indicates that the mailbox content is being
parsed as expected:
Apr 21 15:55:49.067 [146] dbg: util: secure_tmpfile created a temporary file
/tmp/.spamassassin146faHm2gtmp
Apr 21 15:55:49.067 [146] dbg: archive-iterator:
_set_default_message_selection_opts After: Scanprob[1], want_date[0], cache[0],
from_regex[(?^:^From \\S+ ?(\\S\\S\\S \\S\\S\\S .?\\d .?\\d:\\d\\d:\\d\\d
\\d{4}|.?\\d-\\d\\d-\\d{4}_\\d\\d:\\d\\d:\\d\\d_))]
Apr 21 15:55:49.069 [146] dbg: archive-iterator: _run_mailbox
/tmp/.spamassassin146faHm2gtmp, ofs 0, limit 512000
Apr 21 15:55:49.070 [146] dbg: config: time limit 300.0 s
Apr 21 15:55:49.071 [146] dbg: message: _decode_header return-path:
<[email protected]>
--
You are receiving this mail because:
You are the assignee for the bug.