Re: sa-learn won't read db created via MSTOR
On Sat, 8 Jul 2017 21:55:36 +0100 RW wrote: > On Sat, 8 Jul 2017 14:14:42 -0500 > Jerry Malcolm wrote: > As a proof of concept try a small mbox file with > > mbox_format_from_regex /^From\s/ and if it works try this instead: /^From \S+ ?(\S\S\S \S\S\S .?\d .?\d:\d\d:\d\d \d{4})/
Re: sa-learn won't read db created via MSTOR
On Saturday 08 July 2017 at 22:55:36, RW wrote: > I had a spillage and most of the punctuation characters > on my keyboard aren't working at the moment. Oh dear, my sympathies - but what a splendid quote on a mailing list :) Antony. -- Salad is what food eats. Please reply to the list; please *don't* CC me.
Re: sa-learn won't read db created via MSTOR
On Sat, 8 Jul 2017 14:14:42 -0500 Jerry Malcolm wrote: > Thanks for the info. Unfortunately, I don't have a clue how to > interpret a regex expression. I couldn't find any reference to > mbox_format_from_regex in the 3.1.x Mail::SpamAssassin::Conf that > came up when I googled it. I hope you aren't actually running 3.1.x because that's ten years old. > The separators in my mbox file are: > > From - Sat Jul 8 01:02:28 2017 That looks to be the problem. As a proof of concept try a small mbox file with mbox_format_from_regex /^From\s/ This is actually all you need if the mbox files are properly formatted and lines that start "From " are escaped. I would give you a fuller replacement but I had a spillage and most of the punctuation characters on my keyboard aren't working at the moment.
Re: sa-learn won't read db created via MSTOR
Upon further investigation, I don't think sa-learn is even attempting to open the file. I get the exact same message whether I give it a real file or just a string of characters for a file name: [C:\Program Files\JAM Software\SpamAssassin in a Box]sa-learn.exe --spam --mbox c:\IMAPUtil\temp\uncaughtSpam.mstor\temp Learned tokens from 0 message(s) (0 message(s) examined) [C:\Program Files\JAM Software\SpamAssassin in a Box]sa-learn.exe --spam --mbox Learned tokens from 0 message(s) (0 message(s) examined) This can't be right. How can I tell if it's really reading the file? On 7/8/2017 2:14 PM, Jerry Malcolm wrote: Thanks for the info. Unfortunately, I don't have a clue how to interpret a regex expression. I couldn't find any reference to mbox_format_from_regex in the 3.1.x Mail::SpamAssassin::Conf that came up when I googled it. The separators in my mbox file are: From - Sat Jul 8 01:02:28 2017 Can someone who speaks regex tell me if this syntax is my problem, and if so, point me to where I can find the correct regex that matches this that I can copy/paste? Thanks. Jerry On 7/8/2017 8:45 AM, RW wrote: On Sat, 8 Jul 2017 01:57:47 -0500 Jerry Malcolm wrote: Below is a complete log dump from the -D option on sa-learn. ... _set_default_message_selection_opts After: Scanprob[1], want_date[0], cache[0], from_regex[^From \S+ ?(\S\S\S \S\S\S .\d .\d:\d\d:\d\d \d{4}|.\d-\d\d-\d{4}_\d\d:\d\d:\d\d_)] Check that this default regex matches your mbox separator, you may need to set mbox_format_from_regex. See the Mail::SpamAssassin::Conf documentation --- This email has been checked for viruses by AVG. http://www.avg.com
Re: sa-learn won't read db created via MSTOR
Thanks for the info. Unfortunately, I don't have a clue how to interpret a regex expression. I couldn't find any reference to mbox_format_from_regex in the 3.1.x Mail::SpamAssassin::Conf that came up when I googled it. The separators in my mbox file are: From - Sat Jul 8 01:02:28 2017 Can someone who speaks regex tell me if this syntax is my problem, and if so, point me to where I can find the correct regex that matches this that I can copy/paste? Thanks. Jerry On 7/8/2017 8:45 AM, RW wrote: On Sat, 8 Jul 2017 01:57:47 -0500 Jerry Malcolm wrote: Below is a complete log dump from the -D option on sa-learn. ... _set_default_message_selection_opts After: Scanprob[1], want_date[0], cache[0], from_regex[^From \S+ ?(\S\S\S \S\S\S .\d .\d:\d\d:\d\d \d{4}|.\d-\d\d-\d{4}_\d\d:\d\d:\d\d_)] Check that this default regex matches your mbox separator, you may need to set mbox_format_from_regex. See the Mail::SpamAssassin::Conf documentation
Re: sa-learn won't read db created via MSTOR
On Sat, 8 Jul 2017 01:57:47 -0500 Jerry Malcolm wrote: > Below is a complete log dump from the -D option on sa-learn. ... > _set_default_message_selection_opts After: Scanprob[1], want_date[0], > cache[0], from_regex[^From \S+ ?(\S\S\S \S\S\S .\d .\d:\d\d:\d\d > \d{4}|.\d-\d\d-\d{4}_\d\d:\d\d:\d\d_)] Check that this default regex matches your mbox separator, you may need to set mbox_format_from_regex. See the Mail::SpamAssassin::Conf documentation
Re: sa-learn won't read db created via MSTOR
Below is a complete log dump from the -D option on sa-learn. I am really curious that the file name I passed in is never even mentioned in the log. Is that expected? Do I have some sort of syntax error passing the mbox filename in? Here's the command: [C:\Program Files\JAM Software\SpamAssassin in a Box]sa-learn -D --spam --showdots --mbox c:\imaputil\temp\uncaughtspam.mstor\temp Thx, Jerry Jul 8 01:47:42.704 [12972] dbg: logger: adding facilities: all Jul 8 01:47:42.704 [12972] dbg: logger: logging level is DBG Jul 8 01:47:42.704 [12972] dbg: generic: SpamAssassin version 3.4.1 Jul 8 01:47:42.704 [12972] dbg: generic: Perl 5.022001, PREFIX=C:\Program Files\JAM Software\SpamAssassin in a Box\runtime, DEF_RULES_DIR=C:\ProgramData\JAM Software\spamdService\sa-rules, LOCAL_RULES_DIR=C:\ProgramData\JAM Software\spamdService\sa-config, LOCAL_STATE_DIR=..\share Jul 8 01:47:42.705 [12972] dbg: config: timing enabled Jul 8 01:47:42.706 [12972] dbg: config: score set 0 chosen. Jul 8 01:47:42.712 [12972] dbg: util: running in taint mode? no Jul 8 01:47:42.712 [12972] dbg: util: defining getpwuid() wrapper using 'unknown' as username Jul 8 01:47:42.715 [12972] dbg: config: using "C:\ProgramData\JAM Software\spamdService\sa-config" for site rules pre files Jul 8 01:47:42.715 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/init.pre Jul 8 01:47:42.716 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/v310.pre Jul 8 01:47:42.716 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/v312.pre Jul 8 01:47:42.716 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/v320.pre Jul 8 01:47:42.716 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/v330.pre Jul 8 01:47:42.716 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/v340.pre Jul 8 01:47:42.716 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/v341.pre Jul 8 01:47:42.717 [12972] dbg: config: using "C:\ProgramData\JAM Software\spamdService\sa-rules" for sys rules pre files Jul 8 01:47:42.717 [12972] dbg: config: using "C:\ProgramData\JAM Software\spamdService\sa-rules" for default rules dir Jul 8 01:47:42.717 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-rules/sa_zmi_at.cf Jul 8 01:47:42.718 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-rules/sought_rules_yerp_org.cf Jul 8 01:47:42.718 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-rules/spamassassin_heinlein-support_de.cf Jul 8 01:47:42.718 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-rules/updates_spamassassin_org.cf Jul 8 01:47:42.718 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-rules/xsaupdate_jam-software_com.cf Jul 8 01:47:42.718 [12972] dbg: config: using "C:\ProgramData\JAM Software\spamdService\sa-config" for site rules dir Jul 8 01:47:42.719 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/20_khop_bl.cf Jul 8 01:47:42.719 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/contact.cf Jul 8 01:47:42.720 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/jam.cf Jul 8 01:47:42.720 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/jam_DNSBL.cf Jul 8 01:47:42.720 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/jam_example_rules.cf Jul 8 01:47:42.720 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/jam_virus_bounce_rules.cf Jul 8 01:47:42.720 [12972] dbg: config: read file C:\ProgramData\JAM Software\spamdService\sa-config/local.cf Jul 8 01:47:42.721 [12972] dbg: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC Jul 8 01:47:42.727 [12972] dbg: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC Jul 8 01:47:42.733 [12972] dbg: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC Jul 8 01:47:42.738 [12972] dbg: plugin: loading Mail::SpamAssassin::Plugin::Pyzor from @INC Jul 8 01:47:42.740 [12972] dbg: pyzor: network tests on, attempting Pyzor Jul 8 01:47:42.740 [12972] dbg: plugin: loading Mail::SpamAssassin::Plugin::Razor2 from @INC Jul 8 01:47:42.806 [12972] dbg: razor2: razor2 is available, version 2.84 Jul 8 01:47:42.806 [12972] dbg: plugin: loading Mail::SpamAssassin::Plugin::SpamCop from @INC Jul 8 01:47:45.307 [12972] dbg: reporter: network tests on, attempting SpamCop Jul 8 01:47:45.307 [12972] dbg: plugin: loading Mail::SpamAssassin::Plugin::AutoLearnThreshold from @INC Jul 8 01:47:45.309 [12972] dbg: plugin: loading Mail::SpamAssassin::Plugin::TextCat from @INC Jul 8 01:47:45.313 [12972] dbg: textcat: loading languages file
sa-learn won't read db created via MSTOR
My client mail repository is in a sql db and is not an option for sa-learn to read directly. That's fine. I wrote a utility that reads all the mail out of the uncaught-spam folder from my db and creates an mbox folder using the mstor java package. The mbox file gets created with no problem. When I run sa-learn, it says 0 messages were examined. The mbox folder has about 2500 spam messages in it. I've seen lots of discussion on the forums about whether or not sa-learn will 'process' a message based on whether it's processed it before, etc. I understand that. But this is the very first time I've ever tried to run sa-learn. And this error implies that it is not even finding any messages to process. Here's the command and response: (Win server 2008) [C:\Program Files\JAM Software\SpamAssassin in a Box]sa-learn --spam --mbox --showdots c:\imaputil\temp\uncaughtspam.mstor\temp Learned tokens from 0 message(s) (0 message(s) examined) I've used the mstor package before and have had zero problems with it. So I have no reason to assume it's creating a corrupted mbox folder file. The mbox folder is present and is being found (I tried renaming it and got a 'not found' error from sa-learn). I've opened it in an editor, and to the extent I can tell, it looks like an mbox file. There is about a 10-15 sec time lapse while sa-learn is 'running' before it displays the message. So it appears that it's reading the mbox file. But for some reason it thinks there are no messages inside it. I'm at a loss right now. Is there anyway to get additional information on why it thinks there are no messages in the mbox file? I can post the mbox file if necessary. If there are any debug flags that will help me figure out what is wrong, I can do debug as well. Thanks. Jerry