ham counter not going up?

ian douglas Thu, 07 Jun 2007 15:10:17 -0700

Hi all,

Using SA 3.2.0 on a shared hosting account via CPanel, with mysa-trainer.cgi Perl script to call sa-learn with various parameterswhich I'll get to in a second, to scan ham and spam from some Maildirfolders.

After scanning, the Perl script calls "sa-learn --dump magic" and parsesout the total number of spam/ham messages (nspam, nham, respectively)that have been processed through the bayes db's.

What's odd, is that after scanning, the number of ham messages does notincrement. Before running the script, the last dump count said somethingto the effect of:


0.000          0         23          0  non-token data: nham

And after scanning, reports the exact same information.


The command-line calls built for scanning looks something like:

sa-learn -p /path/to/user_prefs --spam /path/to/spam/maildir/cur
sa-learn -p /path/to/user_prefs --use-ignores --ham \
  /path/to/non-spam/maildir/cur

Is the "use-ignores" flag causing the number of scanned messages not togo up?

I turned on some bayes debugging by adding "-D bayes" to the commandline, and see this when scanning the ham messages a second time:


[16014] (I snipped out all references to FuzzyOCR)
[16014] dbg: bayes: tie-ing to DB file R/O \
  /home/mypath/.spamassassin/bayes_toks
[16014] dbg: bayes: tie-ing to DB file R/O \
  /home/mypath/.spamassassin/bayes_seen
[16014] dbg: bayes: found bayes db version 3
[16014] dbg: bayes: DB journal sync: last sync: 0
[16014] dbg: bayes: not available for scanning, only 23 ham(s) in \
  bayes DB < 200
[16014] dbg: bayes: untie-ing
[16014] dbg: learn: initializing learner
[16014] dbg: bayes: bayes journal sync starting
[16014] dbg: bayes: bayes journal sync completed
[16014] dbg: bayes: expiry starting
[16014] dbg: bayes: tie-ing to DB file R/W \
  /home/mypath/.spamassassin/bayes_toks
[16014] dbg: bayes: tie-ing to DB file R/W \
  /home/mypath/.spamassassin/bayes_seen
[16014] dbg: bayes: found bayes db version 3

[16014] dbg: bayes: DB expiry: tokens in DB: 30901, Expiry max size:150000, Oldest atime: 1178647046, Newest atime: 1181075754, Last \

  expire: 0, Current time: 1181253067
[16014] dbg: bayes: expiry completed
[16014] dbg: learn: learning ham

[16014] dbg: bayes:[EMAIL PROTECTED] already learntcorrectly, not learning twice

[16014] dbg: learn: learning ham

[16014] dbg: bayes:[EMAIL PROTECTED] already learntcorrectly, not learning twice

[16014] dbg: learn: learning ham

[16014] dbg: bayes:[EMAIL PROTECTED] already learntcorrectly, not learning twice

[16014] dbg: learn: learning ham

[16014] dbg: bayes:[EMAIL PROTECTED] already learntcorrectly, not learning twice

[16014] dbg: learn: learning ham

The "learnt correctly" line is repeated for all 68 or so messages, andthen ends with:


[16014] dbg: bayes: untie-ing
[16014] dbg: bayes: files locked, now unlocking lock
Learned tokens from 0 message(s) (68 message(s) examined)


Then doing another "dump magic" call, I still see the '23' line:

$ sa-learn --dump magic | grep nham
0.000          0         23          0  non-token data: nham

What information can I offer up, debugging or otherwise, to determinewhy the number of counted ham messages is not increasing? Or is it justthe --use-ignores flag that's causing this?


Thanks,
Ian

ham counter not going up?

Reply via email to