John Hardin wrote:
> 
> On Sat, 13 Feb 2010, smfabac wrote:
> 
>> Is there a message size limit for sa-learn?
> 
> Yes, there is, and sadly sa-learn does not explicitly tell you a message 
> has been skipped because it's too large.
> 
> If there's a non-text attachment try deleteing it and re-learning the 
> message.
> 
> -- 
>   John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
>   jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
>   key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
>    End users want eye candy and the "ooo's and aaaahhh's" experience
>    when reading mail. To them email isn't a tool, but an entertainment
>    form.                                                 -- Steve Lake
> -----------------------------------------------------------------------
>   9 days until George Washington's 278th Birthday
> 
> 

Ok. It's a size problem:

I edited the notspam message and deleted 1000 lines from line 3000 to
4000, saved the file and then reprocessed notspam.

I continued getting 0 messages examined until I had deleted 3000 lines
of the message:

Message size as received:

$ wc -l notspam 
   6408 notspam  <-- sa-learn --ham failed on notspam folder
                             with one message  of 6000+ lines
$ 

After deleting 3003 lines:

$ wc -l notspam
   3405 notspam
$ vi notspam

     1  ^A^A^A^A
     2  From smf  Thu Feb 11 01:30:02 2010
     3  From: Boyd Lynn Gerber <gerb...@zenez.com>
     4  To: distribut...@registry.ca
     5  Subject: Quarterly ASCII posting of SCO UnixWare 7/OpenUNIX
8/OpenServer6 FAQ
     6  Date: Thu, 11 Feb 2010 00:05:18 -0700 (MST)
     7  Message-Id: <ou8faqqt_1265871...@news.xmission.com>
....
  3395
  3396               filepriv -f setuid programfile.exe
  3397
  3398  --
  3399  Boyd Gerber <gerb...@zenez.com> 801 849-0213
  3400  ZENEZ   1042 East Fort Union #135, Midvale Utah  84047
  3401
  3402
  3403  ------------=_4B73B21B.8398EDEC--
  3404
  3405  ^A^A^A^A

$ sa-learn --showdots --ham --mbox notspam
.
Learned tokens from 1 message(s) (1 message(s) examined)
$ 
$ wc notspam
  lines: 3405  words:  18735  characters: 130876 notspam


So, does the documentation on sa-learn indicate that there is 
a size limit on the message to be processed?

-- 
View this message in context: 
http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27590620.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Reply via email to