- restore the corpus
- remove the file 'normfile'
- set 'RebuildTestMode' to on
- run a rebuild
- run a rebuild

Thomas




Von:    Steve Moffat <st...@optimum.bm>
An:     "'assp-test@lists.sourceforge.net'" 
<assp-test@lists.sourceforge.net>, 
Datum:  12.09.2012 18:51
Betreff:        [Assp-test] FW: RebuildSpamDB - report from assp.isp.bm



Hi, Just ran rebuildspamdb with the new release. The results are even 
worse.....before this I had a perfect corpus....

Steve

-----Original Message-----
From: assp@assp.local [mailto:assp@assp.local] 
Sent: Wednesday, September 12, 2012 1:50 PM
To: Steve Moffat
Subject: RebuildSpamDB - report from assp.isp.bm

File rebuildrun.txt follows:



Sep-12-12 13:18:25 RebuildSpamDB-thread rebuildspamdb-version 6.02 started 
in ASSP version 2.2.2(12256)

Sep-12-12 13:18:25 RebuildSpamDB will create a Hidden Markov Model!

Sep-12-12 13:18:25 RebuildSpamDB will include attachment-database-entries 
in to spamdb!

Sep-12-12 13:18:25 RebuildSpamDB will create unicode enabled databases.

Sep-12-12 13:18:25 RebuildSpamDB process all words as Sequence of UAX #29 
Grapheme Clusters.

Sep-12-12 13:18:25 RebuildSpamDB will use the ASSP_WordStem engine.

Sep-12-12 13:18:25 ---ASSP Settings---
Sep-12-12 13:18:25 Do Not Collect RedRe Messages: Enabled **Messages 
matching the RedRe will be removed from the corpus!**

Sep-12-12 13:18:25 Use Subject as Maillog Names: True
Sep-12-12 13:18:25 Maxbytes: 4000
Sep-12-12 13:18:25 RebuildFileTimeLimit: 1 5
Sep-12-12 13:18:25 RebuildFileTimeLimit: files will be moved away from the 
corpus, if there processing takes longer than 5 second(s) 

Sep-12-12 13:18:25 C:/assp/errors/spam
Sep-12-12 13:18:25 File Count:           319
Sep-12-12 13:18:25 Processing... errors/spam with 319 files
Sep-12-12 13:18:25 ignore and remove files older than Dec-17-09 12:18:25 
in folder errors/spam
Sep-12-12 13:18:33 1 attachment/image entries processed
Sep-12-12 13:18:33 Imported Files:               317
Sep-12-12 13:18:33 Finished in 8 second(s)

Sep-12-12 13:18:33 C:/assp/errors/notspam
Sep-12-12 13:18:33 File Count:           113
Sep-12-12 13:18:33 Processing... errors/notspam with 113 files
Sep-12-12 13:18:33 ignore and remove files older than Dec-17-09 12:18:33 
in folder errors/notspam
Sep-12-12 13:18:40 26 attachment/image entries processed
Sep-12-12 13:18:40 Imported Files:               111
Sep-12-12 13:18:40 Finished in 7 second(s)
Sep-12-12 13:18:40 warning: missing information for automatic corpus 
correction in file C:/assp/normfile - rerun the rebuild, if you see this 
warning the first time!

Sep-12-12 13:18:40 C:/assp/spam
Sep-12-12 13:18:40 File Count:           4,363
Sep-12-12 13:18:40 Processing... spam with 4,363 files
Sep-12-12 13:19:27 remove 
C:/assp/spam/Confirmation_of_changes_to_Boo--140013.eml WhiteList: 
'ba.custs...@contact.britishairways.com'
Sep-12-12 13:19:27 remove 
C:/assp/spam/Confirmation_of_changes_to_Boo--144011.eml WhiteList: 
'ba.custs...@contact.britishairways.com'
Sep-12-12 13:19:27 remove 
C:/assp/spam/Confirmation_of_changes_to_Boo--145936.eml WhiteList: 
'ba.custs...@contact.britishairways.com'
Sep-12-12 13:19:27 remove 
C:/assp/spam/Confirmation_of_changes_to_Boo--172792.eml WhiteList: 
'ba.custs...@contact.britishairways.com'
Sep-12-12 13:20:07 remove 
C:/assp/spam/FW_Time_Clarification_Walk_the--81794.eml WhiteList: 
'busbysu...@hotmail.com'
Sep-12-12 13:22:50 Removed White:                5
Sep-12-12 13:22:50 481 attachment/image entries processed
Sep-12-12 13:22:50 Imported Files:               4,356
Sep-12-12 13:22:50 Finished in 250 second(s)

Sep-12-12 13:22:50 C:/assp/notspam
Sep-12-12 13:22:50 File Count:           12,640
Sep-12-12 13:22:50 Processing... notspam with 12,000 files
Sep-12-12 13:42:28 2,022 attachment/image entries processed
Sep-12-12 13:42:28 Imported Files:               12,001
Sep-12-12 13:42:28 Folder contents exceeded 'MaxFiles'(12000). 
Sep-12-12 13:42:28 Finished in 1,178 second(s)

Sep-12-12 13:42:28 Rebuild processed 11.63 files per second.

Sep-12-12 13:42:28 Generating weighted Bayesian tuplets
Sep-12-12 13:42:38 start populating Spamdb with 175,796 records - Bayesian 
check is now disabled!
Sep-12-12 13:43:45 Finished populating Spamdb with 175,796 records - 
Bayesian check is now enabled!
Sep-12-12 13:43:45 done - Generating weighted Bayesian tuplets

Sep-12-12 13:43:45 Bayesian Pairs: 175,796 now in list

Sep-12-12 13:43:45 Generating consolidated Hidden-Markov-Model database 
from 1,634,405 record model
Sep-12-12 13:45:16 HMM sequences: 800,876 now in list

Sep-12-12 13:45:16 generating Spamdb.helo records from 3,664 collected 
HELO's
Sep-12-12 13:45:16 cleaning old Spamdb.helo records
Sep-12-12 13:45:17 done - cleaning old Spamdb.helo records

Sep-12-12 13:45:17 HELO Blacklist: 3 new, 94 now in list

Sep-12-12 13:45:17 Spam Weight:             1,598,969
Sep-12-12 13:45:17 Not-Spam Weight:   4,554,517

Sep-12-12 13:45:17 Corpus norm:          0.3511 - (warning: extremely ham 
heavy)
Sep-12-12 13:45:17 Corpus confidence:            0.13526783
Sep-12-12 13:45:17 Recommendation: RebuildSpamDB will limit the number of 
used messages in your corpus. Excess files will be ingored.
Sep-12-12 13:45:17 Corpus norm should be between 0.6 and 1.4

Sep-12-12 13:45:17 Recommendation: You need more spam messages in the 
corpus.

Sep-12-12 13:45:17 starting auto correction for corpus - delete old ham 
files from notspam

Sep-12-12 13:45:22 info: starting cleanup for to much (old) files in 
folder C:/assp/notspam - will try to remove 40% of the files - will keep 
at least 4000 files - will keep files younger than 14 days
info: deleted 1646 old files from folder C:/assp/notspam

Sep-12-12 13:45:22 Recommendation: You should reduce now MaxBytes to 2500! 
 

Sep-12-12 13:45:27 Start populating Hidden Markov Model. HMM-check is 
disabled for this time!
Sep-12-12 13:45:28 start populating Hidden Markov Model with 800,876 
records!
Sep-12-12 13:49:06 Finished populating Hidden Markov Model with 800,876 
records!
Sep-12-12 13:49:06 Finished populating Hidden Markov Model. HMM-check is 
now enabled again!

Sep-12-12 13:49:06 Total processing time: 1,841 second(s)

Sep-12-12 13:49:06 Total processing data: 567.41 MByte

Sep-12-12 13:49:06 building new GripList records and bounce report
Sep-12-12 13:49:06 processing Logfile C:/assp/logs/maillog.txt

Sep-12-12 13:49:11 skipping bounce report because 'DoNotCollectBounces' is 
switched ON

Sep-12-12 13:49:12 Uploading Griplist via Direct Connection
Sep-12-12 13:49:13 Submitted 2,910 bytes: 0 IPv6 addresses, 322 IPv4 
addresses

Sep-12-12 13:49:13 Trashlist was saved to C:/assp/trashlist.db
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test




DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally 
privileged and protected in law and are intended solely for the use of the 

individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no 
known virus in this email!
*******************************************************


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to