I had maxbytes set to 8000, trying 4000 now, though I'm afraid of going more spammy in the rebuild since there's a TON of spam, but proportionally few messages being sent each day (or recieved as notspam).
Subject logging is on for manual review and retrieval. The server seems to be unresponsive not as it goes through all of the messages, but between "Resulting file" and "Bayesian Pairs" calculation. See below. Apr-18-10 20:49:55 Resulting file 'c:/assp/spamdb.rb.tmp' is 6,078,796 bytes Apr-18-10 20:53:35 Bayesian Pairs: 253,820 in new mail, 1,750,470 now in list What's going on between there? Thanks for the thoughts. On Tue, Apr 20, 2010 at 7:58 AM, Hill, Brett <[email protected]> wrote: > K Post wrote: > > Ok, ok, I'm an idiot. There, I said it, but I still have questions. > > As it turns out, the system rebuilds the database at 00:15, 5:15am, > AND 8:15pm. So now we know where the load is coming from. > > There were outages at midnight and 5am consistently too, I just wasn't > getting text alerts overnight for outages < 15 minutes. > > So, new question: Any idea what could be causing the rebuild to kill > the server while it processes, and why does my rebuilt seem to take > about 40 minutes each time. > > I keep 15,000 files, subject logging on. No databases in use. > > ------------------------------------------------------------------- > Is there a reason why you keep (UseSubjectsAsMaillogNames) enabled? Now > that you've got 15k messages, there really is no need to have that > enabled. > > Your exceeding NotSpam count is probably what is causing your rebuild to > take so long. ASSP still looks through all those extra messages during > the rebuild before it deletes the overage. If you disable > (UseSubjectsAsMaillogNames), ASSP will maintain a consistent 15K (or > thereabouts) thereby reducing the amount of time it takes for the > rebuild. I can't say why your ASSP is non-responsive though. Mine's > always responsive during a rebuild. > > My limit is set to 14,500 and the rebuild takes about 13-14 minutes. > For whatever reason, I have a hard time keeping max files in my spam > dir. I assume it's because they're deleted because of false positives. > My server runs Win32 with a 3.4GHz Xeon Proc with 3.5GB of RAM. > > RebuildSpamDB 2.7.1.0 (1.0.01) started - Tue Apr 20 07:30:01 2010 > > Running in basedirectory 'C:/ASSP' > > ---ASSP Settings--- > Use Subject as Maillog Names: Disabled > Maxbytes: 4000 > Maxfiles: 14500 > > ---Cleaning whitelist (c:/assp/whitelist)--- whitelist entries older > than 1095 days (MaxWhitelistDays) will be removed whitelist before: > 20,008 whitelist after: 20,008 > > --- Cleaning NoBayesian folders --- > entries older than 30 days will be removed starting cleanup old files > for folder c:/assp/okmail folder c:/assp/okmail before: 0 folder > c:/assp/okmail after: 0 > > starting cleanup old files for folder c:/assp/discarded folder > c:/assp/discarded before: 376 folder c:/assp/discarded deleted: 7 folder > c:/assp/discarded after: 369 > > starting cleanup old files for folder c:/assp/quarantine folder > c:/assp/quarantine before: 405 folder c:/assp/quarantine deleted: 7 > folder c:/assp/quarantine after: 398 > > > --- Cleaning corrected spam/notspam folders --- entries older than 1000 > days will be removed starting cleanup old files for folder > c:/assp/errors/spam folder c:/assp/errors/spam before: 545 folder > c:/assp/errors/spam after: 545 > > starting cleanup old files for folder c:/assp/errors/notspam folder > c:/assp/errors/notspam before: 540 folder c:/assp/errors/notspam after: > 540 > > > --- Cleaning Bayesian folders --- > > c:/assp/errors/spam > File Count: 545 > Processing... > Imported Files: 545 > Finished in 4 second(s) > > c:/assp/errors/notspam > File Count: 540 > Processing... > Imported Files: 540 > Finished in 5 second(s) > > c:/assp/spam > File Count: 14,035 > Processing... > removing c:/assp/spam/9760.eml -- '[email protected]' is in Whitelist > Removed White: 1 > Imported Files: 14,034 > Finished in 362 second(s) > > c:/assp/notspam > File Count: 14,499 > Processing... > Imported Files: 14,499 > Finished in 433 second(s) > > Generating weighted Bayesian tuplets...done > > Saving rebuilt SPAM database...done > > Resulting file 'spamdb' is 3,847,902 bytes > > HELO Blacklist: 275 HELOs > > Spam Weight: 4,277,804 > Not-Spam Weight: 4,551,464 > > Corpus norm: 0.9399 (ok - balanced) > Corpus correction settings - low:0.9 high:1.2 minimum files:10000 > minimum days:14 > > Total processing time: 820 second(s) > > Griplist download disabled > Downloading C:/ASSP/files/droplist.txt via direct HTTP connection > > Tue Apr 20 07:43:43 2010: RebuildSpamDB 2.7.1.0 (1.0.01) ended > > Kind Regards, > Brett > > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Assp-test mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/assp-test > ------------------------------------------------------------------------------ _______________________________________________ Assp-test mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-test
