Also, subject logging is used so that files aren't randomly deleted so
block reporting works (still, I don't think that's the problem here,
just wanted to more thouroughly answer your question)


On Tue, Apr 20, 2010 at 2:51 PM, K Post <[email protected]> wrote:
> I had maxbytes set to 8000, trying 4000 now, though I'm afraid of
> going more spammy in the rebuild since there's a TON of spam, but
> proportionally few messages being sent each day (or recieved as
> notspam).
>
> Subject logging is on for manual review and retrieval.
>
> The server seems to be unresponsive not as it goes through all of the
> messages, but between "Resulting file" and "Bayesian Pairs"
> calculation.  See below.
>
> Apr-18-10 20:49:55 Resulting file 'c:/assp/spamdb.rb.tmp' is 6,078,796 bytes
> Apr-18-10 20:53:35 Bayesian Pairs: 253,820 in new mail, 1,750,470 now in list
>
> What's going on between there?
>
> Thanks for the thoughts.
>
> On Tue, Apr 20, 2010 at 7:58 AM, Hill, Brett <[email protected]> wrote:
>> K Post wrote:
>>
>> Ok, ok, I'm an idiot.  There, I said it, but I still have questions.
>>
>> As it turns out, the system rebuilds the database at 00:15, 5:15am,
>> AND 8:15pm.  So now we know where the load is coming from.
>>
>> There were outages at midnight and 5am consistently too, I just wasn't
>> getting text alerts overnight for outages < 15 minutes.
>>
>> So, new question: Any idea what could be causing the rebuild to kill
>> the server while it processes, and why does my rebuilt seem to take
>> about 40 minutes each time.
>>
>> I keep 15,000 files, subject logging on.  No databases in use.
>>
>> -------------------------------------------------------------------
>> Is there a reason why you keep (UseSubjectsAsMaillogNames) enabled?  Now
>> that you've got 15k messages, there really is no need to have that
>> enabled.
>>
>> Your exceeding NotSpam count is probably what is causing your rebuild to
>> take so long.  ASSP still looks through all those extra messages during
>> the rebuild before it deletes the overage.  If you disable
>> (UseSubjectsAsMaillogNames), ASSP will maintain a consistent 15K (or
>> thereabouts) thereby reducing the amount of time it takes for the
>> rebuild.  I can't say why your ASSP is non-responsive though.  Mine's
>> always responsive during a rebuild.
>>
>> My limit is set to 14,500 and the rebuild takes about 13-14 minutes.
>> For whatever reason, I have a hard time keeping max files in my spam
>> dir.  I assume it's because they're deleted because of false positives.
>> My server runs Win32 with a 3.4GHz Xeon Proc with 3.5GB of RAM.
>>
>> RebuildSpamDB 2.7.1.0 (1.0.01) started - Tue Apr 20 07:30:01 2010
>>
>> Running in basedirectory 'C:/ASSP'
>>
>> ---ASSP Settings---
>> Use Subject as Maillog Names: Disabled
>> Maxbytes: 4000
>> Maxfiles: 14500
>>
>> ---Cleaning whitelist (c:/assp/whitelist)--- whitelist entries older
>> than 1095 days (MaxWhitelistDays) will be removed whitelist before:
>> 20,008 whitelist after:  20,008
>>
>> --- Cleaning NoBayesian folders ---
>> entries older than 30 days will be removed starting cleanup old files
>> for folder c:/assp/okmail folder c:/assp/okmail before: 0 folder
>> c:/assp/okmail after: 0
>>
>> starting cleanup old files for folder c:/assp/discarded folder
>> c:/assp/discarded before: 376 folder c:/assp/discarded deleted: 7 folder
>> c:/assp/discarded after: 369
>>
>> starting cleanup old files for folder c:/assp/quarantine folder
>> c:/assp/quarantine before: 405 folder c:/assp/quarantine deleted: 7
>> folder c:/assp/quarantine after: 398
>>
>>
>> --- Cleaning corrected spam/notspam folders --- entries older than 1000
>> days will be removed starting cleanup old files for folder
>> c:/assp/errors/spam folder c:/assp/errors/spam before: 545 folder
>> c:/assp/errors/spam after: 545
>>
>> starting cleanup old files for folder c:/assp/errors/notspam folder
>> c:/assp/errors/notspam before: 540 folder c:/assp/errors/notspam after:
>> 540
>>
>>
>> --- Cleaning Bayesian folders ---
>>
>> c:/assp/errors/spam
>> File Count:     545
>> Processing...
>> Imported Files: 545
>> Finished in 4 second(s)
>>
>> c:/assp/errors/notspam
>> File Count:     540
>> Processing...
>> Imported Files: 540
>> Finished in 5 second(s)
>>
>> c:/assp/spam
>> File Count:     14,035
>> Processing...
>> removing c:/assp/spam/9760.eml  -- '[email protected]' is in Whitelist
>> Removed White:  1
>> Imported Files: 14,034
>> Finished in 362 second(s)
>>
>> c:/assp/notspam
>> File Count:     14,499
>> Processing...
>> Imported Files: 14,499
>> Finished in 433 second(s)
>>
>> Generating weighted Bayesian tuplets...done
>>
>> Saving rebuilt SPAM database...done
>>
>> Resulting file 'spamdb' is 3,847,902 bytes
>>
>> HELO Blacklist: 275 HELOs
>>
>> Spam Weight:       4,277,804
>> Not-Spam Weight:   4,551,464
>>
>> Corpus norm:    0.9399  (ok - balanced)
>> Corpus correction settings - low:0.9 high:1.2 minimum files:10000
>> minimum days:14
>>
>> Total processing time: 820 second(s)
>>
>> Griplist download disabled
>> Downloading C:/ASSP/files/droplist.txt via direct HTTP connection
>>
>> Tue Apr 20 07:43:43 2010: RebuildSpamDB 2.7.1.0 (1.0.01) ended
>>
>> Kind Regards,
>> Brett
>>
>>
>> ------------------------------------------------------------------------------
>> Download Intel&#174; Parallel Studio Eval
>> Try the new software tools for yourself. Speed compiling, find bugs
>> proactively, and fine-tune applications for parallel performance.
>> See why Intel Parallel Studio got high marks during beta.
>> http://p.sf.net/sfu/intel-sw-dev
>> _______________________________________________
>> Assp-test mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/assp-test
>>
>

------------------------------------------------------------------------------
_______________________________________________
Assp-test mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to