preventBulkImport is not checked.

I've reinstalled the VM from scratch.  New OS installation, using the perl
distribution 5.20 from
http://sourceforge.net/projects/assp/files/ASSP%20V2%20multithreading/ASSP%20V2%20module%20installation/

Parsing the files, I'm talking about Apr-28-15 02:14:20 Processing...
messages/notspam with 14,759 files:
I'm worried that just parsing through the 40k files is about 65% slower
than it is on the old production box using the same corpus (copied to the
dev machine) even though the old box is less than 1/2 the processing power,
has 40% slower disks, and 1/4 the RAM.  That very old installation doesn't
have HMM in the code, yes it's that old.  When rb_processfolder runs in the
latest version, is it doing more processing of each file because of the HMM
option?   I can't imagine why it would take so much longer on the new
faster hardware.  Any temporary code modifications I can make to see what's
taking so long?

Is there a spot in code where I could also modify bulk import of spamdb
during the rebuild?  I'd like to see if I can modify that as a test to
write the import script as a file, ultimately to test how long it takes to
import. Or any suggestions on timing this would be great.

I'm really struggling here, thanks for the help.


On Tue, Apr 28, 2015 at 4:19 AM, Thomas Eckardt <thomas.ecka...@thockar.com>
wrote:

> populating the SpamDB and HMMdbis a  "DB Import". Check that
> 'preventBulkImport' is disabled!
>
> Thomas
>
>
>
>
>
> Von:    K Post <nntp.p...@gmail.com>
> An:     ASSP development mailing list <assp-test@lists.sourceforge.net>
> Datum:  27.04.2015 20:32
> Betreff:        [Assp-test] MySQL vs BerkeleyDB
>
>
>
> Hi all-
>
> I'm having a rough go getting the rebuild process to quickly rebuild
> spamdb.  The HMM db, which I have using BerkeleyDB rebuilds wonderfully,
> in
> under a minute.  However, spamdb, which uses MySQL, is taking over 45
> minutes.  That's no good.
>
> The real question is if there is a downside for using BerkeleyDB for
> everything?
>
> In reality, I'd like to figure out why my installation is taking so slow
> with MySQL (and I've got another stalled out thread going on that).  I
> worry about the lack of management tools with BerkeleyDB.  I'd be
> uncomfortable with the whitelist being in Berkeley.
>
>
> More info:
>
> ASSP and MySQL are running on the same Windows 2012 hypver-v virtual
> machine.  16gb ram.  4gb ram disk for c:/assp/tmpDB (using the imdisk
> driver),  The vm seems to be running quickly for all other tasks.
>
> I've got a corpus of around 15k spam, 15k not spam, and 5k errors for each
> of error-spam and error-notspam (so about 40k total).  It takes about 45
> minutes to go through all of these messages and I'm okay with that
>
> MySQL is using the setting suggested here:
> http://sourceforge.net/p/assp/mailman/message/29893302/ by Thomas,
> though net_buffer_length
> is limited to 1M according to the documentation.
>
> Apr-27-15 13:23:47 start populating Spamdb with 1,140,905 records -
> Bayesian check is now disabled!
> Apr-27-15 14:07:09 Finished populating Spamdb with 1,140,905 records -
> Bayesian check is now enabled!
>
>
> I'd really like to stick with MySQL for spamdb and the other databases,
> but
> berkeleydb as recommended for HMM.  I just can't see doing that if the
> rebuild of spamdb will be so slow.
>
> What kind of speeds is everyone else seeing for the spamdb rebuild portion
> of the rebuild?
>
> I'd love some suggestions on speeding up MySQL or anything else.  Thank
> you
>
> Ken
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Assp-test mailing list
> Assp-test@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/assp-test
>
>
>
>
>
>
> DISCLAIMER:
> *******************************************************
> This email and any files transmitted with it may be confidential, legally
> privileged and protected in law and are intended solely for the use of the
>
> individual to whom it is addressed.
> This email was multiple times scanned for viruses. There should be no
> known virus in this email!
> *******************************************************
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Assp-test mailing list
> Assp-test@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/assp-test
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to