I turned on database debugging. In the debug folder, I see a file with this as its contents: INSERT IGNORE INTO spamdb VALUES INSERT IGNORE INTO spamdb VALUES ,INSERT IGNORE INTO spamdb VALUES ,INSERT IGNORE INTO spamdb VALUES ,INSERT IGNORE [[ continues on for many lines, ends with ]] spamdb VALUES
This is MySQL community 5.6.24 x64 running on Windows 2012. On Tue, Apr 28, 2015 at 2:09 PM, K Post <nntp.p...@gmail.com> wrote: > Sorry for the seemingly incessant emails... > > Error: You have an error in your SQL syntax; check the manual that > corresponds to your MySQL server version for the right syntax to use near > 'INSERT IGNORE INTO spamdb VALUES ,INSERT IGNORE INTO spamdb VALUES ,INSERT > IGNOR' at line 1 > > Don't recall having seen this before. I'm now using assp_db_import.cfg > straight from cvs, no edits. Do I need to edit this to use with mysql > too? Or should I? looks like the maximum records for insert in bulk is > 1000, should I change that? > > > > On Tue, Apr 28, 2015 at 10:15 AM, K Post <nntp.p...@gmail.com> wrote: > >> and note, looking periodically at the worker status window in the web >> admin, I see "chkdb - finished" for quite some time after the 40k files >> have been processed. I think this is while spamdb is being generated. >> >> On Tue, Apr 28, 2015 at 9:29 AM, K Post <nntp.p...@gmail.com> wrote: >> >>> and why would the rebuild of hmm in berkeleydb take only seconds, but >>> the spamdb in mysql (on same box) take 45 minutes? >>> >>> On Tue, Apr 28, 2015 at 9:28 AM, K Post <nntp.p...@gmail.com> wrote: >>> >>>> preventBulkImport is not checked. >>>> >>>> I've reinstalled the VM from scratch. New OS installation, using the >>>> perl distribution 5.20 from >>>> http://sourceforge.net/projects/assp/files/ASSP%20V2%20multithreading/ASSP%20V2%20module%20installation/ >>>> >>>> Parsing the files, I'm talking about Apr-28-15 02:14:20 Processing... >>>> messages/notspam with 14,759 files: >>>> I'm worried that just parsing through the 40k files is about 65% slower >>>> than it is on the old production box using the same corpus (copied to the >>>> dev machine) even though the old box is less than 1/2 the processing power, >>>> has 40% slower disks, and 1/4 the RAM. That very old installation doesn't >>>> have HMM in the code, yes it's that old. When rb_processfolder runs in the >>>> latest version, is it doing more processing of each file because of the HMM >>>> option? I can't imagine why it would take so much longer on the new >>>> faster hardware. Any temporary code modifications I can make to see what's >>>> taking so long? >>>> >>>> Is there a spot in code where I could also modify bulk import of spamdb >>>> during the rebuild? I'd like to see if I can modify that as a test to >>>> write the import script as a file, ultimately to test how long it takes to >>>> import. Or any suggestions on timing this would be great. >>>> >>>> I'm really struggling here, thanks for the help. >>>> >>>> >>>> On Tue, Apr 28, 2015 at 4:19 AM, Thomas Eckardt < >>>> thomas.ecka...@thockar.com> wrote: >>>> >>>>> populating the SpamDB and HMMdbis a "DB Import". Check that >>>>> 'preventBulkImport' is disabled! >>>>> >>>>> Thomas >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Von: K Post <nntp.p...@gmail.com> >>>>> An: ASSP development mailing list <assp-test@lists.sourceforge.net >>>>> > >>>>> Datum: 27.04.2015 20:32 >>>>> Betreff: [Assp-test] MySQL vs BerkeleyDB >>>>> >>>>> >>>>> >>>>> Hi all- >>>>> >>>>> I'm having a rough go getting the rebuild process to quickly rebuild >>>>> spamdb. The HMM db, which I have using BerkeleyDB rebuilds >>>>> wonderfully, >>>>> in >>>>> under a minute. However, spamdb, which uses MySQL, is taking over 45 >>>>> minutes. That's no good. >>>>> >>>>> The real question is if there is a downside for using BerkeleyDB for >>>>> everything? >>>>> >>>>> In reality, I'd like to figure out why my installation is taking so >>>>> slow >>>>> with MySQL (and I've got another stalled out thread going on that). I >>>>> worry about the lack of management tools with BerkeleyDB. I'd be >>>>> uncomfortable with the whitelist being in Berkeley. >>>>> >>>>> >>>>> More info: >>>>> >>>>> ASSP and MySQL are running on the same Windows 2012 hypver-v virtual >>>>> machine. 16gb ram. 4gb ram disk for c:/assp/tmpDB (using the imdisk >>>>> driver), The vm seems to be running quickly for all other tasks. >>>>> >>>>> I've got a corpus of around 15k spam, 15k not spam, and 5k errors for >>>>> each >>>>> of error-spam and error-notspam (so about 40k total). It takes about >>>>> 45 >>>>> minutes to go through all of these messages and I'm okay with that >>>>> >>>>> MySQL is using the setting suggested here: >>>>> http://sourceforge.net/p/assp/mailman/message/29893302/ by Thomas, >>>>> though net_buffer_length >>>>> is limited to 1M according to the documentation. >>>>> >>>>> Apr-27-15 13:23:47 start populating Spamdb with 1,140,905 records - >>>>> Bayesian check is now disabled! >>>>> Apr-27-15 14:07:09 Finished populating Spamdb with 1,140,905 records - >>>>> Bayesian check is now enabled! >>>>> >>>>> >>>>> I'd really like to stick with MySQL for spamdb and the other databases, >>>>> but >>>>> berkeleydb as recommended for HMM. I just can't see doing that if the >>>>> rebuild of spamdb will be so slow. >>>>> >>>>> What kind of speeds is everyone else seeing for the spamdb rebuild >>>>> portion >>>>> of the rebuild? >>>>> >>>>> I'd love some suggestions on speeding up MySQL or anything else. Thank >>>>> you >>>>> >>>>> Ken >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> One dashboard for servers and applications across >>>>> Physical-Virtual-Cloud >>>>> Widest out-of-the-box monitoring support with 50+ applications >>>>> Performance metrics, stats and reports that give you Actionable >>>>> Insights >>>>> Deep dive visibility with transaction tracing using APM Insight. >>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y >>>>> _______________________________________________ >>>>> Assp-test mailing list >>>>> Assp-test@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/assp-test >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> DISCLAIMER: >>>>> ******************************************************* >>>>> This email and any files transmitted with it may be confidential, >>>>> legally >>>>> privileged and protected in law and are intended solely for the use of >>>>> the >>>>> >>>>> individual to whom it is addressed. >>>>> This email was multiple times scanned for viruses. There should be no >>>>> known virus in this email! >>>>> ******************************************************* >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> One dashboard for servers and applications across >>>>> Physical-Virtual-Cloud >>>>> Widest out-of-the-box monitoring support with 50+ applications >>>>> Performance metrics, stats and reports that give you Actionable >>>>> Insights >>>>> Deep dive visibility with transaction tracing using APM Insight. >>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y >>>>> _______________________________________________ >>>>> Assp-test mailing list >>>>> Assp-test@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/assp-test >>>>> >>>> >>>> >>> >> > ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Assp-test mailing list Assp-test@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/assp-test