I turned on database debugging.  In the debug folder, I see a file with
this as its contents:
INSERT IGNORE INTO spamdb VALUES INSERT IGNORE INTO spamdb VALUES ,INSERT
IGNORE INTO spamdb VALUES ,INSERT IGNORE INTO spamdb VALUES ,INSERT IGNORE
  [[  continues on  for many lines, ends with  ]]  spamdb VALUES

This is MySQL community 5.6.24 x64 running on Windows 2012.

On Tue, Apr 28, 2015 at 2:09 PM, K Post <nntp.p...@gmail.com> wrote:

> Sorry for the seemingly incessant emails...
>
> Error: You have an error in your SQL syntax; check the manual that
> corresponds to your MySQL server version for the right syntax to use near
> 'INSERT IGNORE INTO spamdb VALUES ,INSERT IGNORE INTO spamdb VALUES ,INSERT
> IGNOR' at line 1
>
> Don't recall having seen this before.  I'm now using assp_db_import.cfg
> straight from cvs, no edits.  Do I need to edit this to use with mysql
> too?  Or should I?  looks like the maximum records for insert in bulk is
> 1000, should I change that?
>
>
>
> On Tue, Apr 28, 2015 at 10:15 AM, K Post <nntp.p...@gmail.com> wrote:
>
>> and note, looking periodically at the worker status window in the web
>> admin, I see "chkdb - finished" for quite some time after the 40k files
>> have been processed.  I think this is while spamdb is being generated.
>>
>> On Tue, Apr 28, 2015 at 9:29 AM, K Post <nntp.p...@gmail.com> wrote:
>>
>>> and why would the rebuild of hmm in berkeleydb take only seconds, but
>>> the spamdb in mysql (on same box) take 45 minutes?
>>>
>>> On Tue, Apr 28, 2015 at 9:28 AM, K Post <nntp.p...@gmail.com> wrote:
>>>
>>>> preventBulkImport is not checked.
>>>>
>>>> I've reinstalled the VM from scratch.  New OS installation, using the
>>>> perl distribution 5.20 from
>>>> http://sourceforge.net/projects/assp/files/ASSP%20V2%20multithreading/ASSP%20V2%20module%20installation/
>>>>
>>>> Parsing the files, I'm talking about Apr-28-15 02:14:20 Processing...
>>>> messages/notspam with 14,759 files:
>>>> I'm worried that just parsing through the 40k files is about 65% slower
>>>> than it is on the old production box using the same corpus (copied to the
>>>> dev machine) even though the old box is less than 1/2 the processing power,
>>>> has 40% slower disks, and 1/4 the RAM.  That very old installation doesn't
>>>> have HMM in the code, yes it's that old.  When rb_processfolder runs in the
>>>> latest version, is it doing more processing of each file because of the HMM
>>>> option?   I can't imagine why it would take so much longer on the new
>>>> faster hardware.  Any temporary code modifications I can make to see what's
>>>> taking so long?
>>>>
>>>> Is there a spot in code where I could also modify bulk import of spamdb
>>>> during the rebuild?  I'd like to see if I can modify that as a test to
>>>> write the import script as a file, ultimately to test how long it takes to
>>>> import. Or any suggestions on timing this would be great.
>>>>
>>>> I'm really struggling here, thanks for the help.
>>>>
>>>>
>>>> On Tue, Apr 28, 2015 at 4:19 AM, Thomas Eckardt <
>>>> thomas.ecka...@thockar.com> wrote:
>>>>
>>>>> populating the SpamDB and HMMdbis a  "DB Import". Check that
>>>>> 'preventBulkImport' is disabled!
>>>>>
>>>>> Thomas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Von:    K Post <nntp.p...@gmail.com>
>>>>> An:     ASSP development mailing list <assp-test@lists.sourceforge.net
>>>>> >
>>>>> Datum:  27.04.2015 20:32
>>>>> Betreff:        [Assp-test] MySQL vs BerkeleyDB
>>>>>
>>>>>
>>>>>
>>>>> Hi all-
>>>>>
>>>>> I'm having a rough go getting the rebuild process to quickly rebuild
>>>>> spamdb.  The HMM db, which I have using BerkeleyDB rebuilds
>>>>> wonderfully,
>>>>> in
>>>>> under a minute.  However, spamdb, which uses MySQL, is taking over 45
>>>>> minutes.  That's no good.
>>>>>
>>>>> The real question is if there is a downside for using BerkeleyDB for
>>>>> everything?
>>>>>
>>>>> In reality, I'd like to figure out why my installation is taking so
>>>>> slow
>>>>> with MySQL (and I've got another stalled out thread going on that).  I
>>>>> worry about the lack of management tools with BerkeleyDB.  I'd be
>>>>> uncomfortable with the whitelist being in Berkeley.
>>>>>
>>>>>
>>>>> More info:
>>>>>
>>>>> ASSP and MySQL are running on the same Windows 2012 hypver-v virtual
>>>>> machine.  16gb ram.  4gb ram disk for c:/assp/tmpDB (using the imdisk
>>>>> driver),  The vm seems to be running quickly for all other tasks.
>>>>>
>>>>> I've got a corpus of around 15k spam, 15k not spam, and 5k errors for
>>>>> each
>>>>> of error-spam and error-notspam (so about 40k total).  It takes about
>>>>> 45
>>>>> minutes to go through all of these messages and I'm okay with that
>>>>>
>>>>> MySQL is using the setting suggested here:
>>>>> http://sourceforge.net/p/assp/mailman/message/29893302/ by Thomas,
>>>>> though net_buffer_length
>>>>> is limited to 1M according to the documentation.
>>>>>
>>>>> Apr-27-15 13:23:47 start populating Spamdb with 1,140,905 records -
>>>>> Bayesian check is now disabled!
>>>>> Apr-27-15 14:07:09 Finished populating Spamdb with 1,140,905 records -
>>>>> Bayesian check is now enabled!
>>>>>
>>>>>
>>>>> I'd really like to stick with MySQL for spamdb and the other databases,
>>>>> but
>>>>> berkeleydb as recommended for HMM.  I just can't see doing that if the
>>>>> rebuild of spamdb will be so slow.
>>>>>
>>>>> What kind of speeds is everyone else seeing for the spamdb rebuild
>>>>> portion
>>>>> of the rebuild?
>>>>>
>>>>> I'd love some suggestions on speeding up MySQL or anything else.  Thank
>>>>> you
>>>>>
>>>>> Ken
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> One dashboard for servers and applications across
>>>>> Physical-Virtual-Cloud
>>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>>> Performance metrics, stats and reports that give you Actionable
>>>>> Insights
>>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>>> _______________________________________________
>>>>> Assp-test mailing list
>>>>> Assp-test@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/assp-test
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> DISCLAIMER:
>>>>> *******************************************************
>>>>> This email and any files transmitted with it may be confidential,
>>>>> legally
>>>>> privileged and protected in law and are intended solely for the use of
>>>>> the
>>>>>
>>>>> individual to whom it is addressed.
>>>>> This email was multiple times scanned for viruses. There should be no
>>>>> known virus in this email!
>>>>> *******************************************************
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> One dashboard for servers and applications across
>>>>> Physical-Virtual-Cloud
>>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>>> Performance metrics, stats and reports that give you Actionable
>>>>> Insights
>>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>>> _______________________________________________
>>>>> Assp-test mailing list
>>>>> Assp-test@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/assp-test
>>>>>
>>>>
>>>>
>>>
>>
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to