Hello,

    I am new to the list, so pardon me if I am on the wrong one posting my question.  
If so, please direct me to the right place.

    My question is in regards to the efficiency of INSERT statements. I have gone over 
some of the optimizations, and on a machine running the client and MySQL 4 server, 
DELETEing a table and repopulating it with ~48k records via INSERT, I have come to the 
conclusion that, regardless of hardware, the delay will be too significant (on a test 
machine it ran about 25 seconds).
    What I am curious to know, is whether there is a significant increase if I switch 
my method of saving from a complete memory dump, to a partial memory dump.  The 
overhaul involved would require a lot of code restructuring.  The situation is this, 
approximately 10k of those 48k records are actually modified regularily.  If I 
overhaul the code to indicate when a record needs to be updated, or inserted, and only 
call the appropriate action, is the performance going to be significantly better?
    Keeping in mind it would have to search the 48k records to UPDATE the 10k modified 
records, as well as potentially INSERT new records (very few if any).
    With 48k records, is updating 10k records faster than simply deleting and 
reinserting every record?

    Alternatively, is there any way speeding up the INSERT time can be achieved? 
First, I am using the default format, I believe is MyISAM.  Second, I have used table 
locking to optimize writing before the table is deleted and repopulated, and unlocked 
after all records are inserted.  Third, I am using multiple INSERT lists (of 1000 
records inserted at a time).  Whether the lists is with 100, or 1000, I get about the 
same results of 25 seconds.  I realize the hardware I am using is not significantly 
powerful, but I think there is some optimization I could make.  Can you preallocate a 
definite number of rows for a large insert operation?

Any assistance would be appreciated, this project is flexible towards new ideas to 
make the dumping more efficient.  The alternative has been considered to use a method 
of UPDATEing records immediately when changes are made in memory.  Transaction 
overhead could become an issue however, with 10k+ records actively changing at any 
given time.  With an UPDATE method, dumping frequently may actually reduce the time 
required for the process, since it would update fewer records.

Anyone with some insight on this would be much appreciated if they could offer some 
ways to speed up the process.

Thanks,
    -Shane

Reply via email to