Hello, I am new to the list, so pardon me if I am on the wrong one posting my question. If so, please direct me to the right place.
My question is in regards to the efficiency of INSERT statements. I have gone over some of the optimizations, and on a machine running the client and MySQL 4 server, DELETEing a table and repopulating it with ~48k records via INSERT, I have come to the conclusion that, regardless of hardware, the delay will be too significant (on a test machine it ran about 25 seconds). What I am curious to know, is whether there is a significant increase if I switch my method of saving from a complete memory dump, to a partial memory dump. The overhaul involved would require a lot of code restructuring. The situation is this, approximately 10k of those 48k records are actually modified regularily. If I overhaul the code to indicate when a record needs to be updated, or inserted, and only call the appropriate action, is the performance going to be significantly better? Keeping in mind it would have to search the 48k records to UPDATE the 10k modified records, as well as potentially INSERT new records (very few if any). With 48k records, is updating 10k records faster than simply deleting and reinserting every record? Alternatively, is there any way speeding up the INSERT time can be achieved? First, I am using the default format, I believe is MyISAM. Second, I have used table locking to optimize writing before the table is deleted and repopulated, and unlocked after all records are inserted. Third, I am using multiple INSERT lists (of 1000 records inserted at a time). Whether the lists is with 100, or 1000, I get about the same results of 25 seconds. I realize the hardware I am using is not significantly powerful, but I think there is some optimization I could make. Can you preallocate a definite number of rows for a large insert operation? Any assistance would be appreciated, this project is flexible towards new ideas to make the dumping more efficient. The alternative has been considered to use a method of UPDATEing records immediately when changes are made in memory. Transaction overhead could become an issue however, with 10k+ records actively changing at any given time. With an UPDATE method, dumping frequently may actually reduce the time required for the process, since it would update fewer records. Anyone with some insight on this would be much appreciated if they could offer some ways to speed up the process. Thanks, -Shane