I am importing records from a text file into my DB. Each record in the text file corresponds to six tables in the DB due to normalization.
So for example, an inventory record containing a part, vendor, contactPerson, mpn etc...in ONE line of a text file, requires 6 inserts into the DB. Further more, before each insert, I need to check for redundancy. So, if an inventory line in the text file has vendor 'Scott', BEFORE I add 'Scott' to the vendor table, I check the vendor table for whether 'Scott' exists. If 'Scott' does exist, then I just pull the ID and use that in the inventory insert - if 'Scott' DOESN'T exist yet, I insert 'Scott' into the Vendor Table, get the last_insert_id() and use that in the inventory table. Each LINE in the text file can result in more than 20 Select/Insert statements of the underlying DB before I can insert the record with all the properly allocated foreign keys. Considering many of these text files have thousands of lines of inventory, as you can imagine, I have a massive performance problem. Each complete line of the text file requires about 1 full second to validate and insert into the underlying schema. I'm using InnoDB tables so alot of the Insert Optimization techniques I found from MySQL don't seem to apply too well. I'm hoping for some experienced feedback in alternative techniques for performing this sort of import. Some further info: In PHP 5, I get my connection object, and then in a loop I'm calling $connecion->query(); for each line of the text file. I'm assuming this is always using the same connection and that each query doesn't require a re-connect to the DB...if it does, that's a major bottleneck that could be avoided (with persistent connections?) ie $conn = mysqli_init(); //null connection object $conn->real_connect('host', 'user', 'pass', 'db') or die('connection'); foreach($file as $line) { buildQuery(); $result = $conn->query($sql); } I hope each time I call $conn->query($sql) it's using the SAME connection resource and not having to reconnect to the DB. Secondly, $sql involves a call to a Stored Procedure, which in turn ends up calling other stored procedures to faciliate all the transactions, validations and inserts...I assume that's not too much more innefficient than using sql insert statements directly. Any help is appreciated. Scott.