It could save you lots of processing time to pre-collect a list of the 
records you are actually going to add to the table.  I know it sounds 
counter-intuitive but this has saved me OODLES of time on several 
occasions (of course it can't work for all situations but in your case 
where you are NOT inserting so many records I think it will help 
considerably).

CREATE TABLE tmpInsertMe (keyfield int);

SELECT s.keyfield
FROM stat_in s
LEFT JOIN 321st_stat st
        on st.keyfield = s.keyfield
WHERE st.keyfield is null

This may take a while to process (depends on if stat_in is also indexed on 
its "keyfield" field) but I should take much less time than your 24 minute 
insert. Now add a key to tmpInsertMe to speed up the JOIN during the 
insert. On 1000+ records (based on your example statistics) this should 
take way under a second.

ALTER TABLE tmpInsertMe add Key(keyfield);

Then you insert to 321st_stat only those records that are new (1000+) so 
you end up with 1/77000th of the processing required to perform the actual 
insert as it doesn't need to check for the IGNORE case for all of the 
other records.

INSERT 321st_stat
SELECT s.*
FROM stat_IN s
INNER JOIN tmpInsertMe t
        ON t.keyfield = s.keyfield;

And you are through with tmpInsertMe, so drop it:

DROP TABLE tmpInsertMe;

Of course, you will need to modify the field name and type of "keyfield" 
to match your situation. As I said, it is not exactly an intuitive 
technique but I tried it once out of desperation and I was HUGELY, 
INCREDIBLY IMPRESSED  with the performance improvement it created. At the 
time I was merging about 100,000 records into a 3,000,000 record table. 
Only about 1000 were new, and I had about 20,000 updates to perform, the 
rest were just old duplicates. Doing direct joins from source table to the 
destination table was killing me so I tried what I described and it worked 
very well....

FWIW,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine



"Andrew Pattison" <[EMAIL PROTECTED]> wrote on 08/19/2004 12:24:18 
PM:

> I'm guessing that you have indexes on the 321st_stat table? If this is 
the 
> case, try dropping them before you do the insert, then rebuilding them. 
> MySQL is known to be slow at doing bulk inserts on indexed tables. Also, 

> updates are much faster than inserts since with inserts there are much 
more 
> disk IOs required.
> 
> Cheers
> 
> Andrew.
> ----- Original Message ----- 
> From: "matt ryan" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Thursday, August 19, 2004 5:06 PM
> Subject: 1 day 28 min insert
> 
> 
> >I think oracle parallel query is calling me
> >
> > 110,832,565 stat records
> >
> > 77,269,086 on weekly update, I get small daily files, but daily sql's 
dont 
> > work very well, and miss records, in this case it missed 563 records.
> >
> > mysql> update stat_in set ctasc='321ST';
> > Query OK, 77269086 rows affected (24 min 17.60 sec)
> > Rows matched: 77269086  Changed: 77269086  Warnings: 0
> >
> > mysql> insert ignore into 321st_stat select * from stat_in;
> > Query OK, 563 rows affected (1 day 28 min 35.95 sec)
> > Records: 77269086  Duplicates: 77268523  Warnings: 0
> >
> > I just cant deal with speeds this slow, an insert onto a table with a 
> > primary key that tosses out almost all records shouldnt take this long 
to 
> > do
> >
> > -- 
> > MySQL General Mailing List
> > For list archives: http://lists.mysql.com/mysql
> > To unsubscribe: 
> > http://lists.mysql.com/[EMAIL PROTECTED]
> >
> > 
> 
> 
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]
> 

Reply via email to