On Wed, 9 Jun 2004 21:58:39 -0500, Donny Simonton <[EMAIL PROTECTED]> wrote:

Russ,
We use #2 currently, and we are actually about to switch back to the
inserting them one at a time. The reason is very simple. In our case we
have a insert statement that will insert a maximum of 600 entries at a time.
But we could have up to 25 different programs running that could possibly be
doing that its own insert of 600 records. The problem is that say the first
one takes 3 seconds to insert all 600, but 1 second after the first one
starts the second program tries to insert, well, it will need to wait for
the first one to finish. So what ends up happening, if all 25 programs try
to insert at the same time, all 25 inserts can take about 5 minutes because
they are all waiting on each other.


Now if you didn't need the auto_increment id, then you could just use an
insert delayed which would be a million times faster for the program itself,
but not necessarily for mysql.


Now if you do the one insert at a time, and each of the 25 programs started
inserting one at a time, in theory they would all finish at the same time.
Would it be faster then the massive inserts? Again it should be slower, but
we have found that it's faster in the long run.


But it could also be that our table we are inserting these records into has
252 million rows in it right now. So I would definitely benchmark it
yourself, before taking my word for it.


We have run into the same problem with INSERT ... ON DUPLICATE KEY
UPDATE..., with a small table when it's mainly doing inserts, it's super
fast. But with a table with 44 million rows and only 3 columns it takes
about 1-2 seconds to do the update part of the insert. But again, we found
this by noticing that when the table was small or it's doing inserts the
command is super fast. But as time goes on, it gets slower.


Donny


Makes sense to me! In our case, each insert block is only going to consist of at maximum about 20 rows, so we don't have the same problem that you have there. Also, the data in the table is continually being moved elsewhere and deleted, so the table maintains a fairly constant number of rows at all time (around 50000), so we won't see the problems associated with the table filling up that you have.


I've just implemented 2) and think we've been the benefit of the fewer queries issued. I think it's a case of using the right tool for the job at hand really.

Still, if 1) is guaranteed to work I could knock off yet another query. ;-) I'm surprised that nobody knows the answer on that for sure...

-----Original Message-----
From: Russ Brown [mailto:[EMAIL PROTECTED]
Sent: Wednesday, June 09, 2004 7:45 AM
To: [EMAIL PROTECTED]
Subject: Re: Multi-row INSERTs

On Wed, 9 Jun 2004 12:58:01 +0200, Jigal van Hemert <[EMAIL PROTECTED]>
wrote:

>> Anyway, my question is this. If I do a single-statement multi-line
>> insert,
>> are the auto-increment IDs of the rows inserted guaranteed to be
>> sequential? Bear in mind also that I'm using InnoDB tables here.
>>
>> Conversely, if I know for a fact that it is not guaranteed, I know that
>> I
>> need to think of something else. :-)
> Will locking the table work for you? If you lock the table for writing
no
> other process can slip a query in between your queries for certain...
>
> Regards, Jigal.
>
>


Unfortunately locking the table isn't an option as the table is being
accessed extremely regularly by other clients performing similar inserts.
The key to this is speed and overhead: at present I'm inserting the rows
individually and recording each row's ID as I go. However I want to be
able to reduce the number of queries involved, so I've though of two
possibilities:


1) If the inserted rows have sequential IDs in the same order that they
appeared in the INSERT statement, I can do them all in one go, use
LAST_INSERT_ID to get the ID of the first and derive the rest by
incrementing in the application logic.
2) Insert them all in one statement and then select them back to get each
row's ID.


Now, I know that 2) will work, and it will allow me to reduce the number
of queries per process from N (where N is on average about 9) to 2.
However, if 1) will work it will allow me to reduce the number of queries
to 1, (plus a call to LAST_INSERT_ID), and wouldn't involve having to
match up the rows from the second query in 2).


If nobody knows the answer I'll just go with 2, but I thought it was an
interesting bit of trivia in addition to being useful to my specific
circumstance.

Thanks.



--

Russ

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]






--

Russ

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to