Benjamin Pflugmann wrote:

Hello.

On Fri 2002-12-06 at 10:28:23 -0600, [EMAIL PROTECTED] wrote:

Heikki and all,
I've got a quick question which may be more general sql related but
since I'm using innodb tables specifically for the row locking.

The following process description is specifically designed to prevent
duplicates.

My current process:
select * from run where job_id = 111 and thread_id = 0 limit 10000 for
update;

update run set thread_id = 999 where job_id = 111 and thread_id = 0 and
( rec_id = x or rec_id = x1 ... ); // rec_id pulled via loop of
previous select

Why do you repeat "job_id = 111 and thread_id = 0"? If you are using a
transaction seperation level of at least REPEATABLE READ (which is the
default), InnoDb assures that you always see the same rows within one
transaction. Regardless, FOR UPDATE locks the records, which assures
the values did not change meanwhile.

Because I'm not using transactions explicitly. And the whole point of setting the thread id is because if thread 1 select's 5k records for update I'm not sure that thread 2 won't select the same 5k records.

I've read the docs forwards and backwards and some things just aren't clear to me.
If a table has 10k records and thread 1 selects the first 5k in update mode, when thread 2 executes a microsecond after thread 1 will it get the same 5k records? Will they run concurrently and pick records within the whole 10k set so that at the end all 10k are selected but each 5k subset is not contiguous? Or will thread 1 get the first 5k records and thread 2 will get the second 5k records?

Btw, rec_id IN (x1, x2, x3) may provide better performance (OR clauses
are still problematic with MySQL v3.23)

Thanks, I will give it a try.


select main.* from run,main where run.job_id = 111 and run.thread_id =
999 and run.rec_id = main.rec_id; // this is saved off

delete from run where job_id = 111 and thread_id = 999;

I am irritated. Above you select only the 10.000 first rows and
process them, but at the end, you delete all, not only max 10.000?

It's a CYA maneuver. If a process is terminated after select, update, select then rerun we now have 20k rows to process.


---- end ----
As you can see, it's quite a few queries to prevent collisions. Right
now the update and delete take the longest for obvious reasons.

What I'm wondering is if I can do the following:

select main.* from run,main where run.job_id = 111 and run.thread_id = 0
and run.rec_id = main.rec_id limit 10000 for update;

delete from run where job_id = 111 and (rec_id = x or rec_id = x2 ...);

Why not? FOR UPDATE is explained on http://www.mysql.com/doc/en/SELECT.html as "If you are using FOR UPDATE on a storage engine with page/row
locks, the examined rows will be write locked."

But not read locked. If 2 select for update's are executed against the same record set, will 1 succeed and 1 fail?

Although the name sounds otherweise, FOR UPDATE is not only suited for
updates. IMHO, the write lock should work for the DELETE as for the
UPDATE. Or in other words: it is either good enough for both of your
variants or for neither.

The only thing you should consider is the fact that the second variant
will lock records from table "main", which the first did not.

HTH,

Benjamin.

Thanks for your reply and information. The problem is that I'm conservative as hell when it comes to coding and must be absolutely sure things will work the way they should. Otherwise, I'll choose the inefficient route just because I know it's safe.

Mike...



---------------------------------------------------------------------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to