In the last episode (Jun 24), Tim Gustafson said: > I'm working on a compute job that queries mySQL for work packets, > processes the work packet and then updates the row in mySQL with the > results of the job. > > Currently, from each compute instance I'm doing something like: > > select * from work_packets where work_data is null order by rand() limit 1; > > That more or less works, and occasionally two jobs will process the same > row and I'll waste a little bit of CPU time, but as a higher percentage of > jobs get completed, duplicate work happens more and more. [..] > I also thought about doing something like: > > select * from work_packets where work_data is null and (select > get_lock(work_packets.id, 0) = 1 order by rand() limit 1; > > but that seems to significantly increase the time it takes to run this > query from about 2 seconds to about 13 seconds. Since the jobs only take > about 10 seconds of CPU time each, this represents in an actual slow-down > as opposed to a speed-up.
That query won't execute (the subquery expression is incomplete); if you really are using a subquery there, try it plain, with no order by rand(): select * from work_packets where work_data is null and get_lock(id,0)=1 limit 1; My guess is your initial reason for ORDER BY RAND() was to minimize duplicate work, but if you're using locks it'll never happen, so you might as well just grab the first available packet. That query shouldn't even take 2 seconds, unless you have 50k rows in the table and no index on work_data... -- Dan Nelson dnel...@allantgroup.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql?unsub=arch...@jab.org