implementation of SELECT ... ORDER BY RAND() LIMIT 1

2007-02-08 Thread Jan Pieter Kunst

2007/2/7, Jos Elkink <[EMAIL PROTECTED]>:

Hi all,

I have a question about the combination of RAND and LIMIT 1. If I have
a query like:

SELECT  ... ORDER BY RAND() LIMIT 1

with the ... replaced with a normal query on one table. How is this
implemented? Is this optimized for the fact that it only needs one
entry?

And what about when there is a combination of tables

SELECT a.a, b.b FROM a,b WHERE a.b = b.id ORDER BY RAND() LIMIT 1

And in the case of

SELECT a.a, b.b FROM a LEFT JOIN b ON a.b = b.id ORDER BY RAND() LIMIT 1

Some say that especially in the last two cases, it is faster to just
retrieve the entire list and then select randomly.

And what if the case is that the limit is larger than 1, but smaller
than the entire table?

I am asking because we have various of these queries in our code and
serious issues with speed, and I was wondering whether I am assuming
optimization in the mysql code where they don't actually exist.

Any help on this would be much appreciated.


I just  dealt with this problem myself. The problem as far as I
understand it is that ORDER BY RAND() LIMIT 1 does a full table scan.

I found the solution and a serious speedup in the comments on this page:

http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/

HTH,
Jan Pieter

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: implementation of SELECT ... ORDER BY RAND() LIMIT 1

2007-02-07 Thread Philip Hallstrom

I have a question about the combination of RAND and LIMIT 1. If I have
a query like:

SELECT  ... ORDER BY RAND() LIMIT 1

with the ... replaced with a normal query on one table. How is this
implemented? Is this optimized for the fact that it only needs one
entry?


Try prefixing your query with "EXPLAIN" and see what it says it's going to 
do.  Pretty sure it's going to look at *every* row in the table, compute a 
random value, sort it, then return the first one.


So, for a table with a good number of rows, the above is going to be 
horrificly inefficient.  It would be a lot faster to do something like:


rowcount = select count(*) from table
random_value = something between 0 and rowcount - 1
select ... LIMIT 1 OFFSET random_value

-philip



And what about when there is a combination of tables

SELECT a.a, b.b FROM a,b WHERE a.b = b.id ORDER BY RAND() LIMIT 1

And in the case of

SELECT a.a, b.b FROM a LEFT JOIN b ON a.b = b.id ORDER BY RAND() LIMIT 1

Some say that especially in the last two cases, it is faster to just
retrieve the entire list and then select randomly.

And what if the case is that the limit is larger than 1, but smaller
than the entire table?

I am asking because we have various of these queries in our code and
serious issues with speed, and I was wondering whether I am assuming
optimization in the mysql code where they don't actually exist.

Any help on this would be much appreciated.

Regards,

Jos
http://www.cantr.net

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



implementation of SELECT ... ORDER BY RAND() LIMIT 1

2007-02-07 Thread Jos Elkink

Hi all,

I have a question about the combination of RAND and LIMIT 1. If I have
a query like:

SELECT  ... ORDER BY RAND() LIMIT 1

with the ... replaced with a normal query on one table. How is this
implemented? Is this optimized for the fact that it only needs one
entry?

And what about when there is a combination of tables

SELECT a.a, b.b FROM a,b WHERE a.b = b.id ORDER BY RAND() LIMIT 1

And in the case of

SELECT a.a, b.b FROM a LEFT JOIN b ON a.b = b.id ORDER BY RAND() LIMIT 1

Some say that especially in the last two cases, it is faster to just
retrieve the entire list and then select randomly.

And what if the case is that the limit is larger than 1, but smaller
than the entire table?

I am asking because we have various of these queries in our code and
serious issues with speed, and I was wondering whether I am assuming
optimization in the mysql code where they don't actually exist.

Any help on this would be much appreciated.

Regards,

Jos
http://www.cantr.net

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]