Re: ORDER BY RAND() Too Slow! Alternatives?
At 11:39 PM 2/10/2001 -0800, Stephen Waits wrote: Never mind on the "it doesn't work on my system" more like it didn't work on my brain :) Works fine. Oh, phew. Theoretically it could be as fast as Carsten's method couldn't it? If it hit a record on the first shot? Otherwise it's pounding through an index O(random-nearest_id) where his does it O(1). And could it potentially loop infinitely? Based on my admittedly pathetic understanding of B-trees and database indexes, I *think* Carsten's approach is O(lg n) on the number of rows. My approach is O(M*n) on the number of rows, where M is a pretty lightweight access to nab the key. The "LIMIT $rand, 1" approach is O(D*n/2) on the number of rows over time, but D is a nasty I/O hit to slurp the whole row into the resultset. The only case where Carsten's approach and mine would converge would be if you were using a query where no index could be applied. Then they'd both be stuck at O(N) on the number of rows. I am curious whether "(@rand:=@rand-1)+id=id" can be optimized to remove the table reference (id) without having the query optimizer decide it only needs to run once. That might shave a good bit off of M. In a case like this, it would be handy to have a ROW() function that tracks the running counter being used to generate the "X rows in set." statistic. But such a thing would probably be of limited utility. At 11:28PM 2/10/2001 -0800, Stephen Waits wrote: Carsten's approach is one of those "duh" things I don't understand why I hadn't thought of it.? Likewise. It's a good reminder that clever solutions don't always come from linear thinking. Thanks Carsten! Jeff - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: ORDER BY RAND() Too Slow! Alternatives?
Could you do something like: CREATE TEMPORARY TABLE temptable ( pk INTEGER, rand INTEGER ); INSERT INTO temptable SELECT yourpk,Rand() FROM yourtable; SELECT yourtable.* FROM yourtable,temptable WHERE pk=yourpk ORDER BY rand; DROP TABLE temptable; That might be quicker than your current approach. Jeff At 12:12 PM 2/10/2001 -0800, Stephen Waits wrote: Hi there, In the quest to get a random row from a table, "order by rand()" has proven too inefficient and slow. It's slow because MySQL apparently selects ALL rows into memory, then randomly shuffles ALL of them, then gives you the first one - very inefficient. There are a few other ways I've thrown around but none are "elegant". One is, if a table has an id # column, like "id int unsigned not null auto_increment", I could do this: select max(id) from table; $random_number = ... select * from table where id=$random_number; This is very fast (assuming the id field is a unique index). But it has the problem that if records have been deleted I might get a 0-row response. It also does not work if I want to limit to a particular category, for instance "where category='women'" or something. I could do this too: select count(*) from table; $random_number = ... select * from table limit $random_number,1; This has the benefit of always working but the speed, though faster than the "order by rand()" method, remains unacceptable. The speed seems linear with regard to the size of $random_number; which is probably obvious to you. So I've experimented with several other things: select * from table where limit rand(),1; select * from table where id=(mod(floor(rand()*4294967296),count(*))+1); .. and it only gets uglier from -- these are all not accepted by MySQL. MySQL does not allow for subqueries which is another way it could possibly be accomplished. In the end, I'll just use what works, no matter the speed. BUT, I'd love to hear what other people have done to solve this problem! Thanks, Steve - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
RE: ORDER BY RAND() Too Slow! Alternatives?
Hi there, In the quest to get a random row from a table, "order by rand()" has proven too inefficient and slow. It's slow because MySQL apparently selects ALL rows into memory, then randomly shuffles ALL of them, then gives you the first one - very inefficient. There are a few other ways I've thrown around but none are "elegant". One is, if a table has an id # column, like "id int unsigned not null auto_increment", I could do this: select max(id) from table; $random_number = ... select * from table where id=$random_number; How about select * from table where id$random_number order by id limit 1; (note that I'm using '' rather than '='). This should always work, and be pretty fast. There is a caveat, tho': this won't work if you need "exact randomness", i.e. certain records will have a better chance of being selected than others. This gets worse, the larger "holes" are in sets of deleted id's. / Carsten -- Carsten H. Pedersen keeper and maintainer of the bitbybit.dk MySQL FAQ http://www.bitbybit.dk/mysqlfaq This is very fast (assuming the id field is a unique index). But it has the problem that if records have been deleted I might get a 0-row response. It also does not work if I want to limit to a particular category, for instance "where category='women'" or something. I could do this too: select count(*) from table; $random_number = ... select * from table limit $random_number,1; This has the benefit of always working but the speed, though faster than the "order by rand()" method, remains unacceptable. The speed seems linear with regard to the size of $random_number; which is probably obvious to you. So I've experimented with several other things: select * from table where limit rand(),1; select * from table where id=(mod(floor(rand()*4294967296),count(*))+1); .. and it only gets uglier from -- these are all not accepted by MySQL. MySQL does not allow for subqueries which is another way it could possibly be accomplished. In the end, I'll just use what works, no matter the speed. BUT, I'd love to hear what other people have done to solve this problem! Thanks, Steve - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
RE: ORDER BY RAND() Too Slow! Alternatives?
?php $query = "SELECT col1, col2 FROM the_table ORDER BY RAND() LIMIT 1"; $result = mysql_query($query) or die("could not query"); $row = mysql_fetch_array($result); print $row[col1]; print "P"; print $row[col2]; ? Robert B. Barrington GetMart Commercial Ecom: Web Administrator http://weddinginlasvegas.com/ http://getmart.com/ [EMAIL PROTECTED] Vegas Vista Productions 3172 North Rainbow Boulevard Suite 326 Las Vegas, Nevada 89108-4534 Telephone: (702)656-1027 Facsimile: (702)656-1608 -Original Message- From: Stephen Waits [mailto:[EMAIL PROTECTED]] Sent: Saturday, February 10, 2001 12:13 PM To: [EMAIL PROTECTED] Subject: ORDER BY RAND() Too Slow! Alternatives? Hi there, In the quest to get a random row from a table, "order by rand()" has proven too inefficient and slow. It's slow because MySQL apparently selects ALL rows into memory, then randomly shuffles ALL of them, then gives you the first one - very inefficient. There are a few other ways I've thrown around but none are "elegant". One is, if a table has an id # column, like "id int unsigned not null auto_increment", I could do this: select max(id) from table; $random_number = ... select * from table where id=$random_number; This is very fast (assuming the id field is a unique index). But it has the problem that if records have been deleted I might get a 0-row response. It also does not work if I want to limit to a particular category, for instance "where category='women'" or something. I could do this too: select count(*) from table; $random_number = ... select * from table limit $random_number,1; This has the benefit of always working but the speed, though faster than the "order by rand()" method, remains unacceptable. The speed seems linear with regard to the size of $random_number; which is probably obvious to you. So I've experimented with several other things: select * from table where limit rand(),1; select * from table where id=(mod(floor(rand()*4294967296),count(*))+1); .. and it only gets uglier from -- these are all not accepted by MySQL. MySQL does not allow for subqueries which is another way it could possibly be accomplished. In the end, I'll just use what works, no matter the speed. BUT, I'd love to hear what other people have done to solve this problem! Thanks, Steve - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
Re: ORDER BY RAND() Too Slow! Alternatives?
"Jeffrey D. Wheelhouse" wrote: SELECT @lines:=COUNT(id) FROM table; SET @rand=CEILING(RAND()*@lines); SELECT * FROM table WHERE (@rand:=@rand-1)+id=id; Never mind on the "it doesn't work on my system" more like it didn't work on my brain :) Works fine. And now that I ponder it a bit more and I think I understand what it's doing I see the performance implications. Theoretically it could be as fast as Carsten's method couldn't it? If it hit a record on the first shot? Otherwise it's pounding through an index O(random-nearest_id) where his does it O(1). And could it potentially loop infinitely? --Steve - Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail [EMAIL PROTECTED] To unsubscribe, e-mail [EMAIL PROTECTED] Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php