As it happens, I found this post useful tonight for an almost identical situation.

I have a table with exactly 200,000 rows, the first column is an autoincrement ID field. I am confident that all IDs are consecutive and there are no holes.

When I do:

Select ID from history where id = CAST( rand() * 199999 + 1 as UNSIGNED) limit 1;

shouldn't it always return a single record? The CAST id should always be in the range from 1 to 200000.

About 50% of the time I get an empty recordset and I'm puzzled. (could be lack of sleep though)


Gary
----- Original Message ----- From: "Eric Bergen" <[EMAIL PROTECTED]>
To: "Dan Bolser" <[EMAIL PROTECTED]>
Cc: "Philip Hallstrom" <[EMAIL PROTECTED]>; "Brian Dunning" <[EMAIL PROTECTED]>; <mysql@lists.mysql.com>
Sent: Friday, May 13, 2005 7:48 PM
Subject: Re: Efficiently finding a random record



in() can take millions of arguments. Up to max packet size. Try it :)

Dan Bolser wrote:

On Fri, 13 May 2005, Eric Bergen wrote:


Even better is if you have an integer primary key (think auto_increment) and use in()

So if you want 10 random rows in your app generate 20 or so random numbers and do something like
select col1, from t where x in (1, 5, 3, 2...) limit 10


check num rows and if you don't get enough generate more random numbers and try again.

in() is blazing fast even with thousands of numbers so don't be afraid to kick a few extra in.


I heard about a 255 'in' limit. When you say 'thousands of numbers' do you mean in the IN or in the column?



-Eric


Philip Hallstrom wrote:


I have a db of about 300,000 records and when I try to find one random record like this:

select * from table order by rand() limit 1;

it can take several minutes. My Sherlock Holmes instincts tell me that what I'm doing is somehow inefficient. What is the primary culprit here?

The culprit is that mysql has to assign a random value (via rand()) to each of the 300,000 rows, then order all 300,000 rows by that random value and return the first one.

So even though you're just retrieving one value, you're processing 300,000 rows.

You'd be better off doing something like this in your application..

row_count = get row count;
random_row = get random number from 0 to row_count - 1;
result = db query LIMIT 1 OFFSET random_row

or something like that...

-philip









--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]






-- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to