At 05:31 PM 10/24/2011, Reindl Harald wrote:

Am 24.10.2011 23:31, schrieb mos:
> At 11:32 AM 10/24/2011, Reindl Harald wrote:
>
>
>> Am 24.10.2011 18:02, schrieb mos:
>> > At 10:34 AM 10/24/2011, you wrote:
>> >> select id from table order by rand() limit 1;
>> >> is doing as example a dumb temporary table with the full size
>> >
>> > Because it has to sort the entire table, then it returns the one row. This of course is extremely inefficient. :) >> > You need to choose a random row by using an auto-inc field. Something like:
>> >
>> > select id from table where id>=myrandomnum limit 1
>>
>> but this is TOTALLY braindead if "id" is a primary-KEY with auto-increment
>
> It all depends on how many holes you have in the sequence and how random you want the selections to be. If there > are no holes then it will work. You need of course to get the first and last id and generate "myrandomnum" within > that range. If there are a lot of holes in the sequence then build another table with the columns bin and an > autoinc column and pick one of those rows randomly. Regenerate the table once an hour or once a day.
>
> Either way it is going to be a LOT FASTER than sorting the entire table

and why in the world is with the query above the WHOLE table
copied in a temp-table while fecth the whole id-list in a
php-array and take a random one is more than 1000 times faster?

the implementation if "order by rand()" is totally braindead

It is not "braindead". You told MySQL to sort by rand() which is a non-indexed column. It needs to assign a value to each row of the result set (all ids of the table) and sort it to get the lowest random number. This is very inefficient for large tables.

That is why I recommended using:

select id from table where id>=myrandomnum limit 1

provided id is uniformly distributed (not too many holes). If there are a lot of holes in the distribution then copy the id's to a separate table with its own auto-in rcd id. This will take about 1-2 seconds. Now you can select a random row from this table using the same SQL statement above and get a random id in about 1 ms. I didn't say to load all of the ids into the separate table (or a PHP array) every time you wanted a random id. That would be "braindead". You will need to update this secondary table whenever your table gets rows inserted or deleted (or reload the contents of the secondary table every few minutes).

Now there is a 3rd way to get a random row that is easier to implement. Add an extra column to your table called RndNum Float. When a row is added to the table, assign a random number to the column. This column needs to be indexed. Now you can execute:

select id from table where RndNum >= rand() limit 1

and this will return a random id in a couple of ms. You should use a stored procedure and handle the occasions when no row is returned because rand() that was generated was larger than the largest RndNum in the table. Depending on how random you want the results, you can make the index unique and handle the occurrences when a duplicate rand() number is generated.


_______________

SELECT pri_key FROM table ORDER BY RAND() LIMIT 1;

let's say 'pri_key' as a autoincrement integer
let's say the table has some blob-fields and is 4 GB large

fetch ONE random id via mysql will explode your webserver
because 4 GB data is copied - there is no valid reason for
such crappy implementation, really no single reason

The blob field is not included in the query so it won't be copied and the server will not explode. If it worked the way you said it did, then we would have exploding web servers with so much blob material encircling every small that not even Steve McQueen could eradicate it.

What does happen is a temporary table is created with all of the pri_key values and is sorted on Rand() and a single pri_key is returned. Since pri_key is an index, it only needs to access the index file and won't even bother accessing the data in the table.

And that's the way it is.

Mike



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=arch...@jab.org

Reply via email to