Donny, what do you do? Throw all the values into an array or something on the client side, and use a random number generator to pull out the array elements?
I suppose (depending on resultset size) pulling that many rows from server to client and handing on client side could be faster... On Mon, 8 Mar 2004, Donny Simonton wrote: > ORDER BY RAND(), just sucks in my opinion. We have created our own internal > randomization system because pretty much everytime you use it will show up > in the slow query log, because of the using temporary, using filesort it > does. Splitting your data into a hundred tables will still make it "using > temporary, using filesort". > > I just did a little test, where I only had 5 entries in a table, and I using > temp using filesort. > > Will it ever be improved? Probably the same time order by DESC is improved. > > Donny > > > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > > Sent: Monday, March 08, 2004 2:05 PM > > To: Neil Gunton > > Cc: MySQL > > Subject: Re: ORDER BY RAND() performance > > > > > > If your infact (sounds like) storing the pictures meta-data (name, size, > > owner, etc) and the data (blob of some kind) .. I would definately break > > up the design into 2 tables. That way when dealing with the meta-data > > table (your RAND() query) there is much less data that needs to be > > traversed to get your answer, which should result in a faster query. > > > > > > > > > > On Mon, 8 Mar 2004, Neil Gunton wrote: > > > > > Hi all, > > > > > > I am using MySQL 4.0.x to run a community website which has (among other > > > things) over 19,000 pictures. There is a page that selects 30 random > > > thumbnails. I have noticed that the performance of "ORDER BY RAND()" on > > > this table has a significant impact on performace. I have all the > > > relevant indexes defined, and I have researched this issue on the Web. > > > It seems that other people have also encountered a performance hit while > > > using ORDER BY RAND(). The reason appears to be that when you do EXPLAIN > > > on a query using this, MySQL reports "Using temporary; Using filesort", > > > which is the worst possible result. Also, the number of rows reported is > > > pretty much the entire set. So, presumably, the current implementation > > > of ORDER BY RAND() means that MySQL has to traverse the entire table, > > > regardless of other indexes. > > > > > > There are, of course, other ways to get around this, but they are all > > > more complex than simply using ORDER BY RAND(). I think that selecting a > > > random number of records from a table is something that a lot of > > > websites would like to be able to do, and so as datasets get larger it > > > would be nice to see this function scale well. For anyone who has a > > > website with a large archive of data, the ability to present a random > > > selection of this data is very useful. > > > > > > I would like to know if anyone knows if the MySQL team is aware of this > > > problem, and if so whether they are planning on improving it at any > > > point. I ask mainly because if I am told that "yes, it'll be much better > > > in version X" then I can live with the couple of seconds that it takes > > > currently, knowing that this will be better down the line. However if I > > > am advised that this is a fundamentally hard problem for whatever > > > reason, then I will put the effort into reworking my tables to use an > > > alternative solution. > > > > > > The only real solution that I can see which is fast is to make another > > > table which contains just the unique IDs of the pictures that are > > > visible (there are others which are not publicly visible, and which > > > shouldn't be included in the random query, so making a separate table > > > with the appropriate subset makes sense for performance). This new table > > > will have a primary key which is a numeric "sequence" field. Every > > > record will have its own sequence number, going from 1 up to the number > > > of records. Then, instead of doing one query with ORDER BY RAND() LIMIT > > > 30, I can instead do 30 queries, each with a different random sequence > > > (generated from Perl), which will look up the unique sequence number. > > > Since this is a primary key, it will be very fast, so that doing 30 > > > queries will not have a big performance impact. However this scheme > > > requires that the sequences in the new table be kept very consistent - > > > for example, if a picture is removed from the sequence then the sequence > > > numbers above that record have to be updated. This introduces potential > > > for error, but it is a possible solution. I don't want to implement it, > > > obviously, if ORDER BY RAND() is slated for improvement. > > > > > > Thanks for any ideas or insights... > > > > > > -Neil Gunton > > > > > > -- > > > MySQL General Mailing List > > > For list archives: http://lists.mysql.com/mysql > > > To unsubscribe: > > http://lists.mysql.com/[EMAIL PROTECTED] > > > > > > > -- > > MySQL General Mailing List > > For list archives: http://lists.mysql.com/mysql > > To unsubscribe: > > http://lists.mysql.com/[EMAIL PROTECTED] > > > -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]