ORDER BY RAND(), just sucks in my opinion.  We have created our own internal
randomization system because pretty much everytime you use it will show up
in the slow query log, because of the using temporary, using filesort it
does.  Splitting your data into a hundred tables will still make it "using
temporary, using filesort".

I just did a little test, where I only had 5 entries in a table, and I using
temp using filesort.

Will it ever be improved?  Probably the same time order by DESC is improved.

Donny



> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> Sent: Monday, March 08, 2004 2:05 PM
> To: Neil Gunton
> Cc: MySQL
> Subject: Re: ORDER BY RAND() performance
>
>
> If your infact (sounds like) storing the pictures meta-data (name, size,
> owner, etc) and the data (blob of some kind) .. I would definately break
> up the design into 2 tables.  That way when dealing with the meta-data
> table (your RAND() query) there is much less data that needs to be
> traversed to get your answer, which should result in a faster query.
>
>
>
>
> On Mon, 8 Mar 2004, Neil Gunton wrote:
>
> > Hi all,
> >
> > I am using MySQL 4.0.x to run a community website which has (among other
> > things) over 19,000 pictures. There is a page that selects 30 random
> > thumbnails. I have noticed that the performance of "ORDER BY RAND()" on
> > this table has a significant impact on performace. I have all the
> > relevant indexes defined, and I have researched this issue on the Web.
> > It seems that other people have also encountered a performance hit while
> > using ORDER BY RAND(). The reason appears to be that when you do EXPLAIN
> > on a query using this, MySQL reports "Using temporary; Using filesort",
> > which is the worst possible result. Also, the number of rows reported is
> > pretty much the entire set. So, presumably, the current implementation
> > of ORDER BY RAND() means that MySQL has to traverse the entire table,
> > regardless of other indexes.
> >
> > There are, of course, other ways to get around this, but they are all
> > more complex than simply using ORDER BY RAND(). I think that selecting a
> > random number of records from a table is something that a lot of
> > websites would like to be able to do, and so as datasets get larger it
> > would be nice to see this function scale well. For anyone who has a
> > website with a large archive of data, the ability to present a random
> > selection of this data is very useful.
> >
> > I would like to know if anyone knows if the MySQL team is aware of this
> > problem, and if so whether they are planning on improving it at any
> > point. I ask mainly because if I am told that "yes, it'll be much better
> > in version X" then I can live with the couple of seconds that it takes
> > currently, knowing that this will be better down the line. However if I
> > am advised that this is a fundamentally hard problem for whatever
> > reason, then I will put the effort into reworking my tables to use an
> > alternative solution.
> >
> > The only real solution that I can see which is fast is to make another
> > table which contains just the unique IDs of the pictures that are
> > visible (there are others which are not publicly visible, and which
> > shouldn't be included in the random query, so making a separate table
> > with the appropriate subset makes sense for performance). This new table
> > will have a primary key which is a numeric "sequence" field. Every
> > record will have its own sequence number, going from 1 up to the number
> > of records. Then, instead of doing one query with ORDER BY RAND() LIMIT
> > 30, I can instead do 30 queries, each with a different random sequence
> > (generated from Perl), which will look up the unique sequence number.
> > Since this is a primary key, it will be very fast, so that doing 30
> > queries will not have a big performance impact. However this scheme
> > requires that the sequences in the new table be kept very consistent -
> > for example, if a picture is removed from the sequence then the sequence
> > numbers above that record have to be updated. This introduces potential
> > for error, but it is a possible solution. I don't want to implement it,
> > obviously, if ORDER BY RAND() is slated for improvement.
> >
> > Thanks for any ideas or insights...
> >
> > -Neil Gunton
> >
> > --
> > MySQL General Mailing List
> > For list archives: http://lists.mysql.com/mysql
> > To unsubscribe:
> http://lists.mysql.com/[EMAIL PROTECTED]
> >
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:
> http://lists.mysql.com/[EMAIL PROTECTED]




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to