Donny,  what do you do?  Throw all the values into an array or something
on the client side, and use a random number generator to pull out the
array elements?

I suppose (depending on resultset size) pulling that many rows from server
to client and handing on client side could be faster...




On Mon, 8 Mar 2004, Donny Simonton wrote:

> ORDER BY RAND(), just sucks in my opinion.  We have created our own internal
> randomization system because pretty much everytime you use it will show up
> in the slow query log, because of the using temporary, using filesort it
> does.  Splitting your data into a hundred tables will still make it "using
> temporary, using filesort".
>
> I just did a little test, where I only had 5 entries in a table, and I using
> temp using filesort.
>
> Will it ever be improved?  Probably the same time order by DESC is improved.
>
> Donny
>
>
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> > Sent: Monday, March 08, 2004 2:05 PM
> > To: Neil Gunton
> > Cc: MySQL
> > Subject: Re: ORDER BY RAND() performance
> >
> >
> > If your infact (sounds like) storing the pictures meta-data (name, size,
> > owner, etc) and the data (blob of some kind) .. I would definately break
> > up the design into 2 tables.  That way when dealing with the meta-data
> > table (your RAND() query) there is much less data that needs to be
> > traversed to get your answer, which should result in a faster query.
> >
> >
> >
> >
> > On Mon, 8 Mar 2004, Neil Gunton wrote:
> >
> > > Hi all,
> > >
> > > I am using MySQL 4.0.x to run a community website which has (among other
> > > things) over 19,000 pictures. There is a page that selects 30 random
> > > thumbnails. I have noticed that the performance of "ORDER BY RAND()" on
> > > this table has a significant impact on performace. I have all the
> > > relevant indexes defined, and I have researched this issue on the Web.
> > > It seems that other people have also encountered a performance hit while
> > > using ORDER BY RAND(). The reason appears to be that when you do EXPLAIN
> > > on a query using this, MySQL reports "Using temporary; Using filesort",
> > > which is the worst possible result. Also, the number of rows reported is
> > > pretty much the entire set. So, presumably, the current implementation
> > > of ORDER BY RAND() means that MySQL has to traverse the entire table,
> > > regardless of other indexes.
> > >
> > > There are, of course, other ways to get around this, but they are all
> > > more complex than simply using ORDER BY RAND(). I think that selecting a
> > > random number of records from a table is something that a lot of
> > > websites would like to be able to do, and so as datasets get larger it
> > > would be nice to see this function scale well. For anyone who has a
> > > website with a large archive of data, the ability to present a random
> > > selection of this data is very useful.
> > >
> > > I would like to know if anyone knows if the MySQL team is aware of this
> > > problem, and if so whether they are planning on improving it at any
> > > point. I ask mainly because if I am told that "yes, it'll be much better
> > > in version X" then I can live with the couple of seconds that it takes
> > > currently, knowing that this will be better down the line. However if I
> > > am advised that this is a fundamentally hard problem for whatever
> > > reason, then I will put the effort into reworking my tables to use an
> > > alternative solution.
> > >
> > > The only real solution that I can see which is fast is to make another
> > > table which contains just the unique IDs of the pictures that are
> > > visible (there are others which are not publicly visible, and which
> > > shouldn't be included in the random query, so making a separate table
> > > with the appropriate subset makes sense for performance). This new table
> > > will have a primary key which is a numeric "sequence" field. Every
> > > record will have its own sequence number, going from 1 up to the number
> > > of records. Then, instead of doing one query with ORDER BY RAND() LIMIT
> > > 30, I can instead do 30 queries, each with a different random sequence
> > > (generated from Perl), which will look up the unique sequence number.
> > > Since this is a primary key, it will be very fast, so that doing 30
> > > queries will not have a big performance impact. However this scheme
> > > requires that the sequences in the new table be kept very consistent -
> > > for example, if a picture is removed from the sequence then the sequence
> > > numbers above that record have to be updated. This introduces potential
> > > for error, but it is a possible solution. I don't want to implement it,
> > > obviously, if ORDER BY RAND() is slated for improvement.
> > >
> > > Thanks for any ideas or insights...
> > >
> > > -Neil Gunton
> > >
> > > --
> > > MySQL General Mailing List
> > > For list archives: http://lists.mysql.com/mysql
> > > To unsubscribe:
> > http://lists.mysql.com/[EMAIL PROTECTED]
> > >
> >
> > --
> > MySQL General Mailing List
> > For list archives: http://lists.mysql.com/mysql
> > To unsubscribe:
> > http://lists.mysql.com/[EMAIL PROTECTED]
>
>
>

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to