Benoit Mortgat <mort...@gmail.com> wrote: > I have a table with millions of records. > > When I run a query with > > ORDER BY random() LIMIT N; > > the RANDOM() function is evaluated against all rows of my result set, > then sorting occurs, and as a result the query is slow. > > In this case the query could be rewritten as: > > * Generate N, random, row numbers between 1 and (SELECT COUNT(*) FROM > the_table_name). Maybe using remainder operator % and builtin ABS() > and RANDOM() functions can help (see below) > * SELECT FROM the_table WHERE rowid IN (those random numbers)
That is not equivalent. First, duplicate random numbers may be generated. Second, rowid values are not necessarily sequential. Thus, you may end up with fewer than N rows. You can implement this algorithm in your application's code, if you are so inclined. Just prepare a query lile select * from the_table where rowid=?; then run it N times, binding different random numbers to the parameter. Of course you still need to deal with the two issues I mentioned above (perhaps you know that in your particular application, rowids are always sequential). > This can however return less than N rows if by chance two random numbers > have the same remainder modulo COUNT(*) FROM my_table. Precisely. -- Igor Tandetnik _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users