It's slow because there's no way around running through the entire
input. The optimization tmp is talking about wouldn't be relevant
becase there is an order by clause which was precisely why I I said it
was a fairly narrow use case. Most people who use limit want a
specific subset even if that specific subset is random. Without the
order by the subset is entirely arbitrary but not useully random.
Incidentally "order by ... limit" is amenable to an optimization which
avoids having to *sort* the whole input even though it still has to
read the whole input. We implemented that in 8.3.
greg
On 6 Dec 2008, at 06:08 PM, Grzegorz Jaskiewicz <[EMAIL PROTECTED]>
wrote:
On 2008-12-06, at 11:29, David Lee Lambert wrote:
I use "ORDER BY random() LIMIT :some_small_number" frequently to
get a "feel"
for data. That always builds the unrandomized relation and then
sorts it. I
guess an alternate path for single-table queries would be to
randomly choose
a block number and then a tuple number; but that would be biased
toward long
rows (of which fewer can appear in a block).
but that's going to be extremely slow, due to speed of random()
function.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers