On Sat, 2007-04-07 at 14:11 -0400, Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > I did some performance testing of the patch, and the results were good. > > I did this: > > > test=> CREATE TABLE test (x INTEGER); > > test=> INSERT INTO test SELECT * FROM generate_series(1, 1000000); > > test=> SET log_min_duration_statement = 0; > > test=> SELECT * FROM test ORDER BY x LIMIT 3; > > LIMIT 3 seems an awfully favorable case; if the patch can only manage a > factor of 4 speedup there, what happens at limit 10, 20, 100? Also, > you've tested only one sort size and only one (unspecified) value of > work_mem, and the usefulness of the patch would surely vary depending on > that. In particular, what happens with a LIMIT large enough to overflow > work_mem?
Yeh, this is really designed to improve the case where we retrieve a "screenfull" of data. i.e. 25, 50 or 100 records. Or worst case 10 screenfulls. The code deliberately doesn't use an insertion sort for that reason, since that is beyond the cut-off where that works best. So it should be optimised for medium numbers of rows when no index is present. The use case is important because we want to be able to populate data for screens in a reasonably bounded time, not one that gets suddenly worse should the number of possible matches exceed work_mem. [Think how well Google reacts to varying numbers of candidate matches] Whatever happens with LIMIT > work_mem doesn't fit the use case, so as long as it is no slower than what we have now, that should be fine. > Lastly, I suspect that sorting presorted input might be particularly > favorable for this patch. Please try it with random data for comparison. Agreed. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match