Bruno Wolff III <[EMAIL PROTECTED]> writes: > Specifically the interpretation I think makes sense is that > SELECT DISTINCT ON (a, b, c) * FROM tablename ORDER BY d, e, f > should be treated as equvialent to > SELECT * FROM > (SELECT DISTINCT ON (a, b, c) FROM tablename ORDER BY a, b, c, d, e, f) AS t > ORDER BY d, e, f
Right, that's the same thing Neil said elsewhere in the thread. I guess I agree that this would be cleaner, and also that it doesn't seem like a high priority problem. BTW, you could imagine implementing DISTINCT and DISTINCT ON by a hash table, if you didn't expect too many distinct rows. The hash key would be the DISTINCT columns, and in the DISTINCT ON case, when you get a new row with DISTINCT columns matching an existing entry, you compare the two rows using the ORDER BY key columns to decide which row to keep. In this formulation it's fairly obvious that the DISTINCT and ORDER BY keys don't need to overlap at all. This technique does not require presorted input (good) but also delivers unsorted output (bad). It could still be a big win if you expect the DISTINCT filtering to reduce the number of rows substantially, because you'd end up sorting a much smaller number of rows. I didn't get around to trying to implement this when I was doing hash aggregation for 7.4, but it seems like a reasonable item for the TODO list. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])