I wrote: > I did some hacking on this and came up with the attached patch, which > could use a bit more work on the comments but passes regression tests. > However, this just solves the issue of being smart about top-level > UNION ALL cases. It might be worth looking into using MergeAppend > for the sorting required for other types of set operations. That would > involve quite a different patch, and I'm not sure if it'd remove the > need for this one or not.
I spent some more time thinking about this. It seems difficult to make any real further progress without essentially throwing out the current approach to planning set-operation trees and starting over. What we'd want is to generate Paths representing the different ways a set-op could be implemented and then pick the best one. I can see the general shape of how to do that: a set-op should be thought of as generating a constrained join relation in a fashion similar to an OUTER JOIN construct, and then the different Paths are possible implementations for the join rel. We'd need new Path types for hashed or sorted UNION, INTERSECT, EXCEPT, as well as some analog of the SpecialJoinInfo struct (or extend that struct to cover these cases). It strikes me that maybe top-level DISTINCT could be married into this too, since it's basically the same as a one-input UNION operator. But not sure what to do with DISTINCT ON. Another thought here is that the current definition of the SetOp execution node type could be improved. Instead of taking a single pre-merged input tuple stream, it'd be better if it had two inputs like a JOIN node. Then we could eliminate the "flag" column, which would simplify matters for planning and reduce the amount of data that has to be passed through sorting. SetOp itself would need to borrow some of the capability of MergeAppend so that it could read two sorted streams in parallel and keep them in sync. But this all looks like a pretty substantial amount of work, and given the low level of user demand for improving the performance of set operations, it seems to belong fairly far down the to-do list. So I'm not going to tackle it now. Barring objection, I'll clean up yesterday's patch a bit more and commit it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers