Bryce Cutt <pandas...@gmail.com> writes: > Here is the new patch. Applied with revisions. I undid some of the "optimizations" that cluttered the code in order to save a cycle or two per tuple --- as per previous discussion, that's not what the performance questions were about. Also, I did not like the terminology "in-memory"/"IM"; it seemed confusing since the main hash table is in-memory too. I revised the code to consistently refer to the additional hash table as a "skew" hashtable and the optimization in general as skew optimization. Hope that seems reasonable to you --- we could search-and-replace it to something else if you'd prefer.
For the moment, I didn't really do anything about teaching the planner to account for this optimization in its cost estimates. The initial estimate of the number of MCVs that will be specially treated seems to me to be too high (it's only accurate if the inner relation is unique), but getting a more accurate estimate seems pretty hard, and it's not clear it's worth the trouble. Without that, though, you can't tell what fraction of outer tuples will get the short-circuit treatment. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers