Daniele Varrazzo <daniele.varra...@gmail.com> writes: > the attached patch improves the array selectivity estimation for = ANY > and <> ALL, hence for the IN/NOT IN operators, to avoid the > shortcoming described in > <http://archives.postgresql.org/pgsql-performance/2012-03/msg00006.php>.
In connection with Alexander Korotkov's array-estimation patch, I just committed some code into scalararraysel() that checks whether the operator is equality or inequality of the array element type. It does that by consulting the default btree or hash opclass for the element type. I did that with the thought that it could be used to attack this issue too, but I see that you've done it another way, ie check to see whether the operator uses eqsel() or neqsel() as selectivity estimator. I'm not sure offhand which way is better. It could be argued that yours is more appropriate because if the operator isn't btree equality, but acts enough like it to use eqsel() as estimator, then it's still appropriate for scalararraysel() to treat it as equality. On the other side of the coin, an operator might be equality but have reason to use some operator-specific estimator rather than eqsel(). We have real examples of the former (such as the approximate-equality geometric operators) but I think the latter case is just hypothetical. Another thing that has to be thought about is that there are numerous cross-type operators that use eqsel, such as date-vs-timestamp, and it's far from clear whether it's appropriate for scalararraysel() to use the modified stats calculation when dealing with one of these. The btree-based logic is impervious to that since it won't match any cross-type operator. Thoughts? (BTW, in any case I don't trust function pointer comparison to be portable. It'd be a lot safer to look at the function OID.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers