"Atul Deopujari" <[EMAIL PROTECTED]> writes: > Yes, letting the planner make its own decision would seem best (in > accordance with what we do for different join paths). But for large IN > lists, a substantial part of the planner is spent in estimating the > selectivity of the ScalarArrayExpr by calling scalararraysel. If we are > not eliminating this step in processing the IN list then we are not > doing any optimization. Asking the planner to do scalararraysel and also > compute cost of any other way and choose between the two is asking > planner to do more work.
So? Better planning usually involves more work. In any case the above argument seems irrelevant, because making scalararraysel more approximate and less expensive for long lists could be done independently of anything else. > Factors such as size of table, availability of index etc. would affect > both the ways similarly. So, if we see a gain in the execution of the IN > list due to an external factor then we will also see a similar gain in > the execution of the transformed IN (VALUES(...)) clause. Incorrect. There is more than one way to do a join, and the above argument only applies if the VALUES case is planned as a nestloop with inner indexscan, which indeed is isomorphic to the scalararrayop implementation ... except that it has higher per-tuple overhead, and therefore will consistently lose, disregarding artifacts of planning costs such as how hard we try to estimate the result size. The case where VALUES is actually a better plan is where the planner switches to merge or hash join because there are too many values. In the current implementation, the planner is incapable of generating those plan shapes from a scalararrayop, and that's what I'd like to see fixed. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings