On Thu, 23 Jan 2020 at 00:00, Tom Lane <t...@sss.pgh.pa.us> wrote: > The problem is that the planner's initial analysis of the query tree > concludes that the scan of "tab" has to return "x", because it looks > through the tree for plain Vars, and "x" is what it's going to find. > This conclusion is in fact true for any plan that doesn't involve > scanning a suitable index on f(x), so it's not something we can just > dispense with.
If f(x) was actually added to the table as a virtual generated column V then the planner could perhaps reasonably find that a particular expression depends in part on V (rather than the set of real columns underlying the virtual generated column). That is, the planner now knows that it needs to return V = "f(x)" rather than "x" and so it can match the requirement to our index and decide to use an index-only scan. > To back up from that and conclude that the indexscan doesn't really > need to return "x" because every use of it is in the context "f(x)" > seems like a pretty expensive proposition, especially once you start > considering index expressions that are more complex than a single > function call. A brute-force matching search could easily be O(N^2) > or worse in the size of the query tree, which is a lot to pay when > there's a pretty high chance of failure. (IOW, we have to expect > that most queries on "tab" are in fact not going to be able to use > that index, so we can't afford to spend a lot of planning time just > because the index exists.) In the approach outlined above, the set of expressions to match on would be limited by the set of virtual generated columns defined on the table — some sort of prefix tree that allows efficient matching on a given expression syntax tree? There would be no cost to this on a table that had no virtual generated columns. The question is whether this is a reasonable take on virtual generated columns. --- regards