On 9 December 2015 at 15:14, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Dec 7, 2015 at 8:26 PM, David Rowley > <david.row...@2ndquadrant.com> wrote: > > The biggest frustration for me is that the way Tom always seems to argue > his > > point it's as if planning time is roughly the same or more expensive than > > execution time, and likely that's true in many cases, but I would > imagine in > > more normal cases that execution time is longer, although I've never had > the > > guts to stand up and argued this as I don't have any stats to back me up. > > I think this is missing the point. It is possible to have a query > planner optimization that is so expensive that it loses even when it > improves the plan, but I don't think this optimization falls into that > category. This particular change would not have been requested as > many times as it has been if people didn't keep rediscovering that, on > a certain class of queries, it works *really* well. The problem that > I understand Tom to be concerned about is the queries where the > optimization consumes additional planning time without delivering a > better plan - and those where, without careful thought, it might even > deliver a worse plan. > > My point was more of a response to the general condition that we cannot add any planner overhead unless the extra CPU cycles spent are almost certainly going improve the plan. > Now, I'm not sure Tom is right about that, but he might be. The class > of queries we're talking about here are the ones where somebody > constrains a column that is part of an equivalence class with multiple > members. Perhaps the best example is a large join, let's say 10 > tables, where the join column is the same for all tables; thus, we > have an equivalence class with 10 members. Suppose further we have an > inequality condition applied to one member of that equivalence class. > > Currently, we'll apply that inequality to the table that the user > actually mentioned and ignore all the others; in theory, it could be > right to enforce that inequality condition on any nonempty subset of > the 10 tables we've got. It might be right to pick exactly one of > those tables and enforce the inequality there, or it might be right to > enforce it on some of them, or it might be right to enforce it on all > of them. The reason why the answer isn't automatically "all of them" > is because, first of all, it's possible that enforcing the condition > at a particular table costs more to filter out the rows that we save > in execution time at higher levels of the plan tree. For example, > consider A JOIN B ON A.X = B.X WHERE A.X > 1000000. It might be that > the range of A.X is [0,1000001] but the range of B.X is > [1000000,2000000]; so enforcing the inequality against A is very > selective but enforcing it against B filters out basically nothing. > hmm, well I think it's more than likely that my view on this has been skewed by the fact that we push equality quals down with the eclass contains a Const without any regard that it could generate a worse plan or add needless CPU overhead for checking the quals. If you rewrite your above paragraph with equality instead of inequality then it still applies. A JOIN B ON A.X = B.X WHERE A.X = 1000000, will clause B.X = 1000000 to be pushed into B, but what if B.X values are all set to 1000000? I've not seen anyone complain about us doing that. This is the reason I limited the patch to only deal with members of the BTREE op-class, I assumed that if there's some BTREE index helping with the equality qual then that same index may well just help with a qual using the <, <=, > or >= operator, and also I thought that in most cases all these btree inequality operations will have about the same CPU cost as equality. Would you be able to explain the difference between the two? Or is it just a question of the likelihood? > Furthermore, there are some cases involving parameterized paths where > enforcing the inequality multiple times is definitely bad: for > example, if we've got a nested loop where the outer side is a seq scan > that enforces the condition and the inner side is an index probe, it > is just a waste to retest it on the inner side. We already know that > the outer row passes the inequality, so the inner row will necessarily > pass also. This doesn't apply to merge or hash joins, and it also > doesn't apply to all nested loops: scans that aren't paramaterized by > the equivalence-class column can still benefit from separate > enforcement of the inequality. > > I guess that could be fixed by somehow marking these pushed quals as optional and having parameterised scans ignore optional quals. I have to admit that I didn't give it much thought as all of the cases that I tested turned what used to be nested loop with a parameterised inner index scan into a merge join, which was faster in my test case. Of course, that does not mean that I claim that this merge join will be faster in all cases or that the plan will switch to using a merge join in all cases. > Considering the factors mentioned in the previous paragraph, it seems > likely to me that a naive patch that propagates the inequalities to > every relation where they could hypothetically be enforced will be a > significant loss in some cases even just in execution cost, ignoring > completely the effects on planning time. And, on the flip side, > figuring out what non-empty subset of relations forms the optimal set > upon which to enforce the inequality seems like a complicated problem. > A first cut might be to enforce the inequality against the relation > where it's believed to be most selective, and to also enforce it in > paths for other relations that aren't parameterized by the > equivalence-class column mentioned in the inequality provided that the > selectivity is thought to be above some threshold ... but I'm not sure > this is very principled, and it's certainly not obvious what arbitrary > threshold would be appropriate. The threshold where multiple > enforcement of the qual starts to pay off also depends on the cost of > the qual: an expensive qual has to filter out more rows than a cheap > qual to be worthwhile. I'm not sure how to estimate all this, but I > don't have trouble believing that if not done carefully it could > either (a) cost a lot of planning time or (b) generate lousy plans. Again this is one of the reason that I limited it to btree operators only. I don't quite know how to estimate this either as the extra rows being filtered may mean that a sort no longer spills to disk, or a hash join no longer multi-batches. I'd imagine filtering would be extra worthwhile in such cases, so I doubt setting the threshold to some constant value is the correct way, and most likely considering the path with and without the qual(s) would be far too costly. > Now, all that having been said, I think this is a pretty important > optimization. Lots of people have asked for it, and I think it would > be worth expending some brainpower to try to figure out a way to be > smarter than we are now, which is, in a nutshell, as dumb as possible. > You're completely right that, on an OLAP query that's going to run for > a minute, a few extra milliseconds of planning time could be totally > OK even if they don't yield any benefit on most queries. Maybe we can > get the cost of this feature down to the point where we can turn it on > for everybody and have that be OK. But if not, having something we > can turn on for queries that are going to run for a long time, or that > we estimate are going to run for a long time, would, I think, be > useful. As things stand, planning time can consume a very significant > percentage of total runtime on OLTP queries, and if you are processing > 300k queries/second and 10% of that is planning time, and somebody > boosts planning time by 1%, that's an 0.1% hit overall and you just > lost several hundred queries per second. That's not nothing, > especially if we do 3 or 4 such "optimizations" per release cycle. > But if the optimization can be made nearly free in cases where it > doesn't apply, that's a different story, and of course OLAP is a whole > different ball game. > > Well I agree with that. I've got no interest in slowing anything down. I've been busy working with big databases for quite a while now, so perhaps my point of view is coming from that direction still. So anyway, it's quite clear that we don't want the patch in its current form, and I'm not volunteering any more time at this stage to improve it, so for this reason I won't add it the January commit fest. Thanks for the feedback. David -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services