On Wed, Jul 16, 2014 at 9:11 AM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Simon Riggs <si...@2ndquadrant.com> writes: > > On 15 July 2014 12:58, David Rowley <dgrowle...@gmail.com> wrote: > >> I found that the call to is_NOTANY_compatible_with_antijoin adds about > 0.2% > >> and 2.3% to total planning time. Though the 2.3% was quite an extreme > case, > >> and the 0.2% was the most simple case I could think of. > > > Is there a way we can only run this extra test when we have reasonable > > idea that there is potential to save significant costs? > > Well, all of this cost occurs only when we find a NOT IN clause in a place > where we could conceivably turn it into an antijoin. I think it's > unquestionably a win to make that transformation if possible. My concern > about it is mainly that the whole thing is a band-aid for naively written > queries, and it seems likely to me that naively written queries aren't > going to be amenable to making the proof we need. But we can't tell that > for sure without doing exactly the work the patch does. > > I do think Simon has a good point, maybe it's not something for this patch, but perhaps other planner patches that potentially optimise queries that could have been executed more efficiently if they had been written in another way. Since the planning time has been added to EXPLAIN ANALYZE I have noticed that in many very simple queries that quite often planning time is longer than execution time, so I really do understand the concern that we don't want to slow the planner down for these super fast queries. But on the other hand, if this extra 1-2 microseconds that the NOT IN optimisation was being frowned upon and the patch had been rejected based on that, at the other end of the scale, I wouldn't think it was too impossible for spending that extra 2 microseconds to translate into shaving a few hours off of the execution time of a query being run in an OLAP type workload. If that was the case then having not spent the 2 extra microseconds seems quite funny, but there's no real way to tell I guess unless we invented something to skip the known costly optimisations on first pass then re-plan the whole query with the planning strength knob turned to the maximum if the final query cost more than N. Off course such a idea would make bad plans even harder to debug, so it's far from perfect, but I'm not seeing other options for the best of both worlds. Regards David Rowley