Re: [HACKERS] More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

Evgeniy Shishkin Wed, 30 Dec 2015 07:45:45 -0800

> On 30 Dec 2015, at 10:16, David Rowley <[email protected]> wrote:
> 
> Hi,
> 
> On [1] I suggested an idea to make improvements to the planner around the 
> Equivalence Class code. Later in [2] Tom raised concerns with this adding too 
> many planning cycles for a perhaps not common enough situation.  I don't want 
> to discuss that particular patch here, I want to discuss more generally about 
> the dilemma about adding more smarts to the planner to allow it to generate a 
> more optimal plan in order to save on execution time.
> 
> In the case of the Equivalence Class Filters code, I quoted an example where 
> pushing these filters down into the joined relation caused a significant 
> performance improvement to a query. Now, I understand Tom's concerns with 
> slowing down the planner, as in cases where the query is short running, or 
> the optimisations don't apply, then we could cause the query to overall 
> (including planning time) perform worse. Nobody wants that, but on the other 
> hand, if spending 5-10 extra microseconds during planning equates to 6 hours 
> shaved off execution time, then nobody would think to grudge that extra 5-10 
> microseconds during planning.
> 
> What I'd like to discuss here is what was touched on on that other thread on 
> ways to get around this problem:
> 
> A number of ideas were suggested on the other thread about how we might go 
> about solving this problem. In [3] Simon talked about perhaps enabling extra 
> optimisations when the planner sees that the plan will cost more than some 
> given threshold. That's perhaps an option, but may not work well for 
> optimisations which must take place very early in planning, for example [4].
> Another idea which came up was from Evgeniy [5], which was more of a request 
> not to do it this way, but never-the-less, the idea was basically to add lots 
> of GUCs to enable/disable each extra planner feature.
>


Well, my idea was to track planning/execution cost in something like 
pg_stat_statements.
That way we can track actual time, not estimated cost like Simon proposed.

This table can be combined with Tomas proposal of plan caching.


> Another option which I've thought about previously was a planner_strength 
> GUC, at which various additional optimisations are enabled at various 
> predefined strength levels, so that databases which tend to spend a great 
> deal more execution time compared to planning time can turn this up a bit to 
> see if that helps change that ratio a bit.  This idea is far from perfect 
> though, as who's to say that planner feature X should "kick in" before 
> planner feature Y? I've also often thought that it might be nice to have it 
> so the planner does not modify the Parse object, so that the planner has the 
> option to throw away what it's done so far and start planning all over again 
> with the "planner_strength" knob turned up to the maximum, if the cost 
> happened to indicate that the query was going to take a long time to execute.
> 
> In reality we already have some planner features which are possible 
> candidates for non essential optimisations. For example join removals likely 
> don't apply in all that many cases, but when they do, this feature is a great 
> win. So by having some sort of ability to enable/disable planner features we 
> also stand to actually speed the planner up for fast simple queries. 
> 
> I do strongly believe that we need to come up with something to solve this 
> problem. I already summarised my thoughts on the other thread.
> 
> I wrote:
> > I believe that with parallel query on the horizon for 9.6 that we're now
> > aiming to support bigger OLAP type database than ever before. So if we
> > ignore patches like this one then it appears that we have some conflicting
> > goals in the community as it seems that we're willing to add the brawn, but
> > we're not willing to add the brain. If this is the case then it's a shame,
> > as I think we can have both. So I very much agree on the fact that we must
> > find a way to maintain support and high performance of small OLTP databases
> > too.
> 
> So here I'd very much like to kick off discussion on an acceptable way to 
> solve this problem, in a realistic way which we're all happy with.
> 
> Comments are of course welcome.
> 
> [1] 
> http://www.postgresql.org/message-id/cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com
> [2] http://www.postgresql.org/message-id/[email protected]
> [3] 
> http://www.postgresql.org/message-id/canp8+jlrprn4ynmsrkoqhyi-dw5jrodmot05qejhrayrsex...@mail.gmail.com
> [4] 
> http://www.postgresql.org/message-id/cakjs1f_uz_mxtpot6epxsghsujoucrkuxyhlh06h072rdxs...@mail.gmail.com
> [5] 
> http://www.postgresql.org/message-id/[email protected]
> 
> -- 
>  David Rowley                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

Reply via email to