On Wed, Sep 04, 2019 at 09:17:10PM +0200, Tomas Vondra wrote:
On Wed, Sep 04, 2019 at 11:37:48AM +0200, Rafia Sabih wrote:
On Tue, 30 Jul 2019 at 02:17, Tomas Vondra <tomas.von...@2ndquadrant.com>
wrote:

On Sun, Jul 21, 2019 at 01:34:22PM +0200, Tomas Vondra wrote:

...

I wonder if we're approaching this wrong. Maybe we should not reverse
engineer queries for the various places, but just start with a set of
queries that we want to optimize, and then identify which places in the
planner need to be modified.


I've decided to do a couple of experiments, trying to make my mind about
which modified places matter to diffrent queries. But instead of trying
to reverse engineer the queries, I've taken a different approach - I've
compiled a list of queries that I think are sensible and relevant, and
then planned them with incremental sort enabled in different places.

I don't have any clear conclusions at this point - it does show some of
the places don't change plan for any of the queries, although there may
be some additional query where it'd make a difference.

But I'm posting this mostly because it might be useful. I've initially
planned to move changes that add incremental sort paths to separate
patches, and then apply/skip different subsets of those patches. But
then I realized there's a better way to do this - I've added a bunch of
GUCs, one for each such place. This allows doing this testing without
having to rebuild repeatedly.

I'm not going to post the patch(es) with extra GUCs here, because it'd
just confuse the patch tester, but it's available here:

 https://github.com/tvondra/postgres/tree/incremental-sort-20190730

There are 10 GUCs, one for each place in planner where incremental sort
paths are constructed. By default all those are set to 'false' so no
incremental sort paths are built. If you do

 SET devel_create_ordered_paths = on;

it'll start creating the paths in non-parallel in create_ordered_paths.
Then you may enable devel_create_ordered_paths_parallel to also consider
parallel paths, etc.

The list of queries (synthetic, but hopefully sufficiently realistic)
and a couple of scripts to collect the plans is in this repository:

 https://github.com/tvondra/incremental-sort-tests-2

There's also a spreadsheet with a summary of results, with a visual
representation of which GUCs affect which queries.

Wow, that sounds like an elaborate experiment. But where is this
spreadsheet you mentioned ?


It seems I forgot to push the commit containing the spreadsheet with
results. I'll fix that tomorrow.


OK, I've pushed the commit with the spreadsheet. The single sheet lists
the synthetic queries, and hashes of plans with different flags enables
(parallel query, force incremental sort, and the new developer GUCs
mentioned before). Only a single developer flag is set to true (or none
of them).

The columns at the end simply say whether the plan differs from the plan
generated by master (no patches). TRUE means "same as master" while
FALSE means "different plan.

The "patched" column means all developer GUCs disabled, so it's expected
to produce the same plan as master (and it is). And then there's one
column for each developer GUC. If the column is just TRUE it means the
GUC does not affect any of the synthetic queries. There are 4 of them:

- devel_add_paths_to_grouping_rel_parallel
- devel_create_partial_grouping_paths
- devel_gather_grouping_paths
- devel_standard_join_search

The places controlled by those GUCs are either useless, or the query
affected by them is not included in the list of queries.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to