planner.c)

Ranier Vilela Sun, 20 Sep 2020 16:12:58 -0700

Em sex., 18 de set. de 2020 às 10:37, Tomas Vondra <
[email protected]> escreveu:


> On Thu, Sep 17, 2020 at 06:31:12PM -0300, Ranier Vilela wrote:
> >Em qui., 17 de set. de 2020 às 17:09, Tomas Vondra <
> >[email protected]> escreveu:
> >
> >> On Thu, Sep 17, 2020 at 02:12:12PM -0300, Ranier Vilela wrote:
> >> >Hi,
> >> >
> >> >In case gd->any_hashable is FALSE, grouping_is_hashable is never
> called.
> >> >In this case, the planner could use HASH for groupings, but will never
> >> know.
> >> >
> >>
> >> The condition is pretty simple - if the query has grouping sets, look at
> >> grouping sets, otherwise look at groupClause. For this to be an issue
> >> the query would need to have both grouping sets and (independent) group
> >> clause - which AFAIK is not possible.
> >>
> >Uh?
> >(parse->groupClause != NIL) If different from NIL we have ((independent)
> >group clause), grouping_is_hashable should check?
> >(gd ? gd->any_hashable : grouping_is_hashable(parse->groupClause))))
> >If gd is not NIL and gd->any_hashtable is FALSE?
> >
>
> Sorry, I'm not quite sure I understand what this is meant to say :-(
>
> Anyway, (groupClause != NIL) does not mean the groupClause is somehow
> independent (of what?). Add some debugging to create_grouping_paths and
> you'll see that e.g. this query ends up with groupClause with 3 items:
>
>      select 1 from t group by grouping sets ((a), (b), (c));
>
> and so does this one:
>
>      select 1 from t group by grouping sets ((a,c), (a,b));
>
> I'm no expert in grouping sets, but I'd bet this means we transform the
> grouping sets into a groupClause by extracting the keys. I haven't
> investigated why exactly we do this, but I'm sure there's a reason (e.g.
> it gives us SortGroupClause).
>
> You seem to believe a query can have both grouping sets and regular
> grouping at the same level - but how would such query look like? Surely
> you can't have two GROuP BY clauses. You can do
>
Translating into code:
gd is grouping sets and
parse->groupClause is regular grouping
and we cannot have both at the same time.


>      select 1 from t group by a, grouping sets ((b), (c));
>
> which is however mostly equivalent to (AFAICS)
>
>      select 1 from t group by grouping sets ((a,b), (a,c))
>
> so it's not like we have an independent groupClause either I think.
>
> The whole point of the original condition is - if there are grouping
> sets, check if at least one can be executed using hashing (i.e. all keys
> are hashable). Otherwise (without grouping sets) look at the grouping as
> a whole.
>
Or if gd is NULL check  parse->groupClause.


> I don't see how your change improves this - if there are grouping sets,
> it's futile to look at the whole groupClause if at least one grouping
> set can't be hashed.
>
> But there's a simple way to disprove this - show us a case (table and a
> query) where your code correctly allows hashing while the current one
> does not.

I'm not an expert in grouping either.
The question I have here is whether gd is populated and has gd->
any_hashable as FALSE,
Its mean no use checking parse-> groupClause, it's a waste of time, Ok.


>
> >
> >> For hashing to be worth considering, at least one grouping set has to be
> >> hashable - otherwise it's pointless.
> >>
> >> Granted, you could have something like
> >>
> >>      GROUP BY GROUPING SETS ((a), (b)), c
> >>
> >> which I think essentially says "add c to every grouping set" and that
> >> will be covered by the any_hashable check.
> >>
> >I am not going into the merit of whether or not it is worth hashing.
> >grouping_is_hashable as a last resort, must decide.
> >
>
> I don't know what this is supposed to say either. The whole point of
> this check is to simply skip construction of hash-based paths in cases
> when it's obviously futile (i.e. when some of the keys don't support
> hashing). We do this as early as possible, because the whole point is to
> save precious CPU cycles during query planning.
>

> >
> >> >Apparently gd pointer, will never be NULL there, verified with
> Assert(gd
> >> !=
> >> >NULL).
> >> >
> >>
> >> Um, what? If you add the assert right before the if condition, you won't
> >> even be able to do initdb. It's pretty clear it'll crash for any query
> >> without grouping sets.
> >>
> >Here not:
> >Assert(gd != NULL);
> >create_ordinary_grouping_paths(root, input_rel, grouped_rel,
> >  agg_costs, gd, &extra,
> >  &partially_grouped_rel);
> >
>
> I have no idea where you're adding this assert. But simply adding it to
> create_grouping_paths (right before the if condition changed by your
> patch) will trigger a failure during initdb. Simply because for queries
> without grouping sets (and with regular grouping) we pass gs=NULL.
>
> Try applying the attached patch and do "pg_ctl -D ... init" - you'll get
> a failure proving that gd=NULL.
>
Well, I did a test right now, downloaded the latest HEAD and added the
Assertive.
I compiled everything, ran the regression tests, installed the binaries,
loaded the server and installed a client database.
Everything is working.
Maybe in all these steps, there is no grouping that would trigger the code
in question, but I doubt it.


>
> >Otherwise Coverity would be right.
> >CID 1412604 (#1 of 1): Dereference after null check (FORWARD_NULL)13.
> >var_deref_model: Passing null pointer gd to
> create_ordinary_grouping_paths,
> >which dereferences it. [show details
> ><
> https://scan6.coverity.com/eventId=31389494-14&modelId=31389494-0&fileInstanceId=105740687&filePath=%2Fdll%2Fpostgres%2Fsrc%2Fbackend%2Foptimizer%2Fplan%2Fplanner.c&fileStart=4053&fileEnd=4180
> >]
> >
> >
> >Which cannot be true, gd is never NULL here.
> >
>
> Or maybe coverity simply does not realize the NULL-dereference can't
> happen, because it's guarded by other conditions, or something like
> that ... As the assert failure demonstrates, we do indeed get NULLs
> here, and it does not crash so coverity is missing something.
>
1. With Assert.
All works.

2. create_ordinary_grouping_paths is called with gd var.

3. create_ordinary_grouping_paths call get_number_of_groups

4. get_number_of_groups  dereference gd pointer (line 3705).
foreach(lc, gd->rollups)

5. line (3701:
Assert(gd); /* keep Coverity happy */

Of course, this works, gd is never NULL here.

6. grouping_is_hashable is never called here, because gd is never NULL here.
if ((parse->groupClause != NIL &&
agg_costs->numOrderedAggs == 0 &&
(gd ? gd->any_hashable : grouping_is_hashable(parse->groupClause))))
flags |= GROUPING_CAN_USE_HASH;

The question is is it worth it or not worth calling (grouping_is_hashable),
at that point?

regards,
Ranier Vilela

Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:09:19.72

C:\dll\postgres\src\tools\msvc>vcregress check
Setting up temp install

Installing version 14 for release in C:/dll/postgres/tmp_install
Copying build output 
files......................................................................................................................................
Copying config files.....
Copying Import libraries...
Copying contrib data 
files...........................................................................................................................................................................................................................................
Copying Public headers.....
Copying Libpq headers..
Copying Libpq internal headers..
Copying Internal headers...
Copying Server headers...
Copying Grammar header.
.....................
Copying PL/pgSQL header.
81 arquivo(s) copiado(s)
1 arquivo(s) copiado(s)
141 arquivo(s) copiado(s)
35 arquivo(s) copiado(s)
33 arquivo(s) copiado(s)
1 arquivo(s) copiado(s)
56 arquivo(s) copiado(s)
10 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
3 arquivo(s) copiado(s)
13 arquivo(s) copiado(s)
14 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
20 arquivo(s) copiado(s)
27 arquivo(s) copiado(s)
22 arquivo(s) copiado(s)
4 arquivo(s) copiado(s)
40 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
11 arquivo(s) copiado(s)
5 arquivo(s) copiado(s)
20 arquivo(s) copiado(s)
7 arquivo(s) copiado(s)
48 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
55 arquivo(s) copiado(s)
8 arquivo(s) copiado(s)
7 arquivo(s) copiado(s)
88 arquivo(s) copiado(s)
Copying ECPG headers................
Copying ECPG informix headers...
Copying timezone names..........
Copying timezone sets...
Copying BKI files.
Copying SQL files..
Copying Information schema data.
Copying Error code data.
Generating timezone files...
Generating tsearch script............................
Copying Stopword files...............
Copying Dictionaries sample files.........
Copying PL Extension files..
Installation complete.
============== creating temporary instance            ==============
============== initializing database system           ==============
============== starting postmaster                    ==============
psql (14devel)
WARNING: Console code page (437) differs from Windows code page (1252)
         8-bit characters might not work correctly. See psql reference
         page "Notes for Windows users" for details.
Type "help" for help.

postgres=# \q
running on port 58080 with PID 312
============== creating database "regression"         ==============
CREATE DATABASE
ALTER DATABASE
============== running regression test queries        ==============
test tablespace                   ... ok         1163 ms
parallel group (20 tests):  boolean char name varchar text int2 int4 txid uuid 
bit money float4 oid int8 float8 pg_lsn enum regproc numeric rangetypes
     boolean                      ... ok         1043 ms
     char                         ... ok         1052 ms
     name                         ... ok         1016 ms
     varchar                      ... ok          999 ms
     text                         ... ok          997 ms
     int2                         ... ok          993 ms
     int4                         ... ok          976 ms
     int8                         ... ok          939 ms
     oid                          ... ok          900 ms
     float4                       ... ok          833 ms
     float8                       ... ok          766 ms
     bit                          ... ok          715 ms
     numeric                      ... ok         1072 ms
     txid                         ... ok          619 ms
     uuid                         ... ok          568 ms
     enum                         ... ok          567 ms
     money                        ... ok          421 ms
     rangetypes                   ... ok          922 ms
     pg_lsn                       ... ok          332 ms
     regproc                      ... ok          325 ms
parallel group (20 tests):  strings numerology point lseg line macaddr interval 
path circle timetz time date macaddr8 inet xid box timestamp tstypes 
timestamptz polygon
     strings                      ... ok         1125 ms
     numerology                   ... ok         1128 ms
     point                        ... ok         1152 ms
     lseg                         ... ok         1142 ms
     line                         ... ok         1108 ms
     box                          ... ok         1272 ms
     path                         ... ok         1012 ms
     polygon                      ... ok         1290 ms
     circle                       ... ok          926 ms
     date                         ... ok          884 ms
     time                         ... ok          812 ms
     timetz                       ... ok          760 ms
     timestamp                    ... ok          818 ms
     timestamptz                  ... ok          821 ms
     interval                     ... ok          492 ms
     inet                         ... ok          468 ms
     macaddr                      ... ok          360 ms
     macaddr8                     ... ok          329 ms
     tstypes                      ... ok          417 ms
     xid                          ... ok          292 ms
parallel group (10 tests):  horology geometry comments type_sanity unicode 
expressions misc_sanity oidjoins opr_sanity regex
     geometry                     ... ok          522 ms
     horology                     ... ok          446 ms
     regex                        ... ok         1248 ms
     oidjoins                     ... ok          811 ms
     type_sanity                  ... ok          557 ms
     opr_sanity                   ... ok         1091 ms
     misc_sanity                  ... ok          523 ms
     comments                     ... ok          320 ms
     expressions                  ... ok          335 ms
     unicode                      ... ok          270 ms
test create_function_1            ... ok          152 ms
test create_type                  ... ok          201 ms
test create_table                 ... ok         1169 ms
test create_function_2            ... ok          134 ms
parallel group (5 tests):  copyselect copydml insert_conflict copy insert
     copy                         ... ok         1231 ms
     copyselect                   ... ok          275 ms
     copydml                      ... ok          254 ms
     insert                       ... ok         1428 ms
     insert_conflict              ... ok          833 ms
parallel group (3 tests):  create_operator create_procedure create_misc
     create_misc                  ... ok          394 ms
     create_operator              ... ok          286 ms
     create_procedure             ... ok          273 ms
parallel group (5 tests):  index_including create_view create_index_spgist 
index_including_gist create_index
     create_index                 ... ok         2745 ms
     create_index_spgist          ... ok         1148 ms
     create_view                  ... ok          855 ms
     index_including              ... ok          750 ms
     index_including_gist         ... ok         1114 ms
parallel group (15 tests):  create_aggregate create_function_3 create_cast 
constraints roleattributes select drop_if_exists hash_func typed_table 
create_am errors updatable_views inherit triggers vacuum
     create_aggregate             ... ok          892 ms
     create_function_3            ... ok          991 ms
     create_cast                  ... ok         1126 ms
     constraints                  ... ok         1146 ms
     triggers                     ... ok         2743 ms
     select                       ... ok          968 ms
     inherit                      ... ok         2455 ms
     typed_table                  ... ok          904 ms
     vacuum                       ... ok         2580 ms
     drop_if_exists               ... ok          789 ms
     updatable_views              ... ok         2152 ms
     roleattributes               ... ok          545 ms
     create_am                    ... ok          722 ms
     hash_func                    ... ok          486 ms
     errors                       ... ok         1464 ms
test sanity_check                 ... ok         1253 ms
parallel group (20 tests):  select_into select_distinct select_distinct_on 
select_implicit delete select_having subselect union portals random namespace 
transactions case arrays prepared_xacts hash_index aggregates join update 
btree_index
     select_into                  ... ok         1369 ms
     select_distinct              ... ok         1393 ms
     select_distinct_on           ... ok         1437 ms
     select_implicit              ... ok         1634 ms
     select_having                ... ok         1632 ms
     subselect                    ... ok         1572 ms
     union                        ... ok         1525 ms
     case                         ... ok         1582 ms
     join                         ... ok         2472 ms
     aggregates                   ... ok         2219 ms
     transactions                 ... ok         1311 ms
     random                       ... ok         1204 ms
     portals                      ... ok         1029 ms
     arrays                       ... ok         1201 ms
     btree_index                  ... ok         2654 ms
     hash_index                   ... ok         1189 ms
     update                       ... ok         2238 ms
     delete                       ... ok          585 ms
     namespace                    ... ok          690 ms
     prepared_xacts               ... ok          795 ms
parallel group (20 tests):  init_privs security_label lock replica_identity 
password drop_operator object_address tablesample spgist collate brin 
groupingsets identity matview generated rowsecurity gin gist privileges 
join_hash
     brin                         ... ok         2506 ms
     gin                          ... ok         3715 ms
     gist                         ... ok         3809 ms
     spgist                       ... ok         2212 ms
     privileges                   ... ok         4614 ms
     init_privs                   ... ok         1145 ms
     security_label               ... ok         1295 ms
     collate                      ... ok         2021 ms
     matview                      ... ok         2917 ms
     lock                         ... ok         1612 ms
     replica_identity             ... ok         1672 ms
     rowsecurity                  ... ok         2948 ms
     object_address               ... ok         1678 ms
     tablesample                  ... ok         1640 ms
     groupingsets                 ... ok         1940 ms
     drop_operator                ... ok         1468 ms
     password                     ... ok         1311 ms
     identity                     ... ok         2164 ms
     generated                    ... ok         2263 ms
     join_hash                    ... ok         7833 ms
parallel group (13 tests):  alter_operator async alter_generic dbsize tid 
collate.icu.utf8 misc tsrf tidscan incremental_sort misc_functions 
create_table_like sysviews
     create_table_like            ... ok         1269 ms
     alter_generic                ... ok          648 ms
     alter_operator               ... ok          505 ms
     misc                         ... ok          765 ms
     async                        ... ok          514 ms
     dbsize                       ... ok          666 ms
     misc_functions               ... ok         1020 ms
     sysviews                     ... ok         1172 ms
     tsrf                         ... ok          556 ms
     tid                          ... ok          516 ms
     tidscan                      ... ok          537 ms
     collate.icu.utf8             ... ok          364 ms
     incremental_sort             ... ok          618 ms
parallel group (6 tests):  psql_crosstab amutils collate.linux.utf8 psql rules 
stats_ext
     rules                        ... ok         1250 ms
     psql                         ... ok          938 ms
     psql_crosstab                ... ok          292 ms
     amutils                      ... ok          286 ms
     stats_ext                    ... ok         1963 ms
     collate.linux.utf8           ... ok          203 ms
test select_parallel              ... ok         7115 ms
test write_parallel               ... ok          539 ms
parallel group (2 tests):  subscription publication
     publication                  ... ok          382 ms
     subscription                 ... ok          211 ms
parallel group (17 tests):  select_views portals_p2 dependency advisory_lock 
guc xmlmap window bitmapops tsdicts functional_deps combocid cluster equivclass 
tsearch foreign_data foreign_key indirect_toast
     select_views                 ... ok         1121 ms
     portals_p2                   ... ok         1300 ms
     foreign_key                  ... ok         3269 ms
     cluster                      ... ok         1722 ms
     dependency                   ... ok         1447 ms
     guc                          ... ok         1498 ms
     bitmapops                    ... ok         1454 ms
     combocid                     ... ok         1371 ms
     tsearch                      ... ok         1435 ms
     tsdicts                      ... ok         1196 ms
     foreign_data                 ... ok         1844 ms
     window                       ... ok         1058 ms
     xmlmap                       ... ok          935 ms
     functional_deps              ... ok          871 ms
     advisory_lock                ... ok          784 ms
     indirect_toast               ... ok         2565 ms
     equivclass                   ... ok          734 ms
parallel group (6 tests):  json_encoding jsonpath_encoding jsonpath json 
jsonb_jsonpath jsonb
     json                         ... ok          476 ms
     jsonb                        ... ok          739 ms
     json_encoding                ... ok          210 ms
     jsonpath                     ... ok          279 ms
     jsonpath_encoding            ... ok          243 ms
     jsonb_jsonpath               ... ok          358 ms
parallel group (18 tests):  plancache limit copy2 temp domain rangefuncs 
prepare polymorphism conversion returning plpgsql sequence xml rowtypes 
truncate with largeobject alter_table
     plancache                    ... ok         1631 ms
     limit                        ... ok         1787 ms
     plpgsql                      ... ok         2166 ms
     copy2                        ... ok         1659 ms
     temp                         ... ok         1697 ms
     domain                       ... ok         1629 ms
     rangefuncs                   ... ok         1559 ms
     prepare                      ... ok         1538 ms
     conversion                   ... ok         1541 ms
     truncate                     ... ok         1817 ms
     alter_table                  ... ok         4420 ms
     sequence                     ... ok         1297 ms
     polymorphism                 ... ok          885 ms
     rowtypes                     ... ok         1152 ms
     returning                    ... ok          938 ms
     largeobject                  ... ok         1205 ms
     with                         ... ok         1117 ms
     xml                          ... ok          848 ms
parallel group (9 tests):  hash_part reloptions partition_info explain indexing 
partition_aggregate partition_join tuplesort partition_prune
     partition_join               ... ok         3267 ms
     partition_prune              ... FAILED     3859 ms
     reloptions                   ... ok          524 ms
     hash_part                    ... ok          399 ms
     indexing                     ... ok         2639 ms
     partition_aggregate          ... ok         2781 ms
     partition_info               ... ok          626 ms
     tuplesort                    ... ok         3378 ms
     explain                      ... ok          615 ms
test event_trigger                ... ok          491 ms
test fast_default                 ... ok          648 ms
test stats                        ... ok          824 ms
============== shutting down postmaster               ==============

========================
 1 of 200 tests failed.
========================

The differences that caused some tests to fail can be viewed in the
file "C:/dll/postgres/src/test/regress/regression.diffs".  A copy of the test 
summary that you see
above is saved in the file "C:/dll/postgres/src/test/regress/regression.out".


C:\dll\postgres\src\tools\msvc>install c:\postgres_debug
Installing version 14 for release in c:\postgres_debug
Copying build output 
files......................................................................................................................................
Copying config files.....
Copying Import libraries...
Copying contrib data 
files...........................................................................................................................................................................................................................................
Copying Public headers.....
Copying Libpq headers..
Copying Libpq internal headers..
Copying Internal headers...
Copying Server headers...
Copying Grammar header.
.....................
Copying PL/pgSQL header.
81 arquivo(s) copiado(s)
1 arquivo(s) copiado(s)
141 arquivo(s) copiado(s)
35 arquivo(s) copiado(s)
33 arquivo(s) copiado(s)
1 arquivo(s) copiado(s)
56 arquivo(s) copiado(s)
10 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
3 arquivo(s) copiado(s)
13 arquivo(s) copiado(s)
14 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
20 arquivo(s) copiado(s)
27 arquivo(s) copiado(s)
22 arquivo(s) copiado(s)
4 arquivo(s) copiado(s)
40 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
11 arquivo(s) copiado(s)
5 arquivo(s) copiado(s)
20 arquivo(s) copiado(s)
7 arquivo(s) copiado(s)
48 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
55 arquivo(s) copiado(s)
8 arquivo(s) copiado(s)
7 arquivo(s) copiado(s)
88 arquivo(s) copiado(s)
Copying ECPG headers................
Copying ECPG informix headers...
Copying timezone names..........
Copying timezone sets...
Copying BKI files.
Copying SQL files..
Copying Information schema data.
Copying Error code data.
Generating timezone files...
Generating tsearch script............................
Copying Stopword files...............
Copying Dictionaries sample files.........
Copying PL Extension files..
Installation complete.

C:\dll\postgres\src\tools\msvc>cd\postgres_debug

C:\postgres_debug>cd bin

C:\postgres_debug\bin>mkdb

C:\postgres_debug\bin>initdb -U postgres -E UTF8 -D c:/postgres_debug/data
The files belonging to this database system will be owned by user "ranier".
This user must also own the server process.

The database cluster will be initialized with locale "Portuguese_Brazil.1252".
The default text search configuration will be set to "portuguese".

Data page checksums are disabled.

creating directory c:/postgres_debug/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... windows
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/Araguaina
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D c:/postgres_debug/data -l logfile start


C:\postgres_debug\bin>pg_start

C:\postgres_debug\bin>pg_ctl -D c:\postgres_debug\data -l 
c:\postgres_debug\log\logfile start
waiting for server to start.... done
server started

C:\postgres_debug\bin>

gd_is_null.patch
Description: Binary data

Re: Planner, check if can use consider HASH for groupings (src/backend/optimizer/plan/planner.c)

Reply via email to