Em sex., 18 de set. de 2020 às 10:37, Tomas Vondra < [email protected]> escreveu:
> On Thu, Sep 17, 2020 at 06:31:12PM -0300, Ranier Vilela wrote: > >Em qui., 17 de set. de 2020 às 17:09, Tomas Vondra < > >[email protected]> escreveu: > > > >> On Thu, Sep 17, 2020 at 02:12:12PM -0300, Ranier Vilela wrote: > >> >Hi, > >> > > >> >In case gd->any_hashable is FALSE, grouping_is_hashable is never > called. > >> >In this case, the planner could use HASH for groupings, but will never > >> know. > >> > > >> > >> The condition is pretty simple - if the query has grouping sets, look at > >> grouping sets, otherwise look at groupClause. For this to be an issue > >> the query would need to have both grouping sets and (independent) group > >> clause - which AFAIK is not possible. > >> > >Uh? > >(parse->groupClause != NIL) If different from NIL we have ((independent) > >group clause), grouping_is_hashable should check? > >(gd ? gd->any_hashable : grouping_is_hashable(parse->groupClause)))) > >If gd is not NIL and gd->any_hashtable is FALSE? > > > > Sorry, I'm not quite sure I understand what this is meant to say :-( > > Anyway, (groupClause != NIL) does not mean the groupClause is somehow > independent (of what?). Add some debugging to create_grouping_paths and > you'll see that e.g. this query ends up with groupClause with 3 items: > > select 1 from t group by grouping sets ((a), (b), (c)); > > and so does this one: > > select 1 from t group by grouping sets ((a,c), (a,b)); > > I'm no expert in grouping sets, but I'd bet this means we transform the > grouping sets into a groupClause by extracting the keys. I haven't > investigated why exactly we do this, but I'm sure there's a reason (e.g. > it gives us SortGroupClause). > > You seem to believe a query can have both grouping sets and regular > grouping at the same level - but how would such query look like? Surely > you can't have two GROuP BY clauses. You can do > Translating into code: gd is grouping sets and parse->groupClause is regular grouping and we cannot have both at the same time. > select 1 from t group by a, grouping sets ((b), (c)); > > which is however mostly equivalent to (AFAICS) > > select 1 from t group by grouping sets ((a,b), (a,c)) > > so it's not like we have an independent groupClause either I think. > > The whole point of the original condition is - if there are grouping > sets, check if at least one can be executed using hashing (i.e. all keys > are hashable). Otherwise (without grouping sets) look at the grouping as > a whole. > Or if gd is NULL check parse->groupClause. > I don't see how your change improves this - if there are grouping sets, > it's futile to look at the whole groupClause if at least one grouping > set can't be hashed. > > But there's a simple way to disprove this - show us a case (table and a > query) where your code correctly allows hashing while the current one > does not. I'm not an expert in grouping either. The question I have here is whether gd is populated and has gd-> any_hashable as FALSE, Its mean no use checking parse-> groupClause, it's a waste of time, Ok. > > > > >> For hashing to be worth considering, at least one grouping set has to be > >> hashable - otherwise it's pointless. > >> > >> Granted, you could have something like > >> > >> GROUP BY GROUPING SETS ((a), (b)), c > >> > >> which I think essentially says "add c to every grouping set" and that > >> will be covered by the any_hashable check. > >> > >I am not going into the merit of whether or not it is worth hashing. > >grouping_is_hashable as a last resort, must decide. > > > > I don't know what this is supposed to say either. The whole point of > this check is to simply skip construction of hash-based paths in cases > when it's obviously futile (i.e. when some of the keys don't support > hashing). We do this as early as possible, because the whole point is to > save precious CPU cycles during query planning. > > > > >> >Apparently gd pointer, will never be NULL there, verified with > Assert(gd > >> != > >> >NULL). > >> > > >> > >> Um, what? If you add the assert right before the if condition, you won't > >> even be able to do initdb. It's pretty clear it'll crash for any query > >> without grouping sets. > >> > >Here not: > >Assert(gd != NULL); > >create_ordinary_grouping_paths(root, input_rel, grouped_rel, > > agg_costs, gd, &extra, > > &partially_grouped_rel); > > > > I have no idea where you're adding this assert. But simply adding it to > create_grouping_paths (right before the if condition changed by your > patch) will trigger a failure during initdb. Simply because for queries > without grouping sets (and with regular grouping) we pass gs=NULL. > > Try applying the attached patch and do "pg_ctl -D ... init" - you'll get > a failure proving that gd=NULL. > Well, I did a test right now, downloaded the latest HEAD and added the Assertive. I compiled everything, ran the regression tests, installed the binaries, loaded the server and installed a client database. Everything is working. Maybe in all these steps, there is no grouping that would trigger the code in question, but I doubt it. > > >Otherwise Coverity would be right. > >CID 1412604 (#1 of 1): Dereference after null check (FORWARD_NULL)13. > >var_deref_model: Passing null pointer gd to > create_ordinary_grouping_paths, > >which dereferences it. [show details > >< > https://scan6.coverity.com/eventId=31389494-14&modelId=31389494-0&fileInstanceId=105740687&filePath=%2Fdll%2Fpostgres%2Fsrc%2Fbackend%2Foptimizer%2Fplan%2Fplanner.c&fileStart=4053&fileEnd=4180 > >] > > > > > >Which cannot be true, gd is never NULL here. > > > > Or maybe coverity simply does not realize the NULL-dereference can't > happen, because it's guarded by other conditions, or something like > that ... As the assert failure demonstrates, we do indeed get NULLs > here, and it does not crash so coverity is missing something. > 1. With Assert. All works. 2. create_ordinary_grouping_paths is called with gd var. 3. create_ordinary_grouping_paths call get_number_of_groups 4. get_number_of_groups dereference gd pointer (line 3705). foreach(lc, gd->rollups) 5. line (3701: Assert(gd); /* keep Coverity happy */ Of course, this works, gd is never NULL here. 6. grouping_is_hashable is never called here, because gd is never NULL here. if ((parse->groupClause != NIL && agg_costs->numOrderedAggs == 0 && (gd ? gd->any_hashable : grouping_is_hashable(parse->groupClause)))) flags |= GROUPING_CAN_USE_HASH; The question is is it worth it or not worth calling (grouping_is_hashable), at that point? regards, Ranier Vilela
Build succeeded.
0 Warning(s)
0 Error(s)
Time Elapsed 00:09:19.72
C:\dll\postgres\src\tools\msvc>vcregress check
Setting up temp install
Installing version 14 for release in C:/dll/postgres/tmp_install
Copying build output
files......................................................................................................................................
Copying config files.....
Copying Import libraries...
Copying contrib data
files...........................................................................................................................................................................................................................................
Copying Public headers.....
Copying Libpq headers..
Copying Libpq internal headers..
Copying Internal headers...
Copying Server headers...
Copying Grammar header.
.....................
Copying PL/pgSQL header.
81 arquivo(s) copiado(s)
1 arquivo(s) copiado(s)
141 arquivo(s) copiado(s)
35 arquivo(s) copiado(s)
33 arquivo(s) copiado(s)
1 arquivo(s) copiado(s)
56 arquivo(s) copiado(s)
10 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
3 arquivo(s) copiado(s)
13 arquivo(s) copiado(s)
14 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
20 arquivo(s) copiado(s)
27 arquivo(s) copiado(s)
22 arquivo(s) copiado(s)
4 arquivo(s) copiado(s)
40 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
11 arquivo(s) copiado(s)
5 arquivo(s) copiado(s)
20 arquivo(s) copiado(s)
7 arquivo(s) copiado(s)
48 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
55 arquivo(s) copiado(s)
8 arquivo(s) copiado(s)
7 arquivo(s) copiado(s)
88 arquivo(s) copiado(s)
Copying ECPG headers................
Copying ECPG informix headers...
Copying timezone names..........
Copying timezone sets...
Copying BKI files.
Copying SQL files..
Copying Information schema data.
Copying Error code data.
Generating timezone files...
Generating tsearch script............................
Copying Stopword files...............
Copying Dictionaries sample files.........
Copying PL Extension files..
Installation complete.
============== creating temporary instance ==============
============== initializing database system ==============
============== starting postmaster ==============
psql (14devel)
WARNING: Console code page (437) differs from Windows code page (1252)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.
Type "help" for help.
postgres=# \q
running on port 58080 with PID 312
============== creating database "regression" ==============
CREATE DATABASE
ALTER DATABASE
============== running regression test queries ==============
test tablespace ... ok 1163 ms
parallel group (20 tests): boolean char name varchar text int2 int4 txid uuid
bit money float4 oid int8 float8 pg_lsn enum regproc numeric rangetypes
boolean ... ok 1043 ms
char ... ok 1052 ms
name ... ok 1016 ms
varchar ... ok 999 ms
text ... ok 997 ms
int2 ... ok 993 ms
int4 ... ok 976 ms
int8 ... ok 939 ms
oid ... ok 900 ms
float4 ... ok 833 ms
float8 ... ok 766 ms
bit ... ok 715 ms
numeric ... ok 1072 ms
txid ... ok 619 ms
uuid ... ok 568 ms
enum ... ok 567 ms
money ... ok 421 ms
rangetypes ... ok 922 ms
pg_lsn ... ok 332 ms
regproc ... ok 325 ms
parallel group (20 tests): strings numerology point lseg line macaddr interval
path circle timetz time date macaddr8 inet xid box timestamp tstypes
timestamptz polygon
strings ... ok 1125 ms
numerology ... ok 1128 ms
point ... ok 1152 ms
lseg ... ok 1142 ms
line ... ok 1108 ms
box ... ok 1272 ms
path ... ok 1012 ms
polygon ... ok 1290 ms
circle ... ok 926 ms
date ... ok 884 ms
time ... ok 812 ms
timetz ... ok 760 ms
timestamp ... ok 818 ms
timestamptz ... ok 821 ms
interval ... ok 492 ms
inet ... ok 468 ms
macaddr ... ok 360 ms
macaddr8 ... ok 329 ms
tstypes ... ok 417 ms
xid ... ok 292 ms
parallel group (10 tests): horology geometry comments type_sanity unicode
expressions misc_sanity oidjoins opr_sanity regex
geometry ... ok 522 ms
horology ... ok 446 ms
regex ... ok 1248 ms
oidjoins ... ok 811 ms
type_sanity ... ok 557 ms
opr_sanity ... ok 1091 ms
misc_sanity ... ok 523 ms
comments ... ok 320 ms
expressions ... ok 335 ms
unicode ... ok 270 ms
test create_function_1 ... ok 152 ms
test create_type ... ok 201 ms
test create_table ... ok 1169 ms
test create_function_2 ... ok 134 ms
parallel group (5 tests): copyselect copydml insert_conflict copy insert
copy ... ok 1231 ms
copyselect ... ok 275 ms
copydml ... ok 254 ms
insert ... ok 1428 ms
insert_conflict ... ok 833 ms
parallel group (3 tests): create_operator create_procedure create_misc
create_misc ... ok 394 ms
create_operator ... ok 286 ms
create_procedure ... ok 273 ms
parallel group (5 tests): index_including create_view create_index_spgist
index_including_gist create_index
create_index ... ok 2745 ms
create_index_spgist ... ok 1148 ms
create_view ... ok 855 ms
index_including ... ok 750 ms
index_including_gist ... ok 1114 ms
parallel group (15 tests): create_aggregate create_function_3 create_cast
constraints roleattributes select drop_if_exists hash_func typed_table
create_am errors updatable_views inherit triggers vacuum
create_aggregate ... ok 892 ms
create_function_3 ... ok 991 ms
create_cast ... ok 1126 ms
constraints ... ok 1146 ms
triggers ... ok 2743 ms
select ... ok 968 ms
inherit ... ok 2455 ms
typed_table ... ok 904 ms
vacuum ... ok 2580 ms
drop_if_exists ... ok 789 ms
updatable_views ... ok 2152 ms
roleattributes ... ok 545 ms
create_am ... ok 722 ms
hash_func ... ok 486 ms
errors ... ok 1464 ms
test sanity_check ... ok 1253 ms
parallel group (20 tests): select_into select_distinct select_distinct_on
select_implicit delete select_having subselect union portals random namespace
transactions case arrays prepared_xacts hash_index aggregates join update
btree_index
select_into ... ok 1369 ms
select_distinct ... ok 1393 ms
select_distinct_on ... ok 1437 ms
select_implicit ... ok 1634 ms
select_having ... ok 1632 ms
subselect ... ok 1572 ms
union ... ok 1525 ms
case ... ok 1582 ms
join ... ok 2472 ms
aggregates ... ok 2219 ms
transactions ... ok 1311 ms
random ... ok 1204 ms
portals ... ok 1029 ms
arrays ... ok 1201 ms
btree_index ... ok 2654 ms
hash_index ... ok 1189 ms
update ... ok 2238 ms
delete ... ok 585 ms
namespace ... ok 690 ms
prepared_xacts ... ok 795 ms
parallel group (20 tests): init_privs security_label lock replica_identity
password drop_operator object_address tablesample spgist collate brin
groupingsets identity matview generated rowsecurity gin gist privileges
join_hash
brin ... ok 2506 ms
gin ... ok 3715 ms
gist ... ok 3809 ms
spgist ... ok 2212 ms
privileges ... ok 4614 ms
init_privs ... ok 1145 ms
security_label ... ok 1295 ms
collate ... ok 2021 ms
matview ... ok 2917 ms
lock ... ok 1612 ms
replica_identity ... ok 1672 ms
rowsecurity ... ok 2948 ms
object_address ... ok 1678 ms
tablesample ... ok 1640 ms
groupingsets ... ok 1940 ms
drop_operator ... ok 1468 ms
password ... ok 1311 ms
identity ... ok 2164 ms
generated ... ok 2263 ms
join_hash ... ok 7833 ms
parallel group (13 tests): alter_operator async alter_generic dbsize tid
collate.icu.utf8 misc tsrf tidscan incremental_sort misc_functions
create_table_like sysviews
create_table_like ... ok 1269 ms
alter_generic ... ok 648 ms
alter_operator ... ok 505 ms
misc ... ok 765 ms
async ... ok 514 ms
dbsize ... ok 666 ms
misc_functions ... ok 1020 ms
sysviews ... ok 1172 ms
tsrf ... ok 556 ms
tid ... ok 516 ms
tidscan ... ok 537 ms
collate.icu.utf8 ... ok 364 ms
incremental_sort ... ok 618 ms
parallel group (6 tests): psql_crosstab amutils collate.linux.utf8 psql rules
stats_ext
rules ... ok 1250 ms
psql ... ok 938 ms
psql_crosstab ... ok 292 ms
amutils ... ok 286 ms
stats_ext ... ok 1963 ms
collate.linux.utf8 ... ok 203 ms
test select_parallel ... ok 7115 ms
test write_parallel ... ok 539 ms
parallel group (2 tests): subscription publication
publication ... ok 382 ms
subscription ... ok 211 ms
parallel group (17 tests): select_views portals_p2 dependency advisory_lock
guc xmlmap window bitmapops tsdicts functional_deps combocid cluster equivclass
tsearch foreign_data foreign_key indirect_toast
select_views ... ok 1121 ms
portals_p2 ... ok 1300 ms
foreign_key ... ok 3269 ms
cluster ... ok 1722 ms
dependency ... ok 1447 ms
guc ... ok 1498 ms
bitmapops ... ok 1454 ms
combocid ... ok 1371 ms
tsearch ... ok 1435 ms
tsdicts ... ok 1196 ms
foreign_data ... ok 1844 ms
window ... ok 1058 ms
xmlmap ... ok 935 ms
functional_deps ... ok 871 ms
advisory_lock ... ok 784 ms
indirect_toast ... ok 2565 ms
equivclass ... ok 734 ms
parallel group (6 tests): json_encoding jsonpath_encoding jsonpath json
jsonb_jsonpath jsonb
json ... ok 476 ms
jsonb ... ok 739 ms
json_encoding ... ok 210 ms
jsonpath ... ok 279 ms
jsonpath_encoding ... ok 243 ms
jsonb_jsonpath ... ok 358 ms
parallel group (18 tests): plancache limit copy2 temp domain rangefuncs
prepare polymorphism conversion returning plpgsql sequence xml rowtypes
truncate with largeobject alter_table
plancache ... ok 1631 ms
limit ... ok 1787 ms
plpgsql ... ok 2166 ms
copy2 ... ok 1659 ms
temp ... ok 1697 ms
domain ... ok 1629 ms
rangefuncs ... ok 1559 ms
prepare ... ok 1538 ms
conversion ... ok 1541 ms
truncate ... ok 1817 ms
alter_table ... ok 4420 ms
sequence ... ok 1297 ms
polymorphism ... ok 885 ms
rowtypes ... ok 1152 ms
returning ... ok 938 ms
largeobject ... ok 1205 ms
with ... ok 1117 ms
xml ... ok 848 ms
parallel group (9 tests): hash_part reloptions partition_info explain indexing
partition_aggregate partition_join tuplesort partition_prune
partition_join ... ok 3267 ms
partition_prune ... FAILED 3859 ms
reloptions ... ok 524 ms
hash_part ... ok 399 ms
indexing ... ok 2639 ms
partition_aggregate ... ok 2781 ms
partition_info ... ok 626 ms
tuplesort ... ok 3378 ms
explain ... ok 615 ms
test event_trigger ... ok 491 ms
test fast_default ... ok 648 ms
test stats ... ok 824 ms
============== shutting down postmaster ==============
========================
1 of 200 tests failed.
========================
The differences that caused some tests to fail can be viewed in the
file "C:/dll/postgres/src/test/regress/regression.diffs". A copy of the test
summary that you see
above is saved in the file "C:/dll/postgres/src/test/regress/regression.out".
C:\dll\postgres\src\tools\msvc>install c:\postgres_debug
Installing version 14 for release in c:\postgres_debug
Copying build output
files......................................................................................................................................
Copying config files.....
Copying Import libraries...
Copying contrib data
files...........................................................................................................................................................................................................................................
Copying Public headers.....
Copying Libpq headers..
Copying Libpq internal headers..
Copying Internal headers...
Copying Server headers...
Copying Grammar header.
.....................
Copying PL/pgSQL header.
81 arquivo(s) copiado(s)
1 arquivo(s) copiado(s)
141 arquivo(s) copiado(s)
35 arquivo(s) copiado(s)
33 arquivo(s) copiado(s)
1 arquivo(s) copiado(s)
56 arquivo(s) copiado(s)
10 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
3 arquivo(s) copiado(s)
13 arquivo(s) copiado(s)
14 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
20 arquivo(s) copiado(s)
27 arquivo(s) copiado(s)
22 arquivo(s) copiado(s)
4 arquivo(s) copiado(s)
40 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
11 arquivo(s) copiado(s)
5 arquivo(s) copiado(s)
20 arquivo(s) copiado(s)
7 arquivo(s) copiado(s)
48 arquivo(s) copiado(s)
2 arquivo(s) copiado(s)
55 arquivo(s) copiado(s)
8 arquivo(s) copiado(s)
7 arquivo(s) copiado(s)
88 arquivo(s) copiado(s)
Copying ECPG headers................
Copying ECPG informix headers...
Copying timezone names..........
Copying timezone sets...
Copying BKI files.
Copying SQL files..
Copying Information schema data.
Copying Error code data.
Generating timezone files...
Generating tsearch script............................
Copying Stopword files...............
Copying Dictionaries sample files.........
Copying PL Extension files..
Installation complete.
C:\dll\postgres\src\tools\msvc>cd\postgres_debug
C:\postgres_debug>cd bin
C:\postgres_debug\bin>mkdb
C:\postgres_debug\bin>initdb -U postgres -E UTF8 -D c:/postgres_debug/data
The files belonging to this database system will be owned by user "ranier".
This user must also own the server process.
The database cluster will be initialized with locale "Portuguese_Brazil.1252".
The default text search configuration will be set to "portuguese".
Data page checksums are disabled.
creating directory c:/postgres_debug/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... windows
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/Araguaina
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
Success. You can now start the database server using:
pg_ctl -D c:/postgres_debug/data -l logfile start
C:\postgres_debug\bin>pg_start
C:\postgres_debug\bin>pg_ctl -D c:\postgres_debug\data -l
c:\postgres_debug\log\logfile start
waiting for server to start.... done
server started
C:\postgres_debug\bin>
gd_is_null.patch
Description: Binary data
