Em seg., 21 de set. de 2020 às 14:24, Tomas Vondra < tomas.von...@2ndquadrant.com> escreveu:
> On Sun, Sep 20, 2020 at 01:50:34PM -0300, Ranier Vilela wrote: > >Em seg., 21 de set. de 2020 às 13:37, Tomas Vondra < > >tomas.von...@2ndquadrant.com> escreveu: > > > >> On Mon, Sep 21, 2020 at 08:32:35AM -0300, Ranier Vilela wrote: > >> >Em dom., 20 de set. de 2020 às 21:10, Tomas Vondra < > >> >tomas.von...@2ndquadrant.com> escreveu: > >> > > >> >> On Sun, Sep 20, 2020 at 08:09:56PM -0300, Ranier Vilela wrote: > >> >> >Em sex., 18 de set. de 2020 às 10:37, Tomas Vondra < > >> >> >tomas.von...@2ndquadrant.com> escreveu: > >> >> > > >> >> >> On Thu, Sep 17, 2020 at 06:31:12PM -0300, Ranier Vilela wrote: > >> >> >> >Em qui., 17 de set. de 2020 às 17:09, Tomas Vondra < > >> >> >> >tomas.von...@2ndquadrant.com> escreveu: > >> >> >> > > >> >> >> >> On Thu, Sep 17, 2020 at 02:12:12PM -0300, Ranier Vilela wrote: > >> >> >> >> >Hi, > >> >> >> >> > > >> >> >> >> >In case gd->any_hashable is FALSE, grouping_is_hashable is > never > >> >> >> called. > >> >> >> >> >In this case, the planner could use HASH for groupings, but > will > >> >> never > >> >> >> >> know. > >> >> >> >> > > >> >> >> >> > >> >> >> >> The condition is pretty simple - if the query has grouping > sets, > >> >> look at > >> >> >> >> grouping sets, otherwise look at groupClause. For this to be an > >> issue > >> >> >> >> the query would need to have both grouping sets and > (independent) > >> >> group > >> >> >> >> clause - which AFAIK is not possible. > >> >> >> >> > >> >> >> >Uh? > >> >> >> >(parse->groupClause != NIL) If different from NIL we have > >> >> ((independent) > >> >> >> >group clause), grouping_is_hashable should check? > >> >> >> >(gd ? gd->any_hashable : > grouping_is_hashable(parse->groupClause)))) > >> >> >> >If gd is not NIL and gd->any_hashtable is FALSE? > >> >> >> > > >> >> >> > >> >> >> Sorry, I'm not quite sure I understand what this is meant to say > :-( > >> >> >> > >> >> >> Anyway, (groupClause != NIL) does not mean the groupClause is > somehow > >> >> >> independent (of what?). Add some debugging to > create_grouping_paths > >> and > >> >> >> you'll see that e.g. this query ends up with groupClause with 3 > >> items: > >> >> >> > >> >> >> select 1 from t group by grouping sets ((a), (b), (c)); > >> >> >> > >> >> >> and so does this one: > >> >> >> > >> >> >> select 1 from t group by grouping sets ((a,c), (a,b)); > >> >> >> > >> >> >> I'm no expert in grouping sets, but I'd bet this means we > transform > >> the > >> >> >> grouping sets into a groupClause by extracting the keys. I haven't > >> >> >> investigated why exactly we do this, but I'm sure there's a reason > >> (e.g. > >> >> >> it gives us SortGroupClause). > >> >> >> > >> >> >> You seem to believe a query can have both grouping sets and > regular > >> >> >> grouping at the same level - but how would such query look like? > >> Surely > >> >> >> you can't have two GROuP BY clauses. You can do > >> >> >> > >> >> >Translating into code: > >> >> >gd is grouping sets and > >> >> >parse->groupClause is regular grouping > >> >> >and we cannot have both at the same time. > >> >> > > >> >> > >> >> Have you verified the claim that we can't have both at the same > time? As > >> >> I already explained, we build groupClause from grouping sets. I > suggest > >> >> you do some debugging using the queries I used as examples, and > you'll > >> >> see the claim is not true. > >> >> > >> >I think we already agreed that we cannot have both at the same time. > >> > > >> > >> No, we haven't. I think we've agreed that you can't have both a group by > >> clause with grouping sets and then another group by clause with "plain" > >> grouping. But even with grouping sets you'll have groupClause generated > >> from the grouping sets. See preprocess_grouping_sets. > >> > >> > > >> >> > >> >> > > >> >> >> select 1 from t group by a, grouping sets ((b), (c)); > >> >> >> > >> >> >> which is however mostly equivalent to (AFAICS) > >> >> >> > >> >> >> select 1 from t group by grouping sets ((a,b), (a,c)) > >> >> >> > >> >> >> so it's not like we have an independent groupClause either I > think. > >> >> >> > >> >> >> The whole point of the original condition is - if there are > grouping > >> >> >> sets, check if at least one can be executed using hashing (i.e. > all > >> keys > >> >> >> are hashable). Otherwise (without grouping sets) look at the > >> grouping as > >> >> >> a whole. > >> >> >> > >> >> >Or if gd is NULL check parse->groupClause. > >> >> > > >> >> > >> >> Which is what I'm saying. > >> >> > >> >> > > >> >> >> I don't see how your change improves this - if there are grouping > >> sets, > >> >> >> it's futile to look at the whole groupClause if at least one > grouping > >> >> >> set can't be hashed. > >> >> >> > >> >> >> But there's a simple way to disprove this - show us a case (table > >> and a > >> >> >> query) where your code correctly allows hashing while the current > one > >> >> >> does not. > >> >> > > >> >> >I'm not an expert in grouping either. > >> >> >The question I have here is whether gd is populated and has gd-> > >> >> >any_hashable as FALSE, > >> >> >Its mean no use checking parse-> groupClause, it's a waste of time, > Ok. > >> >> > > >> >> > >> >> I'm sorry, I don't follow your logic. Those are two separate cases. > If > >> >> we have grouping sets, we have to check if at least one can be > hashed. > >> >> If we don't have grouping sets, we have to check groupClause > directly. > >> >> Why would that be a waste of time is unclear to me. > >> >> > >> >This is clear to me. > >> >The problem is how it was implemented in create_grouping_paths. > >> > > >> > >> You haven't demonstrated what the problem is, though. Showing us a query > >> that fails would make it very clear. > >> > >> > > >> >> > >> >> > > >> >> >> > >> >> >> > > >> >> >> >> For hashing to be worth considering, at least one grouping set > has > >> >> to be > >> >> >> >> hashable - otherwise it's pointless. > >> >> >> >> > >> >> >> >> Granted, you could have something like > >> >> >> >> > >> >> >> >> GROUP BY GROUPING SETS ((a), (b)), c > >> >> >> >> > >> >> >> >> which I think essentially says "add c to every grouping set" > and > >> that > >> >> >> >> will be covered by the any_hashable check. > >> >> >> >> > >> >> >> >I am not going into the merit of whether or not it is worth > hashing. > >> >> >> >grouping_is_hashable as a last resort, must decide. > >> >> >> > > >> >> >> > >> >> >> I don't know what this is supposed to say either. The whole point > of > >> >> >> this check is to simply skip construction of hash-based paths in > >> cases > >> >> >> when it's obviously futile (i.e. when some of the keys don't > support > >> >> >> hashing). We do this as early as possible, because the whole point > >> is to > >> >> >> save precious CPU cycles during query planning. > >> >> >> > >> >> > > >> >> >> > > >> >> >> >> >Apparently gd pointer, will never be NULL there, verified with > >> >> >> Assert(gd > >> >> >> >> != > >> >> >> >> >NULL). > >> >> >> >> > > >> >> >> >> > >> >> >> >> Um, what? If you add the assert right before the if condition, > you > >> >> won't > >> >> >> >> even be able to do initdb. It's pretty clear it'll crash for > any > >> >> query > >> >> >> >> without grouping sets. > >> >> >> >> > >> >> >> >Here not: > >> >> >> >Assert(gd != NULL); > >> >> >> >create_ordinary_grouping_paths(root, input_rel, grouped_rel, > >> >> >> > agg_costs, gd, &extra, > >> >> >> > &partially_grouped_rel); > >> >> >> > > >> >> >> > >> >> >> I have no idea where you're adding this assert. But simply adding > it > >> to > >> >> >> create_grouping_paths (right before the if condition changed by > your > >> >> >> patch) will trigger a failure during initdb. Simply because for > >> queries > >> >> >> without grouping sets (and with regular grouping) we pass gs=NULL. > >> >> >> > >> >> >> Try applying the attached patch and do "pg_ctl -D ... init" - > you'll > >> get > >> >> >> a failure proving that gd=NULL. > >> >> >> > >> >> >Well, I did a test right now, downloaded the latest HEAD and added > the > >> >> >Assertive. > >> >> >I compiled everything, ran the regression tests, installed the > >> binaries, > >> >> >loaded the server and installed a client database. > >> >> >Everything is working. > >> >> >Maybe in all these steps, there is no grouping that would trigger > the > >> code > >> >> >in question, but I doubt it. > >> >> > > >> >> > >> >> For me doing this leads to reliable crashes when pg_regress does > initdb > >> >> (so before the actual checks): > >> >> > >> >> patch -p1 < ~/grouping-assert.patch > >> >> ./configure --enable-debug --enable-cassert > >> >> make -s clean && make -s -j4 && make check > >> >> > >> >> And the "make check" it immediately crashes like this: > >> >> > >> >> ============== creating temporary instance > ============== > >> >> ============== initializing database system > ============== > >> >> > >> >> pg_regress: initdb failed > >> >> Examine /home/user/work/postgres/src/test/regress/log/initdb.log > for > >> >> the reason. > >> >> > >> >> on the assert. So yeah, I'd guess there's something wrong with your > >> >> build. What does pg_config say? > >> >> > >> >Default. > >> >I never change anything. I simply clone the last head: > >> >git clone --single-branch https://github.com/postgres/postgres/ > >> >and have it compiled. > >> > > >> > >> Compiled how? > >> > >build Debug > > > > What does that mean? Wouldn't it be simpler to just show pg_config > output, which shows the exact flags used for configure / compile? > I'm sorry. BINDIR = C:/dll/postgres/Debug/PG_CON~1 DOCDIR = /doc HTMLDIR = /doc INCLUDEDIR = /include PKGINCLUDEDIR = /include INCLUDEDIR-SERVER = /include/server LIBDIR = /lib PKGLIBDIR = /lib LOCALEDIR = /share/locale MANDIR = /man SHAREDIR = /share SYSCONFDIR = /etc PGXS = /lib/pgxs/src/makefiles/pgxs.mk CONFIGURE = --enable-thread-safety --with-ldap --without-zlib CC = not recorded CPPFLAGS = not recorded CFLAGS = not recorded CFLAGS_SL = not recorded LDFLAGS = not recorded LDFLAGS_EX = not recorded LDFLAGS_SL = not recorded LIBS = not recorded VERSION = PostgreSQL 14devel msvc 2019 (64 bits) windows 10 (64bits) steps: cd\dll\postgres\src\tools\msvc build Debug Maybe, in Windows builds, has another file? regards, Ranier Vilela