Re: Question: test "aggregates" failed in 32-bit machine
On 10/3/22 16:05, Tom Lane wrote: > "Jonathan S. Katz" writes: >> Based on the above discussion, the RMT asks for a revert of db0d67db2 in >> the v15 release. The RMT also recommends a revert in HEAD but does not >> have the power to request that. > > Roger, I'll push these shortly. > Thanks for resolving this, and apologies for not noticing this thread earlier (and for the bugs in the code, ofc). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Question: test "aggregates" failed in 32-bit machine
[ Just for the archives' sake at this point, in case somebody has another go at this feature. ] I wrote: > ... I'm now discovering that the code I'd hoped to salvage in > pathkeys.c (get_useful_group_keys_orderings and related) has its very own > bugs. It's imagining that it can rearrange a PathKeys list arbitrarily > and then rearrange the GROUP BY SortGroupClause list to match, but that's > easier said than done, for a couple of different reasons. It strikes me that the easy solution here is to *not* rearrange the SortGroupClause list at all. What that would be used for later is to generate a Unique node's list of columns to compare, but since Unique only cares about equality-or-not, there's no strong reason why it has to compare the columns in the same order they're sorted in. (Indeed, if anything we should prefer to compare them in the opposite order, since the least-significant column should be the most likely to be different from the previous row.) I'm fairly sure that the just-reverted code is buggy on its own terms, in that it might sometimes produce a clause list that's not ordered the same as the pathkeys; but there's no visible misbehavior, because that does not in fact matter. So this'd let us simplify the APIs here, in particular PathKeyInfo seems unnecessary, because we don't have to pass the SortGroupClause list into or out of the pathkey-reordering logic. regards, tom lane
Re: Question: test "aggregates" failed in 32-bit machine
"Jonathan S. Katz" writes: > Based on the above discussion, the RMT asks for a revert of db0d67db2 in > the v15 release. The RMT also recommends a revert in HEAD but does not > have the power to request that. Roger, I'll push these shortly. regards, tom lane
Re: Question: test "aggregates" failed in 32-bit machine
On 10/2/22 8:45 PM, Michael Paquier wrote: On Sun, Oct 02, 2022 at 02:11:12PM -0400, Tom Lane wrote: "Jonathan S. Katz" writes: OK. For v15 I am heavily in favor for the least risky approach given the point we are at in the release cycle. The RMT hasn’t met yet to discuss, but from re-reading this thread again, I would recommend to revert (i.e. the “straight up revert”). OK by me. I don't quite see why it would be to let this code live on HEAD if it is not ready to be merged as there is a risk of creating side issues with things tied to the costing still ready to be merged, so I agree that the reversion done on both branches is the way to go for now. This could always be reworked and reproposed in the future. [RMT-hat] Just to follow things procedure-wise[1], while there do not seem to be any objections to reverting through regular community processes, I do think the RMT has to make this ask as Tomas (patch committer) has not commented and we are up against release deadlines. Based on the above discussion, the RMT asks for a revert of db0d67db2 in the v15 release. The RMT also recommends a revert in HEAD but does not have the power to request that. We do hope to see continued work and inclusion of this feature for a future release. We understand that the work on this optimization is complicated and appreciate all of the efforts on it. Thanks, Jonathan [1] https://wiki.postgresql.org/wiki/Release_Management_Team OpenPGP_signature Description: OpenPGP digital signature
Re: Question: test "aggregates" failed in 32-bit machine
On Sun, Oct 02, 2022 at 02:11:12PM -0400, Tom Lane wrote: > "Jonathan S. Katz" writes: >> OK. For v15 I am heavily in favor for the least risky approach given the >> point we are at in the release cycle. The RMT hasn’t met yet to discuss, >> but from re-reading this thread again, I would recommend to revert >> (i.e. the “straight up revert”). > > OK by me. I don't quite see why it would be to let this code live on HEAD if it is not ready to be merged as there is a risk of creating side issues with things tied to the costing still ready to be merged, so I agree that the reversion done on both branches is the way to go for now. This could always be reworked and reproposed in the future. > I'm just about to throw up my hands and go for reversion in both branches, > because I'm now discovering that the code I'd hoped to salvage in > pathkeys.c (get_useful_group_keys_orderings and related) has its very own > bugs. It's imagining that it can rearrange a PathKeys list arbitrarily > and then rearrange the GROUP BY SortGroupClause list to match, but that's > easier said than done, for a couple of different reasons. (I now > understand why db0d67db2 made a cowboy hack in get_eclass_for_sort_expr ... > but it's still a cowboy hack with difficult-to-foresee side effects.) > There are other things in there that make it painfully obvious that > this code wasn't very carefully reviewed, eg XXX comments that should > have been followed up and were not, or a reference to a nonexistent > "debug_group_by_match_order_by" flag (maybe that was a GUC at some point?). Okay. Ugh. -- Michael signature.asc Description: PGP signature
Re: Question: test "aggregates" failed in 32-bit machine
David Rowley writes: > As for the slight misuse of group_pathkeys, I guess since there are no > users that require just the plain pathkeys belonging to the GROUP BY, > then likely the best thing would be just to rename that field to > something like groupagg_pathkeys. Maintaining two separate fields and > concatenating them every time we want group_pathkeys does not seem > that appealing to me. Seems like a waste of memory and effort. I don't > want to hi-jack this thread to discuss that, but if you have a > preferred course of action, then I'm happy to kick off a discussion on > a new thread. I don't feel any great urgency to resolve this. Let's wait and see what comes out of the other thread. regards, tom lane
Re: Question: test "aggregates" failed in 32-bit machine
On Mon, 3 Oct 2022 at 09:59, Tom Lane wrote: > > David Rowley writes: > > For the master version, I think it's safe just to get rid of > > PlannerInfo.num_groupby_pathkeys now. I only added that so I could > > strip off the ORDER BY / DISTINCT aggregate PathKeys from the group by > > pathkeys before passing to the functions that rearranged the GROUP BY > > clause. > > I was kind of unhappy with that data structure too, but from the > other direction: I didn't like that you were folding aggregate-derived > pathkeys into root->group_pathkeys in the first place. That seems like > a kluge that might work all right for the moment but will cause problems > down the road. (Despite the issues with the patch at hand, I don't > think it's unreasonable to suppose that somebody will have a more > successful go at optimizing GROUP BY sorting later.) If we keep the > data structure like this, I think we absolutely need num_groupby_pathkeys, > or some other way of recording which pathkeys came from what source. Ok, I don't feel too strongly about removing num_groupby_pathkeys. I'm fine to leave it there. However, I'll reserve slight concerns that we'll likely receive sporadic submissions of cleanup patches that remove the unused field over the course of the next few years and that dealing with those might take up more time than just removing it now and putting it back when we need it. We have been receiving quite a few patches along those lines lately. As for the slight misuse of group_pathkeys, I guess since there are no users that require just the plain pathkeys belonging to the GROUP BY, then likely the best thing would be just to rename that field to something like groupagg_pathkeys. Maintaining two separate fields and concatenating them every time we want group_pathkeys does not seem that appealing to me. Seems like a waste of memory and effort. I don't want to hi-jack this thread to discuss that, but if you have a preferred course of action, then I'm happy to kick off a discussion on a new thread. David
Re: Question: test "aggregates" failed in 32-bit machine
David Rowley writes: > For the master version, I think it's safe just to get rid of > PlannerInfo.num_groupby_pathkeys now. I only added that so I could > strip off the ORDER BY / DISTINCT aggregate PathKeys from the group by > pathkeys before passing to the functions that rearranged the GROUP BY > clause. I was kind of unhappy with that data structure too, but from the other direction: I didn't like that you were folding aggregate-derived pathkeys into root->group_pathkeys in the first place. That seems like a kluge that might work all right for the moment but will cause problems down the road. (Despite the issues with the patch at hand, I don't think it's unreasonable to suppose that somebody will have a more successful go at optimizing GROUP BY sorting later.) If we keep the data structure like this, I think we absolutely need num_groupby_pathkeys, or some other way of recording which pathkeys came from what source. One way to manage that would be to insist that the length of root->group_clauses should indicate the number of associated grouping pathkeys. Right now they might not be the same because we might discover some of the pathkeys to be redundant --- but if we do, ISTM that the corresponding GROUP BY clauses are also redundant and could get dropped. That ties into the stuff I was worried about in [1], though. I'll keep this in mind when I get back to messing with that. regards, tom lane [1] https://www.postgresql.org/message-id/flat/1657885.1657647073%40sss.pgh.pa.us
Re: Question: test "aggregates" failed in 32-bit machine
On Mon, 3 Oct 2022 at 08:10, Tom Lane wrote: > As attached. For the master version, I think it's safe just to get rid of PlannerInfo.num_groupby_pathkeys now. I only added that so I could strip off the ORDER BY / DISTINCT aggregate PathKeys from the group by pathkeys before passing to the functions that rearranged the GROUP BY clause. David
Re: Question: test "aggregates" failed in 32-bit machine
"Jonathan S. Katz" writes: > OK. For v15 I am heavily in favor for the least risky approach given the > point we are at in the release cycle. The RMT hasn’t met yet to discuss, > but from re-reading this thread again, I would recommend to revert > (i.e. the “straight up revert”). OK by me. > I’m less opinionated on the approach for what’s in HEAD, but the rewrite > you suggest sounds promising. I'm just about to throw up my hands and go for reversion in both branches, because I'm now discovering that the code I'd hoped to salvage in pathkeys.c (get_useful_group_keys_orderings and related) has its very own bugs. It's imagining that it can rearrange a PathKeys list arbitrarily and then rearrange the GROUP BY SortGroupClause list to match, but that's easier said than done, for a couple of different reasons. (I now understand why db0d67db2 made a cowboy hack in get_eclass_for_sort_expr ... but it's still a cowboy hack with difficult-to-foresee side effects.) There are other things in there that make it painfully obvious that this code wasn't very carefully reviewed, eg XXX comments that should have been followed up and were not, or a reference to a nonexistent "debug_group_by_match_order_by" flag (maybe that was a GUC at some point?). On top of that, it's producing several distinct pathkey orderings for the caller to try, but it's completely unclear to me that the subsequent choice of cheapest path isn't going to largely reduce to the question of whether we can accurately estimate the relative costs of different sort-column orders. Which is exactly what we're finding we can't do. So that end of it seems to need a good deal of rethinking as well. In short, this needs a whole lotta work, and I'm not volunteering. regards, tom lane
Re: Question: test "aggregates" failed in 32-bit machine
> On Oct 2, 2022, at 1:12 PM, Tom Lane wrote: > > "Jonathan S. Katz" writes: >>> On 10/1/22 6:57 PM, Tom Lane wrote: >>> I plan to have a look tomorrow at the idea of reverting only the cost_sort >>> changes, and rewriting get_cheapest_group_keys_order() to just sort the >>> keys by decreasing numgroups estimates as I suggested upthread. That >>> might be substantially less messy, because of fewer interactions with >>> 1349d2790. > >> Maybe this leads to a follow-up question of do we continue to improve >> what is in HEAD while reverting the code in v15 (particularly if it's >> easier to do it that way)? > > No. I see no prospect that the cost_sort code currently in HEAD is going > to become shippable in the near future. Quite aside from the plain bugs, > I think it's based on untenable assumptions about how accurately we can > estimate the CPU costs associated with different sort-column orders. OK. > Having said that, it's certainly possible that we should do something > different in HEAD than in v15. We could do the rewrite I suggest above > in HEAD while doing a straight-up revert in v15. I've been finding that > 1349d2790 is sufficiently entwined with this code that the patches would > look significantly different in any case, so that might be the most > reliable way to proceed in v15. OK. For v15 I am heavily in favor for the least risky approach given the point we are at in the release cycle. The RMT hasn’t met yet to discuss, but from re-reading this thread again, I would recommend to revert (i.e. the “straight up revert”). I’m less opinionated on the approach for what’s in HEAD, but the rewrite you suggest sounds promising. Thanks, Jonathan
Re: Question: test "aggregates" failed in 32-bit machine
"Jonathan S. Katz" writes: > On 10/1/22 6:57 PM, Tom Lane wrote: >> I plan to have a look tomorrow at the idea of reverting only the cost_sort >> changes, and rewriting get_cheapest_group_keys_order() to just sort the >> keys by decreasing numgroups estimates as I suggested upthread. That >> might be substantially less messy, because of fewer interactions with >> 1349d2790. > Maybe this leads to a follow-up question of do we continue to improve > what is in HEAD while reverting the code in v15 (particularly if it's > easier to do it that way)? No. I see no prospect that the cost_sort code currently in HEAD is going to become shippable in the near future. Quite aside from the plain bugs, I think it's based on untenable assumptions about how accurately we can estimate the CPU costs associated with different sort-column orders. Having said that, it's certainly possible that we should do something different in HEAD than in v15. We could do the rewrite I suggest above in HEAD while doing a straight-up revert in v15. I've been finding that 1349d2790 is sufficiently entwined with this code that the patches would look significantly different in any case, so that might be the most reliable way to proceed in v15. regards, tom lane
Re: Question: test "aggregates" failed in 32-bit machine
On 10/1/22 6:57 PM, Tom Lane wrote: "Jonathan S. Katz" writes: On 10/1/22 3:13 PM, Tom Lane wrote: I'm still of the opinion that we need to revert this code for now. [RMT hat, but speaking just for me] reading through Tom's analysis, this seems to be the safest path forward. I have a few questions to better understand: 1. How invasive would the revert be? I've just finished constructing a draft full-reversion patch. I'm not confident in this yet; in particular, teasing it apart from 1349d2790 ("Improve performance of ORDER BY / DISTINCT aggregates") was fairly messy. I need to look through the regression test changes and make sure that none are surprising. But this is approximately the right scope if we rip it out entirely. I plan to have a look tomorrow at the idea of reverting only the cost_sort changes, and rewriting get_cheapest_group_keys_order() to just sort the keys by decreasing numgroups estimates as I suggested upthread. That might be substantially less messy, because of fewer interactions with 1349d2790. Maybe this leads to a follow-up question of do we continue to improve what is in HEAD while reverting the code in v15 (particularly if it's easier to do it that way)? I know we're generally not in favor of that approach, but wanted to ask. 2. Are the other user-visible items that would be impacted? See above. (But note that 1349d2790 is HEAD-only, not in v15.) With the RMT hat, I'm hyperfocused on PG15 stability. We have plenty of time time to stabilize head for v16 :) 3. Is there an option of disabling the feature by default viable? Not one that usefully addresses my concerns. The patch did add an enable_group_by_reordering GUC which we could change to default-off, but it does nothing about the cost_sort behavioral changes. I would be a little inclined to rip out that GUC in either case, because I doubt that we need it with the more restricted change. Understood. I'll wait for your analysis of reverting only the cost_sort changes etc. mentioned above. Thanks, Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: Question: test "aggregates" failed in 32-bit machine
"Jonathan S. Katz" writes: > On 10/1/22 3:13 PM, Tom Lane wrote: >> I'm still of the opinion that we need to revert this code for now. > [RMT hat, but speaking just for me] reading through Tom's analysis, this > seems to be the safest path forward. I have a few questions to better > understand: > 1. How invasive would the revert be? I've just finished constructing a draft full-reversion patch. I'm not confident in this yet; in particular, teasing it apart from 1349d2790 ("Improve performance of ORDER BY / DISTINCT aggregates") was fairly messy. I need to look through the regression test changes and make sure that none are surprising. But this is approximately the right scope if we rip it out entirely. I plan to have a look tomorrow at the idea of reverting only the cost_sort changes, and rewriting get_cheapest_group_keys_order() to just sort the keys by decreasing numgroups estimates as I suggested upthread. That might be substantially less messy, because of fewer interactions with 1349d2790. > 2. Are the other user-visible items that would be impacted? See above. (But note that 1349d2790 is HEAD-only, not in v15.) > 3. Is there an option of disabling the feature by default viable? Not one that usefully addresses my concerns. The patch did add an enable_group_by_reordering GUC which we could change to default-off, but it does nothing about the cost_sort behavioral changes. I would be a little inclined to rip out that GUC in either case, because I doubt that we need it with the more restricted change. regards, tom lane diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 2e4e82a94f..cc9e39c4a5 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -2862,13 +2862,16 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i -- GROUP BY clause in various forms, cardinal, alias and constant expression explain (verbose, costs off) select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2; - QUERY PLAN - - Foreign Scan + QUERY PLAN +--- + Sort Output: (count(c2)), c2, 5, 7.0, 9 - Relations: Aggregate on (public.ft1) - Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST -(4 rows) + Sort Key: ft1.c2 + -> Foreign Scan + Output: (count(c2)), c2, 5, 7.0, 9 + Relations: Aggregate on (public.ft1) + Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 +(7 rows) select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2; w | x | y | z diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index d8848bc774..d750290f13 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -5062,20 +5062,6 @@ ANY num_sync ( - enable_group_by_reordering (boolean) - - enable_group_by_reordering configuration parameter - - - - -Enables or disables reordering of keys in a GROUP BY -clause. The default is on. - - - - enable_hashagg (boolean) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index f486d42441..5ef29eea69 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -1814,327 +1814,6 @@ cost_recursive_union(Path *runion, Path *nrterm, Path *rterm) rterm->pathtarget->width); } -/* - * is_fake_var - * Workaround for generate_append_tlist() which generates fake Vars with - * varno == 0, that will cause a fail of estimate_num_group() call - * - * XXX Ummm, why would estimate_num_group fail with this? - */ -static bool -is_fake_var(Expr *expr) -{ - if (IsA(expr, RelabelType)) - expr = (Expr *) ((RelabelType *) expr)->arg; - - return (IsA(expr, Var) && ((Var *) expr)->varno == 0); -} - -/* - * get_width_cost_multiplier - * Returns relative complexity of comparing two values based on its width. - * The idea behind is that the comparison becomes more expensive the longer the - * value is. Return value is in cpu_operator_cost units. - */ -static double -get_width_cost_multiplier(PlannerInfo *root, Expr *expr) -{ - double width = -1.0; /* fake value */ - - if (IsA(expr, RelabelType)) - expr = (Expr *) ((RelabelType *) expr)->arg; - - /* Try to find actual stat in corresponding relation */ - if (IsA(expr, Var)) - { - Var *var = (Var *) expr; - - if (var->varno > 0 && var->varno < root->simple_rel_array
Re: Question: test "aggregates" failed in 32-bit machine
On 10/1/22 3:13 PM, Tom Lane wrote: I'm still of the opinion that we need to revert this code for now. [RMT hat, but speaking just for me] reading through Tom's analysis, this seems to be the safest path forward. I have a few questions to better understand: 1. How invasive would the revert be? 2. Are the other user-visible items that would be impacted? 3. Is there an option of disabling the feature by default viable? Thanks, Jonathan OpenPGP_signature Description: OpenPGP digital signature
Re: Question: test "aggregates" failed in 32-bit machine
Peter Geoghegan writes: > Reminds me of the other sort testing program that you wrote when the > B&M code first went in: > https://www.postgresql.org/message-id/18732.1142967...@sss.pgh.pa.us Ha, I'd totally forgotten about that ... regards, tom lane
Re: Question: test "aggregates" failed in 32-bit machine
On Sat, Oct 1, 2022 at 12:14 PM Tom Lane wrote: > I spent some time today looking into the question of what our qsort > code actually does. I wrote a quick-n-dirty little test module > (attached) to measure the number of comparisons qsort really uses > for assorted sample inputs. Reminds me of the other sort testing program that you wrote when the B&M code first went in: https://www.postgresql.org/message-id/18732.1142967...@sss.pgh.pa.us This was notable for recreating the tests from the original B&M paper. The paper uses various types of test inputs with characteristics that were challenging to the implementation and worth specifically getting right. For example, "saw tooth" input. -- Peter Geoghegan
Re: Question: test "aggregates" failed in 32-bit machine
I wrote: > So at this point I've lost all faith in the estimates being meaningful > at all. I spent some time today looking into the question of what our qsort code actually does. I wrote a quick-n-dirty little test module (attached) to measure the number of comparisons qsort really uses for assorted sample inputs. The results will move a bit from run to run because of randomization, but the average counts should be pretty stable I think. I got results like these: regression=# create temp table data as select * from qsort_comparisons(10); SELECT 10 regression=# select n * log(groups)/log(2) as est, 100*(n * log(groups)/log(2) - avg_cmps)/avg_cmps as pct_err, * from data; est | pct_err | n| groups | avg_cmps | min_cmps | max_cmps | note ++++--+--+--+ 0 | -100 | 10 | 1 |9 | 9 |9 | all values the same 1660964.0474436812 | -5.419880052975057 | 10 | 10 | 1756145 | 1722569 | 1835627 | all values distinct 10 | -33.33911061041376 | 10 | 2 | 150013 | 150008 | 150024 | 2 distinct values 40 | 11.075628618635713 | 10 | 16 | 360115 | 337586 | 431376 | 16 distinct values 60 | 8.369757612975473 | 10 | 64 | 553660 | 523858 | 639492 | 64 distinct values 80 | 4.770461016221087 | 10 |256 | 763574 | 733898 | 844450 | 256 distinct values 100 | 1.5540821186618827 | 10 | 1024 | 984697 | 953830 | 384 | 1024 distinct values 1457116.0087927429 | 41.97897366170798 | 10 | 24342 | 1026290 | 994694 | 1089503 | Zipfian, parameter 1.1 1150828.9986140348 | 158.28880094758154 | 10 | 2913 | 445559 | 426575 | 511214 | Zipfian, parameter 1.5 578135.9713524659 | 327.6090378488971 | 10 | 55 | 135202 | 132541 | 213467 | Zipfian, parameter 3.0 (10 rows) So "N * log(NumberOfGroups)" is a pretty solid estimate for uniformly-sized groups ... except when NumberOfGroups = 1 ... but it is a significant overestimate if the groups aren't uniformly sized. Now a factor of 2X or 3X isn't awful --- we're very happy to accept estimates only that good in other contexts --- but I still wonder whether this is reliable enough to justify the calculations being done in compute_cpu_sort_cost. I'm still very afraid that the conclusions we're drawing about the sort costs for different column orders are mostly junk. In any case, something's got to be done about the failure at NumberOfGroups = 1. Instead of this: correctedNGroups = Max(1.0, ceil(correctedNGroups)); per_tuple_cost += totalFuncCost * LOG2(correctedNGroups); I suggest if (correctedNGroups > 1.0) per_tuple_cost += totalFuncCost * LOG2(correctedNGroups); else /* Sorting N all-alike tuples takes only N-1 comparisons */ per_tuple_cost += totalFuncCost; (Note that the ceil() here is a complete waste, because all paths leading to this produced integral estimates already. Even if they didn't, I see no good argument why ceil() makes the result better.) I'm still of the opinion that we need to revert this code for now. regards, tom lane /* create function qsort_comparisons(N int) returns table (N int, groups int, avg_cmps int8, min_cmps int8, max_cmps int8, note text) strict volatile language c as '/path/to/count_comparisons.so'; select * from qsort_comparisons(1); */ #include "postgres.h" #include #include "common/pg_prng.h" #include "fmgr.h" #include "funcapi.h" #include "miscadmin.h" #include "utils/tuplestore.h" PG_MODULE_MAGIC; /* * qsort_arg comparator function for integers. * The number of calls is tracked in *arg. */ static int cmp_func(const void *a, const void *b, void *arg) { int aa = *(const int *) a; int bb = *(const int *) b; size_t *count_ptr = (size_t *) arg; (*count_ptr)++; if (aa < bb) return -1; if (aa > bb) return 1; return 0; } /* * Randomly shuffle an array of integers. */ static void shuffle(int *x, int n) { /* * Shuffle array using Fisher-Yates algorithm. Iterate array and swap head * (i) with a randomly chosen same-or-later item (j) at each iteration. */ for (int i = 0; i < n; i++) { int j = (int) pg_prng_uint64_range(&pg_global_prng_state, i, n - 1); int tmp = x[i]; x[i] = x[j]; x[j] = tmp; } } /* * Test qsort for the given test case. * Shuffle the given array, sort it using qsort_arg, * and report the numbers of comparisons made. */ static void qsort_test_case(int *array, int n, char *note, int trials, Tuplestorestate *tupstore, AttInMetadata *attinmeta) { size_t tot_count = 0; size_t min_count = SIZE_MAX;
Re: Question: test "aggregates" failed in 32-bit machine
I wrote: > Given the previous complaints about db0d67db2, I wonder if it's not > most prudent to revert it. I doubt we are going to get satisfactory > behavior out of it until there's fairly substantial improvements in > all these underlying estimates. After spending some more time looking at the code, I think that that is something we absolutely have to discuss. I already complained at [1] about how db0d67db2 made very significant changes in sort cost estimation behavior, which seem likely to result in significant user-visible plan changes that might or might not be for the better. But I hadn't read any of the code at that point. Now I have, and frankly it's not ready for prime time. Beyond the question of whether we have sufficiently accurate input values, I see these issues in and around compute_cpu_sort_cost(): 1. The algorithm is said to be based on Sedgewick & Bentley 2002 [2]. I have the highest regard for those two gentlemen, so I'm quite prepared to believe that their estimate of the number of comparisons used by Quicksort is good. However, the expression given in our comments: * log(N! / (X1! * X2! * ..)) ~ sum(Xi * log(N/Xi)) doesn't look much like anything they wrote. More, what we're actually doing is * We assume all Xi the same because now we don't have any estimation of * group sizes, we have only know the estimate of number of groups (distinct * values). In that case, formula becomes: * N * log(NumberOfGroups) That's a pretty drastic simplification. No argument is given as to why that's still reliable enough to be useful for the purposes to which this code tries to put it --- especially when you consider that real-world data is more likely to follow Zipf's law than have uniform group sizes. If you're going to go as far as doing this: * For multi-column sorts we need to estimate the number of comparisons for * each individual column - for example with columns (c1, c2, ..., ck) we * can estimate that number of comparisons on ck is roughly * ncomparisons(c1, c2, ..., ck) / ncomparisons(c1, c2, ..., c(k-1)) you'd better pray that your number-of-comparisons estimates are pretty darn good, or what you're going to get out is going to be mostly fiction. 2. Sedgewick & Bentley analyzed a specific version of Quicksort, which is ... um ... not the version we are using. It doesn't look to me like the choice of partitioning element is the same. Maybe that doesn't matter much in the end, but there's sure no discussion of the point in this patch. So at this point I've lost all faith in the estimates being meaningful at all. And that's assuming that the simplified algorithm is implemented accurately, which it is not: 3. totalFuncCost is started off at 1.0. Surely that should be zero? If not, at least a comment to justify it would be nice. 4. The code around the add_function_cost call evidently wants to carry the procost lookup result from one column to the next, because it skips the lookup when prev_datatype == em->em_datatype. However, the value of funcCost isn't carried across columns, because it's local to the loop. The effect of this is that anyplace where adjacent GROUP BY columns are of the same datatype, we'll use the fixed 1.0 value of funcCost instead of looking up the real procost. Admittedly, since the real procost is probably also 1.0, this might not mean much in practice. Nonetheless it's broken code. (Oh, btw: I doubt that using add_function_cost rather than raw procost is of any value whatsoever if you're just going to pass it a NULL node tree.) 5. I'm pretty dubious about the idea that we can use the rather-random first element of the EquivalenceClass to determine the datatype that will be compared, much less the average widths of the columns. It's entirely possible for an EC to contain both int4 and int8 vars, or text vars of substantially different average widths. I think we really need to be going back to the original GroupClauses and looking at the variables named there. 6. Worse than that, we're also using the first element of the EquivalenceClass to calculate the number of groups of this sort key. This is FLAT OUT WRONG, as certainly different EC members can have very different stats. 7. The code considers that presorted-key columns do not add to the comparison costs, yet the comment about it claims the opposite: /* * Presorted keys are not considered in the cost above, but we still * do have to compare them in the qsort comparator. So make sure to * factor in the cost in that case. */ if (i >= nPresortedKeys) { I'm not entirely sure whether the code is broken or the comment is, but at least one of them is. I'm also pretty confused about why we still add such columns' comparison functions to the running totalFuncCost if we think they're not sorted on. 8. In the case complained of to start this thread, we're unable to perceive any sort-cost difference between "p, d
Re: Question: test "aggregates" failed in 32-bit machine
I wrote: > The most likely theory, I think, is that that compiler is generating > slightly different floating-point code causing different plans to > be costed slightly differently than what the test case is expecting. > Probably, the different orderings of the keys in this test case have > exactly the same cost, or almost exactly, so that different roundoff > error could be enough to change the selected plan. I added some debug printouts to get_cheapest_group_keys_order() and verified that in the two problematic queries, there are two different orderings that have (on my machine) exactly equal lowest cost. So the code picks the first of those and ignores the second. Different roundoff error would be enough to make it do something else. I find this problematic because "exactly equal" costs are not going to be unusual. That's because the values that cost_sort_estimate relies on are, sadly, just about completely fictional. It's expecting that it can get a good cost estimate based on: * procost. In case you hadn't noticed, this is going to be 1 for just about every function we might be considering here. * column width. This is either going to be a constant (e.g. 4 for integers) or, again, largely fictional. The logic for converting widths to cost multipliers adds yet another layer of debatability. * numdistinct estimates. Sometimes we know what we're talking about there, but often we don't. So what I'm afraid we are dealing with here is usually going to be garbage in, garbage out. And we're expending an awful lot of code and cycles to arrive at these highly questionable choices. Given the previous complaints about db0d67db2, I wonder if it's not most prudent to revert it. I doubt we are going to get satisfactory behavior out of it until there's fairly substantial improvements in all these underlying estimates. regards, tom lane
Re: Question: test "aggregates" failed in 32-bit machine
Hi, On 2022-09-30 12:13:11 -0400, Tom Lane wrote: > "kuroda.hay...@fujitsu.com" writes: > > Hmm, I was not sure about additional conditions, sorry. > > I could reproduce with followings steps: > > I tried this on a 32-bit VM with gcc 11.3, but couldn't reproduce. > You said earlier > > >> OS: RHEL 6.10 server > >> Arch: i686 > >> Gcc: 4.4.7 > > That is an awfully old compiler; I fear I no longer have anything > comparable on a working platform. > > The most likely theory, I think, is that that compiler is generating > slightly different floating-point code causing different plans to > be costed slightly differently than what the test case is expecting. > Probably, the different orderings of the keys in this test case have > exactly the same cost, or almost exactly, so that different roundoff > error could be enough to change the selected plan. Yea. I suspect that's because that compiler version doesn't have -fexcess-precision=standard: > CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Werror=vla -Wendif-labels > -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g > -O2 It's possible one could work around the issue with -msse -mfpmath=sse instead of -fexcess-precision=standard. Greetings, Andres Freund
Re: Question: test "aggregates" failed in 32-bit machine
"kuroda.hay...@fujitsu.com" writes: > Hmm, I was not sure about additional conditions, sorry. > I could reproduce with followings steps: I tried this on a 32-bit VM with gcc 11.3, but couldn't reproduce. You said earlier >> OS: RHEL 6.10 server >> Arch: i686 >> Gcc: 4.4.7 That is an awfully old compiler; I fear I no longer have anything comparable on a working platform. The most likely theory, I think, is that that compiler is generating slightly different floating-point code causing different plans to be costed slightly differently than what the test case is expecting. Probably, the different orderings of the keys in this test case have exactly the same cost, or almost exactly, so that different roundoff error could be enough to change the selected plan. This probably doesn't have a lot of real-world impact, but it's still annoying on a couple of grounds. Failing regression isn't nice, and also this suggests that db0d67db2 is causing us to waste time considering multiple plans with effectively equal costs. Maybe that code needs to filter a little harder. regards, tom lane
RE: Question: test "aggregates" failed in 32-bit machine
Dear Tom, > Hmm, we're not seeing any such failures in the buildfarm's 32-bit > animals, so there must be some additional condition needed to make > it happen. Can you be more specific? Hmm, I was not sure about additional conditions, sorry. I could reproduce with followings steps: $ git clone https://github.com/postgres/postgres.git $ cd postgres $ ./configure --enable-cassert --enable-debug $ make -j2 $ make check LANG=C -> aggregates ... FAILED 3562 ms The hypervisor of the virtual machine is " VMware vSphere 7.0" And I picked another information related with the machine. Could you find something? ``` pg_config]$ ./pg_config ... CONFIGURE = '--enable-cassert' '--enable-debug' CC = gcc -std=gnu99 CPPFLAGS = -D_GNU_SOURCE CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -O2 CFLAGS_SL = -fPIC LDFLAGS = -Wl,--as-needed -Wl,-rpath,'/usr/local/pgsql/lib',--enable-new-dtags LDFLAGS_EX = LDFLAGS_SL = LIBS = -lpgcommon -lpgport -lz -lreadline -lrt -ldl -lm VERSION = PostgreSQL 16devel $ locale LANG=C ... $ arch i686 $cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz stepping: 7 microcode : 83898371 cpu MHz : 2394.374 cache size : 36608 KB physical id : 0 siblings: 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss nx pdpe1gb rdtscp lm constant_tsc arch_perfmon xtopology tsc_reliable nonstop_tsc unfair_spinlock eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch arat xsaveopt ssbd ibrs ibpb stibp fsgsbase bmi1 avx2 smep bmi2 invpcid avx512f rdseed adx avx512cd md_clear flush_l1d arch_capabilities bogomips: 4788.74 clflush size: 64 cache_alignment : 64 address sizes : 43 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz stepping: 7 microcode : 83898371 cpu MHz : 2394.374 cache size : 36608 KB physical id : 2 siblings: 1 core id : 0 cpu cores : 1 apicid : 2 initial apicid : 2 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss nx pdpe1gb rdtscp lm constant_tsc arch_perfmon xtopology tsc_reliable nonstop_tsc unfair_spinlock eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch arat xsaveopt ssbd ibrs ibpb stibp fsgsbase bmi1 avx2 smep bmi2 invpcid avx512f rdseed adx avx512cd md_clear flush_l1d arch_capabilities bogomips: 4788.74 clflush size: 64 cache_alignment : 64 address sizes : 43 bits physical, 48 bits virtual power management ``` Best Regards, Hayato Kuroda FUJITSU LIMITED
Re: Question: test "aggregates" failed in 32-bit machine
"kuroda.hay...@fujitsu.com" writes: > While running `make check LANC=C` with 32-bit virtual machine, > I found that it was failed at "aggregates". Hmm, we're not seeing any such failures in the buildfarm's 32-bit animals, so there must be some additional condition needed to make it happen. Can you be more specific? regards, tom lane
Question: test "aggregates" failed in 32-bit machine
Hi hackers, While running `make check LANC=C` with 32-bit virtual machine, I found that it was failed at "aggregates". PSA the a1b3bca1_regression.diffs. IIUC that part has been added by db0d67db. I checked out the source, tested, and got same result. PSA the db0d67db_regression.diffs I'm not sure about it, but is it an expected behavior? I know that we do not have to consider about "row" ordering, Followings show the environment. Please tell me if another information is needed. OS: RHEL 6.10 server Arch: i686 Gcc: 4.4.7 $ uname -a Linux VMX 2.6.32-754.41.2.el6.i686 #1 SMP Sat Jul 10 04:21:20 EDT 2021 i686 i686 i386 GNU/Linux Configure option: --enable-cassert --enable-debug --enable-tap-tests Best Regards, Hayato Kuroda FUJITSU LIMITED a1b3bca1_regression.diffs Description: a1b3bca1_regression.diffs db0d67db_regression.diffs Description: db0d67db_regression.diffs