On 2/2/2024 11:06, Richard Guo wrote:
On Fri, Feb 2, 2024 at 11:32 AM Richard Guo <guofengli...@gmail.com
<mailto:guofengli...@gmail.com>> wrote:
On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <t...@sss.pgh.pa.us
<mailto:t...@sss.pgh.pa.us>> wrote:
One of the test cases added by this commit has not been very
stable in the buildfarm. Latest example is here:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04>
and I've seen similar failures intermittently on other machines.
I'd suggest building this test atop a table that is more stable
than pg_class. You're just waving a red flag in front of a bull
if you expect stable statistics from that during a regression run.
Nor do I see any particular reason for pg_class to be especially
suited to the test.
Yeah, it's not a good practice to use pg_class in this place. While
looking through the test cases added by this commit, I noticed some
other minor issues that are not great. Such as
* The table 'btg' is inserted with 10000 tuples, which seems a bit
expensive for a test. I don't think we need such a big table to test
what we want.
* I don't see why we need to manipulate GUC max_parallel_workers and
max_parallel_workers_per_gather.
* I think we'd better write the tests with the keywords being all upper
or all lower. A mixed use of upper and lower is not great. Such as in
explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
* Some comments for the test queries are not easy to read.
* For this statement
CREATE INDEX idx_y_x_z ON btg(y,x,w);
I think the index name would cause confusion. It creates an index on
columns y, x and w, but the name indicates an index on y, x and z.
I'd like to write a draft patch for the fixes.
Here is the draft patch that fixes the issues I complained about in
upthread.
I passed through the patch. Looks like it doesn't break anything. Why do
you prefer to use count(*) in EXPLAIN instead of plain targetlist, like
"SELECT x,y,..."?
Also, according to the test mentioned by Tom:
1. I see, PG uses IndexScan on (x,y). So, column x will be already
sorted before the MergeJoin. Why not use Incremental Sort on (x,z,w)
instead of full sort?
2. For memo, IMO, this test shows us the future near perspective of this
feature: It is cheaper to use grouping order (w,z) instead of (z,w).
--
regards,
Andrei Lepikhov
Postgres Professional