On Fri, Feb 2, 2024 at 12:40 PM Andrei Lepikhov <a.lepik...@postgrespro.ru> wrote:
> On 2/2/2024 11:06, Richard Guo wrote: > > > > On Fri, Feb 2, 2024 at 11:32 AM Richard Guo <guofengli...@gmail.com > > <mailto:guofengli...@gmail.com>> wrote: > > > > On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <t...@sss.pgh.pa.us > > <mailto:t...@sss.pgh.pa.us>> wrote: > > > > One of the test cases added by this commit has not been very > > stable in the buildfarm. Latest example is here: > > > > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04 > < > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04 > > > > > > and I've seen similar failures intermittently on other machines. > > > > I'd suggest building this test atop a table that is more stable > > than pg_class. You're just waving a red flag in front of a bull > > if you expect stable statistics from that during a regression > run. > > Nor do I see any particular reason for pg_class to be especially > > suited to the test. > > > > > > Yeah, it's not a good practice to use pg_class in this place. While > > looking through the test cases added by this commit, I noticed some > > other minor issues that are not great. Such as > > > > * The table 'btg' is inserted with 10000 tuples, which seems a bit > > expensive for a test. I don't think we need such a big table to test > > what we want. > > > > * I don't see why we need to manipulate GUC max_parallel_workers and > > max_parallel_workers_per_gather. > > > > * I think we'd better write the tests with the keywords being all > upper > > or all lower. A mixed use of upper and lower is not great. Such as > in > > > > explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w; > > > > * Some comments for the test queries are not easy to read. > > > > * For this statement > > > > CREATE INDEX idx_y_x_z ON btg(y,x,w); > > > > I think the index name would cause confusion. It creates an index on > > columns y, x and w, but the name indicates an index on y, x and z. > > > > I'd like to write a draft patch for the fixes. > > > > > > Here is the draft patch that fixes the issues I complained about in > > upthread. > > I passed through the patch. Looks like it doesn't break anything. Why do > you prefer to use count(*) in EXPLAIN instead of plain targetlist, like > "SELECT x,y,..."? Nothing special. Just making the test cases consistent as much as possible. > Also, according to the test mentioned by Tom: > 1. I see, PG uses IndexScan on (x,y). So, column x will be already > sorted before the MergeJoin. Why not use Incremental Sort on (x,z,w) > instead of full sort? I think that's because the planner chooses to use (z, w, x) to perform the mergejoin. I did not delve into the details, but I guess the cost estimation decides this is cheaper. Hi Alexander, What do you think about the revisions for the test cases? Thanks Richard