Re: master check fails on Windows Server 2008
On 21-02-2018 18:51, Tom Lane wrote: Marina Polyakovawrites: On 20-02-2018 21:23, Tom Lane wrote: I continue to wonder if it's not better to just remove the option and thereby simplify our lives. What's the actual value of having it anymore? I agree with you, but I have too little experience to vote for removing this option. I've started a separate thread to propose removal of the option, at https://postgr.es/m/10862.1519228...@sss.pgh.pa.us Thank you! -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: master check fails on Windows Server 2008
Marina Polyakovawrites: > On 20-02-2018 21:23, Tom Lane wrote: >> I continue to wonder if it's not better to just remove >> the option and thereby simplify our lives. What's the actual value of >> having it anymore? > I agree with you, but I have too little experience to vote for removing > this option. I've started a separate thread to propose removal of the option, at https://postgr.es/m/10862.1519228...@sss.pgh.pa.us regards, tom lane
Re: master check fails on Windows Server 2008
On 20-02-2018 21:23, Tom Lane wrote: Marina Polyakovawrites: On 20-02-2018 3:37, Tom Lane wrote: 4. Try to tweak the stats_ext.sql test conditions in some more refined way to get the test to pass everywhere. This'd be a lot of work with no guarantee of success, so I'm not too excited about it. Thank you for your explanations! I'll try to do something in this direction.. OK. The least painful fix might be to establish a different work_mem setting just for that one query. However, if you're intent on putting work into continued support of --disable-float8-byval, I would *strongly* suggest setting up a buildfarm member that runs that way, because otherwise we're pretty much guaranteed to break it again. Oh, thank you again! I continue to wonder if it's not better to just remove the option and thereby simplify our lives. What's the actual value of having it anymore? I agree with you, but I have too little experience to vote for removing this option. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: master check fails on Windows Server 2008
Marina Polyakovawrites: > On 20-02-2018 3:37, Tom Lane wrote: >> 4. Try to tweak the stats_ext.sql test conditions in some more refined >> way to get the test to pass everywhere. This'd be a lot of work with >> no guarantee of success, so I'm not too excited about it. > Thank you for your explanations! I'll try to do something in this > direction.. OK. The least painful fix might be to establish a different work_mem setting just for that one query. However, if you're intent on putting work into continued support of --disable-float8-byval, I would *strongly* suggest setting up a buildfarm member that runs that way, because otherwise we're pretty much guaranteed to break it again. I continue to wonder if it's not better to just remove the option and thereby simplify our lives. What's the actual value of having it anymore? regards, tom lane
Re: master check fails on Windows Server 2008
On 20-02-2018 3:37, Tom Lane wrote: Ah-hah. I can reproduce the described failure if I configure with --disable-float8-byval on an otherwise 64-bit machine. It appears that the minimum work_mem setting that will allow this query to use a hashagg plan on such a configuration is about 155kB (which squares with the results you show). On the other hand, in a normal 64-bit configuration, work_mem settings of 160kB or more cause other failures (due to other plans switching to hashagg), and on a 32-bit machine I see such failures with work_mem of 150kB or more. So there's basically no setting of work_mem that would make all these tests pass everywhere. I see several things we could do about this: 1. Nothing; just say "sorry, we don't promise that the regression tests pass with no plan differences on nonstandard configurations". Given that --disable-float8-byval has hardly any real-world use, there is not a lot of downside to that. 2. Decide that --disable-float8-byval, and for that matter --disable-float4-byval, have no real-world use at all and take them out. There was some point in those options back when we cared about binary compatibility with version-zero C functions, but now I'm not sure why anyone would use them. 3. Drop that one test case from stats_ext.sql; I'm not sure how much additional test value it's actually bringing. 4. Try to tweak the stats_ext.sql test conditions in some more refined way to get the test to pass everywhere. This'd be a lot of work with no guarantee of success, so I'm not too excited about it. Thank you for your explanations! I'll try to do something in this direction.. 5. Something else? regards, tom lane -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: master check fails on Windows Server 2008
Marina Polyakovawrites: > On 16-02-2018 19:31, Tom Lane wrote: >> Weird. AFAICS the cost estimates for those two plans should be quite >> different, so this isn't just a matter of the estimates maybe being >> a bit platform-dependent. (And that test has been there nearly a >> year without causing reported problems.) > Thank you very much! Your test showed that hash aggregation was not even > added to the possible paths (see windows_regression.diffs attached). > Exploring this, I found that not allowing float8 to pass by value in > config.pl was crucial for the size of the hash table used in this query > (see diff.patch attached): Ah-hah. I can reproduce the described failure if I configure with --disable-float8-byval on an otherwise 64-bit machine. It appears that the minimum work_mem setting that will allow this query to use a hashagg plan on such a configuration is about 155kB (which squares with the results you show). On the other hand, in a normal 64-bit configuration, work_mem settings of 160kB or more cause other failures (due to other plans switching to hashagg), and on a 32-bit machine I see such failures with work_mem of 150kB or more. So there's basically no setting of work_mem that would make all these tests pass everywhere. I see several things we could do about this: 1. Nothing; just say "sorry, we don't promise that the regression tests pass with no plan differences on nonstandard configurations". Given that --disable-float8-byval has hardly any real-world use, there is not a lot of downside to that. 2. Decide that --disable-float8-byval, and for that matter --disable-float4-byval, have no real-world use at all and take them out. There was some point in those options back when we cared about binary compatibility with version-zero C functions, but now I'm not sure why anyone would use them. 3. Drop that one test case from stats_ext.sql; I'm not sure how much additional test value it's actually bringing. 4. Try to tweak the stats_ext.sql test conditions in some more refined way to get the test to pass everywhere. This'd be a lot of work with no guarantee of success, so I'm not too excited about it. 5. Something else? regards, tom lane
Re: master check fails on Windows Server 2008
Marina Polyakovawrites: > Hello, hackers! I got a permanent failure of master (commit > 2a41507dab0f293ff241fe8ae326065998668af8) check on Windows Server 2008. > Regression output and diffs as well as config.pl are attached. Weird. AFAICS the cost estimates for those two plans should be quite different, so this isn't just a matter of the estimates maybe being a bit platform-dependent. (And that test has been there nearly a year without causing reported problems.) To dig into it a bit more, I tweaked the test case to show the costs for both plans, and got an output diff as attached. Could you try the same experiment on your Windows box? In order to force the choice in the other direction, you'd need to temporarily disable enable_sort, not enable_hashagg as I did here, but the principle is the same. regards, tom lane diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql index 46acaad..1082660 100644 *** a/src/test/regress/sql/stats_ext.sql --- b/src/test/regress/sql/stats_ext.sql *** EXPLAIN (COSTS off) *** 177,184 EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, b, c, d; ! EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, d; --- 177,188 EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, b, c, d; ! EXPLAIN --(COSTS off) ! SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; ! set enable_hashagg = 0; ! EXPLAIN --(COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; + reset enable_hashagg; EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, d; *** /home/postgres/pgsql/src/test/regress/expected/stats_ext.out Mon Feb 12 14:53:46 2018 --- /home/postgres/pgsql/src/test/regress/results/stats_ext.out Fri Feb 16 11:23:11 2018 *** *** 309,323 -> Seq Scan on ndistinct (5 rows) ! EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; ! QUERY PLAN ! - ! HashAggregate Group Key: b, c, d !-> Seq Scan on ndistinct (3 rows) EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, d; QUERY PLAN --- 309,336 -> Seq Scan on ndistinct (5 rows) ! EXPLAIN --(COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; ! QUERY PLAN ! -- ! HashAggregate (cost=291.00..307.32 rows=1632 width=20) Group Key: b, c, d !-> Seq Scan on ndistinct (cost=0.00..191.00 rows=1 width=12) (3 rows) + set enable_hashagg = 0; + EXPLAIN --(COSTS off) + SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; + QUERY PLAN + + GroupAggregate (cost=1026.89..1168.21 rows=1632 width=20) +Group Key: b, c, d +-> Sort (cost=1026.89..1051.89 rows=1 width=12) + Sort Key: b, c, d + -> Seq Scan on ndistinct (cost=0.00..191.00 rows=1 width=12) + (5 rows) + + reset enable_hashagg; EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, d; QUERY PLAN ==
master check fails on Windows Server 2008
Hello, hackers! I got a permanent failure of master (commit 2a41507dab0f293ff241fe8ae326065998668af8) check on Windows Server 2008. Regression output and diffs as well as config.pl are attached. I used the following commands: build.bat > build.txt vcregress.bat check > check.txt Binary search has shown that this failure begins with commit bed9ef5a16239d91d97a1fa2efd9309c3cbbc4b2 (Rework the stats_ext test). On the previous commit (70ec3f1f8f0b753c38a1a582280a02930d7cac5f) the check passes. I'm trying to figure out what went wrong, and any suspicions are welcome. About the system: Windows Server 2008 R2 Standard, Service Pack 1, 64-bit. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company*** C:/Users/buildfarm/mpolyakova/postgrespro_master/src/test/regress/expected/stats_ext.out Fri Feb 16 12:56:00 2018 --- C:/Users/buildfarm/mpolyakova/postgrespro_master/src/test/regress/results/stats_ext.out Fri Feb 16 13:09:47 2018 *** *** 312,322 EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; QUERY PLAN ! - ! HashAggregate Group Key: b, c, d -> Seq Scan on ndistinct ! (3 rows) EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, d; --- 312,324 EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d; QUERY PLAN ! --- ! GroupAggregate Group Key: b, c, d +-> Sort + Sort Key: b, c, d -> Seq Scan on ndistinct ! (5 rows) EXPLAIN (COSTS off) SELECT COUNT(*) FROM ndistinct GROUP BY a, d; == test tablespace ... ok parallel group (20 tests): boolean char name varchar text int2 int4 txid int8 bit float8 float4 oid pg_lsn regproc rangetypes enum money uuid numeric boolean ... ok char ... ok name ... ok varchar ... ok text ... ok int2 ... ok int4 ... ok int8 ... ok oid ... ok float4 ... ok float8 ... ok bit ... ok numeric ... ok txid ... ok uuid ... ok enum ... ok money... ok rangetypes ... ok pg_lsn ... ok regproc ... ok test strings ... ok test numerology ... ok parallel group (20 tests): lseg box path line point tstypes reltime circle interval date time timetz macaddr8 tinterval abstime macaddr inet timestamptz timestamp polygon point... ok lseg ... ok line ... ok box ... ok path ... ok polygon ... ok circle ... ok date ... ok time ... ok timetz ... ok timestamp... ok timestamptz ... ok interval ... ok abstime ... ok reltime ... ok tinterval... ok inet ... ok macaddr ... ok macaddr8 ... ok tstypes ... ok parallel group (9 tests): geometry horology expressions oidjoins misc_sanity type_sanity comments regex opr_sanity geometry ... ok horology ... ok regex... ok oidjoins ... ok type_sanity ... ok opr_sanity ... ok misc_sanity ... ok comments ... ok expressions ... ok test insert ... ok test insert_conflict ... ok test create_function_1... ok test create_type ... ok test create_table ... ok test create_function_2... ok parallel group (3 tests): copy copydml copyselect copy ... ok copyselect ... ok copydml ... ok parallel group (3 tests): create_misc create_operator create_procedure create_misc ... ok create_operator ... ok create_procedure ... ok parallel group (2 tests): create_view create_index create_index ... ok create_view ... ok parallel group (15 tests): create_aggregate create_function_3 create_cast roleattributes create_am typed_table hash_func vacuum drop_if_exists constraints