After using patch, the result as below : QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=1078.00..26630101.20 rows=1 width=27) (actual time=160571.005..160571.105 rows=0 loops=1) InitPlan 1 (returns $0) -> Limit (cost=78.00..78.00 rows=1 width=8) (actual time=1.065..1.066 rows=1 loops=1) -> Sort (cost=78.00..83.00 rows=2000 width=8) (actual time=1.064..1.065 rows=1 loops=1) Sort Key: part.p_partkey Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on part (cost=0.00..68.00 rows=2000 width=8) (actual time=0.046..0.830 rows=2000 loops=1) -> Gather (cost=1000.00..26630023.20 rows=1 width=27) (actual time=160571.003..160571.102 rows=0 loops=1) Workers Planned: 1 Params Evaluated: $0 Workers Launched: 1 -> Nested Loop Left Join (cost=0.00..26629023.10 rows=1 width=27) (actual time=160549.257..160549.258 rows=0 loops=2) Join Filter: ($0 IS NOT NULL) Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode IS NULL)) Rows Removed by Filter: 1810515312 -> Parallel Seq Scan on lineitem ref_0 (cost=0.00..1721.97 rows=35397 width=11) (actual time=0.010..3.393 rows=30088 loops=2) -> Materialize (cost=0.00..2270.62 rows=60175 width=27) (actual time=0.000..2.839 rows=60175 loops=60175) -> Seq Scan on lineitem sample_0 (cost=0.00..1969.75 rows=60175 width=27) (actual time=0.008..11.381 rows=60175 loops=2) Planning Time: 0.174 ms Execution Time: 160571.476 ms (20 rows)
tender wang <tndrw...@gmail.com> 于2023年9月5日周二 16:52写道: > Hi all, > > I recently run benchmark[1] on master, but I found performance problem > as below: > > explain analyze select > subq_0.c0 as c0, > subq_0.c1 as c1, > subq_0.c2 as c2 > from > (select > ref_0.l_shipmode as c0, > sample_0.l_orderkey as c1, > sample_0.l_quantity as c2, > ref_0.l_orderkey as c3, > sample_0.l_shipmode as c5, > ref_0.l_shipinstruct as c6 > from > public.lineitem as ref_0 > left join public.lineitem as sample_0 > on ((select p_partkey from public.part order by p_partkey limit > 1) > is not NULL) > where sample_0.l_orderkey is NULL) as subq_0 > where subq_0.c5 is NULL > limit 1; > QUERY PLAN > > > ----------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=78.00..45267050.75 rows=1 width=27) (actual > time=299695.097..299695.099 rows=0 loops=1) > InitPlan 1 (returns $0) > -> Limit (cost=78.00..78.00 rows=1 width=8) (actual > time=0.651..0.652 rows=1 loops=1) > -> Sort (cost=78.00..83.00 rows=2000 width=8) (actual > time=0.650..0.651 rows=1 loops=1) > Sort Key: part.p_partkey > Sort Method: top-N heapsort Memory: 25kB > -> Seq Scan on part (cost=0.00..68.00 rows=2000 > width=8) (actual time=0.013..0.428 rows=2000 loops=1) > -> Nested Loop Left Join (cost=0.00..45266972.75 rows=1 width=27) > (actual time=299695.096..299695.096 rows=0 loops=1) > Join Filter: ($0 IS NOT NULL) > Filter: ((sample_0.l_orderkey IS NULL) AND (sample_0.l_shipmode > IS NULL)) > Rows Removed by Filter: 3621030625 > -> Seq Scan on lineitem ref_0 (cost=0.00..1969.75 rows=60175 > width=11) (actual time=0.026..6.225 rows=60175 loops=1) > -> Materialize (cost=0.00..2270.62 rows=60175 width=27) (actual > time=0.000..2.554 rows=60175 loops=60175) > -> Seq Scan on lineitem sample_0 (cost=0.00..1969.75 > rows=60175 width=27) (actual time=0.004..8.169 rows=60175 loops=1) > Planning Time: 0.172 ms > Execution Time: 299695.501 ms > (16 rows) > > After I set enable_material to off, the same query run faster, as below: > set enable_material = off; > explain analyze select > subq_0.c0 as c0, > subq_0.c1 as c1, > subq_0.c2 as c2 > from > (select > ref_0.l_shipmode as c0, > sample_0.l_orderkey as c1, > sample_0.l_quantity as c2, > ref_0.l_orderkey as c3, > sample_0.l_shipmode as c5, > ref_0.l_shipinstruct as c6 > from > public.lineitem as ref_0 > left join public.lineitem as sample_0 > on ((select p_partkey from public.part order by p_partkey limit > 1) > is not NULL) > where sample_0.l_orderkey is NULL) as subq_0 > where subq_0.c5 is NULL > limit 1; > QUERY > PLAN > > ----------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=1078.00..91026185.57 rows=1 width=27) (actual > time=192669.605..192670.425 rows=0 loops=1) > InitPlan 1 (returns $0) > -> Limit (cost=78.00..78.00 rows=1 width=8) (actual > time=0.662..0.663 rows=1 loops=1) > -> Sort (cost=78.00..83.00 rows=2000 width=8) (actual > time=0.661..0.662 rows=1 loops=1) > Sort Key: part.p_partkey > Sort Method: top-N heapsort Memory: 25kB > -> Seq Scan on part (cost=0.00..68.00 rows=2000 > width=8) (actual time=0.017..0.430 rows=2000 loops=1) > -> Gather (cost=1000.00..91026107.57 rows=1 width=27) (actual > time=192669.604..192670.422 rows=0 loops=1) > Workers Planned: 1 > Params Evaluated: $0 > Workers Launched: 1 > -> Nested Loop Left Join (cost=0.00..91025107.47 rows=1 > width=27) (actual time=192588.143..192588.144 rows=0 loops=2) > Join Filter: ($0 IS NOT NULL) > Filter: ((sample_0.l_orderkey IS NULL) AND > (sample_0.l_shipmode IS NULL)) > Rows Removed by Filter: 1810515312 > -> Parallel Seq Scan on lineitem ref_0 > (cost=0.00..1721.97 rows=35397 width=11) (actual time=0.007..3.797 > rows=30088 loops=2) > -> Seq Scan on lineitem sample_0 (cost=0.00..1969.75 > rows=60175 width=27) (actual time=0.000..2.637 rows=60175 loops=60175) > Planning Time: 0.174 ms > Execution Time: 192670.458 ms > (19 rows) > > I debug the code and find consider_parallel_nestloop() doesn't consider > materialized form of the cheapest inner path. > When enable_material = true, we can see Material path won in first plan, > but Parallel Seq Scan node doesn't add as outer path, which because > in try_partial_nestloop_path() , the cost of nestloop wat computed using > seq scan path not material path. > > [1] include test table schema and data, you can repeat above problem. > > I try fix this problem in attached patch, and I found pg12.12 also had > this issue. Please review my patch, thanks! > > [1] https://github.com/tenderwg/tpch_test >