On Wed, Apr 24, 2024 at 4:46 PM Melanie Plageman
<melanieplage...@gmail.com> wrote:
>
> On Tue, Apr 23, 2024 at 6:43 PM Tomas Vondra
> <tomas.von...@enterprisedb.com> wrote:
> >
> > On 4/23/24 18:05, Melanie Plageman wrote:
> > > One other note: there is some concurrency effect in the parallel
> > > schedule group containing "join" where you won't trip the assert if
> > > all the tests in that group in the parallel schedule are run. But, if
> > > you would like to verify that the test exercises the correct code,
> > > just reduce the group containing "join".
> > >
> >
> > That is ... interesting. Doesn't that mean that most test runs won't
> > actually detect the problem? That would make the test a bit useless.
>
> Yes, I should really have thought about it more. After further
> investigation, I found that the reason it doesn't trip the assert when
> the join test is run concurrently with other tests is that the SELECT
> query doesn't use the skip fetch optimization because the VACUUM
> doesn't set the pages all visible in the VM. In this case, it's
> because the tuples' xmins are not before VacuumCutoffs->OldestXmin
> (which is derived from GetOldestNonRemovableTransactionId()).
>
> After thinking about it more, I suppose we can't add a test that
> relies on the relation being all visible in the VM in a group in the
> parallel schedule. I'm not sure this edge case is important enough to
> merit its own group or an isolation test. What do you think?

Andres rightly pointed out to me off-list that if I just used a temp
table, the table would only be visible to the testing backend anyway.
I've done that in the attached v2. Now the test is deterministic.

- Melanie
From c5f28d12f75f5af84ede0db563fc5c0b53295c65 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplage...@gmail.com>
Date: Thu, 25 Apr 2024 18:50:14 -0400
Subject: [PATCH v2] BitmapHeapScan: Remove incorrect assert

04e72ed617be pushed the skip fetch optimization (allowing bitmap heap
scans to operate like index-only scans if none of the underlying data is
needed) into heap AM-specific bitmap heap scan code.

04e72ed617be added an assert that all tuples in blocks eligible for the
optimization had been NULL-filled and emitted by the end of the scan.
This assert is incorrect when not all tuples need be scanned to execute
the query; for example: a join in which not all inner tuples need to be
scanned before skipping to the next outer tuple.

Author: Melanie Plageman
Reviewed-by: Richard Guo, Tomas Vondra
Discussion: https://postgr.es/m/CAMbWs48orzZVXa7-vP9Nt7vQWLTE04Qy4PePaLQYsVNQgo6qRg%40mail.gmail.com
---
 src/backend/access/heap/heapam.c   |  4 ++--
 src/test/regress/expected/join.out | 37 ++++++++++++++++++++++++++++++
 src/test/regress/sql/join.sql      | 25 ++++++++++++++++++++
 3 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 4a4cf76269d..8906f161320 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1184,7 +1184,7 @@ heap_rescan(TableScanDesc sscan, ScanKey key, bool set_params,
 		scan->rs_vmbuffer = InvalidBuffer;
 	}
 
-	Assert(scan->rs_empty_tuples_pending == 0);
+	scan->rs_empty_tuples_pending = 0;
 
 	/*
 	 * The read stream is reset on rescan. This must be done before
@@ -1216,7 +1216,7 @@ heap_endscan(TableScanDesc sscan)
 	if (BufferIsValid(scan->rs_vmbuffer))
 		ReleaseBuffer(scan->rs_vmbuffer);
 
-	Assert(scan->rs_empty_tuples_pending == 0);
+	scan->rs_empty_tuples_pending = 0;
 
 	/*
 	 * Must free the read stream before freeing the BufferAccessStrategy.
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 8b640c2fc2f..4f0292c7285 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -8956,3 +8956,40 @@ where exists (select 1 from j3
 (13 rows)
 
 drop table j3;
+-- Check the case when:
+--   1. A join doesn't require all inner tuples to be scanned for each outer
+--      tuple, and
+--   2. The inner side is scanned using a bitmap heap scan, and
+--   3. The bitmap heap scan is eligible for the "skip fetch" optimization.
+--      This optimization is usable when no data from the underlying table is
+--      needed. Use a temp table so it is only visible to this backend and
+--      vacuum may reliably mark all blocks in the table all visible in the
+--      visibility map.
+CREATE TEMP TABLE skip_fetch (a INT, b INT) WITH (fillfactor=10);
+INSERT INTO skip_fetch SELECT i % 3, i FROM generate_series(0,30) i;
+CREATE INDEX ON skip_fetch(a);
+VACUUM (ANALYZE) skip_fetch;
+SET enable_indexonlyscan = off;
+set enable_indexscan = off;
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF)
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Nested Loop Anti Join
+   ->  Seq Scan on skip_fetch t1
+   ->  Materialize
+         ->  Bitmap Heap Scan on skip_fetch t2
+               Recheck Cond: (a = 1)
+               ->  Bitmap Index Scan on skip_fetch_a_idx
+                     Index Cond: (a = 1)
+(7 rows)
+
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+ a 
+---
+(0 rows)
+
+RESET enable_indexonlyscan;
+RESET enable_indexscan;
+RESET enable_seqscan;
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index c4c6c7b8ba2..25743ec972a 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -3374,3 +3374,28 @@ where exists (select 1 from j3
       and t1.unique1 < 1;
 
 drop table j3;
+
+-- Check the case when:
+--   1. A join doesn't require all inner tuples to be scanned for each outer
+--      tuple, and
+--   2. The inner side is scanned using a bitmap heap scan, and
+--   3. The bitmap heap scan is eligible for the "skip fetch" optimization.
+--      This optimization is usable when no data from the underlying table is
+--      needed. Use a temp table so it is only visible to this backend and
+--      vacuum may reliably mark all blocks in the table all visible in the
+--      visibility map.
+CREATE TEMP TABLE skip_fetch (a INT, b INT) WITH (fillfactor=10);
+INSERT INTO skip_fetch SELECT i % 3, i FROM generate_series(0,30) i;
+CREATE INDEX ON skip_fetch(a);
+VACUUM (ANALYZE) skip_fetch;
+
+SET enable_indexonlyscan = off;
+set enable_indexscan = off;
+SET enable_seqscan = off;
+EXPLAIN (COSTS OFF)
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+SELECT t1.a FROM skip_fetch t1 LEFT JOIN skip_fetch t2 ON t2.a = 1 WHERE t2.a IS NULL;
+
+RESET enable_indexonlyscan;
+RESET enable_indexscan;
+RESET enable_seqscan;
-- 
2.40.1

Reply via email to