Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-11-13 Thread Alexander Kuzmenkov

I've pushed the executor part of this, but mostly not the planner part,
because I didn't think the latter was anywhere near ready for prime
time: the regression test changes it was causing were entirely bogus.


Hi Tom,

Thanks for the commit and the explanation. I'll try to address your 
comments if I continue working on the planner part.


--

Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-11-01 Thread Tom Lane
Alexander Kuzmenkov  writes:
> On 04.09.2017 15:17, Alexey Chernyshov wrote:
>> make check-world fails on contrib/postgres_fdw because of Subquery Scan on 
>> ... Probably, query plan was changed.

> Thanks for testing! This is the same problem as the one in 
> 'select_distinct' I mentioned before. I changed the test, the updated 
> patch is attached.

I've pushed the executor part of this, but mostly not the planner part,
because I didn't think the latter was anywhere near ready for prime
time: the regression test changes it was causing were entirely bogus.

You had basically two categories of plan changes showing up in the
regression tests.  One was insertion of Subquery Scan nodes where
they hadn't been before; that was because the use_physical_tlist
change broke the optimization that removed no-op Subquery Scans.
I fixed that by narrowing the use_physical_tlist change to apply
only to BitmapHeapPath nodes, which is the only case where there
would be any benefit anyway.  The remaining plan diffs after making
that change all amounted to replacing regular index-only scan plans
with bitmap scans, which seems to me to be silly on its face: if we
can use an IOS then it is unlikely that a bitmap scan will be better.
Those changes indicate that the cost adjustment you'd inserted in
cost_bitmap_heap_scan was way too optimistic, which is hardly
surprising.  I think a realistic adjustment would have to account
for all or most of these factors:

* Whether the index AM is ever going to return recheck = false.
The planner has no way to know that at present, but since there are
lots of cases where it simply won't ever happen, I think assuming
that it always will is not acceptable.  Perhaps we could extend the
AM API so that we could find out whether recheck would happen always,
never, or sometimes.  (Doing better than "sometimes" might be too hard,
but I think most opclasses are going to be "always" or "never" anyway.)

* Whether the bitmap will become lossy, causing us to have to make
rechecks anyway.  We could probably estimate that pretty well based
on comparing the initial number-of-pages estimate to work_mem.

* Whether the plan will need to fetch heap tuples to make filter-qual
checks.  In principle the planner ought to know that.  In practice,
right now it doesn't bother to figure out whether the qual will be
empty until createplan.c time, because that's rather a significant
amount of work and it's undesirable to expend it for paths we might
not end up using.  We might be able to approximate it better than
we do now, though.  It's a live problem for regular indexscan costing
as well as bitmap scans, IIRC.

* What fraction of the table is actually all-visible.  You'd effectively
hard-wired that at 50%, but it's easy to do better --- the regular
IOS code does

if (indexonly)
pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));

and it would be appropriate to do the same here if we conclude that
the other gating conditions apply.

Without a patch that deals more realistically with these concerns,
I think we're better off not changing the cost estimate at all.

As far as the executor side goes, I made several cosmetic changes
and one not so cosmetic one: I changed the decision about whether
to prefetch so that it looks at whether the potential prefetch
page is all-visible.  This still relies on the same assumption you
had that the recheck flag will stay the same from one page to the
next, but at least it's not assuming that the all-visible state
will stay the same.

I'm going to mark the CF entry closed, but if you feel like working
on the cost estimate business, feel free to submit another patch
for that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-09-07 Thread Alexey Chernyshov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

One thing I have noticed is a trailing whitespace after "bogus one":

123 +* If we don't have to fetch the tuple, just return a
124 +* bogus one 
125 +*/

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-09-04 Thread Alexander Kuzmenkov

On 04.09.2017 15:17, Alexey Chernyshov wrote:

make check-world fails on contrib/postgres_fdw because of Subquery Scan on ... 
Probably, query plan was changed.


Hi Alexey,

Thanks for testing! This is the same problem as the one in 
'select_distinct' I mentioned before. I changed the test, the updated 
patch is attached.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index c19b3318c7..5f5f65d60c 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2203,22 +2203,23 @@ SELECT t1c1, avg(t1c1 + t2c1) FROM (SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2
 -- join with lateral reference
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
- QUERY PLAN 
-
+QUERY PLAN
+--
  Limit
Output: t1."C 1"
->  Nested Loop
  Output: t1."C 1"
  ->  Index Scan using t1_pkey on "S 1"."T 1" t1
Output: t1."C 1", t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
- ->  HashAggregate
-   Output: t2.c1, t3.c1
-   Group Key: t2.c1, t3.c1
-   ->  Foreign Scan
+ ->  Subquery Scan on q
+   ->  HashAggregate
  Output: t2.c1, t3.c1
- Relations: (public.ft1 t2) INNER JOIN (public.ft2 t3)
- Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1")) AND ((r1.c2 = $1::integer
-(13 rows)
+ Group Key: t2.c1, t3.c1
+ ->  Foreign Scan
+   Output: t2.c1, t3.c1
+   Relations: (public.ft1 t2) INNER JOIN (public.ft2 t3)
+   Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1")) AND ((r1.c2 = $1::integer
+(14 rows)
 
 SELECT t1."C 1" FROM "S 1"."T 1" t1, LATERAL (SELECT DISTINCT t2.c1, t3.c1 FROM ft1 t2, ft2 t3 WHERE t2.c1 = t3.c1 AND t2.c2 = t1.c2) q ORDER BY t1."C 1" OFFSET 10 LIMIT 10;
  C 1 
@@ -2672,16 +2673,17 @@ select c2, sum(c1) from ft2 group by c2 having avg(c1) < 500 and sum(c1) < 49800
 -- Unshippable HAVING clause will be evaluated locally, and other qual in HAVING clause is pushed down
 explain (verbose, costs off)
 select count(*) from (select c5, count(c1) from ft1 group by c5, sqrt(c2) having (avg(c1) / avg(c1)) * random() <= 1 and avg(c1) < 500) x;
-   QUERY PLAN
--
+  QUERY PLAN   
+---
  Aggregate
Output: count(*)
-   ->  Foreign Scan
- Output: ft1.c5, NULL::bigint, (sqrt((ft1.c2)::double precision))
- Filter: (avg(ft1.c1)) / (avg(ft1.c1::double precision * random()) <= '1'::double precision)
- Relations: Aggregate on (public.ft1)
- Remote SQL: SELECT c5, NULL::bigint, sqrt(c2), avg("C 1") FROM "S 1"."T 1" GROUP BY c5, (sqrt(c2)) HAVING ((avg("C 1") < 500::numeric))
-(7 rows)
+   ->  Subquery Scan on x
+ ->  Foreign Scan
+   Output: ft1.c5, NULL::bigint, (sqrt((ft1.c2)::double precision))
+   Filter: (avg(ft1.c1)) / (avg(ft1.c1::double precision * random()) <= '1'::double precision)
+   Relations: Aggregate on (public.ft1)
+   Remote SQL: SELECT c5, NULL::bigint, sqrt(c2), avg("C 1") FROM "S 1"."T 1" GROUP BY c5, (sqrt(c2)) HAVING ((avg("C 1") < 500::numeric))
+(8 rows)
 
 select count(*) from (select c5, count(c1) from ft1 group by c5, sqrt(c2) having (avg(c1) / avg(c1)) * random() <

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-09-04 Thread Alexey Chernyshov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Hi Alexander,

make check-world fails on contrib/postgres_fdw because of Subquery Scan on ... 
Probably, query plan was changed.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-08-21 Thread Alexander Kumenkov

|Hello everyone,

I made a new patch according to the previous comments. It is simpler 
now, only adding a few checks to the bitmap heap scan node. When the 
target list for the bitmap heap scan is empty, and there is no filter, 
and the bitmap page generated by the index scan is exact, and the 
corresponding heap page is visible to all transaction, we don't fetch it.


The performance is better than with the previous patch, because now it 
can leverage the parallel heap scan logic. A simple benchmark is 
attached: this patch is more than ten times faster on a frequent search 
term, and two times faster on an infrequent one.


Still, there is one thing that is bothering me. I use empty targetlist 
as the marker of that I should not fetch tuples. Because of that, I have 
to make sure that use_physical_tlist() doesn't touch empty tlists. 
Consequently, if count(*) sits on top of a subquery, this subquery has 
to project and cannot be deleted (see trivial_subqueryscan). There is 
such a query in the regression test select_distinct: "select count(*) 
from (select distinct two, four, two from tenk1);". For that particular 
query it shouldn't matter much, so I changed the test, but the broader 
implications of this escape me at the moment.


The cost estimation is very simplified now: I just halve the number of 
pages to be fetched. The most important missing part is checking whether 
we have any quals that are not checked by the index: if we do, we'll 
have to fetch all the tuples. Finding nonindex qualifications is 
somewhat convoluted for the bitmap index scan tree involving multiple 
indexes, so I didn't implement it for now. We could also consider 
estimating the number of lossy pages in the tid bitmap given current 
work_mem size.


I'll be glad to hear your thoughts on this.|
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index 79f534e4e9..d7ea6f6929 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -39,6 +39,7 @@
 
 #include "access/relscan.h"
 #include "access/transam.h"
+#include "access/visibilitymap.h"
 #include "executor/execdebug.h"
 #include "executor/nodeBitmapHeapscan.h"
 #include "miscadmin.h"
@@ -225,9 +226,25 @@ BitmapHeapNext(BitmapHeapScanState *node)
 			}
 
 			/*
-			 * Fetch the current heap page and identify candidate tuples.
+			 * If we don't need the tuple contents and are only counting them,
+			 * we can skip fetching the page if the bitmap doesn't need rechecking
+			 * and all tuples on the page are visible to our transaction
 			 */
-			bitgetpage(scan, tbmres);
+			node->bhs_nofetch = !tbmres->recheck
+&& node->ss.ps.qual == NULL
+&& node->ss.ps.plan->targetlist == NIL
+&& VM_ALL_VISIBLE(node->ss.ss_currentRelation, tbmres->blockno,
+  &node->bhs_vmbuffer);
+
+			if (node->bhs_nofetch)
+scan->rs_ntuples = tbmres->ntuples;
+			else
+			{
+/*
+ * Fetch the current heap page and identify candidate tuples.
+ */
+bitgetpage(scan, tbmres);
+			}
 
 			if (tbmres->ntuples >= 0)
 node->exact_pages++;
@@ -289,45 +306,58 @@ BitmapHeapNext(BitmapHeapScanState *node)
 		 */
 		BitmapPrefetch(node, scan);
 
-		/*
-		 * Okay to fetch the tuple
-		 */
-		targoffset = scan->rs_vistuples[scan->rs_cindex];
-		dp = (Page) BufferGetPage(scan->rs_cbuf);
-		lp = PageGetItemId(dp, targoffset);
-		Assert(ItemIdIsNormal(lp));
 
-		scan->rs_ctup.t_data = (HeapTupleHeader) PageGetItem((Page) dp, lp);
-		scan->rs_ctup.t_len = ItemIdGetLength(lp);
-		scan->rs_ctup.t_tableOid = scan->rs_rd->rd_id;
-		ItemPointerSet(&scan->rs_ctup.t_self, tbmres->blockno, targoffset);
+		if (node->bhs_nofetch)
+		{
+			/*
+			 * If we don't have to fetch the tuple, just return a
+			 * bogus one 
+			 */
+			slot->tts_isempty = false;
+			slot->tts_nvalid = 0;
+		}
+		else
+		{
+			/*
+			 * Okay to fetch the tuple
+			 */
+			targoffset = scan->rs_vistuples[scan->rs_cindex];
+			dp = (Page) BufferGetPage(scan->rs_cbuf);
+			lp = PageGetItemId(dp, targoffset);
+			Assert(ItemIdIsNormal(lp));
 
-		pgstat_count_heap_fetch(scan->rs_rd);
+			scan->rs_ctup.t_data = (HeapTupleHeader) PageGetItem((Page) dp, lp);
+			scan->rs_ctup.t_len = ItemIdGetLength(lp);
+			scan->rs_ctup.t_tableOid = scan->rs_rd->rd_id;
+			ItemPointerSet(&scan->rs_ctup.t_self, tbmres->blockno, targoffset);
 
-		/*
-		 * Set up the result slot to point to this tuple. Note that the slot
-		 * acquires a pin on the buffer.
-		 */
-		ExecStoreTuple(&scan->rs_ctup,
-	   slot,
-	   scan->rs_cbuf,
-	   false);
+			pgstat_count_heap_fetch(scan->rs_rd);
 
-		/*
-		 * If we are using lossy info, we have to recheck the qual conditions
-		 * at every tuple.
-		 */
-		if (tbmres->recheck)
-		{
-			econtext->ecxt_scantuple = slot;
-			ResetExprContext(econtext);
+			/*
+			 * Set up the result slot to point to this tuple. Note that the slot
+			 * acquires a pin on the buffer.
+			 */
+			ExecStoreTuple(&scan-

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Alexander Kuzmenkov


On 12.04.2017 12:29, Alexander Korotkov wrote:

That's a cool feature for FTS users! Please, register this patch to 
the next commitfest.
I've added this to the 2017-07 commitfest: 
https://commitfest.postgresql.org/14/1117/



Also, what is planning overhead of this patch? That's worth
testing too.

The planning overhead is about 0.1 - 0.07 ms on the samples from my 
first letter.


Another thing catch my eye.  The estimated number of rows in Bitmap 
Count node is the same as in Bitmap Index Scan node.  Should it be 1 
because it always returns single row?

You're right, I'll fix this in the next version of the patch.


Does this limitation cause a performance drawback?  When bitmap
index scan returns all rechecks, alternative to Bitmap Count is
still Aggregate + Bitmap Heap Scan.  Thus, I think Bitmap Count
would have the same performance or even slightly faster.  That's
worth testing.

Bitmap heap scan can indeed be faster, because it prefetches heap pages, 
and can be run in parallel. When the table data is not cached, the 
difference is not big on my machine. It could probably be significant if 
I used a storage that supported parallel reads. When the data is cached 
in memory, the parallel bitmap heap scan can be significantly faster.
We could use the standard bitmap heap scan node with some tweaks, as 
discussed in the other subthread, to avoid this regression.


Here are some test queries:

--- not cached 
---
test1=# explain analyze select count(*) from pglist where fts @@ 
to_tsquery( 'tom & lane' );

  QUERY PLAN
--
 Bitmap Count on pglist  (cost=542.65..1087.68 rows=54503 width=8) 
(actual time=30264.174..30264.177 rows=1 loops=1)

   Recheck Cond: (fts @@ to_tsquery('tom & lane'::text))
   Rows Removed by Index Recheck: 270853
   Heap Fetches: 66138
   Heap Blocks: exact=39854 lossy=66138
   ->  Bitmap Index Scan on idx_pglist_fts  (cost=0.00..529.02 
rows=54503 width=0) (actual time=525.341..525.341 rows=222813 loops=1)

 Index Cond: (fts @@ to_tsquery('tom & lane'::text))
 Planning time: 128.238 ms
 Execution time: 30264.299 ms
(9 rows)

test1=# set enable_bitmapcount to off;
SET
test1=# explain analyze select count(*) from pglist where fts @@ 
to_tsquery( 'tom & lane' );

QUERY PLAN

 Finalize Aggregate  (cost=119989.73..119989.74 rows=1 width=8) (actual 
time=31699.829..31699.830 rows=1 loops=1)
   ->  Gather  (cost=119989.52..119989.73 rows=2 width=8) (actual 
time=31698.699..31699.819 rows=3 loops=1)

 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=118989.52..118989.53 rows=1 
width=8) (actual time=31689.289..31689.290 rows=1 loops=3)
   ->  Parallel Bitmap Heap Scan on pglist 
(cost=542.65..118932.74 rows=22710 width=0) (actual 
time=608.532..31634.692 rows=74271 loops=3)

 Recheck Cond: (fts @@ to_tsquery('tom & lane'::text))
 Rows Removed by Index Recheck: 90284
 Heap Blocks: exact=13242 lossy=21960
 ->  Bitmap Index Scan on idx_pglist_fts 
(cost=0.00..529.02 rows=54503 width=0) (actual time=552.136..552.136 
rows=222813 loops=1)
   Index Cond: (fts @@ to_tsquery('tom & 
lane'::text))

 Planning time: 160.055 ms
 Execution time: 31724.468 ms
(13 rows)


- cached 
-
test1=# explain analyze select count(*) from pglist where fts @@ 
to_tsquery( 'tom & lane' );

QUERY PLAN
--
 Finalize Aggregate  (cost=119989.73..119989.74 rows=1 width=8) (actual 
time=1250.973..1250.973 rows=1 loops=1)
   ->  Gather  (cost=119989.52..119989.73 rows=2 width=8) (actual 
time=1250.588..1250.964 rows=3 loops=1)

 Workers Planned: 2
 Workers Launched: 2
 ->  Partial Aggregate  (cost=118989.52..118989.53 rows=1 
width=8) (actual time=1246.144..1246.144 rows=1 loops=3)
   ->  Parallel Bitmap Heap Scan on pglist 
(cost=542.65..118932.74 rows=22710 width=0) (actual 
time=82.781..1237.585 rows=74271 loops=3)

 Recheck Cond: (fts @@ to_tsquery('tom & lane'::text))
 Rows Removed by Index Recheck: 90284
 Heap Blocks: exact=13221 lossy=22217
 ->  Bitmap Index Scan on idx_pglist_fts

Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Alexander Kuzmenkov

On 12.04.2017 17:24, Tom Lane wrote:

TBH, I'm not sure you need to do any of that work.  Have you got evidence
that the planner will fail to choose the right plan regardless? I'm
particularly unconvinced that choose_bitmap_and is a critical problem,
because once you're into having to AND multiple indexes, you're talking
about an expensive query anyhow.
The most expensive part would probably be accessing the heap in the 
bitmap heap scan. It may be worth trying to choose an index path that 
checks all the restriction and therefore allows us to skip this heap 
access. This path might not be the cheapest one, though. The standard 
AND selection procedure would have discarded it based on cost.
I've seen this happen on the regression database. Somehow I can't seem 
to reproduce it on my earlier full-text search example.


An example:

regression=# explain select count(*) from tenk1 where hundred < 90 and 
thousand < 31;

QUERY PLAN
---
 Bitmap Count on tenk1  (cost=182.69..185.56 rows=1 width=8)
   Recheck Cond: ((thousand < 31) AND (hundred < 90))
   ->  BitmapAnd  (cost=182.69..182.69 rows=287 width=0)
 ->  Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..6.68 
rows=319 width=0)

   Index Cond: (thousand < 31)
 ->  Bitmap Index Scan on tenk1_hundred (cost=0.00..175.62 
rows=8978 width=0)

   Index Cond: (hundred < 90)
(7 rows)

regression=# set enable_bitmapcount to off;
SET
regression=# explain select count(*) from tenk1 where hundred < 90 and 
thousand < 31;

QUERY PLAN
---
 Aggregate  (cost=375.34..375.35 rows=1 width=8)
   ->  Bitmap Heap Scan on tenk1  (cost=6.75..374.62 rows=287 width=0)
 Recheck Cond: (thousand < 31)
 Filter: (hundred < 90)
 ->  Bitmap Index Scan on tenk1_thous_tenthous (cost=0.00..6.68 
rows=319 width=0)

   Index Cond: (thousand < 31)
(6 rows)

--

Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Tom Lane
Alexander Kuzmenkov  writes:
> With planner, the changes are more complex. Two things must be done 
> there. First, when the tlist is empty, we must use a different cost 
> function for bitmap heap scan, because the heap access pattern is 
> different. Second, choose_bitmap_and() must use a different algorithm 
> for choosing the right combination of paths. A standard algorithm 
> chooses the combination based on cost. For count(*) purposes, the 
> decisive factor is that the path has to check all the restrictions, or 
> else we'll need heap access to recheck some of them, which defeats the 
> purpose of having this optimization. The planner code that builds and 
> costs the index path is fairly complex, and integrating this additional 
> behavior into it didn't look good to me.

TBH, I'm not sure you need to do any of that work.  Have you got evidence
that the planner will fail to choose the right plan regardless?  I'm
particularly unconvinced that choose_bitmap_and is a critical problem,
because once you're into having to AND multiple indexes, you're talking
about an expensive query anyhow.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Alexander Kuzmenkov

On 12.04.2017 15:04, Tom Lane wrote:

Andrew Gierth  writes:

"Alexander" == Alexander Kuzmenkov  writes:
  Alexander> Structurally, the patch consists of two major parts: a
  Alexander> specialized executor node
Why?
It strikes me that the significant fact here is not that we're doing
count(*), but that we don't need any columns from the bitmap heap scan
result.  Rather than creating a whole new node, can't the existing
bitmap heapscan be taught to skip fetching the actual table page in
cases where it's all-visible, not lossy, and no columns are needed?

+1 ... while I hadn't actually looked at the code, it seemed to me that
anything like the optimization-as-described would be impossibly klugy
from the planner's standpoint.  Your formulation sounds lots nicer.

Detecting that no columns are needed in the executor might be a bit tricky
because of the planner's habit of inserting a "physical tlist" to avoid a
projection step.  (See also nearby discussion about custom scan planning.)
But we could fix that.  I think one rule that would make sense is to
just disable the physical-tlist substitution if the relation's targetlist
is empty --- it wouldn't be buying much in such a case anyhow.  Then the
runtime tlist for the scan node would also be empty, and away you go.
When making an early prototype of this optimization, I did what you are 
describing with the bitmap heap scan executor node. In an internal 
review, it was said that the bitmap heap scan node is already 
complicated enough and shouldn't have more logic added to it, so I 
rewrote it a separate node. To me, your approach looks good too, so if 
the community is generally in favor of this, I could rewrite the 
executor as such.


With planner, the changes are more complex. Two things must be done 
there. First, when the tlist is empty, we must use a different cost 
function for bitmap heap scan, because the heap access pattern is 
different. Second, choose_bitmap_and() must use a different algorithm 
for choosing the right combination of paths. A standard algorithm 
chooses the combination based on cost. For count(*) purposes, the 
decisive factor is that the path has to check all the restrictions, or 
else we'll need heap access to recheck some of them, which defeats the 
purpose of having this optimization. The planner code that builds and 
costs the index path is fairly complex, and integrating this additional 
behavior into it didn't look good to me. Instead, I created a 
specialized path node and isolated the logic that handles it. The 
resulting implementation adds several functions, but it is mostly 
self-contained and has a single entry point in grouping_planner(). It 
handles the specific case of bitmap count plans and doesn't complicate 
the existing code any further.


The planner part is to some extent independent of whether we use a 
separate executor node or not. If we choose not to, the same count(*) 
optimization code I proposed could create plain bitmap heap scan paths.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Tom Lane
Andrew Gierth  writes:
> "Alexander" == Alexander Kuzmenkov  writes:
>  Alexander> Structurally, the patch consists of two major parts: a
>  Alexander> specialized executor node

> Why?

> It strikes me that the significant fact here is not that we're doing
> count(*), but that we don't need any columns from the bitmap heap scan
> result.  Rather than creating a whole new node, can't the existing
> bitmap heapscan be taught to skip fetching the actual table page in
> cases where it's all-visible, not lossy, and no columns are needed?

+1 ... while I hadn't actually looked at the code, it seemed to me that
anything like the optimization-as-described would be impossibly klugy
from the planner's standpoint.  Your formulation sounds lots nicer.

Detecting that no columns are needed in the executor might be a bit tricky
because of the planner's habit of inserting a "physical tlist" to avoid a
projection step.  (See also nearby discussion about custom scan planning.)
But we could fix that.  I think one rule that would make sense is to
just disable the physical-tlist substitution if the relation's targetlist
is empty --- it wouldn't be buying much in such a case anyhow.  Then the
runtime tlist for the scan node would also be empty, and away you go.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Andrew Gierth
> "Alexander" == Alexander Kuzmenkov  writes:

 Alexander> Structurally, the patch consists of two major parts: a
 Alexander> specialized executor node

Why?

It strikes me that the significant fact here is not that we're doing
count(*), but that we don't need any columns from the bitmap heap scan
result.  Rather than creating a whole new node, can't the existing
bitmap heapscan be taught to skip fetching the actual table page in
cases where it's all-visible, not lossy, and no columns are needed?

(this would also have the advantage of getting parallelism for free)

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-12 Thread Alexander Korotkov
On Wed, Apr 12, 2017 at 12:40 AM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Tue, Apr 11, 2017 at 8:24 PM, Alexander Kuzmenkov <
> a.kuzmen...@postgrespro.ru> wrote:
>
>> I would like to propose a patch that speeds up the queries of the form
>> 'select
>> count(*) ... where ...',  where the restriction clauses can be satisfied
>> by some
>> indexes. At the moment, such queries use index-only scans followed by
>> aggregation. Index-only scans are only possible for indexes that are
>> capable of
>> returning indexed tuples, that is, support the 'amgettuple' access
>> method. They
>> are not applicable to indexes such as GIN and RUM. However, it is
>> possible to
>> improve count(*) performance for indexes that support bitmap scans. Having
>> performed a bitmap index scan or a combination of such, the bits in
>> bitmap can
>> be counted to obtain the final result. Of course, the bitmap pages that
>> are
>> lossy or not visible to all existing transactions will still require heap
>> access.
>>
>
> That's a cool feature for FTS users! Please, register this patch to the
> next commitfest.
>
> This patch has some important limitations:
>> * It only supports targetlist consisting of a single expression that can
>> be
>> projected from count(*).
>> * count(expr) is not supported. We could support it for cases where the
>> "expr is not null" restriction can be satisfied with an index.
>> * The current implementation does not support parallel execution. It
>> could be
>> implemented during the PostgreSQL 11 release cycle.
>> * For some indexes, the bitmap index scan will always require rechecking
>> all
>> the tuples. Bitmap count plans should not be used in such cases. For now,
>> this
>> check is not implemented.
>>
>
> Does this limitation cause a performance drawback?  When bitmap index scan
> returns all rechecks, alternative to Bitmap Count is still Aggregate +
> Bitmap Heap Scan.  Thus, I think Bitmap Count would have the same
> performance or even slightly faster.  That's worth testing.
>
> Also, what is planning overhead of this patch?  That's worth testing too.
>

Another thing catch my eye.  The estimated number of rows in Bitmap Count
node is the same as in Bitmap Index Scan node.  Should it be 1 because it
always returns single row?

test1=# explain analyze select count(*) from pglist where fts @@
> to_tsquery( 'tom & lane' );
>  QUERY
> PLAN
>
> 
>  Bitmap Count on pglist  (cost=550.65..1095.68 rows=54503 width=8) (actual
> time=1120.281..1120.281 rows=1 loops=1)
>Recheck Cond: (fts @@ to_tsquery('tom & lane'::text))
>Heap Fetches: 0
>Heap Blocks: exact=105992
>->  Bitmap Index Scan on idx_pglist_rum_fts  (cost=0.00..537.02
> rows=54503 width=0) (actual time=1056.060..1056.060 rows=222813 loops=1)
>  Index Cond: (fts @@ to_tsquery('tom & lane'::text))
>  Planning time: 119.568 ms
>  Execution time: 1121.409 ms
> (8 rows)


--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-11 Thread Alexander Korotkov
On Tue, Apr 11, 2017 at 8:24 PM, Alexander Kuzmenkov <
a.kuzmen...@postgrespro.ru> wrote:

> I would like to propose a patch that speeds up the queries of the form
> 'select
> count(*) ... where ...',  where the restriction clauses can be satisfied
> by some
> indexes. At the moment, such queries use index-only scans followed by
> aggregation. Index-only scans are only possible for indexes that are
> capable of
> returning indexed tuples, that is, support the 'amgettuple' access method.
> They
> are not applicable to indexes such as GIN and RUM. However, it is possible
> to
> improve count(*) performance for indexes that support bitmap scans. Having
> performed a bitmap index scan or a combination of such, the bits in bitmap
> can
> be counted to obtain the final result. Of course, the bitmap pages that are
> lossy or not visible to all existing transactions will still require heap
> access.
>

That's a cool feature for FTS users! Please, register this patch to the
next commitfest.

This patch has some important limitations:
> * It only supports targetlist consisting of a single expression that can be
> projected from count(*).
> * count(expr) is not supported. We could support it for cases where the
> "expr is not null" restriction can be satisfied with an index.
> * The current implementation does not support parallel execution. It could
> be
> implemented during the PostgreSQL 11 release cycle.
> * For some indexes, the bitmap index scan will always require rechecking
> all
> the tuples. Bitmap count plans should not be used in such cases. For now,
> this
> check is not implemented.
>

Does this limitation cause a performance drawback?  When bitmap index scan
returns all rechecks, alternative to Bitmap Count is still Aggregate +
Bitmap Heap Scan.  Thus, I think Bitmap Count would have the same
performance or even slightly faster.  That's worth testing.

Also, what is planning overhead of this patch?  That's worth testing too.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


[HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-04-11 Thread Alexander Kuzmenkov

Hi,

I would like to propose a patch that speeds up the queries of the form 
'select
count(*) ... where ...',  where the restriction clauses can be satisfied 
by some

indexes. At the moment, such queries use index-only scans followed by
aggregation. Index-only scans are only possible for indexes that are 
capable of
returning indexed tuples, that is, support the 'amgettuple' access 
method. They
are not applicable to indexes such as GIN and RUM. However, it is 
possible to

improve count(*) performance for indexes that support bitmap scans. Having
performed a bitmap index scan or a combination of such, the bits in 
bitmap can

be counted to obtain the final result. Of course, the bitmap pages that are
lossy or not visible to all existing transactions will still require heap
access.

One kind of applications that can benefit from this change is the full-text
search with pagination. To show a search results page, the application 
has to
know the results that go to current page, and the total number of the 
results.
Getting one page is fast, when the desired sorting order can be provided 
by an
index. For example, results can be sorted by date with a separate btree 
index,
or by relevance with RUM index. However, getting the total number of 
results is
more difficult. With text search indexes, it requires a bitmap heap 
scan, which
can be rather slow due to obligatory heap access. A well-known hack for 
this is
using the approximate data from 'explain' results. The proposed change 
allows

the user to obtain the precise number of the results in an efficient and
idiomatic manner.

The performance of this approach was tested on an archive of pgsql-hackers
mailing list. The detailed results for two sample queries can be found 
in the

attached file 'benchmark.txt'. The first test demonstrates full-text search
with RUM index, ordering the results by rank. For a query with low 
selectivity,
getting the top results is much faster than counting them all with a 
bitmap heap
scan. With bitmap count execution plan, the results can be counted much 
faster.
A similar test is done with a GIN index, where the results are 
restricted and
ordered by date using another btree index. Again, it shows a significant 
speedup

for count(*) query for bitmap count scan compared to bitmap heap scan. These
results demonstrate that the bitmap count plan can indeed be useful for 
full-

text search scenarios.

Structurally, the patch consists of two major parts: a specialized executor
node and the generation of corresponding paths and plans. The optimizer 
behavior
here is similar to that of the min/max aggregate optimization. The main 
entry

point is preprocess_count_aggs(); it is called by grouping_planner(). It
searches for count(*) expression in the target list, and, if possible, 
generates

a special path for it (struct BitmapCountPath). This path is then added to
UPPERREL_GROUP_AGG upperrel, and competes with other paths at the 
further stages

of planning.

The executor node (nodeBitmapCount.c) is similar to the bitmap heap scan 
node,
with the main difference being that it does not access heap for tuples 
that are

known to satisfy the restriction and to be visible to all transactions.

This patch has some important limitations:
* It only supports targetlist consisting of a single expression that can be
projected from count(*).
* count(expr) is not supported. We could support it for cases where the
"expr is not null" restriction can be satisfied with an index.
* The current implementation does not support parallel execution. It 
could be

implemented during the PostgreSQL 11 release cycle.
* For some indexes, the bitmap index scan will always require rechecking 
all
the tuples. Bitmap count plans should not be used in such cases. For 
now, this

check is not implemented.

I would be glad to hear your comments on this patch.

Regards,
Alexander Kuzmenkov

===
= The data

test1=# \d pglist
   Table "public.pglist"
   Column   |Type | Collation | Nullable | Default 
+-+---+--+-
 id | integer |   |  | 
 sent   | timestamp without time zone |   |  | 
 subject| text|   |  | 
 author | text|   |  | 
 body_plain | text|   |  | 
 fts| tsvector|   |  | 
Indexes:
"idx_pglist_rum_fts" rum (fts)
"idx_pglist_fts" gin (fts)
"idx_pglist_sent" btree (sent)


test1=# select min(sent), max(sent), count(*) from pglist;
 min | max |  count  
-+-+-
 1997-06-24 11:31:09 | 2016-04-27 23:46:29 | 1013770
(1 row)