[HACKERS] [PATCH] Overestimated filter cost and its mitigation

2017-09-11 Thread Yuto Hayamizu
Hi hackers,

Currently, cost of a filter with multiple clauses is estimated by
summing up estimated cost of each clause.  As long as a filter
consists of simple clauses and its cost is fairly small, it works
fine. However, when there exists some heavy clauses (like SubQuery or
user-defined functions) and most of tuples are filtered by other
simple clauses, total cost is likely to be overestimated.  An attached
patch mitigates this overestimation by using selectivity estimation of
subsets of clauses in a filter.

Suppose that there are three qual clauses in a scan node, current
postgres estimates per-tuple cost of the filter to be:
   cost(A) + cost(B) + cost(C)

And basic idea of the attached patch is:
   cost(A) + clauselist_selectivity({A}) * cost(B) +
clauselist_selectivity({A, B}) * cost(C)


We first noticed this overestimation during performance analysis of
our real applications. In order to reproduce the situation, we created
a similar query with pgbench data.

The database was prepared by following commands:
pgbench --scale=1000 -i pgb5
pgbench  -c 10 -t 10 pgb5

and the query is:
select e.tid,
e.aid,
e.bid,
e.mtime
   from pgbench_history e
  where e.tid between 1 and 1000
and (select count(*)
   from pgbench_history f
  where f.mtime < e.mtime
and f.bid = e.bid
  group by f.bid) > 100
and e.aid between 1 and 10;


== Query plan with current upstream ==

1 Seq Scan on pgbench_history e  (cost=0.00..21393523889.00 rows=28
width=20) (actual time=2391.683..21775.191 rows=85 loops=1)
2  Filter: ((tid >= 1) AND (tid <= 1000) AND (aid >= 1) AND (aid <= 10)
AND ((SubPlan 1) > 100))
3  Rows Removed by Filter: 15
4SubPlan 1
5 ->  GroupAggregate  (cost=0.00..21393.50 rows=283 width=12) (actual
time=229.050..229.050 rows=1 loops=94)
6  Group Key: f.bid
7   ->  Seq Scan on pgbench_history f  (cost=0.00..21389.00
rows=333 width=4) (actual time=5.036..228.851 rows=529 loops=94)
8 Filter: ((mtime < e.mtime) AND (bid = e.bid))
9 Rows Removed by Filter: 999471

Most amount of total cost 21,393,523,889 comes from:
  (cost of SubPlan 1) * (# of rows in pgbench_history) = 21,393.50 *
1,000,000 = 21,393,000,000

but in actual run, SubPlan 1 was executed only 94 times, and total
cost was overestimated more than 1 times greater.


== Query plan with patched upstream ==

1 Seq Scan on pgbench_history e  (cost=0.00..1687169.88 rows=28 width=20)
(actual time=2388.802..21721.498 rows=85 loops=1)
2   Filter: ((tid >= 1) AND (tid <= 1000) AND (aid >= 1) AND (aid <=
10) AND ((SubPlan 1) > 100))
3   Rows Removed by Filter: 15
4   SubPlan 1
5 ->  GroupAggregate  (cost=0.00..19726.83 rows=283 width=12) (actual
time=228.507..228.507 rows=1 loops=94)
6   Group Key: f.bid
7   ->  Seq Scan on pgbench_history f  (cost=0.00..19722.33
rows=333 width=4) (actual time=5.378..228.316 rows=529 loops=94)
8 Filter: ((mtime < e.mtime) AND (bid = e.bid))
9 Rows Removed by Filter: 999471

In patched version of upstream, total cost is largely decreased.  In
cost estimation, only 84.4 tuples of pgbench_history are expected to
be evaluated with SubPlan 1.  It is close to actual number 94 to some
extent, and total cost seems much more reasonable than cost with
current upstream.

Any thoughts?

Regards,
Yuto Hayamizu


Mitigate-filter-cost-overestimation.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] HEAD crashes with assertion and LWLOCK_STATS enabled

2014-05-20 Thread Yuto HAYAMIZU
Hi hackers,

I found a bug that causes a crash when assertion is enabled and LWLOCK_STATS is 
defined.
I've tested with Debian 7.5 (3.2.0-4-amd64) on VMware fusion 6, but this bug 
seems to be platform-independent and should reproduce in other environments.
A patch to fix the bug is also attached.

## Reproduing a crash

You can reproduce a crash by this way:

git co a0841ecd2518d4505b96132b764b918ab5d21ad4
git clean -dfx
./configure --enable-cassert CFLAGS='-DLWLOCK_STATS'
make check

In my environment, the following messages appeared.

( omit... )
../../../src/test/regress/pg_regress --inputdir=. 
--temp-install=./tmp_check --top-builddir=../../..--dlpath=.  
--schedule=./parallel_schedule
== creating temporary installation==
== initializing database system   ==

pg_regress: initdb failed

and initdb.log contained the following messages.

reating directory /tmp/pghead/src/test/regress/./tmp_check/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
creating template1 database in 
/tmp/pghead/src/test/regress/./tmp_check/data/base/1 ... PID 48239 lwlock main 
142: shacq 0 exacq 1 blk 0 spindelay 0
( omit... )
PID 48247 lwlock main 33058: shacq 0 exacq 1 blk 0 spindelay 0
PID 48247 lwlock main 33005: shacq 0 exacq 48 blk 0 spindelay 0
ok
loading system objects' descriptions ... TRAP: 
FailedAssertion(!(CritSectionCount == 0 || (context) == ErrorContext || 
(MyAuxProcType == CheckpointerProcess)), File: mcxt.c, Line: 594)
Aborted (core dumped)
child process exited with exit code 134
initdb: data directory /tmp/pghead/src/test/regress/./tmp_check/data not 
removed at user's request

## The cause of crash

The failing assertion is for prohibiting memory allocation in a critical 
section, which is introduced by commit 4a170ee9 on 2014-04-04.

In my understanding, the root cause of the assertion failure is on-demand 
allocation of lwlock_stats entry.  For each LWLock, a lwlock_stats entry is 
created at the first invocation of LWLockAcquire using MemoryContextAlloc.  If 
the first invocation is in a critical section, the assertion fails.

For 'initdb' case I mentioned above, WALWriteLock locking in XLogFlush function 
was the problem.
I also confirmed the assertion failure on starting postgres on a correctly 
initialized database. In this case, locking CheckpointerCommLock in 
AbsorbFsyncRequests function was the problem.

## A solution

In order to avoid memory allocation during critical sections, lwlock_stats hash 
table should be populated at the initialization of each process.
The attached patch populate lwlock_stats entries of MainLWLockArray at the end 
of CreateLWLocks, InitProcess and InitAuxiliaryProcess.

With this patch, all regression tests can be passed so far, but I think this 
patch is not perfect because it does not cover LWLocks outside of 
MainLWLockArray.  I'm not sure where is the right place to initialize 
lwlock_stats entries for that locks.  So I feel it needs some refinements by 
you hackers.
From f96708d14ab0073abd95c463eaf8d60db42f411d Mon Sep 17 00:00:00 2001
From: Yuto Hayamizu y.hayam...@gmail.com
Date: Tue, 20 May 2014 16:19:56 +0900
Subject: [PATCH] pre-populating lwlock_stats entries

---
 src/backend/storage/lmgr/lwlock.c |   21 +
 src/backend/storage/lmgr/proc.c   |8 
 src/include/storage/lwlock.h  |4 
 3 files changed, 33 insertions(+)

diff --git a/src/backend/storage/lmgr/lwlock.c 
b/src/backend/storage/lmgr/lwlock.c
index d23ac62..fc97ae0 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -359,8 +359,29 @@ CreateLWLocks(void)
MainLWLockTranche.array_base = MainLWLockArray;
MainLWLockTranche.array_stride = sizeof(LWLockPadded);
LWLockRegisterTranche(0, MainLWLockTranche);
+
+#ifdef LWLOCK_STATS
+   InitializeLWLockStats();
+#endif
 }
 
+#ifdef LWLOCK_STATS
+void
+InitializeLWLockStats(void)
+{
+   int numLocks = NumLWLocks();
+   int id;
+   LWLockPadded*lock;
+
+   if (MyProc == NULL) return;
+
+   /* Initialize all lwlock_stats entries of MainLWLockArray */
+   for (id = 0, lock = MainLWLockArray; id  numLocks; id++, lock++)
+   {
+   get_lwlock_stats_entry(lock-lock);
+   }
+}
+#endif
 
 /*
  * LWLockAssign - assign a dynamically-allocated LWLock number
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 266b0da..e09cbf8 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -415,6 +415,10 @@ InitProcess(void)
 * the deadlock checker.
 */
InitDeadLockChecking