[HACKERS] [PATCH] Overestimated filter cost and its mitigation
Hi hackers, Currently, cost of a filter with multiple clauses is estimated by summing up estimated cost of each clause. As long as a filter consists of simple clauses and its cost is fairly small, it works fine. However, when there exists some heavy clauses (like SubQuery or user-defined functions) and most of tuples are filtered by other simple clauses, total cost is likely to be overestimated. An attached patch mitigates this overestimation by using selectivity estimation of subsets of clauses in a filter. Suppose that there are three qual clauses in a scan node, current postgres estimates per-tuple cost of the filter to be: cost(A) + cost(B) + cost(C) And basic idea of the attached patch is: cost(A) + clauselist_selectivity({A}) * cost(B) + clauselist_selectivity({A, B}) * cost(C) We first noticed this overestimation during performance analysis of our real applications. In order to reproduce the situation, we created a similar query with pgbench data. The database was prepared by following commands: pgbench --scale=1000 -i pgb5 pgbench -c 10 -t 10 pgb5 and the query is: select e.tid, e.aid, e.bid, e.mtime from pgbench_history e where e.tid between 1 and 1000 and (select count(*) from pgbench_history f where f.mtime < e.mtime and f.bid = e.bid group by f.bid) > 100 and e.aid between 1 and 10; == Query plan with current upstream == 1 Seq Scan on pgbench_history e (cost=0.00..21393523889.00 rows=28 width=20) (actual time=2391.683..21775.191 rows=85 loops=1) 2 Filter: ((tid >= 1) AND (tid <= 1000) AND (aid >= 1) AND (aid <= 10) AND ((SubPlan 1) > 100)) 3 Rows Removed by Filter: 15 4SubPlan 1 5 -> GroupAggregate (cost=0.00..21393.50 rows=283 width=12) (actual time=229.050..229.050 rows=1 loops=94) 6 Group Key: f.bid 7 -> Seq Scan on pgbench_history f (cost=0.00..21389.00 rows=333 width=4) (actual time=5.036..228.851 rows=529 loops=94) 8 Filter: ((mtime < e.mtime) AND (bid = e.bid)) 9 Rows Removed by Filter: 999471 Most amount of total cost 21,393,523,889 comes from: (cost of SubPlan 1) * (# of rows in pgbench_history) = 21,393.50 * 1,000,000 = 21,393,000,000 but in actual run, SubPlan 1 was executed only 94 times, and total cost was overestimated more than 1 times greater. == Query plan with patched upstream == 1 Seq Scan on pgbench_history e (cost=0.00..1687169.88 rows=28 width=20) (actual time=2388.802..21721.498 rows=85 loops=1) 2 Filter: ((tid >= 1) AND (tid <= 1000) AND (aid >= 1) AND (aid <= 10) AND ((SubPlan 1) > 100)) 3 Rows Removed by Filter: 15 4 SubPlan 1 5 -> GroupAggregate (cost=0.00..19726.83 rows=283 width=12) (actual time=228.507..228.507 rows=1 loops=94) 6 Group Key: f.bid 7 -> Seq Scan on pgbench_history f (cost=0.00..19722.33 rows=333 width=4) (actual time=5.378..228.316 rows=529 loops=94) 8 Filter: ((mtime < e.mtime) AND (bid = e.bid)) 9 Rows Removed by Filter: 999471 In patched version of upstream, total cost is largely decreased. In cost estimation, only 84.4 tuples of pgbench_history are expected to be evaluated with SubPlan 1. It is close to actual number 94 to some extent, and total cost seems much more reasonable than cost with current upstream. Any thoughts? Regards, Yuto Hayamizu Mitigate-filter-cost-overestimation.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] HEAD crashes with assertion and LWLOCK_STATS enabled
Hi hackers, I found a bug that causes a crash when assertion is enabled and LWLOCK_STATS is defined. I've tested with Debian 7.5 (3.2.0-4-amd64) on VMware fusion 6, but this bug seems to be platform-independent and should reproduce in other environments. A patch to fix the bug is also attached. ## Reproduing a crash You can reproduce a crash by this way: git co a0841ecd2518d4505b96132b764b918ab5d21ad4 git clean -dfx ./configure --enable-cassert CFLAGS='-DLWLOCK_STATS' make check In my environment, the following messages appeared. ( omit... ) ../../../src/test/regress/pg_regress --inputdir=. --temp-install=./tmp_check --top-builddir=../../..--dlpath=. --schedule=./parallel_schedule == creating temporary installation== == initializing database system == pg_regress: initdb failed and initdb.log contained the following messages. reating directory /tmp/pghead/src/test/regress/./tmp_check/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok creating template1 database in /tmp/pghead/src/test/regress/./tmp_check/data/base/1 ... PID 48239 lwlock main 142: shacq 0 exacq 1 blk 0 spindelay 0 ( omit... ) PID 48247 lwlock main 33058: shacq 0 exacq 1 blk 0 spindelay 0 PID 48247 lwlock main 33005: shacq 0 exacq 48 blk 0 spindelay 0 ok loading system objects' descriptions ... TRAP: FailedAssertion(!(CritSectionCount == 0 || (context) == ErrorContext || (MyAuxProcType == CheckpointerProcess)), File: mcxt.c, Line: 594) Aborted (core dumped) child process exited with exit code 134 initdb: data directory /tmp/pghead/src/test/regress/./tmp_check/data not removed at user's request ## The cause of crash The failing assertion is for prohibiting memory allocation in a critical section, which is introduced by commit 4a170ee9 on 2014-04-04. In my understanding, the root cause of the assertion failure is on-demand allocation of lwlock_stats entry. For each LWLock, a lwlock_stats entry is created at the first invocation of LWLockAcquire using MemoryContextAlloc. If the first invocation is in a critical section, the assertion fails. For 'initdb' case I mentioned above, WALWriteLock locking in XLogFlush function was the problem. I also confirmed the assertion failure on starting postgres on a correctly initialized database. In this case, locking CheckpointerCommLock in AbsorbFsyncRequests function was the problem. ## A solution In order to avoid memory allocation during critical sections, lwlock_stats hash table should be populated at the initialization of each process. The attached patch populate lwlock_stats entries of MainLWLockArray at the end of CreateLWLocks, InitProcess and InitAuxiliaryProcess. With this patch, all regression tests can be passed so far, but I think this patch is not perfect because it does not cover LWLocks outside of MainLWLockArray. I'm not sure where is the right place to initialize lwlock_stats entries for that locks. So I feel it needs some refinements by you hackers. From f96708d14ab0073abd95c463eaf8d60db42f411d Mon Sep 17 00:00:00 2001 From: Yuto Hayamizu y.hayam...@gmail.com Date: Tue, 20 May 2014 16:19:56 +0900 Subject: [PATCH] pre-populating lwlock_stats entries --- src/backend/storage/lmgr/lwlock.c | 21 + src/backend/storage/lmgr/proc.c |8 src/include/storage/lwlock.h |4 3 files changed, 33 insertions(+) diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index d23ac62..fc97ae0 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -359,8 +359,29 @@ CreateLWLocks(void) MainLWLockTranche.array_base = MainLWLockArray; MainLWLockTranche.array_stride = sizeof(LWLockPadded); LWLockRegisterTranche(0, MainLWLockTranche); + +#ifdef LWLOCK_STATS + InitializeLWLockStats(); +#endif } +#ifdef LWLOCK_STATS +void +InitializeLWLockStats(void) +{ + int numLocks = NumLWLocks(); + int id; + LWLockPadded*lock; + + if (MyProc == NULL) return; + + /* Initialize all lwlock_stats entries of MainLWLockArray */ + for (id = 0, lock = MainLWLockArray; id numLocks; id++, lock++) + { + get_lwlock_stats_entry(lock-lock); + } +} +#endif /* * LWLockAssign - assign a dynamically-allocated LWLock number diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c index 266b0da..e09cbf8 100644 --- a/src/backend/storage/lmgr/proc.c +++ b/src/backend/storage/lmgr/proc.c @@ -415,6 +415,10 @@ InitProcess(void) * the deadlock checker. */ InitDeadLockChecking