Re: [PERFORM] autovacuum vacuum creates bad statistics for planner when it log index scans: 0

2014-05-19 Thread Tom Lane
tim_wilson  writes:
> On a 9.3.1 server , I have a key busy_table in that is hit by most
> transactions running on our system. One DB's copy of this table has 60K rows
> and 1/3 of that tables rows can updated every minute.

> Autovacuum autovacuum_analyze_scale_factor is set 0.02, so that analyse runs
> nearly every minute. But when autovacuum vacuum runs I sometimes see the
> following message in logs:

> LOG:  automatic vacuum of table "busy_table":* index scans: 0*
> pages: 0 removed, 22152 remain
> tuples: 0 removed, 196927 remain
> buffer usage: 46241 hits, 478 misses, 715 dirtied
> avg read rate: 0.561 MB/s, avg write rate: 0.839 MB/s
> system usage: CPU 0.07s/0.06u sec elapsed 6.66 sec

> and the tuples remaining is then overestimated by a factor >3 , and have
> seen this over estimate as large at >20 times IE 5M

FWIW, I tried to reproduce this without success.

There's some code in there that attempts to extrapolate the total number
of live tuples when VACUUM has not scanned the entire table.  It's surely
plausible that that logic went off the rails ... but without a test case
or at least a more specific description of the problem scenario, it's
hard to know what's wrong exactly.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query plan good in 8.4, bad in 9.2 and better in 9.3

2014-05-19 Thread Scott Marlowe
On Thu, May 15, 2014 at 10:52 AM, Tom Lane  wrote:
> Scott Marlowe  writes:
>> OK so we have a query that does OK in 8.4, goes to absolute crap in
>> 9.2 and then works great in 9.3. Thing is we've spent several months
>> regression testing 9.2 and no time testing 9.3, so we can't just "go
>> to 9.3" in an afternoon. But we might have to. 9.2 seems hopelessly
>> broken here.
>
>> The query looks something like this:
>
>> SELECT COUNT(*) FROM u, ug
>> WHERE u.ugid = ug.id
>> AND NOT u.d
>> AND ug.somefield IN  (SELECT somefunction(12345));
>
> You really should show us somefunction's definition if you want
> useful comments.  I gather however that it returns a set.  8.4
> seems to be planning on the assumption that the set contains
> only one row, which is completely unjustified in general though
> it happens to be true in your example.  9.2 is assuming 1000 rows
> in the set, and getting a sucky plan because that's wrong.  9.3
> is still assuming that; and I rather doubt that you are really
> testing 9.3 on the same data, because 9.2 is finding millions of
> rows in a seqscan of u while 9.3 is finding none in the exact
> same seqscan.
>
> I'd suggest affixing a ROWS estimate to somefunction, or better
> declaring it to return singleton not set if that's actually
> always the case.

Well great, now I look like an idiot. Last time I trust someone else
to set up my test servers.

Anyway, yeah, affixing a rows estimate fixes this for us 100%. So thanks!


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] same query different execution plan (hash join vs. semi-hash join)

2014-05-19 Thread Tom Lane
"Huang, Suya"  writes:
> Thank you Tom. But the time spent on scanning table test1 is less than 1 
> second (91.738 compares to 87.869), so I guess this shouldn't be the issue?

No, the point is that the bad rowcount estimate (and, possibly, lack of
stats about join column contents) causes the planner to pick a join method
that's not ideal for this query.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] View has different query plan than select statement

2014-05-19 Thread David Rowley
On Mon, May 19, 2014 at 4:47 PM, Geoff Hull wrote:

> I am sending this on behalf of my colleague who tried to post to this list
> last year but without success, then also tried
> pgsql-performance-ow...@postgresql.org but without getting a reply.
>
> I have recently re-tested this in P/G version 9.3.4 with the same results:
>
> Hi,
>
> I have created a table 'test_table' and index 'idx_test_table' with a view
> 'v_test_table'. However the query plan used by the view does not use the
> index but when running the select statement itself it does use the index.
> Given that query specific hints are not available in Postgres 9.1 how can I
> persuade the view to use the same query plan as the select statement?
>
> Thanks,
>
> Tim
>
>
> --DROP table test_table CASCADE;
>
> -- create test table
> CREATE TABLE test_table (
> history_id SERIAL,
> id character varying(50) NOT NULL ,
> name character varying(50),
> CONSTRAINT test_table_pkey PRIMARY KEY (history_id)
> );
>
> -- create index on test table
> CREATE INDEX idx_test_table ON test_table (id);
>
> -- populate test table
> INSERT INTO test_table (id, name) SELECT *, 'Danger Mouse' FROM (SELECT
> md5(random()::text) from generate_series(1,1)) q;
>
> -- collect stats
> ANALYZE test_table;
>
>
> EXPLAIN (ANALYZE, BUFFERS)
> SELECT *
> FROM test_table
> WHERE id = '02b304b1c54542570d9f7bd39361f5b4';
>
> "Index Scan using idx_test_table on test_table (cost=0.00..8.27 rows=1
> width=50) (actual time=0.021..0.022 rows=1 loops=1)"
> " Index Cond: ((id)::text = '02b304b1c54542570d9f7bd39361f5b4'::text)"
> " Buffers: shared hit=3"
> "Total runtime: 0.051 ms"
>
>
> -- select statement with good plan
>
> EXPLAIN (ANALYZE, BUFFERS)
> SELECT id,
> CASE WHEN COALESCE(LAG(name) OVER (PARTITION BY id ORDER BY history_id),
> name || 'x') <> name
> then name
> end as name
> FROM test_table
> WHERE id = '02b304b1c54542570d9f7bd39361f5b4';
>
> "WindowAgg (cost=8.28..8.31 rows=1 width=50) (actual time=0.050..0.051
> rows=1 loops=1)"
> " Buffers: shared hit=3"
> " -> Sort (cost=8.28..8.29 rows=1 width=50) (actual time=0.039..0.039
> rows=1 loops=1)"
> " Sort Key: history_id"
> " Sort Method: quicksort Memory: 25kB"
> " Buffers: shared hit=3"
> " -> Index Scan using idx_test_table on test_table (cost=0.00..8.27 rows=1
> width=50) (actual time=0.030..0.031 rows=1 loops=1)"
> " Index Cond: ((id)::text = '02b304b1c54542570d9f7bd39361f5b4'::text)"
> " Buffers: shared hit=3"
> "Total runtime: 0.102 ms"
>
>
> --DROP VIEW v_test_table;
>
> CREATE OR REPLACE VIEW v_test_table AS
> SELECT id,
> CASE WHEN COALESCE(LAG(name) OVER (PARTITION BY id ORDER BY history_id),
> name || 'x') <> name
> then name
> end as name
> FROM test_table;
>
>
> -- Query via view with bad plan
>
> EXPLAIN (ANALYZE, BUFFERS)
> SELECT *
> FROM v_test_table
> WHERE id = '02b304b1c54542570d9f7bd39361f5b4';
>
> "Subquery Scan on v_test_table (cost=868.39..1243.39 rows=50 width=65)
> (actual time=26.115..33.327 rows=1 loops=1)"
> " Filter: ((v_test_table.id)::text =
> '02b304b1c54542570d9f7bd39361f5b4'::text)"
> " Buffers: shared hit=104, temp read=77 written=77"
> " -> WindowAgg (cost=868.39..1118.39 rows=1 width=50) (actual time=
> 26.022..32.519 rows=1 loops=1)"
> " Buffers: shared hit=104, temp read=77 written=77"
> " -> Sort (cost=868.39..893.39 rows=1 width=50) (actual
> time=26.013..27.796 rows=1 loops=1)"
> " Sort Key: test_table.id, test_table.history_id"
> " Sort Method: external merge Disk: 608kB"
> " Buffers: shared hit=104, temp read=77 written=77"
> " -> Seq Scan on test_table (cost=0.00..204.00 rows=1 width=50)
> (actual time=0.010..1.804 rows=1 loops=1)"
> " Buffers: shared hit=104"
> "Total runtime: 33.491 ms"
>
>
> How can I get the view to use the same query plan as the select statement?
>
>
Hi Geoff,

Unfortunately the view is not making use of the index due to the presence
of the windowing function in the view. I think you would find that if that
was removed then the view would more than likely use the index again.

The reason for this is that currently the WHERE clause of the outer query
is not pushed down into the view due to some overly strict code which
completely disallows pushdowns of where clauses into sub queries that
contain windowing functions...

In your case, because you have this id in your partition by clause, then
technically it is possible to push the where clause down into the sub
query. I wrote a patch a while back which lifts this restriction. it
unfortunately missed the boat for 9.4, but with any luck it will make it
into 9.5. If you're up for compiling postgres from source, then you can
test the patch out:

http://www.postgresql.org/message-id/cahoyfk9ihosarntwc-nj5tphko4wcausd-1c_0wecogi9ue...@mail.gmail.com

It should apply to current HEAD without too much trouble.

Regards

David Rowley