Re: [PERFORM] View with and without ::text casting performs differently.

2013-09-18 Thread Brian Fehrle

On 09/06/2013 12:35 PM, Tom Lane wrote:

Brian Fehrle  writes:

On 09/05/2013 05:50 PM, Tom Lane wrote:

I rather doubt that the now-explicit-instead-of-implicit casts have much
to do with that.  It seems more likely that you forgot to re-ANALYZE in
the new database, or there are some different planner settings, or
something along that line.

I have two versions of the view in place on the same server, one with
the typecasting and one without, and this is where I see the differences
(no ::text runs in 0.5ms and with ::text runs in 13 or so minutes with
nested loop), so it's all running off the same statistics on the data.

Hm.  Can you provide a self-contained example?


I'll see what I can do to recreate this with bogus data. It's sensitive 
data that may just be some sort of anomaly in terms of the data 
distribution that is causing it.


- Brian F


regards, tom lane




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] View with and without ::text casting performs differently.

2013-09-06 Thread Brian Fehrle

On 09/05/2013 05:50 PM, Tom Lane wrote:

Brian Fehrle  writes:

I have a view, that when created with our create statement works
wonderfully, a query on the view with a standard where clause that
narrows the result to a single row performs in under a single ms.
However, when we export this view and re-import it (dump and restore of
the database, which happens often), the exported version of the view has
been modified by Postgres to include various typecasting of some columns
to text.

This is normal (varchar doesn't actually have any operations of its own).


All columns that it typecasts to text are varchar(20), so there is
nothing wrong in what it's doing there. However, with the view
definition including the ::text casting, the query planner changes and
it goes into a nested loop, taking a query from <1ms to over ten minutes.

I rather doubt that the now-explicit-instead-of-implicit casts have much
to do with that.  It seems more likely that you forgot to re-ANALYZE in
the new database, or there are some different planner settings, or
something along that line.
I have two versions of the view in place on the same server, one with 
the typecasting and one without, and this is where I see the differences 
(no ::text runs in 0.5ms and with ::text runs in 13 or so minutes with 
nested loop), so it's all running off the same statistics on the data.


Running an analyse on all tables involved did not change the query plan 
on the 'bad' version of the view (default_statistics_target = 400)


- Brian F


regards, tom lane




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] View with and without ::text casting performs differently.

2013-09-05 Thread Brian Fehrle

Apologies, forgot to include Postgres version  9.1.9

- Brian F
On 09/05/2013 04:45 PM, Brian Fehrle wrote:

Hi All,

I have a view, that when created with our create statement works 
wonderfully, a query on the view with a standard where clause that 
narrows the result to a single row performs in under a single ms. 
However, when we export this view and re-import it (dump and restore 
of the database, which happens often), the exported version of the 
view has been modified by Postgres to include various typecasting of 
some columns to text.


All columns that it typecasts to text are varchar(20), so there is 
nothing wrong in what it's doing there. However, with the view 
definition including the ::text casting, the query planner changes and 
it goes into a nested loop, taking a query from <1ms to over ten minutes.


*NOTE: *When I execute the queries with and without ::text myself 
outside of the view, there is no issues. the query plan is correct and 
I get my result fast, it's only when putting the query in the view and 
then executing it does it do the nested loop and take forever.




--- Query plan for view defined without ::text ---
Nested Loop  (cost=17.440..272.590 rows=1 width=1810) (actual 
time=0.515..0.527 rows=1 loops=1)
  ->  Nested Loop  (cost=17.440..265.480 rows=1 width=1275) (actual 
time=0.471..0.482 rows=1 loops=1)
->  Nested Loop  (cost=17.440..265.190 rows=1 width=761) 
(actual time=0.460..0.471 rows=1 loops=1)
  ->  Nested Loop  (cost=17.440..258.910 rows=1 width=186) 
(actual time=0.437..0.447 rows=1 loops=1)
->  Nested Loop  (cost=17.440..258.590 rows=1 
width=154) (actual time=0.417..0.425 rows=1 loops=1)
  ->  Nested Loop (cost=17.440..252.240 rows=1 
width=160) (actual time=0.388..0.395 rows=1 loops=1)
  Join Filter: 
((alpha_yankee.bravo_papa)::text = (six_zulu.kilo_uniform)::text)
->  Nested Loop (cost=0.000..108.990 
rows=7 width=137) (actual time=0.107..0.109 rows=1 loops=1)
  ->  Nested Loop 
(cost=0.000..102.780 rows=10 width=124) (actual time=0.077..0.078 
rows=1 loops=1)
->  Index Scan using 
juliet_yankee on alpha_yankee  (cost=0.000..18.240 rows=13 width=84) 
(actual time=0.043..0.044 rows=1 loops=1)
Index Cond: 
((bravo_charlie)::text = 'tango'::text)
->  Index Scan using 
charlie_quebec on juliet_three  (cost=0.000..6.490 rows=1 width=40) 
(actual time=0.029..0.029 rows=1 loops=1)
Index Cond: 
((echo)::text = (alpha_yankee.kilo_yankee)::text)
Filter: 
((four)::text = 'alpha_whiskey'::text)
  ->  Index Scan using 
charlie_yankee on romeo  (cost=0.000..0.610 rows=1 width=33) (actual 
time=0.027..0.027 rows=1 loops=1)
  Index Cond: 
((echo)::text = (juliet_three.lima_victor)::text)
  Filter: 
((lima_bravo)::text = 'alpha_whiskey'::text)
->  Bitmap Heap Scan on papa six_zulu  
(cost=17.440..20.450 rows=1 width=64) (actual time=0.268..0.276 
rows=21 loops=1)
Recheck Cond: 
(((charlie_victor)::text = (juliet_three.echo)::text) AND 
((bravo_india)::text = (alpha_yankee.juliet_whiskey)::text))
  ->  BitmapAnd 
(cost=17.440..17.440 rows=1 width=0) (actual time=0.255..0.255 rows=0 
loops=1)
->  Bitmap Index Scan on 
alpha_foxtrot  (cost=0.000..6.710 rows=28 width=0) (actual 
time=0.048..0.048 rows=21 loops=1)
Index Cond: 
((charlie_victor)::text = (juliet_three.echo)::text)
->  Bitmap Index Scan on 
delta  (cost=0.000..10.440 rows=108 width=0) (actual time=0.202..0.202 
rows=757 loops=1)
Index Cond: 
((bravo_india)::text = (alpha_yankee.juliet_whiskey)::text)
  ->  Index Scan using whiskey_india on 
alpha_lima  (cost=0.000..6.340 rows=1 width=57) (actual 
time=0.026..0.027 rows=1 loops=1)
  Index Cond: ((echo)::text = 
(six_zulu.bravo_india)::text)
->  Index Scan using golf on whiskey_oscar  
(cost=0.000..0.310 rows=1 width=43) (actual time=0.017..0.018 rows=1 
loops=1)
Index Cond: ((echo)::text = 
(alpha_lima.whiskey_six)::text)
  ->  Index Scan using bravo_foxtrot on mike_mike 
lima_charlie  (cost=0.000..6.270 rows=1 width=6

[PERFORM] View with and without ::text casting performs differently.

2013-09-05 Thread Brian Fehrle

Hi All,

I have a view, that when created with our create statement works 
wonderfully, a query on the view with a standard where clause that 
narrows the result to a single row performs in under a single ms. 
However, when we export this view and re-import it (dump and restore of 
the database, which happens often), the exported version of the view has 
been modified by Postgres to include various typecasting of some columns 
to text.


All columns that it typecasts to text are varchar(20), so there is 
nothing wrong in what it's doing there. However, with the view 
definition including the ::text casting, the query planner changes and 
it goes into a nested loop, taking a query from <1ms to over ten minutes.


*NOTE: *When I execute the queries with and without ::text myself 
outside of the view, there is no issues. the query plan is correct and I 
get my result fast, it's only when putting the query in the view and 
then executing it does it do the nested loop and take forever.




--- Query plan for view defined without ::text ---
Nested Loop  (cost=17.440..272.590 rows=1 width=1810) (actual 
time=0.515..0.527 rows=1 loops=1)
  ->  Nested Loop  (cost=17.440..265.480 rows=1 width=1275) (actual 
time=0.471..0.482 rows=1 loops=1)
->  Nested Loop  (cost=17.440..265.190 rows=1 width=761) 
(actual time=0.460..0.471 rows=1 loops=1)
  ->  Nested Loop  (cost=17.440..258.910 rows=1 width=186) 
(actual time=0.437..0.447 rows=1 loops=1)
->  Nested Loop  (cost=17.440..258.590 rows=1 
width=154) (actual time=0.417..0.425 rows=1 loops=1)
  ->  Nested Loop (cost=17.440..252.240 rows=1 
width=160) (actual time=0.388..0.395 rows=1 loops=1)
  Join Filter: 
((alpha_yankee.bravo_papa)::text = (six_zulu.kilo_uniform)::text)
->  Nested Loop (cost=0.000..108.990 
rows=7 width=137) (actual time=0.107..0.109 rows=1 loops=1)
  ->  Nested Loop 
(cost=0.000..102.780 rows=10 width=124) (actual time=0.077..0.078 rows=1 
loops=1)
->  Index Scan using 
juliet_yankee on alpha_yankee  (cost=0.000..18.240 rows=13 width=84) 
(actual time=0.043..0.044 rows=1 loops=1)
Index Cond: 
((bravo_charlie)::text = 'tango'::text)
->  Index Scan using 
charlie_quebec on juliet_three  (cost=0.000..6.490 rows=1 width=40) 
(actual time=0.029..0.029 rows=1 loops=1)
Index Cond: 
((echo)::text = (alpha_yankee.kilo_yankee)::text)
Filter: 
((four)::text = 'alpha_whiskey'::text)
  ->  Index Scan using 
charlie_yankee on romeo  (cost=0.000..0.610 rows=1 width=33) (actual 
time=0.027..0.027 rows=1 loops=1)
  Index Cond: ((echo)::text 
= (juliet_three.lima_victor)::text)
  Filter: 
((lima_bravo)::text = 'alpha_whiskey'::text)
->  Bitmap Heap Scan on papa six_zulu  
(cost=17.440..20.450 rows=1 width=64) (actual time=0.268..0.276 rows=21 
loops=1)
Recheck Cond: 
(((charlie_victor)::text = (juliet_three.echo)::text) AND 
((bravo_india)::text = (alpha_yankee.juliet_whiskey)::text))
  ->  BitmapAnd 
(cost=17.440..17.440 rows=1 width=0) (actual time=0.255..0.255 rows=0 
loops=1)
->  Bitmap Index Scan on 
alpha_foxtrot  (cost=0.000..6.710 rows=28 width=0) (actual 
time=0.048..0.048 rows=21 loops=1)
Index Cond: 
((charlie_victor)::text = (juliet_three.echo)::text)
->  Bitmap Index Scan on 
delta  (cost=0.000..10.440 rows=108 width=0) (actual time=0.202..0.202 
rows=757 loops=1)
Index Cond: 
((bravo_india)::text = (alpha_yankee.juliet_whiskey)::text)
  ->  Index Scan using whiskey_india on 
alpha_lima  (cost=0.000..6.340 rows=1 width=57) (actual 
time=0.026..0.027 rows=1 loops=1)
  Index Cond: ((echo)::text = 
(six_zulu.bravo_india)::text)
->  Index Scan using golf on whiskey_oscar 
(cost=0.000..0.310 rows=1 width=43) (actual time=0.017..0.018 rows=1 
loops=1)
Index Cond: ((echo)::text = 
(alpha_lima.whiskey_six)::text)
  ->  Index Scan using bravo_foxtrot on mike_mike 
lima_charlie  (cost=0.000..6.270 rows=1 width=617) (actual 
time=0.020..0.020 rows=1 loops=1)
  Index Cond: ((echo)::text = 
(six_zulu.kilo_uniform)::text)
->  Index Scan using charlie_papa on mike_oscar 
(cost=0.000..0.270 rows

[PERFORM] General key issues when comparing performance between PostgreSQL and oracle

2013-07-16 Thread Brian Fehrle

Hi all (Hopefully this is the correct mailing list for this).

I'm working on performance tuning a host of queries on PostgreSQL 9.2 
from an application, each query having its own issues and fixes, however 
from what I understand this application runs the exact same queries on 
the exact same data in half the time on oracle and SQL server.


Are there any known differences between the database systems in terms of 
query planners or general operations (sorting, unions) that are notable 
different between the systems that would make postgres slow down when 
executing the exact same queries?


It's worth noting that the queries are not that good, they have issues 
with bad sub-selects, Cartesian products, and what looks like bad query 
design in general, so the blame isn't completely with the database being 
slow, but I wonder what makes oracle preform better when given 
not-so-great queries?


I know this is rather general and high level, but any tips or experience 
anyone has would be appreciated.



Thanks,
- Brian F



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to eliminate union and sort

2013-07-15 Thread Brian Fehrle

On 07/12/2013 04:43 PM, Josh Berkus wrote:

As for the counts on the tables:
table1  3,653,472
table2  2,191,314
table325,676,589

I think it's safe to assume right now that any resulting joins are not
one-to-one

Hmmm?  How is doing a subselect in the SELECT clause even working, then?

Oh my, this is sad. the query in all returns 9,955,729 rows, so the sub 
queries are run on each of these resulting rows, however in this entire 
result set, subquery 1 returns 16 distinct rows, subquery 2 returns 63 
different rows, but those sub queries are run over 10 million times to 
return these few distinct rows. So it's running many times, but 
returning the same small set of data over and over again.


- Brian F




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Trying to eliminate union and sort

2013-07-12 Thread Brian Fehrle

On 07/11/2013 06:46 PM, Josh Berkus wrote:

Brian,


3. I'm trying to eliminate the union, however I have two problems.
A) I can't figure out how to have an 'or' clause in a single join that
would fetch all the correct rows. If I just do:
LEFT OUTER JOIN table2 t2 ON (t2.real_id = t.id OR t2.real_id =
t.backup_id), I end up with many less rows than the original query. B.

I believe the issue with this is a row could have one of three
possibilities:
* part of the first query but not the second -> results in 1 row after
the union
* part of the second query but not the first -> results in 1 row after
the union
* part of the first query and the second -> results in 2 rows after the
union (see 'B)' for why)

B) the third and fourth column in the SELECT will need to be different
depending on what column the row is joined on in the LEFT OUTER JOIN to
table2, so I may need some expensive case when logic to filter what is
put there based on whether that row came from the first join clause, or
the second.

No, it doesn't:

SELECT t.id,
t.mycolumn1,
table3.otherid as otherid1,
table3a.otherid as otherid2,
t.mycolumn2
FROM t
LEFT OUTER JOIN table2
   ON ( t.id = t2.real_id OR t.backup_id = t2.real_id )
LEFT OUTER JOIN table3
   ON ( t.typeid = table3.id )
 LEFT OUTER JOIN table3 as table3a
ON ( table2.third_id = table3.id )
WHERE t.external_id IN ( ... )
ORDER BY t.mycolumn2, t.id

I tried this originally, however my resulting rowcount is different.

The original query returns 9,955,729 rows
This above one returns 7,213,906

As for the counts on the tables:
table1  3,653,472
table2  2,191,314
table325,676,589

I think it's safe to assume right now that any resulting joins are not 
one-to-one


- Brian F




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Trying to eliminate union and sort

2013-07-11 Thread Brian Fehrle

Hi All,

(basic info)
PostgreSQL 9.2.4
64 bit Linux host
4GB shared_buffers with 14GB system memory, dedicated database VM
10MB work_mem


I have a query that takes over 6 minutes to complete, and it's due 
mainly to the two sorting operations being done on this query. The data 
it is returning itself is quite large, and with the two sorting 
operations it does (one because of the union, one because of the order 
by), it ends up using about *2.1GB* of temporary file space (and there 
is no way I can increase work_mem to hold this).


--[Query]
SELECT
t.id,
t.mycolumn1,
(SELECT table3.otherid FROM table3 WHERE table3.id = t.typeid),
(SELECT table3.otherid FROM table3 WHERE table3.id = t2.third_id),
t.mycolumn2 AS mycolumn2

FROM table1 t

LEFT OUTER JOIN table2 t2 ON t2.real_id = t.id

WHERE t.external_id IN ('6544', '2234', '2', '4536')

UNION

SELECT
t.id,
t.mycolumn1,
(SELECT table3.otherid FROM table3 WHERE table3.id = t.typeid),
(SELECT table3.otherid FROM table3 WHERE table3.id = t2.third_id),
t.mycolumn2 AS mycolumn2

FROM table1 t

LEFT OUTER JOIN table2 t2 ON t2.real_id = t.backup_id

WHERE t.external_id IN ('6544', '2234', '2', '4536')

ORDER BY t.mycolumn2, t.id;


--[Explain Analyze (sorry for the anonymizing)]
Sort  (cost=133824460.450..133843965.150 rows=7801882 width=56) (actual 
time=849405.656..894724.453 rows=9955729 loops=1)

Sort Key: romeo.three, romeo.quebec_seven
*Sort Method: external merge  Disk: 942856kB*
  ->  Unique  (cost=132585723.530..132702751.760 rows=7801882 width=56) 
(actual time=535267.196..668982.694 rows=9955729 loops=1)
->  Sort  (cost=132585723.530..132605228.240 rows=7801882 
width=56) (actual time=535267.194..649304.631 rows=10792011 loops=1)
Sort Key: romeo.quebec_seven, romeo.golf, ((delta_four 
3)), ((delta_four 4)), romeo.three

*Sort Method: external merge  Disk: 1008216kB*
  ->  Append  (cost=0.000..131464014.850 rows=7801882 
width=56) (actual time=0.798..291477.595 rows=10792011 loops=1)
->  Merge Left Join (cost=0.000..46340412.140 
rows=2748445 width=56) (actual time=0.797..70748.213 rows=3690431 loops=1)
Merge Cond: (romeo.quebec_seven = 
alpha_bravo.six_kilo)
  ->  Index Scan using juliet on zulu romeo  
(cost=0.000..163561.710 rows=2748445 width=52) (actual 
time=0.019..11056.883 rows=3653472 loops=1)
  Filter: (delta_uniform = ANY 
('two'::integer[]))
  ->  Index Only Scan using sierra on 
quebec_juliet alpha_bravo  (cost=0.000..86001.860 rows=2191314 width=8) 
(actual time=0.047..3996.543 rows=2191314 loops=1)

  Heap Fetches: 2191314
  SubPlan
->  Index Scan using oscar on six_delta  
(cost=0.000..8.380 rows=1 width=23) (actual time=0.009..0.009 rows=1 
loops=3690431)

Index Cond: (quebec_seven = romeo.lima)
  SubPlan
->  Index Scan using oscar on six_delta  
(cost=0.000..8.380 rows=1 width=23) (actual time=0.001..0.001 rows=0 
loops=3690431)
Index Cond: (quebec_seven = 
alpha_bravo.delta_november)
->  Merge Right Join (cost=0.000..85045583.890 
rows=5053437 width=56) (actual time=0.843..213450.477 rows=7101580 loops=1)
Merge Cond: (alpha_bravo.six_kilo = 
romeo.india)
  ->  Index Only Scan using sierra on 
quebec_juliet alpha_bravo  (cost=0.000..86001.860 rows=2191314 width=8) 
(actual time=0.666..6165.870 rows=2191314 loops=1)

  Heap Fetches: 2191314
  ->  Materialize (cost=0.000..193106.580 
rows=2748445 width=56) (actual time=0.134..25852.353 rows=7101580 loops=1)
->  Index Scan using alpha_seven on 
zulu romeo  (cost=0.000..186235.470 rows=2748445 width=56) (actual 
time=0.108..18439.857 rows=3653472 loops=1)
Filter: (delta_uniform = ANY 
('two'::integer[]))

  SubPlan
->  Index Scan using oscar on six_delta  
(cost=0.000..8.380 rows=1 width=23) (actual time=0.009..0.010 rows=1 
loops=7101580)

Index Cond: (quebec_seven = romeo.lima)
  SubPlan
->  Index Scan using oscar on six_delta  
(cost=0.000..8.380 rows=1 width=23) (actual time=0.007..0.008 rows=1 
loops=7101580)
Index Cond: (quebec_seven = 
alpha_bravo.delta_november)

--[end]

*My attempts:*
1. I tried to get an index set up on table1 that orders t.mycolumn2 and 
t.id so that the sorting operation might be skipped, however whe

Re: [PERFORM] Random performance hit, unknown cause.

2012-04-12 Thread Brian Fehrle

Interesting, that is very likely.

In this system I have a table that is extremely active. On a 'normal' 
day, the autovacuum process takes about 7 hours to complete on this 
table, and once it's complete, the system performs an autoanalyze on the 
table, finding that we have millions of new dead rows. Once this 
happens, it kicks off the autovacuum again, so we basically always have 
a vacuum running on this table at any given time.


If I were to tweak the autovacuum_vacuum_cost_delay parameter, what 
would that be doing? Would it be limiting what the current autovacuum is 
allowed to do? Or does it simply space out the time between autovacuum 
runs? In my case, with 7 hour long autovacuums (sometimes 14 hours), a 
few milliseconds between each vacuum wouldn't mean anything to me.


If that parameter does limit the amount of work autovacuum can do, It 
may cause the system to perform better at that time, but would prolong 
the length of the autovacuum right? That's an issue I'm already having 
issue with, and wouldn't want to make the autovacuum any longer if I 
don't need to.


- Brian F


On 04/12/2012 01:52 PM, Kevin Grittner wrote:

Claudio Freire  wrote:

On Thu, Apr 12, 2012 at 3:41 PM, Brian Fehrle
  wrote:

This morning, during our nightly backup process (where we grab a
copy of the data directory), we started having this same issue.
The main thing that I see in all of these is a high disk wait on
the system. When we are performing 'well', the %wa from top is
usually around 30%, and our load is around 12 - 15. This morning
we saw a load  21 - 23, and an %wa jumping between 60% and 75%.

The top process pretty much at all times is the WAL Sender
Process, is this normal?

Sounds like vacuum to me.


More particularly, it seems consistent with autovacuum finding a
large number of tuples which had reached their freeze threshold.
Rewriting the tuple in place with a frozen xmin is a WAL-logged
operation.

-Kevin



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Random performance hit, unknown cause.

2012-04-12 Thread Brian Fehrle

Hi all,

OS: Linux 64 bit 2.6.32
PostgreSQL 9.0.5 installed from Ubuntu packages.
8 CPU cores
64 GB system memory
Database cluster is on raid 10 direct attached drive, using a HP p800 
controller card.



I have a system that has been having occasional performance hits, where 
the load on the system skyrockets, all queries take longer to execute 
and a hot standby slave I have set up via streaming replication starts 
to get behind. I'm having trouble pinpointing where the exact issue is.


This morning, during our nightly backup process (where we grab a copy of 
the data directory), we started having this same issue. The main thing 
that I see in all of these is a high disk wait on the system. When we 
are performing 'well', the %wa from top is usually around 30%, and our 
load is around 12 - 15. This morning we saw a load  21 - 23, and an %wa 
jumping between 60% and 75%.


The top process pretty much at all times is the WAL Sender Process, is 
this normal?


From what I can tell, my access patterns on the database has not 
changed, same average number of inserts, updates, deletes, and had 
nothing on the system changed in any way. No abnormal autovacuum 
processes that aren't normally already running.


So what things can I do to track down what an issue is? Currently the 
system has returned to a 'good' state, and performance looks great. But 
I would like to know how to prevent this, as well as be able to grab 
good stats if it does happen again in the future.


Has anyone had any issues with the HP p800 controller card in a postgres 
environment? Is there anything that can help us maximise the performance 
to disk in this case, as it seems to be one of our major bottlenecks? I 
do plan on moving the pg_xlog to a separate drive down the road, the 
cluster is extremely active so that will help out a ton.


some IO stats:

$ iostat -d -x 5 3
Device:rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s 
avgrq-sz avgqu-sz   await  svctm  %util
dev11.9975.24  651.06  438.04 41668.57  8848.18
46.38 0.603.68   0.70  76.36
dev20.00 0.00  653.05  513.43 41668.57  8848.18
43.31 2.184.78   0.65  76.35


Device:rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s 
avgrq-sz avgqu-sz   await  svctm  %util
dev10.0035.20  676.20  292.00 35105.60  5688.00
42.1367.76   70.73   1.03 100.00
dev20.00 0.00  671.80  295.40 35273.60  4843.20
41.4873.41   76.62   1.03 100.00


Device:rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s 
avgrq-sz avgqu-sz   await  svctm  %util
dev11.2040.80  865.40  424.80 51355.20  8231.00
46.1837.87   29.22   0.77  99.80
dev20.00 0.00  867.40  465.60 51041.60  8231.00
44.4738.28   28.58   0.75  99.80


Thanks in advance,
Brian F


Re: [PERFORM] two table join just not fast enough.

2011-11-02 Thread Brian Fehrle

Thanks Tom,
And looks like I pasted an older explain plan, which is almost exactly 
the same as the one with 50MB work_mem, except for the hash join 
'buckets' part which used more memory and only one 'bucket' so to speak.


When running with the 50MB work_mem over 1MB work_mem, the query went 
from an average of 190 ms to 169 ms, so it did help some but it wasn't a 
game changer (I even found for this specific query, 6MB of work_mem was 
the most that would actually help me).


I have other plans to try to get this thing running faster, I'll be 
exploring them tomorrow, as well as looking at using an enum type.


- Brian F

On 11/02/2011 05:53 PM, Tom Lane wrote:

Brian Fehrle  writes:

I've got a query that I need to squeeze as much speed out of as I can.

Hmm ... are you really sure this is being run with work_mem = 50MB?
The hash join is getting "batched", which means the executor thinks it's
working under a memory constraint significantly less than the size of
the filtered inner relation, which should be no more than a couple
megabytes according to this.

I'm not sure how much that will save, since the hashjoin seems to be
reasonably speedy anyway, but there's not much other fat to trim here.

One minor suggestion is to think whether you really need string
comparisons here or could convert that to use of an enum type.
String compares ain't cheap, especially not in non-C locales.

regards, tom lane



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] two table join just not fast enough.

2011-11-02 Thread Brian Fehrle

Hi all,

I've got a query that I need to squeeze as much speed out of as I can.

When I execute this query, the average time it takes is about 190 ms. I 
increased my work_mem from 1 MB to 50MB and it decreased the timing down 
to an average of 170 ms, but that's still not fast enough. This query is 
executed extremely frequently, so much of it should be easily cached.


Some settings
work_mem = 50MB
shared_buffers = 5GB

I've made sure that the two tables are recently analyzed (with 
default_stats to 100, 400, and 1500 even), and table bloat is low (150 
meg table has 7 megs bloat).


Here's the query:
SELECT yankee.my_id
FROM yankee
INNER JOIN hotel_zulu
ON hotel_zulu.my_id = yankee.zulu_id
AND hotel_zulu.type IN ('string1', 'string2', 'string3', 'string4')
WHERE yankee.your_id=402513;

And here is a query plan.

Hash Join  (cost=17516.470..26386.660 rows=27624 width=4) (actual 
time=309.194..395.135 rows=12384 loops=1)
Hash Cond: (yankee.alpha = hotel_zulu.quebec)
  ->   Bitmap Heap Scan on yankee  (cost=1066.470..8605.770 rows=27624 
width=20) (actual time=5.178..34.693 rows=26963 loops=1)
  Recheck Cond: (mike = 402513)
->   Bitmap Index Scan on hotel_alpha  (cost=0.000..1059.570 rows=27624 
width=0) (actual time=4.770..4.770 rows=26967 loops=1)
Index Cond: (mike = 402513)
  ->   Hash  (cost=14465.080..14465.080 rows=114154 width=16) (actual 
time=303.717..303.717 rows=129908 loops=1)
  Buckets: 4096  Batches: 8  Memory Usage: 784kB
->   Bitmap Heap Scan on hotel_zulu  (cost=2461.770..14465.080 
rows=114154 width=16) (actual time=25.642..185.253 rows=129908 loops=1)
Recheck Cond: ((two)::golf = ANY ('xray'::golf[]))
  ->   Bitmap Index Scan on kilo  (cost=0.000..2433.230 rows=114154 
width=0) (actual time=23.887..23.887 rows=130292 loops=1)
  Index Cond: ((two)::golf = ANY ('xray'::golf[]))



One thing I notice is the rows estimated is 27624 and the actual rows 
returned is 12384. Quite a bit different.


Table 2 (known here as hotel_zulu) is being joined on zulu_id to the 
first table, and then a where clause on the column 'type'. There are 
single column indexes on each of these columns, and any multi column 
index I put on these are just ignored by the planner.


Any thoughts on ways to tweak this?

- Brian F

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance