Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread Andrus

Just a question, what are you doing with the 20.000 result rows ?


Those rows represent monthly sales data of one item.
They are used as following:

1. Detailed sales report for month. This report can browsed in screen for 
montly sales and ordering analysis.


2. Total reports. In those reports, sum( sales), sum(quantity) is used to 
get total sales in day, week, month, time for item and resulting rows are 
summed.


Andrus. 



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread Richard Huxton
Andrus wrote:
 Query below seems to use indexes everywhere in most optimal way.
 dokumnr column is of type int
 
 Speed of this query varies rapidly:
 
 In live db fastest response I have got is 8 seconds.
 Re-running same query  after 10 seconds may take 60 seconds.
 Re-running it again after 10 seconds may take 114 seconds.
 
 Any idea how to speed it up ?
 
 Is it possible to optimize it, will upgrading to 8.3.5 help or should I
 require to add more RAM, disk or CPU speed ?

At a quick glance, the plans look the same to me. The overall costs are
certainly identical. That means whatever is affecting the query times it
isn't the query plan.

 Aggregate  (cost=234278.53..234278.54 rows=1 width=0) (actual
 time=62164.496..62164.500 rows=1 loops=1)
 Total runtime: 62164.789 ms

 Aggregate  (cost=234278.53..234278.54 rows=1 width=0) (actual
 time=40185.499..40185.503 rows=1 loops=1)
 Total runtime: 40186.102 ms

 Aggregate  (cost=234278.53..234278.54 rows=1 width=0) (actual
 time=29650.398..29650.402 rows=1 loops=1)
 Total runtime: 29650.696 ms

 Aggregate  (cost=234278.53..234278.54 rows=1 width=0) (actual
 time=11131.392..11131.396 rows=1 loops=1)
 Total runtime: 11131.694 ms

So - what other activity is happening on this machine? Either other
queries are taking up noticeable resources, or some other process is (it
might be disk activity from checkpointing, logging some other application).

-- 
  Richard Huxton
  Archonet Ltd

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread Andrus

Richard,


At a quick glance, the plans look the same to me. The overall costs are
certainly identical. That means whatever is affecting the query times it
isn't the query plan.

So - what other activity is happening on this machine? Either other
queries are taking up noticeable resources, or some other process is (it
might be disk activity from checkpointing, logging some other 
application).


Thank you.
This is dedicated server running only PostgreSql which serves approx 6 point 
of sales at this time.


Maybe those other clients make queries which invalidate lot of data loaded 
into server cache.
In next time server must read it again from disk which causes those 
perfomance differences.


top output is currently:

top - 13:13:10 up 22 days, 18:25,  1 user,  load average: 0.19, 0.12, 0.19
Tasks:  53 total,   2 running,  51 sleeping,   0 stopped,   0 zombie
Cpu(s): 13.7% us,  2.0% sy,  0.0% ni, 78.3% id,  6.0% wa,  0.0% hi,  0.0% si
Mem:   2075828k total,  2022808k used,53020k free,0k buffers
Swap:  3911816k total,   88k used,  3911728k free,  1908536k cached

 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
5382 postgres  15   0  144m  43m  40m S 15.0  2.2   0:00.45 postmaster
5358 postgres  15   0  152m  87m  75m S  0.3  4.3   0:00.97 postmaster
   1 root  16   0  1480  508  444 S  0.0  0.0   0:01.35 init
   2 root  34  19 000 S  0.0  0.0   0:00.01 ksoftirqd/0
   3 root  10  -5 000 S  0.0  0.0   0:00.00 events/0
   4 root  10  -5 000 S  0.0  0.0   0:00.42 khelper
   5 root  10  -5 000 S  0.0  0.0   0:00.00 kthread
   7 root  10  -5 000 S  0.0  0.0   2:03.91 kblockd/0
   8 root  20  -5 000 S  0.0  0.0   0:00.00 kacpid
 115 root  13  -5 000 S  0.0  0.0   0:00.00 aio/0
 114 root  15   0 000 S  0.0  0.0   8:49.67 kswapd0
 116 root  10  -5 000 S  0.0  0.0   0:10.32 xfslogd/0
 117 root  10  -5 000 S  0.0  0.0   0:39.96 xfsdatad/0
 706 root  10  -5 000 S  0.0  0.0   0:00.00 kseriod
 723 root  13  -5 000 S  0.0  0.0   0:00.00 kpsmoused
 738 root  11  -5 000 S  0.0  0.0   0:00.00 ata/0
 740 root  11  -5 000 S  0.0  0.0   0:00.00 scsi_eh_0
 741 root  11  -5 000 S  0.0  0.0   0:00.00 scsi_eh_1
 742 root  11  -5 000 S  0.0  0.0   0:00.00 scsi_eh_2
 743 root  11  -5 000 S  0.0  0.0   0:00.00 scsi_eh_3
 762 root  10  -5 000 S  0.0  0.0   0:17.54 xfsbufd
 763 root  10  -5 000 S  0.0  0.0   0:00.68 xfssyncd
 963 root  16  -4  1712  528  336 S  0.0  0.0   0:00.24 udevd
6677 root  15   0  1728  572  400 S  0.0  0.0   0:04.99 syslog-ng
7128 postgres  16   0  140m  10m 9900 S  0.0  0.5   0:05.60 postmaster

in few seconds later:

top - 13:14:01 up 22 days, 18:26,  1 user,  load average: 1.72, 0.53, 0.32
Tasks:  52 total,   2 running,  50 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.3% us,  3.0% sy,  0.0% ni,  0.0% id, 91.0% wa,  0.0% hi,  0.7% si
Mem:   2075828k total,  2022692k used,53136k free,0k buffers
Swap:  3911816k total,   88k used,  3911728k free,  1905028k cached

 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
1179 postgres  18   0  155m 136m 122m D  6.7  6.7   1:32.52 postmaster
4748 postgres  15   0  145m 126m 122m D  1.3  6.2   0:14.38 postmaster
5358 postgres  16   0  160m  98m  81m D  0.7  4.9   0:01.21 postmaster
   1 root  16   0  1480  508  444 S  0.0  0.0   0:01.35 init
   2 root  34  19 000 S  0.0  0.0   0:00.01 ksoftirqd/0
   3 root  10  -5 000 S  0.0  0.0   0:00.00 events/0
   4 root  10  -5 000 S  0.0  0.0   0:00.42 khelper
   5 root  10  -5 000 S  0.0  0.0   0:00.00 kthread
   7 root  10  -5 000 S  0.0  0.0   2:03.97 kblockd/0
   8 root  20  -5 000 S  0.0  0.0   0:00.00 kacpid
 115 root  13  -5 000 S  0.0  0.0   0:00.00 aio/0
 114 root  15   0 000 S  0.0  0.0   8:49.79 kswapd0
 116 root  10  -5 000 S  0.0  0.0   0:10.32 xfslogd/0
 117 root  10  -5 000 S  0.0  0.0   0:39.96 xfsdatad/0
 706 root  10  -5 000 S  0.0  0.0   0:00.00 kseriod
 723 root  13  -5 000 S  0.0  0.0   0:00.00 kpsmoused
 738 root  11  -5 000 S  0.0  0.0   0:00.00 ata/0
 740 root  11  -5 000 S  0.0  0.0   0:00.00 scsi_eh_0
 741 root  11  -5 000 S  0.0  0.0   0:00.00 scsi_eh_1
 742 root  11  -5 000 S  0.0  0.0   0:00.00 scsi_eh_2
 743 root  11  -5 000 S  0.0  0.0   0:00.00 scsi_eh_3
 762 root  10  -5 000 S  0.0  0.0   0:17.54 xfsbufd
 763 root  10  -5 000 S  0.0  0.0   0:00.68 xfssyncd
 963 root  16  -4  1712  528  336 S  0.0  0.0   0:00.24 

Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread Richard Huxton
Andrus wrote:
 Richard,
 
 At a quick glance, the plans look the same to me. The overall costs are
 certainly identical. That means whatever is affecting the query times it
 isn't the query plan.

 So - what other activity is happening on this machine? Either other
 queries are taking up noticeable resources, or some other process is (it
 might be disk activity from checkpointing, logging some other
 application).
 
 Thank you.
 This is dedicated server running only PostgreSql which serves approx 6
 point of sales at this time.
 
 Maybe those other clients make queries which invalidate lot of data
 loaded into server cache.
 In next time server must read it again from disk which causes those
 perfomance differences.

In addition to top below, you'll probably find vmstat 5 useful.

 top output is currently:
 
 top - 13:13:10 up 22 days, 18:25,  1 user,  load average: 0.19, 0.12, 0.19
 Tasks:  53 total,   2 running,  51 sleeping,   0 stopped,   0 zombie
 Cpu(s): 13.7% us,  2.0% sy,  0.0% ni, 78.3% id,  6.0% wa,  0.0% hi, 
 0.0% si
 Mem:   2075828k total,  2022808k used,53020k free,0k buffers
 Swap:  3911816k total,   88k used,  3911728k free,  1908536k cached
 
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 5382 postgres  15   0  144m  43m  40m S 15.0  2.2   0:00.45 postmaster
 5358 postgres  15   0  152m  87m  75m S  0.3  4.3   0:00.97 postmaster
1 root  16   0  1480  508  444 S  0.0  0.0   0:01.35 init

Looks pretty quiet.

 in few seconds later:
 
 top - 13:14:01 up 22 days, 18:26,  1 user,  load average: 1.72, 0.53, 0.32
 Tasks:  52 total,   2 running,  50 sleeping,   0 stopped,   0 zombie
 Cpu(s):  5.3% us,  3.0% sy,  0.0% ni,  0.0% id, 91.0% wa,  0.0% hi, 
 0.7% si
 Mem:   2075828k total,  2022692k used,53136k free,0k buffers
 Swap:  3911816k total,   88k used,  3911728k free,  1905028k cached
 
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 1179 postgres  18   0  155m 136m 122m D  6.7  6.7   1:32.52 postmaster
 4748 postgres  15   0  145m 126m 122m D  1.3  6.2   0:14.38 postmaster
 5358 postgres  16   0  160m  98m  81m D  0.7  4.9   0:01.21 postmaster
1 root  16   0  1480  508  444 S  0.0  0.0   0:01.35 init

Here you're stuck waiting for disks (91.0% wa). Check out vmstat and
iostat to see what's happening.

-- 
  Richard Huxton
  Archonet Ltd

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash join on int takes 8..114 seconds

2008-11-20 Thread PFC


	OK so vmstat says you are IO-bound, this seems logical if the same plan  
has widely varying timings...


Let's look at the usual suspects :

	- how many dead rows in your tables ? are your tables data, or bloat ?  
(check vacuum verbose, etc)

- what's the size of the dataset relative to the RAM ?

Now let's look more closely at the query :

explain analyze
SELECT sum(1)
  FROM dok JOIN rid USING (dokumnr)
  JOIN toode USING (toode)
  LEFT JOIN artliik using(grupp,liik)
  WHERE rid.toode='X05' AND dok.kuupaev='2008-09-01'


OK, so artliik is a very small table (84 rows) :

Seq Scan on artliik  (cost=0.00..6.84 rows=84 width=19)
(actual time=20.104..29.845 rows=84 loops=1)

	I presume doing the query without artliik changes nothing to the runtime,  
yes ?

Let's look at the main part of the query :

  FROM dok JOIN rid USING (dokumnr) JOIN toode USING (toode)
  WHERE rid.toode='X05' AND dok.kuupaev='2008-09-01'

	Postgres's plan is logical. It starts by joining rid and dok since your  
WHERE is on those :


-  Hash Join  (cost=52103.94..233488.08 rows=24126 width=24) (actual  
time=100386.921..114037.986 rows=20588 loops=1)

Hash Cond: (outer.dokumnr = inner.dokumnr)
	-  Bitmap Heap Scan on rid  (cost=4127.51..175020.84 rows=317003  
width=28) (actual time=9.932..76225.918 rows=277294 loops=1)

  Recheck Cond: (toode = 'X05'::bpchar)
		  -  Bitmap Index Scan on rid_toode_idx (cost=0.00..4127.51 rows=317003  
width=0) (actual time=11105.807..11105.807 rows=280599 loops=1)

Index Cond: (toode = 'X05'::bpchar)
	-  Hash  (cost=47376.82..47376.82 rows=93444 width=4) (actual  
time=35082.427..35082.427 rows=105202 loops=1)
		  -  Index Scan using dok_kuupaev_idx on dok (cost=0.00..47376.82  
rows=93444 width=4) (actual time=42.110..34586.331 rows=105202 loops=1)

Index Cond: (kuupaev = '2008-09-01'::date)

Your problem here is that, no matter what, postgres will have to examine
- all rows where dok.kuupaev='2008-09-01',
- and all rows where rid.toode = 'X05'.
	If you use dok.kuupaev='2007-09-01' (note : 2007) it will probably have  
to scan many, many more rows.


	If you perform this query often you could CLUSTER rid on (toode) and dok  
on (kuupaev), but this can screw other queries.


What is the meaning of the columns ?

To make this type of query faster I would tend to think about :

- materialized views
	- denormalization (ie adding a column in one of your tables and a  
multicolumn index)
	- materialized summary tables (ie. summary of sales for last month, for  
instance)



Aggregate  (cost=234278.53..234278.54 rows=1 width=0) (actual  
time=114479.933..114479.936 rows=1 loops=1)
  -  Hash Left Join  (cost=52111.20..234218.21 rows=24126 width=0)  
(actual time=100435.523..114403.293 rows=20588 loops=1)
Hash Cond: ((outer.grupp = inner.grupp) AND (outer.liik =  
inner.liik))
-  Nested Loop  (cost=52103.94..233735.35 rows=24126 width=19)  
(actual time=100405.258..114207.387 rows=20588 loops=1)
  -  Index Scan using toode_pkey on toode  (cost=0.00..6.01  
rows=1 width=43) (actual time=18.312..18.325 rows=1 loops=1)

Index Cond: ('X05'::bpchar = toode)
  -  Hash Join  (cost=52103.94..233488.08 rows=24126  
width=24) (actual time=100386.921..114037.986 rows=20588 loops=1)

Hash Cond: (outer.dokumnr = inner.dokumnr)
-  Bitmap Heap Scan on rid  (cost=4127.51..175020.84  
rows=317003 width=28) (actual time=9.932..76225.918 rows=277294  
loops=1)

  Recheck Cond: (toode = 'X05'::bpchar)
  -  Bitmap Index Scan on rid_toode_idx  
(cost=0.00..4127.51 rows=317003 width=0) (actual time=11105.807..11105.807  
rows=280599 loops=1)

Index Cond: (toode = 'X05'::bpchar)
-  Hash  (cost=47376.82..47376.82 rows=93444  
width=4) (actual time=35082.427..35082.427 rows=105202 loops=1)
  -  Index Scan using dok_kuupaev_idx on dok  
(cost=0.00..47376.82 rows=93444 width=4) (actual time=42.110..34586.331  
rows=105202 loops=1)
Index Cond: (kuupaev =  
'2008-09-01'::date)

-  Hash  (cost=6.84..6.84 rows=84 width=19) (actual
time=30.220..30.220 rows=84 loops=1)
  -  Seq Scan on artliik  (cost=0.00..6.84 rows=84 width=19)
(actual time=20.104..29.845 rows=84 loops=1)
Total runtime: 114480.373 ms

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance and IN clauses

2008-11-20 Thread Tomas Vondra

Mark Roberts napsal(a):

On Tue, 2008-11-18 at 17:38 +0100, [EMAIL PROTECTED] wrote:

I bet there is no 'critical' length - this is just another case of
index
scan vs. seqscan. The efficiency depends on the size of the table /
row,
amount of data in the table, variability of the column used in the IN
clause, etc.

Splitting the query with 1000 items into 10 separate queries, the
smaller
queries may be faster but the total time consumed may be actually
higher.
Something like

10 * (time of small query) + (time to combine them)  (time of large
query)

If the performance of the 'split' solution is actually better than the
original query, it just means that the planner does not use index scan
when it actually should. That means that either

(a) the planner is not smart enough
(b) it has not current statistics of the table (run ANALYZE on the
table)
(c) the statistics are not detailed enough (ALTER TABLE ... SET
STATICTICS)
(d) the cost variables are not set properly (do not match the hardware
-
decreate index scan cost / increase seq scan cost)

regards
Tomas


I know that it's much faster (for us) to run many smaller queries than
one large query, and I think that it's primarily because of your reason
a.  Most of our problems come from Pg misunderstanding the results of a
join and making a bad plan decision.  Batching dramatically reduces the
liklihood of this.


As I already said - even the smartest planner won't work without correct 
input data. Have you tried fixing the points (b), (c) and (d)?


Fixing them might improve the planner performance so that you don't need 
the batchning at all.


regards
Tomas

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance