Re: [PERFORM] Best OS for Postgres 8.2

2007-05-09 Thread david

On Tue, 8 May 2007, Greg Smith wrote:


On Tue, 8 May 2007, Luke Lonergan wrote:


 From discussions with the developers, the biggest issue is a technical
 one: the Linux VFS layer makes the [ZFS] port difficult.


Difficult on two levels.  First you'd have to figure out how to make it work 
at all; then you'd have to reshape it into a form that it would be acceptable 
to the Linux kernel developers, who haven't seemed real keen on the idea so 
far.


given that RAID, snapshots, etc are already in the linux kernel, I suspect 
that what will need to happen is for the filesystem to be ported without 
those features and then the userspace tools (that manipulate the volumes ) 
be ported to use the things already in the kernel.


The standard article I'm you've already seen this week on this topic is Jeff 
Bonwick's at http://blogs.sun.com/bonwick/entry/rampant_layering_violation


yep, that sounds like what I've been hearing.

what the ZFS (and reiserfs4) folks haven't been wanting to hear from the 
linux kernel devs is that they are interested in having all these neat 
features available for use with all filesystems (and the linux kernel has 
a _lot_ of filesystems available), with solaris you basicly have UFS and 
ZFS so it's not as big a deal.


What really bugged me was his earlier article linked to there where he talks 
about how ZFS eliminates the need for hardware RAID controllers:

http://blogs.sun.com/bonwick/entry/raid_z

While there may be merit to that idea for some applications, like situations 
where you have a pig of a RAID5 volume, that's just hype for database writes. 
We issue the SYNCHRONIZE CACHE command to the disks after pushing all data 
in a transaction group--see, that would be the part the hardware controller 
is needed to accelerate.  If you really care about whether your data hit 
disk, there is no way to break the RPM barrier without hardware support.  The 
fact that he misunderstands such a fundamental point makes me wonder what 
other gigantic mistakes might be buried in his analysis.


I've seen similar comments from some of the linux kernel devs, they've 
used low-end raid controllers with small processors on them and think that 
a second core/socket in the main system to run software raid on is better.


David Lang

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] estimating the need for VACUUM FULL and REINDEX

2007-05-09 Thread Guillaume Cottenceau
Heikki Linnakangas heikki 'at' enterprisedb.com writes:

 Guillaume Cottenceau wrote:
  According to documentation[1], VACUUM FULL's only benefit is
  returning unused disk space to the operating system; am I correct
  in assuming there's also the benefit of optimizing the
  performance of scans, because rows are physically compacted on
  the disk?
 
 That's right.

Ok. Then I think the documentation should probably be updated? It
seems to totally miss this benefit.

We've been hit by degrading performance, probably because of too
seldom VACUUM ANALYZE, and in this situation it seems that the
two solutions are either VACUUM FULL or dumping and recreating
the database. Maybe this situation should be described in the
documentation. In this list, everyone always say you should
VACUUM ANALYZE frequently but little is done to consider the
case when we have to deal with an existing database on which this
hasn't been done properly.
 
  With that in mind, I've tried to estimate how much benefit would
  be brought by running VACUUM FULL, with the output of VACUUM
  VERBOSE. However, it seems that for example the removable rows
  reported by each VACUUM VERBOSE run is actually reused by VACUUM,
  so is not what I'm looking for.
 
 Take a look at contrib/pgstattuple. If a table has high percentage of
 free space, VACUUM FULL will compact that out.

Thanks a lot. I've followed this path and I think it should be
said that free_space must also be large compared to 8K -
free_percent can be large for tables with very few tuples even on
already compacted tables.
 
  Then according to documentation[2], REINDEX has some benefit when
  all but a few index keys on a page have been deleted, because the
  page remains allocated (thus, I assume it improves index scan
  performance, am I correct?). However, again I'm unable to
  estimate the expected benefit. With a slightly modified version
  of a query found in documentation[3] to see the pages used by a
  relation[4], I'm able to see that the index data from a given
  table...
 
 See pgstatindex, in the same contrib-module. The number you're looking
 for is avg_leaf_density. REINDEX will bring that to 90% (with default
 fill factor), so if it's much lower than that REINDEX will help.

Woops, seems that this was not availabe in pgstattuple of pg 7.4 :/

-- 
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Best OS for Postgres 8.2

2007-05-09 Thread Steinar H. Gunderson
On Wed, May 09, 2007 at 01:57:51AM -0700, [EMAIL PROTECTED] wrote:
 given that RAID, snapshots, etc are already in the linux kernel, I suspect 
 that what will need to happen is for the filesystem to be ported without 
 those features and then the userspace tools (that manipulate the volumes ) 
 be ported to use the things already in the kernel.

Well, part of the idea behind ZFS is that these parts are _not_ separated in
layers -- for instance, the filesystem can push data down to the RAID level
to determine the stripe size used.

Whether this is a good idea is of course hotly debated, but I don't think you
can port just the filesystem part and call it a day.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Best OS for Postgres 8.2

2007-05-09 Thread david

On Wed, 9 May 2007, Steinar H. Gunderson wrote:


On Wed, May 09, 2007 at 01:57:51AM -0700, [EMAIL PROTECTED] wrote:

given that RAID, snapshots, etc are already in the linux kernel, I suspect
that what will need to happen is for the filesystem to be ported without
those features and then the userspace tools (that manipulate the volumes )
be ported to use the things already in the kernel.


Well, part of the idea behind ZFS is that these parts are _not_ separated in
layers -- for instance, the filesystem can push data down to the RAID level
to determine the stripe size used.


there's nothing preventing this from happening if they are seperate layers 
either.


there are some performance implications of the seperate layers, but until 
someone has the ability to do head-to-head comparisons it's hard to say 
which approach will win (in theory the lack of layers makes for faster 
code, but in practice the fact that each layer is gone over by experts 
looking for ways to optimize it may overwelm the layering overhead)



Whether this is a good idea is of course hotly debated, but I don't think you
can port just the filesystem part and call it a day.


Oh, I'm absolutly sure that doing so won't satidfy people (wnd would 
generate howles of outrage from some parts), but having watched other 
groups try and get things into the kernel that the kernel devs felt were 
layering violations  I think that it's wat will ultimatly happen.


David Lang

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] What's The Difference Between VACUUM and VACUUM ANALYZE?

2007-05-09 Thread Gregory Stark
Alvaro Herrera [EMAIL PROTECTED] writes:

 Steinar H. Gunderson wrote:

 Shoot me if I'm wrong here, but doesn't VACUUM ANALYZE check _all_ tuples,
 as compared to the random selection employed by ANALYZE?

 You are wrong, but it won't be me the one to shoot you.

 There have been noises towards making the ANALYZE portion use the same
 scan that VACUUM already does, but nobody has written the code (it would
 be useful for some kinds of stats).

I think it does for the count of total records in the table. 
But not for the rest of the stats.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: FW: [PERFORM]

2007-05-09 Thread Gregory Stark
[EMAIL PROTECTED] writes:

 with a standard 7200 rpm drive ~150 transactions/sec sounds about right

 to really speed things up you want to get a disk controller with a battery
 backed cache so that the writes don't need to hit the disk to be safe.

Note that this is only if you're counting transactions/sec in a single
session. You can get much more if you have many sessions since they can all
commit together in a single disk i/o.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Peter Eisentraut
Am Dienstag, 8. Mai 2007 17:53 schrieb Tom Lane:
 Hmm, I'd have expected it to discount the repeated indexscans a lot more
 than it seems to be doing for you.  As an example in the regression
 database, note what happens to the inner indexscan cost estimate when
 the number of outer tuples grows:

I can reproduce your results in the regression test database. 8.2.1 and 8.2.4 
behave the same.

I checked the code around cost_index(), and the assumptions appear to be 
correct (at least this query doesn't produce wildly unusual data).  
Apparently, however, the caching effects are much more significant than the 
model takes into account.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[PERFORM] Apparently useless bitmap scans

2007-05-09 Thread Peter Eisentraut
There's another odd thing about this plan from yesterday.

Query:

SELECT
eh_subj.header_body AS subject,
count(distinct eh_from.header_body)
FROM
email JOIN mime_part USING (email_id)
JOIN email_header eh_subj USING (email_id, mime_part_id)
JOIN email_header eh_from USING (email_id, mime_part_id)
WHERE
eh_subj.header_name = 'subject'
AND eh_from.header_name = 'from'
AND mime_part_id = 0
AND (time = timestamp '2007-05-05 17:01:59' AND time  
timestamp '2007-05-05 17:01:59' + interval '60 min')
GROUP BY
eh_subj.header_body;

Plan:


QUERY PLAN  
   
---
 GroupAggregate  (cost=1920309.81..1920534.21 rows=11220 width=184) (actual 
time=5349.493..5587.536 rows=35000 loops=1)
   -  Sort  (cost=1920309.81..1920337.86 rows=11220 width=184) (actual 
time=5349.427..5392.110 rows=35000 loops=1)
 Sort Key: eh_subj.header_body
 -  Nested Loop  (cost=15576.58..1919555.05 rows=11220 width=184) 
(actual time=537.938..5094.377 rows=35000 loops=1)
   -  Nested Loop  (cost=15576.58..475387.23 rows=11020 width=120) 
(actual time=537.858..4404.330 rows=35000 loops=1)
 -  Nested Loop  (cost=15576.58..430265.44 rows=11092 
width=112) (actual time=537.768..4024.184 rows=35000 loops=1)
   -  Bitmap Heap Scan on email_header eh_from  
(cost=15576.58..16041.55 rows=107156 width=104) (actual time=537.621..1801.032 
rows=280990 loops=1)
 Recheck Cond: ((mime_part_id = 0) AND 
(header_name = 'from'::text))
 -  BitmapAnd  (cost=15576.58..15576.58 
rows=160 width=0) (actual time=500.006..500.006 rows=0 loops=1)
   -  Bitmap Index Scan on dummy_index  
(cost=0.00..3724.22 rows=107156 width=0) (actual time=85.025..85.025 
rows=280990 loops=1)
   -  Bitmap Index Scan on 
idx__email_header__from_local  (cost=0.00..5779.24 rows=107156 width=0) (actual 
time=173.006..173.006 rows=280990 loops=1)
   -  Bitmap Index Scan on dummy2_index  
(cost=0.00..5992.25 rows=107156 width=0) (actual time=174.463..174.463 
rows=280990 loops=1)
   -  Index Scan using email_pkey on email  
(cost=0.00..3.85 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990)
 Index Cond: (email.email_id = eh_from.email_id)
 Filter: ((time = '2007-05-05 
17:01:59'::timestamp without time zone) AND (time  '2007-05-05 
18:01:59'::timestamp without time zone))
 -  Index Scan using mime_part_pkey on mime_part  
(cost=0.00..4.06 rows=1 width=12) (actual time=0.005..0.006 rows=1 loops=35000)
   Index Cond: ((email.email_id = mime_part.email_id) 
AND (mime_part.mime_part_id = 0))
   -  Index Scan using idx__email_header__email_id__mime_part_id 
on email_header eh_subj  (cost=0.00..130.89 rows=13 width=104) (actual 
time=0.009..0.015 rows=1 loops=35000)
 Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = 
eh_subj.mime_part_id))
 Filter: (header_name = 'subject'::text)
 Total runtime: 5625.024 ms


I'm wondering what it wants to achieve with these three index scans:

-  Bitmap Index Scan on dummy_index  (cost=0.00..3724.22 rows=107156 
width=0) (actual time=85.025..85.025 rows=280990 loops=1)
-  Bitmap Index Scan on idx__email_header__from_local  (cost=0.00..5779.24 
rows=107156 width=0) (actual time=173.006..173.006 rows=280990 loops=1)
-  Bitmap Index Scan on dummy2_index  (cost=0.00..5992.25 rows=107156 
width=0) (actual time=174.463..174.463 rows=280990 loops=1)

The indexes in question are:

CREATE INDEX dummy_index ON email_header ((555)) WHERE mime_part_id = 0 AND 
header_name = 'from';
CREATE INDEX dummy2_index ON email_header (substr(header_body,5)) WHERE 
mime_part_id = 0 AND header_name = 'from';
CREATE INDEX idx__email_header__from_local ON email_header 
(get_localpart(header_body)) WHERE mime_part_id = 0 AND header_name = 'from';

It appears to want to use these indexes to get the restriction

AND eh_from.header_name = 'from'
AND mime_part_id = 0

from the query, but why does it need three of them to do it, when all
of them have the same predicate and none of them has an indexed
expression that appears in the query?

There are more partial indexes with the 

[PERFORM] Cannot make GIN intarray index be used by the planner

2007-05-09 Thread Valentine Gogichashvili

Hello all,

I am trying to move from GiST intarray index to GIN intarray index, but my
GIN index is not being used by the planner.

The normal query is like that

select *
 from sourcetablewith_int4
where ARRAY[myint] @ myint_array
  and some_other_filters

(with GiST index everything works fine, but GIN index is not being used)

If I create the same table populating it with text[] data like

select myint_array::text[] as myint_array_as_textarray
 into newtablewith_text
 from sourcetablewith_int4

and then create a GIN index using this new text[] column

the planner starts to use the index and queries run with grate speed when
the query looks like that:

select *
 from newtablewith_text
where ARRAY['myint'] @ myint_array_as_textarray
  and some_other_filters

Where the problem can be with _int4 GIN index in this constellation?

by now the enable_seqscan is set to off in the configuration.

With best regards,

-- Valentine Gogichashvili


[PERFORM] Poor performance with queries using clause: sth IN (...)

2007-05-09 Thread Andrzej Zawadzki
That's email from my friend.
Any hint?

 Original Message 
Subject: bug
Date: Wed, 09 May 2007 15:03:00 +0200
From: Michal Postupalski
To: Andrzej Zawadzki

We've just changed our database from 8.1 to 8.2 and we are
grief-stricken about very poor performance with queries using clause:
sth IN (...). As we can see any query is translate to sth = ANY
('{}'::bpchar[])) and it tooks much more time beacuse it doesn't
use index'es. Why ? How can we speed up these queries? I've just read
Performance of IN (...) vs. = ANY array[...] on pgsql-performance
mailing list and I didn't find any solutions. Can anybody tell me what
can I do with postgres to force him using indexes? If there isn't any
solution I'm afraid that we will have to do downgrade to previous
version 8.1.

example:
SELECT count(*)
FROM kredytob b, kredyty k
WHERE true
AND b.kredytid = k.id
AND '' IN ('', upper(b.nazwisko))
AND '' IN ('', upper(b.imie))
AND '7804485' IN ('', b.pesel)
AND '' IN ('', upper(trim(b.dowseria))) AND '' IN ('', b.dowosnr) AND 0
IN (0, b.typkred) AND k.datazwrot IS NULL;

regards...
Michał Postupalski


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Cannot make GIN intarray index be used by the planner

2007-05-09 Thread Oleg Bartunov

Do you have both indexes (GiST, GIN) on the same table ?

On Wed, 9 May 2007, Valentine Gogichashvili wrote:


Hello all,

I am trying to move from GiST intarray index to GIN intarray index, but my
GIN index is not being used by the planner.

The normal query is like that

select *
from sourcetablewith_int4
where ARRAY[myint] @ myint_array
 and some_other_filters

(with GiST index everything works fine, but GIN index is not being used)

If I create the same table populating it with text[] data like

select myint_array::text[] as myint_array_as_textarray
into newtablewith_text
from sourcetablewith_int4

and then create a GIN index using this new text[] column

the planner starts to use the index and queries run with grate speed when
the query looks like that:

select *
from newtablewith_text
where ARRAY['myint'] @ myint_array_as_textarray
 and some_other_filters

Where the problem can be with _int4 GIN index in this constellation?

by now the enable_seqscan is set to off in the configuration.

With best regards,

-- Valentine Gogichashvili



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Cannot make GIN intarray index be used by the planner

2007-05-09 Thread Valentine Gogichashvili

I have experimented quite a lot. So first I did when starting the attempt to
move from GiST to GIN, was to drop the GiST index and create a brand new GIN
index... after that did not bring the results, I started to create all this
tables with different sets of indexes and so on...

So the answer to the question is: no there in only GIN index on the table.

Thank you in advance,

Valentine

On 5/9/07, Oleg Bartunov [EMAIL PROTECTED] wrote:


Do you have both indexes (GiST, GIN) on the same table ?

On Wed, 9 May 2007, Valentine Gogichashvili wrote:

 Hello all,

 I am trying to move from GiST intarray index to GIN intarray index, but
my
 GIN index is not being used by the planner.

 The normal query is like that

 select *
 from sourcetablewith_int4
 where ARRAY[myint] @ myint_array
  and some_other_filters

 (with GiST index everything works fine, but GIN index is not being used)

 If I create the same table populating it with text[] data like

 select myint_array::text[] as myint_array_as_textarray
 into newtablewith_text
 from sourcetablewith_int4

 and then create a GIN index using this new text[] column

 the planner starts to use the index and queries run with grate speed
when
 the query looks like that:

 select *
 from newtablewith_text
 where ARRAY['myint'] @ myint_array_as_textarray
  and some_other_filters

 Where the problem can be with _int4 GIN index in this constellation?

 by now the enable_seqscan is set to off in the configuration.

 With best regards,

 -- Valentine Gogichashvili


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83





--
ვალენტინ გოგიჩაშვილი
Valentine Gogichashvili


Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Daniel Cristian Cruz

I'm having something weird too...

Look:

Nested Loop Left Join  (cost=93.38..7276.26 rows=93 width=58) (actual
time=99.211..4804.525 rows=2108 loops=1)
  -  Hash Join  (cost=93.38..3748.18 rows=93 width=4) (actual
time=0.686..20.632 rows=45 loops=1)
Hash Cond: ((u.i)::text = (m.i)::text)
-  Seq Scan on u  (cost=0.00..2838.80 rows=10289 width=4)
(actual time=0.010..7.813 rows=10291 loops=1)
-  Hash  (cost=87.30..87.30 rows=30 width=7) (actual
time=0.445..0.445 rows=45 loops=1)
  -  Index Scan using m_pkey on m  (cost=0.00..87.30
rows=30 width=7) (actual time=0.046..0.371 rows=45 loops=1)
Index Cond: (t = 1613)
Filter: ((a)::text = 'Y'::text)
  -  Index Scan using s_pkey on s  (cost=0.00..37.33 rows=3
width=58) (actual time=19.864..106.198 rows=47 loops=45)
Index Cond: ((u.i = s.u) AND ((s.p)::text = '4'::text) AND
(s.t = 1613) AND ((s.c)::text = 'cmi.core.total_time'::text))
Total runtime: 4805.975 ms

And disabling all the joins Tom said:

Nested Loop Left Join  (cost=0.00..16117.12 rows=93 width=58) (actual
time=2.706..168.556 rows=2799 loops=1)
  -  Nested Loop  (cost=0.00..13187.94 rows=93 width=4) (actual
time=2.622..125.739 rows=50 loops=1)
-  Seq Scan on u  (cost=0.00..2838.80 rows=10289 width=4)
(actual time=0.012..9.863 rows=10291 loops=1)
-  Index Scan using m_pkey on m  (cost=0.00..0.80 rows=1
width=7) (actual time=0.009..0.009 rows=0 loops=10291)
  Index Cond: ((m.t = 1615) AND ((u.i)::text = (m.i)::text))
  Filter: ((a)::text = 'Y'::text)
  -  Index Scan using s_pkey on s  (cost=0.00..31.09 rows=2
width=58) (actual time=0.047..0.778 rows=56 loops=50)
Index Cond: ((u.i = s.u) AND ((s.p)::text = '4'::text) AND
(s.t = 1615) AND ((s.c)::text = 'cmi.core.total_time'::text))
Total runtime: 169.836 ms

I had PostgreSQL 8.2.3 on x86_64-redhat-linux-gnu, shared_buffers with
1640MB, effective_cache_size with 5400MB and 8GB of RAM, where all
shared_buffers blocks are used (pg_buffercache, relfilenode IS NOT
NULL).

Note that even when I set default_statistics_target to 500, and
calling ANALYZE s;, I cannot see the number of estimated rows on the
index scan on s close to the actual rows.

Could it be related?

2007/5/9, Peter Eisentraut [EMAIL PROTECTED]:

Am Dienstag, 8. Mai 2007 17:53 schrieb Tom Lane:
 Hmm, I'd have expected it to discount the repeated indexscans a lot more
 than it seems to be doing for you.  As an example in the regression
 database, note what happens to the inner indexscan cost estimate when
 the number of outer tuples grows:

I can reproduce your results in the regression test database. 8.2.1 and 8.2.4
behave the same.

I checked the code around cost_index(), and the assumptions appear to be
correct (at least this query doesn't produce wildly unusual data).
Apparently, however, the caching effects are much more significant than the
model takes into account.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate





--
Daniel Cristian Cruz

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Gregory Stark

Daniel Cristian Cruz [EMAIL PROTECTED] writes:

   -  Nested Loop  (cost=0.00..13187.94 rows=93 width=4) (actual 
 time=2.622..125.739 rows=50 loops=1)
 -  Seq Scan on u  (cost=0.00..2838.80 rows=10289 width=4) (actual 
 time=0.012..9.863 rows=10291 loops=1)
 -  Index Scan using m_pkey on m  (cost=0.00..0.80 rows=1 width=7) 
 (actual time=0.009..0.009 rows=0 loops=10291)

That's not discounting the nested loop for cache effect at all!

What is your effective_cache_size for this?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Daniel Cristian Cruz

2007/5/9, Gregory Stark [EMAIL PROTECTED]:


Daniel Cristian Cruz [EMAIL PROTECTED] writes:

   -  Nested Loop  (cost=0.00..13187.94 rows=93 width=4) (actual 
time=2.622..125.739 rows=50 loops=1)
 -  Seq Scan on u  (cost=0.00..2838.80 rows=10289 width=4) (actual 
time=0.012..9.863 rows=10291 loops=1)
 -  Index Scan using m_pkey on m  (cost=0.00..0.80 rows=1 width=7) 
(actual time=0.009..0.009 rows=0 loops=10291)

That's not discounting the nested loop for cache effect at all!

What is your effective_cache_size for this?


effective_cache_size is 5400MB.

I forgot to mention a modifications on cost:
cpu_tuple_cost = 0.2
Which forced a usage of indexes.

I set it to 0.01 and the plan has a index scan on m before the hash on
u, being 15% slower:

Hash Cond: ((u.i)::text = (m.i)::text)
-  Seq Scan on u  (cost=0.00..2838.80 rows=10289 width=4)
(actual time=0.007..6.138 rows=10292 loops=1)
-  Hash  (cost=87.30..87.30 rows=30 width=7) (actual
time=0.185..0.185 rows=50 loops=1)
  -  Index Scan using m_pkey on m  (cost=0.00..87.30
rows=30 width=7) (actual time=0.021..0.144 rows=50 loops=1)
Index Cond: (t = 1615)
Filter: ((a)::text = 'Y'::text)

I'm still confused since I didn't understood what That's not
discounting the nested loop for cache effect at all! could mean...

Thanks for the help.
--
Daniel Cristian Cruz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Peter Eisentraut
Am Mittwoch, 9. Mai 2007 16:11 schrieb Tom Lane:
 Well, there's something funny going on here.  You've got for instance

-  Index Scan using email_pkey on email  (cost=0.00..3.85
 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) Index Cond:
 (email.email_id = eh_from.email_id)
  Filter: ((time = '2007-05-05 17:01:59'::timestamp
 without time zone) AND (time  '2007-05-05 18:01:59'::timestamp without
 time zone))

 on the inside of a nestloop whose outer side is predicted to return
 107156 rows.  That should've been discounted to *way* less than 3.85
 cost units per iteration.

This is the new plan with 8.2.4.  It's still got the same problem, though.


QUERY PLAN  
   
---
 GroupAggregate  (cost=5627064.21..5627718.73 rows=32726 width=184) (actual 
time=4904.834..5124.585 rows=35000 loops=1)
   -  Sort  (cost=5627064.21..5627146.03 rows=32726 width=184) (actual 
time=4904.771..4947.892 rows=35000 loops=1)
 Sort Key: eh_subj.header_body
 -  Nested Loop  (cost=0.00..5624610.06 rows=32726 width=184) (actual 
time=0.397..4628.141 rows=35000 loops=1)
   -  Nested Loop  (cost=0.00..1193387.12 rows=28461 width=120) 
(actual time=0.322..3960.360 rows=35000 loops=1)
 -  Nested Loop  (cost=0.00..1081957.26 rows=28648 
width=112) (actual time=0.238..3572.023 rows=35000 loops=1)
   -  Index Scan using dummy_index on email_header 
eh_from  (cost=0.00..13389.15 rows=280662 width=104) (actual 
time=0.133..1310.248 rows=280990 loops=1)
   -  Index Scan using email_pkey on email  
(cost=0.00..3.79 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990)
 Index Cond: (email.email_id = eh_from.email_id)
 Filter: ((time = '2007-05-05 
17:01:59'::timestamp without time zone) AND (time  '2007-05-05 
18:01:59'::timestamp without time zone))
 -  Index Scan using mime_part_pkey on mime_part  
(cost=0.00..3.88 rows=1 width=12) (actual time=0.005..0.006 rows=1 loops=35000)
   Index Cond: ((email.email_id = mime_part.email_id) 
AND (mime_part.mime_part_id = 0))
   -  Index Scan using idx__email_header__email_id__mime_part_id 
on email_header eh_subj  (cost=0.00..155.47 rows=18 width=104) (actual 
time=0.009..0.014 rows=1 loops=35000)
 Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = 
eh_subj.mime_part_id))
 Filter: (header_name = 'subject'::text)
 Total runtime: 5161.390 ms

 Are you using any nondefault planner settings?

random_page_cost = 3
effective_cache_size = 384MB

 How big are these tables, anyway?

email   35 MB
email_header421 MB
mime_part   37 MB

Everything is analyzed, vacuumed, and reindexed.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


ZFS and Postgresql - WASRe: [PERFORM] Best OS for Postgres 8.2

2007-05-09 Thread Jignesh Shah

Hello Ian,

I have done some testing with postgresql and ZFS on Solaris 10 11/06.
While I work for Sun, I dont claim to be a ZFS expert (for that matter 
not even Solaris or PostgreSQL).


Lets first look at the scenarios of how postgresql can be deployed on 
Solaris

First the Solaris Options
1. UFS with default setup (which is buffered file system)
2. UFS with forcedirectio option (or unbuffered file system)
3. ZFS by default (128K recordsize with checksum but no compression)
4. ZFS with Compression (Default compression using LZ* algorithm .. now 
even a gzip algorithm is supported)


(For simplicity I am not considering RAID levels here since that 
increases the number of scenarios quite a bit and also skipping Solaris 
Volume Manager - legacy volume management capabilities in Solaris)


Now for the postgresql.conf options
a. wal_sync_method  set to default - maps to opendatasync
b. wal_sync_method set to fdatasync

(assuming checkpoint_segments and wal_buffers are high already)

(This are my tests results  based on the way I used the workload and 
your mileage will vary)

So with this type of configurations I found the following
1a. Default UFS with default wal_sync_method - Sucks for me mostly  
using pgbench or EAStress type workloads
1b. Default UFS with fdatasync - works well  specially increasing 
segmapsize from default 12% to higher values
2a  ForcedirectIO with  default wal_sync_method - works well but then is 
limited to hardware disk performances
(In a way good to have RAID controller with big Write cache for 
it.. One advantage is lower system cpu utilization)

2b Didn't see huge difference from 2a in this case
3a  It was better than 1a but still limited
3b  It was better even than 3a and 1b but cpu utilization seemed higher
4a   - Didn't test this out
4b  - Hard to say since in my case since I wasnt disk bound (per se) but 
CPU bound. The compression helps when number of IOs to the disk are high 
and it helps to cut it down at the cost of CPU cycles



Overall ZFS seems to improve performance with PostgreSQL on Solaris 10 
with a bit increased system times compared to UFS.
(So the final results depends on the metrics that you are measuring the 
performance :-) ) (ZFS engineers are constantly improving the 
performance and I have seen the improvements from Solaris 10 1/06 
release to my current setup)


Of course I haven't compared against any other OS.. If someone has 
already done that I would be interested in knowing the results.


Now comes the thing that I am still exploring
* Do we do checksum in WAL ? I guess we do .. Which means that we are 
now doing double checksumming on the data. One in ZFS and one in 
postgresql. ZFS does allow checksumming to be turned off (but on new 
blocks allocated). But of course the philosophy is where should it be 
done (ZFS or PostgreSQL). ZFS checksumming gives ability to  correct the 
data on the bad checksum if you use mirror devices. PostgreSQL doesnt 
give that ability and in case of an error would fail. ( I  dont know the 
exact behavior of postgresql when it would encounter a failed checksum)


Hope this helps.


Regards,
Jignesh



李彦 Ian Li wrote:
In #postgresql on freenode, somebody ever mentioned that ZFS from 
Solaris helps a lot to the performance of pgsql, so dose anyone have 
information about that?


Steve Atkins wrote:


On May 7, 2007, at 2:55 PM, David Levy wrote:


Hi,

I am about to order a new server for my Postgres cluster. I will
probably get a Dual Xeon Quad Core instead of my current Dual Xeon.
Which OS would you recommend to optimize Postgres behaviour (i/o
access, multithreading, etc) ?

I am hesitating between Fedora Core 6, CentOS and Debian. Can anyone
help with this ?


Well, all three you mention are much the same, just with a different
badge on the box, as far as performance is concerned. They're all
going to be a moderately recent Linux kernel, with your choice
of filesystems, so any choice between them is going to be driven
more by available staff and support or personal preference.

I'd probably go CentOS 5 over Fedora  just because Fedora doesn't
get supported for very long - more of an issue with a dedicated
database box with a long lifespan than your typical desktop or
interchangeable webserver.

I might also look at Solaris 10, though. I've yet to play with it 
much, but it

seems nice, and I suspect it might manage 8 cores better than current
Linux setups.

Cheers,
  Steve



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings



Regards

Ian

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: ZFS and Postgresql - WASRe: [PERFORM] Best OS for Postgres 8.2

2007-05-09 Thread Alvaro Herrera
Jignesh Shah escribió:

 Now comes the thing that I am still exploring
 * Do we do checksum in WAL ? I guess we do .. Which means that we are 
 now doing double checksumming on the data. One in ZFS and one in 
 postgresql. ZFS does allow checksumming to be turned off (but on new 
 blocks allocated). But of course the philosophy is where should it be 
 done (ZFS or PostgreSQL).

Checksums on WAL are not optional in Postgres, because AFAIR they are
used to determine when it should stop recovering.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Best OS for Postgres 8.2

2007-05-09 Thread Jim Nasby

On May 8, 2007, at 2:59 AM, [EMAIL PROTECTED] wrote:
one issue with journaling filesystems, if you journal the data as  
well as the metadata you end up with a very reliable setup, however  
it means that all your data needs to be written twice, oncce to the  
journal, and once to the final location. the write to the journal  
can be slightly faster then a normal write to the final location  
(the journal is a sequential write to an existing file), however  
the need to write twice can effectivly cut your disk I/O bandwidth  
in half when doing heavy writes. worse, when you end up writing mor  
ethen will fit in the journal (128M is the max for ext3) the entire  
system then needs to stall while the journal gets cleared to make  
space for the additional writes.


That's why you want to mount ext3 partitions used with PostgreSQL  
with data=writeback.


Some folks will also use a small filesystem for pg_xlog and mount  
that as ext2.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] How to Find Cause of Long Vacuum Times - NOOB Question

2007-05-09 Thread Jim Nasby

On May 7, 2007, at 11:10 PM, Yudhvir Singh Sidhu wrote:

Jim Nasby wrote:

On May 5, 2007, at 5:57 PM, Yudhvir Singh Sidhu wrote:
Problem:  Vacuum times jump up from 45 minutes, or 1:30 minutes  
to 6+ hours overnight, once every 1 to 3 months.
Solutions tried:  db truncate - brings vacuum times down.  
Reindexing brings vacuum times down.


Does it jump up to 6+ hours just once and then come back down? Or  
once at 6+ hours does it stay there?


Getting that kind of change in vacuum time sounds a lot like you  
suddenly didn't have enough maintenance_work_mem to remember all  
the dead tuples in one pass; increasing that setting might bring  
things back in line (you can increase it on a per-session basis,  
too).


Also, have you considered vacuuming during the day, perhaps via  
autovacuum? If you can vacuum more often you'll probably get less  
bloat. You'll probably want to experiment with the  
vacuum_cost_delay settings to reduce the impact of vacuuming  
during the day (try setting vacuum_cost_delay to 20 as a starting  
point).
It ramps up and I have to run a db truncate to bring it back down.  
On some machines it creeps up, on others it spikes. I have seen it  
climb from 6 to 12 to 21 in 3 consequtive days. Well, what's one to  
do? I have maintenance_work_mem set to 32768 - Is that enough?


Depends on how many dead rows there are to be vacuumed. If there's a  
lot, you could certainly be exceeding maintenance_work_mem. If you  
look closely at the output of VACUUM VERBOSE you'll see the indexes  
for a particular table being scanned more than once if all the dead  
rows can't fit into maintenance_work_mem.



I vacuum daily.


If you've got high update rates, that very likely might not be often  
enough.


I just turned vacuum verbose on on one of the systems and will find  
out tomorrow what it shows me. I plan on playing with Max_fsm_  
settings tomorrow. And I'll keep you guys up to date.


The tail end of vacuumdb -av will tell you exactly how much room is  
needed in the FSM.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Nested loops overpriced

2007-05-09 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Are you using any nondefault planner settings?

 random_page_cost = 3
 effective_cache_size = 384MB

 How big are these tables, anyway?

 email 35 MB
 email_header  421 MB
 mime_part 37 MB

Hmmm ... I see at least part of the problem, which is that email_header
is joined twice in this query, which means that it's counted twice in
figuring the total volume of pages competing for cache space.  So the
thing thinks cache space is oversubscribed nearly 3X when in reality
the database is fully cached.  I remember having dithered about whether
to try to avoid counting the same physical relation more than once in
total_table_pages, but this example certainly suggests that we
shouldn't.  Meanwhile, do the estimates get better if you set
effective_cache_size to 1GB or so?

To return to your original comment: if you're trying to model a
situation with a fully cached database, I think it's sensible
to set random_page_cost = seq_page_cost = 0.1 or so.  You had
mentioned having to decrease them to 0.02, which seems unreasonably
small to me too, but maybe with the larger effective_cache_size
you won't have to go that far.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Cannot make GIN intarray index be used by the planner

2007-05-09 Thread Tom Lane
[cc'ing to pgsql-hackers since this is looking like a contrib/intarray bug]

Valentine Gogichashvili [EMAIL PROTECTED] writes:
 here is the DT

That works fine for me in 8.2:

regression=#  explain SELECT id, (myintarray_int4)
  FROM myintarray_table_nonulls
 WHERE ARRAY[8] @ myintarray_int4;
QUERY PLAN  
  
--
 Index Scan using idx_nonnulls_myintarray_int4_gin on myintarray_table_nonulls  
(cost=0.00..8.27 rows=1 width=36)
   Index Cond: ('{8}'::integer[] @ myintarray_int4)
(2 rows)

What I am betting is that you've installed contrib/intarray in this
database and that's bollixed things up somehow.  In particular, intarray
tries to take over the position of default gin opclass for int4[],
and the opclass that it installs as default has operators named just
like the built-in ones.  If somehow your query is using pg_catalog.@
instead of intarray's public.@, then the planner wouldn't think the
index is relevant.

In a quick test your example still works with intarray installed, because
what it's really created is public.@ (integer[], integer[]) which is
an exact match and therefore takes precedence over the built-in
pg_catalog.@ (anyarray, anyarray).  But if for example you don't have
public in your search_path then the wrong operator would be chosen.

Please look at the pg_index entry for your index, eg

select * from pg_index where indexrelid =
'versionA.idx_nonnulls_myintarray_int4_gin'::regclass;

and see whether the index opclass is the built-in one or not.

Note to hackers: we've already discussed that intarray shouldn't be
trying to take over the default gin opclass, but I am beginning to
wonder if it still has a reason to live at all.  We should at least
consider removing the redundant operators to avoid risks like this one.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Vacuum Times - Verbose and maintenance_work_mem

2007-05-09 Thread Y Sidhu

How do you specify a log file for vacuum verbose to send info to? I have
verbose turned on but cannot see any log messages.

I have upped maintenance_work_mem setting from 32768 to 98304. This is on a
4 GB, 3.2 GHz Xeon, dual core, dual cpu with HTT turned on. I hope that
helps with vacuum times. PG 8.0.9 on a UFS2 FreeBSD 6.2 prerelease.

--
Yudhvir Singh Sidhu
408 375 3134 cell


[PERFORM] Performance Woes

2007-05-09 Thread Ralph Mason
Hi,

 

I have several databases.  They are each about 35gb in size and have about
10.5K relations (count from pg_stat_all_tables) in them.  Pg_class is about
26k rows and the data directory contains about 70k files.  These are busy
machines, they run about 50 xactions per second, ( aproxx insert / update /
delete about 500 rows per second).

 

We started getting errors about the number of open file descriptors

 

: 2007-05-09 03:07:50.083 GMT 1146975740: LOG:  53000: out of file
descriptors: Too many open files; release and retry

2007-05-09 03:07:50.083 GMT 1146975740: CONTEXT:  SQL statement insert …..


PL/pgSQL function trigfunc_whatever line 50 at execute statement

2007-05-09 03:07:50.083 GMT 1146975740: LOCATION:  BasicOpenFile, fd.c:471

2007-05-09 03:07:50.091 GMT 0: LOG:  0: duration: 12.362 ms

2007-05-09 03:07:50.091 GMT 0: LOCATION:  exec_simple_query, postgres.c:1090

 

So  we decreased the max_files_per_process to 800.  This took care of the
error *BUT* about quadrupled  the IO wait that is happening on the machine.
It went from a peek of about 50% to peeks of over 200% (4 processor
machines, 4 gigs ram, raid).  The load on the machine remained constant.

 

I am really to get an understanding of exactly what this setting is and
‘what’ is out of file descriptors and how I can fix that.  I need to bring
that IO back down.

 

 

Thanks for any help.

Ralph

 

 

 


-- 
Internal Virus Database is out-of-date.
Checked by AVG Free Edition.
Version: 7.5.432 / Virus Database: 268.15.9/573 - Release Date: 05/12/2006
16:07
 


Re: [PERFORM] Performance Woes

2007-05-09 Thread CAJ CAJ

I have several databases.  They are each about 35gb in size and have about
10.5K relations (count from pg_stat_all_tables) in them.  Pg_class is
about 26k rows and the data directory contains about 70k files.  These are
busy machines, they run about 50 xactions per second, ( aproxx insert /
update / delete about 500 rows per second).

We started getting errors about the number of open file descriptors

: 2007-05-09 03:07:50.083 GMT 1146975740: LOG:  53000: out of file
descriptors: Too many open files; release and retry

2007-05-09 03:07:50.083 GMT 1146975740: CONTEXT:  SQL statement insert
….. 

PL/pgSQL function trigfunc_whatever line 50 at execute statement

2007-05-09 03:07:50.083 GMT 1146975740: LOCATION:  BasicOpenFile, fd.c:471

2007-05-09 03:07:50.091 GMT 0: LOG:  0: duration: 12.362 ms

2007-05-09 03:07:50.091 GMT 0: LOCATION:  exec_simple_query, postgres.c
:1090



So  we decreased the max_files_per_process to 800.  This took care of the
error **BUT** about quadrupled  the IO wait that is happening on the
machine. It went from a peek of about 50% to peeks of over 200% (4 processor
machines, 4 gigs ram, raid).  The load on the machine remained constant.




What version of Pg/OS? What is your hardware config?

I had seen these errors with earlier versions of Pg 7.4.x which was fixed in
later releases according to the changelogs


Re: [PERFORM] Performance Woes

2007-05-09 Thread Joshua D. Drake



2007-05-09 03:07:50.083 GMT 1146975740: LOCATION:  BasicOpenFile,
fd.c:471

2007-05-09 03:07:50.091 GMT 0: LOG:  0: duration: 12.362 ms

2007-05-09 03:07:50.091 GMT 0: LOCATION:  exec_simple_query,
postgres.c:1090

 


So  we decreased the max_files_per_process to 800.  This took care
of the error **BUT** about quadrupled  the IO wait that is happening
on the machine. It went from a peek of about 50% to peeks of over
200% (4 processor machines, 4 gigs ram, raid).  The load on the
machine remained constant.



Sounds to me like you just need to up the total amount of open files 
allowed by the operating system.


Joshua D. Drake

--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance Woes

2007-05-09 Thread Joshua D. Drake

Hello,

You likely need to increase your file-max parameters using sysctl.conf.

Sincerely,

Joshua D. Drake
--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Performance Woes

2007-05-09 Thread Jeff Davis
On Wed, 2007-05-09 at 17:29 -0700, Joshua D. Drake wrote:
  2007-05-09 03:07:50.083 GMT 1146975740: LOCATION:  BasicOpenFile,
  fd.c:471
  
  2007-05-09 03:07:50.091 GMT 0: LOG:  0: duration: 12.362 ms
  
  2007-05-09 03:07:50.091 GMT 0: LOCATION:  exec_simple_query,
  postgres.c:1090
  
   
  
  So  we decreased the max_files_per_process to 800.  This took care
  of the error **BUT** about quadrupled  the IO wait that is happening
  on the machine. It went from a peek of about 50% to peeks of over
  200% (4 processor machines, 4 gigs ram, raid).  The load on the
  machine remained constant.
  
 
 Sounds to me like you just need to up the total amount of open files 
 allowed by the operating system.

It looks more like the opposite, here's the docs for
max_files_per_process:

Sets the maximum number of simultaneously open files allowed to each
server subprocess. The default is one thousand files. If the kernel is
enforcing a safe per-process limit, you don't need to worry about this
setting. But on some platforms (notably, most BSD systems), the kernel
will allow individual processes to open many more files than the system
can really support when a large number of processes all try to open that
many files. If you find yourself seeing Too many open files failures,
try reducing this setting. This parameter can only be set at server
start.

To me, that means that his machine is allowing the new FD to be created,
but then can't really support that many so it gives an error.

Ralph, how many connections do you have open at once? It seems like the
machine perhaps just can't handle that many FDs in all of those
processes at once.

That is a lot of tables. Maybe a different OS will handle it better?
Maybe there's some way that you can use fewer connections and then the
OS could still handle it?

Regards,
Jeff Davis


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance Woes

2007-05-09 Thread Alvaro Herrera
Ralph Mason wrote:

 I have several databases.  They are each about 35gb in size and have about
 10.5K relations (count from pg_stat_all_tables) in them.  Pg_class is about
 26k rows and the data directory contains about 70k files.  These are busy
 machines, they run about 50 xactions per second, ( aproxx insert / update /
 delete about 500 rows per second).

Is it always the same trigger the problematic one?  Is it just PL/pgSQL,
or do you have something else?  Something that may be trying to open
additional files for example?  Something that may be trying to open
files behind your back?  PL/Perl with funky operators or temp files?

Also, what PG version is this?

 So  we decreased the max_files_per_process to 800.  This took care of the
 error *BUT* about quadrupled  the IO wait that is happening on the machine.
 It went from a peek of about 50% to peeks of over 200% (4 processor
 machines, 4 gigs ram, raid).  The load on the machine remained constant.

The max_files_per_process settings controls how many actual file
descriptors each process is allowed to have.  Postgres uses internally a
virtual file descriptor, which normally have one file descriptor open
each.  However, if your transactions need to access lots of files, the
VFDs will close the kernel FDs to allow other VFDs to open theirs.

So it sounds like your transaction has more than 800 files open.  The
extra IO wait could be caused by the additional system calls to open and
close those files as needed.  I would actually expect it to cause extra
system load (as opposed to user) rather than IO, but I'm not sure.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance Woes

2007-05-09 Thread Tom Lane
Jeff Davis [EMAIL PROTECTED] writes:
 On Wed, 2007-05-09 at 17:29 -0700, Joshua D. Drake wrote:
 Sounds to me like you just need to up the total amount of open files 
 allowed by the operating system.

 It looks more like the opposite, here's the docs for
 max_files_per_process:

I think Josh has got the right advice.  The manual is just saying that
you can reduce max_files_per_process to avoid the failure, but it's not
making any promises about the performance penalty for doing that.
Apparently Ralph's app needs a working set of between 800 and 1000 open
files to have reasonable performance.

 That is a lot of tables. Maybe a different OS will handle it better?
 Maybe there's some way that you can use fewer connections and then the
 OS could still handle it?

Also, it might be worth rethinking the database structure to reduce the
number of tables.  But for a quick-fix, increasing the kernel limit
seems like the easiest answer.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] Background vacuum

2007-05-09 Thread Daniel Haensse
Dear list,

I'm running postgres on a tomcat server. The vacuum is run every hour
(cronjob) which leads to a performance drop of the tomcat applications.
I played around with renice command and I think it is possible to reduce
this effect which a renice. The problem is how can I figure out the PID
of the postmaster performing the vacuum(automated)? Has anybody a nice
solution to change process priority? A shell script, maybe even for java?

best regards

Dani



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Background vacuum

2007-05-09 Thread Dan Harris

Daniel Haensse wrote:

Dear list,

I'm running postgres on a tomcat server. The vacuum is run every hour
(cronjob) which leads to a performance drop of the tomcat applications.
I played around with renice command and I think it is possible to reduce
this effect which a renice. The problem is how can I figure out the PID
of the postmaster performing the vacuum(automated)? Has anybody a nice
solution to change process priority? A shell script, maybe even for java?



While this may technically work, I think it lacks a key point.  'nice' ( at 
least the versions I'm familiar with ) do not adjust I/O priority.  VACUUM is 
bogging things down because of the extra strain on I/O.  CPU usage shouldn't 
really be much of a factor.


Instead, I would recommend looking at vacuum_cost_delay and the related settings 
to make vacuum lower priority than the queries you care about.  This should be a 
cleaner solution for you.


-Dan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq