Re: [PERFORM] Inconsistent performance

2003-09-16 Thread Manfred Koizar
On Mon, 15 Sep 2003 22:26:45 -0400, Christopher Browne
[EMAIL PROTECTED] wrote:
 select count (*) from table;
The only possible plan for THAT query will involve a seq scan of the
whole table.  If the postmaster already has the data in cache, it
makes sense for it to run in 1 second.  If it has to read it from
disk, 12 seconds makes a lot of sense.

Yes.  And note that the main difference is between having the data in
memory and having to fetch it from disk.  I don't believe that this
difference can be explained by 9000 read calls hitting the operating
system's cache.

You might want to increase the shared_buffers parameter in
postgresql.conf; that should lead to increased stability of times as
it should be more likely that the data in table will remain in
cache.

Let's not jump to this conclusion before we know what's going on.

Joseph Bove [EMAIL PROTECTED] wrote in another message above:
| I did have shared_buffers and sort_mem both set higher originally (15000, 
| 32168)

As I read this I think he meant ... and had the same performance
problem.

Joseph, what do you get, if you run that
 EXPLAIN ANALYSE SELECT count(*) ...
several times?  What do vmstat and top show while the query is
running?  Are there other processes active during or between the runs?
What kind of processes?  Postgres backends?  Web server? ...

Servus
 Manfred

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Attempt at work around of int4 query won't touch int8 index ...

2003-09-16 Thread Paul Thomas
On 15/09/2003 08:42 Shridhar Daithankar wrote:
Is it possible to follow data type upgrade model in planner?  Something
like in
C/C++ where data types are promoted upwards to find out better plan?
int2-int4-int8-float4-float8 types.

 That could be a clean solution..

just a thought..

Interestingly, float8 indexes do work OK (float8col = 99). I spend a large 
part of yesterday grepping through the sources to try and find out why 
this should be so. No luck so far but I'm going to keep on trying!

--
Paul Thomas
+--+-+
| Thomas Micro Systems Limited | Software Solutions for the Smaller 
Business |
| Computer Consultants | 
http://www.thomas-micro-systems-ltd.co.uk   |
+--+-+

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Inconsistent performance

2003-09-16 Thread Matt Clark
...
 #effective_cache_size = 1000# typically 8KB each

That's horribly wrong.  It's telling PG that your OS is only likely to cache
8MB of the DB in RAM.  If you've got 1GB of memory it should be between
64000 and 96000




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] restore time: sort_mem vs. checkpoing_segments

2003-09-16 Thread Vivek Khera
 TL == Tom Lane [EMAIL PROTECTED] writes:

TL I was just bugging Marc for some useful data, so I'll ask you too:
TL could you provide a trace of the pg_restore execution?  log_statement
TL plus log_duration output would do it.  I am curious to understand
TL exactly which steps in the restore are significant time sinks.

Sure... machine isn't gonna do much of anything until 7.4 is released
(or I hear a promise of no more dump/reload).

 I notice during the restore that the disk throughput triples during
 the checkpoint.

TL Hm, better make sure the log includes some indication of when
TL checkpoints happen.

That it does.

I'll post the results in the next couple of days, as each run takes
about 4 hours ;-)

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Attempt at work around of int4 query won't touch int8 index ...

2003-09-16 Thread Tom Lane
Paul Thomas [EMAIL PROTECTED] writes:
 On 15/09/2003 08:42 Shridhar Daithankar wrote:
 Is it possible to follow data type upgrade model in planner?

We have one, more or less.  It's not explicitly coded, it emerges from
the fact that certain casts are implicit and others are not.  For
instance, int4-float8 is implicit but float8-int4 is not.

 Interestingly, float8 indexes do work OK (float8col = 99). I spend a large 
 part of yesterday grepping through the sources to try and find out why 
 this should be so. No luck so far but I'm going to keep on trying!

The reason that case works is that there is no float8 = int4 operator.
The parser can find no other interpretation than promoting the int4 to
float8 and using float8 = float8.  (The dual possibility, coerce float8
to int4 and use int4 = int4, is not considered because that coercion
direction is not implicit.)  So you end up with an operator that matches
the float8 index, and all is well.

The int8 case fails because there is a cross-type operator int8 = int4,
and the parser prefers that since it's an exact match to the initial
data types.  But it doesn't match the int8 index.

We've floated various proposals for solving this, such as getting rid of
cross-type operators, but none so far have passed the test of not having
bad side-effects.  See the pg_hackers archives for details (and *please*
don't waste this list's bandwidth with speculating about solutions until
you've absorbed some of the history.  This topic has been heard of
before ;-).)

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Inconsistent performance

2003-09-16 Thread Joseph Bove
Dear list,

First and foremost, thanks to the great number of people who have responded 
with various tips and suggestions. I am now starting to fully appreciate 
the various cache settings and what they can do for performance.

I just want to redefine the problem based on the knowledge of it that I now 
have.

In my example, I am purposefully forcing a full table scan - select count 
(*) from table. This table has only 90,000 rows. Each row is comprised of 
about 300 bytes of data.

If the table has not been cached, I was seeing response times from 5 to 18 
seconds to read the table. If it had been cached, then the response time 
dropped to sub-second response.

Obviously, I can tune the caching so as to make sure that as much data that 
can be reasonably cached is cached. However, I don't think that a hit of 
even 5 seconds to read a table of 90,000 rows is acceptable.

One thing that has been tried with some success was to dump the table and 
recreate it. After this exercise, selecting all rows from the table when it 
is not in cache takes about 3 seconds. (Of course, when in cache, the same 
sub-second response time is seen.)

I still think that 3 seconds is not acceptable. However, I reserve the 
right to be wrong. Does it sound unrealistic to expect PostgreSQL to be 
able to read 90,000 rows with 300 bytes per row in under a second?

Based on suggestions from the list, I am also thinking of making the 
following tuning changes:

shared_buffers = 15000
sort_mem = 32168
effective_cache_size = 64000
This is based on one gig of memory.

Does anyone have any feedback on these values? Also, realizing that no two 
database are the same, etc., etc... does anyone have a good formula for 
setting these values?

Thanks in advance,

Joseph

At 09:09 AM 9/16/2003 +0200, Manfred Koizar wrote:
On Mon, 15 Sep 2003 22:26:45 -0400, Christopher Browne
[EMAIL PROTECTED] wrote:
 select count (*) from table;
The only possible plan for THAT query will involve a seq scan of the
whole table.  If the postmaster already has the data in cache, it
makes sense for it to run in 1 second.  If it has to read it from
disk, 12 seconds makes a lot of sense.
Yes.  And note that the main difference is between having the data in
memory and having to fetch it from disk.  I don't believe that this
difference can be explained by 9000 read calls hitting the operating
system's cache.
You might want to increase the shared_buffers parameter in
postgresql.conf; that should lead to increased stability of times as
it should be more likely that the data in table will remain in
cache.
Let's not jump to this conclusion before we know what's going on.

Joseph Bove [EMAIL PROTECTED] wrote in another message above:
| I did have shared_buffers and sort_mem both set higher originally (15000,
| 32168)
As I read this I think he meant ... and had the same performance
problem.
Joseph, what do you get, if you run that
EXPLAIN ANALYSE SELECT count(*) ...
several times?  What do vmstat and top show while the query is
running?  Are there other processes active during or between the runs?
What kind of processes?  Postgres backends?  Web server? ...
Servus
 Manfred
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Inconsistent performance

2003-09-16 Thread Jeff
On Tue, 16 Sep 2003, Joseph Bove wrote:

 I still think that 3 seconds is not acceptable. However, I reserve the
 right to be wrong. Does it sound unrealistic to expect PostgreSQL to be
 able to read 90,000 rows with 300 bytes per row in under a second?

first, check to see what your max throughput on your disk is using a
benchmark such as Bonnie (Making sure to use a size LARGER than phsyical
memory. 2x physical is veyr optimial).

next, run your query again with a vmstat 1 running in another term.

See how close the vmstat bi numbers correspond to your max according to
bonnie.  You could have an IO bottleneck.  (I once went running around
trying to figure it out and then discovered the issue was IO).



--
Jeff Trout [EMAIL PROTECTED]
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic

2003-09-16 Thread Tom Lane
Mary Edie Meredith [EMAIL PROTECTED] writes:
 Is there any option to remove the cost numbers from the plan so we can
 just use diff to automate the plan comparisons?

No, but a few moments with sed or perl should get the job done for you.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org