Re: [PERFORM] serveRAID M5014 SAS

2011-05-26 Thread Grzegorz Jaśkiewicz
The card is configured in 1+0 . with 128k stripe afaik (I'm a
developer, we don't have hardware guys here).
Are you's sure about the lack of cache by default on the card ? I
thought the difference is that 5104 has 256, and 5105 has 512 ram
already on it.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] serveRAID M5014 SAS

2011-05-26 Thread Mark Kirkwood

On 26/05/11 20:11, Grzegorz Jaśkiewicz wrote:

The card is configured in 1+0 . with 128k stripe afaik (I'm a
developer, we don't have hardware guys here).
Are you's sure about the lack of cache by default on the card ? I
thought the difference is that 5104 has 256, and 5105 has 512 ram
already on it.


No, I'm not sure about what the default is for the M5014 - I'd recommend 
checking this with your supplier (or looking at the invoice if you can 
get it). My *feeling* is that you may have 256M cache but no battery kit 
- as this is an optional part - so the the card will not got into 
writeback mode if that is the case.


FWIW - we got our best (pgbench) results with 256K stripe, No (card) 
readahead and hyperthreading off on the host.


You can interrogate the config of the card and the raid 10 array using 
the megaraid cli package - you need to read the (frankly terrible) 
manual to discover which switches to use to determine battery and cache 
status etc. If you email me privately I'll get you a link to the 
relevant docs!


Cheers

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Speeding up loops in pl/pgsql function

2011-05-26 Thread Merlin Moncure
On Wed, May 25, 2011 at 9:20 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, May 25, 2011 at 8:03 PM, Steve Crawford
 scrawf...@pinpointresearch.com wrote:
 On 05/25/2011 11:45 AM, Reuven M. Lerner wrote:

 Hi, Alex.  You wrote:

 Have you tried something like:
 SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
 'g')::bytea, 'escape');

 Hmm, forgot about regexp_replace.  It might do the trick, but without a
 full-blown eval that I can run on the replacement side, it'll be a bit more
 challenging.  But that's a good direction to consider, for sure.

 The function given didn't work exactly as written for me but it is on the
 right track. See if this works for you (input validation is left as an
 exercise for the reader...:)):

 create or replace function octal_string_to_text(someoctal text) returns text
 as $$
 declare
    binstring text;
 begin
    execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\1', 'g')
 ||  into binstring;
 return binstring;
 end
 $$ language plpgsql;

 four points (minor suggestions btw):
 1. if you are dealing with strings that have backslashes in them,
 don't escape, but dollar quote.  Also try not to use dollar parameter
 notation if you can help it:
 ($1, E'(\\d{3})', E'\\1', 'g') - (someoctal , $q$(\d{3})$q$,
 $q$\\\1$q$, 'g')

 this is particularly true with feeding strings to regex: that way you
 can use the same string pg as in various validators.

 2. there is no need for execute here.
 execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\1', 'g')
 becomes:
 binstring := 'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$,
 'g')  /* I *think* I got this right */

 3. if your function does not scribble on tables and has no or is not
 influenced by any side effects, mark it as IMMUTABLE. always.
 $$ language plpgsql IMMUTABLE;

 4. since all we are doing is generating a variable, prefer sql
 function vs plpgsql. this is particularly true in pre 8.4 postgres
 (IIRC) where you can call the function much more flexibly (select
 func(); vs select * from func();) if that's the case.  Putting it all
 together,

 create or replace function octal_string_to_text(someoctal text)
 returns text as $$
   SELECT  'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$, 'g');
 $$ sql immutable;

 Note I didn't actually check to see what your regex is donig (I'm
 assuming it's correct)...

hm, I slept on this and had the vague unsettling feeling I had said
something stupid -- and I did.  Double +1 to you for being cleverer
than me -- you are using 'execute' to eval the string back in to the
string.  Only plpgsql can do that, so point 4 is also moot.  Still,
the above points hold in principle, so if a way could be figured out
to do this without execute, that would be nice.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread panam
Hi there,


Kevin Grittner wrote:
 
 Is there a way to determine the values actually used?
 The pg_settings view.  Try the query shown here:
 http://wiki.postgresql.org/wiki/Server_Configuration
 
Thanks Kevin, very usful. Here is the output:

version;PostgreSQL 9.0.4, compiled by Visual C++ build 1500, 64-bit
bytea_output;escape
client_encoding;UNICODE
effective_cache_size;4GB
lc_collate;German_Germany.1252
lc_ctype;German_Germany.1252
listen_addresses;*
log_destination;stderr
log_line_prefix;%t 
logging_collector;on
max_connections;100
max_stack_depth;2MB
port;5432
server_encoding;UTF8
shared_buffers;1GB
temp_buffers;4096
TimeZone;CET
work_mem;1GB


Craig Ringer wrote:
 
 On 05/26/2011 12:42 AM, panam wrote:
 It's a bit beyond me, but I suspect that it'd be best if you could hang 
 onto the dump file in case someone has the time and enthusiasm to 
 investigate it. I take it you can't distribute the dump file, even 
 privately?
 
Fortunately, I managed to reduce it to the absolute minimum (i.e. only
meaningless ids), and the issue is still observable.
You can download it from here:
http://www.zumodrive.com/file/460997770?key=cIdeODVlNz

Some things to try:
* tune your psql settings if you want
* reindex, vaccum analzye if you want

Patholgical query:

select
b.id,
(SELECT
m1.id 
FROM
message m1 
LEFT JOIN
message m2 
ON (
m1.box_id = m2.box_id 
AND m1.id  m2.id
) 
WHERE
m2.id IS NULL 
AND m1.box_id = b.id)
from
box b

= takes almost forever (~600 seconds on my system)

Try

delete from message where id  255;

= deletes 78404 rows
Do the pathological query again
= speed is back (~4 seconds on my system)

Replay the dump
Try

delete from message where id  100;

= deletes 835844 (10 times than before) rows. Maybe you can delete many
more, I haven't tested this systematically.
Do the pathological query again
= takes almost forever (didn't wait...)

Replay the dump
Cluster:

cluster message_pkey on message;

Do the pathological query again
= speed is back (~3 seconds on my system)

Any third party confirmation?

Thanks
panam

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4428435.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Speeding up loops in pl/pgsql function

2011-05-26 Thread Merlin Moncure
On Thu, May 26, 2011 at 8:11 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, May 25, 2011 at 9:20 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, May 25, 2011 at 8:03 PM, Steve Crawford
 scrawf...@pinpointresearch.com wrote:
 On 05/25/2011 11:45 AM, Reuven M. Lerner wrote:

 Hi, Alex.  You wrote:

 Have you tried something like:
 SELECT  encode(regexp_replace('141142143', '(\d{3})', '\\\1',
 'g')::bytea, 'escape');

 Hmm, forgot about regexp_replace.  It might do the trick, but without a
 full-blown eval that I can run on the replacement side, it'll be a bit more
 challenging.  But that's a good direction to consider, for sure.

 The function given didn't work exactly as written for me but it is on the
 right track. See if this works for you (input validation is left as an
 exercise for the reader...:)):

 create or replace function octal_string_to_text(someoctal text) returns text
 as $$
 declare
    binstring text;
 begin
    execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\1', 'g')
 ||  into binstring;
 return binstring;
 end
 $$ language plpgsql;

 four points (minor suggestions btw):
 1. if you are dealing with strings that have backslashes in them,
 don't escape, but dollar quote.  Also try not to use dollar parameter
 notation if you can help it:
 ($1, E'(\\d{3})', E'\\1', 'g') - (someoctal , $q$(\d{3})$q$,
 $q$\\\1$q$, 'g')

 this is particularly true with feeding strings to regex: that way you
 can use the same string pg as in various validators.

 2. there is no need for execute here.
 execute 'select E''' || regexp_replace($1, E'(\\d{3})', E'\\1', 'g')
 becomes:
 binstring := 'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$,
 'g')  /* I *think* I got this right */

 3. if your function does not scribble on tables and has no or is not
 influenced by any side effects, mark it as IMMUTABLE. always.
 $$ language plpgsql IMMUTABLE;

 4. since all we are doing is generating a variable, prefer sql
 function vs plpgsql. this is particularly true in pre 8.4 postgres
 (IIRC) where you can call the function much more flexibly (select
 func(); vs select * from func();) if that's the case.  Putting it all
 together,

 create or replace function octal_string_to_text(someoctal text)
 returns text as $$
   SELECT  'E''' || regexp_replace($1, $q$(\d{3})$q$, $q$\\\1$q$, 'g');
 $$ sql immutable;

 Note I didn't actually check to see what your regex is donig (I'm
 assuming it's correct)...

 hm, I slept on this and had the vague unsettling feeling I had said
 something stupid -- and I did.  Double +1 to you for being cleverer
 than me -- you are using 'execute' to eval the string back in to the
 string.  Only plpgsql can do that, so point 4 is also moot.  Still,
 the above points hold in principle, so if a way could be figured out
 to do this without execute, that would be nice.

got it:
select decode(regexp_replace('141142143', '([0-9][0-9][0-9])',
$q$\\\1$q$ , 'g'), 'escape');
 decode

 abc
(1 row)

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Speeding up loops in pl/pgsql function

2011-05-26 Thread Reuven M. Lerner

Wow.

Color me impressed and grateful.  I've been working on a different 
project today, but I'll test these tonight.


I'll never underestimate the regexp functionality in PostgreSQL again!

Reuven

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Cédric Villemain
2011/5/26 panam pa...@gmx.net:
 Hi there,


 Kevin Grittner wrote:

 Is there a way to determine the values actually used?
 The pg_settings view.  Try the query shown here:
 http://wiki.postgresql.org/wiki/Server_Configuration

 Thanks Kevin, very usful. Here is the output:

 version;PostgreSQL 9.0.4, compiled by Visual C++ build 1500, 64-bit
 bytea_output;escape
 client_encoding;UNICODE
 effective_cache_size;4GB
 lc_collate;German_Germany.1252
 lc_ctype;German_Germany.1252
 listen_addresses;*
 log_destination;stderr
 log_line_prefix;%t 
 logging_collector;on
 max_connections;100
 max_stack_depth;2MB
 port;5432
 server_encoding;UTF8
 shared_buffers;1GB
 temp_buffers;4096
 TimeZone;CET
 work_mem;1GB


 Craig Ringer wrote:

 On 05/26/2011 12:42 AM, panam wrote:
 It's a bit beyond me, but I suspect that it'd be best if you could hang
 onto the dump file in case someone has the time and enthusiasm to
 investigate it. I take it you can't distribute the dump file, even
 privately?

 Fortunately, I managed to reduce it to the absolute minimum (i.e. only
 meaningless ids), and the issue is still observable.
 You can download it from here:
 http://www.zumodrive.com/file/460997770?key=cIdeODVlNz

 Some things to try:
 * tune your psql settings if you want
 * reindex, vaccum analzye if you want

 Patholgical query:

 select
        b.id,
        (SELECT
                m1.id
        FROM
                message m1
        LEFT JOIN
                message m2
                        ON (
                                m1.box_id = m2.box_id
                                AND m1.id  m2.id
                        )
        WHERE
                m2.id IS NULL
                AND m1.box_id = b.id)
 from
        box b

 = takes almost forever (~600 seconds on my system)

 Try

 delete from message where id  255;

 = deletes 78404 rows
 Do the pathological query again
 = speed is back (~4 seconds on my system)

 Replay the dump
 Try

 delete from message where id  100;

 = deletes 835844 (10 times than before) rows. Maybe you can delete many
 more, I haven't tested this systematically.
 Do the pathological query again
 = takes almost forever (didn't wait...)

 Replay the dump
 Cluster:

 cluster message_pkey on message;

 Do the pathological query again
 = speed is back (~3 seconds on my system)

 Any third party confirmation?

without explaining further why the antijoin has bad performance
without cluster, I wonder why you don't use this query :

SELECT  b.id,
  max(m.id)
FROM box b, message m
WHERE m.box_id = b.id
GROUP BY b.id;

looks similar and fastest.


 Thanks
 panam

 --
 View this message in context: 
 http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4428435.html
 Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] The shared buffers challenge

2011-05-26 Thread Merlin Moncure
Hello performers, I've long been unhappy with the standard advice
given for setting shared buffers.  This includes the stupendously
vague comments in the standard documentation, which suggest certain
settings in order to get 'good performance'.  Performance of what?
Connection negotiation speed?  Not that it's wrong necessarily, but
ISTM too much based on speculative or anecdotal information.  I'd like
to see the lore around this setting clarified, especially so we can
refine advice to: 'if you are seeing symptoms x,y,z set shared_buffers
from a to b to get symptom reduction of k'.  I've never seen a
database blow up from setting them too low, but over the years I've
helped several people with bad i/o situations or outright OOM
conditions from setting them too high.

My general understanding of shared_buffers is that they are a little
bit faster than filesystem buffering (everything these days is
ultimately based on mmap AIUI, so there's no reason to suspect
anything else).  Where they are most helpful is for masking of i/o if
a page gets dirtied 1 times before it's written out to the heap, but
seeing any benefit from that at all is going to be very workload
dependent.  There are also downsides using them instead of on the heap
as well, and the amount of buffers you have influences checkpoint
behavior.  So things are complex.

So, the challenge is this: I'd like to see repeatable test cases that
demonstrate regular performance gains  20%.  Double bonus points for
cases that show gains  50%.  No points given for anecdotal or
unverifiable data. Not only will this help raise the body of knowledge
regarding the setting, but it will help produce benchmarking metrics
against which we can measure multiple interesting buffer related
patches in the pipeline.  Anybody up for it?

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Kevin Grittner
Cédric Villemaincedric.villemain.deb...@gmail.com wrote:
 2011/5/26 panam pa...@gmx.net:
 
 max_connections;100
 
 work_mem;1GB
 
Each connection can allocate work_mem, potentially several times. 
On a machines without hundreds of GB of RAM, that pair of settings
could cause severe swapping.
 
 Patholgical query:

 select
b.id,
(SELECT
m1.id
FROM
message m1
LEFT JOIN
message m2
ON (
m1.box_id = m2.box_id
AND m1.id  m2.id
)
WHERE
m2.id IS NULL
AND m1.box_id = b.id)
 from
box b
 
 without explaining further why the antijoin has bad performance
 without cluster, I wonder why you don't use this query :
 
 SELECT  b.id,
   max(m.id)
 FROM box b, message m
 WHERE m.box_id = b.id
 GROUP BY b.id;
 
 looks similar and fastest.
 
I think you would need a left join to actually get identical
results:
 
SELECT  b.id, max(m.id)
  FROM box b
  LEFT JOIN message m ON m.box_id = b.id
  GROUP BY b.id;
 
But yeah, I would expect this approach to be much faster.  Rather
easier to understand and harder to get wrong, too.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Kevin Grittner
Merlin Moncure mmonc...@gmail.com wrote:
 
 So, the challenge is this: I'd like to see repeatable test cases
 that demonstrate regular performance gains  20%.  Double bonus
 points for cases that show gains  50%.
 
Are you talking throughput, maximum latency, or some other metric?
 
In our shop the metric we tuned for in reducing shared_buffers was
getting the number of fast queries (which normally run in under a
millisecond) which would occasionally, in clusters, take over 20
seconds (and thus be canceled by our web app and present as errors
to the public) down to zero.  While I know there are those who care
primarily about throughput numbers, that's worthless to me without
maximum latency information under prolonged load.  I'm not talking
90th percentile latency numbers, either -- if 10% of our web
requests were timing out the villagers would be coming after us with
pitchforks and torches.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Speeding up loops in pl/pgsql function

2011-05-26 Thread Steve Crawford

On 05/26/2011 05:36 AM, Merlin Moncure wrote:

...
got it:
select decode(regexp_replace('141142143', '([0-9][0-9][0-9])',
$q$\\\1$q$ , 'g'), 'escape');
  decode

  abc
(1 row)

merlin


Nice. A word of warning, in 9.0 this returns a hex string:

select decode(regexp_replace('141142143', '([0-9][0-9][0-9])', 
$q$\\\1$q$ , 'g'), 'escape');

  decode
--
 \x616263

See http://www.postgresql.org/docs/9.0/static/release-9-0.html:

E.5.2.3. Data Types
bytea output now appears in hex format by default (Peter Eisentraut)
The server parameter bytea_output can be used to select the 
traditional output format if needed for compatibility.


Another wrinkle, the function I wrote sort of ignored the bytea issue by 
using text. But text is subject to character-encoding (for both good and 
bad) while bytea is not so the ultimate solution will depend on whether 
the input string is the octal representation of an un-encoded sequence 
of bytes or represents a string of ASCII/UTF-8/whatever... encoded text.


Cheers,
Steve


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Merlin Moncure
On Thu, May 26, 2011 at 10:10 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Merlin Moncure mmonc...@gmail.com wrote:

 So, the challenge is this: I'd like to see repeatable test cases
 that demonstrate regular performance gains  20%.  Double bonus
 points for cases that show gains  50%.

 Are you talking throughput, maximum latency, or some other metric?

I am talking about *any* metric..you've got something, let's see it.
But it's got to be verifiable, so no points scored.

See my note above about symptoms -- if your symptom of note happens to
be unpredictable spikes in fast query times under load, then I'd like
to scribble that advice directly into the docs along with (hopefully)
some reasoning of exactly why more database managed buffers are
helping.   As noted, I'm particularly interested in things we can test
outside of production environments, since I'm pretty skeptical the
Wisconsin Court System is going to allow the internet to log in and
repeat and verify test methodologies.  Point being: cranking buffers
may have been the bee's knees with, say, the 8.2 buffer manager, but
present and future improvements may have render that change moot or
even counter productive.  I doubt it's really changed much, but we
really need to do better on this -- all else being equal, the lowest
shared_buffers setting possible without sacrificing performance is
best because it releases more memory to the o/s to be used for other
things -- so everthing's bigger in Texas type approaches to
postgresql.conf manipulation (not that I see that here of course) are
not necessarily better :-).

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Claudio Freire
On Thu, May 26, 2011 at 5:36 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Point being: cranking buffers
 may have been the bee's knees with, say, the 8.2 buffer manager, but
 present and future improvements may have render that change moot or
 even counter productive.

I suggest you read the docs on how shared buffers work, because,
reasonably, it would be all the way around.

Recent improvments into how postgres manage its shared buffer pool
makes them better than the OS cache, so there should be more incentive
to increase them, rather than decrease them.

Workload conditions may make those improvements worthless, hinting
that you should decrease them.

But you have to know your workload and you have to know how the shared
buffers work.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Merlin Moncure
On Thu, May 26, 2011 at 10:45 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Thu, May 26, 2011 at 5:36 PM, Merlin Moncure mmonc...@gmail.com wrote:
 Point being: cranking buffers
 may have been the bee's knees with, say, the 8.2 buffer manager, but
 present and future improvements may have render that change moot or
 even counter productive.

 I suggest you read the docs on how shared buffers work, because,
 reasonably, it would be all the way around.

 Recent improvments into how postgres manage its shared buffer pool
 makes them better than the OS cache, so there should be more incentive
 to increase them, rather than decrease them.

 Workload conditions may make those improvements worthless, hinting
 that you should decrease them.

 But you have to know your workload and you have to know how the shared
 buffers work.

I am not denying that any of those things are the case, although your
assumption that I haven't read the documentation was obviously not
grounded upon research.  What you and I know/don't know is not the
point.  The point is what we can prove, because going through the
motions of doing that is useful.  You are also totally missing my
other thrust, which is that future changes to how things work could
change the dynamics of .conf configuration -- btw not for the first
time in the history of the project.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Claudio Freire
On Thu, May 26, 2011 at 6:02 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The point is what we can prove, because going through the
 motions of doing that is useful.

Exactly, and whatever you can prove will be workload-dependant.
So you can't prove anything generally, since no single setting is
best for all.

 You are also totally missing my
 other thrust, which is that future changes to how things work could
 change the dynamics of .conf configuration

Nope, I'm not missing it, simply not commenting on it.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Merlin Moncure
On Thu, May 26, 2011 at 11:37 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Thu, May 26, 2011 at 6:02 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The point is what we can prove, because going through the
 motions of doing that is useful.

 Exactly, and whatever you can prove will be workload-dependant.
 So you can't prove anything generally, since no single setting is
 best for all.

Then we should stop telling people to adjust it unless we can match
the workload to the improvement.  There are some people here who can
do that as if by magic, but that's not the issue.  I'm trying to
understand the why it works better for some than for others.  What's
frustrating is simply believing something is the case, without trying
to understand why.  How about, instead of arguing with me, coming up
with something for the challenge?

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] LIMIT and UNION ALL

2011-05-26 Thread Dave Johansen
On Wed, May 18, 2011 at 8:54 AM, Robert Klemme
shortcut...@googlemail.comwrote:

 On Wed, May 18, 2011 at 5:26 PM, Dave Johansen davejohan...@gmail.com
 wrote:
  I am using Postgres 8.3.3 and I have a VIEW which is a UNION ALL of two
  tables but when I do a select on the view using a LIMIT, it scans the
 entire
  tables and takes significantly longer than writing out the query with the
  LIMITs in the sub-queries themselves. Is there a solution to get the view
 to
  perform like the query with the LIMIT explicitly placed in the
 sub-queries?

 Can you show DDL and queries?

 The query with the LIMIT on the subqueries and the one with the LIMIT
 on the overall query are not semantically equivalent.  Since you can
 have an ORDER BY before the LIMIT on the query with the limit on the
 view the database must have all the rows before it can apply the
 ordering and properly determine the limit.  Although it might be
 possible to determine under particular circumstances that only one of
 the tables needs to be queried or tables need only be queried
 partially I deem that quite complex. I do not know whether Postgres
 can do such optimizations but for that we would certainly need to see
 the concrete example (including constraint and indexes).

 Kind regards

 robert

 --
 remember.guy do |as, often| as.you_can - without end
 http://blog.rubybestpractices.com/


Yes, there is an order by an index involved. Here's a simplified version of
the schema and queries that demonstrates the same behaviour.

 Table public.message1
 Column |   Type   |
Modifiers
+--+
 rid| integer  | not null default
nextval('message1_rid_seq'::regclass)
 data   | integer  |
 tlocal | timestamp with time zone |
Indexes:
message1_pkey PRIMARY KEY, btree (rid)
Referenced by:
TABLE parsed1 CONSTRAINT parsed1_msgid_fkey FOREIGN KEY (msgid)
REFERENCES message1(rid)

  Table public.parsed1
 Column |   Type   |
Modifiers
+--+
 rid| integer  | not null default
nextval('parsed1_rid_seq'::regclass)
 msgid  | integer  |
 data   | integer  |
 tlocal | timestamp with time zone |
Indexes:
parsed1_pkey PRIMARY KEY, btree (rid)
Foreign-key constraints:
parsed1_msgid_fkey FOREIGN KEY (msgid) REFERENCES message1(rid) ON
DELETE CASCADE

For this example, message2 has the same structure/definition and message1
and parsed2 has the same structure/definition as parsed1.

   View public.parsed_all
 Column |   Type   | Modifiers
+--+---
 rid| integer  |
 msgid  | integer  |
 data   | integer  |
 tlocal | timestamp with time zone |
View definition:
 SELECT parsed1.rid, parsed1.msgid, parsed1.data, parsed1.tlocal
   FROM parsed1
UNION ALL
 SELECT parsed2.rid, parsed2.msgid, parsed2.data, parsed2.tlocal
   FROM parsed2;




Slow version using the view:

EXPLAIN ANALYZE SELECT * FROM parsed_all ORDER BY tlocal DESC LIMIT 10;
 QUERY
PLAN

 Limit  (cost=74985.28..74985.31 rows=10 width=20) (actual
time=6224.229..6224.244 rows=10 loops=1)
   -  Sort  (cost=74985.28..79985.28 rows=200 width=20) (actual
time=6224.226..6224.230 rows=10 loops=1)
 Sort Key: parsed1.tlocal
 Sort Method:  top-N heapsort  Memory: 17kB
 -  Result  (cost=0.00..31766.00 rows=200 width=20) (actual
time=0.026..4933.210 rows=200 loops=1)
   -  Append  (cost=0.00..31766.00 rows=200 width=20)
(actual time=0.024..2880.868 rows=200 loops=1)
 -  Seq Scan on parsed1  (cost=0.00..15883.00
rows=100 width=20) (actual time=0.023..551.870 rows=100 loops=1)
 -  Seq Scan on parsed2  (cost=0.00..15883.00
rows=100 width=20) (actual time=0.027..549.465 rows=100 loops=1)
 Total runtime: 6224.337 ms
(9 rows)

Fast version using a direct query with limits in the sub-queries:

EXPLAIN ANALYZE SELECT * FROM (SELECT * FROM (SELECT * FROM parsed1 ORDER BY
tlocal DESC LIMIT 10) AS a UNION ALL SELECT * FROM (SELECT * FROM parsed2
ORDER BY tlocal DESC LIMIT 10) AS b) AS c ORDER BY tlocal DESC LIMIT 10;

QUERY PLAN

---
-
 Limit  (cost=1.33..1.35 rows=10 width=20) (actual time=0.131..0.145 rows=10
loops=1)
   -  Sort  (cost=1.33..1.38 rows=20 

Re: [PERFORM] LIMIT and UNION ALL

2011-05-26 Thread Tom Lane
Dave Johansen davejohan...@gmail.com writes:
 ...  So is there a way to make the
 planner perform the same sort of operation and push those same constraints
 into the sub-queries on its own?

No.  As was mentioned upthread, there is a solution for this in 9.1,
although it doesn't work in exactly the way you suggest.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Kevin Grittner
panam pa...@gmx.net wrote:
 
 I cannot use it because of the way that query is generated
 (by hibernate).
 
 The (simplyfied) base query is just
 
 SELECT b.id from box
 
 the subquery
 
 (SELECT  m1.id FROM message m1 
LEFT JOIN message m2 
   ON (m1.box_id = m2.box_id  AND m1.id  m2.id ) 
WHERE m2.id IS NULL AND m1.box_id = b.id) as lastMessageId
 
 is due to a hibernate formula (containing more or less plain SQL)
 to determine the last message id for that box. It ought to return
 just one row, not multiple. So I am constrained to the subquery in
 all optimization attemps (I cannot combine them as you did), at
 least I do not see how. If you have an idea for a more performant
 subquery though, let me know, as this can easily be replaced.
 
Maybe:
 
(SELECT max(m1.id) FROM message m1 WHERE m1.box_id = b.id)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Kevin Grittner
Merlin Moncure mmonc...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Merlin Moncure mmonc...@gmail.com wrote:

 So, the challenge is this: I'd like to see repeatable test cases
 that demonstrate regular performance gains  20%.  Double bonus
 points for cases that show gains  50%.

 Are you talking throughput, maximum latency, or some other
 metric?
 
 I am talking about *any* metric..you've got something, let's see
 it.  But it's got to be verifiable, so no points scored.
 
Oh, that wasn't to score points; just advocating for more than a
one-dimensional view of performance.  I'm adding to your demands,
not attempting to satisfy them.  :-)
 
 See my note above about symptoms -- if your symptom of note
 happens to be unpredictable spikes in fast query times under load,
 then I'd like to scribble that advice directly into the docs along
 with (hopefully) some reasoning of exactly why more database
 managed buffers are helping.
 
In our case it was *fewer* shared_buffers which helped.
 
 As noted, I'm particularly interested in things we can test
 outside of production environments, since I'm pretty skeptical the
 Wisconsin Court System is going to allow the internet to log in
 and repeat and verify test methodologies.
 
Right, while it was a fairly scientific and methodical test, it was
against a live production environment.  We adjusted parameters
incrementally, a little each day, from where they had been toward
values which were calculated in advance to be better based on our
theory of the problem (aided in no small part by analysis and advice
from Greg Smith), and saw a small improvement each day with the
problem disappearing entirely right at the target values we had
calculated in advance.  :-)
 
 Point being: cranking buffers may have been the bee's knees with,
 say, the 8.2 buffer manager, but present and future improvements
 may have render that change moot or even counter productive.
 
We did find that in 8.3 and later we can support a larger
shared_buffer setting without the problem than in 8.2 and earlier. 
We still need to stay on the low side of what is often advocated to
keep the failure rate from this issue at zero.
 
 all else being equal, the lowest shared_buffers setting possible
 without sacrificing performance is best because it releases more
 memory to the o/s to be used for other things
 
I absolutely agree with this.
 
I think the problem is that it is very tedious and time-consuming to
construct artificial tests for these things.  Greg Smith has spent a
lot of time and done a lot of research investigating the dynamics of
these issues, and recommends a process of incremental adjustments
for tuning the relevant settings which, in my opinion, is going to
be better than any generalized advice on settings.
 
Don't get me wrong, I would love to see numbers which earned
points under the criteria you outline.  I would especially love it
if they could be part of the suite of tests in our performance farm.
I just think that the wealth of anecdotal evidence and the dearth of
repeatable benchmarks in this area is due to the relatively low-cost
techniques available to tune production systems to solve pressing
needs versus the relatively high cost of creating repeatable test
cases (without, by the way, solving an immediate need).
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread panam
Sorry,

SELECT MAX(e.id) FROM event_message e WHERE e.box_id = id

as posted previously should actually read

SELECT max(m1.id) FROM message m1 WHERE m1.box_id = b.id)

so I tried this already.

Regards,
panam




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4429475.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Cédric Villemain
2011/5/26 panam pa...@gmx.net:
 Hi all,


 Cédric Villemain-3 wrote:

 without explaining further why the antijoin has bad performance
 without cluster, I wonder why you don't use this query :

 SELECT  b.id,
                   max(m.id)
 FROM box b, message m
 WHERE m.box_id = b.id
 GROUP BY b.id;

 looks similar and fastest.

 I actually did use a similar strategy in the meantime (during my problem
 with the left join query we are talking about here all the time) for
 mitigation.
 It was
 SELECT MAX(e.id) FROM event_message e WHERE e.box_id = id
 and it performed worse in comparison to the left join query in the general
 case (i.e. before my problems began).
 At the end of this post is an explanation why I think I cannot use the
 solution you suggested above.


 Kevin Grittner wrote:

  Each connection can allocate work_mem, potentially several times.
 On a machines without hundreds of GB of RAM, that pair of settings
 could cause severe swapping.

 Indeed, thanks for the warning. These settings are not for production but to
 exclude a performance degradation because of small cache sizes.


 Kevin Grittner wrote:

 I think you would need a left join to actually get identical
 results:

 SELECT  b.id, max(m.id)
   FROM box b
   LEFT JOIN message m ON m.box_id = b.id
   GROUP BY b.id;

 But yeah, I would expect this approach to be much faster.  Rather
 easier to understand and harder to get wrong, too.


 Correct, it is much faster, even with unclustered ids.
 However, I think I cannot use it because of the way that query is generated
 (by hibernate).
 The (simplyfied) base query is just

 SELECT b.id from box

 the subquery

 (SELECT  m1.id FROM message m1
   LEFT JOIN message m2
      ON (m1.box_id = m2.box_id  AND m1.id  m2.id )
   WHERE m2.id IS NULL AND m1.box_id = b.id) as lastMessageId

 is due to a hibernate formula (containing more or less plain SQL) to
 determine the last message id for that box. It ought to return just one row,
 not multiple. So I am constrained to the subquery in all optimization
 attemps (I cannot combine them as you did), at least I do not see how. If
 you have an idea for a more performant subquery though, let me know, as this
 can easily be replaced.

In production, if you have a decent IO system, you can lower
random_page_cost and it may be faster using index (by default, with
the use case you provided it choose a seqscan). It can be a bit tricky
if you have to lower random_page_cost so much that it destroy others
query plan but increase the perf of the current one. if it happens,
post again :) (sometime need to change other cost parameters but it
needs to be handle with care)

I am not an hibernate expert, but I'll surprised if you can not drive
hibernate to do what you want.


 Thanks for your help and suggestions
 panam

 --
 View this message in context: 
 http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4429125.html
 Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance




-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Performance block size.

2011-05-26 Thread Tory M Blue
Working on some optimization as well as finally getting off my
backside and moving us to 64bit (32gb+memory).

I was reading and at some point it appears on freeBSD  the Postgres
block size was upped to 16kb, from 8kb. And on my fedora systems I
believe the default build is 8kb.

When we were using ext2/ext3 etc made not much of a difference as far
as I can tell since one was limited to 4kb at the file system (so 2
disk access for every Postgres write/read ??).

Now with ext4, we can set the block size, so would it make sense for
larger data sets, that end up loading the entire 5 million row table
into memory, to have a larger block size, or matching block size
between postgres and the filesystem (given that raid is configured to
optimize the writes over all the spindles in your storage?) (leaving
that piece alone).

I want to focus on the relation of Postgres default block size, and
what the various issues/gains are with upping it , at the same time
matching or doing some storage magic to figure out the optimum between
Postgres/filesystem?

Trying to gain some performance and wondered if any of this tuning
even is something I should bother with.

Fedora f12 (looking at CentOS)
postgres 8.4.4 (probably time to start getting to 9.x)
slon 1.2.20 (same, needs an update)

But system tuning, 64 bit first..

Thanks
Tory

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] serveRAID M5014 SAS

2011-05-26 Thread Mark Kirkwood

On 26/05/11 20:31, Grzegorz Jaśkiewicz wrote:

Would HT have any impact to the I/O performance (postgresql, and fs in
general) ?.



There have been previous discussions on this list about HT on vs off (I 
can't recall what the consensus, if any about what the cause of any 
performance difference was). In our case HT off gave us much better 
results for what we think the typical number of clients will be  - see 
attached (server learn-db1 is setup with trivial hardware raid and then 
software raided via md, learn-db2 has its raid all in hardware. We've 
ended up going with the latter setup).


Note that the highest tps on the graph is about 2100 - we got this upto 
just over 2300 by changing from ext4 to xfs in later tests, and managed 
to push the tps for 100 clients up a little by setting no read ahead 
(NORA) for the arrays.


Cheers

Mark
attachment: runhtoff.pngattachment: runhton.png
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Greg Smith

Merlin Moncure wrote:

So, the challenge is this: I'd like to see repeatable test cases that
demonstrate regular performance gains  20%.  Double bonus points for
cases that show gains  50%.


Do I run around challenging your suggestions and giving you homework?  
You have no idea how much eye rolling this whole message provoked from me.


OK, so the key thing to do is create a table such that shared_buffers is 
smaller than the primary key index on a table, then UPDATE that table 
furiously.  This will page constantly out of the buffer cache to the OS 
one, doing work that could be avoided.  Increase shared_buffers to where 
it fits instead, and all the index writes are buffered to write only 
once per checkpoint.  Server settings to exaggerate the effect:


shared_buffers = 32MB
checkpoint_segments = 256
log_checkpoints = on
autovacuum = off

Test case:

createdb pgbench
pgbench -i -s 20 pgbench
psql -d pgbench -c select 
pg_size_pretty(pg_relation_size('public.pgbench_accounts_pkey'))

psql -c select pg_stat_reset_shared('bgwriter')
pgbench -T 120 -c 4 -n pgbench
psql -x -c SELECT * FROM pg_stat_bgwriter

This gives the following size for the primary key and results:

pg_size_pretty

34 MB

transaction type: TPC-B (sort of)
scaling factor: 20
query mode: simple
number of clients: 4
number of threads: 1
duration: 120 s
number of transactions actually processed: 13236
tps = 109.524954 (including connections establishing)
tps = 109.548498 (excluding connections establishing)

-[ RECORD 1 ]-+--
checkpoints_timed | 0
checkpoints_req   | 0
buffers_checkpoint| 0
buffers_clean | 16156
maxwritten_clean  | 131
buffers_backend   | 5701
buffers_backend_fsync | 0
buffers_alloc | 25276
stats_reset   | 2011-05-26 18:39:57.292777-04

Now, change so the whole index fits instead:

shared_buffers = 512MB

...which follows the good old 25% of RAM guidelines given this system 
has 2GB of RAM.  Restart the server, repeat the test case.  New results:


transaction type: TPC-B (sort of)
scaling factor: 20
query mode: simple
number of clients: 4
number of threads: 1
duration: 120 s
number of transactions actually processed: 103440
tps = 861.834090 (including connections establishing)
tps = 862.041716 (excluding connections establishing)

gsmith@meddle:~/personal/scripts$ psql -x -c SELECT * FROM 
pg_stat_bgwriter

-[ RECORD 1 ]-+--
checkpoints_timed | 0
checkpoints_req   | 0
buffers_checkpoint| 0
buffers_clean | 0
maxwritten_clean  | 0
buffers_backend   | 1160
buffers_backend_fsync | 0
buffers_alloc | 34071
stats_reset   | 2011-05-26 18:43:40.887229-04

Rather than writing 16156+5701=21857 buffers out during the test to 
support all the index churn, instead only 1160 buffers go out, 
consisting mostly of the data blocks for pgbench_accounts that are being 
updated irregularly.  With less than 1 / 18th as I/O to do, the system 
executes nearly 8X as many UPDATE statements during the test run.


As for figuring out how this impacts more complicated cases, I hear 
somebody wrote a book or something that went into pages and pages of 
detail about all this.  You might want to check it out.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] serveRAID M5014 SAS

2011-05-26 Thread Greg Smith

Mark Kirkwood wrote:
You can interrogate the config of the card and the raid 10 array using 
the megaraid cli package - you need to read the (frankly terrible) 
manual to discover which switches to use to determine battery and 
cache status etc. If you email me privately I'll get you a link to the 
relevant docs!


That's assuming the MegaCli utility will work against IBM's version of 
the card.  They use an LSI chipset for the RAID parts, but I don't know 
if the card is so similar that it will talk using that utility or not.


The main useful site here is 
http://tools.rapidsoft.de/perc/perc-cheat-sheet.html ; here's how to 
dump all the main config info from an LSI card:


MegaCli64 -LDInfo -Lall -aALL

You want to see a line like this:

 Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache 
if Bad BBU


For the arrays.  And then check the battery like this:

MegaCli64 -AdpBbuCmd -aALL

--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] serveRAID M5014 SAS

2011-05-26 Thread Mark Kirkwood

On 27/05/11 11:19, Greg Smith wrote:

Mark Kirkwood wrote:
You can interrogate the config of the card and the raid 10 array 
using the megaraid cli package - you need to read the (frankly 
terrible) manual to discover which switches to use to determine 
battery and cache status etc. If you email me privately I'll get you 
a link to the relevant docs!


That's assuming the MegaCli utility will work against IBM's version of 
the card.  They use an LSI chipset for the RAID parts, but I don't 
know if the card is so similar that it will talk using that utility or 
not.


It does seem to.

Cheers

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] serveRAID M5014 SAS

2011-05-26 Thread Mark Kirkwood

On 27/05/11 11:22, Mark Kirkwood wrote:

On 27/05/11 11:19, Greg Smith wrote:

Mark Kirkwood wrote:
You can interrogate the config of the card and the raid 10 array 
using the megaraid cli package - you need to read the (frankly 
terrible) manual to discover which switches to use to determine 
battery and cache status etc. If you email me privately I'll get you 
a link to the relevant docs!


That's assuming the MegaCli utility will work against IBM's version 
of the card.  They use an LSI chipset for the RAID parts, but I don't 
know if the card is so similar that it will talk using that utility 
or not.


It does seem to.

Cheers

Mark



e.g checking battery status:

root@learn-db2:~# MegaCli64 -AdpBbuCmd -GetBbuStatus -a0

BBU status for Adapter: 0

BatteryType: iBBU
Voltage: 4040 mV
Current: 0 mA
Temperature: 28 C

BBU Firmware Status:

  Charging Status  : None
  Voltage  : OK
  Temperature  : OK
  Learn Cycle Requested: No
  Learn Cycle Active   : No
  Learn Cycle Status   : OK
  Learn Cycle Timeout  : No
  I2c Errors Detected  : No
  Battery Pack Missing : No
  Battery Replacement required : No
  Remaining Capacity Low   : No
  Periodic Learn Required  : No
  Transparent Learn: No

Battery state:

GasGuageStatus:
  Fully Discharged: No
  Fully Charged   : Yes
  Discharging : Yes
  Initialized : Yes
  Remaining Time Alarm: No
  Remaining Capacity Alarm: No
  Discharge Terminated: No
  Over Temperature: No
  Charging Terminated : No
  Over Charged: No

Relative State of Charge: 99 %
Charger System State: 49168
Charger System Ctrl: 0
Charging current: 0 mA
Absolute state of charge: 99 %
Max Error: 2 %

Exit Code: 0x00


Reminds me of out from DB2 diag commands (years ago...am ok now).

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] The shared buffers challenge

2011-05-26 Thread Samuel Gendler
On Thu, May 26, 2011 at 4:10 PM, Greg Smith g...@2ndquadrant.com wrote:


 As for figuring out how this impacts more complicated cases, I hear
 somebody wrote a book or something that went into pages and pages of detail
 about all this.  You might want to check it out.


I was just going to suggest that there was significant and detailed
documentation of this stuff in a certain book, a well-thumbed copy of which
should be sitting on the desk of anyone attempting any kind of postgres
performance tuning.


Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Craig Ringer

On 05/27/2011 02:13 AM, Cédric Villemain wrote:


I am not an hibernate expert, but I'll surprised if you can not drive
hibernate to do what you want.


If nothing else, you can do a native query in hand-written SQL through 
Hibernate. ORMs are useful tools for some jobs, but it's good to be able 
to bypass them when needed too.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Is any effect other process performance after vaccumdb finished ?

2011-05-26 Thread Junghwe Kim
Hi.

First extremely thanks for your works about postgresql .

I wonder that after executing 'vaccumdb  -z'  some other process can not
read their own msg queue during 2 ~ 3 minuts.

vaccum executed every hour. and The processes have not any relations between
postgreql.

Is it possible ?


Re: [PERFORM] Is any effect other process performance after vaccumdb finished ?

2011-05-26 Thread Craig Ringer

On 27/05/2011 9:58 AM, Junghwe Kim wrote:

Hi.

First extremely thanks for your works about postgresql .

I wonder that after executing 'vaccumdb  -z'  some other process can not
read their own msg queue during 2 ~ 3 minuts.


The most likely cause is checkpoint activity. Enable checkpoint logging 
and examine your server logs.



vaccum executed every hour. and The processes have not any relations
between postgreql.


Instead of running vacuum from cron or some other scheduler, turn on 
autovacuum and set it to run aggressively. You will reduce the impact of 
vacuum if you run it *often* (every few minutes if not more frequently) 
and do so using autovacuum.


Best results will be had on PostgreSQL 8.4 or above.

Please read this if you need to follow up, so you include enough 
information that a more complete answer can be given:


  http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance