Re: [PERFORM] queries with subquery constraints on partitioned tables not optimized?

2010-02-03 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Davor J. dav...@live.com writes:
 Now, if one takes a subquery for 1, the optimizer evaluates it first 
 (let's say to 1), but then searches for it (sequentially) in every 
 partition, which, for large partitions, can be very time-consuming and goes 
 beyond the point of partitioning.

 No, the optimizer doesn't evaluate it first.  Subqueries aren't ever
 assumed to reduce to constants.  (If you actually do have a constant
 expression, why don't you just leave out the word SELECT?)

It's easy to experience the same problem with a JOIN you'd want to
happen at the partition level that the planner will apply on the Append
Node.

I'm yet to figure out if 8.4 is smarter about this, meanwhile I'm using
array tricks to force the push-down.

 WHERE ... 
   AND service = ANY ((SELECT array_accum(id) FROM services WHERE x=281)
   || (SELECT array_accum(id) FROM services WHERE y=281))

It happens that I need the array concatenation more than the = ANY
operator (as compared to IN), so I also have queries using = ANY
('{}':int[] || (SELECT array_accum(x) ...))  to really force the planner
into doing the join in the partitions rather than after the Append has
taken place.

Regards,
-- 
dim

PS: If you're interrested into complete examples, I'll be able to
provide for them in private.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] System overload / context switching / oom, 8.3

2010-02-03 Thread Matthew Wakeling

On Tue, 2 Feb 2010, Rob wrote:

pg 8.3.9, Debian Etch, 8gb ram, quadcore xeon, megaraid (more details at end)
~240 active databases, 800+ db connections via tcp.



Linux 2.6.18-6-686-bigmem #1 SMP Thu Nov 5 17:30:05 UTC 2009 i686
GNU/Linux (Debian Etch)

8 MB RAM
4 Quad Core Intel(R) Xeon(R) CPU   E5440  @ 2.83GHz stepping 06


My advice?

1. Switch to 64-bit operating system and Postgres. Debian provides that, 
and it works a charm. You have a 64-bit system, so why not use it?


2. Buy more RAM. Think about it - you have 800 individual processes 
running on your box, and they will all want their own process space. To be 
honest, I'm impressed that the current machine works at all. You can get 
an idea of how much RAM you might need by multiplying the number of 
connections by (work_mem + about 3MB), and add on shared_buffers. So even 
when the system is idle you're currently burning 3200MB just sustaining 
800 processes - more if they are actually doing something.


3. Try to reduce the number of connections to the database server.

4. Think about your work_mem. Finding the correct value for you is going 
to be a matter of testing. Smaller values will result in large queries 
running slowly, but have the danger of driving the system to swap and OOM.


Matthew

--
A good programmer is one who looks both ways before crossing a one-way street.
Considering the quality and quantity of one-way streets in Cambridge, it
should be no surprise that there are so many good programmers there.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2010-02-03 Thread Greg Stark
On Tue, Feb 2, 2010 at 7:45 PM, Robert Haas robertmh...@gmail.com wrote:
 I think you're probably right, but it's not clear what the new name
 should be until we have a comment explaining what the function is
 responsible for.

So I wrote some comments but wasn't going to repost the patch with the
unchanged name without explanation... But I think you're right though
I was looking at it the other way around. I want to have an API for a
two-stage sync and of course if I do that I'll comment it to explain
that clearly.

The gist of the comments was that the function is preparing to fsync
to initiate the i/o early and allow the later fsync to fast -- but
also at the same time have the beneficial side-effect of avoiding
cache poisoning. It's not clear that the two are necessarily linked
though. Perhaps we need two separate apis, though it'll be hard to
keep them separate on all platforms.

-- 
greg

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2010-02-03 Thread Andres Freund

On 02/03/10 12:53, Greg Stark wrote:

On Tue, Feb 2, 2010 at 7:45 PM, Robert Haasrobertmh...@gmail.com  wrote:

I think you're probably right, but it's not clear what the new name
should be until we have a comment explaining what the function is
responsible for.


So I wrote some comments but wasn't going to repost the patch with the
unchanged name without explanation... But I think you're right though
I was looking at it the other way around. I want to have an API for a
two-stage sync and of course if I do that I'll comment it to explain
that clearly.

The gist of the comments was that the function is preparing to fsync
to initiate the i/o early and allow the later fsync to fast -- but
also at the same time have the beneficial side-effect of avoiding
cache poisoning. It's not clear that the two are necessarily linked
though. Perhaps we need two separate apis, though it'll be hard to
keep them separate on all platforms.
I vote for two seperate apis - sure, there will be some unfortunate 
overlap for most unixoid platforms but its sure better possibly to allow 
adding more platforms later at a centralized place than having to 
analyze every place where the api is used.


Andres

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2010-02-03 Thread Robert Haas
On Wed, Feb 3, 2010 at 6:53 AM, Greg Stark gsst...@mit.edu wrote:
 On Tue, Feb 2, 2010 at 7:45 PM, Robert Haas robertmh...@gmail.com wrote:
 I think you're probably right, but it's not clear what the new name
 should be until we have a comment explaining what the function is
 responsible for.

 So I wrote some comments but wasn't going to repost the patch with the
 unchanged name without explanation... But I think you're right though
 I was looking at it the other way around. I want to have an API for a
 two-stage sync and of course if I do that I'll comment it to explain
 that clearly.

 The gist of the comments was that the function is preparing to fsync
 to initiate the i/o early and allow the later fsync to fast -- but
 also at the same time have the beneficial side-effect of avoiding
 cache poisoning. It's not clear that the two are necessarily linked
 though. Perhaps we need two separate apis, though it'll be hard to
 keep them separate on all platforms.

Well, maybe we should start with a discussion of what kernel calls
you're aware of on different platforms and then we could try to put an
API around it.  I mean, right now all you've got is
POSIX_FADV_DONTNEED, so given just that I feel like the API could
simply be pg_dontneed() or something.  It's hard to design a general
framework based on one example.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2010-02-03 Thread Andres Freund

On 02/03/10 14:42, Robert Haas wrote:

On Wed, Feb 3, 2010 at 6:53 AM, Greg Starkgsst...@mit.edu  wrote:

On Tue, Feb 2, 2010 at 7:45 PM, Robert Haasrobertmh...@gmail.com  wrote:

I think you're probably right, but it's not clear what the new name
should be until we have a comment explaining what the function is
responsible for.


So I wrote some comments but wasn't going to repost the patch with the
unchanged name without explanation... But I think you're right though
I was looking at it the other way around. I want to have an API for a
two-stage sync and of course if I do that I'll comment it to explain
that clearly.

The gist of the comments was that the function is preparing to fsync
to initiate the i/o early and allow the later fsync to fast -- but
also at the same time have the beneficial side-effect of avoiding
cache poisoning. It's not clear that the two are necessarily linked
though. Perhaps we need two separate apis, though it'll be hard to
keep them separate on all platforms.


Well, maybe we should start with a discussion of what kernel calls
you're aware of on different platforms and then we could try to put an
API around it.
In linux there is sync_file_range. On newer Posixish systems one can 
emulate that with mmap() and msync() (in batches obviously).


No idea about windows.

Andres

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] some problems when i use postgresql 8.4.2 in my projects .

2010-02-03 Thread Robert Haas
2010/2/2  wyx6...@sina.com:
  the unique constraints actualy kill concurrency write transaction when
 concurrency insert violate the unique constraints , they block each other ,
 i test this in oracle10g, has the same behavour. I think this may be
 reasonable because the uqniue check must be  the seriazable check .
 for resolve this problem , i do the unique check in application as possible
 , but in big concurrency env , this is not good way .

You may find that your way isn't actually very reliable, and that
making it reliable will be very, very much harder (and likely no
faster) than letting PostgreSQL do it.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Re: Optimizing Postgresql server and FreeBSD for heavy read and writes

2010-02-03 Thread Amitabh Kant
Forgot to add that I am using Postgres 8.4.2 from the default ports of
FreeBSD.

With regards

Amitabh Kant

On Wed, Feb 3, 2010 at 8:40 PM, Amitabh Kant amitabhk...@gmail.com wrote:

 Hello

 I have a server dedicated for Postgres with the following specs:

 RAM 16GB, 146GB SAS (15K) x 4 -  RAID 10 with BBU, Dual Xeon  E5345  @
 2.33GHz
 OS: FreeBSD 8.0

 It runs multiple (approx 10) databases ranging from 500MB to over 24 GB in
 size. All of them are of the same structure, and almost all of them have
 very heavy read and writes.

 pgtune (http://pgfoundry.org/projects/pgtune/) suggests the settings to be
 changed as :

 maintenance_work_mem = 960MB # pg_generate_conf wizard 2010-02-03
 checkpoint_completion_target = 0.9 # pg_generate_conf wizard 2010-02-03
 effective_cache_size = 11GB # pg_generate_conf wizard 2010-02-03
 work_mem = 160MB # pg_generate_conf wizard 2010-02-03
 wal_buffers = 8MB # pg_generate_conf wizard 2010-02-03
 checkpoint_segments = 16 # pg_generate_conf wizard 2010-02-03
 shared_buffers = 3840MB # pg_generate_conf wizard 2010-02-03
 max_connections = 100 # pg_generate_conf wizard 2010-02-03


 While this gives me the changes for postgresql.conf, I am not sure of  of
 the chnages that I need to make in FreeBSD to support such large memory
 allocations. The last time I tried, Postgres refused to start and I had to
 fall back to the default settings.

 I would appreciate if somebody could point out the sysctl/loader.conf
 settings that I need to have in FreeBSD.

 With regards

 Amitabh Kant



Re: [PERFORM] some problems when i use postgresql 8.4.2 in my projects .

2010-02-03 Thread Leo Mannhart
wyx6...@sina.com wrote:
 after shaming , I think i should pick out some my points:
  the unique constraints actualy kill concurrency write transaction when
 concurrency insert violate the unique constraints , they block each
 other , i test this in oracle10g, has the same behavour. I think this
 may be reasonable because the uqniue check must be  the seriazable check .
 for resolve this problem , i do the unique check in application as
 possible , but in big concurrency env , this is not good way .
 

How can you enforce uniqueness in the application? If you implement it
correctly, you need considerably longer than letting it do PostgreSQL.
Even if you use some kind of magic, I could not imagine, how you can
implement a unique constraint in the application and gaurantee
uniqueness while at the same time be faster than the RDBMS.

Leo

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] queries with subquery constraints on partitioned tables not optimized?

2010-02-03 Thread Nikolas Everett
On Tue, Feb 2, 2010 at 7:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Davor J. dav...@live.com writes:
  Now, if one takes a subquery for 1, the optimizer evaluates it first
  (let's say to 1), but then searches for it (sequentially) in every
  partition, which, for large partitions, can be very time-consuming and
 goes
  beyond the point of partitioning.

 No, the optimizer doesn't evaluate it first.  Subqueries aren't ever
 assumed to reduce to constants.  (If you actually do have a constant
 expression, why don't you just leave out the word SELECT?)

regards, tom lane


If you don't have a constant expression then you can either explicitly loop
in the calling code or a function or you could index the key in all the
subtables.  The index isn't really optimal but it gets the job done.

Nik


Re: [PERFORM] Queries within a function

2010-02-03 Thread Віталій Тимчишин
2010/2/2 Mridula Mahadevan mmahade...@stratify.com

  Hi,

  I am running a bunch of queries within a function, creating some temp
 tables and populating them. When the data exceeds say, 100k the queries
 start getting really slow and timeout (30 min). when these are run outside
 of a transaction(in auto commit mode), they run in a few seconds. Any ideas
 on what may be going on and any postgresql.conf parameters etc that might
 help?

 Thanks

Have you tried to analyze temp tables after you've populated them? Because
AFAIK it won't do it automatically for tables created, filled and then used
 in same transaction.


Re: [PERFORM] Re: Optimizing Postgresql server and FreeBSD for heavy read and writes

2010-02-03 Thread Reid Thompson
On Wed, 2010-02-03 at 20:42 +0530, Amitabh Kant wrote:
 Forgot to add that I am using Postgres 8.4.2 from the default ports of
 FreeBSD.

start with this page
http://www.postgresql.org/docs/8.4/static/kernel-resources.html


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizing Postgresql server and FreeBSD for heavy read and writes

2010-02-03 Thread Ivan Voras

On 02/03/10 16:10, Amitabh Kant wrote:

Hello

I have a server dedicated for Postgres with the following specs:

RAM 16GB, 146GB SAS (15K) x 4 -  RAID 10 with BBU, Dual Xeon  E5345  @
2.33GHz
OS: FreeBSD 8.0


If you really do have heavy read and write load on the server, nothing 
will save you from the bottleneck of having only 4 drives in the system 
(or more accurately: adding more memory will help reads but nothing 
helps writes except more drivers or faster (SSD) drives). If you can, 
add another 2 drives in RAID 1 and move+symlink the pg_xlog directory to 
the new array.



maintenance_work_mem = 960MB # pg_generate_conf wizard 2010-02-03
checkpoint_completion_target = 0.9 # pg_generate_conf wizard 2010-02-03
effective_cache_size = 11GB # pg_generate_conf wizard 2010-02-03
work_mem = 160MB # pg_generate_conf wizard 2010-02-03
wal_buffers = 8MB # pg_generate_conf wizard 2010-02-03
checkpoint_segments = 16 # pg_generate_conf wizard 2010-02-03
shared_buffers = 3840MB # pg_generate_conf wizard 2010-02-03
max_connections = 100 # pg_generate_conf wizard 2010-02-03



I would appreciate if somebody could point out the sysctl/loader.conf
settings that I need to have in FreeBSD.


Firstly, you need to run a 64-bit version (amd64) of FreeBSD.

In /boot/loader.conf you will probably need to increase the number of 
sysv ipc semaphores:


kern.ipc.semmni=512
kern.ipc.semmns=1024

This depends mostly on the number of connections allowed to the server. 
The example values I gave above are more than enough but since this is a 
boot-only tunable it is expensive to modify later.


In /etc/sysctl.conf you will need to increase the shared memory sizes, 
e.g. for a 3900 MB shared_buffer:


kern.ipc.shmmax=4089446400
This is the maximum shared memory segment size, in bytes.

kern.ipc.shmall=105
This is the maximum amount of memory allowed to be used as sysv shared 
memory, in 4 kB pages.


If the database contains many objects (tables, indexes, etc.) you may 
need to increase the maximum number of open files and the amount of 
memory for the directory list cache:


kern.maxfiles=16384
vfs.ufs.dirhash_maxmem=4194304

If you estimate you will have large sequential reads on the database, 
you should increase read-ahead count:


vfs.read_max=32

Be sure that soft-updates is enabled on the file system you are using 
for data. Ignore all Linux-centric discussions about problems with 
journaling and write barriers :)


All settings in /etc/sysctl.conf can be changed at runtime (individually 
or by invoking /etc/rc.d/sysctl restart), settings in loader.conf are 
boot-time only.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizing Postgresql server and FreeBSD for heavy read and writes

2010-02-03 Thread Andy Colson

On 2/3/2010 9:10 AM, Amitabh Kant wrote:

Hello

I have a server dedicated for Postgres with the following specs:

RAM 16GB, 146GB SAS (15K) x 4 -  RAID 10 with BBU, Dual Xeon  E5345  @
2.33GHz
OS: FreeBSD 8.0

It runs multiple (approx 10) databases ranging from 500MB to over 24 GB
in size. All of them are of the same structure, and almost all of them
have very heavy read and writes.


With regards

Amitabh Kant


What problems are you having?  Is it slow?  Is there something you are 
trying to fix, or is this just the first tune up?



 memory allocations. The last time I tried, Postgres refused to start and
 I had to fall back to the default settings.

Its probably upset about the amount of shared mem.  There is probably a 
way in bsd to set the max amount of shared memory available.  A Quick 
google turned up:


kern.ipc.shmmax

Dunno if thats right.  When you try to start PG, if it cannot allocate 
enough shared mem it'll spit out an error message into its log saying 
how much it tried to allocate.


Check:
http://archives.postgresql.org/pgsql-admin/2004-06/msg00155.php




 maintenance_work_mem = 960MB # pg_generate_conf wizard 2010-02-03
 checkpoint_completion_target = 0.9 # pg_generate_conf wizard 2010-02-03
 effective_cache_size = 11GB # pg_generate_conf wizard 2010-02-03
 work_mem = 160MB # pg_generate_conf wizard 2010-02-03
 wal_buffers = 8MB # pg_generate_conf wizard 2010-02-03
 checkpoint_segments = 16 # pg_generate_conf wizard 2010-02-03
 shared_buffers = 3840MB # pg_generate_conf wizard 2010-02-03
 max_connections = 100 # pg_generate_conf wizard 2010-02-03

Some of these seem like too much.  I'd recommend starting with one or 
two and see how it runs.  Then increase if you're still slow.


Start with effective_cache_size, shared_buffers and checkpoint_segments.

Wait until very last to play with work_mem and maintenance_work_mem.


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow-ish Query Needs Some Love

2010-02-03 Thread Andy Colson

On 2/3/2010 11:17 AM, Matt White wrote:

On Feb 2, 1:11 pm, a...@squeakycode.net (Andy Colson) wrote:

On 2/2/2010 1:03 PM, Matt White wrote:






On Feb 2, 6:06 am, Edgardo Portalegportal2...@yahoo.comwrote:

On 2010-02-02, Matt Whitemattw...@gmail.comwrote:



I have a relatively straightforward query that by itself isn't that
slow, but we have to run it up to 40 times on one webpage load, so it
needs to run much faster than it does. Here it is:



SELECT COUNT(*) FROM users, user_groups
   WHERE users.user_group_id = user_groups.id AND NOT users.deleted AND
user_groups.partner_id IN
   (partner_id_1, partner_id_2);



The structure is partners have user groups which have users. In the
test data there are over 200,000 user groups and users but only ~3000
partners. Anyone have any bright ideas on how to speed this query up?



Can you avoid running it 40 times, maybe by restructuring the
query (or making a view) along the lines of the following and
adding some logic to your page?



SELECT p.partner_id, ug.user_group_id, u.id, count(*)
FROM partners p
 LEFT JOIN user_groups ug
ON ug.partner_id=p.partner_id
 LEFT JOIN users u
ON u.user_group_id=ug.id
   WHERE NOT u.deleted
   GROUP BY 1,2,3
;



Thanks for the suggestion. The view didn't seem to speed things up.
Perhaps we can reduce the number of times it's called, we'll see. Any
additional ideas would be helpful. Thanks.


I agree with Edgardo, I think the biggest time saver will be reducing
trips to the database.

But... do you have an index on users.user_group_id?

Does rewriting it change the plan any?

SELECT COUNT(*) FROM users
inner join user_groups on (users.user_group_id = user_groups.id)
where NOT users.deleted
AND user_groups.partner_id IN (partner_id_1, partner_id_2);

And... it looks like the row guestimate is off a litte:

Index Scan using user_groups_partner_id_idx
on user_groups
(cost=0.00..133.86 rows=3346 width=8)
(actual time=0.049..96.992 rows=11 loops=2)

It guessed 3,346 rows, but actually got 100,001.  Have you run an
analyze on it?  If so, maybe bumping up the stats might help?

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
To make changes to your 
subscription:http://www.postgresql.org/mailpref/pgsql-performance


Andy,

I have run analyze, see my query plan in my original post. You'll have
to forgive me for being a bit of a Postgres noob but what do you mean
by bumping up the stats?


Thats not what I mean.  explain analyze select... is what you did, and 
correct.  What I meant was analyze user_groups.


see:
http://www.postgresql.org/docs/8.4/interactive/sql-analyze.html


an analyze will make PG look at a table, and calc stats on it, so it can 
make better guesses.  By default analyze only looks at a few rows (well 
a small percent of rows) and makes guesses about the entire table based 
on those rows.  If it guesses wrong, sometimes you need to tell it to 
analyze more rows (ie. a bigger percentage of the table).


By bumping the stats I was referring to this:

http://wiki.postgresql.org/wiki/Planner_Statistics

I have never had to do it, so dont know much about it.  It may or may 
not help.  Just thought it was something you could try.


-Andy


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Queries within a function

2010-02-03 Thread Mridula Mahadevan
Thank you all, You were right on the analyze. Insert statement with an 
aggregated subquery had a problem on an empty table.

I had to change the queries to do a simple insert then analyze on the table 
followed by an update with an aggregated sub query.   That goes thru very fast.

-mridula

From: Віталій Тимчишин [mailto:tiv...@gmail.com]
Sent: Wednesday, February 03, 2010 8:11 AM
To: Mridula Mahadevan
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Queries within a function


2010/2/2 Mridula Mahadevan 
mmahade...@stratify.commailto:mmahade...@stratify.com
Hi,
 I am running a bunch of queries within a function, creating some temp tables 
and populating them. When the data exceeds say, 100k the queries start getting 
really slow and timeout (30 min). when these are run outside of a 
transaction(in auto commit mode), they run in a few seconds. Any ideas on what 
may be going on and any postgresql.conf parameters etc that might help?
Thanks
Have you tried to analyze temp tables after you've populated them? Because 
AFAIK it won't do it automatically for tables created, filled and then used  in 
same transaction.


Re: [PERFORM] System overload / context switching / oom, 8.3

2010-02-03 Thread Robert Haas
On Tue, Feb 2, 2010 at 3:47 PM, Andy Colson a...@squeakycode.net wrote:
 effective_cache_size = 5000MB

 I see your running a 32bit, but with bigmem support, but still, one process
 is limited to 4gig.  You'd make better use of all that ram if you switched
 to 64bit.  And this cache, I think, would be limited to 4gig.

Just to be clear, effective_cache_size does not allocate any memory of
any kind, in any way, ever...

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizing Postgresql server and FreeBSD for heavy read and writes

2010-02-03 Thread Robert Haas
On Wed, Feb 3, 2010 at 10:10 AM, Amitabh Kant amitabhk...@gmail.com wrote:
 work_mem = 160MB # pg_generate_conf wizard 2010-02-03

Overall these settings look sane, but this one looks like an
exception.  That is an enormous value for that parameter...

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Optimizing Postgresql server and FreeBSD for heavy read and writes

2010-02-03 Thread Greg Smith

Robert Haas wrote:

On Wed, Feb 3, 2010 at 10:10 AM, Amitabh Kant amitabhk...@gmail.com wrote:
  

work_mem = 160MB # pg_generate_conf wizard 2010-02-03



Overall these settings look sane, but this one looks like an
exception.  That is an enormous value for that parameter...
  


Yeah, I think I need to retune the suggestions for that parameter.  The 
idea behind the tuning profile used in the web and OLTP setups is 
that you're unlikely to have all the available connections doing 
something involving sorting at the same time with those workloads, and 
when it does happen you want it to use the fastest approach possible 
even if that takes more RAM so the client waiting for a response is more 
likely to get one on time.  That's why the work_mem figure in those 
situations is set very aggressively:  total_mem / connections, so on a 
16GB server that comes out to the 160MB seen here.  I'm going to adjust 
that so that it's capped a little below (total_mem - shared_buffers) / 
connections instead.


pgtune just got a major bit of refactoring recently from Matt Harrison 
to make it more Python-esque, and I'll be pushing toward an official 1.0 
with all the major loose ends cleaned up and an adjusted tuning model 
that will be available before 9.0 ships.  I'm seeing enough people 
interested in it now to justify putting another block of work into 
improving it.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us



Re: [PERFORM] System overload / context switching / oom, 8.3

2010-02-03 Thread Rob Lemley

Andy Colson wrote:

 work_mem = 32MB
 maintenance_work_mem = 64MB


if you have lots and lots of connections, you might need to cut these 
down?


definitely, work_mem is the main focus.

If I understand correctly, th 64MB maintenance_work_mem is per vacuum 
task, and on this system there are 3 autovacuums.  I was wondering if 
with this many databases, possibly decreasing the maintenance_work_mem 
significantly and starting up more autovacuums.


Yes, also moving databases to other servers in order to decrease the 
number of connections.


 effective_cache_size = 5000MB

I see your running a 32bit, but with bigmem support, but still, one 
process is limited to 4gig.  You'd make better use of all that ram if 
you switched to 64bit.  And this cache, I think, would be limited to 
4gig.
All of the cache is being used because the operating system kernel is 
built with the memory extensions to access outside the 32bit range.  
This is the cache size reported by free(1). However, there may be 
advantages to switch to 64bit.




The oom-killer is kicking in, at some point, so somebody is using too 
much ram.  There should be messages or logs or something, right?  
(I've never enabled the oom stuff so dont know much about it).  But 
the log messages might be helpful.


Also, do you know what the oom max memory usage is set to?  You said:
oom_adj -17.  vm_overcommit_memory set to 2, but at this time 
vm_overcommit_ratio was still at 50 (has since been changed to 90, 
should this be 100?)


Oh man.  I encourage everyone to find out what /proc/pid/oom_adj 
means.  You have to set this to keep the Linux oom-killer from doing a 
kill -9 on postgres postmaster.  On Debian:


echo -17  /proc/$(cat /var/run/postgresql/8.3-main.pid)/oom_adj

This is my experience with oom-killer.  After putting -17 into 
/proc/pid/oom_adj, oom-killer seemed to kill one of the database 
connection processes.  Then the postmaster attempted to shut down all 
processes because of possible shared memory corruption.  The database 
then went into recovery mode.  After stopping the database some of the 
processes were stuck and could not be killed.  The operating system was 
rebooted and the database returned with no data loss.


My earlier experience with oom-killer: If you don't have this setting in 
oom_adj, then it seems likely (certain?) that oom-killer kills the 
postmaster because of the algorithm oom-killer uses (called badness()) 
which adds children process scores to their parent's scores.  I don't 
know if sshd was killed but I don't think anyone could log in to the 
OS.  After rebooting there was a segmentation violation when trying to 
start the postmaster.  I don't think that running pg_resetxlog with 
defaults is a good idea.  My colleague who has been investigating the 
crash believes that we could have probably eliminated at least some of 
the data loss with more judicious use of pg_resetxlog.


There was a discussion on the postgres lists about somehow having the 
postgres distribution include the functionality to set oom_adj on 
startup.  To my knowledge, that's not in 8.3 so I wrote a script and 
init.d script to do this on Debian systems.


As far as vm.over_commit memory goes, there are three settings and most 
recommend setting it to 2 for postgres.  However, this does not turn off 
oom-killer!  You need to put -17 in /proc/pid/oom_adj whether you do 
anything about vm.over_commit memory or not  We had vm_overcommit_memory 
set to 2 and oom-killer became active and killed the postmaster.


Kind of off-topic, but a Linux kernel parameter that's often not set on 
database servers is elevator=deadline which sets up the io scheduling 
algorithm.  The algorithm can be viewed/set at runtime for example the 
disk /dev/sdc in /sys/block/sdc/queue/scheduler.


Rob




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] System overload / context switching / oom, 8.3

2010-02-03 Thread Greg Smith

Rob Lemley wrote:
here was a discussion on the postgres lists about somehow having the 
postgres distribution include the functionality to set oom_adj on 
startup.  To my knowledge, that's not in 8.3 so I wrote a script and 
init.d script to do this on Debian systems.


That's not in anything earlier than the upcoming 9.0 because the support 
code involved just showed up:  
http://archives.postgresql.org/pgsql-committers/2010-01/msg00169.php


It was always possible to do this in an init script as you describe.  
The specific new feature added is the ability to remove client child 
processes from having that protection, so that they can still be killed 
normally.  Basically, limiting the protection just at the process that 
you really need it on.  The updated documentation for the new version 
has more details about this whole topic, useful to people running older 
versions too:  
http://developer.postgresql.org/pgdocs/postgres/kernel-resources.html


Kind of off-topic, but a Linux kernel parameter that's often not set 
on database servers is elevator=deadline which sets up the io 
scheduling algorithm.  The algorithm can be viewed/set at runtime for 
example the disk /dev/sdc in /sys/block/sdc/queue/scheduler.


I've never seen a real-world PostgreSQL workload where deadline worked 
better than CFQ, and I've seen a couple where it was significantly 
worse.  Playing with that parameter needs a heavy disclaimer that you 
should benchmark *your app* before and after changing it to make sure it 
was actually useful.  Actually, three times:  return to CFQ again 
afterwards, too, just to confirm it's not a faster on the second run 
effect.


The important things to get right on Linux are read-ahead and reducing 
the size of the write cache size--the latter being the more direct and 
effective way to improve the problem that the scheduler change happens 
to impact too.  Those have dramatically more importance than sensible 
changes to the scheduler used (with using the anticipatory one on a 
server system or the no-op one on a desktop would be non-sensible changes).


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance