Re: [PERFORM] Need help in performance tuning.

2010-07-14 Thread Hannu Krosing
On Fri, 2010-07-09 at 00:42 -0400, Tom Lane wrote:
 Samuel Gendler sgend...@ideasculptor.com writes:
  On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
  cr...@postnewspapers.com.au wrote:
  If you're not using a connection pool, start using one.
 
  I see this issue and subsequent advice cross this list awfully
  frequently.  Is there in architectural reason why postgres itself
  cannot pool incoming connections in order to eliminate the requirement
  for an external pool?
 
 Perhaps not, but there's no obvious benefit either.  Since there's
 More Than One Way To Do It, it seems more practical to keep that as a
 separate problem that can be solved by a choice of add-on packages.

One example where you need a separate connection pool is pooling really
large number of connections, which you may want to do on another host
than the database itself is running.

For example pgbouncer had to add option to use incoming unix sockets,
because they run into the IP socket port number limit (a little above
31k, or more exactly 63k/2.
And unix sockets can be used only on local host .

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-14 Thread Kevin Grittner
Scott Marlowe scott.marl...@gmail.com wrote:
 Hannu Krosing ha...@2ndquadrant.com wrote:
 One example where you need a separate connection pool is pooling
 really large number of connections, which you may want to do on
 another host than the database itself is running.
 
 Definitely.  Often it's best placed on the individual webservers
 that are making requests, each running its own pool.
 
Each running its own pool?  You've just made a case for an
admissions policy based on active database transactions or active
queries (or both) on the server having a benefit when used with this
pooling arrangement.  This collection of pools can't know when the
CPUs have enough to keep them busy and adding more will degrade
performance.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-14 Thread Hannu Krosing
On Wed, 2010-07-14 at 08:58 -0500, Kevin Grittner wrote:
 Scott Marlowe scott.marl...@gmail.com wrote:
  Hannu Krosing ha...@2ndquadrant.com wrote:
  One example where you need a separate connection pool is pooling
  really large number of connections, which you may want to do on
  another host than the database itself is running.
  
  Definitely.  Often it's best placed on the individual webservers
  that are making requests, each running its own pool.
  
 Each running its own pool?  You've just made a case for an
 admissions policy based on active database transactions or active
 queries (or both) on the server having a benefit when used with this
 pooling arrangement.  This collection of pools can't know when the
 CPUs have enough to keep them busy and adding more will degrade
 performance.

I guess this setup is for OLTP load (read lots of short transactions
with low timeout limits), where you can just open 2-5 connections per
CPU for mostly-in-memory database, maybe a little more when disk
accesses are involved. If you have more, then they just wait a few
milliseconds, if you have less, you don't have anything else to run
anyway.


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-12 Thread Bruce Momjian
Craig Ringer wrote:
 It'll need to separate running queries from running processes, or
 start threading backends, so that one way or the other a single query
 can benefit from the capabilities of multiple CPUs. The same separation,
 or a move to async I/O, might be needed to get one query to concurrently
 read multiple partitions of a table, or otherwise get maximum benefit
 from high-capacity I/O subsystems when running just a few big, expensive
 queries.
 
 Otherwise I'm wondering if PostgreSQL will begin really suffering in
 performance on workloads where queries are big and expensive but there
 are relatively few of them running at a time.

Agreed.  We certainly are going to have to go in that direction someday.
We have TODO items for these.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-11 Thread Pierre C


Two problems to recognize.  First is that building something in has the  
potential to significantly limit use and therefore advancement of work  
on external pools, because of the let's use the built in one instead of  
installing something extra mentality.  I'd rather have a great external  
project (which is what we have with pgBouncer) than a mediocre built-in  
one that becomes the preferred way just by nature of being in the core.


I would prefer having supplier A build a great product that seamlessly  
interfaces with supplier B's great product, rather than having supplier M$  
buy A, develop a half-working brain-dead version of B into A and market it  
as the new hot stuff, sinking B in the process. Anyway, orthogonal feature  
sets (like database and pooler) implemented in separate applications fit  
the open source development model quite well I think. Merge everything in,  
you get PHP.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-10 Thread Jesper Krogh

On 2010-07-10 00:59, Greg Smith wrote:

 Matthew Wakeling wrote:
 If you have an external pool solution, you can put it somewhere
 else - maybe on multiple somewhere elses.

 This is the key point to observe:  if you're at the point where you
 have so many connections that you need a pool, the last place you
 want to put that is on the overloaded database server itself.
 Therefore, it must be an external piece of software to be effective,
 rather than being part of the server itself.  Database servers are
 relatively expensive computing hardware due to size/quantity/quality
 of disks required.  You can throw a pooler (or poolers) on any cheap
 1U server.  This is why a built-in pooler, while interesting, is not
 particularly functional for how people normally scale up real-world
 deployments.


That may be totally correct for the 10% of the userbase
that are in a squeezed situation, but for the 90% that isn't (or isn't aware
of being there), the build-in would be a direct benefit. For the 20%
living near the edge it may be the difference between just working and
extra hassle.

I think it is a fair assumption that the majority of PG's users solves
the problems without an connection pooler, and the question
is if it is beneficial to let them scale better without doing anything?

I have also provided a case where Kevin proposal might be a
benefit but a connection pooler cannot solve it:

http://archives.postgresql.org/pgsql-hackers/2010-06/msg01438.php
(at least as I see it, but I'm fully aware that there is stuff I dont 
know of)


I dont think a build-in connection-poller (or similiar) would in any
way limit the actions and abillities of an external one?

* Both numbers wildly guessed..
--
Jesper





Re: [PERFORM] Need help in performance tuning.

2010-07-10 Thread Greg Smith

Jesper Krogh wrote:

I dont think a build-in connection-poller (or similiar) would in any
way limit the actions and abillities of an external one?


Two problems to recognize.  First is that building something in has the 
potential to significantly limit use and therefore advancement of work 
on external pools, because of the let's use the built in one instead of 
installing something extra mentality.  I'd rather have a great external 
project (which is what we have with pgBouncer) than a mediocre built-in 
one that becomes the preferred way just by nature of being in the core.  
If work on a core pooler was started right now, it would be years before 
that reached feature/performance parity, and during that time its 
existence would be a net loss compared to the current status quo for 
many who used it.


The second problem is the limited amount of resources to work on 
improvements to PostgreSQL.  If you want to improve the reach of 
PostgreSQL, I consider projects like materialized views and easy 
built-in partitioning to be orders of magnitude more useful things to 
work on than the marginal benefit of merging the features of the 
external pool software inside the database.  I consider the whole topic 
a bit of a distraction compared to instead working on *any* of the 
highly rated ideas at http://postgresql.uservoice.com/forums/21853-general


As a random technical note, I would recommend that anyone who is 
thinking about a pooler in core take a look at how pgBouncer uses 
libevent to respond to requests, a design model inspired by that of 
memcached.  I haven't looked at it deeply yet, but my gut guess is that 
this proven successful model would be hard to graft on top of the 
existing PostgreSQL process design, and doing anything but that is 
unlikely to perform as well.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Craig Ringer
On 09/07/10 12:42, Tom Lane wrote:
 Samuel Gendler sgend...@ideasculptor.com writes:
 On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
 If you're not using a connection pool, start using one.
 
 I see this issue and subsequent advice cross this list awfully
 frequently.  Is there in architectural reason why postgres itself
 cannot pool incoming connections in order to eliminate the requirement
 for an external pool?
 
 Perhaps not, but there's no obvious benefit either.  Since there's
 More Than One Way To Do It, it seems more practical to keep that as a
 separate problem that can be solved by a choice of add-on packages.

Admittedly I'm relatively ignorant of the details, but I increasingly
think PostgreSQL will need a different big architectural change in the
coming years, as the typical performance characteristics of machines
change:

It'll need to separate running queries from running processes, or
start threading backends, so that one way or the other a single query
can benefit from the capabilities of multiple CPUs. The same separation,
or a move to async I/O, might be needed to get one query to concurrently
read multiple partitions of a table, or otherwise get maximum benefit
from high-capacity I/O subsystems when running just a few big, expensive
queries.

Otherwise I'm wondering if PostgreSQL will begin really suffering in
performance on workloads where queries are big and expensive but there
are relatively few of them running at a time.

My point? *if* I'm not full of hot air and there's some truth to my
blather above, any change like that might be accompanied by a move to
separate query execution state from connection state, so that idle
connections have a much lower resource cost.

OK, that's my hand-waving for the day done.

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Harpreet singh Wadhwa
Thanx you all for the replies.
I got a gist on where should I head towards
like I should rely a bit on postgres for performance and rest on my
tomcat and application.
And will try connection pooling on postgres part.

And if I come back for any query (related to this topic) then this
time it will be more precise (with real time data of my testing). ;-)

Regards
haps

On Fri, Jul 9, 2010 at 1:22 PM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 On 09/07/10 12:42, Tom Lane wrote:
 Samuel Gendler sgend...@ideasculptor.com writes:
 On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
 If you're not using a connection pool, start using one.

 I see this issue and subsequent advice cross this list awfully
 frequently.  Is there in architectural reason why postgres itself
 cannot pool incoming connections in order to eliminate the requirement
 for an external pool?

 Perhaps not, but there's no obvious benefit either.  Since there's
 More Than One Way To Do It, it seems more practical to keep that as a
 separate problem that can be solved by a choice of add-on packages.

 Admittedly I'm relatively ignorant of the details, but I increasingly
 think PostgreSQL will need a different big architectural change in the
 coming years, as the typical performance characteristics of machines
 change:

 It'll need to separate running queries from running processes, or
 start threading backends, so that one way or the other a single query
 can benefit from the capabilities of multiple CPUs. The same separation,
 or a move to async I/O, might be needed to get one query to concurrently
 read multiple partitions of a table, or otherwise get maximum benefit
 from high-capacity I/O subsystems when running just a few big, expensive
 queries.

 Otherwise I'm wondering if PostgreSQL will begin really suffering in
 performance on workloads where queries are big and expensive but there
 are relatively few of them running at a time.

 My point? *if* I'm not full of hot air and there's some truth to my
 blather above, any change like that might be accompanied by a move to
 separate query execution state from connection state, so that idle
 connections have a much lower resource cost.

 OK, that's my hand-waving for the day done.

 --
 Craig Ringer

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Craig Ringer

Otherwise I'm wondering if PostgreSQL will begin really suffering in
performance on workloads where queries are big and expensive but there
are relatively few of them running at a time.


Oh, I should note at this point that I'm *not* whining that someone 
should volunteer to do this, or that the postgresql project should 
just make it happen.


I'm fully aware that Pg is a volunteer project and that even if these 
speculations were in a vaguely reasonable direction, that doesn't mean 
anyone has the time/skills/knowledge/interest to undertake such major 
architectural change. I certainly know I have zero right to ask/expect 
anyone to - I'm very, very grateful to all those who already spend time 
helping out and enhancing Pg. With particular props to Tom Lane for 
patience on the -general list and heroic bug-fixing persistence.


Sorry for the rely-to-self, I just realized my post could've been taken 
as a whine about Pg's architecture and some kind of demand that someone 
do something about it. That couldn't be further from my intent.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Brad Nicholson
On Fri, 2010-07-09 at 00:42 -0400, Tom Lane wrote:
 Samuel Gendler sgend...@ideasculptor.com writes:
  On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
  cr...@postnewspapers.com.au wrote:
  If you're not using a connection pool, start using one.
 
  I see this issue and subsequent advice cross this list awfully
  frequently.  Is there in architectural reason why postgres itself
  cannot pool incoming connections in order to eliminate the requirement
  for an external pool?
 
 Perhaps not, but there's no obvious benefit either.  Since there's
 More Than One Way To Do It, it seems more practical to keep that as a
 separate problem that can be solved by a choice of add-on packages.

This sounds similar to the approach to taken with Replication for years
before being moved into core.

Just like replication, pooling has different approaches.  I do think
that in both cases, having a solution that works, easily, out of the
box will meet the needs of most users.

There is also the issue of perception/adoption here as well.  One of my
colleagues mentioned that at PG East that he repeatedly heard people
talking (negatively) about the over reliance on add-on packages to deal
with core DB functionality.

-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Harpreet singh Wadhwa
Thanx you all for the replies.
I got a gist on where should I head towards
like I should rely a bit on postgres for performance and rest on my
tomcat and application.
And will try connection pooling on postgres part.

And if I come back for any query (related to this topic) then this
time it will be more precise (with real time data of my testing). ;-)

Regards
haps

On Fri, Jul 9, 2010 at 1:22 PM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 On 09/07/10 12:42, Tom Lane wrote:
 Samuel Gendler sgend...@ideasculptor.com writes:
 On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
 If you're not using a connection pool, start using one.

 I see this issue and subsequent advice cross this list awfully
 frequently.  Is there in architectural reason why postgres itself
 cannot pool incoming connections in order to eliminate the requirement
 for an external pool?

 Perhaps not, but there's no obvious benefit either.  Since there's
 More Than One Way To Do It, it seems more practical to keep that as a
 separate problem that can be solved by a choice of add-on packages.

 Admittedly I'm relatively ignorant of the details, but I increasingly
 think PostgreSQL will need a different big architectural change in the
 coming years, as the typical performance characteristics of machines
 change:

 It'll need to separate running queries from running processes, or
 start threading backends, so that one way or the other a single query
 can benefit from the capabilities of multiple CPUs. The same separation,
 or a move to async I/O, might be needed to get one query to concurrently
 read multiple partitions of a table, or otherwise get maximum benefit
 from high-capacity I/O subsystems when running just a few big, expensive
 queries.

 Otherwise I'm wondering if PostgreSQL will begin really suffering in
 performance on workloads where queries are big and expensive but there
 are relatively few of them running at a time.

 My point? *if* I'm not full of hot air and there's some truth to my
 blather above, any change like that might be accompanied by a move to
 separate query execution state from connection state, so that idle
 connections have a much lower resource cost.

 OK, that's my hand-waving for the day done.

 --
 Craig Ringer

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Kevin Grittner
Brad Nicholson bnich...@ca.afilias.info wrote:
 
 Just like replication, pooling has different approaches.  I do
 think that in both cases, having a solution that works, easily,
 out of the box will meet the needs of most users.
 
Any thoughts on the minimalist solution I suggested a couple weeks
ago?:
 
http://archives.postgresql.org/pgsql-hackers/2010-06/msg01385.php
http://archives.postgresql.org/pgsql-hackers/2010-06/msg01387.php
 
So far, there has been no comment by anyone
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Matthew Wakeling

On Fri, 9 Jul 2010, Kevin Grittner wrote:

Any thoughts on the minimalist solution I suggested a couple weeks
ago?:

http://archives.postgresql.org/pgsql-hackers/2010-06/msg01385.php
http://archives.postgresql.org/pgsql-hackers/2010-06/msg01387.php

So far, there has been no comment by anyone


Interesting idea. As far as I can see, you are suggesting solving the too 
many connections problem by allowing lots of connections, but only 
allowing a certain number to do anything at a time?


A proper connection pool provides the following advantages over this:

1. Pool can be on a separate machine or machines, spreading load.
2. Pool has a lightweight footprint per connection, whereas Postgres
   doesn't.
3. A large amount of the overhead is sometimes connection setup, which
   this would not solve. A pool has cheap setup.
4. This could cause Postgres backends to be holding onto large amounts of
   memory while being prevented from doing anything, which is a bad use of
   resources.
5. A fair amount of the overhead is caused by context-switching between
   backends. The more backends, the less useful any CPU caches.
6. There are some internal workings of Postgres that involve keeping all
   the backends informed about something going on. The more backends, the
   greater this overhead is. (This was pretty bad with the sinval queue
   overflowing a while back, but a bit better now. It still causes some
   overhead).
7. That lock would have a metric *($!-load of contention.

Matthew

--
Unfortunately, university regulations probably prohibit me from eating
small children in front of the lecture class.
   -- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Kevin Grittner
In case there's any doubt, the questions below aren't rhetorical.
 
Matthew Wakeling matt...@flymine.org wrote:
 
 Interesting idea. As far as I can see, you are suggesting solving
 the too many connections problem by allowing lots of connections,
 but only allowing a certain number to do anything at a time?
 
Right.
 
 A proper connection pool provides the following advantages over
 this:
 
 1. Pool can be on a separate machine or machines, spreading load.
 
Sure, but how would you do that with a built-in implementation?
 
 2. Pool has a lightweight footprint per connection, whereas
Postgres doesn't.
 
I haven't compared footprint of, say, a pgpool connection on the
database server to that of an idle PostgreSQL connection.  Do you
have any numbers?
 
 3. A large amount of the overhead is sometimes connection setup,
which this would not solve. A pool has cheap setup.
 
This would probably be most useful where the client held a
connection for a long time, not for the login for each database
transaction approach.  I'm curious how often you think application
software uses that approach.
 
 4. This could cause Postgres backends to be holding onto large
amounts of memory while being prevented from doing anything,
which is a bad use of resources.
 
Isn't this point 2 again?  If not, what are you getting at?  Again,
do you have numbers for the comparison, assuming the connection
pooler is running on the database server?
 
 5. A fair amount of the overhead is caused by context-switching
between backends. The more backends, the less useful any CPU
caches.
 
Would this be true while a backend was blocked?  Would this not be
true for a connection pool client-side connection?
 
 6. There are some internal workings of Postgres that involve
keeping all the backends informed about something going on. The
more backends, the greater this overhead is. (This was pretty
bad with the sinval queue overflowing a while back, but a bit
better now. It still causes some overhead).
 
Hmmm...  I hadn't thought about that.  Again, any numbers (e.g.,
profile information) on this?
 
 7. That lock would have a metric *($!-load of contention.
 
Here I doubt you.  It would be held for such short periods that I
suspect that collisions would be relatively infrequent compared to
some of the other locks we use.  As noted in the email, it may
actually normally be an increment and test within an existing
locked block.  Also, assuming that any built in connection pool
would run on the database server, why would you think the contention
for this would be worse than for whatever is monitoring connection
count in the pooler?
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Jorge Montero
If your app is running under Tomcat, connection pooling is extremely easy to 
set up from there: It has connection pooling mechanisms built in. Request your 
db connections using said mechanisms, instead of doing it manually, make a 
couple of changes to server.xml, and the problem goes away. Hundreds, if not 
thousands of concurrent users might end up running with less than 10 
connections.
 


 Harpreet singh Wadhwa harpr...@openbravo.com 7/9/2010 3:55 AM 
Thanx you all for the replies.
I got a gist on where should I head towards
like I should rely a bit on postgres for performance and rest on my
tomcat and application.
And will try connection pooling on postgres part.

And if I come back for any query (related to this topic) then this
time it will be more precise (with real time data of my testing). ;-)

Regards
haps

On Fri, Jul 9, 2010 at 1:22 PM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 On 09/07/10 12:42, Tom Lane wrote:
 Samuel Gendler sgend...@ideasculptor.com writes:
 On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
 If you're not using a connection pool, start using one.

 I see this issue and subsequent advice cross this list awfully
 frequently.  Is there in architectural reason why postgres itself
 cannot pool incoming connections in order to eliminate the requirement
 for an external pool?

 Perhaps not, but there's no obvious benefit either.  Since there's
 More Than One Way To Do It, it seems more practical to keep that as a
 separate problem that can be solved by a choice of add-on packages.

 Admittedly I'm relatively ignorant of the details, but I increasingly
 think PostgreSQL will need a different big architectural change in the
 coming years, as the typical performance characteristics of machines
 change:

 It'll need to separate running queries from running processes, or
 start threading backends, so that one way or the other a single query
 can benefit from the capabilities of multiple CPUs. The same separation,
 or a move to async I/O, might be needed to get one query to concurrently
 read multiple partitions of a table, or otherwise get maximum benefit
 from high-capacity I/O subsystems when running just a few big, expensive
 queries.

 Otherwise I'm wondering if PostgreSQL will begin really suffering in
 performance on workloads where queries are big and expensive but there
 are relatively few of them running at a time.

 My point? *if* I'm not full of hot air and there's some truth to my
 blather above, any change like that might be accompanied by a move to
 separate query execution state from connection state, so that idle
 connections have a much lower resource cost.

 OK, that's my hand-waving for the day done.

 --
 Craig Ringer

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance 


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Matthew Wakeling

On Fri, 9 Jul 2010, Kevin Grittner wrote:

Interesting idea. As far as I can see, you are suggesting solving
the too many connections problem by allowing lots of connections,
but only allowing a certain number to do anything at a time?


Right.


I think in some situations, this arrangement would be an advantage. 
However, I do not think it will suit the majority of situations, and could 
reduce the performance when the user doesn't need the functionality, 
either because they have a pool already, or they don't have many 
connections.


No, I don't have any numbers.


1. Pool can be on a separate machine or machines, spreading load.


Sure, but how would you do that with a built-in implementation?


That's my point exactly. If you have an external pool solution, you can 
put it somewhere else - maybe on multiple somewhere elses.



3. A large amount of the overhead is sometimes connection setup,
   which this would not solve. A pool has cheap setup.


This would probably be most useful where the client held a
connection for a long time, not for the login for each database
transaction approach.  I'm curious how often you think application
software uses that approach.


What you say is true. I don't know how often that is, but it seems to be 
those times that people come crying to the mailing list.



4. This could cause Postgres backends to be holding onto large
   amounts of memory while being prevented from doing anything,
   which is a bad use of resources.


Isn't this point 2 again?


Kind of. Yes. Point 2 was simple overhead. This point was that the backend 
may have done a load of query-related allocation, and then been stopped.



7. That lock would have a metric *($!-load of contention.


Here I doubt you.  It would be held for such short periods that I
suspect that collisions would be relatively infrequent compared to
some of the other locks we use.  As noted in the email, it may
actually normally be an increment and test within an existing
locked block.


Fair enough. It may be much less of a problem than I had previously 
thought.


Matthew

--
Change is inevitable, except from vending machines.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Kevin Grittner
Matthew Wakeling matt...@flymine.org wrote:
 On Fri, 9 Jul 2010, Kevin Grittner wrote:
 Interesting idea. As far as I can see, you are suggesting
 solving the too many connections problem by allowing lots of
 connections, but only allowing a certain number to do anything
 at a time?

 Right.
 
 I think in some situations, this arrangement would be an
 advantage.  However, I do not think it will suit the majority of
 situations, and could reduce the performance when the user doesn't
 need the functionality, either because they have a pool already,
 or they don't have many connections.
 
Oh, totally agreed, except that I think we can have essentially nil
impact if they don't exceed a configured limit.  In my experience,
pooling is more effective the closer you put it to the client.  I
suppose the strongest argument that could be made against building
in some sort of pooling is that it doesn't encourage people to look
for client-side solutions.  However, we seem to get a lot of posts
from people who don't do this, are not able to easily manage it, and
who would benefit from even a simple solution like this.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Jorge Montero
If anything was built in the database to handle such connections, I'd recommend 
a big, bold warning, recommending the use of client-side pooling if available. 
For something like, say, a web-server, pooling connections to the database 
provides a massive performance advantage regardless of how good the database is 
at handling way more active queries than the hardware can handle: The 
assignment of a connection to a thread tends to be at least an order of 
magnitude cheaper than establishing a new connection for each new thread, and 
destroying it when it dies. This is especially true if the client architecture 
relies in relatively short lived threads.
 
While there are a few cases where pooling is counter productive, this only 
happens in relatively few scenarios. This is why every java application server 
out there wil strongly recommend using its own facilities to connect to a 
database: The performance is almost always better, and it provides less 
headaches to the DBAs.
 
Now, if remote clients are accessing your database directly, setting up a pool 
inbetween might not be as straightforward or give you the same gains across the 
board, and that might be the only case where letting the db do its own pooling 
makes sense.

 Kevin Grittner kevin.gritt...@wicourts.gov 7/9/2010 12:52 PM 
Matthew Wakeling matt...@flymine.org wrote:
 On Fri, 9 Jul 2010, Kevin Grittner wrote:
 Interesting idea. As far as I can see, you are suggesting
 solving the too many connections problem by allowing lots of
 connections, but only allowing a certain number to do anything
 at a time?

 Right.
 
 I think in some situations, this arrangement would be an
 advantage.  However, I do not think it will suit the majority of
 situations, and could reduce the performance when the user doesn't
 need the functionality, either because they have a pool already,
 or they don't have many connections.

Oh, totally agreed, except that I think we can have essentially nil
impact if they don't exceed a configured limit.  In my experience,
pooling is more effective the closer you put it to the client.  I
suppose the strongest argument that could be made against building
in some sort of pooling is that it doesn't encourage people to look
for client-side solutions.  However, we seem to get a lot of posts
from people who don't do this, are not able to easily manage it, and
who would benefit from even a simple solution like this.

-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Kevin Grittner
Jorge Montero jorge_mont...@homedecorators.com wrote:
 
 If anything was built in the database to handle such connections,
 I'd recommend a big, bold warning, recommending the use of client-
 side pooling if available.
 
+1
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Robert Haas
On Fri, Jul 9, 2010 at 12:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Samuel Gendler sgend...@ideasculptor.com writes:
 On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
 If you're not using a connection pool, start using one.

 I see this issue and subsequent advice cross this list awfully
 frequently.  Is there in architectural reason why postgres itself
 cannot pool incoming connections in order to eliminate the requirement
 for an external pool?

 Perhaps not, but there's no obvious benefit either.  Since there's
 More Than One Way To Do It, it seems more practical to keep that as a
 separate problem that can be solved by a choice of add-on packages.

I'm not buying it.  A separate connection pooler increases overhead
and management complexity, and, I believe, limits our ability to
implement optimizations like parallel query execution.  I'm glad there
are good ones available, but the fact that they're absolutely
necessary for good performance in some environments is not a feature.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Greg Smith

Matthew Wakeling wrote:
If you have an external pool solution, you can put it somewhere else - 
maybe on multiple somewhere elses.


This is the key point to observe:  if you're at the point where you have 
so many connections that you need a pool, the last place you want to put 
that is on the overloaded database server itself.  Therefore, it must be 
an external piece of software to be effective, rather than being part of 
the server itself.  Database servers are relatively expensive computing 
hardware due to size/quantity/quality of disks required.  You can throw 
a pooler (or poolers) on any cheap 1U server.  This is why a built-in 
pooler, while interesting, is not particularly functional for how people 
normally scale up real-world deployments.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Kevin Grittner
Greg Smith g...@2ndquadrant.com wrote:
 
 if you're at the point where you have so many connections that you
 need a pool, the last place you want to put that is on the
 overloaded database server itself.  Therefore, it must be an
 external piece of software to be effective, rather than being part
 of the server itself.
 
It *is* the last place you want to put it, but putting it there can
be much better than not putting it *anywhere*, which is what we've
often seen.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Mark Kirkwood

On 10/07/10 00:56, Brad Nicholson wrote:

On Fri, 2010-07-09 at 00:42 -0400, Tom Lane wrote:
   


Perhaps not, but there's no obvious benefit either.  Since there's
More Than One Way To Do It, it seems more practical to keep that as a
separate problem that can be solved by a choice of add-on packages.
 

This sounds similar to the approach to taken with Replication for years
before being moved into core.

Just like replication, pooling has different approaches.  I do think
that in both cases, having a solution that works, easily, out of the
box will meet the needs of most users.

There is also the issue of perception/adoption here as well.  One of my
colleagues mentioned that at PG East that he repeatedly heard people
talking (negatively) about the over reliance on add-on packages to deal
with core DB functionality.

   


It would be interesting to know more about what they thought an 'over 
reliance' was and which packages they meant.


While clearly in the case of replication something needed to be done to 
make it better and easier, it is not obvious that the situation with 
connection pools is analogous. For instance we make extensive use of 
PgBouncer, and it seems to do the job fine and is ridiculously easy to 
install and setup. So would having (something like) this in core be an 
improvement? Clearly if the 'in core' product is better then it is 
desirable... similarly if the packaged product is better... well let's 
have that then!


I've certainly observed a 'fear of package installation' on the part of 
some folk, which is often a hangover from the 'Big IT shop' mentality 
where it requires blood signatures and child sacrifice to get anything 
new installed.


regards

Mark

P.s Also note that Database Vendors like pooling integrated in the core 
of *their* product because it is another thing to charge a license for. 
Unfortunately this can also become an entrenched mentality of 'must be 
in core' on the part of consultants etc!


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Need help in performance tuning.

2010-07-08 Thread Harpreet singh Wadhwa
Hi,


I want to fine tune my postgresql to increase number of connects it
can handle in a minutes time.
Decrease the response time per request etc.
The exact case will be to handle around 100 concurrent requests.

Can any one please help me in this.
Any hardware suggestions are also welcomed.


Regards
Harpreet

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-08 Thread Kevin Grittner
Harpreet singh Wadhwa harpreetsingh.wad...@gmail.com wrote:
 
 I want to fine tune my postgresql to increase number of connects
 it can handle in a minutes time.
 Decrease the response time per request etc.
 The exact case will be to handle around 100 concurrent requests.
 
I have found that connection pooling is crucial.
 
The concurrent requests phrase worries me a bit -- you should be
focusing more on concurrent connections and perhaps requests per
second.  With most hardware, you will get faster response time and
better overall throughput by funneling 100 connections through a
connection pool which limits the number of concurrent requests to
just enough to keep all your hardware resources busy, queuing any
requests beyond that for submission when a pending request
completes.
 
 Any hardware suggestions are also welcomed.
 
If you don't have the hardware yet, you'd need to provide a bit more
information to get advice on what hardware you need.
 
-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-08 Thread Craig Ringer

On 9/07/2010 3:20 AM, Harpreet singh Wadhwa wrote:

Hi,


I want to fine tune my postgresql to increase number of connects it
can handle in a minutes time.
Decrease the response time per request etc.
The exact case will be to handle around 100 concurrent requests.


If you're not using a connection pool, start using one.

Do you really need 100 *active* working query threads at one time? 
Because if you do, you're going to need a scary-big disk subsystem and a 
lot of processors.


Most people actually only need a few queries executing simultaneously, 
they just need lots of connections to the database open concurrently 
and/or lots of queries queued up for processing. For that purpose, a 
connection pool is ideal.


You will get BETTER performance from postgresql with FEWER connections 
to the real database that're all doing active work. If you need lots 
and lots of connections you should use a connection pool to save the 
main database the overhead of managing that.


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-08 Thread Samuel Gendler
On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
cr...@postnewspapers.com.au wrote:
 If you're not using a connection pool, start using one.

 Do you really need 100 *active* working query threads at one time? Because
 if you do, you're going to need a scary-big disk subsystem and a lot of
 processors.

I see this issue and subsequent advice cross this list awfully
frequently.  Is there in architectural reason why postgres itself
cannot pool incoming connections in order to eliminate the requirement
for an external pool?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help in performance tuning.

2010-07-08 Thread Tom Lane
Samuel Gendler sgend...@ideasculptor.com writes:
 On Thu, Jul 8, 2010 at 8:11 PM, Craig Ringer
 cr...@postnewspapers.com.au wrote:
 If you're not using a connection pool, start using one.

 I see this issue and subsequent advice cross this list awfully
 frequently.  Is there in architectural reason why postgres itself
 cannot pool incoming connections in order to eliminate the requirement
 for an external pool?

Perhaps not, but there's no obvious benefit either.  Since there's
More Than One Way To Do It, it seems more practical to keep that as a
separate problem that can be solved by a choice of add-on packages.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance