subject:"\[PERFORM\] Pooling in Core WAS\: Need help in performance tuning."

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-29 Thread Josh Berkus




introduce bugs.  If we can get to the point where we have something to
play around with, even if it's kind of kludgey or doesn't quite work,
it'll give us some idea of whether further effort is worthwhile and
how it should be directed.


Should I put this on the TODO list,  then, in hopes that someone steps 
forward?


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-28 Thread Robert Haas

On Wed, Jul 28, 2010 at 4:10 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> However, if we don't support that, we can't do any sort of pooling-ish
>> thing without the ability to pass file descriptors between processes;
>> and Tom seems fairly convinced there's no portable way to do that.
>
> Well, what it would come down to is: are we prepared to not support
> pooling on platforms without such a capability?  It's certainly possible
> to do it on many modern platforms, but I don't believe we can make it
> happen everywhere.  Generally we've tried to avoid having major features
> that don't work everywhere ...

I suppose it depends on the magnitude of the benefit.  And how many
platforms aren't covered.  And how much code is required.  In short,
until someone writes a patch, who knows?  I think the core question we
should be thinking about is what would be the cleanest method of
resetting a backend - either for the same database or for a different
one, whichever seems easier.  And by cleanest, I mean least likely to
introduce bugs.  If we can get to the point where we have something to
play around with, even if it's kind of kludgey or doesn't quite work,
it'll give us some idea of whether further effort is worthwhile and
how it should be directed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-28 Thread Andres Freund

On Wed, Jul 28, 2010 at 04:10:08PM -0400, Tom Lane wrote:
> Robert Haas  writes:
> > However, if we don't support that, we can't do any sort of pooling-ish
> > thing without the ability to pass file descriptors between processes;
> > and Tom seems fairly convinced there's no portable way to do that.
>
> Well, what it would come down to is: are we prepared to not support
> pooling on platforms without such a capability?  It's certainly possible
> to do it on many modern platforms, but I don't believe we can make it
> happen everywhere.  Generally we've tried to avoid having major features
> that don't work everywhere ...
Which platforms do you have in mind here? All of the platforms I found
documented to be supported seem to support at least one of SCM_RIGHTS,
WSADuplicateSocket or STREAMS/FD_INSERT.
Most if not all beside windows support SCM_RIGHTS. The ones I am
dubious about support FD_INSERT...

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-28 Thread Tom Lane

Robert Haas  writes:
> However, if we don't support that, we can't do any sort of pooling-ish
> thing without the ability to pass file descriptors between processes;
> and Tom seems fairly convinced there's no portable way to do that.

Well, what it would come down to is: are we prepared to not support
pooling on platforms without such a capability?  It's certainly possible
to do it on many modern platforms, but I don't believe we can make it
happen everywhere.  Generally we've tried to avoid having major features
that don't work everywhere ...

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-28 Thread Robert Haas

On Wed, Jul 28, 2010 at 3:44 PM, Josh Berkus  wrote:
> On 7/27/10 6:56 PM, Tom Lane wrote:
>> Yeah, if it weren't for that I'd say "sure let's try it".  But I'm
>> afraid we'd be introducing significant headaches in return for a gain
>> that's quite speculative.
>
> Well, the *gain* isn't speculative.  For example, I am once again
> dealing with the issue that PG backend processes on Solaris never give
> up their RAM, resulting in pathological swapping situations if you have
> many idle connections.  This requires me to install pgpool, which is
> overkill (since it has load balancing, replication, and more) just to
> make sure that connections get recycled so that I don't have 300 idle
> connections eating up 8GB of RAM.
>
> Relative to switching databases, I'd tend to say that, like pgbouncer
> and pgpool, we don't need to support that.  Each user/database combo can
> have their own "pool".  While not ideal, this would be good enough for
> 90% of users.

However, if we don't support that, we can't do any sort of pooling-ish
thing without the ability to pass file descriptors between processes;
and Tom seems fairly convinced there's no portable way to do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-28 Thread Josh Berkus

On 7/27/10 6:56 PM, Tom Lane wrote:
> Yeah, if it weren't for that I'd say "sure let's try it".  But I'm
> afraid we'd be introducing significant headaches in return for a gain
> that's quite speculative.

Well, the *gain* isn't speculative.  For example, I am once again
dealing with the issue that PG backend processes on Solaris never give
up their RAM, resulting in pathological swapping situations if you have
many idle connections.  This requires me to install pgpool, which is
overkill (since it has load balancing, replication, and more) just to
make sure that connections get recycled so that I don't have 300 idle
connections eating up 8GB of RAM.

Relative to switching databases, I'd tend to say that, like pgbouncer
and pgpool, we don't need to support that.  Each user/database combo can
have their own "pool".  While not ideal, this would be good enough for
90% of users.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-28 Thread Vitalii Tymchyshyn


28.07.10 04:56, Tom Lane написав(ла):


I'm not asserting it's true, just suggesting it's entirely possible.
Other than the fork() cost itself and whatever authentication activity
there might be, practically all the startup cost of a new backend can be
seen as cache-populating activities.  You'd have to redo all of that,
*plus* pay the costs of getting rid of the previous cache entries.
Maybe the latter costs less than a fork(), or maybe not.  fork() is
pretty cheap on modern Unixen.

   
Actually as for me, the problem is that one can't raise number of 
database connections high without overloading CPU/memory/disk, so 
external pooling is needed. If postgresql had something like 
max_active_queries setting that limit number of connections that are not 
in IDLE [in transaction] state, one could raise max connections high 
(and I don't think idle process itself has much overhead) and limit 
max_active_queries to get maximum performance and won't use external 
pooling. Of course this won't help if the numbers are really high, but 
could work out the most common cases.


Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Craig Ringer

On 28/07/10 04:40, Tom Lane wrote:
> Robert Haas  writes:
>> On Thu, Jul 22, 2010 at 5:29 PM, Andres Freund  wrote:
 The problem is harder for us because a backend can't switch identities
 once it's been assigned to a database.  I haven't heard an adequate
 explanation of why that couldn't be changed, though.
> 
>>> Possibly it might decrease the performance significantly enough by
>>> reducing the cache locality (syscache, prepared plans)?
> 
>> Those things are backend-local.  The worst case scenario is you've got
>> to flush them all when you reinitialize, in which case you still save
>> the overhead of creating a new process.
> 
> "Flushing them all" is not zero-cost; it's not too hard to believe that
> it could actually be slower than forking a clean new backend.
> 
> What's much worse, it's not zero-bug.  We've got little bitty caches
> all over the backend, including (no doubt) some caching behavior in
> third-party code that wouldn't get the word about whatever API you
> invented to deal with this.

In addition to caches, there may be some places where memory is just
expected to leak. Since it's a one-off allocation, nobody really cares;
why bother cleaning it up when it's quicker to just let the OS do it
when the backend terminates?

Being able to reset a backend for re-use would require that per-session
memory use be as neatly managed as per-query and per-transaction memory,
with no leaked stragglers left lying around.

Such cleanup (and management) has its own costs. Plus, you have a
potentially increasingly fragmented memory map to deal with the longer
the backend lives. Overall, there are plenty of advantages to letting
the OS clean it up.

... however, if the requirement is introduced that a given backend may
only be re-used for connections to the same database, lots of things get
simpler. You have to be able to change the current user (which would be
a bonus anyway), reset GUCs, etc, but how much else is there to do?

That way you can maintain per-database pools of idle workers (apache
prefork style) with ageing-out of backends that're unused. Wouldn't this
do the vast majority of what most pools are needed for anyway? And
wouldn't it potentially save quite a bit of load by avoiding having
backends constantly switching databases, flushing caches, etc?

-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Tom Lane

Robert Haas  writes:
> On Tue, Jul 27, 2010 at 9:56 PM, Tom Lane  wrote:
>> Other than the fork() cost itself and whatever authentication activity
>> there might be, practically all the startup cost of a new backend can be
>> seen as cache-populating activities.  You'd have to redo all of that,
>> *plus* pay the costs of getting rid of the previous cache entries.
>> Maybe the latter costs less than a fork(), or maybe not.  fork() is
>> pretty cheap on modern Unixen.

> I agree that the gain is minimal of itself; after all, how often do
> you need to switch databases, and what's the big deal if the
> postmaster has to fork a new backend?  Where I see it as a potentially
> big win is when it comes to things like parallel query.  I can't help
> thinking that's going to be a lot less efficient if you're forever
> forking new backends.

Color me unconvinced.  To do parallel queries with pre-existing worker
processes, you'd need to hook up with a worker that was (at least) in
your own database, and then somehow feed it the query plan that it needs
to execute.  I'm thinking fork() could easily be cheaper.  But obviously
this is all speculation (... and Windows is going to be a whole 'nother
story in any case ...)

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Robert Haas

On Tue, Jul 27, 2010 at 9:56 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Jul 27, 2010 at 4:40 PM, Tom Lane  wrote:
>>> "Flushing them all" is not zero-cost; it's not too hard to believe that
>>> it could actually be slower than forking a clean new backend.
>
>> I'm not so sure I believe that.
>
> I'm not asserting it's true, just suggesting it's entirely possible.
> Other than the fork() cost itself and whatever authentication activity
> there might be, practically all the startup cost of a new backend can be
> seen as cache-populating activities.  You'd have to redo all of that,
> *plus* pay the costs of getting rid of the previous cache entries.
> Maybe the latter costs less than a fork(), or maybe not.  fork() is
> pretty cheap on modern Unixen.
>
>>> What's much worse, it's not zero-bug.
>
>> I think this is probably the biggest issue with the whole idea, and I
>> agree there would be some pain involved.
>
> Yeah, if it weren't for that I'd say "sure let's try it".  But I'm
> afraid we'd be introducing significant headaches in return for a gain
> that's quite speculative.

I agree that the gain is minimal of itself; after all, how often do
you need to switch databases, and what's the big deal if the
postmaster has to fork a new backend?  Where I see it as a potentially
big win is when it comes to things like parallel query.  I can't help
thinking that's going to be a lot less efficient if you're forever
forking new backends.  Perhaps the point here is that you'd actually
sort of like to NOT flush all those caches unless it turns out that
you're switching databases - many installations probably operate with
essentially one big database, so chances are good that even if you
distributed connections / parallel queries to backends round-robin,
you'd potentially save quite a bit of overhead.  Of course, for the
guy who has TWO big databases, you might hope to be a little smarter,
but that's another problem altogether.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Tom Lane

Robert Haas  writes:
> On Tue, Jul 27, 2010 at 4:40 PM, Tom Lane  wrote:
>> "Flushing them all" is not zero-cost; it's not too hard to believe that
>> it could actually be slower than forking a clean new backend.

> I'm not so sure I believe that.

I'm not asserting it's true, just suggesting it's entirely possible.
Other than the fork() cost itself and whatever authentication activity
there might be, practically all the startup cost of a new backend can be
seen as cache-populating activities.  You'd have to redo all of that,
*plus* pay the costs of getting rid of the previous cache entries.
Maybe the latter costs less than a fork(), or maybe not.  fork() is
pretty cheap on modern Unixen.

>> What's much worse, it's not zero-bug.

> I think this is probably the biggest issue with the whole idea, and I
> agree there would be some pain involved.

Yeah, if it weren't for that I'd say "sure let's try it".  But I'm
afraid we'd be introducing significant headaches in return for a gain
that's quite speculative.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Robert Haas

On Tue, Jul 27, 2010 at 4:40 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Jul 22, 2010 at 5:29 PM, Andres Freund  wrote:
 The problem is harder for us because a backend can't switch identities
 once it's been assigned to a database.  I haven't heard an adequate
 explanation of why that couldn't be changed, though.
>
>>> Possibly it might decrease the performance significantly enough by
>>> reducing the cache locality (syscache, prepared plans)?
>
>> Those things are backend-local.  The worst case scenario is you've got
>> to flush them all when you reinitialize, in which case you still save
>> the overhead of creating a new process.
>
> "Flushing them all" is not zero-cost; it's not too hard to believe that
> it could actually be slower than forking a clean new backend.

I'm not so sure I believe that.  Is a sinval overrun slower than
forking a clean new backend?  Is DISCARD ALL slower that forking a
clean new backend?  How much white space is there between either of
those and what would be needed here?  I guess it could be slower, but
I wouldn't want to assume that without evidence.

> What's much worse, it's not zero-bug.  We've got little bitty caches
> all over the backend, including (no doubt) some caching behavior in
> third-party code that wouldn't get the word about whatever API you
> invented to deal with this.

I think this is probably the biggest issue with the whole idea, and I
agree there would be some pain involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Tom Lane

Robert Haas  writes:
> On Thu, Jul 22, 2010 at 5:29 PM, Andres Freund  wrote:
>>> The problem is harder for us because a backend can't switch identities
>>> once it's been assigned to a database.  I haven't heard an adequate
>>> explanation of why that couldn't be changed, though.

>> Possibly it might decrease the performance significantly enough by
>> reducing the cache locality (syscache, prepared plans)?

> Those things are backend-local.  The worst case scenario is you've got
> to flush them all when you reinitialize, in which case you still save
> the overhead of creating a new process.

"Flushing them all" is not zero-cost; it's not too hard to believe that
it could actually be slower than forking a clean new backend.

What's much worse, it's not zero-bug.  We've got little bitty caches
all over the backend, including (no doubt) some caching behavior in
third-party code that wouldn't get the word about whatever API you
invented to deal with this.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Robert Haas

On Sat, Jul 24, 2010 at 2:23 AM, Craig Ringer
 wrote:
> On 24/07/10 01:28, Robert Haas wrote:
>
>> Well, if we could change the backends so that they could fully
>> reinitialize themselves (disconnect from a database to which they are
>> bound, etc.), I don't see why we couldn't use the Apache approach.
>
> This would offer the bonus on the side that it'd be more practical to
> implement database changes for a connection, akin to MySQL's "USE".
> Inefficient, sure, but possible.

Yep.

> I don't care about that current limitation very much. I think anyone
> changing databases all the time probably has the wrong design and should
> be using schema. I'm sure there are times it'd be good to be able to
> switch databases on one connection, though.

I pretty much agree with this.  I think this is merely slightly nice
on its own, but I think it might be a building-block to other things
that we might want to do down the road.  Markus Wanner's Postgres-R
replication uses worker processes; autovacuum does as well; and then
there's parallel query.  I can't help thinking that not needing to
fork a new backend every time you want to connect to a new database
has got to be useful.

> My question with all this remains: is it worth the effort when external
> poolers already solve the problem.

Whether it's worth the effort is something anyone who is thinking
about working on this will have to decide for themselves.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Hannu Krosing

On Sat, 2010-07-24 at 14:36 +0800, Craig Ringer wrote:
> On 24/07/10 13:23, Greg Smith wrote:
> > Joshua Tolley wrote:
> >> Relatively minor, but it would be convenient to avoid having to query
> >> $external_pooler to determine the client_addr of an incoming connection.
> >>   
> > 
> > You suggest this as a minor concern, but I consider it to be one of the
> > most compelling arguments in favor of in-core pooling.  A constant pain
> > with external poolers is the need to then combine two sources of data in
> > order to track connections fully, which is something that everyone runs
> > into eventually and finds annoying.  It's one of the few things that
> > doesn't go away no matter how much fiddling you do with pgBouncer, it's
> > always getting in the way a bit.  And it seems to seriously bother
> > systems administrators and developers, not just the DBAs.
> 
> 
> Putting a pooler in core won't inherently fix this, and won't remove the
> need to solve it for cases where the pooler can't be on the same machine.
> 
> 9.0 has application_name to let apps identify themselves. Perhaps a
> "pooled_client_ip", to be set by a pooler rather than the app, could be
> added to address this problem in a way that can be used by all poolers
> new and existing, not just any new in-core pooling system.
> 
> If a privileged set of pooler functions is was considered, as per my
> other recent mail, the pooler could use a management connection to set
> the client ip before handing the connection to the client, so the client
> couldn't change pooled_client_ip its self by accident or through malice.
> But even without that, it'd be awfully handy.

Or maybe we can add some command extensions to the protocol for passing
extra info, so that instead of sending just the "(run_query:query)"
command over socket we could send both the extra info and execute
"(set_params:(proxy_client_ip:a.b.c.d)(proxy_client_post:n)(something
else))(run_query:query)" in one packet (for performance) and have these
things be available in logging and pg_stat_activity

I see no need to try to somehow restrict these if you can always be sure
that they are set by the direct client. proxy can decide to pass some of
these from the real client but it would be a decision made by proxy, not
mandated by some proxying rules.






-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-27 Thread Hannu Krosing

On Fri, 2010-07-23 at 09:52 -0700, Joshua D. Drake wrote:
> On Thu, 2010-07-22 at 20:56 +0100, Hannu Krosing wrote:
> >  
> > > Let's extend this shall we:
> > > 
> > > Avoid adding yet another network hop
> > 
> > postgreSQL is multi-process, so you either have a separate "pooler
> > process" or need to put pooler functionality in postmaster, bothw ways
> > you still have a two-hop scenario for connect. you may be able to pass
> > the socket to child process and also keep it, but doing this for both
> > client and db sides seems really convoluted. 
> 
> Which means, right now there is three hops. Reducing one is good.

No, it is still two, as postmaster passes the socket to spwaned child
postgresql process after login. 

the process is as follows

Client --connects--> postmaster --spawns--> postgreSQL server process

then socket is passed to be used directly so the use is

Client --talks-to---> postgreSQL server process

when using spooler it becomes

Client --connects-to--> Spooler --passes-requests-to-->  postgreSQL 

I see no way to have spooler select the postgreSQL process, pass the
client connection in a way that taks directly to postgrSQL server
process AND be able to get the server connection back once the client is
finishe with either the request, transaction or connection (depending on
pooling mode).

> 
> > Or is there a prortable way to pass sockets back and forth between
> > parent and child processes ?
> > 
> > If so, then pgbouncer could use it as well.
> > 
> > > Remove of a point of failure
> > 
> > rather move the point of failure from external pooler to internal
> > pooler ;)
> 
> Yes but at that point, it doesn't matter. 
> 
> > 
> > > Reduction of administrative overhead
> > 
> > Possibly. But once you start actually using it, you still need to
> > configure and monitor it and do other administrator-y tasks.
> 
> Yes, but it is inclusive.
> 
> > 
> > > Integration into our core authentication mechanisms
> > 
> > True, although for example having SSL on client side connection will be
> > so slow that it hides any performance gains from pooling, at least for
> > short-lived connections.
> 
> Yes, but right now you can't use *any* pooler with LDAP for example. We
> could if pooling was in core. Your SSL argument doesn't really work
> because its true with or without pooling.

As main slowdown in SSL is connection setup, so you can get the network
security and pooling speedup if you run pool on client side and make the
pooler-server connection over SSL.

> > > Greater flexibility in connection control
> > 
> > Yes, poolers can be much more flexible than default postgresql. See for
> > example pgbouncers PAUSE , RECONFIGURE and RESUME commands 
> 
> :D
> 
> > 
> > > And, having connection pooling in core does not eliminate the use of an
> > > external pool where it makes since.
> > 
> > Probably the easiest way to achieve "pooling in core" would be adding an
> > option to start pgbouncer under postmaster control.
> 
> Yeah but that won't happen. 

I guess it could happen as part of opening up the "postgresql controlled
process" part to be configurable and able to run third party stuff. 

Another thing to run under postmaster control would be pgqd . 

> Also I think we may have a libevent
> dependency that we have to work out.
> 
> > 
> > You probably can't get much leaner than pgbouncer.
> 
> Oh don't get me wrong. I love pgbouncer. It is my recommended pooler but
> even it has limitations (such as auth).

As pgbouncer is single-threaded and the main goal has been performance
there is not much enthusiasm about having _any_ auth method included
which cant be completed in a few cpu cycles. It may be possible to add
threads to wait for LDAP/Kerberos/... response or do SSL handshakes, but
i have not seen any interest from Marko to do it himself.

Maybe there is a way to modularise the auth part of postmaster in a way
that could be used from third party products through some nice API which
postmaster-controlled pgbouncer can start using.

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-25 Thread Dimitri Fontaine

Craig Ringer  writes:
> 9.0 has application_name to let apps identify themselves. Perhaps a
> "pooled_client_ip", to be set by a pooler rather than the app, could be
> added to address this problem in a way that can be used by all poolers
> new and existing, not just any new in-core pooling system.

X-Forwarded-For ?

-- 
dim

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Craig Ringer

On 24/07/10 13:23, Greg Smith wrote:
> Joshua Tolley wrote:
>> Relatively minor, but it would be convenient to avoid having to query
>> $external_pooler to determine the client_addr of an incoming connection.
>>   
> 
> You suggest this as a minor concern, but I consider it to be one of the
> most compelling arguments in favor of in-core pooling.  A constant pain
> with external poolers is the need to then combine two sources of data in
> order to track connections fully, which is something that everyone runs
> into eventually and finds annoying.  It's one of the few things that
> doesn't go away no matter how much fiddling you do with pgBouncer, it's
> always getting in the way a bit.  And it seems to seriously bother
> systems administrators and developers, not just the DBAs.

Putting a pooler in core won't inherently fix this, and won't remove the
need to solve it for cases where the pooler can't be on the same machine.

9.0 has application_name to let apps identify themselves. Perhaps a
"pooled_client_ip", to be set by a pooler rather than the app, could be
added to address this problem in a way that can be used by all poolers
new and existing, not just any new in-core pooling system.

If a privileged set of pooler functions is was considered, as per my
other recent mail, the pooler could use a management connection to set
the client ip before handing the connection to the client, so the client
couldn't change pooled_client_ip its self by accident or through malice.
But even without that, it'd be awfully handy.

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Craig Ringer

On 24/07/10 01:28, Robert Haas wrote:

> Well, if we could change the backends so that they could fully
> reinitialize themselves (disconnect from a database to which they are
> bound, etc.), I don't see why we couldn't use the Apache approach.

This would offer the bonus on the side that it'd be more practical to
implement database changes for a connection, akin to MySQL's "USE".
Inefficient, sure, but possible.

I don't care about that current limitation very much. I think anyone
changing databases all the time probably has the wrong design and should
be using schema. I'm sure there are times it'd be good to be able to
switch databases on one connection, though.

My question with all this remains: is it worth the effort when external
poolers already solve the problem. Can PostgreSQL offer tools and
interfaces to permit external poolers to solve the problems they have,
rather than trying to bring them in-core? For example, with auth, can
the Pg server offer features to help poolers implement passthrough
authentication against the real Pg server?

Perhaps Pg could expose auth features over SQL, permitting appropriately
privileged users to verify credentials with SQL-level calls. Poolers
could pass supplied user credentials through to the real Pg server for
verification. For bonus points, an SQL interface could be provided that
lets the super-priveleged auth managing connection be used to change the
login role of another running backend/connection, so the pooler could
hand out connections with different login user ids without having to
maintain a pool per user id.

( That'd probably also permit implementation of a "CHANGE USER" command,
where the client changed login roles on the fly by passing the
credentials of the new role. That'd be *awesome* for application server
connection pools. )

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Andres Freund

On Sat, Jul 24, 2010 at 01:23:08AM -0400, Greg Smith wrote:
> Joshua Tolley wrote:
> >Relatively minor, but it would be convenient to avoid having to query
> >$external_pooler to determine the client_addr of an incoming connection.
>
> You suggest this as a minor concern, but I consider it to be one of
> the most compelling arguments in favor of in-core pooling.  A
> constant pain with external poolers is the need to then combine two
> sources of data in order to track connections fully, which is
> something that everyone runs into eventually and finds annoying.
> It's one of the few things that doesn't go away no matter how much
> fiddling you do with pgBouncer, it's always getting in the way a
> bit.  And it seems to seriously bother systems administrators and
> developers, not just the DBAs.
But you have to admit that this problem won't vanish as people will
continue to use poolers on other machines for resource reasons.
So providing a capability to do something sensible here seems to be
useful independent of in-core pooling.

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Greg Smith


Joshua Tolley wrote:

Relatively minor, but it would be convenient to avoid having to query
$external_pooler to determine the client_addr of an incoming connection.
  


You suggest this as a minor concern, but I consider it to be one of the 
most compelling arguments in favor of in-core pooling.  A constant pain 
with external poolers is the need to then combine two sources of data in 
order to track connections fully, which is something that everyone runs 
into eventually and finds annoying.  It's one of the few things that 
doesn't go away no matter how much fiddling you do with pgBouncer, it's 
always getting in the way a bit.  And it seems to seriously bother 
systems administrators and developers, not just the DBAs.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Andres Freund

On Fri, Jul 23, 2010 at 01:28:53PM -0400, Robert Haas wrote:
> On Fri, Jul 23, 2010 at 11:58 AM, Hannu Krosing  wrote:
> > On Thu, 2010-07-22 at 20:57 -0400, Robert Haas wrote:
> >> On Thu, Jul 22, 2010 at 3:15 PM, Hannu Krosing  wrote:
> >> > On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote:
> >> >> On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
> >> >>  wrote:
> >> >> > So rather than asking "should core have a connection pool" perhaps
> >> >> > what's needed is to ask "what can an in-core pool do that an external
> >> >> > pool cannot do?"
> >> >>
> >> >> Avoid sending every connection through an extra hop.
> >> >
> >> > not really. in-core != magically-in-right-backend-process
> >>
> >> Well, how about if we arrange it so it IS in the right backend
> >> process?  I don't believe magic is required.
> >
> > Do you have any design in mind, how you can make it so ?
>
> Well, if we could change the backends so that they could fully
> reinitialize themselves (disconnect from a database to which they are
> bound, etc.), I don't see why we couldn't use the Apache approach.
> There's a danger of memory leaks but that's why Apache has
> MaxRequestsPerChild, and it works pretty darn well.  Of course,
> passing file descriptors would be even nicer (you could pass the
> connection off to a child that was already bound to the correct
> database, perhaps) but has pointed out more than once, that's not
> portable.
Its not *that bad* though. To my knowledge its 2 or 3 implementations that
one would need to implement to support most if not all platforms.

- sendmsg/cmsg/SCM_RIGHTS based implementation (most if not all *nixes
  including solaris, linux, (free|open|net)bsd, OSX, AIX, HPUX, others)
- WSADuplicateSocket (windows)
- if needed: STREAMS based stuff (I_SENDFD) (at least solaris,  hpux, aix, 
tru64,
  irix, unixware allow this)


Note that I am still not convinced that its a good idea...

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Hannu Krosing

On Thu, 2010-07-22 at 20:57 -0400, Robert Haas wrote:
> On Thu, Jul 22, 2010 at 3:15 PM, Hannu Krosing  wrote:
> > On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote:
> >> On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
> >>  wrote:
> >> > So rather than asking "should core have a connection pool" perhaps
> >> > what's needed is to ask "what can an in-core pool do that an external
> >> > pool cannot do?"
> >>
> >> Avoid sending every connection through an extra hop.
> >
> > not really. in-core != magically-in-right-backend-process
> 
> Well, how about if we arrange it so it IS in the right backend
> process?  I don't believe magic is required.

Do you have any design in mind, how you can make it so ?

---
Hannu




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Robert Haas

On Fri, Jul 23, 2010 at 11:58 AM, Hannu Krosing  wrote:
> On Thu, 2010-07-22 at 20:57 -0400, Robert Haas wrote:
>> On Thu, Jul 22, 2010 at 3:15 PM, Hannu Krosing  wrote:
>> > On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote:
>> >> On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
>> >>  wrote:
>> >> > So rather than asking "should core have a connection pool" perhaps
>> >> > what's needed is to ask "what can an in-core pool do that an external
>> >> > pool cannot do?"
>> >>
>> >> Avoid sending every connection through an extra hop.
>> >
>> > not really. in-core != magically-in-right-backend-process
>>
>> Well, how about if we arrange it so it IS in the right backend
>> process?  I don't believe magic is required.
>
> Do you have any design in mind, how you can make it so ?

Well, if we could change the backends so that they could fully
reinitialize themselves (disconnect from a database to which they are
bound, etc.), I don't see why we couldn't use the Apache approach.
There's a danger of memory leaks but that's why Apache has
MaxRequestsPerChild, and it works pretty darn well.  Of course,
passing file descriptors would be even nicer (you could pass the
connection off to a child that was already bound to the correct
database, perhaps) but has pointed out more than once, that's not
portable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Joshua D. Drake

On Thu, 2010-07-22 at 20:56 +0100, Hannu Krosing wrote:
>  
> > Let's extend this shall we:
> > 
> > Avoid adding yet another network hop
> 
> postgreSQL is multi-process, so you either have a separate "pooler
> process" or need to put pooler functionality in postmaster, bothw ways
> you still have a two-hop scenario for connect. you may be able to pass
> the socket to child process and also keep it, but doing this for both
> client and db sides seems really convoluted. 

Which means, right now there is three hops. Reducing one is good.

> Or is there a prortable way to pass sockets back and forth between
> parent and child processes ?
> 
> If so, then pgbouncer could use it as well.
> 
> > Remove of a point of failure
> 
> rather move the point of failure from external pooler to internal
> pooler ;)

Yes but at that point, it doesn't matter. 

> 
> > Reduction of administrative overhead
> 
> Possibly. But once you start actually using it, you still need to
> configure and monitor it and do other administrator-y tasks.

Yes, but it is inclusive.

> 
> > Integration into our core authentication mechanisms
> 
> True, although for example having SSL on client side connection will be
> so slow that it hides any performance gains from pooling, at least for
> short-lived connections.

Yes, but right now you can't use *any* pooler with LDAP for example. We
could if pooling was in core. Your SSL argument doesn't really work
because its true with or without pooling.

> > Greater flexibility in connection control
> 
> Yes, poolers can be much more flexible than default postgresql. See for
> example pgbouncers PAUSE , RECONFIGURE and RESUME commands 

:D

> 
> > And, having connection pooling in core does not eliminate the use of an
> > external pool where it makes since.
> 
> Probably the easiest way to achieve "pooling in core" would be adding an
> option to start pgbouncer under postmaster control.

Yeah but that won't happen. Also I think we may have a libevent
dependency that we have to work out.

> 
> You probably can't get much leaner than pgbouncer.

Oh don't get me wrong. I love pgbouncer. It is my recommended pooler but
even it has limitations (such as auth).

Sincerely,

Joshua D. Drake

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-23 Thread Joshua Tolley

On Thu, Jul 22, 2010 at 02:44:04PM -0700, Scott Carey wrote:
> On Jul 22, 2010, at 11:36 AM, Robert Haas wrote:
> > On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
> >  wrote:
> >> So rather than asking "should core have a connection pool" perhaps
> >> what's needed is to ask "what can an in-core pool do that an external
> >> pool cannot do?"
> > 
> > Avoid sending every connection through an extra hop.
> > 
> 
> Dynamically adjust settings based on resource usage of the DB.
> 

Relatively minor, but it would be convenient to avoid having to query
$external_pooler to determine the client_addr of an incoming connection.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


signature.asc
Description: Digital signature

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 5:29 PM, Andres Freund  wrote:
>> The problem is harder for us because a backend can't switch identities
>> once it's been assigned to a database.  I haven't heard an adequate
>> explanation of why that couldn't be changed, though.
> Possibly it might decrease the performance significantly enough by
> reducing the cache locality (syscache, prepared plans)?

Those things are backend-local.  The worst case scenario is you've got
to flush them all when you reinitialize, in which case you still save
the overhead of creating a new process.  The best case scenario is
that you can keep some of them around, in which case, great.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Robert Haas

On Thu, Jul 22, 2010 at 3:15 PM, Hannu Krosing  wrote:
> On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote:
>> On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
>>  wrote:
>> > So rather than asking "should core have a connection pool" perhaps
>> > what's needed is to ask "what can an in-core pool do that an external
>> > pool cannot do?"
>>
>> Avoid sending every connection through an extra hop.
>
> not really. in-core != magically-in-right-backend-process

Well, how about if we arrange it so it IS in the right backend
process?  I don't believe magic is required.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Scott Carey


On Jul 22, 2010, at 11:36 AM, Robert Haas wrote:

> On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
>  wrote:
>> So rather than asking "should core have a connection pool" perhaps
>> what's needed is to ask "what can an in-core pool do that an external
>> pool cannot do?"
> 
> Avoid sending every connection through an extra hop.
> 

Dynamically adjust settings based on resource usage of the DB.

> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Hannu Krosing

On Thu, 2010-07-22 at 12:15 -0700, Joshua D. Drake wrote:
> On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote: 
> > On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
> >  wrote:
> > > So rather than asking "should core have a connection pool" perhaps
> > > what's needed is to ask "what can an in-core pool do that an external
> > > pool cannot do?"
> > 
> > Avoid sending every connection through an extra hop.
> 
> Let's extend this shall we:
> 
> Avoid adding yet another network hop

postgreSQL is multi-process, so you either have a separate "pooler
process" or need to put pooler functionality in postmaster, bothw ways
you still have a two-hop scenario for connect. you may be able to pass
the socket to child process and also keep it, but doing this for both
client and db sides seems really convoluted. 

Or is there a prortable way to pass sockets back and forth between
parent and child processes ?

If so, then pgbouncer could use it as well.

> Remove of a point of failure

rather move the point of failure from external pooler to internal
pooler ;)

> Reduction of administrative overhead

Possibly. But once you start actually using it, you still need to
configure and monitor it and do other administrator-y tasks.

> Integration into our core authentication mechanisms

True, although for example having SSL on client side connection will be
so slow that it hides any performance gains from pooling, at least for
short-lived connections.

> Greater flexibility in connection control

Yes, poolers can be much more flexible than default postgresql. See for
example pgbouncers PAUSE , RECONFIGURE and RESUME commands 

> And, having connection pooling in core does not eliminate the use of an
> external pool where it makes since.

Probably the easiest way to achieve "pooling in core" would be adding an
option to start pgbouncer under postmaster control.

You probably can't get much leaner than pgbouncer.

> -- 
> PostgreSQL.org Major Contributor
> Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
> Consulting, Training, Support, Custom Development, Engineering
> http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt
> 
> 
> 

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Hannu Krosing

On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote:
> On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
>  wrote:
> > So rather than asking "should core have a connection pool" perhaps
> > what's needed is to ask "what can an in-core pool do that an external
> > pool cannot do?"
> 
> Avoid sending every connection through an extra hop.

not really. in-core != magically-in-right-backend-process


there will still be "an extra hop",only it will be local, between pooler
and backend process.

similar to what currently happens with pgbouncer when you deploy it on
same server and use unix sockets

> 
> -- 
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
> 



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Andres Freund

On Thu, Jul 22, 2010 at 02:33:43PM -0400, Robert Haas wrote:
> On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
>  wrote:
> > On 12/07/10 17:45, Matthew Wakeling wrote:
> >>
> >> I'm surprised. Doesn't apache httpd do this? Does it have to do a whole
> >> load of non-portable stuff? It seems to work on a whole load of platforms.
> >
> > A lot of what Apache HTTPd does is handled via the Apache Portable
> > Runtime (APR). It contains a lot of per-platform handlers for various
> > functionality.
>
> Apache just has all of the worker processes call accept() on the
> socket, and whichever one the OS hands it off to gets the job.
As an inconsequential detail - afaik they keep the os from doing that
by protecting it with a mutex for various reasons (speed - as some
implementations wake up and see theres nothing to do, multiple
sockets, fairness)

> The problem is harder for us because a backend can't switch identities
> once it's been assigned to a database.  I haven't heard an adequate
> explanation of why that couldn't be changed, though.
Possibly it might decrease the performance significantly enough by
reducing the cache locality (syscache, prepared plans)?

Andres

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Joshua D. Drake

On Thu, 2010-07-22 at 14:36 -0400, Robert Haas wrote: 
> On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
>  wrote:
> > So rather than asking "should core have a connection pool" perhaps
> > what's needed is to ask "what can an in-core pool do that an external
> > pool cannot do?"
> 
> Avoid sending every connection through an extra hop.

Let's extend this shall we:

Avoid adding yet another network hop
Remove of a point of failure
Reduction of administrative overhead
Integration into our core authentication mechanisms
Greater flexibility in connection control

And, having connection pooling in core does not eliminate the use of an
external pool where it makes since.

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Robert Haas

On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
 wrote:
> So rather than asking "should core have a connection pool" perhaps
> what's needed is to ask "what can an in-core pool do that an external
> pool cannot do?"

Avoid sending every connection through an extra hop.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-22 Thread Robert Haas

On Mon, Jul 12, 2010 at 6:58 AM, Craig Ringer
 wrote:
> On 12/07/10 17:45, Matthew Wakeling wrote:
>>
>> I'm surprised. Doesn't apache httpd do this? Does it have to do a whole
>> load of non-portable stuff? It seems to work on a whole load of platforms.
>
> A lot of what Apache HTTPd does is handled via the Apache Portable
> Runtime (APR). It contains a lot of per-platform handlers for various
> functionality.

Apache just has all of the worker processes call accept() on the
socket, and whichever one the OS hands it off to gets the job.  The
problem is harder for us because a backend can't switch identities
once it's been assigned to a database.  I haven't heard an adequate
explanation of why that couldn't be changed, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Tatsuo Ishii

From: Rajesh Kumar Mallah 
Subject: Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.
Date: Mon, 19 Jul 2010 08:06:09 +0530
Message-ID: 

>  Thanks for the thought but it (-C) does not work .

Still you need:

pgbench's -c <= (pool_size + reserve_pool_size)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Rajesh Kumar Mallah

 Thanks for the thought but it (-C) does not work .

>
>
> BTW, I think you should use -C option with pgbench for this kind of
> testing. -C establishes connection for each transaction, which is
> pretty much similar to the real world application which do not use
> connection pooling. You will be supprised how PostgreSQL connection
> overhead is large.
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
> English: http://www.sraoss.co.jp/index_en.php
> Japanese: http://www.sraoss.co.jp
>

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Tatsuo Ishii

> pgbench cannot be used for testing with pgbouncer if number of
> pgbench clients exceeds pool_size + reserve_pool_size of pgbouncer.
> pgbench keeps waiting doing nothing. I am using pgbench  of postgresql 8.1.
> Are there changes to pgbench in this aspect ?

Pgbench won't start actual transaction until all connections to
PostgreSQL (in this case pgbounser I guess) successfully
established. IMO You sholud try other benchmark tools.

BTW, I think you should use -C option with pgbench for this kind of
testing. -C establishes connection for each transaction, which is
pretty much similar to the real world application which do not use
connection pooling. You will be supprised how PostgreSQL connection
overhead is large.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Hannu Krosing

On Sun, 2010-07-18 at 21:48 +0530, Rajesh Kumar Mallah wrote:
> Hi,
> 
> Sorry, if posting here was not proper instead of starting new thread
> (I am really not sure if its bad thing to do)
> 
> I would like to share my recent experience on implementation of  
> client side pooling using  pgbouncer. By client side i mean that 
> the the pgbouncer process in not on same machine as postgresql server.
> In first trial pgbouncer and postgresql were in same machine &
> phbouncer
> was connecting to postgresql using unix domain sockets. But i shifted
> it
> laters owing to high CPU usage > 50%. ( using top)
> Now i have shifted pgbouncer into a virtual machine (openvz container)
> in the application server hardware 

Why in VM (openvz container) ?

Did you also try it in the same OS as your appserver ?

Perhaps even connecting from appserver via unix seckets ?

> and all my applications on other virtual machines
> (web applications) connect to pgbouncer on this virtual machine.
> 
> I tested the setup with pgbench in two scenarios 
> 
> 1. connecting to DB server directly
> 2. connecting to DB via pgbouncer
> 
> the no of clients was 10 ( -c 10)  carrying out 1 transactions
> each (-t 1) .
> pgbench db was initilised with scaling  factor -s 100. 
> 
> since client count was less there was no queuing of requests in
> pgbouncer
> i would prefer to say  it was in 'passthrough' mode. 
> 
> the result was that 
> 
> 1. direct ~ 2000 tps
> 2. via pgbouncer ~ 1200 tps

Are you sure you are not measuring how much sunning pgbouncer slows down
pgbench directly, by competing for CPU resources and not by adding
latency to requests ?


> --
> Experience on deploying to production environment with real world
> load/usage pattern
> --
> 
> Pgbouncer was put in same machine as postgresql connecting via unix
> domain
> to server and tcp sockets with clients.
> 
> 1. There was drastic reduction in CPU loads  from  30 to 10 ldavg
> 2. There were no clients waiting, pool size was 150 and number of
> active
> connections was 100-120.
> 3. Application performance was worse (inspite of 0 clients waiting ) 
> 
> 
> I am still waiting to see what is the effect of shifting out pgbounce
> from dbserver
> to appserver, but with pgbench results i am not very hopeful. I am
> curious why inspite of 0 clients waiting pgbounce introduces a drop in
> tps.

If you have less clients than pgbouncer connections, you can't have any
clients waiting in pgbouncer, as each of them is allocated it's own
connection right away.

What you were measuring was 

1. pgbench and pgbouncer competeing for the same CPU
2. overhead from 2 hops to db (app-proxy-db) instead of 1 (app-db)

> Warm Regds
> Rajesh Kumar Mallah.
> CTO - tradeindia.com.
> 
> 
> 
> Keywords: pgbouncer performance
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Mon, Jul 12, 2010 at 6:11 PM, Kevin Grittner
>  wrote:
> Craig Ringer  wrote:
> 
> 
> > So rather than asking "should core have a connection pool"
> perhaps
> > what's needed is to ask "what can an in-core pool do that an
> > external pool cannot do?"
> 
> 
> (1)  It can prevent the most pessimal performance problems
> resulting
> from lack of an external connection pool (or a badly
> configured one)
> by setting a single GUC.  Configuration tools could suggest a
> good
> value during initial setup.
> 
> (2)  It can be used without installing and configuring a more
> sophisticated and complex product.
> 
> (3)  It might reduce latency because it avoids having to
> receive,
> parse, and resend data in both directions -- eliminating one
> "hop".
> I know the performance benefit would usually accrue to the
> external
> connection pooler, but there might be some circumstances where
> a
> built-in pool could win.
> 
> (4)  It's one more checkbox which can be ticked off on some
> RFPs.
> 
> That said, I fully agree that if we can include good
> documentation
> on the external poolers and we can get packagers to include
> poolers
> in their distribution, that gets us a much bigger benefit.  A
> built-in solution would only be worthwhile if it was simple
> enough
> and lightweight enough not to be a burden on execution time or
> maintenance.  Maybe that's too big an if.
> 
> -Kevin
> 
> 
> --
> Sent via pgsql-performance mailing list
> (pgsql-performance@postgresql.org)
> To mak

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Rajesh Kumar Mallah

Looks like ,

pgbench cannot be used for testing with pgbouncer if number of
pgbench clients exceeds pool_size + reserve_pool_size of pgbouncer.
pgbench keeps waiting doing nothing. I am using pgbench  of postgresql 8.1.
Are there changes to pgbench in this aspect ?

regds
Rajesh Kumar Mallah.

On Sun, Jul 18, 2010 at 11:38 PM, Rajesh Kumar Mallah <
mallah.raj...@gmail.com> wrote:

>
>
> On Sun, Jul 18, 2010 at 10:55 PM, Greg Smith  wrote:
>
>> Rajesh Kumar Mallah wrote:
>>
>>> the no of clients was 10 ( -c 10)  carrying out 1 transactions each
>>> (-t 1) .
>>> pgbench db was initilised with scaling  factor -s 100.
>>>
>>> since client count was less there was no queuing of requests in pgbouncer
>>> i would prefer to say  it was in 'passthrough' mode.
>>>
>>
>> Of course pgbouncer is going decrease performance in this situation.
>>  You've added a whole layer to things that all traffic has to pass through,
>> without a setup that gains any benefit from the connection pooling.  Try
>> making the client count 1000 instead if you want a useful test.
>>
>
> Dear Greg,
>
> my  max_client is 300 shall i test  with client count 250 ?
> if so what should be the scaling factor while initializing
> the pgbench db?
>
>
>> --
>> Greg Smith  2ndQuadrant US  Baltimore, MD
>> PostgreSQL Training, Services and Support
>> g...@2ndquadrant.com   www.2ndQuadrant.us
>>
>>
>

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Scott Carey

On Jul 9, 2010, at 8:33 PM, Craig Ringer wrote:

> On 10/07/2010 9:25 AM, Josh Berkus wrote:
>> 
>>> It *is* the last place you want to put it, but putting it there can
>>> be much better than not putting it *anywhere*, which is what we've
>>> often seen.
>> 
>> Well, what you proposed is an admission control mechanism, which is
>> *different* from a connection pool, although the two overlap.  A
>> connection pool solves 4 problems when it's working:
>> 
>> a) limiting the number of database server processes
>> b) limiting the number of active concurrent queries
>> c) reducing response times for allocating a new connection
>> d) allowing management of connection routes to the database
>> (redirection, failover, etc.)
> 
> I agree with you: for most Pg users (a) is really, really important. As 
> you know, in PostgreSQL each connection maintains not only general 
> connection state (GUC settings, etc) and if in a transaction, 
> transaction state, but also a query executor (full backend). That gets 
> nasty not only in memory use, but in impact on active query performance, 
> as all those query executors have to participate in global signalling 
> for lock management etc.
> 
> So an in-server pool that solved (b) but not (a) would IMO not be 
> particularly useful for the majority of users.
> 
> That said, I don't think it follows that (a) cannot be solved in-core. 
> How much architectural change would be required to do it efficiently 
> enough, though...
> 

a, b, and c can all be handled in core.  But that would be a radical 
re-architecture to do it right.  Postgres assumes that the client connection, 
authentication, and query processing all happen in one place in one process on 
one thread.  Most server software built and designed today avoids that model in 
order to decouple its critical resources from the # of client connections.  
Most server software designed today tries to control its resources and not let 
the behavior of clients dictate resource usage.

Even Apache HTTPD is undergoing a radical re-design so that it can handle more 
connections and more easily decouple connections from concurrent processing to 
keep up with competitors.

I'm not saying that Postgres core should change -- again thats a radical 
re-architecture.  But it should be recognized that it is not like most other 
server applications -- it can't control its resources very well and needs help 
to do so.  From using a connection pool to manually setting work_mem 
differently for different clients or workloads, resource management is not what 
it does well.  It does a LOT of things very very well, just not that.

> --
> Craig Ringer
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Rajesh Kumar Mallah

On Sun, Jul 18, 2010 at 10:55 PM, Greg Smith  wrote:

> Rajesh Kumar Mallah wrote:
>
>> the no of clients was 10 ( -c 10)  carrying out 1 transactions each
>> (-t 1) .
>> pgbench db was initilised with scaling  factor -s 100.
>>
>> since client count was less there was no queuing of requests in pgbouncer
>> i would prefer to say  it was in 'passthrough' mode.
>>
>
> Of course pgbouncer is going decrease performance in this situation.
>  You've added a whole layer to things that all traffic has to pass through,
> without a setup that gains any benefit from the connection pooling.  Try
> making the client count 1000 instead if you want a useful test.
>

Dear Greg,

my  max_client is 300 shall i test  with client count 250 ?
if so what should be the scaling factor while initializing
the pgbench db?


> --
> Greg Smith  2ndQuadrant US  Baltimore, MD
> PostgreSQL Training, Services and Support
> g...@2ndquadrant.com   www.2ndQuadrant.us
>
>

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Rajesh Kumar Mallah

ok ,
now the question is , is it possible to dig out from from postgresql
database
server if connection pooling is needed ? In our case eg i have kept
max_connections = 300  if i reduce below 250 i get error "max connection
reached."
on connecting to db directly,  if i put pgbouncer i get less performance
(even if no clients waiting)

without pooling the dbserver CPU usage increases but performance of apps
is also become good.

Regds
Rajesh Kumar Mallah.

On Sun, Jul 18, 2010 at 10:55 PM, Greg Smith  wrote:

> Rajesh Kumar Mallah wrote:
>
>> the no of clients was 10 ( -c 10)  carrying out 1 transactions each
>> (-t 1) .
>> pgbench db was initilised with scaling  factor -s 100.
>>
>> since client count was less there was no queuing of requests in pgbouncer
>> i would prefer to say  it was in 'passthrough' mode.
>>
>
> Of course pgbouncer is going decrease performance in this situation.
>  You've added a whole layer to things that all traffic has to pass through,
> without a setup that gains any benefit from the connection pooling.  Try
> making the client count 1000 instead if you want a useful test.
>
> --
> Greg Smith  2ndQuadrant US  Baltimore, MD
> PostgreSQL Training, Services and Support
> g...@2ndquadrant.com   www.2ndQuadrant.us
>
>

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Greg Smith


Rajesh Kumar Mallah wrote:
the no of clients was 10 ( -c 10)  carrying out 1 transactions 
each (-t 1) .

pgbench db was initilised with scaling  factor -s 100.

since client count was less there was no queuing of requests in pgbouncer
i would prefer to say  it was in 'passthrough' mode.


Of course pgbouncer is going decrease performance in this situation.  
You've added a whole layer to things that all traffic has to pass 
through, without a setup that gains any benefit from the connection 
pooling.  Try making the client count 1000 instead if you want a useful 
test.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Rajesh Kumar Mallah

Nice suggestion to try ,
I will put pgbouncer on raw hardware and run pgbench from same hardware.

regds
rajesh kumar mallah.



> Why in VM (openvz container) ?
>
> Did you also try it in the same OS as your appserver ?
>
> Perhaps even connecting from appserver via unix seckets ?
>
> > and all my applications on other virtual machines
>
>

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Rajesh Kumar Mallah

note: my postgresql server & pgbouncer were not in virtualised environment
in the first setup. Only application server has many openvz containers.

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-18 Thread Rajesh Kumar Mallah

Hi,

Sorry, if posting here was not proper instead of starting new thread
(I am really not sure if its bad thing to do)

I would like to share my recent experience on implementation of
client side pooling using  pgbouncer. By client side i mean that
the the pgbouncer process in not on same machine as postgresql server.
In first trial pgbouncer and postgresql were in same machine & phbouncer
was connecting to postgresql using unix domain sockets. But i shifted it
laters owing to high CPU usage > 50%. ( using top)
Now i have shifted pgbouncer into a virtual machine (openvz container)
in the application server hardware and all my applications on other virtual
machines
(web applications) connect to pgbouncer on this virtual machine.

I tested the setup with pgbench in two scenarios

1. connecting to DB server directly
2. connecting to DB via pgbouncer

the no of clients was 10 ( -c 10)  carrying out 1 transactions each (-t
1) .
pgbench db was initilised with scaling  factor -s 100.

since client count was less there was no queuing of requests in pgbouncer
i would prefer to say  it was in 'passthrough' mode.

the result was that

1. direct ~ 2000 tps
2. via pgbouncer ~ 1200 tps

--
Experience on deploying to production environment with real world load/usage
pattern
--

Pgbouncer was put in same machine as postgresql connecting via unix domain
to server and tcp sockets with clients.

1. There was drastic reduction in CPU loads  from  30 to 10 ldavg
2. There were no clients waiting, pool size was 150 and number of active
connections was 100-120.
3. Application performance was worse (inspite of 0 clients waiting )

I am still waiting to see what is the effect of shifting out pgbounce from
dbserver
to appserver, but with pgbench results i am not very hopeful. I am curious
why
inspite of 0 clients waiting pgbounce introduces a drop in tps.

Warm Regds
Rajesh Kumar Mallah.
CTO - tradeindia.com.

Keywords: pgbouncer performance

On Mon, Jul 12, 2010 at 6:11 PM, Kevin Grittner  wrote:

> Craig Ringer  wrote:
>
> > So rather than asking "should core have a connection pool" perhaps
> > what's needed is to ask "what can an in-core pool do that an
> > external pool cannot do?"
>
> (1)  It can prevent the most pessimal performance problems resulting
> from lack of an external connection pool (or a badly configured one)
> by setting a single GUC.  Configuration tools could suggest a good
> value during initial setup.
>
> (2)  It can be used without installing and configuring a more
> sophisticated and complex product.
>
> (3)  It might reduce latency because it avoids having to receive,
> parse, and resend data in both directions -- eliminating one "hop".
> I know the performance benefit would usually accrue to the external
> connection pooler, but there might be some circumstances where a
> built-in pool could win.
>
> (4)  It's one more checkbox which can be ticked off on some RFPs.
>
> That said, I fully agree that if we can include good documentation
> on the external poolers and we can get packagers to include poolers
> in their distribution, that gets us a much bigger benefit.  A
> built-in solution would only be worthwhile if it was simple enough
> and lightweight enough not to be a burden on execution time or
> maintenance.  Maybe that's too big an if.
>
> -Kevin
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-13 Thread Magnus Hagander

On Tue, Jul 13, 2010 at 16:42, Dimitri Fontaine  wrote:
> Tom Lane  writes:
>> I agree with the comments to the effect that this is really a packaging
>> and documentation problem.  There is no need for us to re-invent the
>> existing solutions, but there is a need for making sure that they are
>> readily available and people know when to use them.
>
> On this topic, I think we're getting back to the idea of having non-core
> daemon helpers that should get "supervised" the way postmaster already
> does with backends wrt starting and stoping them at the right time.
>
> So a supervisor daemon with a supervisor API that would have to support
> autovacuum as a use case, then things like pgagent, PGQ and pgbouncer,
> would be very welcome.
>
> What about starting a new thread about that? Or you already know you
> won't want to push the extensibility of PostgreSQL there?

+1 on this idea in general, if we can think up a good API - this seems
very useful to me, and you have some good examples there of cases
where it'd definitely be a help.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-13 Thread Dimitri Fontaine

Tom Lane  writes:
> I agree with the comments to the effect that this is really a packaging
> and documentation problem.  There is no need for us to re-invent the
> existing solutions, but there is a need for making sure that they are
> readily available and people know when to use them.

On this topic, I think we're getting back to the idea of having non-core
daemon helpers that should get "supervised" the way postmaster already
does with backends wrt starting and stoping them at the right time.

So a supervisor daemon with a supervisor API that would have to support
autovacuum as a use case, then things like pgagent, PGQ and pgbouncer,
would be very welcome.

What about starting a new thread about that? Or you already know you
won't want to push the extensibility of PostgreSQL there?

Regards,
-- 
dim

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-12 Thread Hannu Krosing

On Mon, 2010-07-12 at 18:58 +0800, Craig Ringer wrote:
> On 12/07/10 17:45, Matthew Wakeling wrote:
> > 
> > I'm surprised. Doesn't apache httpd do this? Does it have to do a whole
> > load of non-portable stuff? It seems to work on a whole load of platforms.
> 
> A lot of what Apache HTTPd does is handled via the Apache Portable
> Runtime (APR). It contains a lot of per-platform handlers for various
> functionality.
> 
> http://apr.apache.org/docs/apr/1.4/modules.html
> 
> I don't know if the socket passing is provided as part of APR or is part
> of Apache HTTPd its self, but I wouldn't be at all surprised if it was
> in APR.
> 
> Personally I'm now swayed by arguments presented here that trying to
> push pooling into core isn't really desirable, and that better
> packaging/bundling of existing solutions would be better.

"better packaging/bundling of existing solutions" is good in it's own
right,weather there will eventually be some support for pooling in core
or not.

> Perhaps documenting the pluses/minuses of the current pooling options
> and providing clear recommendations on which to use for different use
> cases would help, since half the trouble is users not knowing they need
> a pool or being confused as to which to select.
>
> This discussion reminds me a bit of Hibernate's built-in client-side
> connection pool. It has one, but it's a unloved stepchild that even the
> Hibernate devs suggest should be avoided in favour of a couple of
> external 3rd party options.

Yes, pooling _is_ often better handled as a (set of) separate options,
just because of the reason that here one size does definitely not fit
all;

And efficient in-core pooler probably will look very much like pgbouncer
running in a separate thread spawned by postmaster anyway.

Let's hope there will be some support in core for having user defined
helper processes soon(ish), so tweaking pgbouncer to run as one will be
reasonably easy :)

> A built-in pool seems like a great idea, but there are multiple existing
> ones because they solve different problems in different ways. Unless a
> built-in one could solve ALL those needs, or be so vastly simpler (due
> to code re-use, easier configuration, etc) that it's worth building one
> that won't fit everyone's needs, then it's best to stick to the existing
> external options.
> 
> So rather than asking "should core have a connection pool" perhaps
> what's needed is to ask "what can an in-core pool do that an external
> pool cannot do?"

Probably nothing. OTOH there are some things that an external pool can
do that a built-in one can't, like running on a separate host and
pooling more than 32000 client connections there.

Cascaded pooling seems also impossible with built-in pooling

> Admission control / resource limit features would be great to have in
> core, and can't really be done fully in external modules ... but could
> be designed in ways that would allow external poolers to add
> functionality on top. Josh Berkus has made some good points on why this
> isn't as easy as it looks, though:
> 
> 
> http://it.toolbox.com/blogs/database-soup/admission-control-and-its-discontents-39895
> 
> --
> Craig Ringer
> 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-12 Thread Kevin Grittner

Craig Ringer  wrote:

> So rather than asking "should core have a connection pool" perhaps
> what's needed is to ask "what can an in-core pool do that an
> external pool cannot do?"

(1)  It can prevent the most pessimal performance problems resulting
from lack of an external connection pool (or a badly configured one)
by setting a single GUC.  Configuration tools could suggest a good
value during initial setup.

(2)  It can be used without installing and configuring a more
sophisticated and complex product.

(3)  It might reduce latency because it avoids having to receive,
parse, and resend data in both directions -- eliminating one "hop". 
I know the performance benefit would usually accrue to the external
connection pooler, but there might be some circumstances where a
built-in pool could win.

(4)  It's one more checkbox which can be ticked off on some RFPs.

That said, I fully agree that if we can include good documentation
on the external poolers and we can get packagers to include poolers
in their distribution, that gets us a much bigger benefit.  A
built-in solution would only be worthwhile if it was simple enough
and lightweight enough not to be a burden on execution time or
maintenance.  Maybe that's too big an if.

-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-12 Thread Craig Ringer

On 12/07/10 17:45, Matthew Wakeling wrote:
> 
> I'm surprised. Doesn't apache httpd do this? Does it have to do a whole
> load of non-portable stuff? It seems to work on a whole load of platforms.

A lot of what Apache HTTPd does is handled via the Apache Portable
Runtime (APR). It contains a lot of per-platform handlers for various
functionality.

http://apr.apache.org/docs/apr/1.4/modules.html

I don't know if the socket passing is provided as part of APR or is part
of Apache HTTPd its self, but I wouldn't be at all surprised if it was
in APR.

Personally I'm now swayed by arguments presented here that trying to
push pooling into core isn't really desirable, and that better
packaging/bundling of existing solutions would be better.

Perhaps documenting the pluses/minuses of the current pooling options
and providing clear recommendations on which to use for different use
cases would help, since half the trouble is users not knowing they need
a pool or being confused as to which to select.

This discussion reminds me a bit of Hibernate's built-in client-side
connection pool. It has one, but it's a unloved stepchild that even the
Hibernate devs suggest should be avoided in favour of a couple of
external 3rd party options.

A built-in pool seems like a great idea, but there are multiple existing
ones because they solve different problems in different ways. Unless a
built-in one could solve ALL those needs, or be so vastly simpler (due
to code re-use, easier configuration, etc) that it's worth building one
that won't fit everyone's needs, then it's best to stick to the existing
external options.

So rather than asking "should core have a connection pool" perhaps
what's needed is to ask "what can an in-core pool do that an external
pool cannot do?"

Admission control / resource limit features would be great to have in
core, and can't really be done fully in external modules ... but could
be designed in ways that would allow external poolers to add
functionality on top. Josh Berkus has made some good points on why this
isn't as easy as it looks, though:

http://it.toolbox.com/blogs/database-soup/admission-control-and-its-discontents-39895

--
Craig Ringer

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-12 Thread Matthew Wakeling


On Sat, 10 Jul 2010, Tom Lane wrote:

Doesn't pgpool do this?


No, and in fact that's exactly why the proposed implementation isn't
ever going to be in core: it's not possible to do it portably.


I'm surprised. Doesn't apache httpd do this? Does it have to do a whole 
load of non-portable stuff? It seems to work on a whole load of platforms.


Matthew

--
I would like to think that in this day and age people would know better than
to open executables in an e-mail. I'd also like to be able to flap my arms
and fly to the moon.-- Tim Mullen

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-10 Thread Tom Lane

Josh Berkus  writes:
>> I'm also relying on the unsubstantiated assumption that it's
>> possible to pass a socket connection between processes.

> Doesn't pgpool do this?

No, and in fact that's exactly why the proposed implementation isn't
ever going to be in core: it's not possible to do it portably.
(And no, I'm not interested in hearing about how you can do it on
platforms X, Y, and/or Z.)

I agree with the comments to the effect that this is really a packaging
and documentation problem.  There is no need for us to re-invent the
existing solutions, but there is a need for making sure that they are
readily available and people know when to use them.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-10 Thread Josh Berkus




Right, let's not confuse Kevin's argument that we should have
connection pooling in core with advocacy for any particular patch or
feature suggestion that he may have offered on some other thread.  A
very simple in-core connection pooler might look something like this:
when a session terminates, the backend doesn't exit.  Instead, it
waits for the postmaster to reassign it to a new connection, which the
postmaster does in preference to starting new backends when possible.
But if a backend doesn't get assigned a new connection within a
certain period of time, then it goes ahead and exits anyway.


This would, in my opinion, be an excellent option for PostgreSQL and 
would save a LOT of newbie pain.  Going back to my list, it would help 
with both problems (a) and (c).  It wouldn't be as good as pgbouncer, 
but it would be "good enough" for a lot of users.


HOWEVER, there is the issue that such a mechanism isn't "free".  There 
are issue with sharing backends around GUCs, user configuration, 
security, and prepared plans -- all issues which presently cause people 
difficulty with pgbouncer.  I think the costs are worth it, but we'd 
have to make some attempt to tackle those issues as well.  And, of 
course, we'd need to let DBAs turn the pooling off.


I'd envision parameters:

pool_connections true/false
pool_connection_timeout 60s


 I'm also relying on the unsubstantiated assumption that it's
possible to pass a socket connection between processes.


Doesn't pgpool do this?


Still another approach is admission control based on transactions,
backends, queries, memory usage, I/O, or what have you.


That's a different problem, and on its own doesn't help newbies.  It's 
complimetary to pooling, though, so would be nice to have.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-09 Thread Samuel Gendler




Sent from my iPhone

On Jul 9, 2010, at 18:25, Josh Berkus  wrote:



So while adding (b) to core alone would be very useful for some users,
ironically it's generally for the more advanced users which are not  
the

ones we're trying to help on this thread.


It would seem from evidence presented on this thread that the more  
appropriate conversation would maybe be with package maintainers, to  
perhaps get them to include a connection pool or provide a package  
that comes with a pool preconfigured and installed, along with  
improving existing documentation so that it encourages the use of a  
pool as a first class installation choice since it seems to be  
something of a first class problem for a lot of novice users.


Just to give some background on my perspective - my prior familiarity  
with a connection pool was entirely on the client side, where I've  
been using them for years go keep resource consumption down on the  
client.. But it never occurred to me to consider one on the other end  
of those connections, despite the fact that I usually have a cluster  
of app hosts all talking to the same db. I assumed low connection  
count was desirable, but not mandatory, since surely the db server  
limited its own resource consumption, much the way a well written  
client app will. I basically assumed that the postgres devs used the  
same logic I did when I pooled my connections at the client side in  
order to minimize resource consumption there. I've got no truck with  
the reasons presented against doing so, since they make perfectly good  
sense to me.


However, I suspect there are lots of engineers like myself - folks  
working without the benefit of a dedicated dba or a dba who is new to  
the postgres platform - who make naive assumptions that aren't  
immediately or obviously corrected by the docs (I may be sticking my  
foot in my mouth here. I haven't read the standard docs in a very long  
time). With this issue in particular, the fix is fairly trivial and  
brings other benefits as well. But it sucks to discover it only after  
you've started to get errors on a running app, no matter how easy the  
fix.


So probably this is really only a bug in communication and can be  
fixed there. That's great. Easier to fix bugs are hard to find. I have  
yet to contribute to postgres development, so I guess, if no one  
objects, I'll see what I can do about improving the documentation of  
this issue, both in the official docs and just making sure it gets  
better billing in other sources of postgres documentation. But you'll  
have to bear with me, as I do have a more-than-full-time other job,  
and no experience with the pg developer community other than a couple  
of weeks on the mailing lists. But I do like to contribute to projects  
I use. It always winds up making me a more proficient user.


(for the record, if I wasn't limited to my phone at the moment I would  
actually check the state of existing documentation before sending  
this, so if I'm talking out of my ass on the lack of documentation,  
please go easy on me. I mean no offense)


--sam






--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-09 Thread Robert Haas

On Fri, Jul 9, 2010 at 11:33 PM, Craig Ringer
 wrote:
> On 10/07/2010 9:25 AM, Josh Berkus wrote:
>>
>>> It *is* the last place you want to put it, but putting it there can
>>> be much better than not putting it *anywhere*, which is what we've
>>> often seen.
>>
>> Well, what you proposed is an admission control mechanism, which is
>> *different* from a connection pool, although the two overlap.  A
>> connection pool solves 4 problems when it's working:
>>
>> a) limiting the number of database server processes
>> b) limiting the number of active concurrent queries
>> c) reducing response times for allocating a new connection
>> d) allowing management of connection routes to the database
>> (redirection, failover, etc.)
>
> I agree with you: for most Pg users (a) is really, really important. As you
> know, in PostgreSQL each connection maintains not only general connection
> state (GUC settings, etc) and if in a transaction, transaction state, but
> also a query executor (full backend). That gets nasty not only in memory
> use, but in impact on active query performance, as all those query executors
> have to participate in global signalling for lock management etc.
>
> So an in-server pool that solved (b) but not (a) would IMO not be
> particularly useful for the majority of users.
>
> That said, I don't think it follows that (a) cannot be solved in-core. How
> much architectural change would be required to do it efficiently enough,
> though...

Right, let's not confuse Kevin's argument that we should have
connection pooling in core with advocacy for any particular patch or
feature suggestion that he may have offered on some other thread.  A
very simple in-core connection pooler might look something like this:
when a session terminates, the backend doesn't exit.  Instead, it
waits for the postmaster to reassign it to a new connection, which the
postmaster does in preference to starting new backends when possible.
But if a backend doesn't get assigned a new connection within a
certain period of time, then it goes ahead and exits anyway.

You might argue that this is not really a connection pooler at all
because there's no admission control, but the point is you're avoiding
the overhead of creating and destroying backends unnecessarily.  Of
course, I'm also relying on the unsubstantiated assumption that it's
possible to pass a socket connection between processes.

Another approach to the performance problem is to try to find ways of
reducing the overhead associated with having a large number of
backends in the system.  That's not a connection pooler either, but it
might reduce the need for one.

Still another approach is admission control based on transactions,
backends, queries, memory usage, I/O, or what have you.

None of these things are mutually exclusive.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-09 Thread Craig Ringer


On 10/07/2010 9:25 AM, Josh Berkus wrote:



It *is* the last place you want to put it, but putting it there can
be much better than not putting it *anywhere*, which is what we've
often seen.


Well, what you proposed is an admission control mechanism, which is
*different* from a connection pool, although the two overlap.  A
connection pool solves 4 problems when it's working:

a) limiting the number of database server processes
b) limiting the number of active concurrent queries
c) reducing response times for allocating a new connection
d) allowing management of connection routes to the database
(redirection, failover, etc.)


I agree with you: for most Pg users (a) is really, really important. As 
you know, in PostgreSQL each connection maintains not only general 
connection state (GUC settings, etc) and if in a transaction, 
transaction state, but also a query executor (full backend). That gets 
nasty not only in memory use, but in impact on active query performance, 
as all those query executors have to participate in global signalling 
for lock management etc.


So an in-server pool that solved (b) but not (a) would IMO not be 
particularly useful for the majority of users.


That said, I don't think it follows that (a) cannot be solved in-core. 
How much architectural change would be required to do it efficiently 
enough, though...


--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Pooling in Core WAS: Need help in performance tuning.

2010-07-09 Thread Josh Berkus


> It *is* the last place you want to put it, but putting it there can
> be much better than not putting it *anywhere*, which is what we've
> often seen.

Well, what you proposed is an admission control mechanism, which is
*different* from a connection pool, although the two overlap.  A
connection pool solves 4 problems when it's working:

a) limiting the number of database server processes
b) limiting the number of active concurrent queries
c) reducing response times for allocating a new connection
d) allowing management of connection routes to the database
(redirection, failover, etc.)

What you were proposing is only (b).  While (b) alone is better than
nothing, it only solves some kinds of problems.  Database backend
processes are *not* free, and in general when I see users with "too many
connections" failures they are not because of too many concurrent
queries, but rather because of too many idle connections (I've seen up
to 1800 on a server).  Simply adding (b) for crappy applications would
make the problem worse, not better, because of the large number of
pending queries which the developer would fail to deal with, or monitor.

So while adding (b) to core alone would be very useful for some users,
ironically it's generally for the more advanced users which are not the
ones we're trying to help on this thread.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

59 matches

Mail list logo