[HACKERS] Admission Control Policy

2009-12-28 Thread Kevin Grittner
This paper has a brief but interesting discussion of Admission
Control in section 2.4:
 
Architecture of a Database System. (Joseph M. Hellerstein, Michael
Stonebraker and James Hamilton). Foundations and Trends in Databases
1(2).
 
http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf
 
They describe a two-tier approach, where the first tier is already
effectively implemented in PostgreSQL with the max_connections and
superuser_reserved_connections GUCs.  The second tier is implemented
to run after a plan is chosen, and may postpone execution of a query
(or reduce the resources it is allowed) if starting it at that time
might overload available resources.  I think that implementing
something like this could potentially help with several types of
problems.
 
We often see posts from people who have more active connections than
is efficient.  We could, for example, have a policy which queues
query requests which are *not* from a superuser and not part of a
transaction which has acquired a snapshot or any locks, if the
number of active transactions is above a certain threshold.  Proper
configuration of a policy like this might change the performance
graph to stay relatively steady past the knee rather than
degrading.
 
We occasionally see posts where people have exhausted available
RAM and suffered a severe performance hit or a crash, due to an
excessively high setting of work_mem or maintenance_work_mem.
A good policy might warn and reduce the setting or reschedule
execution to keep things from getting too out of hand.
 
A good policy might also reduce conflicts between transactions,
making stricter transaction isolation less painful.  While this
observation motivated me to think about it, it seems potentially
useful on its own.
 
It might perhaps make sense to provide some hook to allow custom
policies to supplement or override a simple default policy.
 
Thoughts?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Hi,

Le 28 déc. 2009 à 21:33, Kevin Grittner a écrit :
 We often see posts from people who have more active connections than
 is efficient.

How would your proposal better solve the problem than using pgbouncer?

mad proposal time
I'd be in favor of considering how to get pgbouncer into -core, and now that we 
have Hot Standby maybe implement a mode in which as soon as a real XID is 
needed, or maybe upon receiving start transaction read write command, the 
connection is handled transparently to the master.
/

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Andres Freund
On Monday 28 December 2009 22:39:06 Dimitri Fontaine wrote:
 Hi,
 
 Le 28 déc. 2009 à 21:33, Kevin Grittner a écrit :
  We often see posts from people who have more active connections than
  is efficient.
 
 How would your proposal better solve the problem than using pgbouncer?
 
 mad proposal time
 I'd be in favor of considering how to get pgbouncer into -core, and now
  that we have Hot Standby maybe implement a mode in which as soon as a
  real XID is needed, or maybe upon receiving start transaction read write
  command, the connection is handled transparently to the master. /
Thats not as easy as it sounds - the master may not have all data needed by 
the snapshot on the slave anymore.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Le 28 déc. 2009 à 22:46, Andres Freund a écrit :
 mad proposal time
 I'd be in favor of considering how to get pgbouncer into -core, and now
 that we have Hot Standby maybe implement a mode in which as soon as a
 real XID is needed, or maybe upon receiving start transaction read write
 command, the connection is handled transparently to the master. /

 Thats not as easy as it sounds - the master may not have all data needed by 
 the snapshot on the slave anymore.

I suppose that if it was easy some patch would already be around for next 
commit fest? :)

Seriously, your point is why I'd be tempted to only consider getting to the 
master at transaction starting time. That is before any snapshot is taken.

Regards,
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Kevin Grittner
Dimitri Fontaine dfonta...@hi-media.com wrote:
 Le 28 déc. 2009 à 21:33, Kevin Grittner a écrit :
 We often see posts from people who have more active connections
 than is efficient.
 
 How would your proposal better solve the problem than using
 pgbouncer?
 
With my current knowledge of pgbouncer I can't answer that
definitively; but *if* pgbouncer, when configured for transaction
pooling, can queue new transaction requests until a connection is
free, then the differences would be:
 
(1)  According to pgbouncer documentation, transaction pooling is a
hack as it breaks application expectations of backend connection.
You can use it only when application cooperates with such usage by
not using features that can break.  This would not be an issue with
an ACP.
 
(2)  For the active connection aspect of the policy, you could let
through superuser requests while other requests were queuing.
 
(3)  With the ACP, the statements would be parsed and optimized
before queuing, so they would be ready to execute as soon as a
connection was freed.
 
(4)  Other factors than active connection count could be applied,
like expected memory consumption, or more esoteric metrics.
 
In favor of pgbouncer (or other connection poolers) they don't
require the overhead of a process and connection for each idle
connection, so I would recommend a connection pooler even with an
ACP.  They cover overlapping ground, but I see them as more
complementary than competing.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Le 28 déc. 2009 à 22:59, Kevin Grittner a écrit :
 With my current knowledge of pgbouncer I can't answer that
 definitively; but *if* pgbouncer, when configured for transaction
 pooling, can queue new transaction requests until a connection is
 free, then the differences would be:

It does that, yes. You setup a pool, which is per database/user, and when 
there's no more server side connection in the pool, the clients are held in 
cl_waiting state.

 (1)  According to pgbouncer documentation, transaction pooling is a
 hack as it breaks application expectations of backend connection.
 You can use it only when application cooperates with such usage by
 not using features that can break.  This would not be an issue with
 an ACP.

That's why there's both transaction and session pooling. The benefit of session 
pooling is to avoid forking backends, reusing them instead, and you still get 
the pooling control.

 (2)  For the active connection aspect of the policy, you could let
 through superuser requests while other requests were queuing.

superuser is another user and gets its own pool, I'm not sure if you can size 
it differently though (yet). It's possible to trick a little by defining 
another (virtual) database where you force the user in the connection string to 
the server, then tell your application to use this special database.

 (3)  With the ACP, the statements would be parsed and optimized
 before queuing, so they would be ready to execute as soon as a
 connection was freed.

There's a pgfoundry project called preprepare, which can be used along with 
pgbouncer to get this effect. If you use 8.4, you can even get the effect 
without pgbouncer.

  http://preprepare.projects.postgresql.org/README.html

 (4)  Other factors than active connection count could be applied,
 like expected memory consumption, or more esoteric metrics.

All you can put in connection strings or per-role setting can be used to trick 
a virtual database and have it pre-set, but that means different pools (they 
accumulate, now) and different connection strings for the application. The only 
advantage is that it works with released and proven code! (except for 
preprepare... well I've been told it's running in production somewhere)

 In favor of pgbouncer (or other connection poolers) they don't
 require the overhead of a process and connection for each idle
 connection, so I would recommend a connection pooler even with an
 ACP.  They cover overlapping ground, but I see them as more
 complementary than competing.

Yeah, just trying to understand what you're proposing in terms of what I 
already know :)
-- 
dim


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Kevin Grittner
Dimitri Fontaine dfonta...@hi-media.com wrote:
 
 That's why there's both transaction and session pooling. The
 benefit of session pooling is to avoid forking backends, reusing
 them instead, and you still get the pooling control.
 
So the application would need to open and close a pgbouncer
connection for each database transaction in order to share the
backend properly?
 
 (4)  Other factors than active connection count could be applied,
 like expected memory consumption, or more esoteric metrics.
 
 All you can put in connection strings or per-role setting can be
 used to trick a virtual database and have it pre-set, but that
 means different pools (they accumulate, now) and different
 connection strings for the application. 
 
Well, I don't know that you can very accurately predict a plan or
what its memory usage would be.  Trying to work out all permutations
in advance and send each query to the right pool doesn't seem
workable on a large scale.
 
If we had a pooler bundled into the backend and defaulted to a
halfway reasonable configuration, it's possible that implementing an
active connection limit the second tier ACP would be covering close
enough to the same ground as to be redundant.  I'm not quite
convinced, however, that your proposed use of pgbouncer for this,
given the multiple pools which would need to be configured and the
possible application awareness and cooperation with policy would be
better than a fairly simple ACP.  It seems a bit like driving nails
with a wrench.  I like wrenches, I use them to turn things, but I
don't like using them to drive nails when I can help it.  :-)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Le 28 déc. 2009 à 23:35, Kevin Grittner a écrit :
 So the application would need to open and close a pgbouncer
 connection for each database transaction in order to share the
 backend properly?

No, in session pooling you get the same backend connection for the entire 
pgbouncer connection, it's a 1-1 mapping.

 Well, I don't know that you can very accurately predict a plan or
 what its memory usage would be.  Trying to work out all permutations
 in advance and send each query to the right pool doesn't seem
 workable on a large scale.

True. I was just trying to see what components we already have, while you're 
explaining what's missing: teamwork? :)

 If we had a pooler bundled into the backend and defaulted to a
 halfway reasonable configuration, it's possible that implementing an
 active connection limit the second tier ACP would be covering close
 enough to the same ground as to be redundant.  I'm not quite
 convinced, however, that your proposed use of pgbouncer for this,
 given the multiple pools which would need to be configured and the
 possible application awareness and cooperation with policy would be
 better than a fairly simple ACP.  It seems a bit like driving nails
 with a wrench.  I like wrenches, I use them to turn things, but I
 don't like using them to drive nails when I can help it.  :-)

Hehe, pushing what we already have to their limits is often a nice way to 
describe what we want but still don't have... I think...
-- 
dim


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Kevin Grittner
Dimitri Fontaine dfonta...@hi-media.com wrote:
 Le 28 déc. 2009 à 22:59, Kevin Grittner a écrit :
 
 (3)  With the ACP, the statements would be parsed and optimized
 before queuing, so they would be ready to execute as soon as a
 connection was freed.
 
 There's a pgfoundry project called preprepare, which can be used
 along with pgbouncer to get this effect. If you use 8.4, you can
 even get the effect without pgbouncer.
 
   http://preprepare.projects.postgresql.org/README.html
 
I just reviewed the documentation for preprepare -- I can see a use
case for that, but I really don't think it has a huge overlap with
my point.  The parsing and planning mentioned in my point 3 would
apply to any query -- ad hoc, generated by an ORM, etc.  The
preprepare project seems to be a way to create persistent prepared
statements which are automatically materialized upon connection.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Dimitri Fontaine
Le 28 déc. 2009 à 23:56, Kevin Grittner a écrit :
  http://preprepare.projects.postgresql.org/README.html
 
 I just reviewed the documentation for preprepare -- I can see a use
 case for that, but I really don't think it has a huge overlap with
 my point.  The parsing and planning mentioned in my point 3 would
 apply to any query -- ad hoc, generated by an ORM, etc.  The
 preprepare project seems to be a way to create persistent prepared
 statements which are automatically materialized upon connection.

Just that, right.
-- 
dim
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Kevin Grittner
Dimitri Fontaine dfonta...@hi-media.com wrote:
 
 No, in session pooling you get the same backend connection for the
 entire pgbouncer connection, it's a 1-1 mapping.
 
Right -- so it doesn't allow more logical connections than that with
a limit to how many are active at any one time, *unless* the clients
cooperate by closing the connections between transactions --
effectively requiring a client yield to accomplish what an ACP
could do without special client cooperation.
 
 Well, I don't know that you can very accurately predict a plan or
 what its memory usage would be.  Trying to work out all
 permutations in advance and send each query to the right pool
 doesn't seem workable on a large scale.
 
 True. I was just trying to see what components we already have,
 while you're explaining what's missing: teamwork? :)
 
It would take a lot more than teamwork to accurately predict those
things.  Particularly in an environment with a large number of
dynamically generated queries.
 
 pushing what we already have to their limits is often a nice way
 to describe what we want but still don't have...
 
Sure, and I'm a big fan of building things from proven smaller
pieces where possible.  Like with Linux utilities (grep, sed, awk,
find, xargs). I just think that in this case a connection pool is
complementary and doesn't fit into the solution to these particular
problems very well.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Robert Haas
On Mon, Dec 28, 2009 at 3:33 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 They describe a two-tier approach, where the first tier is already
 effectively implemented in PostgreSQL with the max_connections and
 superuser_reserved_connections GUCs.  The second tier is implemented
 to run after a plan is chosen, and may postpone execution of a query
 (or reduce the resources it is allowed) if starting it at that time
 might overload available resources.

It seems like it might be helpful, before tackling what you're talking
about here, to have some better tools for controlling resource
utilization.  Right now, the tools we have a pretty crude.  You can't
even nice/ionice a certain backend without risking priority inversion,
and there's no sensible way to limit the amount of amount of working
memory per-query, only per query-node.

http://archives.postgresql.org/pgsql-hackers/2009-10/msg00125.php

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 It seems like it might be helpful, before tackling what you're
talking
 about here, to have some better tools for controlling resource
 utilization.  Right now, the tools we have a pretty crude.  You
can't
 even nice/ionice a certain backend without risking priority
inversion,
 and there's no sensible way to limit the amount of amount of working
 memory per-query, only per query-node.
 
 http://archives.postgresql.org/pgsql-hackers/2009-10/msg00125.php
 
I will review and consider.  Thanks.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Admission Control Policy

2009-12-28 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 The second tier is implemented to run after a plan is chosen, and
 may postpone execution of a query (or reduce the resources it is
 allowed) if starting it at that time might overload available
 resources.
 
 It seems like it might be helpful, before tackling what you're
 talking about here, to have some better tools for controlling
 resource utilization.  Right now, the tools we have a pretty
 crude.  You can't even nice/ionice a certain backend without
 risking priority inversion, and there's no sensible way to limit
 the amount of amount of working memory per-query, only per
 query-node.
 
 http://archives.postgresql.org/pgsql-hackers/2009-10/msg00125.php
 
I see your point, but it seems largely orthogonal:
 
(1)  These issues wouldn't preclude a very simple but still useful
ACP which just limits the active connection count.  This is really
what I most want, and would solve a problem frequently reported on
the lists.
 
(2)  If the ACP had a hook to allow plugging new policies, it would
support development and testing of the types of measurement and
control you describe, not hinder it.
 
(3)  You could get some useful benefit from an ACP which just
postponed queries when a memory-heavy plan was ready and a lot of
memory was already reserved by executing queries anticipated to be
memory-heavy.  That is, you wouldn't need to solve the harder
problem of *limiting* memory usage to get benefit from being able to
roughly *estimate* memory usage.
 
Frankly, solving the problems you reference might be more work than
implementing true serializable transactions.  (At least *I'm*
clueless about how to solve the memory allocation problems, and feel
relatively confident about how to deal with serializable
transactions.)  I'm interested in ACPs because even the simplest
implementation could reduce the number of serialization errors in
some environments, improving performance in serializable isolation
level.  If doing that is a first step in helping to solve the
problems you describe, I'll be happy to have helped.  I don't think
our shop can afford to tackle everything you reference there,
however.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers