[HACKERS] Admission Control Policy
This paper has a brief but interesting discussion of Admission Control in section 2.4: Architecture of a Database System. (Joseph M. Hellerstein, Michael Stonebraker and James Hamilton). Foundations and Trends in Databases 1(2). http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf They describe a two-tier approach, where the first tier is already effectively implemented in PostgreSQL with the max_connections and superuser_reserved_connections GUCs. The second tier is implemented to run after a plan is chosen, and may postpone execution of a query (or reduce the resources it is allowed) if starting it at that time might overload available resources. I think that implementing something like this could potentially help with several types of problems. We often see posts from people who have more active connections than is efficient. We could, for example, have a policy which queues query requests which are *not* from a superuser and not part of a transaction which has acquired a snapshot or any locks, if the number of active transactions is above a certain threshold. Proper configuration of a policy like this might change the performance graph to stay relatively steady past the knee rather than degrading. We occasionally see posts where people have exhausted available RAM and suffered a severe performance hit or a crash, due to an excessively high setting of work_mem or maintenance_work_mem. A good policy might warn and reduce the setting or reschedule execution to keep things from getting too out of hand. A good policy might also reduce conflicts between transactions, making stricter transaction isolation less painful. While this observation motivated me to think about it, it seems potentially useful on its own. It might perhaps make sense to provide some hook to allow custom policies to supplement or override a simple default policy. Thoughts? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Hi, Le 28 déc. 2009 à 21:33, Kevin Grittner a écrit : We often see posts from people who have more active connections than is efficient. How would your proposal better solve the problem than using pgbouncer? mad proposal time I'd be in favor of considering how to get pgbouncer into -core, and now that we have Hot Standby maybe implement a mode in which as soon as a real XID is needed, or maybe upon receiving start transaction read write command, the connection is handled transparently to the master. / Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
On Monday 28 December 2009 22:39:06 Dimitri Fontaine wrote: Hi, Le 28 déc. 2009 à 21:33, Kevin Grittner a écrit : We often see posts from people who have more active connections than is efficient. How would your proposal better solve the problem than using pgbouncer? mad proposal time I'd be in favor of considering how to get pgbouncer into -core, and now that we have Hot Standby maybe implement a mode in which as soon as a real XID is needed, or maybe upon receiving start transaction read write command, the connection is handled transparently to the master. / Thats not as easy as it sounds - the master may not have all data needed by the snapshot on the slave anymore. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Le 28 déc. 2009 à 22:46, Andres Freund a écrit : mad proposal time I'd be in favor of considering how to get pgbouncer into -core, and now that we have Hot Standby maybe implement a mode in which as soon as a real XID is needed, or maybe upon receiving start transaction read write command, the connection is handled transparently to the master. / Thats not as easy as it sounds - the master may not have all data needed by the snapshot on the slave anymore. I suppose that if it was easy some patch would already be around for next commit fest? :) Seriously, your point is why I'd be tempted to only consider getting to the master at transaction starting time. That is before any snapshot is taken. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Dimitri Fontaine dfonta...@hi-media.com wrote: Le 28 déc. 2009 à 21:33, Kevin Grittner a écrit : We often see posts from people who have more active connections than is efficient. How would your proposal better solve the problem than using pgbouncer? With my current knowledge of pgbouncer I can't answer that definitively; but *if* pgbouncer, when configured for transaction pooling, can queue new transaction requests until a connection is free, then the differences would be: (1) According to pgbouncer documentation, transaction pooling is a hack as it breaks application expectations of backend connection. You can use it only when application cooperates with such usage by not using features that can break. This would not be an issue with an ACP. (2) For the active connection aspect of the policy, you could let through superuser requests while other requests were queuing. (3) With the ACP, the statements would be parsed and optimized before queuing, so they would be ready to execute as soon as a connection was freed. (4) Other factors than active connection count could be applied, like expected memory consumption, or more esoteric metrics. In favor of pgbouncer (or other connection poolers) they don't require the overhead of a process and connection for each idle connection, so I would recommend a connection pooler even with an ACP. They cover overlapping ground, but I see them as more complementary than competing. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Le 28 déc. 2009 à 22:59, Kevin Grittner a écrit : With my current knowledge of pgbouncer I can't answer that definitively; but *if* pgbouncer, when configured for transaction pooling, can queue new transaction requests until a connection is free, then the differences would be: It does that, yes. You setup a pool, which is per database/user, and when there's no more server side connection in the pool, the clients are held in cl_waiting state. (1) According to pgbouncer documentation, transaction pooling is a hack as it breaks application expectations of backend connection. You can use it only when application cooperates with such usage by not using features that can break. This would not be an issue with an ACP. That's why there's both transaction and session pooling. The benefit of session pooling is to avoid forking backends, reusing them instead, and you still get the pooling control. (2) For the active connection aspect of the policy, you could let through superuser requests while other requests were queuing. superuser is another user and gets its own pool, I'm not sure if you can size it differently though (yet). It's possible to trick a little by defining another (virtual) database where you force the user in the connection string to the server, then tell your application to use this special database. (3) With the ACP, the statements would be parsed and optimized before queuing, so they would be ready to execute as soon as a connection was freed. There's a pgfoundry project called preprepare, which can be used along with pgbouncer to get this effect. If you use 8.4, you can even get the effect without pgbouncer. http://preprepare.projects.postgresql.org/README.html (4) Other factors than active connection count could be applied, like expected memory consumption, or more esoteric metrics. All you can put in connection strings or per-role setting can be used to trick a virtual database and have it pre-set, but that means different pools (they accumulate, now) and different connection strings for the application. The only advantage is that it works with released and proven code! (except for preprepare... well I've been told it's running in production somewhere) In favor of pgbouncer (or other connection poolers) they don't require the overhead of a process and connection for each idle connection, so I would recommend a connection pooler even with an ACP. They cover overlapping ground, but I see them as more complementary than competing. Yeah, just trying to understand what you're proposing in terms of what I already know :) -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Dimitri Fontaine dfonta...@hi-media.com wrote: That's why there's both transaction and session pooling. The benefit of session pooling is to avoid forking backends, reusing them instead, and you still get the pooling control. So the application would need to open and close a pgbouncer connection for each database transaction in order to share the backend properly? (4) Other factors than active connection count could be applied, like expected memory consumption, or more esoteric metrics. All you can put in connection strings or per-role setting can be used to trick a virtual database and have it pre-set, but that means different pools (they accumulate, now) and different connection strings for the application. Well, I don't know that you can very accurately predict a plan or what its memory usage would be. Trying to work out all permutations in advance and send each query to the right pool doesn't seem workable on a large scale. If we had a pooler bundled into the backend and defaulted to a halfway reasonable configuration, it's possible that implementing an active connection limit the second tier ACP would be covering close enough to the same ground as to be redundant. I'm not quite convinced, however, that your proposed use of pgbouncer for this, given the multiple pools which would need to be configured and the possible application awareness and cooperation with policy would be better than a fairly simple ACP. It seems a bit like driving nails with a wrench. I like wrenches, I use them to turn things, but I don't like using them to drive nails when I can help it. :-) -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Le 28 déc. 2009 à 23:35, Kevin Grittner a écrit : So the application would need to open and close a pgbouncer connection for each database transaction in order to share the backend properly? No, in session pooling you get the same backend connection for the entire pgbouncer connection, it's a 1-1 mapping. Well, I don't know that you can very accurately predict a plan or what its memory usage would be. Trying to work out all permutations in advance and send each query to the right pool doesn't seem workable on a large scale. True. I was just trying to see what components we already have, while you're explaining what's missing: teamwork? :) If we had a pooler bundled into the backend and defaulted to a halfway reasonable configuration, it's possible that implementing an active connection limit the second tier ACP would be covering close enough to the same ground as to be redundant. I'm not quite convinced, however, that your proposed use of pgbouncer for this, given the multiple pools which would need to be configured and the possible application awareness and cooperation with policy would be better than a fairly simple ACP. It seems a bit like driving nails with a wrench. I like wrenches, I use them to turn things, but I don't like using them to drive nails when I can help it. :-) Hehe, pushing what we already have to their limits is often a nice way to describe what we want but still don't have... I think... -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Dimitri Fontaine dfonta...@hi-media.com wrote: Le 28 déc. 2009 à 22:59, Kevin Grittner a écrit : (3) With the ACP, the statements would be parsed and optimized before queuing, so they would be ready to execute as soon as a connection was freed. There's a pgfoundry project called preprepare, which can be used along with pgbouncer to get this effect. If you use 8.4, you can even get the effect without pgbouncer. http://preprepare.projects.postgresql.org/README.html I just reviewed the documentation for preprepare -- I can see a use case for that, but I really don't think it has a huge overlap with my point. The parsing and planning mentioned in my point 3 would apply to any query -- ad hoc, generated by an ORM, etc. The preprepare project seems to be a way to create persistent prepared statements which are automatically materialized upon connection. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Le 28 déc. 2009 à 23:56, Kevin Grittner a écrit : http://preprepare.projects.postgresql.org/README.html I just reviewed the documentation for preprepare -- I can see a use case for that, but I really don't think it has a huge overlap with my point. The parsing and planning mentioned in my point 3 would apply to any query -- ad hoc, generated by an ORM, etc. The preprepare project seems to be a way to create persistent prepared statements which are automatically materialized upon connection. Just that, right. -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Dimitri Fontaine dfonta...@hi-media.com wrote: No, in session pooling you get the same backend connection for the entire pgbouncer connection, it's a 1-1 mapping. Right -- so it doesn't allow more logical connections than that with a limit to how many are active at any one time, *unless* the clients cooperate by closing the connections between transactions -- effectively requiring a client yield to accomplish what an ACP could do without special client cooperation. Well, I don't know that you can very accurately predict a plan or what its memory usage would be. Trying to work out all permutations in advance and send each query to the right pool doesn't seem workable on a large scale. True. I was just trying to see what components we already have, while you're explaining what's missing: teamwork? :) It would take a lot more than teamwork to accurately predict those things. Particularly in an environment with a large number of dynamically generated queries. pushing what we already have to their limits is often a nice way to describe what we want but still don't have... Sure, and I'm a big fan of building things from proven smaller pieces where possible. Like with Linux utilities (grep, sed, awk, find, xargs). I just think that in this case a connection pool is complementary and doesn't fit into the solution to these particular problems very well. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
On Mon, Dec 28, 2009 at 3:33 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: They describe a two-tier approach, where the first tier is already effectively implemented in PostgreSQL with the max_connections and superuser_reserved_connections GUCs. The second tier is implemented to run after a plan is chosen, and may postpone execution of a query (or reduce the resources it is allowed) if starting it at that time might overload available resources. It seems like it might be helpful, before tackling what you're talking about here, to have some better tools for controlling resource utilization. Right now, the tools we have a pretty crude. You can't even nice/ionice a certain backend without risking priority inversion, and there's no sensible way to limit the amount of amount of working memory per-query, only per query-node. http://archives.postgresql.org/pgsql-hackers/2009-10/msg00125.php ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Robert Haas robertmh...@gmail.com wrote: It seems like it might be helpful, before tackling what you're talking about here, to have some better tools for controlling resource utilization. Right now, the tools we have a pretty crude. You can't even nice/ionice a certain backend without risking priority inversion, and there's no sensible way to limit the amount of amount of working memory per-query, only per query-node. http://archives.postgresql.org/pgsql-hackers/2009-10/msg00125.php I will review and consider. Thanks. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Admission Control Policy
Robert Haas robertmh...@gmail.com wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: The second tier is implemented to run after a plan is chosen, and may postpone execution of a query (or reduce the resources it is allowed) if starting it at that time might overload available resources. It seems like it might be helpful, before tackling what you're talking about here, to have some better tools for controlling resource utilization. Right now, the tools we have a pretty crude. You can't even nice/ionice a certain backend without risking priority inversion, and there's no sensible way to limit the amount of amount of working memory per-query, only per query-node. http://archives.postgresql.org/pgsql-hackers/2009-10/msg00125.php I see your point, but it seems largely orthogonal: (1) These issues wouldn't preclude a very simple but still useful ACP which just limits the active connection count. This is really what I most want, and would solve a problem frequently reported on the lists. (2) If the ACP had a hook to allow plugging new policies, it would support development and testing of the types of measurement and control you describe, not hinder it. (3) You could get some useful benefit from an ACP which just postponed queries when a memory-heavy plan was ready and a lot of memory was already reserved by executing queries anticipated to be memory-heavy. That is, you wouldn't need to solve the harder problem of *limiting* memory usage to get benefit from being able to roughly *estimate* memory usage. Frankly, solving the problems you reference might be more work than implementing true serializable transactions. (At least *I'm* clueless about how to solve the memory allocation problems, and feel relatively confident about how to deal with serializable transactions.) I'm interested in ACPs because even the simplest implementation could reduce the number of serialization errors in some environments, improving performance in serializable isolation level. If doing that is a first step in helping to solve the problems you describe, I'll be happy to have helped. I don't think our shop can afford to tackle everything you reference there, however. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers