Re: [HACKERS] Overhauling GUCS
Alvaro Herrera wrote: Tom Lane escribió: Gregory Stark [EMAIL PROTECTED] writes: The entire target market for such a thing is DBAs stuck on hosted databases which don't have shell access to their machines. Perhaps the overlap between that and the people who can write a server-side module which dumps out a config file according to some rules is just too small? There's a veritable boatload of stuff we do that assumes shell access (how many times have you seen cron jobs recommended, for instance?). So I'm unconvinced that modify the config without shell access is really a goal that is worth lots of effort. Actually, lots of people are discouraged by suggestions of using cron to do anything. The only reason cron is suggested is because we don't have any other answer, and for many people it's a half-solution. An integrated task scheduler in Pg would be more than welcome. Yes, I hear a lot of people complaining aobut that too. Now that we have a working autovacuum, some of it goes away though - no need to cron your VACUUMs in most cases anymore. But there are still backups - but they are often managed by the scheduler of an enterprise backup software. Also, remember that pgAdmin already comes with a pgAgent thing. Yeah, it's a real life-saver on Windows where the builtin task-scheduler isn't as readily accessible or easy to use.. //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Am Monday, 18. August 2008 schrieb Josh Berkus: Right now, if you want to survey your databases, tables, approx disk space, query activity, etc., you can do that all through port 5432. You can't manage most of your server settings that way, and definitely can't manage the *persistent* settings. When you're trying to manage 1000 PostgreSQL servers, this is not a minor issue. Some of that effort could go into making less settings persistent. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Am Monday, 18. August 2008 schrieb Tom Lane: The impression I get every time this comes up is that various people have different problems they want to solve that (they think) require redesign of the way GUC works. Those complicated solutions arise from attempting to satisfy N different demands simultaneously. Which may be the reason that I have been getting the impression that the Problems and the proposed resolutions on http://wiki.postgresql.org/wiki/GUCS_Overhaul are not really closely related. I can agree with the Problems, but then I am lost. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
What I'm interested in is auto-tuning, not necessarily overhauling GUCS, which happens to be the subject of this thread :-) Having done a SELECT * FROM pg_settings, all the information you need seems to be there... Maybe I'm being over-simplistic here, but the important bit is knowing how you should tune stuff - and this is what I'm hoping to learn through this process. Now, you could probably sidestep UI and GUCS concerns by moving the auto-tuning process inside the database. You don't need fancy GUIs for guessing configuration parameters, and if you can already do that, coming up with a GUI should be pretty straightforward. For example, I see no reason why you couldn't capture the logic of tuning in a couple of PL/Python functions to look up usage stats, size of indices etc. PL/Python being an untrusted language, you could even write a new postgresql.conf file to disk, with the suggested alterations. Cheap, quick and cheerful! Perhaps the auto-tuning conversation should take place in a separate thread, how do you feel about changing the subject line? The most insteresting bit is discussing and testing tuning strategies. This, of course, is related to the [admin] and [perform] side of things, but there is also a development dimension. As soon as there is a tuning strategy published, a number of tools will certainly follow. Michael Yes, there's a grand plan for a super-wizard that queries the database for size, index, and statistics information for figure out what to do; I've been beating that drum for a while now. Unfortunately, the actual implementation is blocked behind the dreadfully boring work of sorting out how to organize and manage the GUC information a bit better, and the moderately boring work of building a UI for modifying things. If you were hoping to work on the sexy autotuning parts without doing some of the grunt work, let me know if you can figure out how so I can follow you.
Re: [HACKERS] Overhauling GUCS
On Aug 18, 2008, at 1:05 AM, Magnus Hagander wrote: Josh Berkus wrote: Steve, First pass is done. Needs a little cleanup before sharing. I spent a fair while down OS-specific-hardware-queries rathole, but I'm better now. Gods, I hope you gave up on that. You want to use SIGAR or something. If it's going to be C++, and reasonably cross platform, and a pg tool, why not try to build something as a module in pgAdmin? Certainly going to get you a larger exposure... And I'm sure the pgAdmin team would be happy to have it! I'm attempting to build it as something that can be used in several places. Where there's most need for it is as an install time option in installers, particularly on Windows. There's no reason the same underlying code couldn't also go into pgAdmin, of course. At the moment the code is a bit Qt specific, reducing that is part of the cleanup. Cheers, Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Steve Atkins wrote: On Aug 18, 2008, at 1:05 AM, Magnus Hagander wrote: Josh Berkus wrote: Steve, First pass is done. Needs a little cleanup before sharing. I spent a fair while down OS-specific-hardware-queries rathole, but I'm better now. Gods, I hope you gave up on that. You want to use SIGAR or something. If it's going to be C++, and reasonably cross platform, and a pg tool, why not try to build something as a module in pgAdmin? Certainly going to get you a larger exposure... And I'm sure the pgAdmin team would be happy to have it! I'm attempting to build it as something that can be used in several places. Where there's most need for it is as an install time option in installers, particularly on Windows. Well, if it was in pgadmin, it would be there more or less by default on Windows. And very easy to get in on other platforms, since pgadmin is already packaged there. Plus, all the dependencies are already there on said platforms. //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Mon, 18 Aug 2008, Michael Nacos wrote: Having done a SELECT * FROM pg_settings, all the information you need seems to be there... See http://archives.postgresql.org/pgsql-hackers/2008-06/msg00209.php You sound like you're at rung 2 on the tool author ladder I describe there, still thinking about the fun part of tuning but not yet aware of the annoying postgresql.conf management issues that show up in the field that motivate many of the GUCS changes suggested. Coping with user and system-generated comments is one difficult part that people normally don't consider, dealing with bad settings the server won't start with is another. I did make one mistake in that message, which is that the context field of pg_settings already exposes when a setting can be changed. And it is possible to get the value for a setting as entered by the admin by joining pg_settings against what current_setting returns, which is one part of handling the import/change/export cycle while keeping useful units intact. Maybe I'm being over-simplistic here, but the important bit is knowing how you should tune stuff - and this is what I'm hoping to learn through this process. The tuning references at the bottom of http://wiki.postgresql.org/wiki/GUCS_Overhaul provide more detail here than anyone has been able to automate so far. There's certainly room to improve on the suggestions there with some introspection of the database, I'm trying to stay focused on something to help newbies whose settings are completely wrong first. As soon as there is a tuning strategy published, a number of tools will certainly follow. Josh Berkus published one in 2005 and zero such tools have been produced since then, even though it looked to him then (like it does to you now and like it did to me once) that such a tool would easily follow: http://pgfoundry.org/docman/?group_id=1000106 The bright side here is that you don't have to waste time tinkering in this area to find out where the dead ends are like Josh and I independantly did. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg Smith [EMAIL PROTECTED] writes: On Mon, 18 Aug 2008, Michael Nacos wrote: Having done a SELECT * FROM pg_settings, all the information you need seems to be there... See http://archives.postgresql.org/pgsql-hackers/2008-06/msg00209.php You sound like you're at rung 2 on the tool author ladder I describe there, still thinking about the fun part of tuning but not yet aware of the annoying postgresql.conf management issues that show up in the field that motivate many of the GUCS changes suggested. Coping with user and system-generated comments is one difficult part that people normally don't consider, Because coping with free-form user-edited text is a losing game. People don't consider it because it's a dead-end. Instead you have one file for user-edited configuration and a separate file for computer generated configuration. You never try to automatically edit a user-edited file -- that way lies madness. dealing with bad settings the server won't start with is another. A tuning interface can't be turing complete and detect all possible misconfigurations. To do that it would have to be as complex as the server. In any case worrying about things like this before you have a tuning interface that can do the basics is putting the cart before the horse. As soon as there is a tuning strategy published, a number of tools will certainly follow. Josh Berkus published one in 2005 and zero such tools have been produced since then, even though it looked to him then (like it does to you now and like it did to me once) that such a tool would easily follow: http://pgfoundry.org/docman/?group_id=1000106 The entire target market for such a thing is DBAs stuck on hosted databases which don't have shell access to their machines. Perhaps the overlap between that and the people who can write a server-side module which dumps out a config file according to some rules is just too small? I do think you and others make it less likely every time you throw up big insoluble problems like above though. As a consequence every proposal has started with big overly-complex solutions trying to solve all these incidental issues which never go anywhere instead of simple solutions which directly tackle the main problem. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg, The entire target market for such a thing is DBAs stuck on hosted databases which don't have shell access to their machines. That's incorrect. The main reason for having a port-based API (such as the SQL command line) for managing your server is that it makes it much easier to manage a large number of servers. Right now, if you want to survey your databases, tables, approx disk space, query activity, etc., you can do that all through port 5432. You can't manage most of your server settings that way, and definitely can't manage the *persistent* settings. When you're trying to manage 1000 PostgreSQL servers, this is not a minor issue. With the growing cloud sector, the lack of easy server parameter management is hurting PostgreSQL's adoption for hosted applications. This isn't a new complaint, and is a big part of the reason why 90% of web hosts still don't offer PostgreSQL. I've heard complaints about our manageability problems from more vendors than I can count. HOWEVER, it's completely possible to get a 1st-generation config tool out there without first implementing port-based config access. For one thing, there's Puppet. So that's what I'm intending to do. I do think you and others make it less likely every time you throw up big insoluble problems like above though. It's not an insoluble problem. It's a political problem; several people don't want to add this functionality to the project. As a consequence every proposal has started with big overly-complex solutions trying to solve all these incidental issues which never go anywhere instead of simple solutions which directly tackle the main problem. What, in your opinion, is the main problem? I'm not sure we agree on that. -- --Josh Josh Berkus PostgreSQL San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Gregory Stark [EMAIL PROTECTED] writes: The entire target market for such a thing is DBAs stuck on hosted databases which don't have shell access to their machines. Perhaps the overlap between that and the people who can write a server-side module which dumps out a config file according to some rules is just too small? There's a veritable boatload of stuff we do that assumes shell access (how many times have you seen cron jobs recommended, for instance?). So I'm unconvinced that modify the config without shell access is really a goal that is worth lots of effort. In any case, there's already adequate support for sucking postgresql.conf out of the machine and putting it back: pg_read_file(), pg_file_write(), pg_reload_conf(). So at the end of the day remote access isn't a factor in this at all. I do think you and others make it less likely every time you throw up big insoluble problems like above though. As a consequence every proposal has started with big overly-complex solutions trying to solve all these incidental issues which never go anywhere instead of simple solutions which directly tackle the main problem. The impression I get every time this comes up is that various people have different problems they want to solve that (they think) require redesign of the way GUC works. Those complicated solutions arise from attempting to satisfy N different demands simultaneously. The fact that many of these goals aren't subscribed to by the whole community to begin with doesn't help to ease resolution of the issues. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus [EMAIL PROTECTED] writes: Greg, The entire target market for such a thing is DBAs stuck on hosted databases which don't have shell access to their machines. That's incorrect. The main reason for having a port-based API (such as the SQL command line) for managing your server is that it makes it much easier to manage a large number of servers. Right now, if you want to survey your databases, tables, approx disk space, query activity, etc., you can do that all through port 5432. You can't manage most of your server settings that way, and definitely can't manage the *persistent* settings. When you're trying to manage 1000 PostgreSQL servers, this is not a minor issue. This I don't understand. If you're managing lots of servers running lots of software the last thing you want to have to do is write a custom method for updating the configuration of each service. In that use case you would prefer to just use rsync/svn/git to push the new config file to all the machines anyways. With the growing cloud sector, the lack of easy server parameter management is hurting PostgreSQL's adoption for hosted applications. This isn't a new complaint, and is a big part of the reason why 90% of web hosts still don't offer PostgreSQL. I've heard complaints about our manageability problems from more vendors than I can count. These are both use cases which fall in the category I described where you want to allow users to configure the system through an automated interface. We can do that today by generating the automatically generated section and including that in postgresql.conf as an include file. As a consequence every proposal has started with big overly-complex solutions trying to solve all these incidental issues which never go anywhere instead of simple solutions which directly tackle the main problem. What, in your opinion, is the main problem? I'm not sure we agree on that. The main problem that I've seen described is what I mentioned before: allowing adjusting the postgresql.conf GUC settings by remote users who don't have shell access. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg, The main problem that I've seen described is what I mentioned before: allowing adjusting the postgresql.conf GUC settings by remote users who don't have shell access. Oh, ok. I think we're in agreement, though. I don't think that's the *1st* problem to be solved, but it's definitely important. -- --Josh Josh Berkus PostgreSQL San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark [EMAIL PROTECTED] wrote: The main problem that I've seen described is what I mentioned before: allowing adjusting the postgresql.conf GUC settings by remote users who don't have shell access. Which pgAdmin has done perfectly well for years, as long as the config is all in one file. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Mon, 18 Aug 2008, Gregory Stark wrote: Because coping with free-form user-edited text is a losing game. People don't consider it because it's a dead-end. Right, that's impossible technology to build, which is why I had to plan all these screen shots showing tools that handle that easily for Apache's very similar configuration file: http://www.apache-gui.com/apache-windows.html http://kochizz.sourceforge.net/quelques-captures-decran/ Instead you have one file for user-edited configuration and a separate file for computer generated configuration. It wouldn't be so difficult if the system generated postgresql.conf didn't have all this extra junk in it, which is part of what an overhaul plans to simplify. The way the file gets spit out right now encourages some of the awful practices people develop in the field. A tuning interface can't be turing complete and detect all possible misconfigurations. To do that it would have to be as complex as the server. Thank you for supporting the case for why changes need to be to the server code itself, to handle things like validating new postgresql.conf files before they get loaded. I try not to bring that up lest it complicate things further. The entire target market for such a thing is DBAs stuck on hosted databases which don't have shell access to their machines. I've never setup a hosted database on a system I don't have shell access to, so I have no idea where you get the impression that was a primary goal of anything I've said. It just so happens that improving what tuning you can do over port 5432 helps that crowd out too, that's a bonus as I see it. Ask me about EnterpriseDB's On-Demand Production Tuning ...nah, too easy, I'll just let that go. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Dave Page wrote: On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark [EMAIL PROTECTED] wrote: The main problem that I've seen described is what I mentioned before: allowing adjusting the postgresql.conf GUC settings by remote users who don't have shell access. Which pgAdmin has done perfectly well for years, as long as the config is all in one file. I'll argue it's not done it perfectly well (it's not particularly user-friendly), but it has certainly *done* it... //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On 8/18/08, Magnus Hagander [EMAIL PROTECTED] wrote: Dave Page wrote: On Mon, Aug 18, 2008 at 8:51 PM, Gregory Stark [EMAIL PROTECTED] wrote: The main problem that I've seen described is what I mentioned before: allowing adjusting the postgresql.conf GUC settings by remote users who don't have shell access. Which pgAdmin has done perfectly well for years, as long as the config is all in one file. I'll argue it's not done it perfectly well (it's not particularly user-friendly), but it has certainly *done* it... I mean it's able to read write the config file correctly. I agree the ui is, umm, sub-optimal. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom Lane escribió: Gregory Stark [EMAIL PROTECTED] writes: The entire target market for such a thing is DBAs stuck on hosted databases which don't have shell access to their machines. Perhaps the overlap between that and the people who can write a server-side module which dumps out a config file according to some rules is just too small? There's a veritable boatload of stuff we do that assumes shell access (how many times have you seen cron jobs recommended, for instance?). So I'm unconvinced that modify the config without shell access is really a goal that is worth lots of effort. Actually, lots of people are discouraged by suggestions of using cron to do anything. The only reason cron is suggested is because we don't have any other answer, and for many people it's a half-solution. An integrated task scheduler in Pg would be more than welcome. Also, remember that pgAdmin already comes with a pgAgent thing. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Wed, 13 Aug 2008, Michael Nacos wrote: Hi there... Configuration autotuning is something I am really interested in. I have seen this page: http://wiki.postgresql.org/wiki/GUCS_Overhaul and a couple of emails mentioning this, so I wanted to ask is someone already on it? If yes, I'd like to contribute. Good time to give a status report on what's been going on with all this. With some help I just finished off an answer to problem #1 there recently, Most people have no idea how to set these. There was some concern here that work was being done on config tools without a clear vision of what was going to be tuned. See http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server for an intro on how to set the 18 most important parameters (+7 logging parameters) based on the best information I'm aware of. Circa June, Steve Atkins was looking into writing a C++/Qt GUI tuning interface application, with the idea that somebody else would figure out the actual smarts to the tuning effort. Don't know where that's at. Josh Berkus and I have been exchanging some ideas for the GUC internals overhaul and had a quick discussion about that in person last month. We've been gravitating toward putting all the extra information we'd like to push into there in an extra catalog table (pg_settings_info or something). The stuff the server needs to start can stay right where it is right now, all the other decoration can move to the table. Ideally, an external little app should also provide recommendations based on current database usage statistics -- wouldn't this constitute something akin to application-specific advice? Yes, there's a grand plan for a super-wizard that queries the database for size, index, and statistics information for figure out what to do; I've been beating that drum for a while now. Unfortunately, the actual implementation is blocked behind the dreadfully boring work of sorting out how to organize and manage the GUC information a bit better, and the moderately boring work of building a UI for modifying things. If you were hoping to work on the sexy autotuning parts without doing some of the grunt work, let me know if you can figure out how so I can follow you. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg Smith [EMAIL PROTECTED] writes: Josh Berkus and I have been exchanging some ideas for the GUC internals overhaul and had a quick discussion about that in person last month. We've been gravitating toward putting all the extra information we'd like to push into there in an extra catalog table (pg_settings_info or something). The stuff the server needs to start can stay right where it is right now, all the other decoration can move to the table. Somehow, the attraction of that idea escapes me. What we have now was named Grand Unified Configuration for a reason: it centralized the handling of what had been a mess of different things configured in different ways. I'm not eager to go backwards on that. I'm also interested to know exactly what such a table would provide that isn't already available in the form of the pg_settings view. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Sun, 17 Aug 2008, Tom Lane wrote: What we have now was named Grand Unified Configuration for a reason: it centralized the handling of what had been a mess of different things configured in different ways. I'm not eager to go backwards on that. No need to change anything related to how the configuration is done. There's really only two things wrong with what's there right now IMHO and they don't require any changes to the internals, just what's shown: 1) The view should show both how the user defined the setting and how it's represented internally. Basically something that looks like this: select name,current_setting(name) as input_setting,setting from pg_settings; 2) Expose the default value. I'm also interested to know exactly what such a table would provide that isn't already available in the form of the pg_settings view. Links to the relevant documentation and a place to save both default and user comments about the setting were two things being considered that seemed a really bad fit to tack onto the GUC structure. There's some others. The main point is that that nobody wants to have to tinker with the core GUC itself just to decorate it with more information, that is complicated enough as it is. One might make a case that the stuff the GUC must handle (settings, units, type, defaults, etc.) could be usefully separated from all the more documentation-oriented bits stored there right now (category, descriptions), and that the existing documentation bits could move over to the table along with the hyperlinks and such. Doing that adds another place to have to edit, but I think there's an even exchange available there because it enables easy auto-generation of the postgresql.conf file at initdb time from that table + pg_settings. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Aug 17, 2008, at 1:48 PM, Greg Smith wrote: On Wed, 13 Aug 2008, Michael Nacos wrote: Hi there... Configuration autotuning is something I am really interested in. I have seen this page: http://wiki.postgresql.org/wiki/ GUCS_Overhaul and a couple of emails mentioning this, so I wanted to ask is someone already on it? If yes, I'd like to contribute. Good time to give a status report on what's been going on with all this. With some help I just finished off an answer to problem #1 there recently, Most people have no idea how to set these. There was some concern here that work was being done on config tools without a clear vision of what was going to be tuned. See http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server for an intro on how to set the 18 most important parameters (+7 logging parameters) based on the best information I'm aware of. Circa June, Steve Atkins was looking into writing a C++/Qt GUI tuning interface application, with the idea that somebody else would figure out the actual smarts to the tuning effort. Don't know where that's at. First pass is done. Needs a little cleanup before sharing. I spent a fair while down OS-specific-hardware-queries rathole, but I'm better now. Cheers, Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Steve, First pass is done. Needs a little cleanup before sharing. I spent a fair while down OS-specific-hardware-queries rathole, but I'm better now. Gods, I hope you gave up on that. You want to use SIGAR or something. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Hi there... Configuration autotuning is something I am really interested in. I have seen this page: http://wiki.postgresql.org/wiki/GUCS_Overhaul and a couple of emails mentioning this, so I wanted to ask is someone already on it? If yes, I'd like to contribute. Ideally, an external little app should also provide recommendations based on current database usage statistics -- wouldn't this constitute something akin to application-specific advice? In this regard, the tool could also tell you things like how much RAM you should order if you were to replace your existing database server :-) Michael
Re: [HACKERS] Overhauling GUCS
Added to TODO: o Add external tool to auto-tune some postgresql.conf parameters http://archives.postgresql.org/pgsql-hackers/2008-06/msg0.php --- Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: * Can we build a configuration wizard to tell newbies what settings they need to tweak? That would trump all the other suggestions conclusively. Anyone good at expert systems? How far could we get with the answers to just three questions: * How many concurrent queries do you expect to have? * How much RAM space are you willing to let Postgres use? * How much overhead disk space are you willing to let Postgres use? concurrent queries drives max_connections, obviously, and RAM space would drive shared_buffers and effective_cache_size, and both of them would be needed to size work_mem. The third one is a bit weird but I don't see any other good way to set the checkpoint parameters. If those aren't enough questions, what else must we ask? Or maybe they aren't the right questions at all --- maybe we should ask is this a dedicated machine or not and try to extrapolate everything else from what we (hopefully) can find out about the hardware. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg Smith wrote: On Thu, 5 Jun 2008, Magnus Hagander wrote: We really need a proper API for it, and the stuff in pgAdmin isn't even enough to base one on. I would be curious to hear your opinion on whether the GUC overhaul discussed in this thread is a useful precursor to building such a proper API. I'm getting back to this one now that the discussion has died down a bit. As usual, the discussion spread out into these huge changes needed to be done to satisfy everyone from day one. I don't think that's the way to go - we have to do it piece by piece if that's ever going to be done. IMHO the first thing to do is to create a stable API for modifying config variables remotely. *at this time* it doesn't matter how this API modifies the configuration, if it's in a file on in the db or whatever. We can always change that later... Which is why I will be proceeding with this one - make an API that requires the least possible change for now, per discussions earlier this year (not in this thread) :-) Now, this in itself is unrelated to the stuff Josh was talking about which is collecting the information in one place and being able to generate config files that don't contain everything, and being able to auto-tune things. It would be nice for such a tool to be able to get the full descriptions and such from the pg_settings view or such, which means it needs to go in the code and not in a standalone file, but it's not *as necessary*. I still think the config file we ship today is horrible, I get lots of questions around it, and I see lots of examples of people who tweak settings they have no idea what they do, just because it's there. That's a separate issue that could *either* be solved by shipping more than one default config file, or it could be solved by the config file generator Josh proposed. //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Wednesday 11 June 2008 16:54:23 Bruce Momjian wrote: Dave Page wrote: On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian [EMAIL PROTECTED] wrote: Dave Page wrote: On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: pg_ctl -D data check? I would +1 that. I would also really like to see that - though I'd also like to see an SQL interface so we can check a config before saving when editing via pgAdmin or similar. Should this be a TODO? Yes please. Added to TODO: * Add pg_ctl option to do a syntax check of postgresql.conf ISTM we need something that can run inside the db as well, i'm thinking something like pg_check_conf() to go with pg_reload_conf(). Also, these should probably check validity of the pg_hba.conf as well. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Robert Treat wrote: On Wednesday 11 June 2008 16:54:23 Bruce Momjian wrote: Dave Page wrote: On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian [EMAIL PROTECTED] wrote: Dave Page wrote: On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: pg_ctl -D data check? I would +1 that. I would also really like to see that - though I'd also like to see an SQL interface so we can check a config before saving when editing via pgAdmin or similar. Should this be a TODO? Yes please. Added to TODO: * Add pg_ctl option to do a syntax check of postgresql.conf ISTM we need something that can run inside the db as well, i'm thinking something like pg_check_conf() to go with pg_reload_conf(). Also, these should probably check validity of the pg_hba.conf as well. Agreed, TODO updated: o Add functions to syntax check configuration files -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Also, I'd actually assert that 10 seems to be perfectly adequate for the majority of users. That is, the number of users where I've recommended increasing d_s_t for the whole database is smaller than the number where I don't, and of course we never hear from most users at all. So I'm pretty happy recommending Leave the default. If you encounter problem queries, increase it to 100, and analyse the database. Really? I'm the opposite: I never leave a client's setting at 10, that's just asking for trouble. Making it 100 *after* you encounter problem queries is reactive; I prefer being proactive. Nor is a setting of 10 perfectly adequate: I think you might be the last person on the lists who thinks so. That train has left the station, we've been trying to decide what a better default should be other than 10, and, more to the point, how to quantitatively measure it. The problem is, you really can't. Sure, you can graph a tiny increase in ANALYZE time and disk space, but there are no stock queries we can use to measure an increase in planning time. Frankly, I'd be shocked if there is any significant difference and all compared to the actual query run time. The orders of magnitude speed up of certain queries when the d_s_t goes above 98 is what spawned my original thread proposing a change to 100: http://markmail.org/message/tun3a3juxlsyjbsw While it's easy to get bogged down in theory about what things d_s_t should measure, the optimal size of buckets, etc., it's still a severe performance regression bug that should be fixed, IMO. Changing the subject line as well: this is only tangentially related to overhauling GUCS, although I'll point out again that this particular config is a good example of one that needs more comments. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200806121213 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkhRTKYACgkQvJuQZxSWSsjGvACeJkXZJ8cP385W9UXKzLHdzhvw gqQAoJWdrepFbkxR2be7oetK8/o/yd9I =w469 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg Sabino Mullane [EMAIL PROTECTED] writes: The orders of magnitude speed up of certain queries when the d_s_t goes above 98 is what spawned my original thread proposing a change to 100: http://markmail.org/message/tun3a3juxlsyjbsw That was a pretty special case (LIKE/regex estimation), and we've since eliminated the threshold change in the LIKE/regex estimates anyway, so there's no longer any reason to pick 100 as opposed to any other number. So we're still back at what's a good value and why?. Frankly, I'd be shocked if there is any significant difference and all compared to the actual query run time. I'm still concerned about the fact that eqjoinsel() is O(N^2). Show me some measurements demonstrating that a deep nest of equijoins doesn't get noticeably more expensive to plan --- preferably on a datatype with an expensive equality operator, eg numeric --- and I'm on board. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg Sabino Mullane [EMAIL PROTECTED] writes: Also, I'd actually assert that 10 seems to be perfectly adequate for the majority of users. That is, the number of users where I've recommended increasing d_s_t for the whole database is smaller than the number where I don't, and of course we never hear from most users at all. So I'm pretty happy recommending Leave the default. If you encounter problem queries, increase it to 100, and analyse the database. Really? I'm the opposite: I never leave a client's setting at 10, that's just asking for trouble. Making it 100 *after* you encounter problem queries is reactive; I prefer being proactive. Have you ever measured the system speed before and after? Nor is a setting of 10 perfectly adequate: What percentage of your plans actually change with the larger statistics? How many for the better? How many were massively improved? I suspect you're looking at some single-digit percentage slowdown for planning across the board. In exchange if you a) have simple queries you probably see none improving. If you b) have moderately complex queries you probably get some single-digit percentage of them with improvements. And if you c) have very complex queries you probably have a handful of them which see massive improvements. Across the internet there are a whole lot more applications of type (a) than the others... Frankly, I'd be shocked if there is any significant difference and all compared to the actual query run time. Well you might start preparing to be shocked. Note that retrieving the statistics is a query itself so it's not hard for it to be comparable to a similarly simple query. It's not hard for a simple query using multiple columns to be using more records of statistics than it is from the actual data. And things can look much worse if that data is TOASTed and requires further lookups and/or decompression... The orders of magnitude speed up of certain queries when the d_s_t goes above 98 is what spawned my original thread proposing a change to 100: http://markmail.org/message/tun3a3juxlsyjbsw While it's easy to get bogged down in theory about what things d_s_t should measure, the optimal size of buckets, etc., it's still a severe performance regression bug that should be fixed, IMO. It was, three months ago: http://archives.postgresql.org/pgsql-committers/2008-03/msg00129.php 100 is no longer a magic threshold for LIKE queries (in CVS HEAD) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Bruce, I am concerned that each wizzard is going to have to duplicate the same logic each time, and adjust to release-based changes. I think that's a feature, not a bug. Right now, I'm not at all convinced that my algorithms for setting the various major dials are great (I just think that nobody yet has better). So I think we should *encourage* people to write their own wizards until we find one that works reasonably well. -- Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus wrote: Bruce, I am concerned that each wizard is going to have to duplicate the same logic each time, and adjust to release-based changes. I think that's a feature, not a bug. Right now, I'm not at all convinced that my algorithms for setting the various major dials are great (I just think that nobody yet has better). So I think we should *encourage* people to write their own wizards until we find one that works reasonably well. I am thinking a web-based wizard would make the most sense. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Bruce, I am thinking a web-based wizard would make the most sense. I'd prefer command-line, so that people could run it on their own servers. For one thing, we need to generate at least two files on many platforms; a postgresql.conf and a sysctl. -- Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus wrote: Bruce, I am thinking a web-based wizard would make the most sense. I'd prefer command-line, so that people could run it on their own servers. For one thing, we need to generate at least two files on many platforms; a postgresql.conf and a sysctl. They can just download the files the need from the web page, no? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Jun 12, 2008, at 11:21 AM, Bruce Momjian wrote: Josh Berkus wrote: Bruce, I am concerned that each wizard is going to have to duplicate the same logic each time, and adjust to release-based changes. I think that's a feature, not a bug. Right now, I'm not at all convinced that my algorithms for setting the various major dials are great (I just think that nobody yet has better). So I think we should *encourage* people to write their own wizards until we find one that works reasonably well. I am thinking a web-based wizard would make the most sense. There's a definite need for an interactive GUI wizard (bundle with the Windows and OS X installers, at least). And a commandline wizard would certainly be nice, both interactive and non-interactive. Mostly for including in install scripts on unix platforms. And a web-based wizard would be useful too. And all of them would benefit from being able to both modify an existing configuration file, and to generate one from scratch. It looks like it's going to be reasonably easy to abstract away the interface to the user such that the first two (and likely the third) can be built from the same codebase, meaning that the smarts about how to set the various GUC settings (based on RAM available, estimates of database size and usage) can be maintained in one place. Cheers, Steve -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Thu, 12 Jun 2008, Bruce Momjian wrote: I am thinking a web-based wizard would make the most sense. I have not a single customer I work with who could use an external web-based wizard. Way too many companies have privacy policy restrictions that nobody dare cross by giving out any info about their server, or sometimes that they're even using PostgreSQL inside the firewall. If it's not a tool that you can run on the same server you're running PostgreSQL on, I'd consider that another diversion that's not worth pursuing. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Jun 11, 2008, at 9:34 PM, Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: The idea has a fundamental logical flaw, which is that it's not clear which parameter wins if the user changes both. Yes, you could get into problems by having variable dependency loops, Who said anything about loops? What I am talking about is what happens during set memory_usage = X; // implicitly sets work_mem = X/100, say set work_mem = Y; set memory_usage = Z; My initial thought was that this would behave like a shell script variable, meaning once you set something it would affect all references to it below in postgresql.conf. The problem with that is that we comment out all settings, so there isn't a logical order like you would have in a shell script. I was not thinking of memory_usage implicity changing anything. I figured postgresql.conf would have: memory_usage = 100 work_mem = $memory_usage * 0.75 If you change memory_usage via SET, it will not change work_mem at all because you are not re-initializing the variables. Why? That's the exact opposite of what I'd expect. If I want a setting's value to be $memory_usage * .75, that's what I want it to be. Not some value based on whatever $memory_usage was set to when work_mem happened to be changed. Of course, if you set something to a hard value with no variables, then that's what it's set to. I am kind of lost how this would work logically and am willing to think about it some more, but I do think we aren't going to simplify postgresql.conf without such a facility. Agreed. And I think it's a lot more elegant for dealing with day-to- day tuning than some kind of external wizzard. The big problem I see is that right now everything has a constant default. If we allowed memory_usage to change some of the defaults, how would we signal that we want the variables based on it to change their values? This is your behind-the-scenes problem you mentioned. I would suggest that we just re-evaluate everything whenever any setting is changed (this assumes that we store postgresql.conf internally in some fashion, so that we're not actually hitting the file all the time and possibly picking up random edits). -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Overhauling GUCS
Decibel! [EMAIL PROTECTED] writes: On Jun 11, 2008, at 9:34 PM, Bruce Momjian wrote: I am kind of lost how this would work logically and am willing to think about it some more, but I do think we aren't going to simplify postgresql.conf without such a facility. Agreed. And I think it's a lot more elegant for dealing with day-to- day tuning than some kind of external wizzard. You guys call this simplification? You're out of your minds. This proposal is ridiculously complicated, and yet it still fails even to consider adjusting non-numeric parameters. And what about things that require more than a trivial arithmetic expression to compute? It's not hard at all to imagine wanting log, sqrt, etc. We do not need to put computational capability into GUC. Any computations needed to determine a parameter setting should be done by a wizard. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Really? I'm the opposite: I never leave a client's setting at 10, that's just asking for trouble. Making it 100 *after* you encounter problem queries is reactive; I prefer being proactive. Have you ever measured the system speed before and after? Yes. No change (see below on caching) or faster (better plans). Nor is a setting of 10 perfectly adequate: What percentage of your plans actually change with the larger statistics? How many for the better? How many were massively improved? It matters not if there is a slight increase in planning time: the penalty of choosing a Bad Plan far outweighs any increased analyze or planning cost, period. Are you arguing that 10 is a good default, or just against larger values in general? I suspect you're looking at some single-digit percentage slowdown for planning across the board. In exchange if you a) have simple queries you probably see none improving. If you b) have moderately complex queries you probably get some single-digit percentage of them with improvements. And if you c) have very complex queries you probably have a handful of them which see massive improvements. Across the internet there are a whole lot more applications of type (a) than the others... I'm still skeptical that it's the case, but I wouldn't mind seeing some figures about how slowed down a simple database gets going from 10 to 100 (or larger). Robert, any chance we can use Pagila for some sort of test for that? Frankly, I'd be shocked if there is any significant difference and all compared to the actual query run time. Well you might start preparing to be shocked. Note that retrieving the statistics is a query itself so it's not hard for it to be comparable to a similarly simple query. It's not hard for a simple query using multiple columns to be using more records of statistics than it is from the actual data. And things can look much worse if that data is TOASTed and requires further lookups and/or decompression... Even if all you say above is true, and I think we'll have to agree to disagree on that, there's an important point to remember: query plans can be (and very often are) cached. Queries and query results cannot (although I'm working on that... :) Plans to queries is a 1-N, where N can be very, very large, and further boosts the query time vs planning time ratio. ... 100 is no longer a magic threshold for LIKE queries (in CVS HEAD) That's great, but doesn't help existing releases (unless that was backpatched, but I don't recall it if so) But that's a battle I'm going to stop fighting, and concentrate on helping to find a replacement for 10 that may or may not be 100. - -- Greg Sabino Mullane [EMAIL PROTECTED] End Point Corporation PGP Key: 0x14964AC8 200806122100 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAkhRyXIACgkQvJuQZxSWSshZpwCeOdLZCu0gSQGpOLciQ6H29Tsd iNgAn3AuoisE8zSbMjLuDL4aWzP6NAth =ujTa -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
You guys call this simplification? You're out of your minds. This proposal is ridiculously complicated, and yet it still fails even to consider adjusting non-numeric parameters. And what about things that require more than a trivial arithmetic expression to compute? It's not hard at all to imagine wanting log, sqrt, etc. We do not need to put computational capability into GUC. Any computations needed to determine a parameter setting should be done by a wizard. +1 (save the comment speculating about anyone's location relative to their mind ;) Additionally, obvious as it may be, there's nothing stopping anyone from developing a tool to generate the configuration file from a more interesting source. Whether that's XML or some DSL that supports computations, doesn't matter. I would think if such a tool showed dramatic merit it would provoke another discussion about core integration, but ISTM that leaving it dead simple is best. [mm, hook into the postgres startup script, shouldn't be that hard to administer..] -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: Oh, and wal_buffers, the default for which we should just change if it weren't for SHMMAX. Uh, why? On a workload of mostly small transactions, what value is there in lots of wal_buffers? None. But there's also little to no harm in having a higher setting; at worst you waste a few megabytes of memory. Besides, most databases are initialized from some outside source in the beginning, and data loading does benefit from a higher wal_buffers setting. Ideally, of course, there would be no wal_buffers setting, and WAL buffers would be allocated from shared_buffers pool on demand... -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg Sabino Mullane wrote: * The word 'paramters' is still misspelled. :) Corrected for 8.4. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote: Greg Sabino Mullane wrote: * The word 'paramters' is still misspelled. :) Corrected for 8.4. Technically this is a bug fix... why not backpatch it too? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Heikki, Ideally, of course, there would be no wal_buffers setting, and WAL buffers would be allocated from shared_buffers pool on demand... +1 --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom Lane wrote: Josh Berkus [EMAIL PROTECTED] writes: Oh, and wal_buffers, the default for which we should just change if it weren't for SHMMAX. Uh, why? On a workload of mostly small transactions, what value is there in lots of wal_buffers? Actually, it's also useful for any workload with many connections. Any time you have high throughput, really. We've seen this on DBT2, SpecJAppserver and iGen. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus wrote: Heikki, Ideally, of course, there would be no wal_buffers setting, and WAL buffers would be allocated from shared_buffers pool on demand... Same for pg_subtrans, pg_clog, etc (as previously discussed) -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Robert Lor wrote: Robert Treat wrote: On Wednesday 04 June 2008 22:04:54 Greg Smith wrote: I was just talking to someone today about building a monitoring tool for this. Not having a clear way to recommend people monitor use of work_mem and its brother spilled to disk sorts is an issue right now, I'll whack that one myself if someone doesn't beat me to it before I get time. I remember *years* ago, someone wrote a perl script to poll pgsql_tmp and print out anytime something showed up... you could probably find that in the archives if you look around. of course to me this sounds like an excellent idea for a dtrace probe ;-) Actually, you can find out from the sort-end probe now whether or not the sort spilled to disk and number of disk blocks used. This is one of the probes from Simon. TRACE_POSTGRESQL_SORT_END(state-tapeset, (state-tapeset ? LogicalTapeSetBlocks(state-tapeset) : (state-allowedMem - state-availMem + 1023) / 1024)); 8.3 has GUC log_temp_files. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Joshua D. Drake wrote: On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote: Greg Sabino Mullane wrote: * The word 'paramters' is still misspelled. :) Corrected for 8.4. Technically this is a bug fix... why not backpatch it too? That might show up as a diff for people doing upgrades where the minor version changed the spelling. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Bruce Momjian wrote: Joshua D. Drake wrote: On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote: Greg Sabino Mullane wrote: * The word 'paramters' is still misspelled. :) Corrected for 8.4. Technically this is a bug fix... why not backpatch it too? That might show up as a diff for people doing upgrades where the minor version changed the spelling. People upgrading won't see it, I think. You only see it when you do a new initdb.. //Magnus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus wrote: Heikki, Ideally, of course, there would be no wal_buffers setting, and WAL buffers would be allocated from shared_buffers pool on demand... +1 --Josh +1 -Jignesh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Dave Page wrote: On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: I've seen people not doing so more often than you would think. Perhaps because they are DBAs and not sysadmins? I also meant a tool to do things like verify that the changes are valid, as someone else mentioned elsewhere in this thread. pg_ctl -D data check? I would +1 that. I would also really like to see that - though I'd also like to see an SQL interface so we can check a config before saving when editing via pgAdmin or similar. Should this be a TODO? -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian [EMAIL PROTECTED] wrote: Dave Page wrote: On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: pg_ctl -D data check? I would +1 that. I would also really like to see that - though I'd also like to see an SQL interface so we can check a config before saving when editing via pgAdmin or similar. Should this be a TODO? Yes please. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Dave Page wrote: On Wed, Jun 11, 2008 at 6:56 PM, Bruce Momjian [EMAIL PROTECTED] wrote: Dave Page wrote: On Mon, Jun 2, 2008 at 5:46 AM, Joshua D. Drake [EMAIL PROTECTED] wrote: pg_ctl -D data check? I would +1 that. I would also really like to see that - though I'd also like to see an SQL interface so we can check a config before saving when editing via pgAdmin or similar. Should this be a TODO? Yes please. Added to TODO: * Add pg_ctl option to do a syntax check of postgresql.conf -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Alvaro Herrera [EMAIL PROTECTED] writes: Josh Berkus wrote: Ideally, of course, there would be no wal_buffers setting, and WAL buffers would be allocated from shared_buffers pool on demand... Same for pg_subtrans, pg_clog, etc (as previously discussed) I agree with that for pg_clog and friends, but I'm much more leery of folding WAL into the same framework. Its access pattern is *totally* unlike standard caches, so the argument that this would be good for performance is resting on nothing but imagination. Also I'm concerned about possible deadlocks, because WAL is customarily accessed while holding one or more exclusive buffer locks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: Tom Lane wrote: * Can we build a configuration wizard to tell newbies what settings they need to tweak? That would trump all the other suggestions conclusively. Anyone good at expert systems? How far could we get with the answers to just three questions: * How many concurrent queries do you expect to have? * How much RAM space are you willing to let Postgres use? * How much overhead disk space are you willing to let Postgres use? concurrent queries drives max_connections, obviously, and RAM space would drive shared_buffers and effective_cache_size, and both of them would be needed to size work_mem. The third one is a bit weird but I don't see any other good way to set the checkpoint parameters. If those aren't enough questions, what else must we ask? Or maybe they aren't the right questions at all --- maybe we should ask is this a dedicated machine or not and try to extrapolate everything else from what we (hopefully) can find out about the hardware. Having returned from Japan, I read through this thread. It had lots of ideas (new format for postgresql.conf, more/less comments in postgresql.conf) but I didn't see any of the ideas getting a majority. I think we do a good job of making many settings automatic (meaning no one even sees them), but we don't to a great job of making the visible settings easy to set, both in the process (no GUI) and in knowing the proper value. There are two ideas I did think had merit. First, using ## for system-supplied comments, so user comments would be easier to identify. There might be value in doing that even if it were not helpful for scripts. The second idea is the idea of having one parameter depend on another. Not only could we do that for some of our existing parameters, but we could have pseudo-parameters like concurrent_queries, memory_usage, and extra_disk_space that could be at the top of postgresql.conf and then affect the other settings. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Bruce Momjian [EMAIL PROTECTED] writes: The second idea is the idea of having one parameter depend on another. Not only could we do that for some of our existing parameters, but we could have pseudo-parameters like concurrent_queries, memory_usage, and extra_disk_space that could be at the top of postgresql.conf and then affect the other settings. We have tried to do that in the past, and it didn't work well *at all*. The idea has a fundamental logical flaw, which is that it's not clear which parameter wins if the user changes both. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: The second idea is the idea of having one parameter depend on another. Not only could we do that for some of our existing parameters, but we could have pseudo-parameters like concurrent_queries, memory_usage, and extra_disk_space that could be at the top of postgresql.conf and then affect the other settings. We have tried to do that in the past, and it didn't work well *at all*. We have? When? The idea has a fundamental logical flaw, which is that it's not clear which parameter wins if the user changes both. Yes, you could get into problems by having variable dependency loops, but I see no way to easily improve configuration without it. My idea was to have: memory_usage = 128MB## pseudo-parameter shared_buffers = $memory_usage * 0.75 -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom Lane [EMAIL PROTECTED] writes: Alvaro Herrera [EMAIL PROTECTED] writes: Josh Berkus wrote: Ideally, of course, there would be no wal_buffers setting, and WAL buffers would be allocated from shared_buffers pool on demand... Same for pg_subtrans, pg_clog, etc (as previously discussed) I agree with that for pg_clog and friends, but I'm much more leery of folding WAL into the same framework. Its access pattern is *totally* unlike standard caches, so the argument that this would be good for performance is resting on nothing but imagination. Also I'm concerned about possible deadlocks, because WAL is customarily accessed while holding one or more exclusive buffer locks. Well it may still be worthwhile stealing buffers from shared_buffers even if we set a special flag marking them as owned by WAL and out of bounds for the normal buffer manager. At least that way we could always steal more if we want or return some, as long as we're careful about when we do it. That would open the door to having these parameters be dynamically adjustable. That alone would be worthwhile even if we bypass all bells and whistles of the buffer manager. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg, At least that way we could always steal more if we want or return some, as long as we're careful about when we do it. That would open the door to having these parameters be dynamically adjustable. That alone would be worthwhile even if we bypass all bells and whistles of the buffer manager. One hitch, though, is that asynchronous commit could consume big chunks of shared_buffers. So we might still need a limit for people who are using async. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Magnus Hagander wrote: Bruce Momjian wrote: Joshua D. Drake wrote: On Wed, 2008-06-11 at 11:45 -0400, Bruce Momjian wrote: Greg Sabino Mullane wrote: * The word 'paramters' is still misspelled. :) Corrected for 8.4. Technically this is a bug fix... why not backpatch it too? That might show up as a diff for people doing upgrades where the minor version changed the spelling. People upgrading won't see it, I think. You only see it when you do a new initdb.. The problem is that people doing initdb with different minor versions will have different stock postgresql.conf files. That isn't a huge problem, but I don't see a need to create the problem just to fix a spelling mistake that few have observed. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: The second idea is the idea of having one parameter depend on another. We have tried to do that in the past, and it didn't work well *at all*. We have? When? Just a couple months ago we had to give up enforcing an interrelationship between NBuffers and MaxConnections, because it got too complicated and un-explainable. I seem to recall some other interactions in the distant past, but a quick look through the CVS logs didn't find any smoking guns. The idea has a fundamental logical flaw, which is that it's not clear which parameter wins if the user changes both. Yes, you could get into problems by having variable dependency loops, Who said anything about loops? What I am talking about is what happens during set memory_usage = X; // implicitly sets work_mem = X/100, say set work_mem = Y; set memory_usage = Z; What is work_mem now, and what's your excuse for saying so, and how will you document the behavior so that users can understand it? (Just to make things interesting, assume that some of the above SETs happen via changing postgresql.conf rather than directly.) If the objective is to make configuration easier to understand, I don't believe that behind-the-scenes changes of configuration values will advance that goal. but I see no way to easily improve configuration without it. The higher-level concepts should be things that a configuration wizard works with, and then tells you how to set the postmaster parameters. They should not end up in the configure file (unless maybe as comments?) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus [EMAIL PROTECTED] writes: Greg, At least that way we could always steal more if we want or return some, as long as we're careful about when we do it. That would open the door to having these parameters be dynamically adjustable. That alone would be worthwhile even if we bypass all bells and whistles of the buffer manager. One hitch, though, is that asynchronous commit could consume big chunks of shared_buffers. So we might still need a limit for people who are using async. Well currently we use a fixed number of fixed-sized buffers, no? I doubt we'll change that even if we take this tact of making wal_buffers resizable by stealing buffers from the buffer manager for precisely the reasons Tom was describing. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I agree with that for pg_clog and friends, but I'm much more leery of folding WAL into the same framework. Well it may still be worthwhile stealing buffers from shared_buffers even if we set a special flag marking them as owned by WAL and out of bounds for the normal buffer manager. At least that way we could always steal more if we want or return some, as long as we're careful about when we do it. ... and as long as you can acquire the WAL per-buffer management space out of nowhere ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Wed, 11 Jun 2008, Tom Lane wrote: Who said anything about loops? What I am talking about is what happens during set memory_usage = X; // implicitly sets work_mem = X/100, say set work_mem = Y; set memory_usage = Z; What is work_mem now, and what's your excuse for saying so, and how will you document the behavior so that users can understand it? (Just to make things interesting, assume that some of the above SETs happen via changing postgresql.conf rather than directly.) People are already exposed to issues in this area via things like the include file mechanism. You can think of that two ways. You can say, there's already problems like this so who cares if there's another one. Or, you can say let's not add even more confusion like that. Having a mini programming language for setting parameters is interesting and all, and it might be enough to do a good job of handling the basic newbie setup chores. But I don't think it's a complete solution and therefore I find moving in that direction a bit of a distraction; your concerns about ambiguity just amplify that feeling. It's unlikely that will get powerful enough to enable the one true config file that just works for everybody. There's too many things that depend a bit on both data access pattern and on overall database size/structure no matter what you do. [If only there were some technology that did workload profiling and set the server parameters based on that. Some sort of dynamic tuning tool; wouldn't that be great? Oh well, that's just a dream right now I guess.] I'm not sure if I've stated this explicitly yet, but I personally have no interest in just solving the newbie problem. I want a tool to help out tuning medium to large installs, and generating a simple config file is absolutely something that should come out of that as a bonus. Anything that just targets the simple installs, though, I'm not very motivated to chase after. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: The idea has a fundamental logical flaw, which is that it's not clear which parameter wins if the user changes both. Yes, you could get into problems by having variable dependency loops, Who said anything about loops? What I am talking about is what happens during set memory_usage = X; // implicitly sets work_mem = X/100, say set work_mem = Y; set memory_usage = Z; My initial thought was that this would behave like a shell script variable, meaning once you set something it would affect all references to it below in postgresql.conf. The problem with that is that we comment out all settings, so there isn't a logical order like you would have in a shell script. I was not thinking of memory_usage implicity changing anything. I figured postgresql.conf would have: memory_usage = 100 work_mem = $memory_usage * 0.75 If you change memory_usage via SET, it will not change work_mem at all because you are not re-initializing the variables. I am kind of lost how this would work logically and am willing to think about it some more, but I do think we aren't going to simplify postgresql.conf without such a facility. The big problem I see is that right now everything has a constant default. If we allowed memory_usage to change some of the defaults, how would we signal that we want the variables based on it to change their values? This is your behind-the-scenes problem you mentioned. What is work_mem now, and what's your excuse for saying so, and how will you document the behavior so that users can understand it? (Just to make things interesting, assume that some of the above SETs happen via changing postgresql.conf rather than directly.) If the objective is to make configuration easier to understand, I don't believe that behind-the-scenes changes of configuration values will advance that goal. but I see no way to easily improve configuration without it. The higher-level concepts should be things that a configuration wizard works with, and then tells you how to set the postmaster parameters. They should not end up in the configure file (unless maybe as comments?) I am concerned that each wizzard is going to have to duplicate the same logic each time, and adjust to release-based changes. I thought having the configuration file itself help with setting some values would be helpful. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg Smith wrote: On Wed, 11 Jun 2008, Tom Lane wrote: Who said anything about loops? What I am talking about is what happens during set memory_usage = X; // implicitly sets work_mem = X/100, say set work_mem = Y; set memory_usage = Z; What is work_mem now, and what's your excuse for saying so, and how will you document the behavior so that users can understand it? (Just to make things interesting, assume that some of the above SETs happen via changing postgresql.conf rather than directly.) People are already exposed to issues in this area via things like the include file mechanism. You can think of that two ways. You can say, there's already problems like this so who cares if there's another one. Or, you can say let's not add even more confusion like that. Having a mini programming language for setting parameters is interesting and all, and it might be enough to do a good job of handling the basic newbie setup chores. But I don't think it's a complete solution and therefore I find moving in that direction a bit of a distraction; your concerns about ambiguity just amplify that feeling. It's unlikely that will get powerful enough to enable the one true config file that just works for everybody. There's too many things that depend a bit on both data access pattern and on overall database size/structure no matter what you do. You are right that the complete solution is going to have to ask users questions, and my idea of using variables is not going to get us that far. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus [EMAIL PROTECTED] writes: Greg, Speak to the statisticians. Our sample size is calculated using the same theory behind polls which sample 600 people to learn what 250 million people are going to do on election day. You do NOT need (significantly) larger samples for larger populations. Your analogy is bad. For elections, the voters have only a few choices. In a 300 million row table, there could be 300 million different values, and the histogram becomes less accurate for every order of magnitude smaller than 300 million it is. I think you're right that you need, for example, 600 people *for each answer* to give good poll results. So for a two-way election which is about even that's about 1,200 people. If one of the candidates is much less popular you might have to sample many more people before you have 600 people in that bucket. The analogous case in our situation is not having 300 million distinct values, since we're not gathering info on specific values, only the buckets. We need, for example, 600 samples *for each bucket*. Each bucket is chosen to have the same number of samples in it. So that means that we always need the same number of samples for a given number of buckets. Also, our estimates for n_distinct are very unreliable. The math behind sampling for statistics just doesn't work the same way for properties like n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). Actually, a number of papers have shown block-based algorithms which can arrive a reasonably confident (between 50% and 250% of accurate) estimates based on scanning only 5% of *blocks*. Simon did some work on this a couple years ago, but he and I had difficultly convincing -hackers that a genuine problem existed. Really? Could you send references? The paper I read surveyed previous work and found that you needed to scan up to 50% of the table to get good results. 50-250% is considerably looser than what I recall it considering good results so these aren't entirely inconsistent but I thought previous results were much worse than that. You're correct that we'd need to change pg_statistic, though. For one thing, we need to separate the sample size from the histogram size. That amounts to giving users control over the sample size per bucket. Which allows them to get a more or less accurate estimate for a range covering a single bucket without changing the size of the bucket. I'm a bit puzzled which direction you want to go. Also, we seem to be getting pretty far away from the original GUC discussion. Thank heavens :) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Ron, I wonder if the fastest way to generate the configurator would be to simply ask everyone to post their tuned postgresql.conf files along with a brief description of the use case for that file. The we could group the use-cases into various classes; and average the values of the submitted files. Then the configurator's one question choose which use case most closely matches yours from this list. This has been proposed a number of times, but is still infeasable for the simple reason that a lot of settings are dependant on how much RAM you have on the machine. Sample configs would just result in people trying to set their shared_buffers to 150% of system RAM if they have a lightweight system, or in having 500 different sample configurations, which would produce even *more* user confusion. A generator makes more sense. -- Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Robert, shared_buffers effective_cache_size default_stats_target work_mem maintainance_work_mem listen_address max_connections the fsm parameters checkpoint_segements random_page_cost My list is very similar, execept that I drop random_page_cost and add synchronous_commit, autovaccum (on or off) and offer 4 packages of log settings. So I think that we can get consensus on what are the newbie parameters pretty easily. -- Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Tuesday 10 June 2008 09:37, Josh Berkus wrote: Robert, shared_buffers effective_cache_size default_stats_target work_mem maintainance_work_mem listen_address max_connections the fsm parameters checkpoint_segements random_page_cost My list is very similar, execept that I drop random_page_cost and add synchronous_commit, autovaccum (on or off) and offer 4 packages of log settings. Oh, and wal_buffers, the default for which we should just change if it weren't for SHMMAX. -- Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus [EMAIL PROTECTED] writes: Oh, and wal_buffers, the default for which we should just change if it weren't for SHMMAX. Uh, why? On a workload of mostly small transactions, what value is there in lots of wal_buffers? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: ...default_statistics_target?...Uhh 10. Ah, but we only ever hear about the cases where it's wrong of course. In other words even if we raised it to some optimal value we would still have precisely the same experience of seeing only posts on list about it being insufficient. Yet if it was raised above the optimal, wouldn't you expect to hear cases where it was too high? It seems a slow painful way of finding a good setting might be to keep increasing it until we had cases where the lists start telling people they need to lower the parameter? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom, Actually, the reason it's still 10 is that the effort expended to get it changed has been *ZERO*. I keep asking for someone to make some measurements, do some benchmarking, anything to make a plausible case for a specific higher value as being a reasonable place to set it. The silence has been deafening. Not surprising really. It is a simple adjustment to make and it also is easy to spot when its a problem. However it is not trivial to test for (in terms of time and effort). I know 10 is wrong and so do you. If you don't I am curious why I see so many posts from you saying, Your estimates are off, what is your default_statistics_target? with yet even more responses saying, Uhh 10. I tried (back in 7.4) to do some systematic testing of this. The problem is that the cases were higher d_s_t are required are specifically ones with complex, unbalanced data distributions and/or very large databases. This makes test cases extremely difficult and time-consuming to generate; further, I found that the test cases I had from my clients' databases were not portable (in addition to being confidential). Also, I'd actually assert that 10 seems to be perfectly adequate for the majority of users. That is, the number of users where I've recommended increasing d_s_t for the whole database is smaller than the number where I don't, and of course we never hear from most users at all. So I'm pretty happy recommending Leave the default. If you encounter problem queries, increase it to 100, and analyse the database. If you're running a data warehouse, increase it to 1000. Where analyze does systematically fall down is with databases over 500GB in size, but that's not a function of d_s_t but rather of our tiny sample size. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Josh Berkus [EMAIL PROTECTED] writes: Where analyze does systematically fall down is with databases over 500GB in size, but that's not a function of d_s_t but rather of our tiny sample size. Speak to the statisticians. Our sample size is calculated using the same theory behind polls which sample 600 people to learn what 250 million people are going to do on election day. You do NOT need (significantly) larger samples for larger populations. In fact where those polls have difficulty is the same place we have some problems. For *smaller* populations like individual congressional races you need to have nearly the same 600 sample for each of those small races. That adds up to a lot more than 600 total. In our case it means when queries cover a range much less than a whole bucket then the confidence interval increases too. Also, our estimates for n_distinct are very unreliable. The math behind sampling for statistics just doesn't work the same way for properties like n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On 6/9/08, Gregory Stark [EMAIL PROTECTED] wrote: Josh Berkus [EMAIL PROTECTED] writes: Where analyze does systematically fall down is with databases over 500GB in size, but that's not a function of d_s_t but rather of our tiny sample size. n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). Hi, is this (summarizing results in a constant sized data structure) something which could be achived by Bloom-Filters ? http://archives.postgresql.org/pgsql-general/2008-06/msg00076.php Kind regards Hakan Kocaman
Re: [HACKERS] Overhauling GUCS
Hakan Kocaman [EMAIL PROTECTED] writes: On 6/9/08, Gregory Stark [EMAIL PROTECTED] wrote: n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). is this (summarizing results in a constant sized data structure) something which could be achived by Bloom-Filters ? Uhm, it would be a bit of a strange application of them but actually it seems to me that would be a possible approach. It would need a formula for estimating the number of distinct values given the number of bits set in the bloom filter. That should be a tractable combinatorics problem (in fact it's pretty similar to the combinatorics I posted a while back about getting all the drives in a raid array busy). And if you have a dynamic structure where the filter size grows then it would overestimate because extra copied bits would be set. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Greg, Speak to the statisticians. Our sample size is calculated using the same theory behind polls which sample 600 people to learn what 250 million people are going to do on election day. You do NOT need (significantly) larger samples for larger populations. Your analogy is bad. For elections, the voters have only a few choices. In a 300 million row table, there could be 300 million different values, and the histogram becomes less accurate for every order of magnitude smaller than 300 million it is. Also, our estimates for n_distinct are very unreliable. The math behind sampling for statistics just doesn't work the same way for properties like n_distinct. For that Josh is right, we *would* need a sample size proportional to the whole data set which would practically require us to scan the whole table (and have a technique for summarizing the results in a nearly constant sized data structure). Actually, a number of papers have shown block-based algorithms which can arrive a reasonably confident (between 50% and 250% of accurate) estimates based on scanning only 5% of *blocks*. Simon did some work on this a couple years ago, but he and I had difficultly convincing -hackers that a genuine problem existed. You're correct that we'd need to change pg_statistic, though. For one thing, we need to separate the sample size from the histogram size. Also, we seem to be getting pretty far away from the original GUC discussion. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Joshua D. Drake [EMAIL PROTECTED] writes: On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: Actually, the reason it's still 10 is that the effort expended to get it changed has been *ZERO*. I keep asking for someone to make some measurements, do some benchmarking, anything to make a plausible case for a specific higher value as being a reasonable place to set it. The silence has been deafening. Not surprising really. It is a simple adjustment to make and it also is easy to spot when its a problem. However it is not trivial to test for (in terms of time and effort). I know 10 is wrong and so do you. If you don't I am curious why I see so many posts from you saying, Your estimates are off, what is your default_statistics_target? with yet even more responses saying, Uhh 10. Ah, but we only ever hear about the cases where it's wrong of course. In other words even if we raised it to some optimal value we would still have precisely the same experience of seeing only posts on list about it being insufficient. What's needed is some speed benchmarks for complex queries with varying size statistics so we can see how badly large statistic tables hurt planning time. The flip side of seeing how much larger tables help planning accuracy is much harder to measure. Offhand I don't see any systematic way to go about it. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Sunday 08 June 2008 19:07:21 Gregory Stark wrote: Joshua D. Drake [EMAIL PROTECTED] writes: On Fri, 2008-06-06 at 20:19 -0400, Tom Lane wrote: Robert Treat [EMAIL PROTECTED] writes: Actually, the reason it's still 10 is that the effort expended to get it changed has been *ZERO*. I keep asking for someone to make some measurements, do some benchmarking, anything to make a plausible case for a specific higher value as being a reasonable place to set it. The silence has been deafening. Not surprising really. It is a simple adjustment to make and it also is easy to spot when its a problem. However it is not trivial to test for (in terms of time and effort). I know 10 is wrong and so do you. If you don't I am curious why I see so many posts from you saying, Your estimates are off, what is your default_statistics_target? with yet even more responses saying, Uhh 10. Ah, but we only ever hear about the cases where it's wrong of course. In other words even if we raised it to some optimal value we would still have precisely the same experience of seeing only posts on list about it being insufficient. The slipside to this is that we're not trying to find the perfect setting, we're just trying to determine a number that will cause more benefit than harm compared to the number we have now. While I am sure there are cases where 100 is too low as well, I cannot recall ever having seen someone suggest lowering the default_stats_target to something less than 100. (I know sit back and wait for someone to comb the archives, just to find that 1 time). -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
David E. Wheeler wrote: On Jun 5, 2008, at 14:47, Greg Smith wrote: This is why there's the emphasis on preserving comments as they pass into the GUC structure and back to an output file. This is one of the implementation details I haven't fully made up my mind on: how to clearly label user comments in the postgresql.conf to distinguish them from verbose ones added to the file. I have no intention of letting manual user edits go away; what I'm trying to do here (and this part is much more me than Josh) is make them more uniform such that they can co-exist with machine edits without either stomping on the other. Right now doing that is difficult, because it's impossible to tell the default comments from the ones the users added and the current comment structure bleeds onto the same lines as the settings. How about a simple rule, such as that machine-generated comments start with ##, while user comments start with just #? I think that I've seen such a rule used before. At any rate, I think that, unless you have some sort of line marker for machine-generated comments, there will be no way to tell them apart from user comments. What comments do we consider machine-generated? Just the ones used to comment out settings, like #shared_buffers = 32MB or something else? If the automatic tool lets alone all other kind of comments, I think we're fine. In fact, it wouldn't necessarily need to modify those comments either, it could simply add a new setting line below that: #shared_buffers = 32MB shared_buffers = 1024MB For extra safety, it could comment out old settings, perhaps with something like this: #shared_buffers = 32MB #shared_buffers = 1024MB # commented out by wizard on 2008-06-05 shared_buffers = 2048MB This would preserve a full change history in the file. It would become quite messy after a lo of changes, of course, but a user can trim the history by hand if he wants to. Or perhaps we should explicitly mark the settings the tool has generated, and comment out: #shared_buffers = 32MB # commented out by wizard on 2008-06-05 shared_buffers = 1024MB # automatically set by wizard on 2008-06-05 That way the tool could safely replace automatically set settings, without replacing manually set ones without leaving a clear trace of what happened. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
David E. Wheeler wrote: How about a simple rule, such as that machine-generated comments start with ##, while user comments start with just #? I think that I've seen such a rule used before. At any rate, I think that, unless you have some sort of line marker for machine-generated comments, there will be no way to tell them apart from user comments. Two heretical questions: Do we need user generated comments at all? I can't remember ever having used any comment in postgresql.conf. Why do so many people here insist on editing postgresql.conf as primary means of changing config params? Isn't a psql -c SET foo=bar; MAKE PERSISTENT just as good as sed'ing postgresql.conf or doing it manually? Looking around for different approaches, network appliances come to my mind, e.g. Cisco routers and PIX. You have 3 ways to configure a pix: - use a command line (using ssh or telnet, eqivalent to psql); WRITE MEMORY to make the changes survive a reboot. - use a web interface (or similar tool) - use tftp to up/download the complete config in and out, editing the file. User comments will be lost, with the exception of those that have been applied with special comment commands (equivalent to comment on). Regards, Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Andreas Pflug [EMAIL PROTECTED] writes: Why do so many people here insist on editing postgresql.conf as primary means of changing config params? Isn't a psql -c SET foo=bar; MAKE PERSISTENT just as good as sed'ing postgresql.conf or doing it manually? no, it's awful. Looking around for different approaches, network appliances come to my mind, e.g. Cisco routers and PIX. You have 3 ways to configure a pix: - use a command line (using ssh or telnet, eqivalent to psql); WRITE MEMORY to make the changes survive a reboot. - use a web interface (or similar tool) - use tftp to up/download the complete config in and out, editing the file. User comments will be lost, with the exception of those that have been applied with special comment commands (equivalent to comment on). And in every major installation I've seen people use the last option. They treat the original text file which is kept elsewhere -- normally checked into some revision control system, tracked and managed like source code -- as the canonical and authoritative version. The fact that you have to go through an especially complicated and awkward process to load it onto the switches is just a technical detail to be worked around. And the fact that somebody could have altered the configuration directly on the switch and forgotten to write it to the configuration file is a danger to be worked around with policy, not a benefit. Essentially what good cisco network admins are doing here is working around the awkward interface using policy and replacing it with a static text configuration file kept elsewhere. Experience with other attempts at automatic tools to edit configuration files like webmin etc are that editing text files which are also under user control is just a terrible idea. You can make it almost work but it's always fragile and can't keep up with the flexibility of free-form text and the creativity of users. What I would suggest is going back to the plan when we added include files. Have an automatically generated config file which is freshly re-generated each time and doesn't try to preserve any user-formatting. Users then include that file at the top of postgresql.conf and override any of the settings by setting them in postgresql.conf after the include. The important thing here is to separate files which are under postgres control and those under user control. That doesn't necessarily mean we have to ban users from touching the automatically generated config file or switch formats, but it relieves us of any responsibility for maintaining free-form text. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Gregory Stark wrote: Andreas Pflug [EMAIL PROTECTED] writes: Why do so many people here insist on editing postgresql.conf as primary means of changing config params? Isn't a psql -c SET foo=bar; MAKE PERSISTENT just as good as sed'ing postgresql.conf or doing it manually? no, it's awful. So I wonder why you accept it when configuring schemas. What's the big difference between setting a config param, and creating a table? And ultimately, the config param file format may well look like an SQL command file, restricted to SET only. And in every major installation I've seen people use the last option. They treat the original text file which is kept elsewhere -- normally checked into some revision control system, tracked and managed like source code -- as the canonical and authoritative version. That's how you'd have to manage the schema sources too, no? Your comments are lost as well after schema creation scripts are executed, and manual changes may interfere with that. Regards, Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Andreas Pflug [EMAIL PROTECTED] writes: Gregory Stark wrote: Andreas Pflug [EMAIL PROTECTED] writes: Why do so many people here insist on editing postgresql.conf as primary means of changing config params? Isn't a psql -c SET foo=bar; MAKE PERSISTENT just as good as sed'ing postgresql.conf or doing it manually? no, it's awful. So I wonder why you accept it when configuring schemas. What's the big difference between setting a config param, and creating a table? Yes, this is a big impedence mismatch between the old-school DBMS (with emphasis on the MS part of the acronym) architecture and more modern approaches. You don't configure your web site with ALTER PAGE home.html SET FORM contactus INPUT TYPE SUBMIT either... This is partly out of necessity though as those database objects contain *data*. So merely replacing them with new data objects doesn't give the system enough information to understand what to do with that data. The DDL commands which modify the schema give that kind of data mutating instruction. And I'll note that DBAs go to *great* lengths to do exactly the same kind of thing I described the CISCO DBAs doing. Usually that means doing an ALTER command and separately editing a creation script and trying to keep the two in sync. It's a huge burden. And ultimately, the config param file format may well look like an SQL command file, restricted to SET only. So all you have is our existing file except with an additional layer of quoting to deal with, a useless SET keyword to annoy users, and a file that you need a bison parser to deal instead of a simple keyword-value syntax that sed can manipulate. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Gregory Stark wrote: So all you have is our existing file except with an additional layer of quoting to deal with, a useless SET keyword to annoy users, and a file that you need a bison parser Don't you think that's a little over the top, throwing bison at the simple task to extend postgresql.conf scanning so it accepts --, /**/ and SET? to deal instead of a simple keyword-value syntax that sed can manipulate. sed seems to be THE killer application... Apart from the fact that sed easily could identify SET, my answer would be to use psql to modify the config, not sed. Text config files are NOT friendly for beginner and mediocre users. IMHO the current restriction on GUC changes is a major obstacle towards pgsql tuning tools, e.g. written as a Google SoC project. Graphic tools aren't too popular at pgsql-hackers, but please contemplate a little how much pgadmin may have contributed to the pgsql usage boost, esp. on windows. Regards, Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Am Mittwoch, 4. Juni 2008 schrieb Aidan Van Dyk: When reading this thread, I'm wondering if anybody ever saw a config file for a complex software product that was easily editable and understandable. I don't know one. If there was one, it'd be nice to know it so we can learn from it. PostreSQL, Apache, X.org And note that one of the major advances in X.org over XFree86 was that all the useless garbage was removed from the configuration file, so that the final and usable configuration fits on one screen, and you can even write it from memory if you dare to. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Am Mittwoch, 4. Juni 2008 schrieb Tom Lane: * Can we present the config options in a more helpful way (this is 99% a documentation problem, not a code problem)? ack * Can we build a configuration wizard to tell newbies what settings they need to tweak? Some questions to clarify this: - What settings do newbies (or anyone else) typically need to change? Please post a list. - What values would you set those settings to? Please provide a description for arriving at a value, which can later be transformed into code. Note that in some cases, not even the documentation provides more than handwaving help. - If we know better values, why don't we set them by default? Another orthogonal stumbling block on the way to making all of this automatic is that the surely criticial shared_buffers setting will in any useful configuration require messing around with kernel settings that no PostgreSQL tool can really help with. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Am Donnerstag, 5. Juni 2008 schrieb Tom Lane: How far could we get with the answers to just three questions: * How many concurrent queries do you expect to have? * How much RAM space are you willing to let Postgres use? * How much overhead disk space are you willing to let Postgres use? This is surely a good start. We could optimize this even more by saying, disk space is cheap, so let's just use a much higher default setting for checkpoint_segments. (If PostgreSQL is installed but not actually used, not all the space is actually going to be used anyway.) Then, increase max_connections a bit; that should be OK for most users. Then you are left with the memory settings, and those need kernel tuning in most cases, so any automation tool loses. Hmm. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
* Andreas Pflug [EMAIL PROTECTED] [080606 04:50]: David E. Wheeler wrote: How about a simple rule, such as that machine-generated comments start with ##, while user comments start with just #? I think that I've seen such a rule used before. At any rate, I think that, unless you have some sort of line marker for machine-generated comments, there will be no way to tell them apart from user comments. Two heretical questions: Do we need user generated comments at all? I can't remember ever having used any comment in postgresql.conf. Well, I have, mainly to leave traces as to what and why I've changed something from a default/previous value, based on chagne, hope, and test style tuning. And the one that Greg brought up earlier: ## Don't make this too high, or linux OOM will kill ther server!!! I'm guessing that comment was put in for a reason too. Why do so many people here insist on editing postgresql.conf as primary means of changing config params? Isn't a psql -c SET foo=bar; MAKE PERSISTENT just as good as sed'ing postgresql.conf or doing it manually? I would guess main for 3 reasons: 1) It's always been that way, it's the traditional unix way, and many admins are comfortable being able to comment/make quick test changes with simple text files. 2) Their postgresql.conf are distrubuted/synced/generated from central provisioning/SCM system 3) PostgreSQL the server isn't even running Looking around for different approaches, network appliances come to my mind, e.g. Cisco routers and PIX. You have 3 ways to configure a pix: - use a command line (using ssh or telnet, eqivalent to psql); WRITE MEMORY to make the changes survive a reboot. - use a web interface (or similar tool) - use tftp to up/download the complete config in and out, editing the file. User comments will be lost, with the exception of those that have been applied with special comment commands (equivalent to comment on). And, of course, other server software comes to mind too: apache, bind, postfix, sendmail, dhcpd, sshd, cron, xinetd... ;-) a. -- Aidan Van Dyk Create like a god, [EMAIL PROTECTED] command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] Overhauling GUCS
* Peter Eisentraut [EMAIL PROTECTED] [080606 08:25]: Am Mittwoch, 4. Juni 2008 schrieb Aidan Van Dyk: When reading this thread, I'm wondering if anybody ever saw a config file for a complex software product that was easily editable and understandable. I don't know one. If there was one, it'd be nice to know it so we can learn from it. PostreSQL, Apache, X.org And note that one of the major advances in X.org over XFree86 was that all the useless garbage was removed from the configuration file, so that the final and usable configuration fits on one screen, and you can even write it from memory if you dare to. Ya, and unfortunately, I haven't got my dual-head to work with recent distro X.org either, probably because I haven't found how how/where/what to set in Xorg.conf for new version (and my old one doesn't just work on new X.org either). So I've got an old x.org version pinned, even though I'm pretty sure that a newer X.org is probably better, dispite my lack of ability to configure it. Go figure ;-) -- Aidan Van Dyk Create like a god, [EMAIL PROTECTED] command like a king, http://www.highrise.ca/ work like a slave. signature.asc Description: Digital signature
Re: [HACKERS] Overhauling GUCS
Andreas Pflug [EMAIL PROTECTED] writes: Gregory Stark wrote: So all you have is our existing file except with an additional layer of quoting to deal with, a useless SET keyword to annoy users, and a file that you need a bison parser Don't you think that's a little over the top, throwing bison at the simple task to extend postgresql.conf scanning so it accepts --, /**/ and SET? Well you're the one who proposed generalising the simple text file format to SET commands. The whole point of a machine-editable format would be to make it more restricted and predictable, not less so. We had a proposal to do a postgresql.conf.auto which was automatically generated included from a free-form postgresql.conf. That file could generated in a simple format without worrying about formatting, comments, ordering, etc. Text config files are NOT friendly for beginner and mediocre users. IMHO the current restriction on GUC changes is a major obstacle towards pgsql tuning tools, e.g. written as a Google SoC project. Graphic tools aren't too popular at pgsql-hackers, but please contemplate a little how much pgadmin may have contributed to the pgsql usage boost, esp. on windows. Like it or not computers actually have to store state when you're done entering it via the GUI. If you design around the GUI you end up with system that can *only* be used via a GUI and spend years trying to work around that (witness Windows which is only now with a lot of effort recovering from that mistake). People got all focused on comments and sed, but in fact there are a host of reasons to want to be able to directly edit the state stored in a simple text file. You might like to group together related options, for example, perhaps grouping together the options that your site have to adjust between dev and production -- and probably have instructions on how to make the transition in a commented section. I happen to think being presented with 200 options is less confusing than being presented with an empty file and being told to go read the documentation to find out whether I might want to put anything in it. The documentation is a good reference but isn't good as a here's what you'll want to check before you start guide. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Peter Eisentraut [EMAIL PROTECTED] writes: And note that one of the major advances in X.org over XFree86 was that all the useless garbage was removed from the configuration file, so that the final and usable configuration fits on one screen, and you can even write it from memory if you dare to. This is a good point, though I think the big advance wasn't in actually removing the garbage so much as making it irrelevant. That is, the advances were in making the server actually work automatically with less manual configuration. That's always a noble goal. I would be interested in what parameters people thought we could eliminate from the initial config file because the server normally gets it right anyways and users shouldn't have to adjust it anyways. (I'm specifically *not* referring to sections like the kerberos parameters which even if most users can leave them alone those who do use that feature *do* need to adjust those parameters because the server does *not* get them right automatically) -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's PostGIS support! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Andreas Pflug [EMAIL PROTECTED] writes: Text config files are NOT friendly for beginner and mediocre users. IMHO the current restriction on GUC changes is a major obstacle towards pgsql tuning tools, e.g. written as a Google SoC project. Graphic tools aren't too popular at pgsql-hackers, but please contemplate a little how much pgadmin may have contributed to the pgsql usage boost, esp. on windows. I grow weary of this thread. I will say it once more: I do not believe for one instant that the current formatting of postgresql.conf is the major impediment, or even a noticeable impediment, to producing a useful configuration wizard. If you wish to prove otherwise, provide a complete wizard except for the parts that touch the config file, and I will promise to finish it. I will not read or respond to any further discussion of changing the config file format. It's a waste of bandwidth. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Peter Eisentraut [EMAIL PROTECTED] writes: - If we know better values, why don't we set them by default? The problem is: better for what? In particular, I'm uncomfortable with any changes in the direction of trying to make Postgres take over the entire machine by default. I'd want some fairly explicit permission from the user for that ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: - If we know better values, why don't we set them by default? The problem is: better for what? In particular, I'm uncomfortable with any changes in the direction of trying to make Postgres take over the entire machine by default. I'd want some fairly explicit permission from the user for that ... That is where some 80% solution sample config files come in. Joshua D. Drake regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Am Freitag, 6. Juni 2008 schrieb Tom Lane: Peter Eisentraut [EMAIL PROTECTED] writes: - If we know better values, why don't we set them by default? The problem is: better for what? In particular, I'm uncomfortable with any changes in the direction of trying to make Postgres take over the entire machine by default. I'd want some fairly explicit permission from the user for that ... Yes, those are decisions we are going to have to make, eventually. But recall the three step process: 1. What values need changing? 2. What to change them to? 3. Could that be the new default value? I haven't seen a proposal for item 1 yet, so the rest is idle discussion at this time. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Peter Eisentraut wrote: Am Freitag, 6. Juni 2008 schrieb Tom Lane: Peter Eisentraut [EMAIL PROTECTED] writes: - If we know better values, why don't we set them by default? The problem is: better for what? In particular, I'm uncomfortable with any changes in the direction of trying to make Postgres take over the entire machine by default. I'd want some fairly explicit permission from the user for that ... Yes, those are decisions we are going to have to make, eventually. But recall the three step process: 1. What values need changing? shared_buffers work_mem maintenance_work_mem checkpoint_segments wal_sync_method effective_cache_size I haven't seen a proposal for item 1 yet, so the rest is idle discussion at this time. I think those cover the biggest low hanging fruit, async_commit is arguable. Sincerely, Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Am Freitag, 6. Juni 2008 schrieb Joshua D. Drake: That is where some 80% solution sample config files come in. Considering that writing a sample configuration file is trivial, yet I haven't seen a single one posted in the six or more years of GUC, I have no faith in this plan until I actually see it implemented. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
Peter Eisentraut wrote: Am Freitag, 6. Juni 2008 schrieb Joshua D. Drake: That is where some 80% solution sample config files come in. Considering that writing a sample configuration file is trivial, yet I haven't seen a single one posted in the six or more years of GUC, I have no faith in this plan until I actually see it implemented. I fail to see why anyone would bother with the amount of noise on this thread. If people start leaning toward an actual solution, I am sure the work will get done. Joshua D. Drake -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Overhauling GUCS
On Jun 5, 2008, at 23:08, Heikki Linnakangas wrote: What comments do we consider machine-generated? Just the ones used to comment out settings, like #shared_buffers = 32MB or something else? Those and documentation comments. If the automatic tool lets alone all other kind of comments, I think we're fine. In fact, it wouldn't necessarily need to modify those comments either, it could simply add a new setting line below that: #shared_buffers = 32MB shared_buffers = 1024MB Well, we've been talking about having varying levels of documentation in the comments of the file based on the options passed to the configuration program. I think that these are the primary concern, though Greg, please do correct me if I'm mistaken. For extra safety, it could comment out old settings, perhaps with something like this: #shared_buffers = 32MB #shared_buffers = 1024MB # commented out by wizard on 2008-06-05 shared_buffers = 2048MB This would preserve a full change history in the file. It would become quite messy after a lo of changes, of course, but a user can trim the history by hand if he wants to. I guess that could be a feature. Personally, I use a vcs system for that. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers